From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750923996; cv=none; d=zohomail.com; s=zohoarc; b=kvREwBSNFObLCsILTas2uAf7z3wLLRVMoPymEYfg6f+DZC1cWFo9vW5Uy2TRE8q/DOM/5SEMnzwtfHZJogGaAG+LDrPzGv66u5pciFKRZ0ol2SFzrPQo1YxE5TaLsasscXcmBbbMvOILk7++UuFEqYWaBIJpdc5m9Nn/VHkc9wo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750923996; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=Fu14M398cI1ijP/+NwiuzviF3yT2V0YhBy6jnzONO+w=; b=ANmcsEPaiPyGUkq0cPq3lW+l82Bv82K5HcaDAZg9Holwa3PPSYQG3cexgu6zq7/I+1GbzP9QQ9rNJT+7Nn2nwZ3uH21qOpUHCtu0wB+kNf7iyCgBvCzjA8m1s2R+5IV6CqFQ2BJDbdm4Rsw/PMjAgq4NOuK7y916NmZZYqcgP9w= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750923996469979.5271148368963; Thu, 26 Jun 2025 00:46:36 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhIo-0004N2-DO; Thu, 26 Jun 2025 03:45:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIm-0004Ms-Vu for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIk-0001uC-TQ for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:44 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-76-XAR0f7DiNs6A9_b10qQjpQ-1; Thu, 26 Jun 2025 03:45:37 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E0D2C1808993; Thu, 26 Jun 2025 07:45:36 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C24EA180035C; Thu, 26 Jun 2025 07:45:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923941; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fu14M398cI1ijP/+NwiuzviF3yT2V0YhBy6jnzONO+w=; b=T/9D/o2NYN/l4hFPQ/EZx2bm3yzMZGlrpb5U9gKdBQjk/Da6Kr0k8ibWrb7LVyJEOLxOl7 sU0JXxW32lFyQ9CVogmG6E7bhn4xKQGp3kFYxV+z/GncC7ECkKuapOwTDtcH7boWyRIcJf XQNLxVn9zYxB4bohSmw7FLyjXJUzaW8= X-MC-Unique: XAR0f7DiNs6A9_b10qQjpQ-1 X-Mimecast-MFC-AGG-ID: XAR0f7DiNs6A9_b10qQjpQ_1750923937 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Rorie Reyes , Thomas Huth , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 01/25] hw/vfio/ap: attribute constructor for cfg_chg_events_lock Date: Thu, 26 Jun 2025 09:45:05 +0200 Message-ID: <20250626074529.1384114-2-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750923999604116600 From: Rorie Reyes Created an attribute constructor for cfg_chg_events_lock for locking mechanism when storing event information for an AP configuration change event Fixes: fd03360215 ("Storing event information for an AP configuration chang= e event") Signed-off-by: Rorie Reyes Reviewed-by: Thomas Huth Link: https://lore.kernel.org/qemu-devel/20250611211252.82107-1-rreyes@linu= x.ibm.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio/ap.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c index 874e0d1eaf1dafa105f9f42705e9973ffc719803..1df4438149d2a06ef176e03e358= 336381bfa4caa 100644 --- a/hw/vfio/ap.c +++ b/hw/vfio/ap.c @@ -52,6 +52,11 @@ static QTAILQ_HEAD(, APConfigChgEvent) cfg_chg_events = =3D =20 static QemuMutex cfg_chg_events_lock; =20 +static void __attribute__((constructor)) vfio_ap_global_init(void) +{ + qemu_mutex_init(&cfg_chg_events_lock); +} + OBJECT_DECLARE_SIMPLE_TYPE(VFIOAPDevice, VFIO_AP_DEVICE) =20 static void vfio_ap_compute_needs_reset(VFIODevice *vdev) @@ -230,13 +235,6 @@ static void vfio_ap_realize(DeviceState *dev, Error **= errp) VFIOAPDevice *vapdev =3D VFIO_AP_DEVICE(dev); VFIODevice *vbasedev =3D &vapdev->vdev; =20 - static bool lock_initialized; - - if (!lock_initialized) { - qemu_mutex_init(&cfg_chg_events_lock); - lock_initialized =3D true; - } - if (!vfio_device_get_name(vbasedev, errp)) { return; } --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924076; cv=none; d=zohomail.com; s=zohoarc; b=BHmLusSbg6Zg0btxnt2l6Lr6c7uZTYeE8NoS3ZYAfOGOq163oms7x4A0wz61xPdDn5dcZoH4ewNjAOoybIc8vJ9UxW+UBKbxCnUKuwocuNLRLyZCrFKJKzeOBRXX7uMwVuSm+wiDmzqz+PrJ60D+UTGt6xA2YcondjDKmj89hDw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924076; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=RGinRN5UYQIQXdu8qzSrfoWxXQNkX7kh8kvKzxPuFBc=; b=M5rmGkUx6kIZHysEtC+pU8qUDe+FQIxxvDdgojZme1U12EYMKillUUNdoE+wO/YjKkbCs+JiLeWs54QAXlm0gn39tvYNjDJ60UOKt37NrJgHmWeWJn2mzezdpFqDwKrEOHaloSA5zTPdSP1/K3UmKTcuiwewfVgLJcjgKUa2HfI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 175092407692539.721272754876054; Thu, 26 Jun 2025 00:47:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhIt-0004No-TK; Thu, 26 Jun 2025 03:45:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIr-0004NU-Na for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:49 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIp-0001vG-TN for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:49 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-522-zoaH5A2DOqKu6jGsQYIJTg-1; Thu, 26 Jun 2025 03:45:40 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7851C18002ED; Thu, 26 Jun 2025 07:45:39 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 60B59180035C; Thu, 26 Jun 2025 07:45:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923946; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RGinRN5UYQIQXdu8qzSrfoWxXQNkX7kh8kvKzxPuFBc=; b=NI7Pptnap2WC71XInHXosjXnR+zXU/ZcR3r8z0dAMlsQluiwNw09yIJ1v4nPkKC6zU3HBj tDdtyF11WCha0jT6+pC+e9RQ8IRL1ldwfNdqRrnLYnxOvYI7+8tGA4EuYdP6xHS3HA45hf f1yPU3qoBipX5xakV9MFtEW1yP4phJY= X-MC-Unique: zoaH5A2DOqKu6jGsQYIJTg-1 X-Mimecast-MFC-AGG-ID: zoaH5A2DOqKu6jGsQYIJTg_1750923939 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= , =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= Subject: [PULL 02/25] vfio: add vfio_device_get_region_fd() Date: Thu, 26 Jun 2025 09:45:06 +0200 Message-ID: <20250626074529.1384114-3-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924078460116600 From: John Levon This keeps the existence of ->region_fds private to hw/vfio/device.c. Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Reviewed-by: Philippe Mathieu-Daud=C3=A9 Link: https://lore.kernel.org/qemu-devel/20250616101337.3190027-1-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- include/hw/vfio/vfio-device.h | 12 ++++++++++++ hw/vfio/device.c | 7 +++++++ hw/vfio/region.c | 5 +---- 3 files changed, 20 insertions(+), 4 deletions(-) diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h index d45e5a68a24e7990fa93ce6549f5710b4f25a037..1926adb0a35c7d61ea84fb63de8= a83f719578053 100644 --- a/include/hw/vfio/vfio-device.h +++ b/include/hw/vfio/vfio-device.h @@ -256,6 +256,18 @@ int vfio_device_get_region_info(VFIODevice *vbasedev, = int index, struct vfio_region_info **info); int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type, uint32_t subtype, struct vfio_region_= info **info); + +/** + * Return the fd for mapping this region. This is either the device's fd (= for + * e.g. kernel vfio), or a per-region fd (for vfio-user). + * + * @vbasedev: #VFIODevice to use + * @index: region index + * + * Returns the fd. + */ +int vfio_device_get_region_fd(VFIODevice *vbasedev, int index); + bool vfio_device_has_region_cap(VFIODevice *vbasedev, int region, uint16_t= cap_type); =20 int vfio_device_get_irq_info(VFIODevice *vbasedev, int index, diff --git a/hw/vfio/device.c b/hw/vfio/device.c index d32600eda1a282adaa4c58cfeb80c19c6489ee92..d91c695b69b67ff8f09f590d3fc= ca8f30f259170 100644 --- a/hw/vfio/device.c +++ b/hw/vfio/device.c @@ -243,6 +243,13 @@ retry: return 0; } =20 +int vfio_device_get_region_fd(VFIODevice *vbasedev, int index) +{ + return vbasedev->region_fds ? + vbasedev->region_fds[index] : + vbasedev->fd; +} + int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type, uint32_t subtype, struct vfio_region_= info **info) { diff --git a/hw/vfio/region.c b/hw/vfio/region.c index f5b8e3cbf100f4f1b9fb916075653d3a6e1ca8d4..d04c57db630f3742ea90730e5b5= 8d9aa8b60ffed 100644 --- a/hw/vfio/region.c +++ b/hw/vfio/region.c @@ -273,10 +273,7 @@ int vfio_region_mmap(VFIORegion *region) goto no_mmap; } =20 - /* Use the per-region fd if set, or the shared fd. */ - fd =3D region->vbasedev->region_fds ? - region->vbasedev->region_fds[region->nr] : - region->vbasedev->fd, + fd =3D vfio_device_get_region_fd(region->vbasedev, region->nr); =20 map_align =3D (void *)ROUND_UP((uintptr_t)map_base, (uintptr_t)ali= gn); munmap(map_base, map_align - map_base); --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924257; cv=none; d=zohomail.com; s=zohoarc; b=OlfXDJD6yzlNXFC1ym5KEWgpS4Fv6eNwp0UaD3+bEXKzkbO4JeH711WImQncLGIN5ZIHzvd2lBILA+S7hvq7S6jRqt9ujPTgheY2LMngOLKoO2DA+wKwyhvXf76hrnQxY5Quj1rT/Ys3sMORImbZJEqcRmTNhDXPJpg4uEodvQQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924257; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=80aDPKkAGNekjZlBd3D72tmzWAavOYeI6L2XeZYLnsc=; b=HBqUFQB+EDXqGr8/yfRnt25AQsnHTOvgNpvsnKRhzqMSfVvpJ9BxMgDt2WLnBkER9Qu4UHTxQ/ArKQGMdaG6iThiKG6GdDDjI0voqA5ClbM8Mzc9LD0gOI6XXIi2Y2CezFn2dVPryxv4w9/7Q4XL/OBZD/ey7DvQkqvsNYnFyuc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924257559339.6535972681726; Thu, 26 Jun 2025 00:50:57 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhIv-0004ON-9Y; Thu, 26 Jun 2025 03:45:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIt-0004Nf-3u for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:51 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIn-0001ut-62 for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:50 -0400 Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-136-BBW83-kiPi6epugieGjgmw-1; Thu, 26 Jun 2025 03:45:42 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B1E771956048; Thu, 26 Jun 2025 07:45:41 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1AFC0180035C; Thu, 26 Jun 2025 07:45:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923944; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=80aDPKkAGNekjZlBd3D72tmzWAavOYeI6L2XeZYLnsc=; b=I7uSjKGqvooQKxeGRqxSzfsTM5cc/Hy7xPlghR+5+wsTulzjKXN0NNq7409vECAFfZ0hrC zFGKQ2ue5/m2K0xarSsE3+MxzZgmjhEEEMQAHt/U2FUzmWhVyeND9R9aOUNKT+FI1fdzGS IOMeM91u2yn33E0YjVXNOhYXE0y+lxM= X-MC-Unique: BBW83-kiPi6epugieGjgmw-1 X-Mimecast-MFC-AGG-ID: BBW83-kiPi6epugieGjgmw_1750923942 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 03/25] vfio: add documentation for posted write argument Date: Thu, 26 Jun 2025 09:45:07 +0200 Message-ID: <20250626074529.1384114-4-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924260609116600 From: John Levon Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250616101314.3189793-1-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- include/hw/vfio/vfio-device.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h index 1926adb0a35c7d61ea84fb63de8a83f719578053..959e458d7f7e9facd8216fbd369= 5d9d97dbc667a 100644 --- a/include/hw/vfio/vfio-device.h +++ b/include/hw/vfio/vfio-device.h @@ -240,6 +240,7 @@ struct VFIODeviceIOOps { * @off: offset within the region * @size: size in bytes to write * @data: buffer to write from + * @post: true if this is a posted write * * Returns number of bytes write on success or -errno. */ --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924012; cv=none; d=zohomail.com; s=zohoarc; b=SFnR6j64k0m1mljYMQXeowtMbNKi/XsZ6tW9SzZZHFvHosE8hUfZ/lnJc/E/UYhZFISasRodsV5lvgUPKvh5rTUICzRAgfgfltNS16JgzTOcZaC8FLmoTSlp18q/5S2vZsd0x2jhwR26qci1nF9SSW7nYqwqVr4DjQmV71pc7OU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924012; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=UIa3DkdPiYbEQRu+bEiCZZDWHR5v/uJ1aistawXdiBM=; b=K6wxoJh960uN9FJikcpkzFSUuVLz83VRcFAxEWT/MtVmFIGorvkH9gl3p7StAQlMqIhPlCu9+t4vEVQqHpw4t7QWLCjSQvCfLqjr2mKHA4oOWgiJN4vCkhmuL9QUt3nOgdrIAGgvzuqB89UtjxO143+DiGlRXm5R1FLu0DFWK6s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924012306169.72782246378426; Thu, 26 Jun 2025 00:46:52 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhIz-0004PL-Fa; Thu, 26 Jun 2025 03:45:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIx-0004Og-Iq for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:55 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIq-0001vV-Vg for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:55 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-630-hoi7p12ROfSxZpSsb9b2Mw-1; Thu, 26 Jun 2025 03:45:45 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4A4DB1808985; Thu, 26 Jun 2025 07:45:44 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5A610180045B; Thu, 26 Jun 2025 07:45:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923948; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UIa3DkdPiYbEQRu+bEiCZZDWHR5v/uJ1aistawXdiBM=; b=UFU5jF+Jv9wnvEwdUFGUOuLdFymxymHgKJ+JMlJxCDK1S1RFVBKqAtm82XO9QbPtUXcyov FKHTdCBsw8b3kWJLf/zIqdtqHz2SAkJAo1kJ41Y50DlaU7nFnGKFvd0PyqOsuPexvtpQSm dnsMigx55B0XyLtmX+hne28XIl2Pqrw= X-MC-Unique: hoi7p12ROfSxZpSsb9b2Mw-1 X-Mimecast-MFC-AGG-ID: hoi7p12ROfSxZpSsb9b2Mw_1750923944 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , =?UTF-8?q?Daniel=20P=2E=20Berrang=C3=A9?= , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 04/25] vfio: add license tag to some files Date: Thu, 26 Jun 2025 09:45:08 +0200 Message-ID: <20250626074529.1384114-5-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924013569116600 From: John Levon Add SPDX-License-Identifier to some files missing it in hw/vfio/. Signed-off-by: John Levon Reviewed-by: Daniel P. Berrang=C3=A9 Link: https://lore.kernel.org/qemu-devel/20250623093053.1495509-1-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio/trace.h | 3 +++ hw/vfio/Kconfig | 2 ++ hw/vfio/meson.build | 2 ++ hw/vfio/trace-events | 2 ++ 4 files changed, 9 insertions(+) diff --git a/hw/vfio/trace.h b/hw/vfio/trace.h index 5a343aa59cc9b7aa76df9d900acbeec17a95f046..b34b61ddb285cedc250f35a2b77= 966b0d64e463b 100644 --- a/hw/vfio/trace.h +++ b/hw/vfio/trace.h @@ -1 +1,4 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + */ #include "trace/trace-hw_vfio.h" diff --git a/hw/vfio/Kconfig b/hw/vfio/Kconfig index 7cdba0560aa821c88d3420b36f86020575834202..91d9023b79b594975c6c5f65273= 011b89240691c 100644 --- a/hw/vfio/Kconfig +++ b/hw/vfio/Kconfig @@ -1,3 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0-or-later + config VFIO bool depends on LINUX diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index 73d29f925ff5c668518698effc3e48cdf71c3a9a..63ea39307601cce4b0783766f68= c4cf8d9af71f9 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -1,3 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0-or-later + vfio_ss =3D ss.source_set() vfio_ss.add(files( 'listener.c', diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index f06236f37b72cc64868456be3a5a58fdb5beb829..e1728c4ef64acfc4a377dfc4711= cad35c03a51b7 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -1,4 +1,6 @@ # See docs/devel/tracing.rst for syntax documentation. +# +# SPDX-License-Identifier: GPL-2.0-or-later =20 # pci.c vfio_intx_interrupt(const char *name, char line) " (%s) Pin %c" --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924218; cv=none; d=zohomail.com; s=zohoarc; b=VmGA6IsnclJC0VTHbcPe7sxQh7zD7DXMzJrE9JVYxVWNf4kcQ58Nyg5HUayvKHjpOnQyBE4twn3Pdjhh5oOPyB3PQub80+L6gyGqVVC4eH5pZOpInefhcz2DCWTY9ANyp2WWRNdly2rB2bCjqTwM5CLbpRpmGT7Z+BMRiX4rFno= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924218; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=ByhnffEcIoi804d7Url3PJ2rw+ryQ3F85UYi3x35kkM=; b=FFFRtS1S3kNwhYDjWNBbp/uIZq4A4eDZngtWQUxUCtlBKXnVwb02PwWznS1nwae5VZZWlSzTd7NxZyyfEp51XtmVt15PDh21tGLlYzXNZzNVyyEZ3t0pHPIc4PG2J7OgYXNk3mlS0FhhNbqAawbxG3INFx+eAoelI0k7ZgmbcTY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924218171545.8470089386103; Thu, 26 Jun 2025 00:50:18 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJE-0004RC-4q; Thu, 26 Jun 2025 03:46:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJ4-0004Q3-Gk for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:08 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIu-0001vt-0M for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:02 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-28-yQL7QovrM2SSC1pj_gLtPA-1; Thu, 26 Jun 2025 03:45:47 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B63D81808985; Thu, 26 Jun 2025 07:45:46 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C87C518002B0; Thu, 26 Jun 2025 07:45:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923951; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ByhnffEcIoi804d7Url3PJ2rw+ryQ3F85UYi3x35kkM=; b=hr6oBZkiBMnDfrTFvH2xur/uR5zsjzxbBV8P0U15fyRMimNZ5fgLDY4lkxOdQDCoBp3cjR WWxLuMyKIa4wCs4sxumHxY3a3m5fVvuceT0ZQbm31exIQzH5SU7pUi1CvWGdgh5VXO72zY j8wbjp0EdWD/3t9LPkt/soRlsQ4EXgA= X-MC-Unique: yQL7QovrM2SSC1pj_gLtPA-1 X-Mimecast-MFC-AGG-ID: yQL7QovrM2SSC1pj_gLtPA_1750923946 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Zhenzhong Duan , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 05/25] vfio/container: Fix SIGSEGV when open container file fails Date: Thu, 26 Jun 2025 09:45:09 +0200 Message-ID: <20250626074529.1384114-6-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924220166116600 From: Zhenzhong Duan When open /dev/vfio/vfio fails, SIGSEGV triggers because vfio_listener_unregister() doesn't support a NULL bcontainer pointer. Fixes: a1f267a7d4d9 ("vfio/container: reform vfio_container_connect cleanup= ") Signed-off-by: Zhenzhong Duan Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250623102235.94877-2-zhenzhong.d= uan@intel.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio/container.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 3e8d645ebbd7b13c55625f38f4250670a63df124..2853f6f08b5d3f199b7f2c22de8= 84ec6b57279ce 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -710,7 +710,9 @@ static bool vfio_container_connect(VFIOGroup *group, Ad= dressSpace *as, return true; =20 fail: - vfio_listener_unregister(bcontainer); + if (new_container) { + vfio_listener_unregister(bcontainer); + } =20 if (group_was_added) { vfio_container_group_del(container, group); --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924191; cv=none; d=zohomail.com; s=zohoarc; b=BmOQUd/8qsBh/cBb+VkFnfaXPiqc/boJY7Sdt6S+iogTWAiN8SEeog2MCg+yQHyAual7YFb4EhXN8s1nvY4IPpQp7zEMyL5SC5FThQkIfO/K/PbmdzHUsWoHfLN0KXxk5yUcXoo2dysmtCljNHz/C+gbm2lAsJbfc/RH0OGD66c= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924191; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=cohC1TF4+wfU5TNquD8tlKMRJgQq52ejRwWMDBTo4QU=; b=ngLFoE7GNNProZGVm6wYq7LcGC5htMV+2dcKUxy1fI/YE6asdNoxX56haIHk/fL9BPHFdxn1fvBuZHgSCJk/IwwQ3BteRvVruitER87c2cLDPGwooj+9d6EEDA1eCYooM71vR/SSoemlDPmzFimGpfooz793lsABxeqvIK7AvII= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924191596283.82089086377755; Thu, 26 Jun 2025 00:49:51 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhIy-0004Or-5O; Thu, 26 Jun 2025 03:45:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIw-0004OW-HU for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:54 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhIu-0001vy-Sr for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:45:54 -0400 Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-292-Ttvubg1VMpuR10ov-uc0Fg-1; Thu, 26 Jun 2025 03:45:50 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 302151944AAB; Thu, 26 Jun 2025 07:45:49 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 6DADF180035C; Thu, 26 Jun 2025 07:45:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923952; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cohC1TF4+wfU5TNquD8tlKMRJgQq52ejRwWMDBTo4QU=; b=UPKKqaruBeaYC/V99Dibu81H5zR2A48hzEyY8G/1+r7xz7HSKsdv8uqT9X2NaYs1POJ9tR GIUbdIRjXHm4kMMVZXCXHnYpzInwacbNFX/3PWFhTCm0hqCgjJXvcnk66X9P5IeaIQ/VVL OleCTcXw79d8xVPEbAnX1pbxYhTPOuM= X-MC-Unique: Ttvubg1VMpuR10ov-uc0Fg-1 X-Mimecast-MFC-AGG-ID: Ttvubg1VMpuR10ov-uc0Fg_1750923949 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Zhenzhong Duan , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 06/25] vfio/container: fails mdev hotplug if add migration blocker failed Date: Thu, 26 Jun 2025 09:45:10 +0200 Message-ID: <20250626074529.1384114-7-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924193707116600 From: Zhenzhong Duan It's aggressive to abort a running QEMU process when hotplug a mdev and it fails migration blocker adding. Fix by just failing mdev hotplug itself. Signed-off-by: Zhenzhong Duan Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250623102235.94877-3-zhenzhong.d= uan@intel.com [ clg: Changed test on value returned by migrate_add_blocker_modes() ] Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio/container.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 2853f6f08b5d3f199b7f2c22de884ec6b57279ce..3e13feaa74c30e8d7f5a0978e15= 824cecbf6d674 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -992,12 +992,16 @@ static bool vfio_legacy_attach_device(const char *nam= e, VFIODevice *vbasedev, if (vbasedev->mdev) { error_setg(&vbasedev->cpr.mdev_blocker, "CPR does not support vfio mdev %s", vbasedev->name); - migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, &error_fata= l, - MIG_MODE_CPR_TRANSFER, -1); + if (migrate_add_blocker_modes(&vbasedev->cpr.mdev_blocker, errp, + MIG_MODE_CPR_TRANSFER, -1) < 0) { + goto hiod_unref_exit; + } } =20 return true; =20 +hiod_unref_exit: + object_unref(vbasedev->hiod); device_put_exit: vfio_device_put(vbasedev); group_put_exit: --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924010; cv=none; d=zohomail.com; s=zohoarc; b=a+zseQOM403SJNBWWhj0N/TclOdt2OJWU3bp3cT44WgDiXU2t6SOx7Nck9ovMEisu8omRA4oLfApWbDyKAQUPcH400s633QGNscxuN0CgF+w2Aj6a6Y9Dbc4IXujBtfMpL9Xc2q5bkX9bUPPpCOGP9U4ajtI8gmcrwY8lIhm/ok= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924010; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=NYaB4dUKOwAtubar7RX7PLoeMOj0N+gdeYw3pMcSMLw=; b=g26QaGPRRbx/JPiQdI9099sdtrKUPeWWaacxVHL3En2fqRONUQvfNGpg8VRY1dC9ZOl+mY5Y9g1S8o5PLdNCNiX6yX/1o5vO3vbAU1lqidWDCZDEqP16+iskHxiHH5zZ/v79OjfuQ6pQdMJ3SlLa6oOIcD2IAozu1FU13mMWCp4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924010448193.84339967621588; Thu, 26 Jun 2025 00:46:50 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJA-0004QB-GA; Thu, 26 Jun 2025 03:46:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJ3-0004Pr-0j for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:01 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJ0-0001wj-8C for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:00 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-632-HsK8dw9pMYqv3PKX55Rsjw-1; Thu, 26 Jun 2025 03:45:53 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C5ADF1800368; Thu, 26 Jun 2025 07:45:52 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C780818002B3; Thu, 26 Jun 2025 07:45:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923957; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NYaB4dUKOwAtubar7RX7PLoeMOj0N+gdeYw3pMcSMLw=; b=Fi1F/TXzIWSq9Nj31MzHVCTtX2gzGhcjbwig1+MGvok2iBcBfhmo6fsp5vcTTE7FPfEZY7 XDmR5lgwtGpNm+UnGjtDycMSqdrydKe4f1ONBVlUQ0kbf/7VHL5AZdMrdOX0TerGXAm6RX +IVmfFwRCiZHP4Q+CEOwgn67hlTpOIA= X-MC-Unique: HsK8dw9pMYqv3PKX55Rsjw-1 X-Mimecast-MFC-AGG-ID: HsK8dw9pMYqv3PKX55Rsjw_1750923952 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 07/25] vfio-user: add vfio-user class and container Date: Thu, 26 Jun 2025 09:45:11 +0200 Message-ID: <20250626074529.1384114-8-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924012042116600 From: John Levon Introduce basic plumbing for vfio-user with CONFIG_VFIO_USER. We introduce VFIOUserContainer in hw/vfio-user/container.c, which is a container type for the "IOMMU" type "vfio-iommu-user", and share some common container code from hw/vfio/container.c. Add hw/vfio-user/pci.c for instantiating VFIOUserPCIDevice objects, sharing some common code from hw/vfio/pci.c. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-2-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- MAINTAINERS | 8 + hw/vfio-user/container.h | 21 +++ include/hw/vfio/vfio-container-base.h | 1 + hw/vfio-user/container.c | 208 ++++++++++++++++++++++++++ hw/vfio-user/pci.c | 185 +++++++++++++++++++++++ hw/Kconfig | 1 + hw/meson.build | 1 + hw/vfio-user/Kconfig | 7 + hw/vfio-user/meson.build | 9 ++ 9 files changed, 441 insertions(+) create mode 100644 hw/vfio-user/container.h create mode 100644 hw/vfio-user/container.c create mode 100644 hw/vfio-user/pci.c create mode 100644 hw/vfio-user/Kconfig create mode 100644 hw/vfio-user/meson.build diff --git a/MAINTAINERS b/MAINTAINERS index 27f4fe3f2597cb9b5a28d825e965ed891e15d4f2..2369391004bc3637696a38d4fbb= 05fe317beb338 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4253,6 +4253,14 @@ F: hw/remote/iommu.c F: include/hw/remote/iommu.h F: tests/functional/test_multiprocess.py =20 +VFIO-USER: +M: John Levon +M: Thanos Makatos +S: Supported +F: hw/vfio-user/* +F: include/hw/vfio-user/* +F: subprojects/libvfio-user + EBPF: M: Jason Wang R: Andrew Melnychenko diff --git a/hw/vfio-user/container.h b/hw/vfio-user/container.h new file mode 100644 index 0000000000000000000000000000000000000000..e4a46d2c1b0f712cd7fc1ff62a6= 5d889050de835 --- /dev/null +++ b/hw/vfio-user/container.h @@ -0,0 +1,21 @@ +/* + * vfio-user specific definitions. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#ifndef HW_VFIO_USER_CONTAINER_H +#define HW_VFIO_USER_CONTAINER_H + +#include "qemu/osdep.h" + +#include "hw/vfio/vfio-container-base.h" + +/* MMU container sub-class for vfio-user. */ +typedef struct VFIOUserContainer { + VFIOContainerBase bcontainer; +} VFIOUserContainer; + +OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserContainer, VFIO_IOMMU_USER); + +#endif /* HW_VFIO_USER_CONTAINER_H */ diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-c= ontainer-base.h index f0232654eedf19c4d9c4f0ed55e79074442720c3..3cd86ec59e1a4605dea92fadeca= 5816145ae409b 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -109,6 +109,7 @@ vfio_container_get_page_size_mask(const VFIOContainerBa= se *bcontainer) #define TYPE_VFIO_IOMMU_LEGACY TYPE_VFIO_IOMMU "-legacy" #define TYPE_VFIO_IOMMU_SPAPR TYPE_VFIO_IOMMU "-spapr" #define TYPE_VFIO_IOMMU_IOMMUFD TYPE_VFIO_IOMMU "-iommufd" +#define TYPE_VFIO_IOMMU_USER TYPE_VFIO_IOMMU "-user" =20 OBJECT_DECLARE_TYPE(VFIOContainerBase, VFIOIOMMUClass, VFIO_IOMMU) =20 diff --git a/hw/vfio-user/container.c b/hw/vfio-user/container.c new file mode 100644 index 0000000000000000000000000000000000000000..2367332177245e509842f9f0546= f028e8a132c14 --- /dev/null +++ b/hw/vfio-user/container.c @@ -0,0 +1,208 @@ +/* + * Container for vfio-user IOMMU type: rather than communicating with the = kernel + * vfio driver, we communicate over a socket to a server using the vfio-us= er + * protocol. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include +#include +#include "qemu/osdep.h" + +#include "hw/vfio-user/container.h" +#include "hw/vfio/vfio-cpr.h" +#include "hw/vfio/vfio-device.h" +#include "hw/vfio/vfio-listener.h" +#include "qapi/error.h" + +static int vfio_user_dma_unmap(const VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb, bool unmap_all) +{ + return -ENOTSUP; +} + +static int vfio_user_dma_map(const VFIOContainerBase *bcontainer, hwaddr i= ova, + ram_addr_t size, void *vaddr, bool readonly, + MemoryRegion *mrp) +{ + return -ENOTSUP; +} + +static int +vfio_user_set_dirty_page_tracking(const VFIOContainerBase *bcontainer, + bool start, Error **errp) +{ + error_setg_errno(errp, ENOTSUP, "Not supported"); + return -ENOTSUP; +} + +static int vfio_user_query_dirty_bitmap(const VFIOContainerBase *bcontaine= r, + VFIOBitmap *vbmap, hwaddr iova, + hwaddr size, Error **errp) +{ + error_setg_errno(errp, ENOTSUP, "Not supported"); + return -ENOTSUP; +} + +static bool vfio_user_setup(VFIOContainerBase *bcontainer, Error **errp) +{ + error_setg_errno(errp, ENOTSUP, "Not supported"); + return -ENOTSUP; +} + +static VFIOUserContainer *vfio_user_create_container(Error **errp) +{ + VFIOUserContainer *container; + + container =3D VFIO_IOMMU_USER(object_new(TYPE_VFIO_IOMMU_USER)); + return container; +} + +/* + * Try to mirror vfio_container_connect() as much as possible. + */ +static VFIOUserContainer * +vfio_user_container_connect(AddressSpace *as, Error **errp) +{ + VFIOContainerBase *bcontainer; + VFIOUserContainer *container; + VFIOAddressSpace *space; + VFIOIOMMUClass *vioc; + + space =3D vfio_address_space_get(as); + + container =3D vfio_user_create_container(errp); + if (!container) { + goto put_space_exit; + } + + bcontainer =3D &container->bcontainer; + + if (!vfio_cpr_register_container(bcontainer, errp)) { + goto free_container_exit; + } + + vioc =3D VFIO_IOMMU_GET_CLASS(bcontainer); + assert(vioc->setup); + + if (!vioc->setup(bcontainer, errp)) { + goto unregister_container_exit; + } + + vfio_address_space_insert(space, bcontainer); + + if (!vfio_listener_register(bcontainer, errp)) { + goto listener_release_exit; + } + + bcontainer->initialized =3D true; + + return container; + +listener_release_exit: + vfio_listener_unregister(bcontainer); + if (vioc->release) { + vioc->release(bcontainer); + } + +unregister_container_exit: + vfio_cpr_unregister_container(bcontainer); + +free_container_exit: + object_unref(container); + +put_space_exit: + vfio_address_space_put(space); + + return NULL; +} + +static void vfio_user_container_disconnect(VFIOUserContainer *container) +{ + VFIOContainerBase *bcontainer =3D &container->bcontainer; + VFIOIOMMUClass *vioc =3D VFIO_IOMMU_GET_CLASS(bcontainer); + + vfio_listener_unregister(bcontainer); + if (vioc->release) { + vioc->release(bcontainer); + } + + VFIOAddressSpace *space =3D bcontainer->space; + + vfio_cpr_unregister_container(bcontainer); + object_unref(container); + + vfio_address_space_put(space); +} + +static bool vfio_user_device_get(VFIOUserContainer *container, + VFIODevice *vbasedev, Error **errp) +{ + struct vfio_device_info info =3D { 0 }; + + vbasedev->fd =3D -1; + + vfio_device_prepare(vbasedev, &container->bcontainer, &info); + + return true; +} + +/* + * vfio_user_device_attach: attach a device to a new container. + */ +static bool vfio_user_device_attach(const char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp) +{ + VFIOUserContainer *container; + + container =3D vfio_user_container_connect(as, errp); + if (container =3D=3D NULL) { + error_prepend(errp, "failed to connect proxy"); + return false; + } + + return vfio_user_device_get(container, vbasedev, errp); +} + +static void vfio_user_device_detach(VFIODevice *vbasedev) +{ + VFIOUserContainer *container =3D container_of(vbasedev->bcontainer, + VFIOUserContainer, bcontai= ner); + + vfio_device_unprepare(vbasedev); + + vfio_user_container_disconnect(container); +} + +static int vfio_user_pci_hot_reset(VFIODevice *vbasedev, bool single) +{ + /* ->needs_reset is always false for vfio-user. */ + return 0; +} + +static void vfio_iommu_user_class_init(ObjectClass *klass, const void *dat= a) +{ + VFIOIOMMUClass *vioc =3D VFIO_IOMMU_CLASS(klass); + + vioc->setup =3D vfio_user_setup; + vioc->dma_map =3D vfio_user_dma_map; + vioc->dma_unmap =3D vfio_user_dma_unmap; + vioc->attach_device =3D vfio_user_device_attach; + vioc->detach_device =3D vfio_user_device_detach; + vioc->set_dirty_page_tracking =3D vfio_user_set_dirty_page_tracking; + vioc->query_dirty_bitmap =3D vfio_user_query_dirty_bitmap; + vioc->pci_hot_reset =3D vfio_user_pci_hot_reset; +}; + +static const TypeInfo types[] =3D { + { + .name =3D TYPE_VFIO_IOMMU_USER, + .parent =3D TYPE_VFIO_IOMMU, + .instance_size =3D sizeof(VFIOUserContainer), + .class_init =3D vfio_iommu_user_class_init, + }, +}; + +DEFINE_TYPES(types) diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c new file mode 100644 index 0000000000000000000000000000000000000000..86d705574736586a893feb529a0= 57e799ae610c8 --- /dev/null +++ b/hw/vfio-user/pci.c @@ -0,0 +1,185 @@ +/* + * vfio PCI device over a UNIX socket. + * + * Copyright =C2=A9 2018, 2021 Oracle and/or its affiliates. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include +#include "qemu/osdep.h" +#include "qapi-visit-sockets.h" + +#include "hw/qdev-properties.h" +#include "hw/vfio/pci.h" + +#define TYPE_VFIO_USER_PCI "vfio-user-pci" +OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) + +struct VFIOUserPCIDevice { + VFIOPCIDevice device; + SocketAddress *socket; +}; + +/* + * Emulated devices don't use host hot reset + */ +static void vfio_user_compute_needs_reset(VFIODevice *vbasedev) +{ + vbasedev->needs_reset =3D false; +} + +static Object *vfio_user_pci_get_object(VFIODevice *vbasedev) +{ + VFIOUserPCIDevice *vdev =3D container_of(vbasedev, VFIOUserPCIDevice, + device.vbasedev); + + return OBJECT(vdev); +} + +static VFIODeviceOps vfio_user_pci_ops =3D { + .vfio_compute_needs_reset =3D vfio_user_compute_needs_reset, + .vfio_eoi =3D vfio_pci_intx_eoi, + .vfio_get_object =3D vfio_user_pci_get_object, + /* No live migration support yet. */ + .vfio_save_config =3D NULL, + .vfio_load_config =3D NULL, +}; + +static void vfio_user_pci_realize(PCIDevice *pdev, Error **errp) +{ + ERRP_GUARD(); + VFIOUserPCIDevice *udev =3D VFIO_USER_PCI(pdev); + VFIOPCIDevice *vdev =3D VFIO_PCI_BASE(pdev); + VFIODevice *vbasedev =3D &vdev->vbasedev; + const char *sock_name; + AddressSpace *as; + + if (!udev->socket) { + error_setg(errp, "No socket specified"); + error_append_hint(errp, "e.g. -device '{" + "\"driver\":\"vfio-user-pci\", " + "\"socket\": {\"path\": \"/tmp/vfio-user.sock\", " + "\"type\": \"unix\"}'" + "}'\n"); + return; + } + + sock_name =3D udev->socket->u.q_unix.path; + + vbasedev->name =3D g_strdup_printf("vfio-user:%s", sock_name); + + /* + * vfio-user devices are effectively mdevs (don't use a host iommu). + */ + vbasedev->mdev =3D true; + + as =3D pci_device_iommu_address_space(pdev); + if (!vfio_device_attach_by_iommu_type(TYPE_VFIO_IOMMU_USER, + vbasedev->name, vbasedev, + as, errp)) { + error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->name); + return; + } +} + +static void vfio_user_instance_init(Object *obj) +{ + PCIDevice *pci_dev =3D PCI_DEVICE(obj); + VFIOPCIDevice *vdev =3D VFIO_PCI_BASE(obj); + VFIODevice *vbasedev =3D &vdev->vbasedev; + + device_add_bootindex_property(obj, &vdev->bootindex, + "bootindex", NULL, + &pci_dev->qdev); + vdev->host.domain =3D ~0U; + vdev->host.bus =3D ~0U; + vdev->host.slot =3D ~0U; + vdev->host.function =3D ~0U; + + vfio_device_init(vbasedev, VFIO_DEVICE_TYPE_PCI, &vfio_user_pci_ops, + DEVICE(vdev), false); + + vdev->nv_gpudirect_clique =3D 0xFF; + + /* + * QEMU_PCI_CAP_EXPRESS initialization does not depend on QEMU command + * line, therefore, no need to wait to realize like other devices. + */ + pci_dev->cap_present |=3D QEMU_PCI_CAP_EXPRESS; +} + +static void vfio_user_instance_finalize(Object *obj) +{ + VFIOPCIDevice *vdev =3D VFIO_PCI_BASE(obj); + + vfio_pci_put_device(vdev); +} + +static const Property vfio_user_pci_dev_properties[] =3D { + DEFINE_PROP_UINT32("x-pci-vendor-id", VFIOPCIDevice, + vendor_id, PCI_ANY_ID), + DEFINE_PROP_UINT32("x-pci-device-id", VFIOPCIDevice, + device_id, PCI_ANY_ID), + DEFINE_PROP_UINT32("x-pci-sub-vendor-id", VFIOPCIDevice, + sub_vendor_id, PCI_ANY_ID), + DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, + sub_device_id, PCI_ANY_ID), +}; + +static void vfio_user_pci_set_socket(Object *obj, Visitor *v, const char *= name, + void *opaque, Error **errp) +{ + VFIOUserPCIDevice *udev =3D VFIO_USER_PCI(obj); + bool success; + + qapi_free_SocketAddress(udev->socket); + + udev->socket =3D NULL; + + success =3D visit_type_SocketAddress(v, name, &udev->socket, errp); + + if (!success) { + return; + } + + if (udev->socket->type !=3D SOCKET_ADDRESS_TYPE_UNIX) { + error_setg(errp, "Unsupported socket type %s", + SocketAddressType_str(udev->socket->type)); + qapi_free_SocketAddress(udev->socket); + udev->socket =3D NULL; + return; + } +} + +static void vfio_user_pci_dev_class_init(ObjectClass *klass, const void *d= ata) +{ + DeviceClass *dc =3D DEVICE_CLASS(klass); + PCIDeviceClass *pdc =3D PCI_DEVICE_CLASS(klass); + + device_class_set_props(dc, vfio_user_pci_dev_properties); + + object_class_property_add(klass, "socket", "SocketAddress", NULL, + vfio_user_pci_set_socket, NULL, NULL); + object_class_property_set_description(klass, "socket", + "SocketAddress (UNIX sockets onl= y)"); + + dc->desc =3D "VFIO over socket PCI device assignment"; + pdc->realize =3D vfio_user_pci_realize; +} + +static const TypeInfo vfio_user_pci_dev_info =3D { + .name =3D TYPE_VFIO_USER_PCI, + .parent =3D TYPE_VFIO_PCI_BASE, + .instance_size =3D sizeof(VFIOUserPCIDevice), + .class_init =3D vfio_user_pci_dev_class_init, + .instance_init =3D vfio_user_instance_init, + .instance_finalize =3D vfio_user_instance_finalize, +}; + +static void register_vfio_user_dev_type(void) +{ + type_register_static(&vfio_user_pci_dev_info); +} + + type_init(register_vfio_user_dev_type) diff --git a/hw/Kconfig b/hw/Kconfig index 9a86a6a28a643c5ed17d37336ed006cc557f1cda..9e6c789ae7ee518fdea385fcd16= 6c4d8a626b041 100644 --- a/hw/Kconfig +++ b/hw/Kconfig @@ -42,6 +42,7 @@ source ufs/Kconfig source usb/Kconfig source virtio/Kconfig source vfio/Kconfig +source vfio-user/Kconfig source vmapple/Kconfig source xen/Kconfig source watchdog/Kconfig diff --git a/hw/meson.build b/hw/meson.build index b91f761fe08aae3e4732c2e463528d743e1390e1..791ce21ab42afbb4207413ffb21= b080489bf12b8 100644 --- a/hw/meson.build +++ b/hw/meson.build @@ -39,6 +39,7 @@ subdir('uefi') subdir('ufs') subdir('usb') subdir('vfio') +subdir('vfio-user') subdir('virtio') subdir('vmapple') subdir('watchdog') diff --git a/hw/vfio-user/Kconfig b/hw/vfio-user/Kconfig new file mode 100644 index 0000000000000000000000000000000000000000..24bdf7af903bee63e2eeefcbe62= 43d131c0f612d --- /dev/null +++ b/hw/vfio-user/Kconfig @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0-or-later + +config VFIO_USER + bool + default y + depends on VFIO_PCI + diff --git a/hw/vfio-user/meson.build b/hw/vfio-user/meson.build new file mode 100644 index 0000000000000000000000000000000000000000..b82c558252e82d0f492bd68ab6c= 8b657d5b6b08c --- /dev/null +++ b/hw/vfio-user/meson.build @@ -0,0 +1,9 @@ +# SPDX-License-Identifier: GPL-2.0-or-later + +vfio_user_ss =3D ss.source_set() +vfio_user_ss.add(files( + 'container.c', + 'pci.c', +)) + +system_ss.add_all(when: 'CONFIG_VFIO_USER', if_true: vfio_user_ss) --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924260; cv=none; d=zohomail.com; s=zohoarc; b=Yoda8/Ab59rxoEz7sNE0MgIed82XwaAr17FbVkQT2MMD7LTPJerXrciRbVfmspCw7OYSJGcZcO7bLMKqgvsIksXsDpcatanGOBYkDO2akIgtI7ziQoZSikmJETEkbLeIjqZhLcxU3efWaIcPBST1092f77dUCwqIu8sDKey0oFw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924260; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=4njr8PwYAhNczIr5/nlvFNXwpt2ofdxrbzvEMNzQSNw=; b=kG0vIMnhplDHVsxcq8WP3ht96p29AcWBwtsmb0yRHyFjShkc1ODN9H+cIMIY33/oo6arKbhvi4L7PIMDoD2ON/DilzAkkm/NCjgiXtQNxNMeKiYDBNJC3iFhu5M8JvNjsVm90g2IF/s1l1gQUJLf6Nk7niQsy57EzDE0gr0VGUY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924260456619.7648458650352; Thu, 26 Jun 2025 00:51:00 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJI-0004VJ-OF; Thu, 26 Jun 2025 03:46:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJ6-0004QG-9z for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:08 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJ3-0001wu-Sh for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:04 -0400 Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-193-3eXKpNt4OtG71EttOO_R6g-1; Thu, 26 Jun 2025 03:45:57 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B316219560AE; Thu, 26 Jun 2025 07:45:55 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4EFF8180035C; Thu, 26 Jun 2025 07:45:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923959; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4njr8PwYAhNczIr5/nlvFNXwpt2ofdxrbzvEMNzQSNw=; b=UJuP5RPGTU+Dwc78tkX8YkTIgicQrYcjs9VQT/vQcenIDtPZpHTAwy3YaI0L7QFASQfbPz dCb56jHdQVurpM3IwkQDkRTflBfi+sgIzDGrvXtT5syJiTimQveD8Em8s8wAYWxXWsQ7pV +ZLinRqafWKAhuvPt1HP725KE62Aucw= X-MC-Unique: 3eXKpNt4OtG71EttOO_R6g-1 X-Mimecast-MFC-AGG-ID: 3eXKpNt4OtG71EttOO_R6g_1750923955 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 08/25] vfio-user: connect vfio proxy to remote server Date: Thu, 26 Jun 2025 09:45:12 +0200 Message-ID: <20250626074529.1384114-9-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924262950116600 From: John Levon Introduce the vfio-user "proxy": this is the client code responsible for sending and receiving vfio-user messages across the control socket. The new files hw/vfio-user/proxy.[ch] contain some basic plumbing for managing the proxy; initialize the proxy during realization of the VFIOUserPCIDevice instance. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-3-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/proxy.h | 79 +++++++++++++++++ include/hw/vfio/vfio-device.h | 2 + hw/vfio-user/pci.c | 22 +++++ hw/vfio-user/proxy.c | 162 ++++++++++++++++++++++++++++++++++ hw/vfio-user/meson.build | 1 + 5 files changed, 266 insertions(+) create mode 100644 hw/vfio-user/proxy.h create mode 100644 hw/vfio-user/proxy.c diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h new file mode 100644 index 0000000000000000000000000000000000000000..a9bce82239d0e9dc40a892e1542= 488cfffc4f3da --- /dev/null +++ b/hw/vfio-user/proxy.h @@ -0,0 +1,79 @@ +#ifndef VFIO_USER_PROXY_H +#define VFIO_USER_PROXY_H + +/* + * vfio protocol over a UNIX socket. + * + * Copyright =C2=A9 2018, 2021 Oracle and/or its affiliates. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "io/channel.h" +#include "io/channel-socket.h" + +typedef struct { + int send_fds; + int recv_fds; + int *fds; +} VFIOUserFDs; + +enum msg_type { + VFIO_MSG_NONE, + VFIO_MSG_ASYNC, + VFIO_MSG_WAIT, + VFIO_MSG_NOWAIT, + VFIO_MSG_REQ, +}; + +typedef struct VFIOUserMsg { + QTAILQ_ENTRY(VFIOUserMsg) next; + VFIOUserFDs *fds; + uint32_t rsize; + uint32_t id; + QemuCond cv; + bool complete; + enum msg_type type; +} VFIOUserMsg; + + +enum proxy_state { + VFIO_PROXY_CONNECTED =3D 1, + VFIO_PROXY_ERROR =3D 2, + VFIO_PROXY_CLOSING =3D 3, + VFIO_PROXY_CLOSED =3D 4, +}; + +typedef QTAILQ_HEAD(VFIOUserMsgQ, VFIOUserMsg) VFIOUserMsgQ; + +typedef struct VFIOUserProxy { + QLIST_ENTRY(VFIOUserProxy) next; + char *sockname; + struct QIOChannel *ioc; + void (*request)(void *opaque, VFIOUserMsg *msg); + void *req_arg; + int flags; + QemuCond close_cv; + AioContext *ctx; + QEMUBH *req_bh; + + /* + * above only changed when BQL is held + * below are protected by per-proxy lock + */ + QemuMutex lock; + VFIOUserMsgQ free; + VFIOUserMsgQ pending; + VFIOUserMsgQ incoming; + VFIOUserMsgQ outgoing; + VFIOUserMsg *last_nowait; + enum proxy_state state; +} VFIOUserProxy; + +/* VFIOProxy flags */ +#define VFIO_PROXY_CLIENT 0x1 + +VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); +void vfio_user_disconnect(VFIOUserProxy *proxy); + +#endif /* VFIO_USER_PROXY_H */ diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h index 959e458d7f7e9facd8216fbd3695d9d97dbc667a..c616652ee72265c637cb5136fdf= fb4444639e0b7 100644 --- a/include/hw/vfio/vfio-device.h +++ b/include/hw/vfio/vfio-device.h @@ -47,6 +47,7 @@ typedef struct VFIOMigration VFIOMigration; =20 typedef struct IOMMUFDBackend IOMMUFDBackend; typedef struct VFIOIOASHwpt VFIOIOASHwpt; +typedef struct VFIOUserProxy VFIOUserProxy; =20 typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) next; @@ -88,6 +89,7 @@ typedef struct VFIODevice { struct vfio_region_info **reginfo; int *region_fds; VFIODeviceCPR cpr; + VFIOUserProxy *proxy; } VFIODevice; =20 struct VFIODeviceOps { diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index 86d705574736586a893feb529a057e799ae610c8..642421e7916aad01a894ee390c4= 3453dd5297356 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -12,6 +12,7 @@ =20 #include "hw/qdev-properties.h" #include "hw/vfio/pci.h" +#include "hw/vfio-user/proxy.h" =20 #define TYPE_VFIO_USER_PCI "vfio-user-pci" OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_PCI) @@ -54,6 +55,8 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error = **errp) VFIODevice *vbasedev =3D &vdev->vbasedev; const char *sock_name; AddressSpace *as; + SocketAddress addr; + VFIOUserProxy *proxy; =20 if (!udev->socket) { error_setg(errp, "No socket specified"); @@ -69,6 +72,15 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error= **errp) =20 vbasedev->name =3D g_strdup_printf("vfio-user:%s", sock_name); =20 + memset(&addr, 0, sizeof(addr)); + addr.type =3D SOCKET_ADDRESS_TYPE_UNIX; + addr.u.q_unix.path =3D (char *)sock_name; + proxy =3D vfio_user_connect_dev(&addr, errp); + if (!proxy) { + return; + } + vbasedev->proxy =3D proxy; + /* * vfio-user devices are effectively mdevs (don't use a host iommu). */ @@ -112,8 +124,13 @@ static void vfio_user_instance_init(Object *obj) static void vfio_user_instance_finalize(Object *obj) { VFIOPCIDevice *vdev =3D VFIO_PCI_BASE(obj); + VFIODevice *vbasedev =3D &vdev->vbasedev; =20 vfio_pci_put_device(vdev); + + if (vbasedev->proxy !=3D NULL) { + vfio_user_disconnect(vbasedev->proxy); + } } =20 static const Property vfio_user_pci_dev_properties[] =3D { @@ -133,6 +150,11 @@ static void vfio_user_pci_set_socket(Object *obj, Visi= tor *v, const char *name, VFIOUserPCIDevice *udev =3D VFIO_USER_PCI(obj); bool success; =20 + if (udev->device.vbasedev.proxy) { + error_setg(errp, "Proxy is connected"); + return; + } + qapi_free_SocketAddress(udev->socket); =20 udev->socket =3D NULL; diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c new file mode 100644 index 0000000000000000000000000000000000000000..bb436c9db91b6ebd90542b4121b= 62d56bbbdf4ab --- /dev/null +++ b/hw/vfio-user/proxy.c @@ -0,0 +1,162 @@ +/* + * vfio protocol over a UNIX socket. + * + * Copyright =C2=A9 2018, 2021 Oracle and/or its affiliates. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include + +#include "hw/vfio/vfio-device.h" +#include "hw/vfio-user/proxy.h" +#include "qapi/error.h" +#include "qemu/error-report.h" +#include "qemu/lockable.h" +#include "system/iothread.h" + +static IOThread *vfio_user_iothread; + +static void vfio_user_shutdown(VFIOUserProxy *proxy); + + +/* + * Functions called by main, CPU, or iothread threads + */ + +static void vfio_user_shutdown(VFIOUserProxy *proxy) +{ + qio_channel_shutdown(proxy->ioc, QIO_CHANNEL_SHUTDOWN_READ, NULL); + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, NULL, + proxy->ctx, NULL, NULL); +} + +/* + * Functions only called by iothread + */ + +static void vfio_user_cb(void *opaque) +{ + VFIOUserProxy *proxy =3D opaque; + + QEMU_LOCK_GUARD(&proxy->lock); + + proxy->state =3D VFIO_PROXY_CLOSED; + qemu_cond_signal(&proxy->close_cv); +} + + +/* + * Functions called by main or CPU threads + */ + +static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets =3D + QLIST_HEAD_INITIALIZER(vfio_user_sockets); + +VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp) +{ + VFIOUserProxy *proxy; + QIOChannelSocket *sioc; + QIOChannel *ioc; + char *sockname; + + if (addr->type !=3D SOCKET_ADDRESS_TYPE_UNIX) { + error_setg(errp, "vfio_user_connect - bad address family"); + return NULL; + } + sockname =3D addr->u.q_unix.path; + + sioc =3D qio_channel_socket_new(); + ioc =3D QIO_CHANNEL(sioc); + if (qio_channel_socket_connect_sync(sioc, addr, errp)) { + object_unref(OBJECT(ioc)); + return NULL; + } + qio_channel_set_blocking(ioc, false, NULL); + + proxy =3D g_malloc0(sizeof(VFIOUserProxy)); + proxy->sockname =3D g_strdup_printf("unix:%s", sockname); + proxy->ioc =3D ioc; + proxy->flags =3D VFIO_PROXY_CLIENT; + proxy->state =3D VFIO_PROXY_CONNECTED; + + qemu_mutex_init(&proxy->lock); + qemu_cond_init(&proxy->close_cv); + + if (vfio_user_iothread =3D=3D NULL) { + vfio_user_iothread =3D iothread_create("VFIO user", errp); + } + + proxy->ctx =3D iothread_get_aio_context(vfio_user_iothread); + + QTAILQ_INIT(&proxy->outgoing); + QTAILQ_INIT(&proxy->incoming); + QTAILQ_INIT(&proxy->free); + QTAILQ_INIT(&proxy->pending); + QLIST_INSERT_HEAD(&vfio_user_sockets, proxy, next); + + return proxy; +} + +void vfio_user_disconnect(VFIOUserProxy *proxy) +{ + VFIOUserMsg *r1, *r2; + + qemu_mutex_lock(&proxy->lock); + + /* our side is quitting */ + if (proxy->state =3D=3D VFIO_PROXY_CONNECTED) { + vfio_user_shutdown(proxy); + if (!QTAILQ_EMPTY(&proxy->pending)) { + error_printf("vfio_user_disconnect: outstanding requests\n"); + } + } + object_unref(OBJECT(proxy->ioc)); + proxy->ioc =3D NULL; + + proxy->state =3D VFIO_PROXY_CLOSING; + QTAILQ_FOREACH_SAFE(r1, &proxy->outgoing, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->outgoing, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->incoming, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->incoming, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->pending, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->pending, r1, next); + g_free(r1); + } + QTAILQ_FOREACH_SAFE(r1, &proxy->free, next, r2) { + qemu_cond_destroy(&r1->cv); + QTAILQ_REMOVE(&proxy->free, r1, next); + g_free(r1); + } + + /* + * Make sure the iothread isn't blocking anywhere + * with a ref to this proxy by waiting for a BH + * handler to run after the proxy fd handlers were + * deleted above. + */ + aio_bh_schedule_oneshot(proxy->ctx, vfio_user_cb, proxy); + qemu_cond_wait(&proxy->close_cv, &proxy->lock); + + /* we now hold the only ref to proxy */ + qemu_mutex_unlock(&proxy->lock); + qemu_cond_destroy(&proxy->close_cv); + qemu_mutex_destroy(&proxy->lock); + + QLIST_REMOVE(proxy, next); + if (QLIST_EMPTY(&vfio_user_sockets)) { + iothread_destroy(vfio_user_iothread); + vfio_user_iothread =3D NULL; + } + + g_free(proxy->sockname); + g_free(proxy); +} diff --git a/hw/vfio-user/meson.build b/hw/vfio-user/meson.build index b82c558252e82d0f492bd68ab6c8b657d5b6b08c..9e85a8ea51c3402d828df0b78d5= 34a4adde193a2 100644 --- a/hw/vfio-user/meson.build +++ b/hw/vfio-user/meson.build @@ -4,6 +4,7 @@ vfio_user_ss =3D ss.source_set() vfio_user_ss.add(files( 'container.c', 'pci.c', + 'proxy.c', )) =20 system_ss.add_all(when: 'CONFIG_VFIO_USER', if_true: vfio_user_ss) --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924080; cv=none; d=zohomail.com; s=zohoarc; b=Hgr64oNAFscvScAvS0z8yhpcN1b0BDvzY5sdEkP/dFdM9/AZ3V+lae61fjJH2xHSo11BHj9S5z0+8sAITBwBu3XP+n+0DDbNMn0jQzuU8g6loRth0TDhEe4JnXijvxg8MuZ7ETYoe+wYrFON14MK6WYOT/mgxMxLr4RFS9eGeLQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924080; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=U2n7Dhy5Yznd0RJTImonPlt60nglT5ze/kfjz6hLk5I=; b=TIjoU5fpnv2763KuRAWxsSN8Kw0rvKTaBKLpnXG8jpqNuC2HLBYeBfd1Zv494cDiyAseZOlqGXqKRWneMwaECPJsEOC9BUKGYD8vbjUe3xtMBIQCObWJvKZG45oyRnnCxaUP4nhpPgUj5n0yJBYpx5uKYVjsT7/4MpXa9Jt0Wns= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924080307256.6133085944075; Thu, 26 Jun 2025 00:48:00 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJS-0004fG-06; Thu, 26 Jun 2025 03:46:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJF-0004Uk-Jv for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJ9-0001xR-3S for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:13 -0400 Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-271-fMyTph_gNNy-kfrObnOaDw-1; Thu, 26 Jun 2025 03:46:01 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id DE0A81944AA6; Thu, 26 Jun 2025 07:45:58 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 3AFE2180045B; Thu, 26 Jun 2025 07:45:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923965; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U2n7Dhy5Yznd0RJTImonPlt60nglT5ze/kfjz6hLk5I=; b=UWZbPdMtTYZgR3DKZhF/vyZ6uw+ma97ucoa+j8wKTHyWXTu/aHFJrXAOGf5bDDjWUONvN4 PeCebm5jHShCYintfAfHIdGHOIiPtCAaGdxChyW9sEnPFL93Dn47HBlXkwk1JwpSDTeGGh cE90Oa/phiTnSNCds6AEM5vVHVDKuPA= X-MC-Unique: fMyTph_gNNy-kfrObnOaDw-1 X-Mimecast-MFC-AGG-ID: fMyTph_gNNy-kfrObnOaDw_1750923959 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 09/25] vfio-user: implement message receive infrastructure Date: Thu, 26 Jun 2025 09:45:13 +0200 Message-ID: <20250626074529.1384114-10-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924082662116600 From: John Levon Add the basic implementation for receiving vfio-user messages from the control socket. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-4-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- meson.build | 1 + hw/vfio-user/protocol.h | 53 +++++ hw/vfio-user/proxy.h | 11 + hw/vfio-user/trace.h | 4 + hw/vfio-user/pci.c | 11 + hw/vfio-user/proxy.c | 411 ++++++++++++++++++++++++++++++++++++++ hw/vfio-user/trace-events | 8 + 7 files changed, 499 insertions(+) create mode 100644 hw/vfio-user/protocol.h create mode 100644 hw/vfio-user/trace.h create mode 100644 hw/vfio-user/trace-events diff --git a/meson.build b/meson.build index 4676908dbb2006181355785c7262a85df52fc87c..dbc97bfdf7a0d888e270056d52b= 565c494878646 100644 --- a/meson.build +++ b/meson.build @@ -3683,6 +3683,7 @@ if have_system 'hw/ufs', 'hw/usb', 'hw/vfio', + 'hw/vfio-user', 'hw/virtio', 'hw/vmapple', 'hw/watchdog', diff --git a/hw/vfio-user/protocol.h b/hw/vfio-user/protocol.h new file mode 100644 index 0000000000000000000000000000000000000000..4ddfb5f222e60c3bd212110bba6= 855b81f191d27 --- /dev/null +++ b/hw/vfio-user/protocol.h @@ -0,0 +1,53 @@ +#ifndef VFIO_USER_PROTOCOL_H +#define VFIO_USER_PROTOCOL_H + +/* + * vfio protocol over a UNIX socket. + * + * Copyright =C2=A9 2018, 2021 Oracle and/or its affiliates. + * + * Each message has a standard header that describes the command + * being sent, which is almost always a VFIO ioctl(). + * + * The header may be followed by command-specific data, such as the + * region and offset info for read and write commands. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +typedef struct { + uint16_t id; + uint16_t command; + uint32_t size; + uint32_t flags; + uint32_t error_reply; +} VFIOUserHdr; + +/* VFIOUserHdr commands */ +enum vfio_user_command { + VFIO_USER_VERSION =3D 1, + VFIO_USER_DMA_MAP =3D 2, + VFIO_USER_DMA_UNMAP =3D 3, + VFIO_USER_DEVICE_GET_INFO =3D 4, + VFIO_USER_DEVICE_GET_REGION_INFO =3D 5, + VFIO_USER_DEVICE_GET_REGION_IO_FDS =3D 6, + VFIO_USER_DEVICE_GET_IRQ_INFO =3D 7, + VFIO_USER_DEVICE_SET_IRQS =3D 8, + VFIO_USER_REGION_READ =3D 9, + VFIO_USER_REGION_WRITE =3D 10, + VFIO_USER_DMA_READ =3D 11, + VFIO_USER_DMA_WRITE =3D 12, + VFIO_USER_DEVICE_RESET =3D 13, + VFIO_USER_DIRTY_PAGES =3D 14, + VFIO_USER_MAX, +}; + +/* VFIOUserHdr flags */ +#define VFIO_USER_REQUEST 0x0 +#define VFIO_USER_REPLY 0x1 +#define VFIO_USER_TYPE 0xF + +#define VFIO_USER_NO_REPLY 0x10 +#define VFIO_USER_ERROR 0x20 + +#endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h index a9bce82239d0e9dc40a892e1542488cfffc4f3da..ff553cad9d50ce2ba4867d910ad= f92836dc068b5 100644 --- a/hw/vfio-user/proxy.h +++ b/hw/vfio-user/proxy.h @@ -12,6 +12,9 @@ #include "io/channel.h" #include "io/channel-socket.h" =20 +#include "qemu/sockets.h" +#include "hw/vfio-user/protocol.h" + typedef struct { int send_fds; int recv_fds; @@ -28,6 +31,7 @@ enum msg_type { =20 typedef struct VFIOUserMsg { QTAILQ_ENTRY(VFIOUserMsg) next; + VFIOUserHdr *hdr; VFIOUserFDs *fds; uint32_t rsize; uint32_t id; @@ -67,13 +71,20 @@ typedef struct VFIOUserProxy { VFIOUserMsgQ incoming; VFIOUserMsgQ outgoing; VFIOUserMsg *last_nowait; + VFIOUserMsg *part_recv; + size_t recv_left; enum proxy_state state; } VFIOUserProxy; =20 /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 =20 +typedef struct VFIODevice VFIODevice; + VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr, Error **errp); void vfio_user_disconnect(VFIOUserProxy *proxy); +void vfio_user_set_handler(VFIODevice *vbasedev, + void (*handler)(void *opaque, VFIOUserMsg *msg), + void *reqarg); =20 #endif /* VFIO_USER_PROXY_H */ diff --git a/hw/vfio-user/trace.h b/hw/vfio-user/trace.h new file mode 100644 index 0000000000000000000000000000000000000000..9cf02d9506dd3d7e5a1d2344977= 61f7c545aca76 --- /dev/null +++ b/hw/vfio-user/trace.h @@ -0,0 +1,4 @@ +/* + * SPDX-License-Identifier: GPL-2.0-or-later + */ +#include "trace/trace-hw_vfio_user.h" diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index 642421e7916aad01a894ee390c43453dd5297356..bad2829f5cfc76ef5716a67ffb0= 00c8d78ed360d 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -22,6 +22,16 @@ struct VFIOUserPCIDevice { SocketAddress *socket; }; =20 +/* + * Incoming request message callback. + * + * Runs off main loop, so BQL held. + */ +static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg) +{ + +} + /* * Emulated devices don't use host hot reset */ @@ -80,6 +90,7 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error = **errp) return; } vbasedev->proxy =3D proxy; + vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); =20 /* * vfio-user devices are effectively mdevs (don't use a host iommu). diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c index bb436c9db91b6ebd90542b4121b62d56bbbdf4ab..349ea2b27c9877993125f4f0f3c= a27cb4df9b1b3 100644 --- a/hw/vfio-user/proxy.c +++ b/hw/vfio-user/proxy.c @@ -11,15 +11,31 @@ =20 #include "hw/vfio/vfio-device.h" #include "hw/vfio-user/proxy.h" +#include "hw/vfio-user/trace.h" #include "qapi/error.h" #include "qemu/error-report.h" #include "qemu/lockable.h" +#include "qemu/main-loop.h" #include "system/iothread.h" =20 static IOThread *vfio_user_iothread; =20 static void vfio_user_shutdown(VFIOUserProxy *proxy); +static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hd= r, + VFIOUserFDs *fds); +static VFIOUserFDs *vfio_user_getfds(int numfds); +static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg); =20 +static void vfio_user_recv(void *opaque); +static void vfio_user_cb(void *opaque); + +static void vfio_user_request(void *opaque); + +static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) +{ + hdr->flags |=3D VFIO_USER_ERROR; + hdr->error_reply =3D err; +} =20 /* * Functions called by main, CPU, or iothread threads @@ -32,10 +48,343 @@ static void vfio_user_shutdown(VFIOUserProxy *proxy) proxy->ctx, NULL, NULL); } =20 +static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hd= r, + VFIOUserFDs *fds) +{ + VFIOUserMsg *msg; + + msg =3D QTAILQ_FIRST(&proxy->free); + if (msg !=3D NULL) { + QTAILQ_REMOVE(&proxy->free, msg, next); + } else { + msg =3D g_malloc0(sizeof(*msg)); + qemu_cond_init(&msg->cv); + } + + msg->hdr =3D hdr; + msg->fds =3D fds; + return msg; +} + +/* + * Recycle a message list entry to the free list. + */ +static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg) +{ + if (msg->type =3D=3D VFIO_MSG_NONE) { + error_printf("vfio_user_recycle - freeing free msg\n"); + return; + } + + /* free msg buffer if no one is waiting to consume the reply */ + if (msg->type =3D=3D VFIO_MSG_NOWAIT || msg->type =3D=3D VFIO_MSG_ASYN= C) { + g_free(msg->hdr); + if (msg->fds !=3D NULL) { + g_free(msg->fds); + } + } + + msg->type =3D VFIO_MSG_NONE; + msg->hdr =3D NULL; + msg->fds =3D NULL; + msg->complete =3D false; + QTAILQ_INSERT_HEAD(&proxy->free, msg, next); +} + +static VFIOUserFDs *vfio_user_getfds(int numfds) +{ + VFIOUserFDs *fds =3D g_malloc0(sizeof(*fds) + (numfds * sizeof(int))); + + fds->fds =3D (int *)((char *)fds + sizeof(*fds)); + + return fds; +} + /* * Functions only called by iothread */ =20 +/* + * Process a received message. + */ +static void vfio_user_process(VFIOUserProxy *proxy, VFIOUserMsg *msg, + bool isreply) +{ + + /* + * Replies signal a waiter, if none just check for errors + * and free the message buffer. + * + * Requests get queued for the BH. + */ + if (isreply) { + msg->complete =3D true; + if (msg->type =3D=3D VFIO_MSG_WAIT) { + qemu_cond_signal(&msg->cv); + } else { + if (msg->hdr->flags & VFIO_USER_ERROR) { + error_printf("vfio_user_process: error reply on async "); + error_printf("request command %x error %s\n", + msg->hdr->command, + strerror(msg->hdr->error_reply)); + } + /* youngest nowait msg has been ack'd */ + if (proxy->last_nowait =3D=3D msg) { + proxy->last_nowait =3D NULL; + } + vfio_user_recycle(proxy, msg); + } + } else { + QTAILQ_INSERT_TAIL(&proxy->incoming, msg, next); + qemu_bh_schedule(proxy->req_bh); + } +} + +/* + * Complete a partial message read + */ +static int vfio_user_complete(VFIOUserProxy *proxy, Error **errp) +{ + VFIOUserMsg *msg =3D proxy->part_recv; + size_t msgleft =3D proxy->recv_left; + bool isreply; + char *data; + int ret; + + data =3D (char *)msg->hdr + (msg->hdr->size - msgleft); + while (msgleft > 0) { + ret =3D qio_channel_read(proxy->ioc, data, msgleft, errp); + + /* error or would block */ + if (ret <=3D 0) { + /* try for rest on next iternation */ + if (ret =3D=3D QIO_CHANNEL_ERR_BLOCK) { + proxy->recv_left =3D msgleft; + } + return ret; + } + trace_vfio_user_recv_read(msg->hdr->id, ret); + + msgleft -=3D ret; + data +=3D ret; + } + + /* + * Read complete message, process it. + */ + proxy->part_recv =3D NULL; + proxy->recv_left =3D 0; + isreply =3D (msg->hdr->flags & VFIO_USER_TYPE) =3D=3D VFIO_USER_REPLY; + vfio_user_process(proxy, msg, isreply); + + /* return positive value */ + return 1; +} + +/* + * Receive and process one incoming message. + * + * For replies, find matching outgoing request and wake any waiters. + * For requests, queue in incoming list and run request BH. + */ +static int vfio_user_recv_one(VFIOUserProxy *proxy, Error **errp) +{ + VFIOUserMsg *msg =3D NULL; + g_autofree int *fdp =3D NULL; + VFIOUserFDs *reqfds; + VFIOUserHdr hdr; + struct iovec iov =3D { + .iov_base =3D &hdr, + .iov_len =3D sizeof(hdr), + }; + bool isreply =3D false; + int i, ret; + size_t msgleft, numfds =3D 0; + char *data =3D NULL; + char *buf =3D NULL; + + /* + * Complete any partial reads + */ + if (proxy->part_recv !=3D NULL) { + ret =3D vfio_user_complete(proxy, errp); + + /* still not complete, try later */ + if (ret =3D=3D QIO_CHANNEL_ERR_BLOCK) { + return ret; + } + + if (ret <=3D 0) { + goto fatal; + } + /* else fall into reading another msg */ + } + + /* + * Read header + */ + ret =3D qio_channel_readv_full(proxy->ioc, &iov, 1, &fdp, &numfds, 0, + errp); + if (ret =3D=3D QIO_CHANNEL_ERR_BLOCK) { + return ret; + } + + /* read error or other side closed connection */ + if (ret <=3D 0) { + goto fatal; + } + + if (ret < sizeof(hdr)) { + error_setg(errp, "short read of header"); + goto fatal; + } + + /* + * Validate header + */ + if (hdr.size < sizeof(VFIOUserHdr)) { + error_setg(errp, "bad header size"); + goto fatal; + } + switch (hdr.flags & VFIO_USER_TYPE) { + case VFIO_USER_REQUEST: + isreply =3D false; + break; + case VFIO_USER_REPLY: + isreply =3D true; + break; + default: + error_setg(errp, "unknown message type"); + goto fatal; + } + trace_vfio_user_recv_hdr(proxy->sockname, hdr.id, hdr.command, hdr.siz= e, + hdr.flags); + + /* + * For replies, find the matching pending request. + * For requests, reap incoming FDs. + */ + if (isreply) { + QTAILQ_FOREACH(msg, &proxy->pending, next) { + if (hdr.id =3D=3D msg->id) { + break; + } + } + if (msg =3D=3D NULL) { + error_setg(errp, "unexpected reply"); + goto err; + } + QTAILQ_REMOVE(&proxy->pending, msg, next); + + /* + * Process any received FDs + */ + if (numfds !=3D 0) { + if (msg->fds =3D=3D NULL || msg->fds->recv_fds < numfds) { + error_setg(errp, "unexpected FDs"); + goto err; + } + msg->fds->recv_fds =3D numfds; + memcpy(msg->fds->fds, fdp, numfds * sizeof(int)); + } + } else { + if (numfds !=3D 0) { + reqfds =3D vfio_user_getfds(numfds); + memcpy(reqfds->fds, fdp, numfds * sizeof(int)); + } else { + reqfds =3D NULL; + } + } + + /* + * Put the whole message into a single buffer. + */ + if (isreply) { + if (hdr.size > msg->rsize) { + error_setg(errp, "reply larger than recv buffer"); + goto err; + } + *msg->hdr =3D hdr; + data =3D (char *)msg->hdr + sizeof(hdr); + } else { + buf =3D g_malloc0(hdr.size); + memcpy(buf, &hdr, sizeof(hdr)); + data =3D buf + sizeof(hdr); + msg =3D vfio_user_getmsg(proxy, (VFIOUserHdr *)buf, reqfds); + msg->type =3D VFIO_MSG_REQ; + } + + /* + * Read rest of message. + */ + msgleft =3D hdr.size - sizeof(hdr); + while (msgleft > 0) { + ret =3D qio_channel_read(proxy->ioc, data, msgleft, errp); + + /* prepare to complete read on next iternation */ + if (ret =3D=3D QIO_CHANNEL_ERR_BLOCK) { + proxy->part_recv =3D msg; + proxy->recv_left =3D msgleft; + return ret; + } + + if (ret <=3D 0) { + goto fatal; + } + trace_vfio_user_recv_read(hdr.id, ret); + + msgleft -=3D ret; + data +=3D ret; + } + + vfio_user_process(proxy, msg, isreply); + return 0; + + /* + * fatal means the other side closed or we don't trust the stream + * err means this message is corrupt + */ +fatal: + vfio_user_shutdown(proxy); + proxy->state =3D VFIO_PROXY_ERROR; + + /* set error if server side closed */ + if (ret =3D=3D 0) { + error_setg(errp, "server closed socket"); + } + +err: + for (i =3D 0; i < numfds; i++) { + close(fdp[i]); + } + if (isreply && msg !=3D NULL) { + /* force an error to keep sending thread from hanging */ + vfio_user_set_error(msg->hdr, EINVAL); + msg->complete =3D true; + qemu_cond_signal(&msg->cv); + } + return -1; +} + +static void vfio_user_recv(void *opaque) +{ + VFIOUserProxy *proxy =3D opaque; + + QEMU_LOCK_GUARD(&proxy->lock); + + if (proxy->state =3D=3D VFIO_PROXY_CONNECTED) { + Error *local_err =3D NULL; + + while (vfio_user_recv_one(proxy, &local_err) =3D=3D 0) { + ; + } + + if (local_err !=3D NULL) { + error_report_err(local_err); + } + } +} + static void vfio_user_cb(void *opaque) { VFIOUserProxy *proxy =3D opaque; @@ -51,6 +400,53 @@ static void vfio_user_cb(void *opaque) * Functions called by main or CPU threads */ =20 +/* + * Process incoming requests. + * + * The bus-specific callback has the form: + * request(opaque, msg) + * where 'opaque' was specified in vfio_user_set_handler + * and 'msg' is the inbound message. + * + * The callback is responsible for disposing of the message buffer, + * usually by re-using it when calling vfio_send_reply or vfio_send_error, + * both of which free their message buffer when the reply is sent. + * + * If the callback uses a new buffer, it needs to free the old one. + */ +static void vfio_user_request(void *opaque) +{ + VFIOUserProxy *proxy =3D opaque; + VFIOUserMsgQ new, free; + VFIOUserMsg *msg, *m1; + + /* reap all incoming */ + QTAILQ_INIT(&new); + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + QTAILQ_FOREACH_SAFE(msg, &proxy->incoming, next, m1) { + QTAILQ_REMOVE(&proxy->incoming, msg, next); + QTAILQ_INSERT_TAIL(&new, msg, next); + } + } + + /* process list */ + QTAILQ_INIT(&free); + QTAILQ_FOREACH_SAFE(msg, &new, next, m1) { + QTAILQ_REMOVE(&new, msg, next); + trace_vfio_user_recv_request(msg->hdr->command); + proxy->request(proxy->req_arg, msg); + QTAILQ_INSERT_HEAD(&free, msg, next); + } + + /* free list */ + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + QTAILQ_FOREACH_SAFE(msg, &free, next, m1) { + vfio_user_recycle(proxy, msg); + } + } +} + + static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets =3D QLIST_HEAD_INITIALIZER(vfio_user_sockets); =20 @@ -89,6 +485,7 @@ VFIOUserProxy *vfio_user_connect_dev(SocketAddress *addr= , Error **errp) } =20 proxy->ctx =3D iothread_get_aio_context(vfio_user_iothread); + proxy->req_bh =3D qemu_bh_new(vfio_user_request, proxy); =20 QTAILQ_INIT(&proxy->outgoing); QTAILQ_INIT(&proxy->incoming); @@ -99,6 +496,18 @@ VFIOUserProxy *vfio_user_connect_dev(SocketAddress *add= r, Error **errp) return proxy; } =20 +void vfio_user_set_handler(VFIODevice *vbasedev, + void (*handler)(void *opaque, VFIOUserMsg *msg), + void *req_arg) +{ + VFIOUserProxy *proxy =3D vbasedev->proxy; + + proxy->request =3D handler; + proxy->req_arg =3D req_arg; + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, NULL, NULL, proxy); +} + void vfio_user_disconnect(VFIOUserProxy *proxy) { VFIOUserMsg *r1, *r2; @@ -114,6 +523,8 @@ void vfio_user_disconnect(VFIOUserProxy *proxy) } object_unref(OBJECT(proxy->ioc)); proxy->ioc =3D NULL; + qemu_bh_delete(proxy->req_bh); + proxy->req_bh =3D NULL; =20 proxy->state =3D VFIO_PROXY_CLOSING; QTAILQ_FOREACH_SAFE(r1, &proxy->outgoing, next, r2) { diff --git a/hw/vfio-user/trace-events b/hw/vfio-user/trace-events new file mode 100644 index 0000000000000000000000000000000000000000..ddeb9f4b2f62557c6c7307f61e0= 888524d158b36 --- /dev/null +++ b/hw/vfio-user/trace-events @@ -0,0 +1,8 @@ +# See docs/devel/tracing.rst for syntax documentation. +# +# SPDX-License-Identifier: GPL-2.0-or-later + +# common.c +vfio_user_recv_hdr(const char *name, uint16_t id, uint16_t cmd, uint32_t s= ize, uint32_t flags) " (%s) id 0x%x cmd 0x%x size 0x%x flags 0x%x" +vfio_user_recv_read(uint16_t id, int read) " id 0x%x read 0x%x" +vfio_user_recv_request(uint16_t cmd) " command 0x%x" --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924150; cv=none; d=zohomail.com; s=zohoarc; b=UVE8cG+ZwZ+3qFiIJ4eTUDhw+GqYmYfwyYRoyCkh0CnI2qt0ynl7e6gj9PooaDJ/Br0QU2r55ofzL1ERoyfjE35n58gztdaTOzfLwAUjT/EXPBEOeo4HGbNSXqWUyponE6TDdA6fYSM84z0+7a5/nWAj4sHNrZEtjuiowSBmdLc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924150; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=+zqaNpLmwjF+LRDhKXd6plBT9jkQ1fHc9Vy4M+yxq48=; b=HtNstXaxUHdvrwj3hU2cSzcsJD7tdBS8SOLVCqDlNggyembcqB4R7HmEoU6Ub+FlXXVbhQc5A4vaNz1k+D7UsCw6msFrW8isT2mNHXS9tzzuHt5zjsLrjqwn7GOOu4DU0HxzQh1ew+sr+HtMN2CosXqGpGze9IwPr0fORN9obYU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924149995633.8290084940945; Thu, 26 Jun 2025 00:49:09 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJT-0004gD-75; Thu, 26 Jun 2025 03:46:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJI-0004Z0-DL for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:16 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJE-0001xe-3L for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:15 -0400 Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-544--d_fnVnAOtKbG7XXi6FDhA-1; Thu, 26 Jun 2025 03:46:02 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C9C601944A83; Thu, 26 Jun 2025 07:46:01 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 64279180045B; Thu, 26 Jun 2025 07:45:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923971; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+zqaNpLmwjF+LRDhKXd6plBT9jkQ1fHc9Vy4M+yxq48=; b=ZEvqWpkKiDq4w0brBP1r+rksGQ94U7U3HTwfwHsuAHIaVzkjYyhe25lJRU3DogP+ScvQsg IG1czycq2ODqhQ3s/DFB3h3h/XAeahbtJUlzrQUXsX1182LOhPr11ktkUkIVijXmGLyIvi lHAtJlrGff4qAQgidTnKgE5VNIxoqFo= X-MC-Unique: -d_fnVnAOtKbG7XXi6FDhA-1 X-Mimecast-MFC-AGG-ID: -d_fnVnAOtKbG7XXi6FDhA_1750923961 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Jagannathan Raman , Elena Ufimtseva , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 10/25] vfio-user: implement message send infrastructure Date: Thu, 26 Jun 2025 09:45:14 +0200 Message-ID: <20250626074529.1384114-11-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924151666116600 From: John Levon Add plumbing for sending vfio-user messages on the control socket. Add initial version negotation on connection. Originally-by: John Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-5-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/protocol.h | 62 +++++ hw/vfio-user/proxy.h | 9 + hw/vfio-user/pci.c | 20 +- hw/vfio-user/proxy.c | 515 ++++++++++++++++++++++++++++++++++++++ hw/vfio-user/trace-events | 2 + 5 files changed, 606 insertions(+), 2 deletions(-) diff --git a/hw/vfio-user/protocol.h b/hw/vfio-user/protocol.h index 4ddfb5f222e60c3bd212110bba6855b81f191d27..2d52d0fb10d1418207bf383e5f8= ec24230431443 100644 --- a/hw/vfio-user/protocol.h +++ b/hw/vfio-user/protocol.h @@ -50,4 +50,66 @@ enum vfio_user_command { #define VFIO_USER_NO_REPLY 0x10 #define VFIO_USER_ERROR 0x20 =20 + +/* + * VFIO_USER_VERSION + */ +typedef struct { + VFIOUserHdr hdr; + uint16_t major; + uint16_t minor; + char capabilities[]; +} VFIOUserVersion; + +#define VFIO_USER_MAJOR_VER 0 +#define VFIO_USER_MINOR_VER 0 + +#define VFIO_USER_CAP "capabilities" + +/* "capabilities" members */ +#define VFIO_USER_CAP_MAX_FDS "max_msg_fds" +#define VFIO_USER_CAP_MAX_XFER "max_data_xfer_size" +#define VFIO_USER_CAP_PGSIZES "pgsizes" +#define VFIO_USER_CAP_MAP_MAX "max_dma_maps" +#define VFIO_USER_CAP_MIGR "migration" + +/* "migration" members */ +#define VFIO_USER_CAP_PGSIZE "pgsize" +#define VFIO_USER_CAP_MAX_BITMAP "max_bitmap_size" + +/* + * Max FDs mainly comes into play when a device supports multiple interrup= ts + * where each ones uses an eventfd to inject it into the guest. + * It is clamped by the the number of FDs the qio channel supports in a + * single message. + */ +#define VFIO_USER_DEF_MAX_FDS 8 +#define VFIO_USER_MAX_MAX_FDS 16 + +/* + * Max transfer limits the amount of data in region and DMA messages. + * Region R/W will be very small (limited by how much a single instruction + * can process) so just use a reasonable limit here. + */ +#define VFIO_USER_DEF_MAX_XFER (1024 * 1024) +#define VFIO_USER_MAX_MAX_XFER (64 * 1024 * 1024) + +/* + * Default pagesizes supported is 4k. + */ +#define VFIO_USER_DEF_PGSIZE 4096 + +/* + * Default max number of DMA mappings is stolen from the + * linux kernel "dma_entry_limit" + */ +#define VFIO_USER_DEF_MAP_MAX 65535 + +/* + * Default max bitmap size is also take from the linux kernel, + * where usage of signed ints limits the VA range to 2^31 bytes. + * Dividing that by the number of bits per byte yields 256MB + */ +#define VFIO_USER_DEF_MAX_BITMAP (256 * 1024 * 1024) + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h index ff553cad9d50ce2ba4867d910adf92836dc068b5..5bc890a0f5f06cce83e2bf54711= f7ff9af01cf8b 100644 --- a/hw/vfio-user/proxy.h +++ b/hw/vfio-user/proxy.h @@ -37,6 +37,7 @@ typedef struct VFIOUserMsg { uint32_t id; QemuCond cv; bool complete; + bool pending; enum msg_type type; } VFIOUserMsg; =20 @@ -56,6 +57,12 @@ typedef struct VFIOUserProxy { struct QIOChannel *ioc; void (*request)(void *opaque, VFIOUserMsg *msg); void *req_arg; + uint64_t max_xfer_size; + uint64_t max_send_fds; + uint64_t max_dma; + uint64_t dma_pgsizes; + uint64_t max_bitmap; + uint64_t migr_pgsize; int flags; QemuCond close_cv; AioContext *ctx; @@ -78,6 +85,7 @@ typedef struct VFIOUserProxy { =20 /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 +#define VFIO_PROXY_FORCE_QUEUED 0x4 =20 typedef struct VFIODevice VFIODevice; =20 @@ -86,5 +94,6 @@ void vfio_user_disconnect(VFIOUserProxy *proxy); void vfio_user_set_handler(VFIODevice *vbasedev, void (*handler)(void *opaque, VFIOUserMsg *msg), void *reqarg); +bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp); =20 #endif /* VFIO_USER_PROXY_H */ diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index bad2829f5cfc76ef5716a67ffb000c8d78ed360d..61f525cf4a77f0dd0204d1cf28d= 213ef13824164 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -20,6 +20,7 @@ OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserPCIDevice, VFIO_USER_P= CI) struct VFIOUserPCIDevice { VFIOPCIDevice device; SocketAddress *socket; + bool send_queued; /* all sends are queued */ }; =20 /* @@ -92,6 +93,16 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Error= **errp) vbasedev->proxy =3D proxy; vfio_user_set_handler(vbasedev, vfio_user_pci_process_req, vdev); =20 + vbasedev->name =3D g_strdup_printf("vfio-user:%s", sock_name); + + if (udev->send_queued) { + proxy->flags |=3D VFIO_PROXY_FORCE_QUEUED; + } + + if (!vfio_user_validate_version(proxy, errp)) { + goto error; + } + /* * vfio-user devices are effectively mdevs (don't use a host iommu). */ @@ -101,9 +112,13 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Err= or **errp) if (!vfio_device_attach_by_iommu_type(TYPE_VFIO_IOMMU_USER, vbasedev->name, vbasedev, as, errp)) { - error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->name); - return; + goto error; } + + return; + +error: + error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); } =20 static void vfio_user_instance_init(Object *obj) @@ -153,6 +168,7 @@ static const Property vfio_user_pci_dev_properties[] = =3D { sub_vendor_id, PCI_ANY_ID), DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, sub_device_id, PCI_ANY_ID), + DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, fals= e), }; =20 static void vfio_user_pci_set_socket(Object *obj, Visitor *v, const char *= name, diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c index 349ea2b27c9877993125f4f0f3ca27cb4df9b1b3..874142e9e50ced664a7fce71971= eaa040cfe1e2b 100644 --- a/hw/vfio-user/proxy.c +++ b/hw/vfio-user/proxy.c @@ -13,11 +13,15 @@ #include "hw/vfio-user/proxy.h" #include "hw/vfio-user/trace.h" #include "qapi/error.h" +#include "qobject/qdict.h" +#include "qobject/qjson.h" +#include "qobject/qnum.h" #include "qemu/error-report.h" #include "qemu/lockable.h" #include "qemu/main-loop.h" #include "system/iothread.h" =20 +static int wait_time =3D 5000; /* wait up to 5 sec for busy servers */ static IOThread *vfio_user_iothread; =20 static void vfio_user_shutdown(VFIOUserProxy *proxy); @@ -27,9 +31,12 @@ static VFIOUserFDs *vfio_user_getfds(int numfds); static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg); =20 static void vfio_user_recv(void *opaque); +static void vfio_user_send(void *opaque); static void vfio_user_cb(void *opaque); =20 static void vfio_user_request(void *opaque); +static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags); =20 static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) { @@ -48,6 +55,41 @@ static void vfio_user_shutdown(VFIOUserProxy *proxy) proxy->ctx, NULL, NULL); } =20 +/* + * Same return values as qio_channel_writev_full(): + * + * QIO_CHANNEL_ERR_BLOCK: *errp not set + * -1: *errp will be populated + * otherwise: bytes written + */ +static ssize_t vfio_user_send_qio(VFIOUserProxy *proxy, VFIOUserMsg *msg, + Error **errp) +{ + VFIOUserFDs *fds =3D msg->fds; + struct iovec iov =3D { + .iov_base =3D msg->hdr, + .iov_len =3D msg->hdr->size, + }; + size_t numfds =3D 0; + int *fdp =3D NULL; + ssize_t ret; + + if (fds !=3D NULL && fds->send_fds !=3D 0) { + numfds =3D fds->send_fds; + fdp =3D fds->fds; + } + + ret =3D qio_channel_writev_full(proxy->ioc, &iov, 1, fdp, numfds, 0, e= rrp); + + if (ret =3D=3D -1) { + vfio_user_set_error(msg->hdr, EIO); + vfio_user_shutdown(proxy); + } + trace_vfio_user_send_write(msg->hdr->id, ret); + + return ret; +} + static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hd= r, VFIOUserFDs *fds) { @@ -88,6 +130,7 @@ static void vfio_user_recycle(VFIOUserProxy *proxy, VFIO= UserMsg *msg) msg->hdr =3D NULL; msg->fds =3D NULL; msg->complete =3D false; + msg->pending =3D false; QTAILQ_INSERT_HEAD(&proxy->free, msg, next); } =20 @@ -385,6 +428,62 @@ static void vfio_user_recv(void *opaque) } } =20 +/* + * Send a single message, same return semantics as vfio_user_send_qio(). + * + * Sent async messages are freed, others are moved to pending queue. + */ +static ssize_t vfio_user_send_one(VFIOUserProxy *proxy, Error **errp) +{ + VFIOUserMsg *msg; + ssize_t ret; + + msg =3D QTAILQ_FIRST(&proxy->outgoing); + ret =3D vfio_user_send_qio(proxy, msg, errp); + if (ret < 0) { + return ret; + } + + QTAILQ_REMOVE(&proxy->outgoing, msg, next); + if (msg->type =3D=3D VFIO_MSG_ASYNC) { + vfio_user_recycle(proxy, msg); + } else { + QTAILQ_INSERT_TAIL(&proxy->pending, msg, next); + msg->pending =3D true; + } + + return ret; +} + +/* + * Send messages from outgoing queue when the socket buffer has space. + * If we deplete 'outgoing', remove ourselves from the poll list. + */ +static void vfio_user_send(void *opaque) +{ + VFIOUserProxy *proxy =3D opaque; + + QEMU_LOCK_GUARD(&proxy->lock); + + if (proxy->state =3D=3D VFIO_PROXY_CONNECTED) { + while (!QTAILQ_EMPTY(&proxy->outgoing)) { + Error *local_err =3D NULL; + int ret; + + ret =3D vfio_user_send_one(proxy, &local_err); + + if (ret =3D=3D QIO_CHANNEL_ERR_BLOCK) { + return; + } else if (ret =3D=3D -1) { + error_report_err(local_err); + return; + } + } + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, NULL, NULL, proxy); + } +} + static void vfio_user_cb(void *opaque) { VFIOUserProxy *proxy =3D opaque; @@ -446,6 +545,128 @@ static void vfio_user_request(void *opaque) } } =20 +/* + * Messages are queued onto the proxy's outgoing list. + * + * It handles 3 types of messages: + * + * async messages - replies and posted writes + * + * There will be no reply from the server, so message + * buffers are freed after they're sent. + * + * nowait messages - map/unmap during address space transactions + * + * These are also sent async, but a reply is expected so that + * vfio_wait_reqs() can wait for the youngest nowait request. + * They transition from the outgoing list to the pending list + * when sent, and are freed when the reply is received. + * + * wait messages - all other requests + * + * The reply to these messages is waited for by their caller. + * They also transition from outgoing to pending when sent, but + * the message buffer is returned to the caller with the reply + * contents. The caller is responsible for freeing these messages. + * + * As an optimization, if the outgoing list and the socket send + * buffer are empty, the message is sent inline instead of being + * added to the outgoing list. The rest of the transitions are + * unchanged. + */ +static bool vfio_user_send_queued(VFIOUserProxy *proxy, VFIOUserMsg *msg, + Error **errp) +{ + int ret; + + /* + * Unsent outgoing msgs - add to tail + */ + if (!QTAILQ_EMPTY(&proxy->outgoing)) { + QTAILQ_INSERT_TAIL(&proxy->outgoing, msg, next); + return true; + } + + /* + * Try inline - if blocked, queue it and kick send poller + */ + if (proxy->flags & VFIO_PROXY_FORCE_QUEUED) { + ret =3D QIO_CHANNEL_ERR_BLOCK; + } else { + ret =3D vfio_user_send_qio(proxy, msg, errp); + } + + if (ret =3D=3D QIO_CHANNEL_ERR_BLOCK) { + QTAILQ_INSERT_HEAD(&proxy->outgoing, msg, next); + qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, + vfio_user_recv, proxy->ctx, + vfio_user_send, proxy); + return true; + } + if (ret =3D=3D -1) { + return false; + } + + /* + * Sent - free async, add others to pending + */ + if (msg->type =3D=3D VFIO_MSG_ASYNC) { + vfio_user_recycle(proxy, msg); + } else { + QTAILQ_INSERT_TAIL(&proxy->pending, msg, next); + msg->pending =3D true; + } + + return true; +} + +/* + * Returns false if we did not successfully receive a reply message, in wh= ich + * case @errp will be populated. + * + * In either case, the caller must free @hdr and @fds if needed. + */ +static bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize, Error **errp) +{ + VFIOUserMsg *msg; + bool ok =3D false; + + if (hdr->flags & VFIO_USER_NO_REPLY) { + error_setg_errno(errp, EINVAL, "%s on NO_REPLY message", __func__); + return false; + } + + qemu_mutex_lock(&proxy->lock); + + msg =3D vfio_user_getmsg(proxy, hdr, fds); + msg->id =3D hdr->id; + msg->rsize =3D rsize ? rsize : hdr->size; + msg->type =3D VFIO_MSG_WAIT; + + ok =3D vfio_user_send_queued(proxy, msg, errp); + + if (ok) { + while (!msg->complete) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + VFIOUserMsgQ *list; + + list =3D msg->pending ? &proxy->pending : &proxy->outgoing; + QTAILQ_REMOVE(list, msg, next); + error_setg_errno(errp, ETIMEDOUT, + "timed out waiting for reply"); + ok =3D false; + break; + } + } + } + + vfio_user_recycle(proxy, msg); + + qemu_mutex_unlock(&proxy->lock); + + return ok; +} =20 static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets =3D QLIST_HEAD_INITIALIZER(vfio_user_sockets); @@ -474,6 +695,15 @@ VFIOUserProxy *vfio_user_connect_dev(SocketAddress *ad= dr, Error **errp) proxy =3D g_malloc0(sizeof(VFIOUserProxy)); proxy->sockname =3D g_strdup_printf("unix:%s", sockname); proxy->ioc =3D ioc; + + /* init defaults */ + proxy->max_xfer_size =3D VFIO_USER_DEF_MAX_XFER; + proxy->max_send_fds =3D VFIO_USER_DEF_MAX_FDS; + proxy->max_dma =3D VFIO_USER_DEF_MAP_MAX; + proxy->dma_pgsizes =3D VFIO_USER_DEF_PGSIZE; + proxy->max_bitmap =3D VFIO_USER_DEF_MAX_BITMAP; + proxy->migr_pgsize =3D VFIO_USER_DEF_PGSIZE; + proxy->flags =3D VFIO_PROXY_CLIENT; proxy->state =3D VFIO_PROXY_CONNECTED; =20 @@ -571,3 +801,288 @@ void vfio_user_disconnect(VFIOUserProxy *proxy) g_free(proxy->sockname); g_free(proxy); } + +static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags) +{ + static uint16_t next_id; + + hdr->id =3D qatomic_fetch_inc(&next_id); + hdr->command =3D cmd; + hdr->size =3D size; + hdr->flags =3D (flags & ~VFIO_USER_TYPE) | VFIO_USER_REQUEST; + hdr->error_reply =3D 0; +} + +struct cap_entry { + const char *name; + bool (*check)(VFIOUserProxy *proxy, QObject *qobj, Error **errp); +}; + +static bool caps_parse(VFIOUserProxy *proxy, QDict *qdict, + struct cap_entry caps[], Error **errp) +{ + QObject *qobj; + struct cap_entry *p; + + for (p =3D caps; p->name !=3D NULL; p++) { + qobj =3D qdict_get(qdict, p->name); + if (qobj !=3D NULL) { + if (!p->check(proxy, qobj, errp)) { + return false; + } + qdict_del(qdict, p->name); + } + } + + /* warning, for now */ + if (qdict_size(qdict) !=3D 0) { + warn_report("spurious capabilities"); + } + return true; +} + +static bool check_migr_pgsize(VFIOUserProxy *proxy, QObject *qobj, Error *= *errp) +{ + QNum *qn =3D qobject_to(QNum, qobj); + uint64_t pgsize; + + if (qn =3D=3D NULL || !qnum_get_try_uint(qn, &pgsize)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_PGSIZE); + return false; + } + + /* must be larger than default */ + if (pgsize & (VFIO_USER_DEF_PGSIZE - 1)) { + error_setg(errp, "pgsize 0x%"PRIx64" too small", pgsize); + return false; + } + + proxy->migr_pgsize =3D pgsize; + return true; +} + +static bool check_bitmap(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QNum *qn =3D qobject_to(QNum, qobj); + uint64_t bitmap_size; + + if (qn =3D=3D NULL || !qnum_get_try_uint(qn, &bitmap_size)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_BITMAP); + return false; + } + + /* can only lower it */ + if (bitmap_size > VFIO_USER_DEF_MAX_BITMAP) { + error_setg(errp, "%s too large", VFIO_USER_CAP_MAX_BITMAP); + return false; + } + + proxy->max_bitmap =3D bitmap_size; + return true; +} + +static struct cap_entry caps_migr[] =3D { + { VFIO_USER_CAP_PGSIZE, check_migr_pgsize }, + { VFIO_USER_CAP_MAX_BITMAP, check_bitmap }, + { NULL } +}; + +static bool check_max_fds(VFIOUserProxy *proxy, QObject *qobj, Error **err= p) +{ + QNum *qn =3D qobject_to(QNum, qobj); + uint64_t max_send_fds; + + if (qn =3D=3D NULL || !qnum_get_try_uint(qn, &max_send_fds) || + max_send_fds > VFIO_USER_MAX_MAX_FDS) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_FDS); + return false; + } + proxy->max_send_fds =3D max_send_fds; + return true; +} + +static bool check_max_xfer(VFIOUserProxy *proxy, QObject *qobj, Error **er= rp) +{ + QNum *qn =3D qobject_to(QNum, qobj); + uint64_t max_xfer_size; + + if (qn =3D=3D NULL || !qnum_get_try_uint(qn, &max_xfer_size) || + max_xfer_size > VFIO_USER_MAX_MAX_XFER) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_XFER); + return false; + } + proxy->max_xfer_size =3D max_xfer_size; + return true; +} + +static bool check_pgsizes(VFIOUserProxy *proxy, QObject *qobj, Error **err= p) +{ + QNum *qn =3D qobject_to(QNum, qobj); + uint64_t pgsizes; + + if (qn =3D=3D NULL || !qnum_get_try_uint(qn, &pgsizes)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_PGSIZES); + return false; + } + + /* must be larger than default */ + if (pgsizes & (VFIO_USER_DEF_PGSIZE - 1)) { + error_setg(errp, "pgsize 0x%"PRIx64" too small", pgsizes); + return false; + } + + proxy->dma_pgsizes =3D pgsizes; + return true; +} + +static bool check_max_dma(VFIOUserProxy *proxy, QObject *qobj, Error **err= p) +{ + QNum *qn =3D qobject_to(QNum, qobj); + uint64_t max_dma; + + if (qn =3D=3D NULL || !qnum_get_try_uint(qn, &max_dma)) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAP_MAX); + return false; + } + + /* can only lower it */ + if (max_dma > VFIO_USER_DEF_MAP_MAX) { + error_setg(errp, "%s too large", VFIO_USER_CAP_MAP_MAX); + return false; + } + + proxy->max_dma =3D max_dma; + return true; +} + +static bool check_migr(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QDict *qdict =3D qobject_to(QDict, qobj); + + if (qdict =3D=3D NULL) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MAX_FDS); + return true; + } + return caps_parse(proxy, qdict, caps_migr, errp); +} + +static struct cap_entry caps_cap[] =3D { + { VFIO_USER_CAP_MAX_FDS, check_max_fds }, + { VFIO_USER_CAP_MAX_XFER, check_max_xfer }, + { VFIO_USER_CAP_PGSIZES, check_pgsizes }, + { VFIO_USER_CAP_MAP_MAX, check_max_dma }, + { VFIO_USER_CAP_MIGR, check_migr }, + { NULL } +}; + +static bool check_cap(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QDict *qdict =3D qobject_to(QDict, qobj); + + if (qdict =3D=3D NULL) { + error_setg(errp, "malformed %s", VFIO_USER_CAP); + return false; + } + return caps_parse(proxy, qdict, caps_cap, errp); +} + +static struct cap_entry ver_0_0[] =3D { + { VFIO_USER_CAP, check_cap }, + { NULL } +}; + +static bool caps_check(VFIOUserProxy *proxy, int minor, const char *caps, + Error **errp) +{ + QObject *qobj; + QDict *qdict; + bool ret; + + qobj =3D qobject_from_json(caps, NULL); + if (qobj =3D=3D NULL) { + error_setg(errp, "malformed capabilities %s", caps); + return false; + } + qdict =3D qobject_to(QDict, qobj); + if (qdict =3D=3D NULL) { + error_setg(errp, "capabilities %s not an object", caps); + qobject_unref(qobj); + return false; + } + ret =3D caps_parse(proxy, qdict, ver_0_0, errp); + + qobject_unref(qobj); + return ret; +} + +static GString *caps_json(void) +{ + QDict *dict =3D qdict_new(); + QDict *capdict =3D qdict_new(); + QDict *migdict =3D qdict_new(); + GString *str; + + qdict_put_int(migdict, VFIO_USER_CAP_PGSIZE, VFIO_USER_DEF_PGSIZE); + qdict_put_int(migdict, VFIO_USER_CAP_MAX_BITMAP, VFIO_USER_DEF_MAX_BIT= MAP); + qdict_put_obj(capdict, VFIO_USER_CAP_MIGR, QOBJECT(migdict)); + + qdict_put_int(capdict, VFIO_USER_CAP_MAX_FDS, VFIO_USER_MAX_MAX_FDS); + qdict_put_int(capdict, VFIO_USER_CAP_MAX_XFER, VFIO_USER_DEF_MAX_XFER); + qdict_put_int(capdict, VFIO_USER_CAP_PGSIZES, VFIO_USER_DEF_PGSIZE); + qdict_put_int(capdict, VFIO_USER_CAP_MAP_MAX, VFIO_USER_DEF_MAP_MAX); + + qdict_put_obj(dict, VFIO_USER_CAP, QOBJECT(capdict)); + + str =3D qobject_to_json(QOBJECT(dict)); + qobject_unref(dict); + return str; +} + +bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp) +{ + g_autofree VFIOUserVersion *msgp =3D NULL; + GString *caps; + char *reply; + int size, caplen; + + caps =3D caps_json(); + caplen =3D caps->len + 1; + size =3D sizeof(*msgp) + caplen; + msgp =3D g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_VERSION, size, 0); + msgp->major =3D VFIO_USER_MAJOR_VER; + msgp->minor =3D VFIO_USER_MINOR_VER; + memcpy(&msgp->capabilities, caps->str, caplen); + g_string_free(caps, true); + trace_vfio_user_version(msgp->major, msgp->minor, msgp->capabilities); + + if (!vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0, errp)) { + return false; + } + + if (msgp->hdr.flags & VFIO_USER_ERROR) { + error_setg_errno(errp, msgp->hdr.error_reply, "version reply"); + return false; + } + + if (msgp->major !=3D VFIO_USER_MAJOR_VER || + msgp->minor > VFIO_USER_MINOR_VER) { + error_setg(errp, "incompatible server version"); + return false; + } + + reply =3D msgp->capabilities; + if (reply[msgp->hdr.size - sizeof(*msgp) - 1] !=3D '\0') { + error_setg(errp, "corrupt version reply"); + return false; + } + + if (!caps_check(proxy, msgp->minor, reply, errp)) { + return false; + } + + trace_vfio_user_version(msgp->major, msgp->minor, msgp->capabilities); + return true; +} diff --git a/hw/vfio-user/trace-events b/hw/vfio-user/trace-events index ddeb9f4b2f62557c6c7307f61e0888524d158b36..a965c7b1f2843aa757818238524= bd90165267eb0 100644 --- a/hw/vfio-user/trace-events +++ b/hw/vfio-user/trace-events @@ -6,3 +6,5 @@ vfio_user_recv_hdr(const char *name, uint16_t id, uint16_t cmd, uint32_t s= ize, uint32_t flags) " (%s) id 0x%x cmd 0x%x size 0x%x flags 0x%x" vfio_user_recv_read(uint16_t id, int read) " id 0x%x read 0x%x" vfio_user_recv_request(uint16_t cmd) " command 0x%x" +vfio_user_send_write(uint16_t id, int wrote) " id 0x%x wrote 0x%x" +vfio_user_version(uint16_t major, uint16_t minor, const char *caps) " majo= r %d minor %d caps: %s" --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924019; cv=none; d=zohomail.com; s=zohoarc; b=h+UDQZx5ZMDryNRqqgpXidndiCnZl1+bmO11WplrxUGs5DozDvyw0Yy9KjTBl5TlFMNihtnWKyYoegYyd8syI6YkIEvkaOuTP1/esCnBmP4TKfDIcSvbS+NZDzfCNhiXvKDuCPHQuumZ6LX0J8dT6Y25nhgksuEucFkq3WD98v8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924019; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=kSr5RrGhBBxgOGopny4FyzIZ6I/CPfIINFKI9uz7GVs=; b=ZwxPkMp/wgH3TKJayiJCkm/6DSMrLN46Go+RGSdCdNqO2I6zIh8kvPSN0dKrw71ufIt5zaKDhhUqBMG2hK5VcFKcScN2nAVTxEEhSgMRBp+L8JMOWTY2+991NPQXoJ41mbvwmknnDxgLP0EfnDNhPtIxPI6if04Q9mTJ8IuGPYU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924019581865.9770594367927; Thu, 26 Jun 2025 00:46:59 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJh-00056G-Dd; Thu, 26 Jun 2025 03:46:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJJ-0004Zs-GS for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:19 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJG-0001xg-7I for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:17 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-59-BNgBq69XMyKT-F1nZTtvYw-1; Thu, 26 Jun 2025 03:46:09 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 347411809C89; Thu, 26 Jun 2025 07:46:05 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4F43A180045B; Thu, 26 Jun 2025 07:46:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923971; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kSr5RrGhBBxgOGopny4FyzIZ6I/CPfIINFKI9uz7GVs=; b=gyQGeW3wRVDSdc4eBqw1lDv8P5n6WdVvSrH2kwS2O2EBAyRmPGjBUeH89TKo3xKoWOYwwX WtE1YXJUPnJMJs/MmoBDw96NtFw7/fcdSRqtQErsOKWYt+KyHPUtn4bYVOcyiJVFFt5sBu WbIHN6h9wSeVsHj8zWJEqtvegTKjDXg= X-MC-Unique: BNgBq69XMyKT-F1nZTtvYw-1 X-Mimecast-MFC-AGG-ID: BNgBq69XMyKT-F1nZTtvYw_1750923965 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 11/25] vfio-user: implement VFIO_USER_DEVICE_GET_INFO Date: Thu, 26 Jun 2025 09:45:15 +0200 Message-ID: <20250626074529.1384114-12-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924021877116600 From: John Levon Add support for getting basic device information. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-6-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/device.h | 20 ++++++++++++++ hw/vfio-user/protocol.h | 12 +++++++++ hw/vfio-user/proxy.h | 7 +++++ hw/vfio-user/container.c | 8 +++++- hw/vfio-user/device.c | 55 +++++++++++++++++++++++++++++++++++++++ hw/vfio-user/proxy.c | 10 +++---- hw/vfio-user/meson.build | 1 + hw/vfio-user/trace-events | 1 + 8 files changed, 107 insertions(+), 7 deletions(-) create mode 100644 hw/vfio-user/device.h create mode 100644 hw/vfio-user/device.c diff --git a/hw/vfio-user/device.h b/hw/vfio-user/device.h new file mode 100644 index 0000000000000000000000000000000000000000..ef3f71ee697da301baf6d556796= 850524ac4ef34 --- /dev/null +++ b/hw/vfio-user/device.h @@ -0,0 +1,20 @@ +#ifndef VFIO_USER_DEVICE_H +#define VFIO_USER_DEVICE_H + +/* + * vfio protocol over a UNIX socket device handling. + * + * Copyright =C2=A9 2018, 2021 Oracle and/or its affiliates. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "linux/vfio.h" + +#include "hw/vfio-user/proxy.h" + +bool vfio_user_get_device_info(VFIOUserProxy *proxy, + struct vfio_device_info *info, Error **errp= ); + +#endif /* VFIO_USER_DEVICE_H */ diff --git a/hw/vfio-user/protocol.h b/hw/vfio-user/protocol.h index 2d52d0fb10d1418207bf383e5f8ec24230431443..e0bba68739d05ee212821089466= 20e75dd11163d 100644 --- a/hw/vfio-user/protocol.h +++ b/hw/vfio-user/protocol.h @@ -112,4 +112,16 @@ typedef struct { */ #define VFIO_USER_DEF_MAX_BITMAP (256 * 1024 * 1024) =20 +/* + * VFIO_USER_DEVICE_GET_INFO + * imported from struct vfio_device_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t num_regions; + uint32_t num_irqs; +} VFIOUserDeviceInfo; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h index 5bc890a0f5f06cce83e2bf54711f7ff9af01cf8b..837b02a8c486e9bf71d63b442a3= 62a9a9610ff18 100644 --- a/hw/vfio-user/proxy.h +++ b/hw/vfio-user/proxy.h @@ -12,7 +12,9 @@ #include "io/channel.h" #include "io/channel-socket.h" =20 +#include "qemu/queue.h" #include "qemu/sockets.h" +#include "qemu/thread.h" #include "hw/vfio-user/protocol.h" =20 typedef struct { @@ -96,4 +98,9 @@ void vfio_user_set_handler(VFIODevice *vbasedev, void *reqarg); bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp); =20 +void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags); +bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize, Error **errp); + #endif /* VFIO_USER_PROXY_H */ diff --git a/hw/vfio-user/container.c b/hw/vfio-user/container.c index 2367332177245e509842f9f0546f028e8a132c14..f5bfd5431675c3b8abfb3ba4409= 8ac61201fde58 100644 --- a/hw/vfio-user/container.c +++ b/hw/vfio-user/container.c @@ -11,6 +11,7 @@ #include "qemu/osdep.h" =20 #include "hw/vfio-user/container.h" +#include "hw/vfio-user/device.h" #include "hw/vfio/vfio-cpr.h" #include "hw/vfio/vfio-device.h" #include "hw/vfio/vfio-listener.h" @@ -140,7 +141,12 @@ static void vfio_user_container_disconnect(VFIOUserCon= tainer *container) static bool vfio_user_device_get(VFIOUserContainer *container, VFIODevice *vbasedev, Error **errp) { - struct vfio_device_info info =3D { 0 }; + struct vfio_device_info info =3D { .argsz =3D sizeof(info) }; + + + if (!vfio_user_get_device_info(vbasedev->proxy, &info, errp)) { + return false; + } =20 vbasedev->fd =3D -1; =20 diff --git a/hw/vfio-user/device.c b/hw/vfio-user/device.c new file mode 100644 index 0000000000000000000000000000000000000000..4212fefd44d7f8a5c1b91567656= 19047e54d7b1b --- /dev/null +++ b/hw/vfio-user/device.c @@ -0,0 +1,55 @@ +/* + * vfio protocol over a UNIX socket device handling. + * + * Copyright =C2=A9 2018, 2021 Oracle and/or its affiliates. + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "qemu/error-report.h" + +#include "hw/vfio-user/device.h" +#include "hw/vfio-user/trace.h" + +/* + * These are to defend against a malign server trying + * to force us to run out of memory. + */ +#define VFIO_USER_MAX_REGIONS 100 +#define VFIO_USER_MAX_IRQS 50 + +bool vfio_user_get_device_info(VFIOUserProxy *proxy, + struct vfio_device_info *info, Error **errp) +{ + VFIOUserDeviceInfo msg; + uint32_t argsz =3D sizeof(msg) - sizeof(msg.hdr); + + memset(&msg, 0, sizeof(msg)); + vfio_user_request_msg(&msg.hdr, VFIO_USER_DEVICE_GET_INFO, sizeof(msg)= , 0); + msg.argsz =3D argsz; + + if (!vfio_user_send_wait(proxy, &msg.hdr, NULL, 0, errp)) { + return false; + } + + if (msg.hdr.flags & VFIO_USER_ERROR) { + error_setg_errno(errp, -msg.hdr.error_reply, + "VFIO_USER_DEVICE_GET_INFO failed"); + return false; + } + + trace_vfio_user_get_info(msg.num_regions, msg.num_irqs); + + memcpy(info, &msg.argsz, argsz); + + /* defend against a malicious server */ + if (info->num_regions > VFIO_USER_MAX_REGIONS || + info->num_irqs > VFIO_USER_MAX_IRQS) { + error_setg_errno(errp, EINVAL, "invalid reply"); + return false; + } + + return true; +} diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c index 874142e9e50ced664a7fce71971eaa040cfe1e2b..aed7b22e2a657cfc2ddd391163d= 380ee5ea22f68 100644 --- a/hw/vfio-user/proxy.c +++ b/hw/vfio-user/proxy.c @@ -35,8 +35,6 @@ static void vfio_user_send(void *opaque); static void vfio_user_cb(void *opaque); =20 static void vfio_user_request(void *opaque); -static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, - uint32_t size, uint32_t flags); =20 static inline void vfio_user_set_error(VFIOUserHdr *hdr, uint32_t err) { @@ -626,8 +624,8 @@ static bool vfio_user_send_queued(VFIOUserProxy *proxy,= VFIOUserMsg *msg, * * In either case, the caller must free @hdr and @fds if needed. */ -static bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, - VFIOUserFDs *fds, int rsize, Error **errp) +bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize, Error **errp) { VFIOUserMsg *msg; bool ok =3D false; @@ -802,8 +800,8 @@ void vfio_user_disconnect(VFIOUserProxy *proxy) g_free(proxy); } =20 -static void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, - uint32_t size, uint32_t flags) +void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, + uint32_t size, uint32_t flags) { static uint16_t next_id; =20 diff --git a/hw/vfio-user/meson.build b/hw/vfio-user/meson.build index 9e85a8ea51c3402d828df0b78d534a4adde193a2..2ed0ae5b1dd987c76720125656c= 1174e9e83ae84 100644 --- a/hw/vfio-user/meson.build +++ b/hw/vfio-user/meson.build @@ -3,6 +3,7 @@ vfio_user_ss =3D ss.source_set() vfio_user_ss.add(files( 'container.c', + 'device.c', 'pci.c', 'proxy.c', )) diff --git a/hw/vfio-user/trace-events b/hw/vfio-user/trace-events index a965c7b1f2843aa757818238524bd90165267eb0..b7312d6d5928dc7d1d984fdb501= b7146e2011a79 100644 --- a/hw/vfio-user/trace-events +++ b/hw/vfio-user/trace-events @@ -8,3 +8,4 @@ vfio_user_recv_read(uint16_t id, int read) " id 0x%x read 0= x%x" vfio_user_recv_request(uint16_t cmd) " command 0x%x" vfio_user_send_write(uint16_t id, int wrote) " id 0x%x wrote 0x%x" vfio_user_version(uint16_t major, uint16_t minor, const char *caps) " majo= r %d minor %d caps: %s" +vfio_user_get_info(uint32_t nregions, uint32_t nirqs) " #regions %d #irqs = %d" --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924076; cv=none; d=zohomail.com; s=zohoarc; b=SLsGbg+k9s+WTk0xEcV1CBf9nB9HRh6pblX8lNS95f+CtQBGVzOR04F7wDmK7UcsEroTLgt6pNm3ZDZEiOyvgklbD+vSrp00hyk68eaN3TXonrnmuOy7NKZvYkK6VShHewIgcjKEFgQD0QHY+b0vHEJsEeH/PJBV1D1/+zi84y0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924076; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=zo5xwy24PzhkdmLZji8s8LCN5YV4P1Qb4+NP0OylSMY=; b=IxTXKIcIRqdeJscs7n4/0RExhtk8s2C09w24u+iXXzwKMOZKuq/EEazpw/tva6+8vzP6EUt4ghN6V9EhcfHFRA2l37xzImWBYvPwozDCD4n8OrQlFz86RiFjuSAe0FEP+CubP+TAYW9ZPPTF9plP66XbOrWjjm4/rho95qiG9DA= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924076259916.9457474644316; Thu, 26 Jun 2025 00:47:56 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJT-0004jl-VV; Thu, 26 Jun 2025 03:46:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJI-0004Yy-1S for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:16 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJF-0001xz-UW for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:15 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-274-ObDLe1bVOSCYo0yf-XNJ4g-1; Thu, 26 Jun 2025 03:46:09 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 59F1E1808993; Thu, 26 Jun 2025 07:46:08 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id AE3BB180045B; Thu, 26 Jun 2025 07:46:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923973; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zo5xwy24PzhkdmLZji8s8LCN5YV4P1Qb4+NP0OylSMY=; b=fmdOMRmvqtVOsuEy9wRYPG8061GjrsXA3ub/OGFJnpR1h6ADP7KRxq7K+/JGww3KiMYhLu X7e5QL3C7YZ9XBCXnX+MLLmHBFpu0e7xoKnTRxzS21U4eJFXeXGt4/KtVJlu/KOkk1xbDT Co5TnMflZ7+oUnWxWYUuFsj/SjusAEg= X-MC-Unique: ObDLe1bVOSCYo0yf-XNJ4g-1 X-Mimecast-MFC-AGG-ID: ObDLe1bVOSCYo0yf-XNJ4g_1750923968 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 12/25] vfio-user: implement VFIO_USER_DEVICE_GET_REGION_INFO Date: Thu, 26 Jun 2025 09:45:16 +0200 Message-ID: <20250626074529.1384114-13-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924078580116600 From: John Levon Add support for getting region info for vfio-user. As vfio-user has one fd per region, enable ->use_region_fds. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-7-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/device.h | 2 ++ hw/vfio-user/protocol.h | 14 ++++++++ hw/vfio-user/proxy.h | 1 + hw/vfio-user/device.c | 74 +++++++++++++++++++++++++++++++++++++++ hw/vfio-user/pci.c | 11 ++++++ hw/vfio-user/trace-events | 1 + 6 files changed, 103 insertions(+) diff --git a/hw/vfio-user/device.h b/hw/vfio-user/device.h index ef3f71ee697da301baf6d556796850524ac4ef34..619c0f3140650599f930eddeb6d= 8a4a9ebf10b97 100644 --- a/hw/vfio-user/device.h +++ b/hw/vfio-user/device.h @@ -17,4 +17,6 @@ bool vfio_user_get_device_info(VFIOUserProxy *proxy, struct vfio_device_info *info, Error **errp= ); =20 +extern VFIODeviceIOOps vfio_user_device_io_ops_sock; + #endif /* VFIO_USER_DEVICE_H */ diff --git a/hw/vfio-user/protocol.h b/hw/vfio-user/protocol.h index e0bba68739d05ee21282108946620e75dd11163d..db88f5fcb17647613151684894f= 72b2cc77a3c44 100644 --- a/hw/vfio-user/protocol.h +++ b/hw/vfio-user/protocol.h @@ -124,4 +124,18 @@ typedef struct { uint32_t num_irqs; } VFIOUserDeviceInfo; =20 +/* + * VFIO_USER_DEVICE_GET_REGION_INFO + * imported from struct vfio_region_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t cap_offset; + uint64_t size; + uint64_t offset; +} VFIOUserRegionInfo; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h index 837b02a8c486e9bf71d63b442a362a9a9610ff18..ba1c33aba8d08083c9afd2f203e= d0a04a56cf70f 100644 --- a/hw/vfio-user/proxy.h +++ b/hw/vfio-user/proxy.h @@ -15,6 +15,7 @@ #include "qemu/queue.h" #include "qemu/sockets.h" #include "qemu/thread.h" +#include "hw/vfio/vfio-device.h" #include "hw/vfio-user/protocol.h" =20 typedef struct { diff --git a/hw/vfio-user/device.c b/hw/vfio-user/device.c index 4212fefd44d7f8a5c1b9156765619047e54d7b1b..d90232a08f4b0d8320028796935= b8c59506da69d 100644 --- a/hw/vfio-user/device.c +++ b/hw/vfio-user/device.c @@ -53,3 +53,77 @@ bool vfio_user_get_device_info(VFIOUserProxy *proxy, =20 return true; } + +static int vfio_user_get_region_info(VFIOUserProxy *proxy, + struct vfio_region_info *info, + VFIOUserFDs *fds) +{ + g_autofree VFIOUserRegionInfo *msgp =3D NULL; + Error *local_err =3D NULL; + uint32_t size; + + /* data returned can be larger than vfio_region_info */ + if (info->argsz < sizeof(*info)) { + error_printf("vfio_user_get_region_info argsz too small\n"); + return -E2BIG; + } + if (fds !=3D NULL && fds->send_fds !=3D 0) { + error_printf("vfio_user_get_region_info can't send FDs\n"); + return -EINVAL; + } + + size =3D info->argsz + sizeof(VFIOUserHdr); + msgp =3D g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_GET_REGION_INFO, + sizeof(*msgp), 0); + msgp->argsz =3D info->argsz; + msgp->index =3D info->index; + + if (!vfio_user_send_wait(proxy, &msgp->hdr, fds, size, &local_err)) { + error_prepend(&local_err, "%s: ", __func__); + error_report_err(local_err); + return -EFAULT; + } + + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + trace_vfio_user_get_region_info(msgp->index, msgp->flags, msgp->size); + + memcpy(info, &msgp->argsz, info->argsz); + return 0; +} + + +static int vfio_user_device_io_get_region_info(VFIODevice *vbasedev, + struct vfio_region_info *in= fo, + int *fd) +{ + VFIOUserFDs fds =3D { 0, 1, fd}; + int ret; + + if (info->index > vbasedev->num_regions) { + return -EINVAL; + } + + ret =3D vfio_user_get_region_info(vbasedev->proxy, info, &fds); + if (ret) { + return ret; + } + + /* cap_offset in valid area */ + if ((info->flags & VFIO_REGION_INFO_FLAG_CAPS) && + (info->cap_offset < sizeof(*info) || info->cap_offset > info->args= z)) { + return -EINVAL; + } + + return 0; +} + +/* + * Socket-based io_ops + */ +VFIODeviceIOOps vfio_user_device_io_ops_sock =3D { + .get_region_info =3D vfio_user_device_io_get_region_info, +}; diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index 61f525cf4a77f0dd0204d1cf28d213ef13824164..d704e3d390c71444c9d913e38c4= dc064ff1c6625 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -12,6 +12,7 @@ =20 #include "hw/qdev-properties.h" #include "hw/vfio/pci.h" +#include "hw/vfio-user/device.h" #include "hw/vfio-user/proxy.h" =20 #define TYPE_VFIO_USER_PCI "vfio-user-pci" @@ -103,11 +104,21 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Er= ror **errp) goto error; } =20 + /* + * Use socket-based device I/O instead of vfio kernel driver. + */ + vbasedev->io_ops =3D &vfio_user_device_io_ops_sock; + /* * vfio-user devices are effectively mdevs (don't use a host iommu). */ vbasedev->mdev =3D true; =20 + /* + * Enable per-region fds. + */ + vbasedev->use_region_fds =3D true; + as =3D pci_device_iommu_address_space(pdev); if (!vfio_device_attach_by_iommu_type(TYPE_VFIO_IOMMU_USER, vbasedev->name, vbasedev, diff --git a/hw/vfio-user/trace-events b/hw/vfio-user/trace-events index b7312d6d5928dc7d1d984fdb501b7146e2011a79..ef3f14c74dd5a6a3c0870795e9a= b9291d26dc925 100644 --- a/hw/vfio-user/trace-events +++ b/hw/vfio-user/trace-events @@ -9,3 +9,4 @@ vfio_user_recv_request(uint16_t cmd) " command 0x%x" vfio_user_send_write(uint16_t id, int wrote) " id 0x%x wrote 0x%x" vfio_user_version(uint16_t major, uint16_t minor, const char *caps) " majo= r %d minor %d caps: %s" vfio_user_get_info(uint32_t nregions, uint32_t nirqs) " #regions %d #irqs = %d" +vfio_user_get_region_info(uint32_t index, uint32_t flags, uint64_t size) "= index %d flags 0x%x size 0x%"PRIx64 --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924079; cv=none; d=zohomail.com; s=zohoarc; b=Mpsetbc+bklrkzUEbgDS4whudwwK/BjQ3oQLUm9dr+l0ZHZACS1UA9ioVvMyaxsHPKTY4x3mvLPDpPmSchKi5StjwaHMkW84XomR0I2I+8WVvyhhHix728rQzPbdOh41VOCGq8ztSlTa1MPC7xIEhh4nQFINKlF2iCqJKgdwzSA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924079; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=E6D6Z+DLkGDP2rfLDxxYiy5ZSAQnqxmvOdaY+vUUigg=; b=G1sbZ+J0ABzpgZwEB0YSJU81aQ8DUCVcWRdnbxsEe4g5PSJq9l4+2FzMNX8rRBxTIwLXOajQtiPX7aAENqkcgc7AzUnGMEWaGyQBpF+z9fbNvGWfsXXnvZeq6i3kGl3Y8FVER8TrfP1drzKqMRx3Et/E5054oyW3ObdhYjyDPFY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924079014466.0586930035296; Thu, 26 Jun 2025 00:47:59 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJe-00050M-53; Thu, 26 Jun 2025 03:46:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJJ-0004Zt-P0 for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:19 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJH-0001yC-Ck for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:17 -0400 Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-396-N5sGj4mGOHOmKxd5XNaVnw-1; Thu, 26 Jun 2025 03:46:12 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 815891944AAD; Thu, 26 Jun 2025 07:46:11 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D7B3D180035C; Thu, 26 Jun 2025 07:46:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923974; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E6D6Z+DLkGDP2rfLDxxYiy5ZSAQnqxmvOdaY+vUUigg=; b=jL9eZdjpQ+UUWwyjBudPTOJunx7/2K1LGm78P5OfkFPtwcbDPZ9b1CTsh5Eir8eP11qrYu W+PE/zZvbQ08Poekoedzdjnbwy+LzRoT4QKgrqn04O5gdhox9B0QeRt7uA27kYoC6/yH11 JCOa+V3WRFm9HPnhLRmdHLYgj/sudmY= X-MC-Unique: N5sGj4mGOHOmKxd5XNaVnw-1 X-Mimecast-MFC-AGG-ID: N5sGj4mGOHOmKxd5XNaVnw_1750923971 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 13/25] vfio-user: implement VFIO_USER_REGION_READ/WRITE Date: Thu, 26 Jun 2025 09:45:17 +0200 Message-ID: <20250626074529.1384114-14-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924080503116600 From: John Levon Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-8-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/protocol.h | 12 ++++++ hw/vfio-user/device.c | 79 +++++++++++++++++++++++++++++++++++++++ hw/vfio-user/trace-events | 1 + 3 files changed, 92 insertions(+) diff --git a/hw/vfio-user/protocol.h b/hw/vfio-user/protocol.h index db88f5fcb17647613151684894f72b2cc77a3c44..0cd32ad71ae8e6767f613f9ef73= aaf1858c80f90 100644 --- a/hw/vfio-user/protocol.h +++ b/hw/vfio-user/protocol.h @@ -138,4 +138,16 @@ typedef struct { uint64_t offset; } VFIOUserRegionInfo; =20 +/* + * VFIO_USER_REGION_READ + * VFIO_USER_REGION_WRITE + */ +typedef struct { + VFIOUserHdr hdr; + uint64_t offset; + uint32_t region; + uint32_t count; + char data[]; +} VFIOUserRegionRW; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio-user/device.c b/hw/vfio-user/device.c index d90232a08f4b0d8320028796935b8c59506da69d..4ef6a2d850c282cd8bf4a07659e= 7b751b1432735 100644 --- a/hw/vfio-user/device.c +++ b/hw/vfio-user/device.c @@ -121,9 +121,88 @@ static int vfio_user_device_io_get_region_info(VFIODev= ice *vbasedev, return 0; } =20 +static int vfio_user_device_io_region_read(VFIODevice *vbasedev, uint8_t i= ndex, + off_t off, uint32_t count, + void *data) +{ + g_autofree VFIOUserRegionRW *msgp =3D NULL; + VFIOUserProxy *proxy =3D vbasedev->proxy; + int size =3D sizeof(*msgp) + count; + Error *local_err =3D NULL; + + if (count > proxy->max_xfer_size) { + return -EINVAL; + } + + msgp =3D g_malloc0(size); + vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_READ, sizeof(*msgp)= , 0); + msgp->offset =3D off; + msgp->region =3D index; + msgp->count =3D count; + trace_vfio_user_region_rw(msgp->region, msgp->offset, msgp->count); + + if (!vfio_user_send_wait(proxy, &msgp->hdr, NULL, size, &local_err)) { + error_prepend(&local_err, "%s: ", __func__); + error_report_err(local_err); + return -EFAULT; + } + + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } else if (msgp->count > count) { + return -E2BIG; + } else { + memcpy(data, &msgp->data, msgp->count); + } + + return msgp->count; +} + +static int vfio_user_device_io_region_write(VFIODevice *vbasedev, uint8_t = index, + off_t off, unsigned count, + void *data, bool post) +{ + g_autofree VFIOUserRegionRW *msgp =3D NULL; + VFIOUserProxy *proxy =3D vbasedev->proxy; + int size =3D sizeof(*msgp) + count; + Error *local_err =3D NULL; + int ret; + + if (count > proxy->max_xfer_size) { + return -EINVAL; + } + + msgp =3D g_malloc0(size); + vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_WRITE, size, 0); + msgp->offset =3D off; + msgp->region =3D index; + msgp->count =3D count; + memcpy(&msgp->data, data, count); + trace_vfio_user_region_rw(msgp->region, msgp->offset, msgp->count); + + /* Ignore post: all writes are synchronous/non-posted. */ + + if (!vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0, &local_err)) { + error_prepend(&local_err, "%s: ", __func__); + error_report_err(local_err); + return -EFAULT; + } + + if (msgp->hdr.flags & VFIO_USER_ERROR) { + ret =3D -msgp->hdr.error_reply; + } else { + ret =3D count; + } + + return ret; +} + /* * Socket-based io_ops */ VFIODeviceIOOps vfio_user_device_io_ops_sock =3D { .get_region_info =3D vfio_user_device_io_get_region_info, + .region_read =3D vfio_user_device_io_region_read, + .region_write =3D vfio_user_device_io_region_write, + }; diff --git a/hw/vfio-user/trace-events b/hw/vfio-user/trace-events index ef3f14c74dd5a6a3c0870795e9ab9291d26dc925..733313cd1f60de37d24f6e0801a= a72fb4717bda6 100644 --- a/hw/vfio-user/trace-events +++ b/hw/vfio-user/trace-events @@ -10,3 +10,4 @@ vfio_user_send_write(uint16_t id, int wrote) " id 0x%x wr= ote 0x%x" vfio_user_version(uint16_t major, uint16_t minor, const char *caps) " majo= r %d minor %d caps: %s" vfio_user_get_info(uint32_t nregions, uint32_t nirqs) " #regions %d #irqs = %d" vfio_user_get_region_info(uint32_t index, uint32_t flags, uint64_t size) "= index %d flags 0x%x size 0x%"PRIx64 +vfio_user_region_rw(uint32_t region, uint64_t off, uint32_t count) " regio= n %d offset 0x%"PRIx64" count %d" --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924084; cv=none; d=zohomail.com; s=zohoarc; b=G8rWUg6460woD9DE84sy1TJNZW2CnhUo1FN+mtoYuwvBhrDN6hwu5bSNNNR0lXUHRY+p/fDpGRuLNTZ23PdRzEGeaP8E19pqsU9BDcYWEEmmpbt3gmxvEfuY0Q5GGDQ/faWLCuxL3yrr64GNlUrP+b2W66obSRPXOXn2sfNGQkI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924084; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=otGVkjCf0opn+vETCAGncq2bMRafF3DOvc1cBsAO+2s=; b=eTc9c+Bx/YmeifokTbR0F2S0wpsZuN5WF051sDs9hq6QgrAC3aW2XSAcPTZ63JwS6TTfaoQPYbjKfJ6BKDRrF20wWjAr8auPOJeEqdWj/d7ByBTEguZWkamsvEvzvuzqYSNzDTu7ZdaJIkILvd6PnvDvkiOuEK1SW3z8hYIVbBY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924084710536.8487361586011; Thu, 26 Jun 2025 00:48:04 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJk-0005MY-PA; Thu, 26 Jun 2025 03:46:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJU-0004kk-Po for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJP-0001ys-Ji for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:28 -0400 Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-347-P1n4hmnePDiMmlmkC1o09g-1; Thu, 26 Jun 2025 03:46:15 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 70D3A18001D1; Thu, 26 Jun 2025 07:46:14 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 06FA3180045B; Thu, 26 Jun 2025 07:46:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923981; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=otGVkjCf0opn+vETCAGncq2bMRafF3DOvc1cBsAO+2s=; b=KEpINegZEm/0xlYZKWKFrW8uJL47kG9L0Ynuc/JACWRF0weeJREv1fxks/fPvbQvSq887d KvLHSw1LB5ycZNVLoanCxQASsMxoYNp4S+763WO5CFRHEs8D2trWNGNdBkQsb0Wf7gli36 L9UkMsSJEAuD394KrRpaQIN4tUatX0A= X-MC-Unique: P1n4hmnePDiMmlmkC1o09g-1 X-Mimecast-MFC-AGG-ID: P1n4hmnePDiMmlmkC1o09g_1750923974 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 14/25] vfio-user: set up PCI in vfio_user_pci_realize() Date: Thu, 26 Jun 2025 09:45:18 +0200 Message-ID: <20250626074529.1384114-15-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924086661116600 From: John Levon Re-use PCI setup functions from hw/vfio/pci.c to realize the vfio-user PCI device. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-9-john.levo= n@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/pci.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index d704e3d390c71444c9d913e38c4dc064ff1c6625..b49f42b980017d61a2676c5dd7e= 01a869a63ac50 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -126,10 +126,39 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Er= ror **errp) goto error; } =20 + if (!vfio_pci_populate_device(vdev, errp)) { + goto error; + } + + if (!vfio_pci_config_setup(vdev, errp)) { + goto error; + } + + /* + * vfio_pci_config_setup will have registered the device's BARs + * and setup any MSIX BARs, so errors after it succeeds must + * use out_teardown + */ + + if (!vfio_pci_add_capabilities(vdev, errp)) { + goto out_teardown; + } + + if (!vfio_pci_interrupt_setup(vdev, errp)) { + goto out_teardown; + } + + vfio_pci_register_err_notifier(vdev); + vfio_pci_register_req_notifier(vdev); + return; =20 +out_teardown: + vfio_pci_teardown_msi(vdev); + vfio_pci_bars_exit(vdev); error: error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name); + vfio_pci_put_device(vdev); } =20 static void vfio_user_instance_init(Object *obj) --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924081; cv=none; d=zohomail.com; s=zohoarc; b=mgDEe7QHIC1BcAh2Co8euHQbVDdHvu/HLzJJgq05rwLN3YqwjOVhb5eyLi34qfsrPTBpxiaY/kjVuBjNaKuf0a7kV5wmE7eDYeUahNPYljCS1OKYWe+iNKB7O1edOL2M/KFm18n69FWdC85TG47OSJe9sxd04nExc6JrdyxSJhM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924081; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=ADCp8nC6oEqO58PRjxGR0DDe9L4EYjnFEK0y4X8B/FI=; b=jXSSOz4PFGPRwIg1M/GpDoovle3cNS4sREfmMXdgeMsq+fJLzfXvtzKFoAuNiUDquI7qnHsEzUHJK/jV5JzTS+psFnfYtELbrfEMrmTPZqzsOoDA/Yr+ViNOVdpt5Ag9uT6fF0GiPIHOBat+geJMdQHidxS1XDhYqTCPSVsFzaA= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924081777443.87787973370337; Thu, 26 Jun 2025 00:48:01 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJm-0005Ra-1U; Thu, 26 Jun 2025 03:46:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJU-0004kj-Q1 for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJP-0001yu-JM for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:28 -0400 Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-90-OIU3baGfNWueHpIsc7KkSA-1; Thu, 26 Jun 2025 03:46:18 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3319A180029F; Thu, 26 Jun 2025 07:46:17 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id ED2E5180035C; Thu, 26 Jun 2025 07:46:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923981; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ADCp8nC6oEqO58PRjxGR0DDe9L4EYjnFEK0y4X8B/FI=; b=VHNqiDLgzVpTBGoj40h1fgPQZBAlgBY3x7Wyq53/mjlm6CWceTuewr9r+STboiATGY1E3o ptL8QW6NdRCgkZWWV3dh44JPaVAHkLL/MNpY28FuYXdv1fWZfPr32uKeS0rUWKv8WhlqOF QfESHmEcrXFS6asrXIzhfYhl0Nhjie4= X-MC-Unique: OIU3baGfNWueHpIsc7KkSA-1 X-Mimecast-MFC-AGG-ID: OIU3baGfNWueHpIsc7KkSA_1750923977 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 15/25] vfio-user: implement VFIO_USER_DEVICE_GET/SET_IRQ* Date: Thu, 26 Jun 2025 09:45:19 +0200 Message-ID: <20250626074529.1384114-16-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924084632116600 From: John Levon IRQ setup uses the same semantics as the traditional vfio path, but we need to share the corresponding file descriptors with the server as necessary. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-10-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/protocol.h | 25 +++++++ hw/vfio-user/device.c | 138 ++++++++++++++++++++++++++++++++++++++ hw/vfio-user/trace-events | 2 + 3 files changed, 165 insertions(+) diff --git a/hw/vfio-user/protocol.h b/hw/vfio-user/protocol.h index 0cd32ad71ae8e6767f613f9ef73aaf1858c80f90..48144b2c33324faad4a07e1457a= 4709575466afb 100644 --- a/hw/vfio-user/protocol.h +++ b/hw/vfio-user/protocol.h @@ -138,6 +138,31 @@ typedef struct { uint64_t offset; } VFIOUserRegionInfo; =20 +/* + * VFIO_USER_DEVICE_GET_IRQ_INFO + * imported from struct vfio_irq_info + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t count; +} VFIOUserIRQInfo; + +/* + * VFIO_USER_DEVICE_SET_IRQS + * imported from struct vfio_irq_set + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint32_t index; + uint32_t start; + uint32_t count; +} VFIOUserIRQSet; + /* * VFIO_USER_REGION_READ * VFIO_USER_REGION_WRITE diff --git a/hw/vfio-user/device.c b/hw/vfio-user/device.c index 4ef6a2d850c282cd8bf4a07659e7b751b1432735..f01b3925c5a5a22cd8f55303d48= b36d9dface54c 100644 --- a/hw/vfio-user/device.c +++ b/hw/vfio-user/device.c @@ -121,6 +121,142 @@ static int vfio_user_device_io_get_region_info(VFIODe= vice *vbasedev, return 0; } =20 +static int vfio_user_device_io_get_irq_info(VFIODevice *vbasedev, + struct vfio_irq_info *info) +{ + VFIOUserProxy *proxy =3D vbasedev->proxy; + Error *local_err =3D NULL; + VFIOUserIRQInfo msg; + + memset(&msg, 0, sizeof(msg)); + vfio_user_request_msg(&msg.hdr, VFIO_USER_DEVICE_GET_IRQ_INFO, + sizeof(msg), 0); + msg.argsz =3D info->argsz; + msg.index =3D info->index; + + if (!vfio_user_send_wait(proxy, &msg.hdr, NULL, 0, &local_err)) { + error_prepend(&local_err, "%s: ", __func__); + error_report_err(local_err); + return -EFAULT; + } + + if (msg.hdr.flags & VFIO_USER_ERROR) { + return -msg.hdr.error_reply; + } + trace_vfio_user_get_irq_info(msg.index, msg.flags, msg.count); + + memcpy(info, &msg.argsz, sizeof(*info)); + return 0; +} + +static int irq_howmany(int *fdp, uint32_t cur, uint32_t max) +{ + int n =3D 0; + + if (fdp[cur] !=3D -1) { + do { + n++; + } while (n < max && fdp[cur + n] !=3D -1); + } else { + do { + n++; + } while (n < max && fdp[cur + n] =3D=3D -1); + } + + return n; +} + +static int vfio_user_device_io_set_irqs(VFIODevice *vbasedev, + struct vfio_irq_set *irq) +{ + VFIOUserProxy *proxy =3D vbasedev->proxy; + g_autofree VFIOUserIRQSet *msgp =3D NULL; + uint32_t size, nfds, send_fds, sent_fds, max; + Error *local_err =3D NULL; + + if (irq->argsz < sizeof(*irq)) { + error_printf("vfio_user_set_irqs argsz too small\n"); + return -EINVAL; + } + + /* + * Handle simple case + */ + if ((irq->flags & VFIO_IRQ_SET_DATA_EVENTFD) =3D=3D 0) { + size =3D sizeof(VFIOUserHdr) + irq->argsz; + msgp =3D g_malloc0(size); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_SET_IRQS, size,= 0); + msgp->argsz =3D irq->argsz; + msgp->flags =3D irq->flags; + msgp->index =3D irq->index; + msgp->start =3D irq->start; + msgp->count =3D irq->count; + trace_vfio_user_set_irqs(msgp->index, msgp->start, msgp->count, + msgp->flags); + + if (!vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0, &local_err)) { + error_prepend(&local_err, "%s: ", __func__); + error_report_err(local_err); + return -EFAULT; + } + + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + + return 0; + } + + /* + * Calculate the number of FDs to send + * and adjust argsz + */ + nfds =3D (irq->argsz - sizeof(*irq)) / sizeof(int); + irq->argsz =3D sizeof(*irq); + msgp =3D g_malloc0(sizeof(*msgp)); + /* + * Send in chunks if over max_send_fds + */ + for (sent_fds =3D 0; nfds > sent_fds; sent_fds +=3D send_fds) { + VFIOUserFDs *arg_fds, loop_fds; + + /* must send all valid FDs or all invalid FDs in single msg */ + max =3D nfds - sent_fds; + if (max > proxy->max_send_fds) { + max =3D proxy->max_send_fds; + } + send_fds =3D irq_howmany((int *)irq->data, sent_fds, max); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DEVICE_SET_IRQS, + sizeof(*msgp), 0); + msgp->argsz =3D irq->argsz; + msgp->flags =3D irq->flags; + msgp->index =3D irq->index; + msgp->start =3D irq->start + sent_fds; + msgp->count =3D send_fds; + trace_vfio_user_set_irqs(msgp->index, msgp->start, msgp->count, + msgp->flags); + + loop_fds.send_fds =3D send_fds; + loop_fds.recv_fds =3D 0; + loop_fds.fds =3D (int *)irq->data + sent_fds; + arg_fds =3D loop_fds.fds[0] !=3D -1 ? &loop_fds : NULL; + + if (!vfio_user_send_wait(proxy, &msgp->hdr, arg_fds, 0, &local_err= )) { + error_prepend(&local_err, "%s: ", __func__); + error_report_err(local_err); + return -EFAULT; + } + + if (msgp->hdr.flags & VFIO_USER_ERROR) { + return -msgp->hdr.error_reply; + } + } + + return 0; +} + static int vfio_user_device_io_region_read(VFIODevice *vbasedev, uint8_t i= ndex, off_t off, uint32_t count, void *data) @@ -202,6 +338,8 @@ static int vfio_user_device_io_region_write(VFIODevice = *vbasedev, uint8_t index, */ VFIODeviceIOOps vfio_user_device_io_ops_sock =3D { .get_region_info =3D vfio_user_device_io_get_region_info, + .get_irq_info =3D vfio_user_device_io_get_irq_info, + .set_irqs =3D vfio_user_device_io_set_irqs, .region_read =3D vfio_user_device_io_region_read, .region_write =3D vfio_user_device_io_region_write, =20 diff --git a/hw/vfio-user/trace-events b/hw/vfio-user/trace-events index 733313cd1f60de37d24f6e0801aa72fb4717bda6..aa8f3c3d4d3b30e3d5091d77b8c= a2fccaa847a73 100644 --- a/hw/vfio-user/trace-events +++ b/hw/vfio-user/trace-events @@ -11,3 +11,5 @@ vfio_user_version(uint16_t major, uint16_t minor, const c= har *caps) " major %d m vfio_user_get_info(uint32_t nregions, uint32_t nirqs) " #regions %d #irqs = %d" vfio_user_get_region_info(uint32_t index, uint32_t flags, uint64_t size) "= index %d flags 0x%x size 0x%"PRIx64 vfio_user_region_rw(uint32_t region, uint64_t off, uint32_t count) " regio= n %d offset 0x%"PRIx64" count %d" +vfio_user_get_irq_info(uint32_t index, uint32_t flags, uint32_t count) " i= ndex %d flags 0x%x count %d" +vfio_user_set_irqs(uint32_t index, uint32_t start, uint32_t count, uint32_= t flags) " index %d start %d count %d flags 0x%x" --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924249; cv=none; d=zohomail.com; s=zohoarc; b=A9QTQRU9AQF6ea/yD2OSh9UQfk+o7aPZlocTjmJ84Snj2Hd5TGU9xLbtKQDCRQ9JW57ZzOlEGH/bWGDKGQSRVIwEvFmGscw8imqKdywLsxY+WRunqSnSsXr5XSHu6RmWRK40Vjun9gaxJroAAoE5NCeGQ3GtGMWL/JViPxn82x4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924249; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=NMVdBE6sPxaz95ENjSAyU0hEzektczFNWQgGZnlVLh8=; b=ghlYBNUeUnHdOKRFdGcNq9RBdzOtlIylm2gZWK9urjeb3j6aEf0sS7daV0IP4iFa3NzFyhI38EaXg1o9YpyDqh4lfVw8aSlgj+YDgbEX61N7h7WC9eYt+PXS6a2lyoLqnZW8aujIYGVc0QM93HxZmq6UMXhjjIhxxW9XPu2AEns= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924248934771.461343556137; Thu, 26 Jun 2025 00:50:48 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJl-0005Qa-T8; Thu, 26 Jun 2025 03:46:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJY-0004qx-1O for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJR-0001z0-IG for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:30 -0400 Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-322-lbaYQ96JOLKXjMdTzHr5cQ-1; Thu, 26 Jun 2025 03:46:21 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 21511193F069; Thu, 26 Jun 2025 07:46:20 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id AFAE6180035C; Thu, 26 Jun 2025 07:46:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923984; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NMVdBE6sPxaz95ENjSAyU0hEzektczFNWQgGZnlVLh8=; b=JTFlgTt785jf2Z4u5iREPCmHmqlkd8tapOPmYmTMdMYXFW9Cs/PQIVavuN/imwOMt1h3hc jWwxUmfMRaY5+XBtA88udOk88IA+fc9nAoo4tMtNJDUvXbzyQ6hZWoOLg59DQmT2A2hsYA k7l7LX+nk6bvulioiAI1bHtdOG1zxL0= X-MC-Unique: lbaYQ96JOLKXjMdTzHr5cQ-1 X-Mimecast-MFC-AGG-ID: lbaYQ96JOLKXjMdTzHr5cQ_1750923980 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 16/25] vfio-user: forward MSI-X PBA BAR accesses to server Date: Thu, 26 Jun 2025 09:45:20 +0200 Message-ID: <20250626074529.1384114-17-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924250550116600 From: John Levon For vfio-user, the server holds the pending IRQ state; set up an I/O region for the MSI-X PBA so we can ask the server for this state on a PBA read. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-11-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio/pci.h | 1 + hw/vfio-user/pci.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index d3dc2274a97bc591b02df117f4488d24cd39fe7a..5ba7330b27e80d1a565da270468= 9e48fa9bece18 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -116,6 +116,7 @@ typedef struct VFIOMSIXInfo { uint32_t pba_offset; unsigned long *pending; bool noresize; + MemoryRegion *pba_region; } VFIOMSIXInfo; =20 /* diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index b49f42b980017d61a2676c5dd7e01a869a63ac50..c0f00f15b10a80820606ff4ef03= 83d59ecd91670 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -24,6 +24,62 @@ struct VFIOUserPCIDevice { bool send_queued; /* all sends are queued */ }; =20 +/* + * The server maintains the device's pending interrupts, + * via its MSIX table and PBA, so we treat these accesses + * like PCI config space and forward them. + */ +static uint64_t vfio_user_pba_read(void *opaque, hwaddr addr, + unsigned size) +{ + VFIOPCIDevice *vdev =3D opaque; + VFIORegion *region =3D &vdev->bars[vdev->msix->pba_bar].region; + uint64_t data; + + /* server copy is what matters */ + data =3D vfio_region_read(region, addr + vdev->msix->pba_offset, size); + return data; +} + +static void vfio_user_pba_write(void *opaque, hwaddr addr, + uint64_t data, unsigned size) +{ + /* dropped */ +} + +static const MemoryRegionOps vfio_user_pba_ops =3D { + .read =3D vfio_user_pba_read, + .write =3D vfio_user_pba_write, + .endianness =3D DEVICE_LITTLE_ENDIAN, +}; + +static void vfio_user_msix_setup(VFIOPCIDevice *vdev) +{ + MemoryRegion *vfio_reg, *msix_reg, *pba_reg; + + pba_reg =3D g_new0(MemoryRegion, 1); + vdev->msix->pba_region =3D pba_reg; + + vfio_reg =3D vdev->bars[vdev->msix->pba_bar].mr; + msix_reg =3D &vdev->pdev.msix_pba_mmio; + memory_region_init_io(pba_reg, OBJECT(vdev), &vfio_user_pba_ops, vdev, + "VFIO MSIX PBA", int128_get64(msix_reg->size)); + memory_region_add_subregion_overlap(vfio_reg, vdev->msix->pba_offset, + pba_reg, 1); +} + +static void vfio_user_msix_teardown(VFIOPCIDevice *vdev) +{ + MemoryRegion *mr, *sub; + + mr =3D vdev->bars[vdev->msix->pba_bar].mr; + sub =3D vdev->msix->pba_region; + memory_region_del_subregion(mr, sub); + + g_free(vdev->msix->pba_region); + vdev->msix->pba_region =3D NULL; +} + /* * Incoming request message callback. * @@ -144,6 +200,10 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Err= or **errp) goto out_teardown; } =20 + if (vdev->msix !=3D NULL) { + vfio_user_msix_setup(vdev); + } + if (!vfio_pci_interrupt_setup(vdev, errp)) { goto out_teardown; } @@ -192,6 +252,10 @@ static void vfio_user_instance_finalize(Object *obj) VFIOPCIDevice *vdev =3D VFIO_PCI_BASE(obj); VFIODevice *vbasedev =3D &vdev->vbasedev; =20 + if (vdev->msix !=3D NULL) { + vfio_user_msix_teardown(vdev); + } + vfio_pci_put_device(vdev); =20 if (vbasedev->proxy !=3D NULL) { --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924082; cv=none; d=zohomail.com; s=zohoarc; b=HatayoWTtnxzNZQ2qptvZ5/B+s1ihgjVq6Hc7V4vx1ipjUDGZyRjzY7sfAr9wGkoVd4tOtTurrTttLNpKPSYybPHu/1cVjPkLH9CKHv4YikpaA9MO08hGO19ygQAi6nTNKC80WWdGDz/H3OB8KTYR/fBgcsPOpjM+1hcE6lRViM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924082; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=ZCSr4dPM/OB5HkYMabf3A1sfa8rX74gpdBjkUH/fH6Y=; b=a4e8LHz48I+mUBnULWn6m3ZteditAMXEaKJ9efP/yCWuvhy25kI8d9+2eKrUg4bfs185Zb8Ktr1GFPpPKqcyWwOdoW9Y0To0499KWsogdv6mFbQaVczr28bErjnlCLe0sTf7NFTlvQqoKpWc3QTQOGpzw6xVv2PTS9UVoFkG48s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924082299953.7684573749274; Thu, 26 Jun 2025 00:48:02 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJj-0005FA-D9; Thu, 26 Jun 2025 03:46:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJY-0004qy-9p for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:33 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJT-0001zA-Pq for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:30 -0400 Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-170-41vcD9K3PCGhFUHlaM_fZg-1; Thu, 26 Jun 2025 03:46:24 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 00F7A18DA380; Thu, 26 Jun 2025 07:46:23 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9AD6F180045B; Thu, 26 Jun 2025 07:46:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923986; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZCSr4dPM/OB5HkYMabf3A1sfa8rX74gpdBjkUH/fH6Y=; b=Og+PzRy+lQPTMuuXr0H5w8mxT9680tKSpaianNelt7+wbQFb2ZIJ/LvBhBOoWa44aFOGgH NSMrUlfEyPtcyTu/ab65m8rNuefxsku5VYZ+MZMpKLRqZnOtHeVv8RVy7Gg9+tR4IxazoT QMfg94IwTX/0wVt+KYIj6VU4ZDteKRg= X-MC-Unique: 41vcD9K3PCGhFUHlaM_fZg-1 X-Mimecast-MFC-AGG-ID: 41vcD9K3PCGhFUHlaM_fZg_1750923983 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 17/25] vfio-user: set up container access to the proxy Date: Thu, 26 Jun 2025 09:45:21 +0200 Message-ID: <20250626074529.1384114-18-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924084546116600 From: John Levon The user container will shortly need access to the underlying vfio-user proxy; set this up. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-12-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/container.h | 2 ++ hw/vfio-user/container.c | 43 +++++++++++++++++++++++++++++++--------- 2 files changed, 36 insertions(+), 9 deletions(-) diff --git a/hw/vfio-user/container.h b/hw/vfio-user/container.h index e4a46d2c1b0f712cd7fc1ff62a65d889050de835..2bb1fa13431c280f595f52cd651= 0ccd141f1ffdb 100644 --- a/hw/vfio-user/container.h +++ b/hw/vfio-user/container.h @@ -10,10 +10,12 @@ #include "qemu/osdep.h" =20 #include "hw/vfio/vfio-container-base.h" +#include "hw/vfio-user/proxy.h" =20 /* MMU container sub-class for vfio-user. */ typedef struct VFIOUserContainer { VFIOContainerBase bcontainer; + VFIOUserProxy *proxy; } VFIOUserContainer; =20 OBJECT_DECLARE_SIMPLE_TYPE(VFIOUserContainer, VFIO_IOMMU_USER); diff --git a/hw/vfio-user/container.c b/hw/vfio-user/container.c index f5bfd5431675c3b8abfb3ba44098ac61201fde58..b4a5a840b089c5622ade83e632c= 2560475076e5b 100644 --- a/hw/vfio-user/container.c +++ b/hw/vfio-user/container.c @@ -49,15 +49,28 @@ static int vfio_user_query_dirty_bitmap(const VFIOConta= inerBase *bcontainer, =20 static bool vfio_user_setup(VFIOContainerBase *bcontainer, Error **errp) { - error_setg_errno(errp, ENOTSUP, "Not supported"); - return -ENOTSUP; + VFIOUserContainer *container =3D container_of(bcontainer, VFIOUserCont= ainer, + bcontainer); + + assert(container->proxy->dma_pgsizes !=3D 0); + bcontainer->pgsizes =3D container->proxy->dma_pgsizes; + bcontainer->dma_max_mappings =3D container->proxy->max_dma; + + /* No live migration support yet. */ + bcontainer->dirty_pages_supported =3D false; + bcontainer->max_dirty_bitmap_size =3D container->proxy->max_bitmap; + bcontainer->dirty_pgsizes =3D container->proxy->migr_pgsize; + + return true; } =20 -static VFIOUserContainer *vfio_user_create_container(Error **errp) +static VFIOUserContainer *vfio_user_create_container(VFIODevice *vbasedev, + Error **errp) { VFIOUserContainer *container; =20 container =3D VFIO_IOMMU_USER(object_new(TYPE_VFIO_IOMMU_USER)); + container->proxy =3D vbasedev->proxy; return container; } =20 @@ -65,16 +78,18 @@ static VFIOUserContainer *vfio_user_create_container(Er= ror **errp) * Try to mirror vfio_container_connect() as much as possible. */ static VFIOUserContainer * -vfio_user_container_connect(AddressSpace *as, Error **errp) +vfio_user_container_connect(AddressSpace *as, VFIODevice *vbasedev, + Error **errp) { VFIOContainerBase *bcontainer; VFIOUserContainer *container; VFIOAddressSpace *space; VFIOIOMMUClass *vioc; + int ret; =20 space =3D vfio_address_space_get(as); =20 - container =3D vfio_user_create_container(errp); + container =3D vfio_user_create_container(vbasedev, errp); if (!container) { goto put_space_exit; } @@ -85,11 +100,17 @@ vfio_user_container_connect(AddressSpace *as, Error **= errp) goto free_container_exit; } =20 + ret =3D ram_block_uncoordinated_discard_disable(true); + if (ret) { + error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken"= ); + goto unregister_container_exit; + } + vioc =3D VFIO_IOMMU_GET_CLASS(bcontainer); assert(vioc->setup); =20 if (!vioc->setup(bcontainer, errp)) { - goto unregister_container_exit; + goto enable_discards_exit; } =20 vfio_address_space_insert(space, bcontainer); @@ -108,6 +129,9 @@ listener_release_exit: vioc->release(bcontainer); } =20 +enable_discards_exit: + ram_block_uncoordinated_discard_disable(false); + unregister_container_exit: vfio_cpr_unregister_container(bcontainer); =20 @@ -124,14 +148,15 @@ static void vfio_user_container_disconnect(VFIOUserCo= ntainer *container) { VFIOContainerBase *bcontainer =3D &container->bcontainer; VFIOIOMMUClass *vioc =3D VFIO_IOMMU_GET_CLASS(bcontainer); + VFIOAddressSpace *space =3D bcontainer->space; + + ram_block_uncoordinated_discard_disable(false); =20 vfio_listener_unregister(bcontainer); if (vioc->release) { vioc->release(bcontainer); } =20 - VFIOAddressSpace *space =3D bcontainer->space; - vfio_cpr_unregister_container(bcontainer); object_unref(container); =20 @@ -163,7 +188,7 @@ static bool vfio_user_device_attach(const char *name, V= FIODevice *vbasedev, { VFIOUserContainer *container; =20 - container =3D vfio_user_container_connect(as, errp); + container =3D vfio_user_container_connect(as, vbasedev, errp); if (container =3D=3D NULL) { error_prepend(errp, "failed to connect proxy"); return false; --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924179; cv=none; d=zohomail.com; s=zohoarc; b=UF+7B2Gdxvan3v2REddZVVVkf515SLX4E4Af0zEgLI83ZbUJFlORDGId8xPzLgER1vS3l10u0pkcvj8OecxEP2SQTdeZpsElEyBXK7AyR2+oQolcduAfX3yu8HzafZwqHG++aFpSUJnZwXe8fHIQinwCcidSRoXsZ6EeezhccVA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924179; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=QwrYbNhZkajaewKca5Og/0GorUd7Ik/9Cud+ESMPAxA=; b=CIbVfLGFpW10v99F5sJCOfSS0Et1Q/CMbaf9QR23Rv6doT4f9NGEEAFAuM3a3ziZJbjALjAdGi1cbNK7j420V3754lqmemVe8I55bIpNf+NtkCje+LfuxeCUrwzj2Yg5CTk/Ns3ycjy1oamMjp5YFSFalkK106CT9zGYw0EKF4s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 175092417917473.92589134991033; Thu, 26 Jun 2025 00:49:39 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJh-0005A6-Qs; Thu, 26 Jun 2025 03:46:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJb-0004uM-Kq for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:37 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJZ-0001zm-Jq for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:35 -0400 Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-665-yhnMQZoeP-aYKDBAr6-rbQ-1; Thu, 26 Jun 2025 03:46:27 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 9EEDE1800268; Thu, 26 Jun 2025 07:46:26 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 83FDD180045B; Thu, 26 Jun 2025 07:46:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923991; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QwrYbNhZkajaewKca5Og/0GorUd7Ik/9Cud+ESMPAxA=; b=LY4NNxAK3amSNqjywCR7YkymwryLL/hToMNRA+MM59jE1gqsoLHIZ9WJYOiOuUW8CsuCco NxYrG+6FaaUuYk2i7qzdb5h3TfELaVJfrN1Zqbz2nvzs2/kavfUnk/9stIVDGxO3ywqrh4 S3Hmq0bTWBET+fn8dQzfk6IlOLb+UYA= X-MC-Unique: yhnMQZoeP-aYKDBAr6-rbQ-1 X-Mimecast-MFC-AGG-ID: yhnMQZoeP-aYKDBAr6-rbQ_1750923986 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 18/25] vfio-user: implement VFIO_USER_DEVICE_RESET Date: Thu, 26 Jun 2025 09:45:22 +0200 Message-ID: <20250626074529.1384114-19-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924179785116600 From: John Levon Hook this call up to the legacy reset handler for vfio-user-pci. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-13-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/device.h | 2 ++ hw/vfio-user/device.c | 18 ++++++++++++++++++ hw/vfio-user/pci.c | 15 +++++++++++++++ 3 files changed, 35 insertions(+) diff --git a/hw/vfio-user/device.h b/hw/vfio-user/device.h index 619c0f3140650599f930eddeb6d8a4a9ebf10b97..d183a3950e26ea1defac8705092= 65cdfe57a4026 100644 --- a/hw/vfio-user/device.h +++ b/hw/vfio-user/device.h @@ -17,6 +17,8 @@ bool vfio_user_get_device_info(VFIOUserProxy *proxy, struct vfio_device_info *info, Error **errp= ); =20 +void vfio_user_device_reset(VFIOUserProxy *proxy); + extern VFIODeviceIOOps vfio_user_device_io_ops_sock; =20 #endif /* VFIO_USER_DEVICE_H */ diff --git a/hw/vfio-user/device.c b/hw/vfio-user/device.c index f01b3925c5a5a22cd8f55303d48b36d9dface54c..3a118e73615a40d5d455dfb3791= 15be66726627a 100644 --- a/hw/vfio-user/device.c +++ b/hw/vfio-user/device.c @@ -54,6 +54,24 @@ bool vfio_user_get_device_info(VFIOUserProxy *proxy, return true; } =20 +void vfio_user_device_reset(VFIOUserProxy *proxy) +{ + Error *local_err =3D NULL; + VFIOUserHdr hdr; + + vfio_user_request_msg(&hdr, VFIO_USER_DEVICE_RESET, sizeof(hdr), 0); + + if (!vfio_user_send_wait(proxy, &hdr, NULL, 0, &local_err)) { + error_prepend(&local_err, "%s: ", __func__); + error_report_err(local_err); + return; + } + + if (hdr.flags & VFIO_USER_ERROR) { + error_printf("reset reply error %d\n", hdr.error_reply); + } +} + static int vfio_user_get_region_info(VFIOUserProxy *proxy, struct vfio_region_info *info, VFIOUserFDs *fds) diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index c0f00f15b10a80820606ff4ef0383d59ecd91670..49d12763aba73f780ec5b202dfd= 4d3dc98837491 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -263,6 +263,20 @@ static void vfio_user_instance_finalize(Object *obj) } } =20 +static void vfio_user_pci_reset(DeviceState *dev) +{ + VFIOPCIDevice *vdev =3D VFIO_PCI_BASE(dev); + VFIODevice *vbasedev =3D &vdev->vbasedev; + + vfio_pci_pre_reset(vdev); + + if (vbasedev->reset_works) { + vfio_user_device_reset(vbasedev->proxy); + } + + vfio_pci_post_reset(vdev); +} + static const Property vfio_user_pci_dev_properties[] =3D { DEFINE_PROP_UINT32("x-pci-vendor-id", VFIOPCIDevice, vendor_id, PCI_ANY_ID), @@ -310,6 +324,7 @@ static void vfio_user_pci_dev_class_init(ObjectClass *k= lass, const void *data) DeviceClass *dc =3D DEVICE_CLASS(klass); PCIDeviceClass *pdc =3D PCI_DEVICE_CLASS(klass); =20 + device_class_set_legacy_reset(dc, vfio_user_pci_reset); device_class_set_props(dc, vfio_user_pci_dev_properties); =20 object_class_property_add(klass, "socket", "SocketAddress", NULL, --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924080; cv=none; d=zohomail.com; s=zohoarc; b=hrTaCvdxwr/1PSGLkzRf78fZRv+bPIr0PR9LBkR33lXBEVJ2Tt0ZM5Z4WxyaLrVQrOYGmmfjAt239tZukSwBCJjtNJnB1TzSmysinOzz0p94fy8SosbgTmNdBvRpDFLmz86sG4HGGHoHyCkvah5ZgNDF4ll3Fd8SM32XH2O+x48= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924080; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=52nDQriU5UkQa7Cq5IVTKOLBOOicAj6GSABdDv3pt98=; b=b7/Z7t0hlcCKVXxkFo4uaunqWO6+ri3OqwN0icC4hDcM8giOtMajxjA/dJeL+GZbF22pM+pBwDtJxfuuy1jsm0OQ+y66ZUKILDPVwe+Qv/RSsQy8h8ICJWpNja6zS23G9bWpxQthr5XO2B/mxPM/jEANGUq9fL21pW3RzmZvsTE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924080743424.7083277180127; Thu, 26 Jun 2025 00:48:00 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJq-0005b0-Ef; Thu, 26 Jun 2025 03:46:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJi-0005G6-Ra for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:43 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJc-00020Q-VC for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:42 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-679-DX5jjTBPOXeYbuxzrZ8PYg-1; Thu, 26 Jun 2025 03:46:31 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 233721808993; Thu, 26 Jun 2025 07:46:30 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 295A6180045B; Thu, 26 Jun 2025 07:46:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923995; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=52nDQriU5UkQa7Cq5IVTKOLBOOicAj6GSABdDv3pt98=; b=OUhvh+2rkuvoXpxsYt/WcBq+WL5iKibHa7sugVuanxAy9y7HvWggxLNRjssO1E0mfd5HUR 5vfTkJUehCwDooasZEKE0VjR1ezEWWE/cGTRns//16yKQbpUsLHX16Gox7pOECJI4M+e/B upu43CVqzgGKniaBlXtf7FswmLPwMeU= X-MC-Unique: DX5jjTBPOXeYbuxzrZ8PYg-1 X-Mimecast-MFC-AGG-ID: DX5jjTBPOXeYbuxzrZ8PYg_1750923990 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Jagannathan Raman , Elena Ufimtseva , John Levon , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 19/25] vfio-user: implement VFIO_USER_DMA_MAP/UNMAP Date: Thu, 26 Jun 2025 09:45:23 +0200 Message-ID: <20250626074529.1384114-20-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924088678116600 From: John Levon When the vfio-user container gets mapping updates, share them with the vfio-user by sending a message; this can include the region fd, allowing the server to directly mmap() the region as needed. For performance, we only wait for the message responses when we're doing with a series of updates via the listener_commit() callback. Originally-by: John Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-14-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/protocol.h | 32 +++++++++ hw/vfio-user/proxy.h | 6 ++ hw/vfio-user/container.c | 135 +++++++++++++++++++++++++++++++++++++- hw/vfio-user/proxy.c | 84 +++++++++++++++++++++++- hw/vfio-user/trace-events | 4 ++ 5 files changed, 257 insertions(+), 4 deletions(-) diff --git a/hw/vfio-user/protocol.h b/hw/vfio-user/protocol.h index 48144b2c33324faad4a07e1457a4709575466afb..524f3d633ab8800ad0934ffad86= 1c7b86a8e70b3 100644 --- a/hw/vfio-user/protocol.h +++ b/hw/vfio-user/protocol.h @@ -112,6 +112,31 @@ typedef struct { */ #define VFIO_USER_DEF_MAX_BITMAP (256 * 1024 * 1024) =20 +/* + * VFIO_USER_DMA_MAP + * imported from struct vfio_iommu_type1_dma_map + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t offset; /* FD offset */ + uint64_t iova; + uint64_t size; +} VFIOUserDMAMap; + +/* + * VFIO_USER_DMA_UNMAP + * imported from struct vfio_iommu_type1_dma_unmap + */ +typedef struct { + VFIOUserHdr hdr; + uint32_t argsz; + uint32_t flags; + uint64_t iova; + uint64_t size; +} VFIOUserDMAUnmap; + /* * VFIO_USER_DEVICE_GET_INFO * imported from struct vfio_device_info @@ -175,4 +200,11 @@ typedef struct { char data[]; } VFIOUserRegionRW; =20 +/*imported from struct vfio_bitmap */ +typedef struct { + uint64_t pgsize; + uint64_t size; + char data[]; +} VFIOUserBitmap; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h index ba1c33aba8d08083c9afd2f203ed0a04a56cf70f..e2fc83ca3b12aa891d6843345c7= c5015e8811aa5 100644 --- a/hw/vfio-user/proxy.h +++ b/hw/vfio-user/proxy.h @@ -70,6 +70,7 @@ typedef struct VFIOUserProxy { QemuCond close_cv; AioContext *ctx; QEMUBH *req_bh; + bool async_ops; =20 /* * above only changed when BQL is held @@ -99,9 +100,14 @@ void vfio_user_set_handler(VFIODevice *vbasedev, void *reqarg); bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp); =20 +VFIOUserFDs *vfio_user_getfds(int numfds); + void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, uint32_t size, uint32_t flags); +void vfio_user_wait_reqs(VFIOUserProxy *proxy); bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, Error **errp); +bool vfio_user_send_nowait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize, Error **errp); =20 #endif /* VFIO_USER_PROXY_H */ diff --git a/hw/vfio-user/container.c b/hw/vfio-user/container.c index b4a5a840b089c5622ade83e632c2560475076e5b..3133fef17738e14fc512ad5272b= c349d928c980a 100644 --- a/hw/vfio-user/container.c +++ b/hw/vfio-user/container.c @@ -12,23 +12,152 @@ =20 #include "hw/vfio-user/container.h" #include "hw/vfio-user/device.h" +#include "hw/vfio-user/trace.h" #include "hw/vfio/vfio-cpr.h" #include "hw/vfio/vfio-device.h" #include "hw/vfio/vfio-listener.h" #include "qapi/error.h" =20 +/* + * When DMA space is the physical address space, the region add/del listen= ers + * will fire during memory update transactions. These depend on BQL being= held, + * so do any resulting map/demap ops async while keeping BQL. + */ +static void vfio_user_listener_begin(VFIOContainerBase *bcontainer) +{ + VFIOUserContainer *container =3D container_of(bcontainer, VFIOUserCont= ainer, + bcontainer); + + container->proxy->async_ops =3D true; +} + +static void vfio_user_listener_commit(VFIOContainerBase *bcontainer) +{ + VFIOUserContainer *container =3D container_of(bcontainer, VFIOUserCont= ainer, + bcontainer); + + /* wait here for any async requests sent during the transaction */ + container->proxy->async_ops =3D false; + vfio_user_wait_reqs(container->proxy); +} + static int vfio_user_dma_unmap(const VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb, bool unmap_all) { - return -ENOTSUP; + VFIOUserContainer *container =3D container_of(bcontainer, VFIOUserCont= ainer, + bcontainer); + Error *local_err =3D NULL; + int ret =3D 0; + + VFIOUserDMAUnmap *msgp =3D g_malloc(sizeof(*msgp)); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DMA_UNMAP, sizeof(*msgp), = 0); + msgp->argsz =3D sizeof(struct vfio_iommu_type1_dma_unmap); + msgp->flags =3D unmap_all ? VFIO_DMA_UNMAP_FLAG_ALL : 0; + msgp->iova =3D iova; + msgp->size =3D size; + trace_vfio_user_dma_unmap(msgp->iova, msgp->size, msgp->flags, + container->proxy->async_ops); + + if (container->proxy->async_ops) { + if (!vfio_user_send_nowait(container->proxy, &msgp->hdr, NULL, + 0, &local_err)) { + error_report_err(local_err); + ret =3D -EFAULT; + } else { + ret =3D 0; + } + } else { + if (!vfio_user_send_wait(container->proxy, &msgp->hdr, NULL, + 0, &local_err)) { + error_report_err(local_err); + ret =3D -EFAULT; + } + + if (msgp->hdr.flags & VFIO_USER_ERROR) { + ret =3D -msgp->hdr.error_reply; + } + + g_free(msgp); + } + + return ret; } =20 static int vfio_user_dma_map(const VFIOContainerBase *bcontainer, hwaddr i= ova, ram_addr_t size, void *vaddr, bool readonly, MemoryRegion *mrp) { - return -ENOTSUP; + VFIOUserContainer *container =3D container_of(bcontainer, VFIOUserCont= ainer, + bcontainer); + int fd =3D memory_region_get_fd(mrp); + Error *local_err =3D NULL; + int ret; + + VFIOUserFDs *fds =3D NULL; + VFIOUserDMAMap *msgp =3D g_malloc0(sizeof(*msgp)); + + vfio_user_request_msg(&msgp->hdr, VFIO_USER_DMA_MAP, sizeof(*msgp), 0); + msgp->argsz =3D sizeof(struct vfio_iommu_type1_dma_map); + msgp->flags =3D VFIO_DMA_MAP_FLAG_READ; + msgp->offset =3D 0; + msgp->iova =3D iova; + msgp->size =3D size; + + /* + * vaddr enters as a QEMU process address; make it either a file offset + * for mapped areas or leave as 0. + */ + if (fd !=3D -1) { + msgp->offset =3D qemu_ram_block_host_offset(mrp->ram_block, vaddr); + } + + if (!readonly) { + msgp->flags |=3D VFIO_DMA_MAP_FLAG_WRITE; + } + + trace_vfio_user_dma_map(msgp->iova, msgp->size, msgp->offset, msgp->fl= ags, + container->proxy->async_ops); + + /* + * The async_ops case sends without blocking. They're later waited for= in + * vfio_send_wait_reqs. + */ + if (container->proxy->async_ops) { + /* can't use auto variable since we don't block */ + if (fd !=3D -1) { + fds =3D vfio_user_getfds(1); + fds->send_fds =3D 1; + fds->fds[0] =3D fd; + } + + if (!vfio_user_send_nowait(container->proxy, &msgp->hdr, fds, + 0, &local_err)) { + error_report_err(local_err); + ret =3D -EFAULT; + } else { + ret =3D 0; + } + } else { + VFIOUserFDs local_fds =3D { 1, 0, &fd }; + + fds =3D fd !=3D -1 ? &local_fds : NULL; + + if (!vfio_user_send_wait(container->proxy, &msgp->hdr, fds, + 0, &local_err)) { + error_report_err(local_err); + ret =3D -EFAULT; + } + + if (msgp->hdr.flags & VFIO_USER_ERROR) { + ret =3D -msgp->hdr.error_reply; + } + + g_free(msgp); + } + + return ret; } =20 static int @@ -218,6 +347,8 @@ static void vfio_iommu_user_class_init(ObjectClass *kla= ss, const void *data) VFIOIOMMUClass *vioc =3D VFIO_IOMMU_CLASS(klass); =20 vioc->setup =3D vfio_user_setup; + vioc->listener_begin =3D vfio_user_listener_begin, + vioc->listener_commit =3D vfio_user_listener_commit, vioc->dma_map =3D vfio_user_dma_map; vioc->dma_unmap =3D vfio_user_dma_unmap; vioc->attach_device =3D vfio_user_device_attach; diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c index aed7b22e2a657cfc2ddd391163d380ee5ea22f68..c8ae8a59b441a1151c884dfb862= 5fde696af664c 100644 --- a/hw/vfio-user/proxy.c +++ b/hw/vfio-user/proxy.c @@ -27,7 +27,6 @@ static IOThread *vfio_user_iothread; static void vfio_user_shutdown(VFIOUserProxy *proxy); static VFIOUserMsg *vfio_user_getmsg(VFIOUserProxy *proxy, VFIOUserHdr *hd= r, VFIOUserFDs *fds); -static VFIOUserFDs *vfio_user_getfds(int numfds); static void vfio_user_recycle(VFIOUserProxy *proxy, VFIOUserMsg *msg); =20 static void vfio_user_recv(void *opaque); @@ -132,7 +131,7 @@ static void vfio_user_recycle(VFIOUserProxy *proxy, VFI= OUserMsg *msg) QTAILQ_INSERT_HEAD(&proxy->free, msg, next); } =20 -static VFIOUserFDs *vfio_user_getfds(int numfds) +VFIOUserFDs *vfio_user_getfds(int numfds) { VFIOUserFDs *fds =3D g_malloc0(sizeof(*fds) + (numfds * sizeof(int))); =20 @@ -618,6 +617,43 @@ static bool vfio_user_send_queued(VFIOUserProxy *proxy= , VFIOUserMsg *msg, return true; } =20 +/* + * nowait send - vfio_wait_reqs() can wait for it later + * + * Returns false if we did not successfully receive a reply message, in wh= ich + * case @errp will be populated. + * + * In either case, ownership of @hdr and @fds is taken, and the caller must + * *not* free them itself. + */ +bool vfio_user_send_nowait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, int rsize, Error **errp) +{ + VFIOUserMsg *msg; + + QEMU_LOCK_GUARD(&proxy->lock); + + msg =3D vfio_user_getmsg(proxy, hdr, fds); + msg->id =3D hdr->id; + msg->rsize =3D rsize ? rsize : hdr->size; + msg->type =3D VFIO_MSG_NOWAIT; + + if (hdr->flags & VFIO_USER_NO_REPLY) { + error_setg_errno(errp, EINVAL, "%s on NO_REPLY message", __func__); + vfio_user_recycle(proxy, msg); + return false; + } + + if (!vfio_user_send_queued(proxy, msg, errp)) { + vfio_user_recycle(proxy, msg); + return false; + } + + proxy->last_nowait =3D msg; + + return true; +} + /* * Returns false if we did not successfully receive a reply message, in wh= ich * case @errp will be populated. @@ -666,6 +702,50 @@ bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUse= rHdr *hdr, return ok; } =20 +void vfio_user_wait_reqs(VFIOUserProxy *proxy) +{ + VFIOUserMsg *msg; + + /* + * Any DMA map/unmap requests sent in the middle + * of a memory region transaction were sent nowait. + * Wait for them here. + */ + qemu_mutex_lock(&proxy->lock); + if (proxy->last_nowait !=3D NULL) { + /* + * Change type to WAIT to wait for reply + */ + msg =3D proxy->last_nowait; + msg->type =3D VFIO_MSG_WAIT; + proxy->last_nowait =3D NULL; + while (!msg->complete) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + VFIOUserMsgQ *list; + + list =3D msg->pending ? &proxy->pending : &proxy->outgoing; + QTAILQ_REMOVE(list, msg, next); + error_printf("vfio_wait_reqs - timed out\n"); + break; + } + } + + if (msg->hdr->flags & VFIO_USER_ERROR) { + error_printf("vfio_user_wait_reqs - error reply on async "); + error_printf("request: command %x error %s\n", msg->hdr->comma= nd, + strerror(msg->hdr->error_reply)); + } + + /* + * Change type back to NOWAIT to free + */ + msg->type =3D VFIO_MSG_NOWAIT; + vfio_user_recycle(proxy, msg); + } + + qemu_mutex_unlock(&proxy->lock); +} + static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets =3D QLIST_HEAD_INITIALIZER(vfio_user_sockets); =20 diff --git a/hw/vfio-user/trace-events b/hw/vfio-user/trace-events index aa8f3c3d4d3b30e3d5091d77b8ca2fccaa847a73..44dde020b3198e215065c4adcb9= d13832e5c4287 100644 --- a/hw/vfio-user/trace-events +++ b/hw/vfio-user/trace-events @@ -13,3 +13,7 @@ vfio_user_get_region_info(uint32_t index, uint32_t flags,= uint64_t size) " index vfio_user_region_rw(uint32_t region, uint64_t off, uint32_t count) " regio= n %d offset 0x%"PRIx64" count %d" vfio_user_get_irq_info(uint32_t index, uint32_t flags, uint32_t count) " i= ndex %d flags 0x%x count %d" vfio_user_set_irqs(uint32_t index, uint32_t start, uint32_t count, uint32_= t flags) " index %d start %d count %d flags 0x%x" + +# container.c +vfio_user_dma_map(uint64_t iova, uint64_t size, uint64_t off, uint32_t fla= gs, bool async_ops) " iova 0x%"PRIx64" size 0x%"PRIx64" off 0x%"PRIx64" fla= gs 0x%x async_ops %d" +vfio_user_dma_unmap(uint64_t iova, uint64_t size, uint32_t flags, bool as= ync_ops) " iova 0x%"PRIx64" size 0x%"PRIx64" flags 0x%x async_ops %d" --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924178; cv=none; d=zohomail.com; s=zohoarc; b=IY5b4OzlHbxVBOsGY1sd6n2zGcGGcV6Snq/4E3d7QI6ElwkmcTakSrJo/2V2IIUdIDsK/Ez8RlfUPE6jvyRvBCSZU6jTcqkFh3pUbfEkRDjeaFeHBDFNvSk390KT5q4eEBvnmJEzRj1OgBKRwcIivSKJKIgcKSroGWHQ1kL/5R4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924178; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=PM04fP381Lx2923Bt16aAeeskeqZaHbRysSgF0zgPuE=; b=hrhS5uat7ZWHaBPKAgpfCTfupJ+Baadc5MZLhTlMTtagwIM8q574nG7kVkitE/XDCxjnEhc72STyWTQlcRxWA3TeSr90ioj9KcuDjTO0zHftn2Uf6j6k8Igqf6AvA6HCk5epPzpDvvfYSEWk5ZiKPjL3yug4TKlhqt5dAVYLKg0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924178324100.38238929581996; Thu, 26 Jun 2025 00:49:38 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJm-0005Ut-Rm; Thu, 26 Jun 2025 03:46:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJh-0005BU-Tq for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:41 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJf-00020c-BX for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:41 -0400 Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-588-lEm9DaFyNzKL0MVIW67rDg-1; Thu, 26 Jun 2025 03:46:34 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 316EA195FCC2; Thu, 26 Jun 2025 07:46:33 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 95744180045B; Thu, 26 Jun 2025 07:46:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750923996; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PM04fP381Lx2923Bt16aAeeskeqZaHbRysSgF0zgPuE=; b=f84Lhdn+4nGRkoaDS4K3iZVj1cAqe1PG8ifAiv76ZPma+nPDvcqAXtRZsaCrSXw7Y9rnuO jP9iDibTT6lZYMHROeyD2PjFfBKCZickRz6H0MNutn+xpFXsU8opS36rJbBTilMezmP/8w kzyoGqJnO5n62ayQuUvrgs0nrPb5AgY= X-MC-Unique: lEm9DaFyNzKL0MVIW67rDg-1 X-Mimecast-MFC-AGG-ID: lEm9DaFyNzKL0MVIW67rDg_1750923993 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 20/25] vfio-user: implement VFIO_USER_DMA_READ/WRITE Date: Thu, 26 Jun 2025 09:45:24 +0200 Message-ID: <20250626074529.1384114-21-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924179945116600 From: John Levon Unlike most other messages, this is a server->client message, for when a server wants to do "DMA"; this is slow, so normally the server has memory directly mapped instead. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-15-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/protocol.h | 13 ++++- hw/vfio-user/proxy.h | 3 ++ hw/vfio-user/pci.c | 111 ++++++++++++++++++++++++++++++++++++++++ hw/vfio-user/proxy.c | 97 +++++++++++++++++++++++++++++++++++ 4 files changed, 223 insertions(+), 1 deletion(-) diff --git a/hw/vfio-user/protocol.h b/hw/vfio-user/protocol.h index 524f3d633ab8800ad0934ffad861c7b86a8e70b3..3e9d8e576bae6a51d8fbc7cf26b= b35d29e0e1aee 100644 --- a/hw/vfio-user/protocol.h +++ b/hw/vfio-user/protocol.h @@ -200,7 +200,18 @@ typedef struct { char data[]; } VFIOUserRegionRW; =20 -/*imported from struct vfio_bitmap */ +/* + * VFIO_USER_DMA_READ + * VFIO_USER_DMA_WRITE + */ +typedef struct { + VFIOUserHdr hdr; + uint64_t offset; + uint32_t count; + char data[]; +} VFIOUserDMARW; + +/* imported from struct vfio_bitmap */ typedef struct { uint64_t pgsize; uint64_t size; diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h index e2fc83ca3b12aa891d6843345c7c5015e8811aa5..39092c08c8d31ad315e1942ddd0= 9134a8a4569a0 100644 --- a/hw/vfio-user/proxy.h +++ b/hw/vfio-user/proxy.h @@ -101,6 +101,7 @@ void vfio_user_set_handler(VFIODevice *vbasedev, bool vfio_user_validate_version(VFIOUserProxy *proxy, Error **errp); =20 VFIOUserFDs *vfio_user_getfds(int numfds); +void vfio_user_putfds(VFIOUserMsg *msg); =20 void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, uint32_t size, uint32_t flags); @@ -109,5 +110,7 @@ bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUser= Hdr *hdr, VFIOUserFDs *fds, int rsize, Error **errp); bool vfio_user_send_nowait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, Error **errp); +void vfio_user_send_reply(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int size= ); +void vfio_user_send_error(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int erro= r); =20 #endif /* VFIO_USER_PROXY_H */ diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index 49d12763aba73f780ec5b202dfd4d3dc98837491..040660d1977cf97d9c983985d30= d9488e391e0c0 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -9,6 +9,7 @@ #include #include "qemu/osdep.h" #include "qapi-visit-sockets.h" +#include "qemu/error-report.h" =20 #include "hw/qdev-properties.h" #include "hw/vfio/pci.h" @@ -80,6 +81,95 @@ static void vfio_user_msix_teardown(VFIOPCIDevice *vdev) vdev->msix->pba_region =3D NULL; } =20 +static void vfio_user_dma_read(VFIOPCIDevice *vdev, VFIOUserDMARW *msg) +{ + PCIDevice *pdev =3D &vdev->pdev; + VFIOUserProxy *proxy =3D vdev->vbasedev.proxy; + VFIOUserDMARW *res; + MemTxResult r; + size_t size; + + if (msg->hdr.size < sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, EINVAL); + return; + } + if (msg->count > proxy->max_xfer_size) { + vfio_user_send_error(proxy, &msg->hdr, E2BIG); + return; + } + + /* switch to our own message buffer */ + size =3D msg->count + sizeof(VFIOUserDMARW); + res =3D g_malloc0(size); + memcpy(res, msg, sizeof(*res)); + g_free(msg); + + r =3D pci_dma_read(pdev, res->offset, &res->data, res->count); + + switch (r) { + case MEMTX_OK: + if (res->hdr.flags & VFIO_USER_NO_REPLY) { + g_free(res); + return; + } + vfio_user_send_reply(proxy, &res->hdr, size); + break; + case MEMTX_ERROR: + vfio_user_send_error(proxy, &res->hdr, EFAULT); + break; + case MEMTX_DECODE_ERROR: + vfio_user_send_error(proxy, &res->hdr, ENODEV); + break; + case MEMTX_ACCESS_ERROR: + vfio_user_send_error(proxy, &res->hdr, EPERM); + break; + default: + error_printf("vfio_user_dma_read unknown error %d\n", r); + vfio_user_send_error(vdev->vbasedev.proxy, &res->hdr, EINVAL); + } +} + +static void vfio_user_dma_write(VFIOPCIDevice *vdev, VFIOUserDMARW *msg) +{ + PCIDevice *pdev =3D &vdev->pdev; + VFIOUserProxy *proxy =3D vdev->vbasedev.proxy; + MemTxResult r; + + if (msg->hdr.size < sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, EINVAL); + return; + } + /* make sure transfer count isn't larger than the message data */ + if (msg->count > msg->hdr.size - sizeof(*msg)) { + vfio_user_send_error(proxy, &msg->hdr, E2BIG); + return; + } + + r =3D pci_dma_write(pdev, msg->offset, &msg->data, msg->count); + + switch (r) { + case MEMTX_OK: + if ((msg->hdr.flags & VFIO_USER_NO_REPLY) =3D=3D 0) { + vfio_user_send_reply(proxy, &msg->hdr, sizeof(msg->hdr)); + } else { + g_free(msg); + } + break; + case MEMTX_ERROR: + vfio_user_send_error(proxy, &msg->hdr, EFAULT); + break; + case MEMTX_DECODE_ERROR: + vfio_user_send_error(proxy, &msg->hdr, ENODEV); + break; + case MEMTX_ACCESS_ERROR: + vfio_user_send_error(proxy, &msg->hdr, EPERM); + break; + default: + error_printf("vfio_user_dma_write unknown error %d\n", r); + vfio_user_send_error(vdev->vbasedev.proxy, &msg->hdr, EINVAL); + } +} + /* * Incoming request message callback. * @@ -87,7 +177,28 @@ static void vfio_user_msix_teardown(VFIOPCIDevice *vdev) */ static void vfio_user_pci_process_req(void *opaque, VFIOUserMsg *msg) { + VFIOPCIDevice *vdev =3D opaque; + VFIOUserHdr *hdr =3D msg->hdr; + + /* no incoming PCI requests pass FDs */ + if (msg->fds !=3D NULL) { + vfio_user_send_error(vdev->vbasedev.proxy, hdr, EINVAL); + vfio_user_putfds(msg); + return; + } =20 + switch (hdr->command) { + case VFIO_USER_DMA_READ: + vfio_user_dma_read(vdev, (VFIOUserDMARW *)hdr); + break; + case VFIO_USER_DMA_WRITE: + vfio_user_dma_write(vdev, (VFIOUserDMARW *)hdr); + break; + default: + error_printf("vfio_user_pci_process_req unknown cmd %d\n", + hdr->command); + vfio_user_send_error(vdev->vbasedev.proxy, hdr, ENOSYS); + } } =20 /* diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c index c8ae8a59b441a1151c884dfb8625fde696af664c..cb93d9a660605f6675a141a1645= f2f3391e4aa8a 100644 --- a/hw/vfio-user/proxy.c +++ b/hw/vfio-user/proxy.c @@ -347,6 +347,10 @@ static int vfio_user_recv_one(VFIOUserProxy *proxy, Er= ror **errp) *msg->hdr =3D hdr; data =3D (char *)msg->hdr + sizeof(hdr); } else { + if (hdr.size > proxy->max_xfer_size + sizeof(VFIOUserDMARW)) { + error_setg(errp, "vfio_user_recv request larger than max"); + goto err; + } buf =3D g_malloc0(hdr.size); memcpy(buf, &hdr, sizeof(hdr)); data =3D buf + sizeof(hdr); @@ -702,6 +706,40 @@ bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUse= rHdr *hdr, return ok; } =20 +/* + * async send - msg can be queued, but will be freed when sent + * + * Returns false on failure, in which case @errp will be populated. + * + * In either case, ownership of @hdr and @fds is taken, and the caller must + * *not* free them itself. + */ +static bool vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, Error **errp) +{ + VFIOUserMsg *msg; + + QEMU_LOCK_GUARD(&proxy->lock); + + msg =3D vfio_user_getmsg(proxy, hdr, fds); + msg->id =3D hdr->id; + msg->rsize =3D 0; + msg->type =3D VFIO_MSG_ASYNC; + + if (!(hdr->flags & (VFIO_USER_NO_REPLY | VFIO_USER_REPLY))) { + error_setg_errno(errp, EINVAL, "%s on sync message", __func__); + vfio_user_recycle(proxy, msg); + return false; + } + + if (!vfio_user_send_queued(proxy, msg, errp)) { + vfio_user_recycle(proxy, msg); + return false; + } + + return true; +} + void vfio_user_wait_reqs(VFIOUserProxy *proxy) { VFIOUserMsg *msg; @@ -746,6 +784,65 @@ void vfio_user_wait_reqs(VFIOUserProxy *proxy) qemu_mutex_unlock(&proxy->lock); } =20 +/* + * Reply to an incoming request. + */ +void vfio_user_send_reply(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int size) +{ + Error *local_err =3D NULL; + + if (size < sizeof(VFIOUserHdr)) { + error_printf("%s: size too small", __func__); + g_free(hdr); + return; + } + + /* + * convert header to associated reply + */ + hdr->flags =3D VFIO_USER_REPLY; + hdr->size =3D size; + + if (!vfio_user_send_async(proxy, hdr, NULL, &local_err)) { + error_report_err(local_err); + } +} + +/* + * Send an error reply to an incoming request. + */ +void vfio_user_send_error(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int erro= r) +{ + Error *local_err =3D NULL; + + /* + * convert header to associated reply + */ + hdr->flags =3D VFIO_USER_REPLY; + hdr->flags |=3D VFIO_USER_ERROR; + hdr->error_reply =3D error; + hdr->size =3D sizeof(*hdr); + + if (!vfio_user_send_async(proxy, hdr, NULL, &local_err)) { + error_report_err(local_err); + } +} + +/* + * Close FDs erroneously received in an incoming request. + */ +void vfio_user_putfds(VFIOUserMsg *msg) +{ + VFIOUserFDs *fds =3D msg->fds; + int i; + + for (i =3D 0; i < fds->recv_fds; i++) { + close(fds->fds[i]); + } + g_free(fds); + msg->fds =3D NULL; +} + static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets =3D QLIST_HEAD_INITIALIZER(vfio_user_sockets); =20 --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924181; cv=none; d=zohomail.com; s=zohoarc; b=XuJgMpo6IQV8M/zrZT82zPF8p3GclfXc4QmK1gGvRn6pQycgYa77N9W+ZFywNisLz/Y50LdKR/s4U8soAaoyTNcI6LQnRlfLSTnG7EPMcx6us6n+Oe1IUOOoFV6jZkqyrlqiYlEgzoh9H3JjGFRzAOqd+Kco6yv+6BoZR8E017U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924181; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=ZTiYRdZHss2VOjKgyAJO7JfRn0HCz3tkkFa9ahC8avE=; b=fQr9u9Uc2RtLUB4DJUK1N/1eOFk+YWqvmXrgA+p81q7IhaEf0zU55Km2aOU6G1WoWF+NIlunhCaCf3y94exBJQaBSE2YfW3qt8Am4TAAmsJhc9jinBAfvbMpHRkxe46bCG/0gwkxBkJNwGIZpJRwySTkTDTe+v4cFMCsL2LqkYo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924181951349.35957909220804; Thu, 26 Jun 2025 00:49:41 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJt-0005wA-5r; Thu, 26 Jun 2025 03:46:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJm-0005Rr-2v for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:46 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJi-00021J-NZ for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:45 -0400 Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-13-6-3fOjScONK6YRnujYSLaA-1; Thu, 26 Jun 2025 03:46:37 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1AC721956096; Thu, 26 Jun 2025 07:46:36 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id AD8DF180035C; Thu, 26 Jun 2025 07:46:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750924001; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZTiYRdZHss2VOjKgyAJO7JfRn0HCz3tkkFa9ahC8avE=; b=T8yV1n5YDoPRcQxKyDLOzyjc7/LWUU0sYMY2yRpD7bYs/vM3QIDwCdY0Y7yV7nkksfRqmu vZsW8KQYypaOwcUNTezofG8gkqDaNdddaQpNYco6Hs7CVGWqFSRAN26/D2M8v51WScxuL3 gbm9hLVYqav6rk5qyhIkQO93YAG/JOs= X-MC-Unique: 6-3fOjScONK6YRnujYSLaA-1 X-Mimecast-MFC-AGG-ID: 6-3fOjScONK6YRnujYSLaA_1750923996 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 21/25] vfio-user: add 'x-msg-timeout' option Date: Thu, 26 Jun 2025 09:45:25 +0200 Message-ID: <20250626074529.1384114-22-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924183934116600 From: John Levon By default, the vfio-user subsystem will wait 5 seconds for a message reply from the server. Add an option to allow this to be configurable. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-16-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/proxy.h | 1 + hw/vfio-user/pci.c | 5 +++++ hw/vfio-user/proxy.c | 7 ++++--- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h index 39092c08c8d31ad315e1942ddd09134a8a4569a0..6b29cd7cd392ed5a1675dfc3f26= fee691f2f9d44 100644 --- a/hw/vfio-user/proxy.h +++ b/hw/vfio-user/proxy.h @@ -67,6 +67,7 @@ typedef struct VFIOUserProxy { uint64_t max_bitmap; uint64_t migr_pgsize; int flags; + uint32_t wait_time; QemuCond close_cv; AioContext *ctx; QEMUBH *req_bh; diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index 040660d1977cf97d9c983985d30d9488e391e0c0..f260bea4900bd4648ba38db191c= afe933ca64135 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -23,6 +23,7 @@ struct VFIOUserPCIDevice { VFIOPCIDevice device; SocketAddress *socket; bool send_queued; /* all sends are queued */ + uint32_t wait_time; /* timeout for message replies */ }; =20 /* @@ -267,6 +268,9 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Erro= r **errp) proxy->flags |=3D VFIO_PROXY_FORCE_QUEUED; } =20 + /* user specified or 5 sec default */ + proxy->wait_time =3D udev->wait_time; + if (!vfio_user_validate_version(proxy, errp)) { goto error; } @@ -398,6 +402,7 @@ static const Property vfio_user_pci_dev_properties[] = =3D { DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice, sub_device_id, PCI_ANY_ID), DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, fals= e), + DEFINE_PROP_UINT32("x-msg-timeout", VFIOUserPCIDevice, wait_time, 5000= ), }; =20 static void vfio_user_pci_set_socket(Object *obj, Visitor *v, const char *= name, diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c index cb93d9a660605f6675a141a1645f2f3391e4aa8a..c3724ba212b437fb810a9ad8291= 5b4b1fdf7a5bc 100644 --- a/hw/vfio-user/proxy.c +++ b/hw/vfio-user/proxy.c @@ -21,7 +21,6 @@ #include "qemu/main-loop.h" #include "system/iothread.h" =20 -static int wait_time =3D 5000; /* wait up to 5 sec for busy servers */ static IOThread *vfio_user_iothread; =20 static void vfio_user_shutdown(VFIOUserProxy *proxy); @@ -686,7 +685,8 @@ bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUser= Hdr *hdr, =20 if (ok) { while (!msg->complete) { - if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, + proxy->wait_time)) { VFIOUserMsgQ *list; =20 list =3D msg->pending ? &proxy->pending : &proxy->outgoing; @@ -758,7 +758,8 @@ void vfio_user_wait_reqs(VFIOUserProxy *proxy) msg->type =3D VFIO_MSG_WAIT; proxy->last_nowait =3D NULL; while (!msg->complete) { - if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, wait_time)) { + if (!qemu_cond_timedwait(&msg->cv, &proxy->lock, + proxy->wait_time)) { VFIOUserMsgQ *list; =20 list =3D msg->pending ? &proxy->pending : &proxy->outgoing; --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924064; cv=none; d=zohomail.com; s=zohoarc; b=Du8BqRmHTf315CaFBhLrjXAUQyldpQxhbXaq5Ut0mJar6gYkGzxrQq5pTkM0wYmg+yLxOh9sxh03PLYrBG/bAmVRuW1G6cZxJzBhpCcM2UBYwKCFimtl3R53cEN+rQn/0So1ys+Q6/t/590AmDKv93vqW82aKZg9KgKQMjtoJKo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924064; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=au1bmCam4tqlhslh9zeYcUNLFWP/u2nkLdhtzCcyHM8=; b=m4CUCGCHCTd350XiDTUTYDzaz82HPLNf3qwZ1PVSZE0y5tyXLZjn/I6zlrUHZy4JwFQ0GUSQTD12B+5n6NTxUHzh2gBMXNoLTOMES8SlVKKi1LKjoZqFrA2uMDVFiwI2Xc5JrrB13jnX9+SYWyFfldMavWQ07aYI2Iz/dSVMTBE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924064049612.1678863600087; Thu, 26 Jun 2025 00:47:44 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJt-0005vh-36; Thu, 26 Jun 2025 03:46:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJr-0005jf-DW for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:51 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJk-00021f-Rh for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:49 -0400 Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-25-DuI-HmlYPimwpTEWyNQ6ow-1; Thu, 26 Jun 2025 03:46:39 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 533411801BD8; Thu, 26 Jun 2025 07:46:38 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 95E0A180035C; Thu, 26 Jun 2025 07:46:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750924003; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=au1bmCam4tqlhslh9zeYcUNLFWP/u2nkLdhtzCcyHM8=; b=PWH+4LFpY0I5G8vJiFud47DLsWkRp/ZO4fXC80f2MG0S3xF/ctPFEU6/WN+YyPN+1RRbj8 VS8RwZAU6vjIj8nz9eAmO8PM+Fkqe8Lem2LZHZzay0HvTT0XPYrzz0LmE1Q30y50NLL8uM eMsw/+oAVo8cMWOmpbJkDSrPDmhN4D4= X-MC-Unique: DuI-HmlYPimwpTEWyNQ6ow-1 X-Mimecast-MFC-AGG-ID: DuI-HmlYPimwpTEWyNQ6ow_1750923998 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 22/25] vfio-user: support posted writes Date: Thu, 26 Jun 2025 09:45:26 +0200 Message-ID: <20250626074529.1384114-23-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924066558116600 From: John Levon Support an asynchronous send of a vfio-user socket message (no wait for a reply) when the write is posted. This is only safe when no regions are mappable by the VM. Add an option to explicitly disable this as well. Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-17-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/proxy.h | 6 ++++++ hw/vfio-user/device.c | 45 +++++++++++++++++++++++++++++++++++++++---- hw/vfio-user/pci.c | 6 ++++++ hw/vfio-user/proxy.c | 12 ++++++++++-- 4 files changed, 63 insertions(+), 6 deletions(-) diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h index 6b29cd7cd392ed5a1675dfc3f26fee691f2f9d44..0418f58bf156121d65d363fd647= 114ca0cfecec0 100644 --- a/hw/vfio-user/proxy.h +++ b/hw/vfio-user/proxy.h @@ -91,6 +91,7 @@ typedef struct VFIOUserProxy { /* VFIOProxy flags */ #define VFIO_PROXY_CLIENT 0x1 #define VFIO_PROXY_FORCE_QUEUED 0x4 +#define VFIO_PROXY_NO_POST 0x8 =20 typedef struct VFIODevice VFIODevice; =20 @@ -104,6 +105,8 @@ bool vfio_user_validate_version(VFIOUserProxy *proxy, E= rror **errp); VFIOUserFDs *vfio_user_getfds(int numfds); void vfio_user_putfds(VFIOUserMsg *msg); =20 +void vfio_user_disable_posted_writes(VFIOUserProxy *proxy); + void vfio_user_request_msg(VFIOUserHdr *hdr, uint16_t cmd, uint32_t size, uint32_t flags); void vfio_user_wait_reqs(VFIOUserProxy *proxy); @@ -111,6 +114,9 @@ bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUser= Hdr *hdr, VFIOUserFDs *fds, int rsize, Error **errp); bool vfio_user_send_nowait(VFIOUserProxy *proxy, VFIOUserHdr *hdr, VFIOUserFDs *fds, int rsize, Error **errp); +bool vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, Error **errp); + void vfio_user_send_reply(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int size= ); void vfio_user_send_error(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int erro= r); =20 diff --git a/hw/vfio-user/device.c b/hw/vfio-user/device.c index 3a118e73615a40d5d455dfb379115be66726627a..aa07eac3303aef07d32592f9a56= a58e85011e6ab 100644 --- a/hw/vfio-user/device.c +++ b/hw/vfio-user/device.c @@ -110,10 +110,21 @@ static int vfio_user_get_region_info(VFIOUserProxy *p= roxy, trace_vfio_user_get_region_info(msgp->index, msgp->flags, msgp->size); =20 memcpy(info, &msgp->argsz, info->argsz); + + /* + * If at least one region is directly mapped into the VM, then we can = no + * longer rely on the sequential nature of vfio-user request handling = to + * ensure that posted writes are completed before a subsequent read. I= n this + * case, disable posted write support. This is a per-device property, = not + * per-region. + */ + if (info->flags & VFIO_REGION_INFO_FLAG_MMAP) { + vfio_user_disable_posted_writes(proxy); + } + return 0; } =20 - static int vfio_user_device_io_get_region_info(VFIODevice *vbasedev, struct vfio_region_info *in= fo, int *fd) @@ -312,33 +323,58 @@ static int vfio_user_device_io_region_read(VFIODevice= *vbasedev, uint8_t index, return msgp->count; } =20 +/* + * If this is a posted write, and VFIO_PROXY_NO_POST is not set, then we a= re OK + * to send the write to the socket without waiting for the server's reply: + * a subsequent read (of any region) will not pass the posted write, as all + * messages are handled sequentially. + */ static int vfio_user_device_io_region_write(VFIODevice *vbasedev, uint8_t = index, off_t off, unsigned count, void *data, bool post) { - g_autofree VFIOUserRegionRW *msgp =3D NULL; + VFIOUserRegionRW *msgp =3D NULL; VFIOUserProxy *proxy =3D vbasedev->proxy; int size =3D sizeof(*msgp) + count; Error *local_err =3D NULL; + int flags =3D 0; int ret; =20 if (count > proxy->max_xfer_size) { return -EINVAL; } =20 + if (proxy->flags & VFIO_PROXY_NO_POST) { + post =3D false; + } + + if (post) { + flags |=3D VFIO_USER_NO_REPLY; + } + msgp =3D g_malloc0(size); - vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_WRITE, size, 0); + vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_WRITE, size, flags); msgp->offset =3D off; msgp->region =3D index; msgp->count =3D count; memcpy(&msgp->data, data, count); trace_vfio_user_region_rw(msgp->region, msgp->offset, msgp->count); =20 - /* Ignore post: all writes are synchronous/non-posted. */ + /* async send will free msg after it's sent */ + if (post) { + if (!vfio_user_send_async(proxy, &msgp->hdr, NULL, &local_err)) { + error_prepend(&local_err, "%s: ", __func__); + error_report_err(local_err); + return -EFAULT; + } + + return count; + } =20 if (!vfio_user_send_wait(proxy, &msgp->hdr, NULL, 0, &local_err)) { error_prepend(&local_err, "%s: ", __func__); error_report_err(local_err); + g_free(msgp); return -EFAULT; } =20 @@ -348,6 +384,7 @@ static int vfio_user_device_io_region_write(VFIODevice = *vbasedev, uint8_t index, ret =3D count; } =20 + g_free(msgp); return ret; } =20 diff --git a/hw/vfio-user/pci.c b/hw/vfio-user/pci.c index f260bea4900bd4648ba38db191cafe933ca64135..be71c777291f0c68b01b5402961= 2c4dbc6aa86e2 100644 --- a/hw/vfio-user/pci.c +++ b/hw/vfio-user/pci.c @@ -24,6 +24,7 @@ struct VFIOUserPCIDevice { SocketAddress *socket; bool send_queued; /* all sends are queued */ uint32_t wait_time; /* timeout for message replies */ + bool no_post; /* all region writes are sync */ }; =20 /* @@ -268,6 +269,10 @@ static void vfio_user_pci_realize(PCIDevice *pdev, Err= or **errp) proxy->flags |=3D VFIO_PROXY_FORCE_QUEUED; } =20 + if (udev->no_post) { + proxy->flags |=3D VFIO_PROXY_NO_POST; + } + /* user specified or 5 sec default */ proxy->wait_time =3D udev->wait_time; =20 @@ -403,6 +408,7 @@ static const Property vfio_user_pci_dev_properties[] = =3D { sub_device_id, PCI_ANY_ID), DEFINE_PROP_BOOL("x-send-queued", VFIOUserPCIDevice, send_queued, fals= e), DEFINE_PROP_UINT32("x-msg-timeout", VFIOUserPCIDevice, wait_time, 5000= ), + DEFINE_PROP_BOOL("x-no-posted-writes", VFIOUserPCIDevice, no_post, fal= se), }; =20 static void vfio_user_pci_set_socket(Object *obj, Visitor *v, const char *= name, diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c index c3724ba212b437fb810a9ad82915b4b1fdf7a5bc..7ce8573abbb4337028ab67cd045= 62ab822c672d2 100644 --- a/hw/vfio-user/proxy.c +++ b/hw/vfio-user/proxy.c @@ -714,8 +714,8 @@ bool vfio_user_send_wait(VFIOUserProxy *proxy, VFIOUser= Hdr *hdr, * In either case, ownership of @hdr and @fds is taken, and the caller must * *not* free them itself. */ -static bool vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, - VFIOUserFDs *fds, Error **errp) +bool vfio_user_send_async(VFIOUserProxy *proxy, VFIOUserHdr *hdr, + VFIOUserFDs *fds, Error **errp) { VFIOUserMsg *msg; =20 @@ -844,6 +844,14 @@ void vfio_user_putfds(VFIOUserMsg *msg) msg->fds =3D NULL; } =20 +void +vfio_user_disable_posted_writes(VFIOUserProxy *proxy) +{ + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + proxy->flags |=3D VFIO_PROXY_NO_POST; + } +} + static QLIST_HEAD(, VFIOUserProxy) vfio_user_sockets =3D QLIST_HEAD_INITIALIZER(vfio_user_sockets); =20 --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924183; cv=none; d=zohomail.com; s=zohoarc; b=PH6hFDUwP1THR6KayiGZniU8fDqXtYx5jolxLgpSDpeze2qSscOOfFNawOWZ8/0+FZOXOyELdkX4WF23riqPKpOyybICI2Y3WF4uTb+3a30YCOEVxeDQtYi9/bE+DnldyZ5cAVjiFZeKt90ls7nvaMOvQ5rjDjscPBsSvpDqVJ0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924183; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=e+H0MhAcAv4nq4Wbd68BYhhQHn3QGKnFblTAmr6dPgk=; b=D4ICqW3cN2sbbgzkHr1+GuuWzhA4vzGZ/XbAQXlrPQs1OXZvgv+yH0OyVIHD1vUK9wNbqZnWm/6lfa3mFV4wIOH7luVs33+mA9OyEx0LTiTZQj6nSP5NAB45GN0wZm/nYGTaoLJDz6DOqKwv5IAAKVZxV8ynXsnY5G57tpySDzw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924183804886.4220954224926; Thu, 26 Jun 2025 00:49:43 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJy-0006JM-2R; Thu, 26 Jun 2025 03:46:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJv-00066d-Ni for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:55 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJs-000220-5H for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:55 -0400 Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-651-66eXxJ1AOP2NOarmzh-7HQ-1; Thu, 26 Jun 2025 03:46:42 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A5CF118DA385; Thu, 26 Jun 2025 07:46:41 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D1ECC180045B; Thu, 26 Jun 2025 07:46:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750924006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e+H0MhAcAv4nq4Wbd68BYhhQHn3QGKnFblTAmr6dPgk=; b=ICbvdC5lTeHJt4Qp+6wHdoc4sX6QFQQKLgGaFbReal1rN0GqU4kIa5CYqM5L/aAKOSPppq M0DJqyU3H6DnUsy2Dq+M9k++CZwfQpqHHGqfVXAvHm+XVMHC7UCRCq5TDr2de9i8aSb3Ho 3fO0jTQU6K/7baE0nkL3THk0oXvvTB4= X-MC-Unique: 66eXxJ1AOP2NOarmzh-7HQ-1 X-Mimecast-MFC-AGG-ID: 66eXxJ1AOP2NOarmzh-7HQ_1750924001 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , John Johnson , Elena Ufimtseva , Jagannathan Raman , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 23/25] vfio-user: add coalesced posted writes Date: Thu, 26 Jun 2025 09:45:27 +0200 Message-ID: <20250626074529.1384114-24-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924185936116600 From: John Levon Add new message to send multiple writes to server in a single message. Prevents the outgoing queue from overflowing when a long latency operation is followed by a series of posted writes. Originally-by: John Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-18-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- hw/vfio-user/protocol.h | 21 ++++++++++ hw/vfio-user/proxy.h | 12 ++++++ hw/vfio-user/device.c | 40 +++++++++++++++++++ hw/vfio-user/proxy.c | 84 +++++++++++++++++++++++++++++++++++++++ hw/vfio-user/trace-events | 1 + 5 files changed, 158 insertions(+) diff --git a/hw/vfio-user/protocol.h b/hw/vfio-user/protocol.h index 3e9d8e576bae6a51d8fbc7cf26bb35d29e0e1aee..3249a4a6b62af95e3718ed5053b= bf310cd08c024 100644 --- a/hw/vfio-user/protocol.h +++ b/hw/vfio-user/protocol.h @@ -39,6 +39,7 @@ enum vfio_user_command { VFIO_USER_DMA_WRITE =3D 12, VFIO_USER_DEVICE_RESET =3D 13, VFIO_USER_DIRTY_PAGES =3D 14, + VFIO_USER_REGION_WRITE_MULTI =3D 15, VFIO_USER_MAX, }; =20 @@ -72,6 +73,7 @@ typedef struct { #define VFIO_USER_CAP_PGSIZES "pgsizes" #define VFIO_USER_CAP_MAP_MAX "max_dma_maps" #define VFIO_USER_CAP_MIGR "migration" +#define VFIO_USER_CAP_MULTI "write_multiple" =20 /* "migration" members */ #define VFIO_USER_CAP_PGSIZE "pgsize" @@ -218,4 +220,23 @@ typedef struct { char data[]; } VFIOUserBitmap; =20 +/* + * VFIO_USER_REGION_WRITE_MULTI + */ +#define VFIO_USER_MULTI_DATA 8 +#define VFIO_USER_MULTI_MAX 200 + +typedef struct { + uint64_t offset; + uint32_t region; + uint32_t count; + char data[VFIO_USER_MULTI_DATA]; +} VFIOUserWROne; + +typedef struct { + VFIOUserHdr hdr; + uint64_t wr_cnt; + VFIOUserWROne wrs[VFIO_USER_MULTI_MAX]; +} VFIOUserWRMulti; + #endif /* VFIO_USER_PROTOCOL_H */ diff --git a/hw/vfio-user/proxy.h b/hw/vfio-user/proxy.h index 0418f58bf156121d65d363fd647114ca0cfecec0..61e64a0020fcd3a12e6cf846682= bb0c8e4352c07 100644 --- a/hw/vfio-user/proxy.h +++ b/hw/vfio-user/proxy.h @@ -85,6 +85,8 @@ typedef struct VFIOUserProxy { VFIOUserMsg *last_nowait; VFIOUserMsg *part_recv; size_t recv_left; + VFIOUserWRMulti *wr_multi; + int num_outgoing; enum proxy_state state; } VFIOUserProxy; =20 @@ -92,6 +94,11 @@ typedef struct VFIOUserProxy { #define VFIO_PROXY_CLIENT 0x1 #define VFIO_PROXY_FORCE_QUEUED 0x4 #define VFIO_PROXY_NO_POST 0x8 +#define VFIO_PROXY_USE_MULTI 0x16 + +/* coalescing high and low water marks for VFIOProxy num_outgoing */ +#define VFIO_USER_OUT_HIGH 1024 +#define VFIO_USER_OUT_LOW 128 =20 typedef struct VFIODevice VFIODevice; =20 @@ -120,4 +127,9 @@ bool vfio_user_send_async(VFIOUserProxy *proxy, VFIOUse= rHdr *hdr, void vfio_user_send_reply(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int size= ); void vfio_user_send_error(VFIOUserProxy *proxy, VFIOUserHdr *hdr, int erro= r); =20 +void vfio_user_flush_multi(VFIOUserProxy *proxy); +void vfio_user_create_multi(VFIOUserProxy *proxy); +void vfio_user_add_multi(VFIOUserProxy *proxy, uint8_t index, + off_t offset, uint32_t count, void *data); + #endif /* VFIO_USER_PROXY_H */ diff --git a/hw/vfio-user/device.c b/hw/vfio-user/device.c index aa07eac3303aef07d32592f9a56a58e85011e6ab..0609a7dc25428c1e7efed1e7a4e= 85de91aace2d4 100644 --- a/hw/vfio-user/device.c +++ b/hw/vfio-user/device.c @@ -9,6 +9,8 @@ #include "qemu/osdep.h" #include "qapi/error.h" #include "qemu/error-report.h" +#include "qemu/lockable.h" +#include "qemu/thread.h" =20 #include "hw/vfio-user/device.h" #include "hw/vfio-user/trace.h" @@ -337,6 +339,7 @@ static int vfio_user_device_io_region_write(VFIODevice = *vbasedev, uint8_t index, VFIOUserProxy *proxy =3D vbasedev->proxy; int size =3D sizeof(*msgp) + count; Error *local_err =3D NULL; + bool can_multi; int flags =3D 0; int ret; =20 @@ -352,6 +355,43 @@ static int vfio_user_device_io_region_write(VFIODevice= *vbasedev, uint8_t index, flags |=3D VFIO_USER_NO_REPLY; } =20 + /* write eligible to be in a WRITE_MULTI msg ? */ + can_multi =3D (proxy->flags & VFIO_PROXY_USE_MULTI) && post && + count <=3D VFIO_USER_MULTI_DATA; + + /* + * This should be a rare case, so first check without the lock, + * if we're wrong, vfio_send_queued() will flush any posted writes + * we missed here + */ + if (proxy->wr_multi !=3D NULL || + (proxy->num_outgoing > VFIO_USER_OUT_HIGH && can_multi)) { + + /* + * re-check with lock + * + * if already building a WRITE_MULTI msg, + * add this one if possible else flush pending before + * sending the current one + * + * else if outgoing queue is over the highwater, + * start a new WRITE_MULTI message + */ + WITH_QEMU_LOCK_GUARD(&proxy->lock) { + if (proxy->wr_multi !=3D NULL) { + if (can_multi) { + vfio_user_add_multi(proxy, index, off, count, data); + return count; + } + vfio_user_flush_multi(proxy); + } else if (proxy->num_outgoing > VFIO_USER_OUT_HIGH && can_mul= ti) { + vfio_user_create_multi(proxy); + vfio_user_add_multi(proxy, index, off, count, data); + return count; + } + } + } + msgp =3D g_malloc0(size); vfio_user_request_msg(&msgp->hdr, VFIO_USER_REGION_WRITE, size, flags); msgp->offset =3D off; diff --git a/hw/vfio-user/proxy.c b/hw/vfio-user/proxy.c index 7ce8573abbb4337028ab67cd04562ab822c672d2..c418954440839b92864518cd757= c750f0576d0af 100644 --- a/hw/vfio-user/proxy.c +++ b/hw/vfio-user/proxy.c @@ -13,12 +13,14 @@ #include "hw/vfio-user/proxy.h" #include "hw/vfio-user/trace.h" #include "qapi/error.h" +#include "qobject/qbool.h" #include "qobject/qdict.h" #include "qobject/qjson.h" #include "qobject/qnum.h" #include "qemu/error-report.h" #include "qemu/lockable.h" #include "qemu/main-loop.h" +#include "qemu/thread.h" #include "system/iothread.h" =20 static IOThread *vfio_user_iothread; @@ -445,6 +447,7 @@ static ssize_t vfio_user_send_one(VFIOUserProxy *proxy,= Error **errp) } =20 QTAILQ_REMOVE(&proxy->outgoing, msg, next); + proxy->num_outgoing--; if (msg->type =3D=3D VFIO_MSG_ASYNC) { vfio_user_recycle(proxy, msg); } else { @@ -481,6 +484,11 @@ static void vfio_user_send(void *opaque) } qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, vfio_user_recv, NULL, NULL, proxy); + + /* queue empty - send any pending multi write msgs */ + if (proxy->wr_multi !=3D NULL) { + vfio_user_flush_multi(proxy); + } } } =20 @@ -579,11 +587,18 @@ static bool vfio_user_send_queued(VFIOUserProxy *prox= y, VFIOUserMsg *msg, { int ret; =20 + /* older coalesced writes go first */ + if (proxy->wr_multi !=3D NULL && + ((msg->hdr->flags & VFIO_USER_TYPE) =3D=3D VFIO_USER_REQUEST)) { + vfio_user_flush_multi(proxy); + } + /* * Unsent outgoing msgs - add to tail */ if (!QTAILQ_EMPTY(&proxy->outgoing)) { QTAILQ_INSERT_TAIL(&proxy->outgoing, msg, next); + proxy->num_outgoing++; return true; } =20 @@ -598,6 +613,7 @@ static bool vfio_user_send_queued(VFIOUserProxy *proxy,= VFIOUserMsg *msg, =20 if (ret =3D=3D QIO_CHANNEL_ERR_BLOCK) { QTAILQ_INSERT_HEAD(&proxy->outgoing, msg, next); + proxy->num_outgoing =3D 1; qio_channel_set_aio_fd_handler(proxy->ioc, proxy->ctx, vfio_user_recv, proxy->ctx, vfio_user_send, proxy); @@ -1151,12 +1167,27 @@ static bool check_migr(VFIOUserProxy *proxy, QObjec= t *qobj, Error **errp) return caps_parse(proxy, qdict, caps_migr, errp); } =20 +static bool check_multi(VFIOUserProxy *proxy, QObject *qobj, Error **errp) +{ + QBool *qb =3D qobject_to(QBool, qobj); + + if (qb =3D=3D NULL) { + error_setg(errp, "malformed %s", VFIO_USER_CAP_MULTI); + return false; + } + if (qbool_get_bool(qb)) { + proxy->flags |=3D VFIO_PROXY_USE_MULTI; + } + return true; +} + static struct cap_entry caps_cap[] =3D { { VFIO_USER_CAP_MAX_FDS, check_max_fds }, { VFIO_USER_CAP_MAX_XFER, check_max_xfer }, { VFIO_USER_CAP_PGSIZES, check_pgsizes }, { VFIO_USER_CAP_MAP_MAX, check_max_dma }, { VFIO_USER_CAP_MIGR, check_migr }, + { VFIO_USER_CAP_MULTI, check_multi }, { NULL } }; =20 @@ -1215,6 +1246,7 @@ static GString *caps_json(void) qdict_put_int(capdict, VFIO_USER_CAP_MAX_XFER, VFIO_USER_DEF_MAX_XFER); qdict_put_int(capdict, VFIO_USER_CAP_PGSIZES, VFIO_USER_DEF_PGSIZE); qdict_put_int(capdict, VFIO_USER_CAP_MAP_MAX, VFIO_USER_DEF_MAP_MAX); + qdict_put_bool(capdict, VFIO_USER_CAP_MULTI, true); =20 qdict_put_obj(dict, VFIO_USER_CAP, QOBJECT(capdict)); =20 @@ -1270,3 +1302,55 @@ bool vfio_user_validate_version(VFIOUserProxy *proxy= , Error **errp) trace_vfio_user_version(msgp->major, msgp->minor, msgp->capabilities); return true; } + +void vfio_user_flush_multi(VFIOUserProxy *proxy) +{ + VFIOUserMsg *msg; + VFIOUserWRMulti *wm =3D proxy->wr_multi; + Error *local_err =3D NULL; + + proxy->wr_multi =3D NULL; + + /* adjust size for actual # of writes */ + wm->hdr.size -=3D (VFIO_USER_MULTI_MAX - wm->wr_cnt) * sizeof(VFIOUser= WROne); + + msg =3D vfio_user_getmsg(proxy, &wm->hdr, NULL); + msg->id =3D wm->hdr.id; + msg->rsize =3D 0; + msg->type =3D VFIO_MSG_ASYNC; + trace_vfio_user_wrmulti("flush", wm->wr_cnt); + + if (!vfio_user_send_queued(proxy, msg, &local_err)) { + error_report_err(local_err); + vfio_user_recycle(proxy, msg); + } +} + +void vfio_user_create_multi(VFIOUserProxy *proxy) +{ + VFIOUserWRMulti *wm; + + wm =3D g_malloc0(sizeof(*wm)); + vfio_user_request_msg(&wm->hdr, VFIO_USER_REGION_WRITE_MULTI, + sizeof(*wm), VFIO_USER_NO_REPLY); + proxy->wr_multi =3D wm; +} + +void vfio_user_add_multi(VFIOUserProxy *proxy, uint8_t index, + off_t offset, uint32_t count, void *data) +{ + VFIOUserWRMulti *wm =3D proxy->wr_multi; + VFIOUserWROne *w1 =3D &wm->wrs[wm->wr_cnt]; + + w1->offset =3D offset; + w1->region =3D index; + w1->count =3D count; + memcpy(&w1->data, data, count); + + wm->wr_cnt++; + trace_vfio_user_wrmulti("add", wm->wr_cnt); + if (wm->wr_cnt =3D=3D VFIO_USER_MULTI_MAX || + proxy->num_outgoing < VFIO_USER_OUT_LOW) { + vfio_user_flush_multi(proxy); + } +} diff --git a/hw/vfio-user/trace-events b/hw/vfio-user/trace-events index 44dde020b3198e215065c4adcb9d13832e5c4287..abb67f4c119af533a77ad04e383= fa027e5aaa684 100644 --- a/hw/vfio-user/trace-events +++ b/hw/vfio-user/trace-events @@ -13,6 +13,7 @@ vfio_user_get_region_info(uint32_t index, uint32_t flags,= uint64_t size) " index vfio_user_region_rw(uint32_t region, uint64_t off, uint32_t count) " regio= n %d offset 0x%"PRIx64" count %d" vfio_user_get_irq_info(uint32_t index, uint32_t flags, uint32_t count) " i= ndex %d flags 0x%x count %d" vfio_user_set_irqs(uint32_t index, uint32_t start, uint32_t count, uint32_= t flags) " index %d start %d count %d flags 0x%x" +vfio_user_wrmulti(const char *s, uint64_t wr_cnt) " %s count 0x%"PRIx64 =20 # container.c vfio_user_dma_map(uint64_t iova, uint64_t size, uint64_t off, uint32_t fla= gs, bool async_ops) " iova 0x%"PRIx64" size 0x%"PRIx64" off 0x%"PRIx64" fla= gs 0x%x async_ops %d" --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924218; cv=none; d=zohomail.com; s=zohoarc; b=cKS2o9kNt2k3lB4G6YwKyTdRtu8JaRHriCNz0KMAFdfXEtFcB9ENyaaIuVyNesMgNiX7Ym8YDBz80j53PkCZvzDvraAuwVUehjOkn7IwsTKUiB9R9ZBr/Z0oxODzaOxzMyQ463wRT3suoL/dB9A1xiXfD5dy7BGV7erBuIz10QA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924218; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=IErTZzC51n5Uhzt0ReIfXt+k26ISLRH++xMSba06ZjI=; b=X2butb801FlgxCVDmP3RXLMw7jdseOii+Nrw7y7n8CunPFSEqlmq4IE5WQHlXw4WsqbZ8lhHUDHOvC+N3mZ75EZ+PyPMl0MNHTaaHaosVbTZm+UE5miLOnwP3LWhujGl+vlM+w6fO7xW6f+Y6V9IojbGz5TEOBm9Oj1awStNqoY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924218131644.3855115132358; Thu, 26 Jun 2025 00:50:18 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhJv-00064H-5y; Thu, 26 Jun 2025 03:46:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJt-0005xH-FO for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:53 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJr-00022U-6G for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:53 -0400 Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-650-2-fBjaM1PUevB3jQYYgVRA-1; Thu, 26 Jun 2025 03:46:45 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 42D8718DA5CA; Thu, 26 Jun 2025 07:46:44 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 3500D180045B; Thu, 26 Jun 2025 07:46:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750924009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IErTZzC51n5Uhzt0ReIfXt+k26ISLRH++xMSba06ZjI=; b=iSbvGkXj+booGIY2gImmmDjaA5EhQL0ytOVFKYmlju+kCAoQDvrrSbCh76dpDKWygdJgMm vtSpweCdcmmlmuvRiFFVFjhW6Myv8flTgmpfFwXmIXtbMYN+veqAd6X9PYgqFUX6mD2krn cmVh7W8PJEZCgHtwyGh92FEO8Hs0SJU= X-MC-Unique: 2-fBjaM1PUevB3jQYYgVRA-1 X-Mimecast-MFC-AGG-ID: 2-fBjaM1PUevB3jQYYgVRA_1750924004 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , John Levon , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 24/25] docs: add vfio-user documentation Date: Thu, 26 Jun 2025 09:45:28 +0200 Message-ID: <20250626074529.1384114-25-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924220196116600 From: John Levon Add some basic documentation on vfio-user usage. Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-19-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- docs/system/device-emulation.rst | 1 + docs/system/devices/vfio-user.rst | 26 ++++++++++++++++++++++++++ 2 files changed, 27 insertions(+) create mode 100644 docs/system/devices/vfio-user.rst diff --git a/docs/system/device-emulation.rst b/docs/system/device-emulatio= n.rst index a1b0d7997e04f781484a0d48085858f39ee17e35..911381643f1c617dca1021962bc= 4446dbec73ec1 100644 --- a/docs/system/device-emulation.rst +++ b/docs/system/device-emulation.rst @@ -85,6 +85,7 @@ Emulated Devices devices/can.rst devices/ccid.rst devices/cxl.rst + devices/vfio-user.rst devices/ivshmem.rst devices/ivshmem-flat.rst devices/keyboard.rst diff --git a/docs/system/devices/vfio-user.rst b/docs/system/devices/vfio-u= ser.rst new file mode 100644 index 0000000000000000000000000000000000000000..b6dcaa5615e5354e80e63925f0a= 37537ca9e9371 --- /dev/null +++ b/docs/system/devices/vfio-user.rst @@ -0,0 +1,26 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +=3D=3D=3D=3D=3D=3D=3D=3D=3D +vfio-user +=3D=3D=3D=3D=3D=3D=3D=3D=3D + +QEMU includes a ``vfio-user`` client. The ``vfio-user`` specification allo= ws for +implementing (PCI) devices in userspace outside of QEMU; it is similar to +``vhost-user`` in this respect (see :doc:`vhost-user`), but can emulate ar= bitrary +PCI devices, not just ``virtio``. Whereas ``vfio`` is handled by the host +kernel, ``vfio-user``, while similar in implementation, is handled entirel= y in +userspace. + +For example, SPDK includes a virtual PCI NVMe controller implementation; by +setting up a ``vfio-user`` UNIX socket between QEMU and SPDK, a VM can sen= d NVMe +I/O to the SPDK process. + +Presuming a suitable ``vfio-user`` server has opened a socket at +``/tmp/vfio-user.sock``, a device can be configured with for example: + +.. code-block:: console + +-device '{"driver": "vfio-user-pci","socket": {"path": "/tmp/vfio-user.soc= k", "type": "unix"}}' + +See `libvfio-user `_ for further +information. --=20 2.49.0 From nobody Sat Nov 15 14:49:49 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1750924262; cv=none; d=zohomail.com; s=zohoarc; b=kHKzi5z17KO+BdHZRqV5jsnk2H+dcpswywmflE2XMPbB37io2fMqH+CWIwRvqkor0oAGCKH6RNvARtb75vNSMA7lkFmjl0jhubo7BMWRVMNiqKHNvheormyxAq6t4tHZ17TLbHg3styZl7qmX9+zpx4HamkoSCs+tN9nxXWpXoc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1750924262; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=bK2ax7sG/nNON6tKaxJaUYjhXJJ40ZEkUgwXa53wI/Q=; b=JLZ61U9LZZSnCRe2mS/NJ/GY6KX8jlBPXBGwd0vssRhz6sdSfsSRBd07Y/+KByFIyA+5gwGqzRJrCITRoigykWaYXYpJwF44cqs43dS4VIiPeF5yhC25PQ6vwz4xQbprnyoVtxZeJXX0/2y1s1vvMcmA9WUDBRVHvYE3liispEg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1750924262924619.909893582002; Thu, 26 Jun 2025 00:51:02 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uUhK1-0006Vd-8x; Thu, 26 Jun 2025 03:47:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJz-0006PV-FO for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:59 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uUhJr-00022Q-6E for qemu-devel@nongnu.org; Thu, 26 Jun 2025 03:46:59 -0400 Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-73-C7rUw7nTMgOx1xw5bMb1Hg-1; Thu, 26 Jun 2025 03:46:48 -0400 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2F2EC1800268; Thu, 26 Jun 2025 07:46:47 +0000 (UTC) Received: from corto.redhat.com (unknown [10.44.32.51]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C97B5180045B; Thu, 26 Jun 2025 07:46:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750924010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bK2ax7sG/nNON6tKaxJaUYjhXJJ40ZEkUgwXa53wI/Q=; b=DocKelpDnjqcNDmffDNwCK1ZYSuozqVSqGnL9ymGiJTymSLf64sPPFYZHSuvtlr+KCNc34 9ntzGb+HCWYcSFS9uEBONEkUrJScQqDOU0b0RdAel8jvToVsmzx5+lY5gs+J2VAM2MZ+fq z225Pw+vfbZ3Dz47s6kNwwukMi7gRlc= X-MC-Unique: C7rUw7nTMgOx1xw5bMb1Hg-1 X-Mimecast-MFC-AGG-ID: C7rUw7nTMgOx1xw5bMb1Hg_1750924007 From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= To: qemu-devel@nongnu.org Cc: Alex Williamson , Thanos Makatos , John Levon , =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= Subject: [PULL 25/25] vfio-user: introduce vfio-user protocol specification Date: Thu, 26 Jun 2025 09:45:29 +0200 Message-ID: <20250626074529.1384114-26-clg@redhat.com> In-Reply-To: <20250626074529.1384114-1-clg@redhat.com> References: <20250626074529.1384114-1-clg@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=clg@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, WEIRD_QUOTING=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1750924265040116600 From: Thanos Makatos This patch introduces the vfio-user protocol specification (formerly known as VFIO-over-socket), which is designed to allow devices to be emulated outside QEMU, in a separate process. vfio-user reuses the existing VFIO defines, structs and concepts. It has been earlier discussed as an RFC in: "RFC: use VFIO over a UNIX domain socket to implement device offloading" Signed-off-by: Thanos Makatos Signed-off-by: John Levon Reviewed-by: C=C3=A9dric Le Goater Link: https://lore.kernel.org/qemu-devel/20250625193012.2316242-20-john.lev= on@nutanix.com Signed-off-by: C=C3=A9dric Le Goater --- MAINTAINERS | 3 +- docs/interop/index.rst | 1 + docs/interop/vfio-user.rst | 1520 ++++++++++++++++++++++++++++++++++++ 3 files changed, 1523 insertions(+), 1 deletion(-) create mode 100644 docs/interop/vfio-user.rst diff --git a/MAINTAINERS b/MAINTAINERS index 2369391004bc3637696a38d4fbb05fe317beb338..1b73b8b394a2c87cd5b073c3744= 17c214d50a062 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4246,7 +4246,6 @@ F: hw/remote/proxy-memory-listener.c F: include/hw/remote/proxy-memory-listener.h F: hw/remote/iohub.c F: include/hw/remote/iohub.h -F: subprojects/libvfio-user F: hw/remote/vfio-user-obj.c F: include/hw/remote/vfio-user-obj.h F: hw/remote/iommu.c @@ -4257,6 +4256,8 @@ VFIO-USER: M: John Levon M: Thanos Makatos S: Supported +F: docs/interop/vfio-user.rst +F: docs/system/devices/vfio-user.rst F: hw/vfio-user/* F: include/hw/vfio-user/* F: subprojects/libvfio-user diff --git a/docs/interop/index.rst b/docs/interop/index.rst index 972f3e49ce9e586afd2177f820b14e4a71c503c1..d830c5c4104a609f54d9ae49e06= 2cd01378455c4 100644 --- a/docs/interop/index.rst +++ b/docs/interop/index.rst @@ -25,6 +25,7 @@ are useful for making QEMU interoperate with other softwa= re. qemu-ga-ref qemu-qmp-ref qemu-storage-daemon-qmp-ref + vfio-user vhost-user vhost-user-gpu vhost-vdpa diff --git a/docs/interop/vfio-user.rst b/docs/interop/vfio-user.rst new file mode 100644 index 0000000000000000000000000000000000000000..0b06f026b0d745f9f120d73717e= 967e145f35d38 --- /dev/null +++ b/docs/interop/vfio-user.rst @@ -0,0 +1,1520 @@ +.. include:: +.. SPDX-License-Identifier: GPL-2.0-or-later + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D +vfio-user Protocol Specification +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D + +.. contents:: Table of Contents + +Introduction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +vfio-user is a protocol that allows a device to be emulated in a separate +process outside of a Virtual Machine Monitor (VMM). vfio-user devices cons= ist +of a generic VFIO device type, living inside the VMM, which we call the cl= ient, +and the core device implementation, living outside the VMM, which we call = the +server. + +The vfio-user specification is partly based on the +`Linux VFIO ioctl interface `_. + +VFIO is a mature and stable API, backed by an extensively used framework. = The +existing VFIO client implementation in QEMU (``qemu/hw/vfio/``) can be lar= gely +re-used, though there is nothing in this specification that requires that +particular implementation. None of the VFIO kernel modules are required for +supporting the protocol, on either the client or server side. Some source +definitions in VFIO are re-used for vfio-user. + +The main idea is to allow a virtual device to function in a separate proce= ss in +the same host over a UNIX domain socket. A UNIX domain socket (``AF_UNIX``= ) is +chosen because file descriptors can be trivially sent over it, which in tu= rn +allows: + +* Sharing of client memory for DMA with the server. +* Sharing of server memory with the client for fast MMIO. +* Efficient sharing of eventfd's for triggering interrupts. + +Other socket types could be used which allow the server to run in a separa= te +guest in the same host (``AF_VSOCK``) or remotely (``AF_INET``). Theoretic= ally +the underlying transport does not necessarily have to be a socket, however= we do +not examine such alternatives. In this protocol version we focus on using = a UNIX +domain socket and introduce basic support for the other two types of socke= ts +without considering performance implications. + +While passing of file descriptors is desirable for performance reasons, su= pport +is not necessary for either the client or the server in order to implement= the +protocol. There is always an in-band, message-passing fall back mechanism. + +Overview +=3D=3D=3D=3D=3D=3D=3D=3D + +VFIO is a framework that allows a physical device to be securely passed th= rough +to a user space process; the device-specific kernel driver does not drive = the +device at all. Typically, the user space process is a VMM and the device = is +passed through to it in order to achieve high performance. VFIO provides a= n API +and the required functionality in the kernel. QEMU has adopted VFIO to all= ow a +guest to directly access physical devices, instead of emulating them in +software. + +vfio-user reuses the core VFIO concepts defined in its API, but implements= them +as messages to be sent over a socket. It does not change the kernel-based = VFIO +in any way, in fact none of the VFIO kernel modules need to be loaded to u= se +vfio-user. It is also possible for the client to concurrently use the curr= ent +kernel-based VFIO for one device, and vfio-user for another device. + +VFIO Device Model +----------------- + +A device under VFIO presents a standard interface to the user process. Man= y of +the VFIO operations in the existing interface use the ``ioctl()`` system c= all, and +references to the existing interface are called the ``ioctl()`` implementa= tion in +this document. + +The following sections describe the set of messages that implement the vfi= o-user +interface over a socket. In many cases, the messages are analogous to data +structures used in the ``ioctl()`` implementation. Messages derived from t= he +``ioctl()`` will have a name derived from the ``ioctl()`` command name. E= .g., the +``VFIO_DEVICE_GET_INFO`` ``ioctl()`` command becomes a +``VFIO_USER_DEVICE_GET_INFO`` message. The purpose of this reuse is to sh= are as +much code as feasible with the ``ioctl()`` implementation``. + +Connection Initiation +^^^^^^^^^^^^^^^^^^^^^ + +After the client connects to the server, the initial client message is +``VFIO_USER_VERSION`` to propose a protocol version and set of capabilitie= s to +apply to the session. The server replies with a compatible version and set= of +capabilities it supports, or closes the connection if it cannot support the +advertised version. + +Device Information +^^^^^^^^^^^^^^^^^^ + +The client uses a ``VFIO_USER_DEVICE_GET_INFO`` message to query the serve= r for +information about the device. This information includes: + +* The device type and whether it supports reset (``VFIO_DEVICE_FLAGS_``), +* the number of device regions, and +* the device presents to the client the number of interrupt types the devi= ce + supports. + +Region Information +^^^^^^^^^^^^^^^^^^ + +The client uses ``VFIO_USER_DEVICE_GET_REGION_INFO`` messages to query the +server for information about the device's regions. This information descri= bes: + +* Read and write permissions, whether it can be memory mapped, and whether= it + supports additional capabilities (``VFIO_REGION_INFO_CAP_``). +* Region index, size, and offset. + +When a device region can be mapped by the client, the server provides a fi= le +descriptor which the client can ``mmap()``. The server is responsible for +polling for client updates to memory mapped regions. + +Region Capabilities +""""""""""""""""""" + +Some regions have additional capabilities that cannot be described adequat= ely +by the region info data structure. These capabilities are returned in the +region info reply in a list similar to PCI capabilities in a PCI device's +configuration space. + +Sparse Regions +"""""""""""""" +A region can be memory-mappable in whole or in part. When only a subset of= a +region can be mapped by the client, a ``VFIO_REGION_INFO_CAP_SPARSE_MMAP`` +capability is included in the region info reply. This capability describes +which portions can be mapped by the client. + +.. Note:: + For example, in a virtual NVMe controller, sparse regions can be used so + that accesses to the NVMe registers (found in the beginning of BAR0) are + trapped (an infrequent event), while allowing direct access to the door= bells + (an extremely frequent event as every I/O submission requires a write to + BAR0), found in the next page after the NVMe registers in BAR0. + +Device-Specific Regions +""""""""""""""""""""""" + +A device can define regions additional to the standard ones (e.g. PCI inde= xes +0-8). This is achieved by including a ``VFIO_REGION_INFO_CAP_TYPE`` capabi= lity +in the region info reply of a device-specific region. Such regions are ref= lected +in ``struct vfio_user_device_info.num_regions``. Thus, for PCI devices this +value can be equal to, or higher than, ``VFIO_PCI_NUM_REGIONS``. + +Region I/O via file descriptors +------------------------------- + +For unmapped regions, region I/O from the client is done via +``VFIO_USER_REGION_READ/WRITE``. As an optimization, ioeventfds or ioregi= onfds +may be configured for sub-regions of some regions. A client may request +information on these sub-regions via ``VFIO_USER_DEVICE_GET_REGION_IO_FDS`= `; by +configuring the returned file descriptors as ioeventfds or ioregionfds, the +server can be directly notified of I/O (for example, by KVM) without takin= g a +trip through the client. + +Interrupts +^^^^^^^^^^ + +The client uses ``VFIO_USER_DEVICE_GET_IRQ_INFO`` messages to query the se= rver +for the device's interrupt types. The interrupt types are specific to the = bus +the device is attached to, and the client is expected to know the capabili= ties +of each interrupt type. The server can signal an interrupt by directly inj= ecting +interrupts into the guest via an event file descriptor. The client configu= res +how the server signals an interrupt with ``VFIO_USER_SET_IRQS`` messages. + +Device Read and Write +^^^^^^^^^^^^^^^^^^^^^ + +When the guest executes load or store operations to an unmapped device reg= ion, +the client forwards these operations to the server with +``VFIO_USER_REGION_READ`` or ``VFIO_USER_REGION_WRITE`` messages. The serv= er +will reply with data from the device on read operations or an acknowledgem= ent on +write operations. See `Read and Write Operations`_. + +Client memory access +-------------------- + +The client uses ``VFIO_USER_DMA_MAP`` and ``VFIO_USER_DMA_UNMAP`` messages= to +inform the server of the valid DMA ranges that the server can access on be= half +of a device (typically, VM guest memory). DMA memory may be accessed by the +server via ``VFIO_USER_DMA_READ`` and ``VFIO_USER_DMA_WRITE`` messages ove= r the +socket. In this case, the "DMA" part of the naming is a misnomer. + +Actual direct memory access of client memory from the server is possible i= f the +client provides file descriptors the server can ``mmap()``. Note that ``mm= ap()`` +privileges cannot be revoked by the client, therefore file descriptors sho= uld +only be exported in environments where the client trusts the server not to +corrupt guest memory. + +See `Read and Write Operations`_. + +Client/server interactions +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D + +Socket +------ + +A server can serve: + +1) one or more clients, and/or +2) one or more virtual devices, belonging to one or more clients. + +The current protocol specification requires a dedicated socket per +client/server connection. It is a server-side implementation detail whethe= r a +single server handles multiple virtual devices from the same or multiple +clients. The location of the socket is implementation-specific. Multiplexi= ng +clients, devices, and servers over the same socket is not supported in this +version of the protocol. + +Authentication +-------------- + +For ``AF_UNIX``, we rely on OS mandatory access controls on the socket fil= es, +therefore it is up to the management layer to set up the socket as require= d. +Socket types that span guests or hosts will require a proper authentication +mechanism. Defining that mechanism is deferred to a future version of the +protocol. + +Command Concurrency +------------------- + +A client may pipeline multiple commands without waiting for previous comma= nd +replies. The server will process commands in the order they are received.= A +consequence of this is if a client issues a command with the *No_reply* bi= t, +then subsequently issues a command without *No_reply*, the older command w= ill +have been processed before the reply to the younger command is sent by the +server. The client must be aware of the device's capability to process +concurrent commands if pipelining is used. For example, pipelining allows +multiple client threads to concurrently access device regions; the client = must +ensure these accesses obey device semantics. + +An example is a frame buffer device, where the device may allow concurrent +access to different areas of video memory, but may have indeterminate beha= vior +if concurrent accesses are performed to command or status registers. + +Note that unrelated messages sent from the server to the client can appear= in +between a client to server request/reply and vice versa. + +Implementers should be prepared for certain commands to exhibit potentially +unbounded latencies. For example, ``VFIO_USER_DEVICE_RESET`` may take an +arbitrarily long time to complete; clients should take care not to block +unnecessarily. + +Socket Disconnection Behavior +----------------------------- +The server and the client can disconnect from each other, either intention= ally +or unexpectedly. Both the client and the server need to know how to handle= such +events. + +Server Disconnection +^^^^^^^^^^^^^^^^^^^^ +A server disconnecting from the client may indicate that: + +1) A virtual device has been restarted, either intentionally (e.g. because= of a + device update) or unintentionally (e.g. because of a crash). +2) A virtual device has been shut down with no intention to be restarted. + +It is impossible for the client to know whether or not a failure is +intermittent or innocuous and should be retried, therefore the client shou= ld +reset the VFIO device when it detects the socket has been disconnected. +Error recovery will be driven by the guest's device error handling +behavior. + +Client Disconnection +^^^^^^^^^^^^^^^^^^^^ +The client disconnecting from the server primarily means that the client +has exited. Currently, this means that the guest is shut down so the devic= e is +no longer needed therefore the server can automatically exit. However, the= re +can be cases where a client disconnection should not result in a server ex= it: + +1) A single server serving multiple clients. +2) A multi-process QEMU upgrading itself step by step, which is not yet + implemented. + +Therefore in order for the protocol to be forward compatible, the server s= hould +respond to a client disconnection as follows: + + - all client memory regions are unmapped and cleaned up (including closin= g any + passed file descriptors) + - all IRQ file descriptors passed from the old client are closed + - the device state should otherwise be retained + +The expectation is that when a client reconnects, it will re-establish IRQ= and +client memory mappings. + +If anything happens to the client (such as qemu really did exit), the cont= rol +stack will know about it and can clean up resources accordingly. + +Security Considerations +----------------------- + +Speaking generally, vfio-user clients should not trust servers, and vice v= ersa. +Standard tools and mechanisms should be used on both sides to validate inp= ut and +prevent against denial of service scenarios, buffer overflow, etc. + +Request Retry and Response Timeout +---------------------------------- +A failed command is a command that has been successfully sent and has been +responded to with an error code. Failure to send the command in the first = place +(e.g. because the socket is disconnected) is a different type of error exa= mined +earlier in the disconnect section. + +.. Note:: + QEMU's VFIO retries certain operations if they fail. While this makes s= ense + for real HW, we don't know for sure whether it makes sense for virtual + devices. + +Defining a retry and timeout scheme is deferred to a future version of the +protocol. + +Message sizes +------------- + +Some requests have an ``argsz`` field. In a request, it defines the maximum +expected reply payload size, which should be at least the size of the fixed +reply payload headers defined here. The *request* payload size is defined = by the +usual ``msg_size`` field in the header, not the ``argsz`` field. + +In a reply, the server sets ``argsz`` field to the size needed for a full +payload size. This may be less than the requested maximum size. This may be +larger than the requested maximum size: in that case, the full payload is = not +included in the reply, but the ``argsz`` field in the reply indicates the = needed +size, allowing a client to allocate a larger buffer for holding the reply = before +trying again. + +In addition, during negotiation (see `Version`_), the client and server m= ay +each specify a ``max_data_xfer_size`` value; this defines the maximum data= that +may be read or written via one of the ``VFIO_USER_DMA/REGION_READ/WRITE`` +messages; see `Read and Write Operations`_. + +Protocol Specification +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +To distinguish from the base VFIO symbols, all vfio-user symbols are prefi= xed +with ``vfio_user`` or ``VFIO_USER``. In this revision, all data is in the +endianness of the host system, although this may be relaxed in future +revisions in cases where the client and server run on different hosts +with different endianness. + +Unless otherwise specified, all sizes should be presumed to be in bytes. + +.. _Commands: + +Commands +-------- +The following table lists the VFIO message command IDs, and whether the +message command is sent from the client or the server. + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Name Command Request Direction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +``VFIO_USER_VERSION`` 1 client -> server +``VFIO_USER_DMA_MAP`` 2 client -> server +``VFIO_USER_DMA_UNMAP`` 3 client -> server +``VFIO_USER_DEVICE_GET_INFO`` 4 client -> server +``VFIO_USER_DEVICE_GET_REGION_INFO`` 5 client -> server +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` 6 client -> server +``VFIO_USER_DEVICE_GET_IRQ_INFO`` 7 client -> server +``VFIO_USER_DEVICE_SET_IRQS`` 8 client -> server +``VFIO_USER_REGION_READ`` 9 client -> server +``VFIO_USER_REGION_WRITE`` 10 client -> server +``VFIO_USER_DMA_READ`` 11 server -> client +``VFIO_USER_DMA_WRITE`` 12 server -> client +``VFIO_USER_DEVICE_RESET`` 13 client -> server +``VFIO_USER_REGION_WRITE_MULTI`` 15 client -> server +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Header +------ + +All messages, both command messages and reply messages, are preceded by a +16-byte header that contains basic information about the message. The head= er is +followed by message-specific data described in the sections below. + ++----------------+--------+-------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D= +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| Message ID | 0 | 2 | ++----------------+--------+-------------+ +| Command | 2 | 2 | ++----------------+--------+-------------+ +| Message size | 4 | 4 | ++----------------+--------+-------------+ +| Flags | 8 | 4 | ++----------------+--------+-------------+ +| | +-----+------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0-3 | Type | | +| | +-----+------------+ | +| | | 4 | No_reply | | +| | +-----+------------+ | +| | | 5 | Error | | +| | +-----+------------+ | ++----------------+--------+-------------+ +| Error | 12 | 4 | ++----------------+--------+-------------+ +| | 16 | variable | ++----------------+--------+-------------+ + +* *Message ID* identifies the message, and is echoed in the command's reply + message. Message IDs belong entirely to the sender, can be re-used (even + concurrently) and the receiver must not make any assumptions about their + uniqueness. +* *Command* specifies the command to be executed, listed in Commands_. It = is + also set in the reply header. +* *Message size* contains the size of the entire message, including the he= ader. +* *Flags* contains attributes of the message: + + * The *Type* bits indicate the message type. + + * *Command* (value 0x0) indicates a command message. + * *Reply* (value 0x1) indicates a reply message acknowledging a previ= ous + command with the same message ID. + * *No_reply* in a command message indicates that no reply is needed for = this + command. This is commonly used when multiple commands are sent, and o= nly + the last needs acknowledgement. + * *Error* in a reply message indicates the command being acknowledged had + an error. In this case, the *Error* field will be valid. + +* *Error* in a reply message is an optional UNIX errno value. It may be ze= ro + even if the Error bit is set in Flags. It is reserved in a command messa= ge. + +Each command message in Commands_ must be replied to with a reply message, +unless the message sets the *No_Reply* bit. The reply consists of the hea= der +with the *Reply* bit set, plus any additional data. + +If an error occurs, the reply message must only include the reply header. + +As the header is standard in both requests and replies, it is not included= in +the command-specific specifications below; each message definition should = be +appended to the standard header, and the offsets are given from the end of= the +standard header. + +``VFIO_USER_VERSION`` +--------------------- + +.. _Version: + +This is the initial message sent by the client after the socket connection= is +established; the same format is used for the server's reply. + +Upon establishing a connection, the client must send a ``VFIO_USER_VERSION= `` +message proposing a protocol version and a set of capabilities. The server +compares these with the versions and capabilities it supports and sends a +``VFIO_USER_VERSION`` reply according to the following rules. + +* The major version in the reply must be the same as proposed. If the clie= nt + does not support the proposed major, it closes the connection. +* The minor version in the reply must be equal to or less than the minor + version proposed. +* The capability list must be a subset of those proposed. If the server + requires a capability the client did not include, it closes the connecti= on. + +The protocol major version will only change when incompatible protocol cha= nges +are made, such as changing the message format. The minor version may change +when compatible changes are made, such as adding new messages or capabilit= ies, +Both the client and server must support all minor versions less than the +maximum minor version it supports. E.g., an implementation that supports +version 1.3 must also support 1.0 through 1.2. + +When making a change to this specification, the protocol version number mu= st +be included in the form "added in version X.Y" + +Request +^^^^^^^ + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D =3D=3D=3D= =3D +Name Offset Size +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D =3D=3D=3D= =3D +version major 0 2 +version minor 2 2 +version data 4 variable (including terminating NUL). Optional. +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D =3D=3D=3D= =3D + +The version data is an optional UTF-8 encoded JSON byte array with the fol= lowing +format: + ++--------------+--------+-----------------------------------+ +| Name | Type | Description | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| capabilities | object | Contains common capabilities that | +| | | the sender supports. Optional. | ++--------------+--------+-----------------------------------+ + +Capabilities: + ++--------------------+---------+------------------------------------------= ------+ +| Name | Type | Description = | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D= =3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D+ +| max_msg_fds | number | Maximum number of file descriptors that c= an be | +| | | received by the sender in one message. = | +| | | Optional. If not specified then the recei= ver | +| | | must assume a value of ``1``. = | ++--------------------+---------+------------------------------------------= ------+ +| max_data_xfer_size | number | Maximum ``count`` for data transfer messa= ges; | +| | | see `Read and Write Operations`_. Optiona= l, | +| | | with a default value of 1048576 bytes. = | ++--------------------+---------+------------------------------------------= ------+ +| pgsizes | number | Page sizes supported in DMA map operation= s | +| | | or'ed together. Optional, with a default = value | +| | | of supporting only 4k pages. = | ++--------------------+---------+------------------------------------------= ------+ +| max_dma_maps | number | Maximum number DMA map windows that can b= e | +| | | valid simultaneously. Optional, with a = | +| | | value of 65535 (64k-1). = | ++--------------------+---------+------------------------------------------= ------+ +| migration | object | Migration capability parameters. If missi= ng | +| | | then migration is not supported by the se= nder. | ++--------------------+---------+------------------------------------------= ------+ +| write_multiple | boolean | ``VFIO_USER_REGION_WRITE_MULTI`` messages= | +| | | are supported if the value is ``true``. = | ++--------------------+---------+------------------------------------------= ------+ + +The migration capability contains the following name/value pairs: + ++-----------------+--------+----------------------------------------------= ----+ +| Name | Type | Description = | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D= =3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D+ +| pgsize | number | Page size of dirty pages bitmap. The smallest= | +| | | between the client and the server is used. = | ++-----------------+--------+----------------------------------------------= ----+ +| max_bitmap_size | number | Maximum bitmap size in ``VFIO_USER_DIRTY_PAGE= S`` | +| | | and ``VFIO_DMA_UNMAP`` messages. Optional, = | +| | | with a default value of 256MB. = | ++-----------------+--------+----------------------------------------------= ----+ + +Reply +^^^^^ + +The same message format is used in the server's reply with the semantics +described above. + +``VFIO_USER_DMA_MAP`` +--------------------- + +This command message is sent by the client to the server to inform it of t= he +memory regions the server can access. It must be sent before the server can +perform any DMA to the client. It is normally sent directly after the vers= ion +handshake is completed, but may also occur when memory is added to the cli= ent, +or if the client uses a vIOMMU. + +Request +^^^^^^^ + +The request payload for this message is a structure of the following forma= t: + ++-------------+--------+-------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------------+--------+-------------+ +| flags | 4 | 4 | ++-------------+--------+-------------+ +| | +-----+------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0 | readable | | +| | +-----+------------+ | +| | | 1 | writeable | | +| | +-----+------------+ | ++-------------+--------+-------------+ +| offset | 8 | 8 | ++-------------+--------+-------------+ +| address | 16 | 8 | ++-------------+--------+-------------+ +| size | 24 | 8 | ++-------------+--------+-------------+ + +* *argsz* is the size of the above structure. Note there is no reply paylo= ad, + so this field differs from other message types. +* *flags* contains the following region attributes: + + * *readable* indicates that the region can be read from. + + * *writeable* indicates that the region can be written to. + +* *offset* is the file offset of the region with respect to the associated= file + descriptor, or zero if the region is not mappable +* *address* is the base DMA address of the region. +* *size* is the size of the region. + +This structure is 32 bytes in size, so the message size is 16 + 32 bytes. + +If the DMA region being added can be directly mapped by the server, a file +descriptor must be sent as part of the message meta-data. The region can be +mapped via the mmap() system call. On ``AF_UNIX`` sockets, the file descri= ptor +must be passed as ``SCM_RIGHTS`` type ancillary data. Otherwise, if the D= MA +region cannot be directly mapped by the server, no file descriptor must be= sent +as part of the message meta-data and the DMA region can be accessed by the +server using ``VFIO_USER_DMA_READ`` and ``VFIO_USER_DMA_WRITE`` messages, +explained in `Read and Write Operations`_. A command to map over an existi= ng +region must be failed by the server with ``EEXIST`` set in error field in = the +reply. + +Reply +^^^^^ + +There is no payload in the reply message. + +``VFIO_USER_DMA_UNMAP`` +----------------------- + +This command message is sent by the client to the server to inform it that= a +DMA region, previously made available via a ``VFIO_USER_DMA_MAP`` command +message, is no longer available for DMA. It typically occurs when memory is +subtracted from the client or if the client uses a vIOMMU. The DMA region = is +described by the following structure: + +Request +^^^^^^^ + +The request payload for this message is a structure of the following forma= t: + ++--------------+--------+------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| argsz | 0 | 4 | ++--------------+--------+------------------------+ +| flags | 4 | 4 | ++--------------+--------+------------------------+ +| address | 8 | 8 | ++--------------+--------+------------------------+ +| size | 16 | 8 | ++--------------+--------+------------------------+ + +* *argsz* is the maximum size of the reply payload. +* *flags* is unused in this version. +* *address* is the base DMA address of the DMA region. +* *size* is the size of the DMA region. + +The address and size of the DMA region being unmapped must match exactly a +previous mapping. + +Reply +^^^^^ + +Upon receiving a ``VFIO_USER_DMA_UNMAP`` command, if the file descriptor is +mapped then the server must release all references to that DMA region befo= re +replying, which potentially includes in-flight DMA transactions. + +The server responds with the original DMA entry in the request. + + +``VFIO_USER_DEVICE_GET_INFO`` +----------------------------- + +This command message is sent by the client to the server to query for basic +information about the device. + +Request +^^^^^^^ + ++-------------+--------+--------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------------+--------+--------------------------+ +| flags | 4 | 4 | ++-------------+--------+--------------------------+ +| | +-----+-------------------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0 | VFIO_DEVICE_FLAGS_RESET | | +| | +-----+-------------------------+ | +| | | 1 | VFIO_DEVICE_FLAGS_PCI | | +| | +-----+-------------------------+ | ++-------------+--------+--------------------------+ +| num_regions | 8 | 4 | ++-------------+--------+--------------------------+ +| num_irqs | 12 | 4 | ++-------------+--------+--------------------------+ + +* *argsz* is the maximum size of the reply payload +* all other fields must be zero. + +Reply +^^^^^ + ++-------------+--------+--------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------------+--------+--------------------------+ +| flags | 4 | 4 | ++-------------+--------+--------------------------+ +| | +-----+-------------------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0 | VFIO_DEVICE_FLAGS_RESET | | +| | +-----+-------------------------+ | +| | | 1 | VFIO_DEVICE_FLAGS_PCI | | +| | +-----+-------------------------+ | ++-------------+--------+--------------------------+ +| num_regions | 8 | 4 | ++-------------+--------+--------------------------+ +| num_irqs | 12 | 4 | ++-------------+--------+--------------------------+ + +* *argsz* is the size required for the full reply payload (16 bytes today) +* *flags* contains the following device attributes. + + * ``VFIO_DEVICE_FLAGS_RESET`` indicates that the device supports the + ``VFIO_USER_DEVICE_RESET`` message. + * ``VFIO_DEVICE_FLAGS_PCI`` indicates that the device is a PCI device. + +* *num_regions* is the number of memory regions that the device exposes. +* *num_irqs* is the number of distinct interrupt types that the device sup= ports. + +This version of the protocol only supports PCI devices. Additional devices= may +be supported in future versions. + +``VFIO_USER_DEVICE_GET_REGION_INFO`` +------------------------------------ + +This command message is sent by the client to the server to query for +information about device regions. The VFIO region info structure is define= d in +```` (``struct vfio_region_info``). + +Request +^^^^^^^ + ++------------+--------+------------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D+ +| argsz | 0 | 4 | ++------------+--------+------------------------------+ +| flags | 4 | 4 | ++------------+--------+------------------------------+ +| index | 8 | 4 | ++------------+--------+------------------------------+ +| cap_offset | 12 | 4 | ++------------+--------+------------------------------+ +| size | 16 | 8 | ++------------+--------+------------------------------+ +| offset | 24 | 8 | ++------------+--------+------------------------------+ + +* *argsz* the maximum size of the reply payload +* *index* is the index of memory region being queried, it is the only field + that is required to be set in the command message. +* all other fields must be zero. + +Reply +^^^^^ + ++------------+--------+------------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D+ +| argsz | 0 | 4 | ++------------+--------+------------------------------+ +| flags | 4 | 4 | ++------------+--------+------------------------------+ +| | +-----+-----------------------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0 | VFIO_REGION_INFO_FLAG_READ | | +| | +-----+-----------------------------+ | +| | | 1 | VFIO_REGION_INFO_FLAG_WRITE | | +| | +-----+-----------------------------+ | +| | | 2 | VFIO_REGION_INFO_FLAG_MMAP | | +| | +-----+-----------------------------+ | +| | | 3 | VFIO_REGION_INFO_FLAG_CAPS | | +| | +-----+-----------------------------+ | ++------------+--------+------------------------------+ ++------------+--------+------------------------------+ +| index | 8 | 4 | ++------------+--------+------------------------------+ +| cap_offset | 12 | 4 | ++------------+--------+------------------------------+ +| size | 16 | 8 | ++------------+--------+------------------------------+ +| offset | 24 | 8 | ++------------+--------+------------------------------+ + +* *argsz* is the size required for the full reply payload (region info str= ucture + plus the size of any region capabilities) +* *flags* are attributes of the region: + + * ``VFIO_REGION_INFO_FLAG_READ`` allows client read access to the region. + * ``VFIO_REGION_INFO_FLAG_WRITE`` allows client write access to the regi= on. + * ``VFIO_REGION_INFO_FLAG_MMAP`` specifies the client can mmap() the reg= ion. + When this flag is set, the reply will include a file descriptor in its + meta-data. On ``AF_UNIX`` sockets, the file descriptors will be passed= as + ``SCM_RIGHTS`` type ancillary data. + * ``VFIO_REGION_INFO_FLAG_CAPS`` indicates additional capabilities found= in the + reply. + +* *index* is the index of memory region being queried, it is the only field + that is required to be set in the command message. +* *cap_offset* describes where additional region capabilities can be found. + cap_offset is relative to the beginning of the VFIO region info structur= e. + The data structure it points is a VFIO cap header defined in + ````. +* *size* is the size of the region. +* *offset* is the offset that should be given to the mmap() system call for + regions with the MMAP attribute. It is also used as the base offset when + mapping a VFIO sparse mmap area, described below. + +VFIO region capabilities +"""""""""""""""""""""""" + +The VFIO region information can also include a capabilities list. This lis= t is +similar to a PCI capability list - each entry has a common header that +identifies a capability and where the next capability in the list can be f= ound. +The VFIO capability header format is defined in ```` (``stru= ct +vfio_info_cap_header``). + +VFIO cap header format +"""""""""""""""""""""" + ++---------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D+ +| id | 0 | 2 | ++---------+--------+------+ +| version | 2 | 2 | ++---------+--------+------+ +| next | 4 | 4 | ++---------+--------+------+ + +* *id* is the capability identity. +* *version* is a capability-specific version number. +* *next* specifies the offset of the next capability in the capability lis= t. It + is relative to the beginning of the VFIO region info structure. + +VFIO sparse mmap cap header +""""""""""""""""""""""""""" + ++------------------+----------------------------------+ +| Name | Value | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D+ +| id | VFIO_REGION_INFO_CAP_SPARSE_MMAP | ++------------------+----------------------------------+ +| version | 0x1 | ++------------------+----------------------------------+ +| next | | ++------------------+----------------------------------+ +| sparse mmap info | VFIO region info sparse mmap | ++------------------+----------------------------------+ + +This capability is defined when only a subrange of the region supports +direct access by the client via mmap(). The VFIO sparse mmap area is defin= ed in +```` (``struct vfio_region_sparse_mmap_area`` and ``struct +vfio_region_info_cap_sparse_mmap``). + +VFIO region info cap sparse mmap +"""""""""""""""""""""""""""""""" + ++----------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D= =3D+ +| nr_areas | 0 | 4 | ++----------+--------+------+ +| reserved | 4 | 4 | ++----------+--------+------+ +| offset | 8 | 8 | ++----------+--------+------+ +| size | 16 | 8 | ++----------+--------+------+ +| ... | | | ++----------+--------+------+ + +* *nr_areas* is the number of sparse mmap areas in the region. +* *offset* and size describe a single area that can be mapped by the clien= t. + There will be *nr_areas* pairs of offset and size. The offset will be ad= ded to + the base offset given in the ``VFIO_USER_DEVICE_GET_REGION_INFO`` to for= m the + offset argument of the subsequent mmap() call. + +The VFIO sparse mmap area is defined in ```` (``struct +vfio_region_info_cap_sparse_mmap``). + + +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` +-------------------------------------- + +Clients can access regions via ``VFIO_USER_REGION_READ/WRITE`` or, if prov= ided, by +``mmap()`` of a file descriptor provided by the server. + +``VFIO_USER_DEVICE_GET_REGION_IO_FDS`` provides an alternative access mech= anism via +file descriptors. This is an optional feature intended for performance +improvements where an underlying sub-system (such as KVM) supports communi= cation +across such file descriptors to the vfio-user server, without needing to +round-trip through the client. + +The server returns an array of sub-regions for the requested region. Each +sub-region describes a span (offset and size) of a region, along with the +requested file descriptor notification mechanism to use. Each sub-region = in the +response message may choose to use a different method, as defined below. = The +two mechanisms supported in this specification are ioeventfds and ioregion= fds. + +The server in addition returns a file descriptor in the ancillary data; cl= ients +are expected to configure each sub-region's file descriptor with the reque= sted +notification method. For example, a client could configure KVM with the +requested ioeventfd via a ``KVM_IOEVENTFD`` ``ioctl()``. + +Request +^^^^^^^ + ++-------------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D= =3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------------+--------+------+ +| flags | 4 | 4 | ++-------------+--------+------+ +| index | 8 | 4 | ++-------------+--------+------+ +| count | 12 | 4 | ++-------------+--------+------+ + +* *argsz* the maximum size of the reply payload +* *index* is the index of memory region being queried +* all other fields must be zero + +The client must set ``flags`` to zero and specify the region being queried= in +the ``index``. + +Reply +^^^^^ + ++-------------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D= =3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------------+--------+------+ +| flags | 4 | 4 | ++-------------+--------+------+ +| index | 8 | 4 | ++-------------+--------+------+ +| count | 12 | 4 | ++-------------+--------+------+ +| sub-regions | 16 | ... | ++-------------+--------+------+ + +* *argsz* is the size of the region IO FD info structure plus the + total size of the sub-region array. Thus, each array entry "i" is at off= set + i * ((argsz - 32) / count). Note that currently this is 40 bytes for bot= h IO + FD types, but this is not to be relied on. As elsewhere, this indicates = the + full reply payload size needed. +* *flags* must be zero +* *index* is the index of memory region being queried +* *count* is the number of sub-regions in the array +* *sub-regions* is the array of Sub-Region IO FD info structures + +The reply message will additionally include at least one file descriptor i= n the +ancillary data. Note that more than one sub-region may share the same file +descriptor. + +Note that it is the client's responsibility to verify the requested values= (for +example, that the requested offset does not exceed the region's bounds). + +Each sub-region given in the response has one of two possible structures, +depending whether *type* is ``VFIO_USER_IO_FD_TYPE_IOEVENTFD`` or +``VFIO_USER_IO_FD_TYPE_IOREGIONFD``: + +Sub-Region IO FD info format (ioeventfd) +"""""""""""""""""""""""""""""""""""""""" + ++-----------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D= =3D=3D+ +| offset | 0 | 8 | ++-----------+--------+------+ +| size | 8 | 8 | ++-----------+--------+------+ +| fd_index | 16 | 4 | ++-----------+--------+------+ +| type | 20 | 4 | ++-----------+--------+------+ +| flags | 24 | 4 | ++-----------+--------+------+ +| padding | 28 | 4 | ++-----------+--------+------+ +| datamatch | 32 | 8 | ++-----------+--------+------+ + +* *offset* is the offset of the start of the sub-region within the region + requested ("physical address offset" for the region) +* *size* is the length of the sub-region. This may be zero if the access s= ize is + not relevant, which may allow for optimizations +* *fd_index* is the index in the ancillary data of the FD to use for ioeve= ntfd + notification; it may be shared. +* *type* is ``VFIO_USER_IO_FD_TYPE_IOEVENTFD`` +* *flags* is any of: + + * ``KVM_IOEVENTFD_FLAG_DATAMATCH`` + * ``KVM_IOEVENTFD_FLAG_PIO`` + * ``KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY`` (FIXME: makes sense?) + +* *datamatch* is the datamatch value if needed + +See https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt, *4.59 +KVM_IOEVENTFD* for further context on the ioeventfd-specific fields. + +Sub-Region IO FD info format (ioregionfd) +""""""""""""""""""""""""""""""""""""""""" + ++-----------+--------+------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D= =3D=3D+ +| offset | 0 | 8 | ++-----------+--------+------+ +| size | 8 | 8 | ++-----------+--------+------+ +| fd_index | 16 | 4 | ++-----------+--------+------+ +| type | 20 | 4 | ++-----------+--------+------+ +| flags | 24 | 4 | ++-----------+--------+------+ +| padding | 28 | 4 | ++-----------+--------+------+ +| user_data | 32 | 8 | ++-----------+--------+------+ + +* *offset* is the offset of the start of the sub-region within the region + requested ("physical address offset" for the region) +* *size* is the length of the sub-region. This may be zero if the access s= ize is + not relevant, which may allow for optimizations; ``KVM_IOREGION_POSTED_W= RITES`` + must be set in *flags* in this case +* *fd_index* is the index in the ancillary data of the FD to use for ioreg= ionfd + messages; it may be shared +* *type* is ``VFIO_USER_IO_FD_TYPE_IOREGIONFD`` +* *flags* is any of: + + * ``KVM_IOREGION_PIO`` + * ``KVM_IOREGION_POSTED_WRITES`` + +* *user_data* is an opaque value passed back to the server via a message o= n the + file descriptor + +For further information on the ioregionfd-specific fields, see: +https://lore.kernel.org/kvm/cover.1613828726.git.eafanasova@gmail.com/ + +(FIXME: update with final API docs.) + +``VFIO_USER_DEVICE_GET_IRQ_INFO`` +--------------------------------- + +This command message is sent by the client to the server to query for +information about device interrupt types. The VFIO IRQ info structure is +defined in ```` (``struct vfio_irq_info``). + +Request +^^^^^^^ + ++-------+--------+---------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------+--------+---------------------------+ +| flags | 4 | 4 | ++-------+--------+---------------------------+ +| | +-----+--------------------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0 | VFIO_IRQ_INFO_EVENTFD | | +| | +-----+--------------------------+ | +| | | 1 | VFIO_IRQ_INFO_MASKABLE | | +| | +-----+--------------------------+ | +| | | 2 | VFIO_IRQ_INFO_AUTOMASKED | | +| | +-----+--------------------------+ | +| | | 3 | VFIO_IRQ_INFO_NORESIZE | | +| | +-----+--------------------------+ | ++-------+--------+---------------------------+ +| index | 8 | 4 | ++-------+--------+---------------------------+ +| count | 12 | 4 | ++-------+--------+---------------------------+ + +* *argsz* is the maximum size of the reply payload (16 bytes today) +* index is the index of IRQ type being queried (e.g. ``VFIO_PCI_MSIX_IRQ_I= NDEX``) +* all other fields must be zero + +Reply +^^^^^ + ++-------+--------+---------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------+--------+---------------------------+ +| flags | 4 | 4 | ++-------+--------+---------------------------+ +| | +-----+--------------------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0 | VFIO_IRQ_INFO_EVENTFD | | +| | +-----+--------------------------+ | +| | | 1 | VFIO_IRQ_INFO_MASKABLE | | +| | +-----+--------------------------+ | +| | | 2 | VFIO_IRQ_INFO_AUTOMASKED | | +| | +-----+--------------------------+ | +| | | 3 | VFIO_IRQ_INFO_NORESIZE | | +| | +-----+--------------------------+ | ++-------+--------+---------------------------+ +| index | 8 | 4 | ++-------+--------+---------------------------+ +| count | 12 | 4 | ++-------+--------+---------------------------+ + +* *argsz* is the size required for the full reply payload (16 bytes today) +* *flags* defines IRQ attributes: + + * ``VFIO_IRQ_INFO_EVENTFD`` indicates the IRQ type can support server ev= entfd + signalling. + * ``VFIO_IRQ_INFO_MASKABLE`` indicates that the IRQ type supports the ``= MASK`` + and ``UNMASK`` actions in a ``VFIO_USER_DEVICE_SET_IRQS`` message. + * ``VFIO_IRQ_INFO_AUTOMASKED`` indicates the IRQ type masks itself after= being + triggered, and the client must send an ``UNMASK`` action to receive new + interrupts. + * ``VFIO_IRQ_INFO_NORESIZE`` indicates ``VFIO_USER_SET_IRQS`` operations= setup + interrupts as a set, and new sub-indexes cannot be enabled without dis= abling + the entire type. +* index is the index of IRQ type being queried +* count describes the number of interrupts of the queried type. + +``VFIO_USER_DEVICE_SET_IRQS`` +----------------------------- + +This command message is sent by the client to the server to set actions for +device interrupt types. The VFIO IRQ set structure is defined in +```` (``struct vfio_irq_set``). + +Request +^^^^^^^ + ++-------+--------+------------------------------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ +| argsz | 0 | 4 | ++-------+--------+------------------------------+ +| flags | 4 | 4 | ++-------+--------+------------------------------+ +| | +-----+-----------------------------+ | +| | | Bit | Definition | | +| | +=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+ | +| | | 0 | VFIO_IRQ_SET_DATA_NONE | | +| | +-----+-----------------------------+ | +| | | 1 | VFIO_IRQ_SET_DATA_BOOL | | +| | +-----+-----------------------------+ | +| | | 2 | VFIO_IRQ_SET_DATA_EVENTFD | | +| | +-----+-----------------------------+ | +| | | 3 | VFIO_IRQ_SET_ACTION_MASK | | +| | +-----+-----------------------------+ | +| | | 4 | VFIO_IRQ_SET_ACTION_UNMASK | | +| | +-----+-----------------------------+ | +| | | 5 | VFIO_IRQ_SET_ACTION_TRIGGER | | +| | +-----+-----------------------------+ | ++-------+--------+------------------------------+ +| index | 8 | 4 | ++-------+--------+------------------------------+ +| start | 12 | 4 | ++-------+--------+------------------------------+ +| count | 16 | 4 | ++-------+--------+------------------------------+ +| data | 20 | variable | ++-------+--------+------------------------------+ + +* *argsz* is the size of the VFIO IRQ set request payload, including any *= data* + field. Note there is no reply payload, so this field differs from other + message types. +* *flags* defines the action performed on the interrupt range. The ``DATA`` + flags describe the data field sent in the message; the ``ACTION`` flags + describe the action to be performed. The flags are mutually exclusive for + both sets. + + * ``VFIO_IRQ_SET_DATA_NONE`` indicates there is no data field in the com= mand. + The action is performed unconditionally. + * ``VFIO_IRQ_SET_DATA_BOOL`` indicates the data field is an array of boo= lean + bytes. The action is performed if the corresponding boolean is true. + * ``VFIO_IRQ_SET_DATA_EVENTFD`` indicates an array of event file descrip= tors + was sent in the message meta-data. These descriptors will be signalled= when + the action defined by the action flags occurs. In ``AF_UNIX`` sockets,= the + descriptors are sent as ``SCM_RIGHTS`` type ancillary data. + If no file descriptors are provided, this de-assigns the specified + previously configured interrupts. + * ``VFIO_IRQ_SET_ACTION_MASK`` indicates a masking event. It can be used= with + ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to mask an in= terrupt, + or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event when the gu= est masks + the interrupt. + * ``VFIO_IRQ_SET_ACTION_UNMASK`` indicates an unmasking event. It can be= used + with ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to unmas= k an + interrupt, or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event = when the + guest unmasks the interrupt. + * ``VFIO_IRQ_SET_ACTION_TRIGGER`` indicates a triggering event. It can b= e used + with ``VFIO_IRQ_SET_DATA_BOOL`` or ``VFIO_IRQ_SET_DATA_NONE`` to trigg= er an + interrupt, or with ``VFIO_IRQ_SET_DATA_EVENTFD`` to generate an event = when the + server triggers the interrupt. + +* *index* is the index of IRQ type being setup. +* *start* is the start of the sub-index being set. +* *count* describes the number of sub-indexes being set. As a special case= , a + count (and start) of 0, with data flags of ``VFIO_IRQ_SET_DATA_NONE`` di= sables + all interrupts of the index. +* *data* is an optional field included when the + ``VFIO_IRQ_SET_DATA_BOOL`` flag is present. It contains an array of bool= eans + that specify whether the action is to be performed on the corresponding + index. It's used when the action is only performed on a subset of the ra= nge + specified. + +Not all interrupt types support every combination of data and action flags. +The client must know the capabilities of the device and IRQ index before it +sends a ``VFIO_USER_DEVICE_SET_IRQ`` message. + +In typical operation, a specific IRQ may operate as follows: + +1. The client sends a ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=3D(VFIO_IRQ_SET_DATA_EVENTFD|VFIO_IRQ_SET_ACTION_TRIGGER)`` alo= ng + with an eventfd. This associates the IRQ with a particular eventfd on t= he + server side. + +#. The client may send a ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=3D(VFIO_IRQ_SET_DATA_EVENTFD|VFIO_IRQ_SET_ACTION_MASK/UNMASK)``= along + with another eventfd. This associates the given eventfd with the + mask/unmask state on the server side. + +#. The server may trigger the IRQ by writing 1 to the eventfd. + +#. The server may mask/unmask an IRQ which will write 1 to the correspondi= ng + mask/unmask eventfd, if there is one. + +5. A client may trigger a device IRQ itself, by sending a + ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=3D(VFIO_IRQ_SET_DATA_NONE/BOOL|VFIO_IRQ_SET_ACTION_TRIGGER)``. + +6. A client may mask or unmask the IRQ, by sending a + ``VFIO_USER_DEVICE_SET_IRQ`` message with + ``flags=3D(VFIO_IRQ_SET_DATA_NONE/BOOL|VFIO_IRQ_SET_ACTION_MASK/UNMASK)= ``. + +Reply +^^^^^ + +There is no payload in the reply. + +.. _Read and Write Operations: + +Note that all of these operations must be supported by the client and/or s= erver, +even if the corresponding memory or device region has been shared as mappa= ble. + +The ``count`` field must not exceed the value of ``max_data_xfer_size`` of= the +peer, for both reads and writes. + +``VFIO_USER_REGION_READ`` +------------------------- + +If a device region is not mappable, it's not directly accessible by the cl= ient +via ``mmap()`` of the underlying file descriptor. In this case, a client c= an +read from a device region with this message. + +Request +^^^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. + +Reply +^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | variable | ++--------+--------+----------+ + +* *offset* into the region accessed. +* *region* is the index of the region accessed. +* *count* is the size of the data transferred. +* *data* is the data that was read from the device region. + +``VFIO_USER_REGION_WRITE`` +-------------------------- + +If a device region is not mappable, it's not directly accessible by the cl= ient +via mmap() of the underlying fd. In this case, a client can write to a dev= ice +region with this message. + +Request +^^^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | variable | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. +* *data* is the data to write + +Reply +^^^^^ + ++--------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ + +* *offset* into the region accessed. +* *region* is the index of the region accessed. +* *count* is the size of the data transferred. + +``VFIO_USER_DMA_READ`` +----------------------- + +If the client has not shared mappable memory, the server can use this mess= age to +read from guest memory. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. This address = must have + been previously exported to the server with a ``VFIO_USER_DMA_MAP`` mess= age. +* *count* is the size of the data to be transferred. + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ +| data | 16 | variable | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. +* *count* is the size of the data transferred. +* *data* is the data read. + +``VFIO_USER_DMA_WRITE`` +----------------------- + +If the client has not shared mappable memory, the server can use this mess= age to +write to guest memory. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 8 | ++---------+--------+----------+ +| data | 16 | variable | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. This address = must have + been previously exported to the server with a ``VFIO_USER_DMA_MAP`` mess= age. +* *count* is the size of the data to be transferred. +* *data* is the data to write + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D+ +| address | 0 | 8 | ++---------+--------+----------+ +| count | 8 | 4 | ++---------+--------+----------+ + +* *address* is the client DMA memory address being accessed. +* *count* is the size of the data transferred. + +``VFIO_USER_DEVICE_RESET`` +-------------------------- + +This command message is sent from the client to the server to reset the de= vice. +Neither the request or reply have a payload. + +``VFIO_USER_REGION_WRITE_MULTI`` +-------------------------------- + +This message can be used to coalesce multiple device write operations +into a single messgage. It is only used as an optimization when the +outgoing message queue is relatively full. + +Request +^^^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D+ +| wr_cnt | 0 | 8 | ++---------+--------+----------+ +| wrs | 8 | variable | ++---------+--------+----------+ + +* *wr_cnt* is the number of device writes coalesced in the message +* *wrs* is an array of device writes defined below + +Single Device Write Format +"""""""""""""""""""""""""" + ++--------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D+ +| offset | 0 | 8 | ++--------+--------+----------+ +| region | 8 | 4 | ++--------+--------+----------+ +| count | 12 | 4 | ++--------+--------+----------+ +| data | 16 | 8 | ++--------+--------+----------+ + +* *offset* into the region being accessed. +* *region* is the index of the region being accessed. +* *count* is the size of the data to be transferred. This format can + only describe writes of 8 bytes or less. +* *data* is the data to write. + +Reply +^^^^^ + ++---------+--------+----------+ +| Name | Offset | Size | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D+ +| wr_cnt | 0 | 8 | ++---------+--------+----------+ + +* *wr_cnt* is the number of device writes completed. + + +Appendices +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Unused VFIO ``ioctl()`` commands +-------------------------------- + +The following VFIO commands do not have an equivalent vfio-user command: + +* ``VFIO_GET_API_VERSION`` +* ``VFIO_CHECK_EXTENSION`` +* ``VFIO_SET_IOMMU`` +* ``VFIO_GROUP_GET_STATUS`` +* ``VFIO_GROUP_SET_CONTAINER`` +* ``VFIO_GROUP_UNSET_CONTAINER`` +* ``VFIO_GROUP_GET_DEVICE_FD`` +* ``VFIO_IOMMU_GET_INFO`` + +However, once support for live migration for VFIO devices is finalized some +of the above commands may have to be handled by the client in their +corresponding vfio-user form. This will be addressed in a future protocol +version. + +VFIO groups and containers +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The current VFIO implementation includes group and container idioms that +describe how a device relates to the host IOMMU. In the vfio-user +implementation, the IOMMU is implemented in SW by the client, and is not +visible to the server. The simplest idea would be that the client put each +device into its own group and container. + +Backend Program Conventions +--------------------------- + +vfio-user backend program conventions are based on the vhost-user ones. + +* The backend program must not daemonize itself. +* No assumptions must be made as to what access the backend program has on= the + system. +* File descriptors 0, 1 and 2 must exist, must have regular + stdin/stdout/stderr semantics, and can be redirected. +* The backend program must honor the SIGTERM signal. +* The backend program must accept the following commands line options: + + * ``--socket-path=3DPATH``: path to UNIX domain socket, + * ``--fd=3DFDNUM``: file descriptor for UNIX domain socket, incompatible= with + ``--socket-path`` +* The backend program must be accompanied with a JSON file stored under + ``/usr/share/vfio-user``. + +TODO add schema similar to docs/interop/vhost-user.json. --=20 2.49.0