From nobody Fri Dec 12 14:06:21 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) client-ip=8.43.85.245; envelope-from=devel-bounces@lists.libvirt.org; helo=lists.libvirt.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) smtp.mailfrom=devel-bounces@lists.libvirt.org; arc=fail (Bad Signature); dmarc=pass(p=reject dis=none) header.from=lists.libvirt.org Return-Path: Received: from lists.libvirt.org (lists.libvirt.org [8.43.85.245]) by mx.zohomail.com with SMTPS id 1763778674287103.53788204711577; Fri, 21 Nov 2025 18:31:14 -0800 (PST) Received: by lists.libvirt.org (Postfix, from userid 993) id 99AF444039; Fri, 21 Nov 2025 21:31:13 -0500 (EST) Received: from [172.19.199.56] (lists.libvirt.org [8.43.85.245]) by lists.libvirt.org (Postfix) with ESMTP id 3EE4944663; Fri, 21 Nov 2025 21:27:31 -0500 (EST) Received: by lists.libvirt.org (Postfix, from userid 993) id CFAB941905; Fri, 21 Nov 2025 21:22:46 -0500 (EST) Received: from SN4PR0501CU005.outbound.protection.outlook.com (mail-southcentralusazon11011012.outbound.protection.outlook.com [40.93.194.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (prime256v1) server-signature RSA-PSS (3072 bits)) (No client certificate requested) by lists.libvirt.org (Postfix) with ESMTPS id 7C1FE44046 for ; Fri, 21 Nov 2025 21:21:03 -0500 (EST) Received: from PH7PR12MB6834.namprd12.prod.outlook.com (2603:10b6:510:1b4::18) by CY1PR12MB9559.namprd12.prod.outlook.com (2603:10b6:930:fd::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9343.11; Sat, 22 Nov 2025 02:21:00 +0000 Received: from PH7PR12MB6834.namprd12.prod.outlook.com ([fe80::f432:162b:b94e:d2cb]) by PH7PR12MB6834.namprd12.prod.outlook.com ([fe80::f432:162b:b94e:d2cb%6]) with mapi id 15.20.9343.011; Sat, 22 Nov 2025 02:21:00 +0000 X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-26) on lists.libvirt.org X-Spam-Level: X-Spam-Status: No, score=-5.0 required=5.0 tests=ARC_SIGNED,ARC_VALID,BAYES_00, DKIM_INVALID,DKIM_SIGNED,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED,RCVD_IN_VALIDITY_RPBL_BLOCKED, RCVD_IN_VALIDITY_SAFE_BLOCKED,SPF_PASS autolearn=unavailable autolearn_force=no version=4.0.1 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=d35LqNKoyrYJ+RPFfLbkDcTEg5sJi24ec/uBr3aYsB6LQtY69s0Lz4tIpaubakw729+oiWBjQoO83rOiCTG37fz26NmSP7+SK5rXbiORb+IjOgBm3FoX7CPUmbvtDi9Dv2zmWYCnxw6m8rgWcdYfddGCF7wLZkwXqqlbK+H4ta0qVKKJjERFLq/CReRTJn6I3U/DIRGbL6jIaCcacTpOVMxN14/ijmC2e6xbPY1Jd6+k165afy18zMqJFHMtB22xOiEmY3EaI6e4vao8RQwsSp+3D65aBzRNYvhvOXE146fo+xCv6XBnuhYQNvHpk5rBMYxijf2Ci3uEiqgNrE0AbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=elNgwEEsjh+EZgUxjnw6wf0fEMBQF3YVjqkL92YWRVY=; b=B5dnCl+kfWI+MfULjNXKNUV+I3JjoTl3uul9GrjTFRLcOwTMDPT0j2ytGxSgw9qFI/WIBqfyLiMFDiiXu5yxFFfF4N+T/OGQXw+C0M8JpQd78wZZUk+T2Kd997cEvx4+siUiOv+iydLKK4XiIgpllMmHW2RSKRLb8BD+sRk8sK1Kn03pRQXNuM5V/MfvkmVozxMBtTojQ+GJMcU0tMNKn9S/bFY3j9z4E5ozxDr7bKOMzjq39taf4t1MMdrE7XBkeGOOOadJNjkC0hDxBeShdsLahviCZXaUn31aegW7gtSgGN5+QVgc4R+VhNAMu7py8zLPHxhtDLm4pgjCuPB72w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=elNgwEEsjh+EZgUxjnw6wf0fEMBQF3YVjqkL92YWRVY=; b=pQRje6eMvjJREfBe0DwpcG4ebO0e/2IQsWLq5wXP117K7Iph/JWjQxma8XznJaw+b+YYzO50pCWrnPeaCEa+OagRzVbtm1CXmnbVd5qn0m+YmWMsiM3s1oUlnDr0TYce8ccwwGDjUJo+2YuCPYbmtlGjW79+qGfSoX7IbKBI4/gbUqeLhvmskU0jNAprcBUTKAuQHVsqOPxe0OVHCTwK6FN9+D4nPix+Ek/Btc9tS12XxsL8B6qcA2b513Gy0HOyXMGuBkiuUVAzsFOK97mmAwQ+MYS72bxvPkxtjzJZpalTLQDdqBoJPfeYGqOqHOnT3Z9A4YMWUF0sB5aGoZaxHg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; To: devel@lists.libvirt.org Subject: [PATCH v2 2/5] qemu: open VFIO FDs from libvirt backend Date: Fri, 21 Nov 2025 18:20:54 -0800 Message-ID: <20251122022057.3440459-3-nathanc@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20251122022057.3440459-1-nathanc@nvidia.com> References: <20251122022057.3440459-1-nathanc@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: BY5PR04CA0027.namprd04.prod.outlook.com (2603:10b6:a03:1d0::37) To PH7PR12MB6834.namprd12.prod.outlook.com (2603:10b6:510:1b4::18) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: PH7PR12MB6834:EE_|CY1PR12MB9559:EE_ X-MS-Office365-Filtering-Correlation-Id: be57b359-fd24-4d12-a575-08de296dc73a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|366016; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?lSQRyGHwvCT3zFnOVWbm7ibwHgs6G0JVo+AZ/ezCYsVuXbDRmLws44cA/k9N?= =?us-ascii?Q?zTNDMDQ8p+PHFVTpEHZF/pd+YOpxjNlxrFPbemf4l2hmgpGH5w2MxfQVGbXf?= =?us-ascii?Q?gg/TyUlNHqLj6eDDIvFQmB706iSvCagzwEqjtuZW63BuJ9/kKonwW8UGnDzC?= =?us-ascii?Q?FCtlRfGG8pZ7HBSdxXnL/PIDpS2DfaVZZm54IfoLcpMlsmyiAx8d9gPCxNom?= =?us-ascii?Q?LYM7Iql3OQXHTovjlLxs090jp/oMuo05tJ7NtflwIuCFDd7DrxD2sde5aR4s?= =?us-ascii?Q?FFM1clJmZpvW4IgTn4L7EwY8V/GOa/DjWZeB7eSl+5IB+A/enp9jajEgNjkj?= =?us-ascii?Q?GNfuZz3e3xVc1ndu7iSzBYXp53OaRO4zcfSXPTQjJzTUULvbO0LyPGp5vn2L?= =?us-ascii?Q?zTkJ6aYuvcIGoS9y82f0nq4u1JxR1+F1MO4Z1BLF6BTyiYUUZm35bODzVwEs?= =?us-ascii?Q?Qs/N7Nvcg/XRBXDdBknkIWSQabVoAOlnDVRXZ+PqDLX74MfhLiQnRHokTvxa?= =?us-ascii?Q?P+K2b009S/jEo8gxL4pXw6/9j3Si3kth9czVxTW6qX0ppHEJ49Zd8GmPilBv?= =?us-ascii?Q?s6KZH71phHCb98Vgn3IKbPr0dCcZTOwGpqidTCulXsrS2dQjzYN+XYU/gLh1?= =?us-ascii?Q?ywGEDYK+0Nr+3AcxqkBP3VxRQqvpgOXZNdrj7xVHL0CtqUA9HkeM9Vd1MLJJ?= =?us-ascii?Q?UpZ3TAXL/gvbUVnJ3dtCbNfXM8eTBE8Wnt20zuJqu5e5oXl+9bNN81U5Hm45?= =?us-ascii?Q?wF0o/yuQ72I9PaPPomyE7yk8VdFJP9gS7YldqwgzFul3bBeUQkm6sVeO1Bd+?= =?us-ascii?Q?WKL6MvNOOmAeZiM46IZJLoq/36i0LEZt+kMi24i/epqBjYkEG3wIWun7erIx?= =?us-ascii?Q?uT4HwOCdHEK6CNWQJvtOjhj78V4Eu47JscEdohRDZtgkP9YQ/lsaR+w124f2?= =?us-ascii?Q?mghI/G9DT3ir0yDp8nZNIU7q1g+lBje8yIi71SC4QzoSafW5Ze2t5uu4XoUr?= =?us-ascii?Q?ajtf5/ExXKCJuUnsekpmKuRnWsttzHH51Felzjp6rwcuQyhdPCPk8WsaGkk/?= =?us-ascii?Q?kyYds7mVHEH/dY5XRXutZx+Pkw7ugwR5JSNOH8hCg3as0/LTY8kOXonzIzpY?= =?us-ascii?Q?fuVIhomvTqePth+zphJF6BWlIGTNPhl7cAUoBam6V0nsSyT1fGVPVtmL1371?= =?us-ascii?Q?PYVbHM8TjHS1H1ut8Vfw9mlmUB2fpoWu2i+aU78fj5c/bycORopwaMYXPhdc?= =?us-ascii?Q?4xoyGNlbymabGIwGdqcN2toLt78Xe/OXksR6dp50Q3UZ28St2CGLqjT4E0KZ?= =?us-ascii?Q?vQP5jRKdtEcO6s1i+/s4jLaVUuAT+lpT6RkygtTU4lxNM1RJqOOQdd/s5GLW?= =?us-ascii?Q?NRcP9geEdV1b+ByEo7xvM2LOyPU4tUOVaWh6eqiHC2TR+gLIjQCXXpdBKf1h?= =?us-ascii?Q?a+2xSxd/INTDk2uT2kHC5PNocJmLE+Ot?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PH7PR12MB6834.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(376014)(1800799024)(366016);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?Ys7WohNyYx2/eEfqPZb/qiJL5qOmQnuxAW+/99CjHJXTSMmYC0uezwK3qGsf?= =?us-ascii?Q?xRb84tMHFM7/a85JpsAJfuuM1yDEu62hVEpe3Hyxsn7ZfzzeWifexvnA42m7?= =?us-ascii?Q?7enmuKCBdSLmOTDKBqXhwX0yciRFrNXEQe9ithFG26fqW//11dwC73Fw2Amk?= =?us-ascii?Q?/VSVFoaq/zD7ul5HFpgCLIUQlwtSRdmbG8KbD/+zdZFcDaQS8u4zo0av7UoE?= =?us-ascii?Q?IiieTsYhFOgn4Y+htIov2jTyAOPW3R3pWpGv+7dpqYzpgcyll4dk4y9ChzPj?= =?us-ascii?Q?fhXksrliHHkny3kMTvKP9w3HIuARRB5xUUyrorWr/uDIT3aAx1b3n4Z6kEuU?= =?us-ascii?Q?kj7FR3xS4NKtbccmEkeeSx5BvdPUM8Oejwbr6jFlo7PaSny3ssKfJAmKZk0L?= =?us-ascii?Q?KTu+KsIIpuIbWj5rp7Rc8FaA+roz4fojS8hKh6fsfwx3IgxSITLUZUkLYoaQ?= =?us-ascii?Q?O6AuMRK0118cnj1DWKosGW1yz/4lllYyIE9esXPt/ap1efifsSKotDe/llyX?= =?us-ascii?Q?YuSCCeBNDGNQdGUriM1U3jz2H7fVsIRSNIk3pMKQd6opGNEzU0Oyfo2YRA1N?= =?us-ascii?Q?gstvoi8TfYHLClX+Dt6SExWdBBB8kyo3pvgHshW16xUsZ505/fiSTo3Ls9kv?= =?us-ascii?Q?Rg+kqhhXDCDE9waeYiDp7YMegrE9AYi6Q40SIHbp8RR/zs8L+3BDraIVUWTk?= =?us-ascii?Q?LHn2kl+Zd1a9NunPqrvJDsXs5qBxqm9I7DxAs1ybBL5Ua//BdlX++d6GMhv/?= =?us-ascii?Q?050oRfLIiLkWa2ncGLaVi/2lmklzvUzr17TLuLhyMm0xUmDRrQIARpzn3E7e?= =?us-ascii?Q?sbVwJ6o6Mkq95fNyswyNzYdQA7dfs2Us4kJpxd9Esxiyvukj0rNIasql4GHX?= =?us-ascii?Q?Vdo9FfpVWVuCEkZaHv3NFIByZWSDyt5+v/RhS9VnAGza830gUcLDc0PZR/3x?= =?us-ascii?Q?xkgztucWsYrspFVqnJ8U7TD4XR/Gn9kr6VLacWAop9zQTCoIrWJUSliTAYFn?= =?us-ascii?Q?FF0Xw2VeWa9AZ02mK5tI54PaQxoaQ43l6mh9JpGCJkkCL4RzAF7F0VRRqs40?= =?us-ascii?Q?MynXDHJOiK4qRnuFP4nOpOALr0zDuyQg0gnZLl/we7GrZxYvl+Esxse2b9gy?= =?us-ascii?Q?IYIxHx11FX9YBUQx9U1QvvbPDiPvVHr/W3PH/056cQPFjUtCUWG7sIpCICbJ?= =?us-ascii?Q?kaELUx+/9tE21QycDeOrTUDj/NKfSQkxY8B8Wyasxnsr3hMbA5WdxKU/8no/?= =?us-ascii?Q?77r6pRsMlBW9WjUtfEQ+yDKg8mlN1oz6PyINkDFtVXK5KjTbBnMLNSx0/xk7?= =?us-ascii?Q?TeLwZvYIyf0SF98Q+jjo2KTFX6qcOdwYtMDDsHoEDli9tGXqBFmFczQGx1tx?= =?us-ascii?Q?GnhJqhHgL1hpV7xLSQuAOUAvfFpO2ol9cGYruWKylBiY+OHPdZnMBoqCfcY+?= =?us-ascii?Q?wioWNSwz+I/RelQH0dzkMFosOULYkltPc3hj3tOUYk9bI0/PuHiLzIMQNCkQ?= =?us-ascii?Q?FVLfiLPMoI29aaXauQ6TMMKRA4s8GiY6ImM0zGJbLHebmtpud97pDtaR7rlr?= =?us-ascii?Q?cghLpVwa6cX77NBnVk8KpSTT+5u4p34n9xsL0eMz?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: be57b359-fd24-4d12-a575-08de296dc73a X-MS-Exchange-CrossTenant-AuthSource: PH7PR12MB6834.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Nov 2025 02:21:00.5271 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: mIQ8ZnqGUWLGg6ov3PQRG5dq5RF6xlHFV0aaiXexqW49FSgOL1bw4IGASNSE6A71bTN4N/fwXT0/++LdFm0rXA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR12MB9559 Message-ID-Hash: RN77QZ2ZWMUZVHKHD3VYO3O2PCRTFQ32 X-Message-ID-Hash: RN77QZ2ZWMUZVHKHD3VYO3O2PCRTFQ32 X-MailFrom: nathanc@nvidia.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; loop; banned-address; header-match-devel.lists.libvirt.org-0; emergency; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: skolothumtho@nvidia.com, nicolinc@nvidia.com, nathanc@nvidia.com, mochs@nvidia.com X-Mailman-Version: 3.3.10 Precedence: list List-Id: Development discussions about the libvirt library & tools Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: From: Nathan Chen via Devel Reply-To: Nathan Chen X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1763778675135018900 Content-Type: text/plain; charset="utf-8" Open VFIO FDs from libvirt backend without exposing these FDs to XML users, i.e. one per iommufd hostdev for /dev/vfio/devices/vfioX, and pass the FD to qemu command line. Signed-off-by: Nathan Chen --- src/conf/domain_conf.h | 2 + src/libvirt_private.syms | 1 + src/qemu/qemu_command.c | 26 ++++++++ src/qemu/qemu_domain.c | 39 ++++++++++++ src/qemu/qemu_domain.h | 17 +++++ src/qemu/qemu_process.c | 130 +++++++++++++++++++++++++++++++++++++++ src/util/virpci.c | 69 +++++++++++++++++++++ src/util/virpci.h | 2 + 8 files changed, 286 insertions(+) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 4fd8342950..da4ce9fc86 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -364,6 +364,8 @@ struct _virDomainHostdevDef { */ virDomainNetDef *parentnet; =20 + virObject *privateData; + virDomainHostdevMode mode; virDomainStartupPolicy startupPolicy; bool managed; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 4e57e4a8f6..ed2b0d381e 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -3159,6 +3159,7 @@ virPCIDeviceGetStubDriverName; virPCIDeviceGetStubDriverType; virPCIDeviceGetUnbindFromStub; virPCIDeviceGetUsedBy; +virPCIDeviceGetVfioPath; virPCIDeviceGetVPD; virPCIDeviceHasPCIExpressLink; virPCIDeviceIsAssignable; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 95d1c2ee98..9b08f66175 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4756,6 +4756,12 @@ qemuBuildPCIHostdevDevProps(const virDomainDef *def, const char *iommufdId =3D NULL; /* 'ramfb' property must be omitted unless it's to be enabled */ bool ramfb =3D pcisrc->ramfb =3D=3D VIR_TRISTATE_SWITCH_ON; + bool useIommufd =3D false; + + if (pcisrc->driver.name =3D=3D VIR_DEVICE_HOSTDEV_PCI_DRIVER_NAME_VFIO= && + pcisrc->driver.iommufd =3D=3D VIR_TRISTATE_BOOL_YES) { + useIommufd =3D true; + } =20 /* caller has to assign proper passthrough driver name */ switch (pcisrc->driver.name) { @@ -4802,6 +4808,17 @@ qemuBuildPCIHostdevDevProps(const virDomainDef *def, NULL) < 0) return NULL; =20 + if (useIommufd && dev->privateData) { + qemuDomainHostdevPrivate *hostdevPriv =3D QEMU_DOMAIN_HOSTDEV_PRIV= ATE(dev); + + if (hostdevPriv->vfioDeviceFd >=3D 0) { + if (virJSONValueObjectAdd(&props, + "S:fd", g_strdup_printf("%d", hostde= vPriv->vfioDeviceFd), + NULL) < 0) + return NULL; + } + } + if (qemuBuildDeviceAddressProps(props, def, dev->info) < 0) return NULL; =20 @@ -5260,6 +5277,15 @@ qemuBuildHostdevCommandLine(virCommand *cmd, if (qemuCommandAddExtDevice(cmd, hostdev->info, def, qemuCaps)= < 0) return -1; =20 + if (subsys->u.pci.driver.iommufd =3D=3D VIR_TRISTATE_BOOL_YES)= { + qemuDomainHostdevPrivate *hostdevPriv =3D QEMU_DOMAIN_HOST= DEV_PRIVATE(hostdev); + + if (hostdevPriv && hostdevPriv->vfioDeviceFd >=3D 0) { + virCommandPassFD(cmd, hostdevPriv->vfioDeviceFd, + VIR_COMMAND_PASS_FD_CLOSE_PARENT); + } + } + if (!(devprops =3D qemuBuildPCIHostdevDevProps(def, hostdev))) return -1; =20 diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index ac56fc7cb4..7601bdbb2b 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1238,6 +1238,45 @@ qemuDomainNetworkPrivateFormat(const virDomainNetDef= *net, } =20 =20 +static virClass *qemuDomainHostdevPrivateClass; + +static void +qemuDomainHostdevPrivateDispose(void *obj) +{ + qemuDomainHostdevPrivate *priv =3D obj; + + VIR_FORCE_CLOSE(priv->vfioDeviceFd); +} + + +static int +qemuDomainHostdevPrivateOnceInit(void) +{ + if (!VIR_CLASS_NEW(qemuDomainHostdevPrivate, virClassForObject())) + return -1; + + return 0; +} + +VIR_ONCE_GLOBAL_INIT(qemuDomainHostdevPrivate); + +virObject * +qemuDomainHostdevPrivateNew(void) +{ + qemuDomainHostdevPrivate *priv; + + if (qemuDomainHostdevPrivateInitialize() < 0) + return NULL; + + if (!(priv =3D virObjectNew(qemuDomainHostdevPrivateClass))) + return NULL; + + priv->vfioDeviceFd =3D -1; + + return (virObject *) priv; +} + + /* qemuDomainSecretInfoSetup: * @priv: pointer to domain private object * @alias: alias of the secret diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 3396f929fd..4736f1ede5 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -461,6 +461,17 @@ struct _qemuDomainTPMPrivate { }; =20 =20 +#define QEMU_DOMAIN_HOSTDEV_PRIVATE(hostdev) \ + ((qemuDomainHostdevPrivate *) (hostdev)->privateData) + +typedef struct _qemuDomainHostdevPrivate qemuDomainHostdevPrivate; +struct _qemuDomainHostdevPrivate { + virObject parent; + + /* VFIO device file descriptor for iommufd passthrough */ + int vfioDeviceFd; +}; + void qemuDomainNetworkPrivateClearFDs(qemuDomainNetworkPrivate *priv); =20 @@ -1174,3 +1185,9 @@ qemuDomainCheckCPU(virArch arch, bool qemuDomainMachineSupportsFloppy(const char *machine, virQEMUCaps *qemuCaps); + +virObject * +qemuDomainHostdevPrivateNew(void); + +int qemuProcessOpenVfioFds(virDomainObj *vm); +void qemuProcessCloseVfioFds(virDomainObj *vm); diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 45fc32a663..bf245ee8af 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -106,6 +106,7 @@ =20 #include "logging/log_manager.h" #include "logging/log_protocol.h" +#include "util/virpci.h" =20 #define VIR_FROM_THIS VIR_FROM_QEMU =20 @@ -8091,6 +8092,9 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuExtDevicesStart(driver, vm, incomingMigrationExtDevices) < 0) goto cleanup; =20 + if (qemuProcessOpenVfioFds(vm) < 0) + goto cleanup; + if (!(cmd =3D qemuBuildCommandLine(vm, incoming ? "defer" : NULL, vmop, @@ -10267,3 +10271,129 @@ qemuProcessHandleNbdkitExit(qemuNbdkitProcess *nb= dkit, qemuProcessEventSubmit(vm, QEMU_PROCESS_EVENT_NBDKIT_EXITED, 0, 0, nbd= kit); virObjectUnlock(vm); } + +/** + * qemuProcessOpenVfioDeviceFd: + * @hostdev: host device definition + * @vfioFd: returned file descriptor + * + * Opens the VFIO device file descriptor for a hostdev. + * + * Returns: 0 on success, -1 on failure + */ +static int +qemuProcessOpenVfioDeviceFd(virDomainHostdevDef *hostdev, + int *vfioFd) +{ + g_autofree char *vfioPath =3D NULL; + int fd =3D -1; + + + if (hostdev->mode !=3D VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || + hostdev->source.subsys.type !=3D VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PC= I) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("VFIO FD only supported for PCI hostdevs")); + return -1; + } + + if (virPCIDeviceGetVfioPath(&hostdev->source.subsys.u.pci.addr, &vfioP= ath) < 0) + return -1; + + VIR_DEBUG("Opening VFIO device %s", vfioPath); + + if ((fd =3D open(vfioPath, O_RDWR | O_CLOEXEC)) < 0) { + if (errno =3D=3D ENOENT) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("VFIO device %1$s not found - ensure device i= s bound to vfio-pci driver"), + vfioPath); + } else { + virReportSystemError(errno, + _("cannot open VFIO device %1$s"), vfioPa= th); + } + return -1; + } + + *vfioFd =3D fd; + VIR_DEBUG("Opened VFIO device FD %d for %s", *vfioFd, vfioPath); + return 0; +} + +/** + * qemuProcessOpenVfioFds: + * @vm: domain object + * + * Opens all necessary VFIO file descriptors for the domain. + * + * Returns: 0 on success, -1 on failure + */ +int +qemuProcessOpenVfioFds(virDomainObj *vm) +{ + size_t i; + + /* Check if we have any hostdevs that need VFIO FDs */ + for (i =3D 0; i < vm->def->nhostdevs; i++) { + virDomainHostdevDef *hostdev =3D vm->def->hostdevs[i]; + qemuDomainHostdevPrivate *hostdevPriv =3D NULL; + + if (hostdev->mode =3D=3D VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && + hostdev->source.subsys.type =3D=3D VIR_DOMAIN_HOSTDEV_SUBSYS_T= YPE_PCI) { + + if (hostdev->source.subsys.u.pci.driver.name =3D=3D VIR_DEVICE= _HOSTDEV_PCI_DRIVER_NAME_VFIO && + hostdev->source.subsys.u.pci.driver.iommufd =3D=3D VIR_TRI= STATE_BOOL_YES) { + + if (!hostdev->privateData) { + if (!(hostdev->privateData =3D qemuDomainHostdevPrivat= eNew())) + goto error; + } + + hostdevPriv =3D QEMU_DOMAIN_HOSTDEV_PRIVATE(hostdev); + + /* Open VFIO device FD */ + if (qemuProcessOpenVfioDeviceFd(hostdev, &hostdevPriv->vfi= oDeviceFd) < 0) + goto error; + + VIR_DEBUG("Stored VFIO FD %d in hostdev %04x:%02x:%02x.%d = private data", + hostdevPriv->vfioDeviceFd, + hostdev->source.subsys.u.pci.addr.domain, + hostdev->source.subsys.u.pci.addr.bus, + hostdev->source.subsys.u.pci.addr.slot, + hostdev->source.subsys.u.pci.addr.function); + } + } + } + + return 0; + + error: + qemuProcessCloseVfioFds(vm); + return -1; +} + +/** + * qemuProcessCloseVfioFds: + * @vm: domain object + * + * Closes all VFIO file descriptors for the domain. + */ +void +qemuProcessCloseVfioFds(virDomainObj *vm) +{ + size_t i; + + /* Close all VFIO device FDs */ + for (i =3D 0; i < vm->def->nhostdevs; i++) { + virDomainHostdevDef *hostdev =3D vm->def->hostdevs[i]; + qemuDomainHostdevPrivate *hostdevPriv; + + if (!hostdev->privateData) + continue; + + hostdevPriv =3D QEMU_DOMAIN_HOSTDEV_PRIVATE(hostdev); + + if (hostdevPriv->vfioDeviceFd >=3D 0) { + VIR_DEBUG("Closing VFIO device FD %d", hostdevPriv->vfioDevice= Fd); + VIR_FORCE_CLOSE(hostdevPriv->vfioDeviceFd); + } + } +} diff --git a/src/util/virpci.c b/src/util/virpci.c index 90617e69c6..da62ece0f6 100644 --- a/src/util/virpci.c +++ b/src/util/virpci.c @@ -3320,3 +3320,72 @@ virPCIDeviceAddressFree(virPCIDeviceAddress *address) { g_free(address); } + +/** + * virPCIDeviceGetVfioPath: + * @addr: host device PCI address + * @vfioPath: returned VFIO device path + * + * Constructs the VFIO device path for a PCI hostdev. + * + * Returns: 0 on success, -1 on failure + */ +int +virPCIDeviceGetVfioPath(virPCIDeviceAddress *addr, + char **vfioPath) +{ + g_autofree char *addrStr =3D NULL; + + *vfioPath =3D NULL; + addrStr =3D virPCIDeviceAddressAsString(addr); + + /* First try: Direct lookup in device's vfio-dev subdirectory */ + { + g_autofree char *sysfsPath =3D NULL; + g_autoptr(DIR) dir =3D NULL; + struct dirent *entry =3D NULL; + + sysfsPath =3D g_strdup_printf("/sys/bus/pci/devices/%s/vfio-dev/",= addrStr); + + if (virDirOpen(&dir, sysfsPath) =3D=3D 1) { + while (virDirRead(dir, &entry, sysfsPath) > 0) { + if (STRPREFIX(entry->d_name, "vfio")) { + *vfioPath =3D g_strdup_printf("/dev/vfio/devices/%s", = entry->d_name); + return 0; + } + } + } + } + + /* Second try: Scan /sys/class/vfio-dev */ + { + g_autofree char *sysfsPath =3D g_strdup("/sys/class/vfio-dev"); + g_autoptr(DIR) dir =3D NULL; + struct dirent *entry =3D NULL; + + if (virDirOpen(&dir, sysfsPath) =3D=3D 1) { + while (virDirRead(dir, &entry, sysfsPath) > 0) { + g_autofree char *devLink =3D NULL; + g_autofree char *target =3D NULL; + + if (!STRPREFIX(entry->d_name, "vfio")) + continue; + + devLink =3D g_strdup_printf("/sys/class/vfio-dev/%s/device= ", entry->d_name); + + if (virFileResolveLink(devLink, &target) < 0) + continue; + + if (strstr(target, addrStr)) { + *vfioPath =3D g_strdup_printf("/dev/vfio/devices/%s", = entry->d_name); + return 0; + } + } + } + } + + virReportError(VIR_ERR_INTERNAL_ERROR, + _("cannot find VFIO device for PCI device %1$s"), + addrStr); + return -1; +} diff --git a/src/util/virpci.h b/src/util/virpci.h index fc538566e1..24ede10755 100644 --- a/src/util/virpci.h +++ b/src/util/virpci.h @@ -296,6 +296,8 @@ void virPCIEDeviceInfoFree(virPCIEDeviceInfo *dev); =20 void virPCIDeviceAddressFree(virPCIDeviceAddress *address); =20 +int virPCIDeviceGetVfioPath(virPCIDeviceAddress *addr, char **vfioPath); + G_DEFINE_AUTOPTR_CLEANUP_FUNC(virPCIDevice, virPCIDeviceFree); G_DEFINE_AUTOPTR_CLEANUP_FUNC(virPCIDeviceAddress, virPCIDeviceAddressFree= ); G_DEFINE_AUTOPTR_CLEANUP_FUNC(virPCIEDeviceInfo, virPCIEDeviceInfoFree); --=20 2.43.0