From nobody Fri Oct 3 06:37:01 2025 Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2048.outbound.protection.outlook.com [40.107.95.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 368F923770A; Thu, 4 Sep 2025 04:08:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.95.48 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958921; cv=fail; b=QP+UCzS4m/5gq20Hdx3mYZyT47tRrdZzNzl2VorutjIxtGDhMbKZ1JFLP56oitf8Xp62cs0p7dmv2u6u43maDapVQjrg2VJW7Oc7AVYS6EnVdbqm+8aMtRr8UCFgpvJCdWaoS1aEhxSnOb5p/2w1/0Jx4aPCOHn+h+cK1MIQzpg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958921; c=relaxed/simple; bh=f/Z5Ik+/3VbDxM5lUH69qFNQ10KWLXCIaJYieTPFcAQ=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=B/xDK1DB6mQAoprlGZCrCY9K/IbaN77DVaqXd9fjoSJzUtDjwN7Wev8f1EYD6nZmyX/587tL3hfJwqZk9Manh3La7VlOx3kemFRdX6ZJ7ufy9+wu2WVG/la23aymS157SbPHKQ41wPRTs9p2fU9Vkg9iMxQ8yvi54/dYkSHwtKs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=uGAC5bcG; arc=fail smtp.client-ip=40.107.95.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="uGAC5bcG" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CeIXrGOQbfolIm8Yuh08XqguYIvMMoDsdGcRYxjnmr/cBvpq06nddO1TYeIdM68q3PyzoKIxCa4Drk4Z6E5w2L8zNkzKCJlxCfJnCh7Bo/lG/TsvDkdG2MQ0wq6hsEZEJxaj6hJ7yvTUxSqk7D2WH2fw8IFnsqptFYM4FRNQza23BZq+xlKuJjysxxL5smTU3QcbDlodDbFNsVbLrviMc5VgGrBm5SOMqWEC6T8ni/zGhW1/wpav+s7Si9PHLK4dIefxWTN+XR+qM0xFyIg9UGsVAyZiX2b795giPpGADqnh0ZsO3Xsv9QmGHwlwQn/evOEbty5odyVq8vwW1jKQzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JID8y7Z9uJF50O8YDNsF/0hOlM+5JVit07Ff6CcCMI4=; b=tT6YdKxOXO0uPXvbY5NZrtFz4QKaPRuEkD+NmapsD5fKIli7n7UEOh5GJqedJYdfDPiCHg5GWiTMXTjbVgRXjP2ZMk+Ixca7RI/TBhmbaPnYcpRb3/XBfdAbzr/KPrTCdm+pigLf74OM0ypseejvkACK1mzJ+HDU1aARrCQ17hr+LmZt/gql/0T65g85+FDos8VtOOOgzM8QVlgaaaZmUEoG3l1s81Gi528GILxMl51rss9adwIvU38aHhqnYk5iElpflQnAQPokVsb5SUJ0wYldAjB22b/p1I3Cfj4SHev8RwtORFELBvwWc3XYxLxFDIfdHR7+cMpxNC+J6vkn4A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JID8y7Z9uJF50O8YDNsF/0hOlM+5JVit07Ff6CcCMI4=; b=uGAC5bcGX8DG0qJpg6guEJuOpru1nheslL72GFoJ7GwI7lvAJgpWPKzgtaIKRF1US4GOAl+5JtmEl//iX9NeO35UhUORr8CiRmdevvbdimgrLydrUWd1Q/TAXV7sSCs15GS63WnIAfOIz05mTGitnEnd/J8I8PM8BX9QByOxWp2CwT6o+NiDvnfKbPaeEpTwIWVZmHeH3ekc2vLtlX5YR3yU15QIm9WRLvRT5+U4yo6Z6PQhGGR65lZXNFsc2F19Vg/RjLKgc+yowpPAOIMM0qYF7J29/wNpuL4Us8X9gN4ign0cCDOghymPYKWzozipY02mPJhiV2j+ffOsrFYB6w== Received: from SJ0PR13CA0180.namprd13.prod.outlook.com (2603:10b6:a03:2c7::35) by PH8PR12MB6843.namprd12.prod.outlook.com (2603:10b6:510:1ca::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9009.22; Thu, 4 Sep 2025 04:08:34 +0000 Received: from SJ5PEPF000001D2.namprd05.prod.outlook.com (2603:10b6:a03:2c7:cafe::b7) by SJ0PR13CA0180.outlook.office365.com (2603:10b6:a03:2c7::35) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9115.6 via Frontend Transport; Thu, 4 Sep 2025 04:08:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001D2.mail.protection.outlook.com (10.167.242.54) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:34 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:30 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:29 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:29 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 01/14] vfio/nvgrace-gpu: Expand module_pci_driver to allow custom module init Date: Thu, 4 Sep 2025 04:08:15 +0000 Message-ID: <20250904040828.319452-2-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001D2:EE_|PH8PR12MB6843:EE_ X-MS-Office365-Filtering-Correlation-Id: 4b31d7a2-de6f-405a-9b5c-08ddeb68b7b0 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|36860700013|1800799024|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?9xx/23UHzYTeS3MMqPd0GMZ14oXR/feuH1HFd3EPZr4zYsyCrde2B33jqk2l?= =?us-ascii?Q?yLbmFnlYkgGIiuL51PdqyTBI5YyBi8WnM0TayGgW13gHj8thpIvl0tNaouYX?= =?us-ascii?Q?rciSX9Cv9cwPuGKVHhWTZfvf1BnON/gWVlVCenY2zW8HQ0CxwXDz4KwPPIFe?= =?us-ascii?Q?Ovu4Kfvvuum6bGPxUx1V+zoWA0RCunx7tsgpB8SFL/A7J0o8g7B031Lk/ldL?= =?us-ascii?Q?b+3+AzCnTHRzP+/9y3aN1iRLkP60Nhgz0rsfuTnQnwn7bgJmzLcp2+LbH1kA?= =?us-ascii?Q?lEMinVEJIgks4mvKf/FHCIBscN8pyGlPv/MJkshCS0kdE0ZUPGXBfdKxQHz1?= =?us-ascii?Q?tAae+wffaBCkM2ozcHX9hcsIezz/Rz/aefZkTdEtD7Rrbvmpn8aqoyRP75Ew?= =?us-ascii?Q?rmcSoaoymuJt+sVo1jEe6u3/EO5CeKruJDUGV+mKC9vP1T2KS20DDL1Gnpu4?= =?us-ascii?Q?t7KxMMvRc/L5J996CSMpNccf5LjLaK0JBxmUV9NP3qviChk9AFzCwLNPHtFT?= =?us-ascii?Q?6M+95yTx+vb1ARs1H8aD5rHBWKFXtTntUuCOvWC/qUstZNI2z77Mt0oOdJSs?= =?us-ascii?Q?2CTRXlrDAM8Hej+VS2vAPK1BHR22gr7JBH3+Ie/jPAZwqBG6NNODD9iV0ngN?= =?us-ascii?Q?EvrcjBcoX+fOgoYfmBQGg5oUlxH0H0Inn1MVLLBo2fS7mlm/Dq5j+74qSLlP?= =?us-ascii?Q?KephelXj0B1tRT8pMOOEaGJmx1tbEYEL5blD+PTFd0QlpMCCszS0Y+2JhVdh?= =?us-ascii?Q?EDTO0AKXgGeNeCYiLkQFCFVmzsqgp2bVJyTuFJqst3J9YY3SdC8XT7r2u60H?= =?us-ascii?Q?UHWqX2nidPueA0mqoQKv0/TtLO7xiJRU1sD+wZRIdHKy8S2afBF9xQty/jev?= =?us-ascii?Q?Tge8iychZKygbllrSyQf1Yi9LUPm0nNmZrKwzJQYqfvk6i7cWCTlGzGhyBPS?= =?us-ascii?Q?Mp8YFiSmyg1JUiFZ1E/CqNCQ4dObA1hkYQClUSQFBULzi93OK1loMxo4yQxG?= =?us-ascii?Q?k6oMLBalSQBeQCKvyK3SXWKFETijTd6qDqWouCPYMmmIGOwM0BW+qRKrQxd6?= =?us-ascii?Q?lWhype2ol6AoFwhMkdW7L4IVE9GZW0Eem82EqMNaocKIwmHVpNzeVqZWndBs?= =?us-ascii?Q?Odmicu4iLOSyTYXe2hcPkTmg+iWe+cuTEvEEOuFSnVAsf3/ZpElVDBsLIxgm?= =?us-ascii?Q?iiGMaOi6R06QRAerrmX4akdLAMHzVs2F46Jkpq6RFvG1NxMYmJed+YG6NNGn?= =?us-ascii?Q?W9j21YZPQbbJaJdRjvt3lCYUXpnuoz3oZhd9/IlZE257IsptYYgKg3NttEiH?= =?us-ascii?Q?bsOVtCI5JLkm2YnbNNrMwJl1BNLNvFD87WKeGl8DRCal4kF/hC1KkzW2YjaR?= =?us-ascii?Q?Zn4WUfhVFKLxVY8jbb1Q+WJm7ekRXibH8iaVyFIe/c3Xhmr5NBJtNq4FCy6O?= =?us-ascii?Q?iKpo7VW2bEqDLs5jwjRB3dvpsM8e9sOpGN/6YPZP7LgEt3GCbq0lluwvQFs3?= =?us-ascii?Q?O/sTsfUKfbbWztTm+eVOs3y0ADJ1RlJVOBxh?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(36860700013)(1800799024)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:34.7156 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4b31d7a2-de6f-405a-9b5c-08ddeb68b7b0 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001D2.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH8PR12MB6843 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal Allow custom changes to the nvgrace-gpu module init functions by expanding definition of module_pci_driver. Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/main.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace= -gpu/main.c index d95761dcdd58..72e7ac1fa309 100644 --- a/drivers/vfio/pci/nvgrace-gpu/main.c +++ b/drivers/vfio/pci/nvgrace-gpu/main.c @@ -1009,7 +1009,17 @@ static struct pci_driver nvgrace_gpu_vfio_pci_driver= =3D { .driver_managed_dma =3D true, }; =20 -module_pci_driver(nvgrace_gpu_vfio_pci_driver); +static int __init nvgrace_gpu_vfio_pci_init(void) +{ + return pci_register_driver(&nvgrace_gpu_vfio_pci_driver); +} +module_init(nvgrace_gpu_vfio_pci_init); + +static void __exit nvgrace_gpu_vfio_pci_cleanup(void) +{ + pci_unregister_driver(&nvgrace_gpu_vfio_pci_driver); +} +module_exit(nvgrace_gpu_vfio_pci_cleanup); =20 MODULE_LICENSE("GPL"); MODULE_AUTHOR("Ankit Agrawal "); --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on2047.outbound.protection.outlook.com [40.107.96.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6041E244670; Thu, 4 Sep 2025 04:08:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.96.47 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958927; cv=fail; b=ALKXexUETWjdP4MmK6FHz1M31l/AzT7/7Ku7cW9yiRm5VcfHXGpJGBSo17pzUCv2rT0GbggpyBItFpWJ56ZZ1rt/43L2VpqmyQbdMeMSbvbDHqq1tlwaLy3tqsDbHE7XjaDFho7ltKSyyNiq+0SelssoWrg4HVrjpYKOGCiRAIM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958927; c=relaxed/simple; bh=k/jIZGT0gXqIAL5ya1yb6yqw/TdBwp/5NP38RWjkOZs=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=bqU38ohYPjWbnpKtofhLEhE8NsvYxfDaXqMMdtYUCqXzANeZnohLBD/nml0AKSbEAh9R/3xc6nhX6PwA7IHWd8AivCQ4sivs2s1fg1YO2FPxwFtYiRZgmXBMhHUVO+edUHp7TEz0m9HyUYzfnsaPPOd6d9yXgAGPPUJSzqO/A1M= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=OlgmxosW; arc=fail smtp.client-ip=40.107.96.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="OlgmxosW" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=iq5/RGhHuWYU7unXJxJx7aFA95Yq5ZIFTBHvQo2Qr4fSgKfoJgBnDUh20XQhS2DCWR1cv8nEE8s4a5XP06dI8NZdFBs0exWQrl1ZB0v6LX8NBfXiHskYrgcupDFkTnEI7iTtIbeAkST5b5N9Z7ObGXuVcn8W4XXYgJXFt6ygGYk81KRKw0K0GmU/Qhfe+ooz53/e5WAo3mr0ZGlcCbessg/jFYgK1n7+P38u/4Mq6BkT88iMkC6T3A9W6XnOQehoOQgdH52KoYYgPk8Zgl+Uj+5AchtZ2PQd6GS79F/ELHVwjoN34oFqvYdyPgwfxRuGFbpJonhQS0qnpKfPpeMMDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=k1q8yXmp5JWASwLYDzw6tAMWeF+Njs0I2/7BTicMu+M=; b=UbPYZf4FvhnhKv6LWC1MqXgnIDmQEwKhXt9pyuel86Hwbc5VpIr5ax6hfHFXAjvt68RgF+Vdhy5HNvT2ah00vzzugowM8QljyPitnIcS94mdqpBJ8uPJGZApqKElhK1g0lyOq85Foa5dQthMeQFmEp5dUHAeq7dFGFW7thi3Fe/MRngM6FhhH0nm7V4lzrUBBM0uk47tWY7QYrzMiqNqA450J/kYUV4CFXeLxgMarMUhLWAfVslE+sd5sPz+NIGOEtRdvU1MYTryP+NaNEA2V/KGbCE/De6GKQS1obTrOvboSSPluSx00k0rpvjHS2SGuQwn5Wx690O9miI7uSppfw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=k1q8yXmp5JWASwLYDzw6tAMWeF+Njs0I2/7BTicMu+M=; b=OlgmxosW2HWrO2MBKQlV82g9QjjGA3wysdRLGbKaMbROeZ7kFJCMXuVkHRJnSGoOjzkrkwJ+1OQ2siKJcMfDRtfrC5K4iEP+WJ1pXfWbpiul4HIIOUWHnQ9PUguEtTRndNyMH9t43R4egHcZ9zvX4MPj5gvlnG+SOdxhFQAHTF45+/f3cD1/4yr2nNA8p9zaz0MKY6dAWBf6vfw4/yIyF+4Ww54NxlKimbmt2Lo9VKbY8WIXuLL147hM7a6bc0s7F0sY8YvHmoQIl9u1Lz1ySVwUA1xe++NiZeCQxao3vCGl/shppZ0zNO1ebPyCw4svsdDsaMV03HB2tbKtiKVVhg== Received: from SA1PR03CA0024.namprd03.prod.outlook.com (2603:10b6:806:2d3::25) by IA1PR12MB6164.namprd12.prod.outlook.com (2603:10b6:208:3e8::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9073.27; Thu, 4 Sep 2025 04:08:36 +0000 Received: from SN1PEPF000252A2.namprd05.prod.outlook.com (2603:10b6:806:2d3:cafe::45) by SA1PR03CA0024.outlook.office365.com (2603:10b6:806:2d3::25) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.17 via Frontend Transport; Thu, 4 Sep 2025 04:08:36 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SN1PEPF000252A2.mail.protection.outlook.com (10.167.242.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:36 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:30 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:29 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:29 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 02/14] vfio/nvgrace-gpu: Create auxiliary device for EGM Date: Thu, 4 Sep 2025 04:08:16 +0000 Message-ID: <20250904040828.319452-3-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN1PEPF000252A2:EE_|IA1PR12MB6164:EE_ X-MS-Office365-Filtering-Correlation-Id: ea42bfd9-6e0a-4eb9-32c9-08ddeb68b8ae X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?99/Jh1Q1QOD3oX7YCO/KaypqhQE1ldUY60n8lnAoRPD21z10oPeemwcm37Y+?= =?us-ascii?Q?Mco9l5IuF+uh8FY+pwzm40ZUoyEIfK52WFicI3usae5RT6qk/s+tb/H42w5Z?= =?us-ascii?Q?ZnWtf87qqz28o+BwRDfOlpKAWdyqbTLRD5JSdKOTLlR0V9DxDCSGa+LjBpWz?= =?us-ascii?Q?wyOL9B3TikAibCkAhnkXDbDc370nOqeo3c/xLCknRyPohe9lScyxkw6Yu7It?= =?us-ascii?Q?QnH+v7C/xlbG/0Ot4hIBCAEq184mGp1IlHB4PC90uZuaOeoGEpFUFLetgGs3?= =?us-ascii?Q?FLpiVEq/TFUFUd7Mr8TMowbrT4HAS4bLDZaFg9bIXQfMhE1dod23Irck2cMR?= =?us-ascii?Q?V8R+DJj3bg1uG/0nkN1sMs67Dq2AFmxl1LAeJs81U4oVHhjfcsnD+iIQi7zY?= =?us-ascii?Q?xKRu7gJQmGjMZUdjsPaLQGTsszQH7Spk+dVxdESCvs6N+LU33pSJuWu5UMVO?= =?us-ascii?Q?Urua5/jKNxChlg7kVUoSkz5WCnUKeiCJz96ww+6b79W3rqFQ9rm0QFWuxpuv?= =?us-ascii?Q?w5avPKl0hGWbjo4xblH/GbRgHg1HI6Lj4QAqgSEWSGHex0Ixx4v0HnC62RFK?= =?us-ascii?Q?4XnXQa/z8pYqijOtccdjxV9Xj7imYcpZBgf3Ni5zJZQ1RbtOqEA6c+scIUfb?= =?us-ascii?Q?C/5xOC/shdpIXO+boBYDqHARug46D/J7oyTPl1TgTyQ/sr3cEXE+wQsSfJEV?= =?us-ascii?Q?TXRqs+wUcFBgVVXfeR8bgvQsFS8ou3OOBPuhJWeA2ABY1kevghwFrVTP1Ik9?= =?us-ascii?Q?tmp/nmbPD3W5/uB5u8GSW3ZQFF/Qxt1d77fSfgfog12kHYiR5ijNxRMckCVO?= =?us-ascii?Q?EqVwGDIHNvMN7dqbwPEcsOdO2kU0eLeuG6oB3OlR6gpTpLT6DZX8HKiNtbPG?= =?us-ascii?Q?ks4ufbo6KR2vBzAGIRjK/OobI72ZT/VjAAsnd95wSeT6lcMCaLPqYIPakNax?= =?us-ascii?Q?Lj527/BxpgzufIPDGRxJMQLlJEqz731OkU0V+mEiOWYj4NTXXjH6PA7Ke+Gg?= =?us-ascii?Q?ayLRjMAxrZWruN4WfmY053+3xPA5CudGlaNAxFEPuOvZFBUMWdCkhHfSdyEO?= =?us-ascii?Q?InVT2Y5yTD4rOcGHIrNKx8I3GVwBIvyVJeYF9v5xTTaRcr6ig1JqHh58J++U?= =?us-ascii?Q?5v4syD/xr3UXV2tn9dno1gROlcKfSM7VJP/jH5mahfxsZtCjwC5NYTeWXHXw?= =?us-ascii?Q?loT2XiR2LCpqw3rJ6azOTEtDIJhLb5Gw1fkvUY1Ne+1LsfSzcc++09MD6ZcU?= =?us-ascii?Q?584JS/G8qMVDuSv7QfUG7VY/c0O/P7aaQG9qK0dpXPqxo1l3xamSy8IHDONb?= =?us-ascii?Q?+9z1kux4tOLeruh++967OjoHya9iVFYVFI7bF/FOGAYcw9E11qYaFHjmEIJZ?= =?us-ascii?Q?BfGnEb8UngkNRn+LKS5lyyX+QR0GFWo6dvdCvjqnS1NoXzg9F+Ko4sHy73T6?= =?us-ascii?Q?3fByLVDh6th9neiuy18gBb+dBBlCqsNTlZyYjqt9MJvgVCxNugesG9v5naVA?= =?us-ascii?Q?xDhDvu119bHLx5mQq2Qb3cbdH0uvS0lFfQR9?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(1800799024)(36860700013)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:36.2794 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ea42bfd9-6e0a-4eb9-32c9-08ddeb68b8ae X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SN1PEPF000252A2.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6164 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal The Extended GPU Memory (EGM) feature enables the GPU access to the system memory across sockets and physical systems on the Grace Hopper and Grace Blackwell systems. When the feature is enabled through SBIOS, part of the system memory is made available to the GPU for access through EGM path. The EGM functionality is separate and largely independent from the core GPU device functionality. However, the EGM region information of base SPA and size is associated with the GPU on the ACPI tables. An architecture wih EGM represented as an auxiliary device suits well in this context. The parent GPU device creates an EGM auxiliary device to be managed independently by an auxiliary EGM driver. The EGM region information is kept as part of the shared struct nvgrace_egm_dev along with the auxiliary device handle. Each socket has a separate EGM region and hence a multi-socket system have multiple EGM regions. Each EGM region has a separate nvgrace_egm_dev and the nvgrace-gpu keeps the EGM regions as part of a list. Note that EGM is an optional feature enabled through SBIOS. The EGM properties are only populated in ACPI tables if the feature is enabled; they are absent otherwise. The absence of the properties is thus not considered fatal. The presence of improper set of values however are considered fatal. It is also noteworthy that there may also be multiple GPUs present per socket and have duplicate EGM region information with them. Make sure the duplicate data does not get added. Suggested-by: Jason Gunthorpe Signed-off-by: Ankit Agrawal --- MAINTAINERS | 5 +- drivers/vfio/pci/nvgrace-gpu/Makefile | 2 +- drivers/vfio/pci/nvgrace-gpu/egm_dev.c | 61 ++++++++++++++++++++++ drivers/vfio/pci/nvgrace-gpu/egm_dev.h | 17 +++++++ drivers/vfio/pci/nvgrace-gpu/main.c | 70 +++++++++++++++++++++++++- include/linux/nvgrace-egm.h | 23 +++++++++ 6 files changed, 175 insertions(+), 3 deletions(-) create mode 100644 drivers/vfio/pci/nvgrace-gpu/egm_dev.c create mode 100644 drivers/vfio/pci/nvgrace-gpu/egm_dev.h create mode 100644 include/linux/nvgrace-egm.h diff --git a/MAINTAINERS b/MAINTAINERS index 6dcfbd11efef..dd7df834b70b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -26471,7 +26471,10 @@ VFIO NVIDIA GRACE GPU DRIVER M: Ankit Agrawal L: kvm@vger.kernel.org S: Supported -F: drivers/vfio/pci/nvgrace-gpu/ +F: drivers/vfio/pci/nvgrace-gpu/egm_dev.c +F: drivers/vfio/pci/nvgrace-gpu/egm_dev.h +F: drivers/vfio/pci/nvgrace-gpu/main.c +F: include/linux/nvgrace-egm.h =20 VFIO PCI DEVICE SPECIFIC DRIVERS R: Jason Gunthorpe diff --git a/drivers/vfio/pci/nvgrace-gpu/Makefile b/drivers/vfio/pci/nvgra= ce-gpu/Makefile index 3ca8c187897a..e72cc6739ef8 100644 --- a/drivers/vfio/pci/nvgrace-gpu/Makefile +++ b/drivers/vfio/pci/nvgrace-gpu/Makefile @@ -1,3 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only obj-$(CONFIG_NVGRACE_GPU_VFIO_PCI) +=3D nvgrace-gpu-vfio-pci.o -nvgrace-gpu-vfio-pci-y :=3D main.o +nvgrace-gpu-vfio-pci-y :=3D main.o egm_dev.o diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c b/drivers/vfio/pci/nvgr= ace-gpu/egm_dev.c new file mode 100644 index 000000000000..f4e27dadf1ef --- /dev/null +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.c @@ -0,0 +1,61 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include +#include "egm_dev.h" + +/* + * Determine if the EGM feature is enabled. If disabled, there + * will be no EGM properties populated in the ACPI tables and this + * fetch would fail. + */ +int nvgrace_gpu_has_egm_property(struct pci_dev *pdev, u64 *pegmpxm) +{ + return device_property_read_u64(&pdev->dev, "nvidia,egm-pxm", + pegmpxm); +} + +static void nvgrace_gpu_release_aux_device(struct device *device) +{ + struct auxiliary_device *aux_dev =3D container_of(device, struct auxiliar= y_device, dev); + struct nvgrace_egm_dev *egm_dev =3D container_of(aux_dev, struct nvgrace_= egm_dev, aux_dev); + + kvfree(egm_dev); +} + +struct nvgrace_egm_dev * +nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name, + u64 egmpxm) +{ + struct nvgrace_egm_dev *egm_dev; + int ret; + + egm_dev =3D kvzalloc(sizeof(*egm_dev), GFP_KERNEL); + if (!egm_dev) + goto create_err; + + egm_dev->egmpxm =3D egmpxm; + egm_dev->aux_dev.id =3D egmpxm; + egm_dev->aux_dev.name =3D name; + egm_dev->aux_dev.dev.release =3D nvgrace_gpu_release_aux_device; + egm_dev->aux_dev.dev.parent =3D &pdev->dev; + + ret =3D auxiliary_device_init(&egm_dev->aux_dev); + if (ret) + goto free_dev; + + ret =3D auxiliary_device_add(&egm_dev->aux_dev); + if (ret) { + auxiliary_device_uninit(&egm_dev->aux_dev); + goto create_err; + } + + return egm_dev; + +free_dev: + kvfree(egm_dev); +create_err: + return NULL; +} diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.h b/drivers/vfio/pci/nvgr= ace-gpu/egm_dev.h new file mode 100644 index 000000000000..c00f5288f4e7 --- /dev/null +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#ifndef EGM_DEV_H +#define EGM_DEV_H + +#include + +int nvgrace_gpu_has_egm_property(struct pci_dev *pdev, u64 *pegmpxm); + +struct nvgrace_egm_dev * +nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name, + u64 egmphys); + +#endif /* EGM_DEV_H */ diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace= -gpu/main.c index 72e7ac1fa309..2cf851492990 100644 --- a/drivers/vfio/pci/nvgrace-gpu/main.c +++ b/drivers/vfio/pci/nvgrace-gpu/main.c @@ -7,6 +7,8 @@ #include #include #include +#include +#include "egm_dev.h" =20 /* * The device memory usable to the workloads running in the VM is cached @@ -60,6 +62,63 @@ struct nvgrace_gpu_pci_core_device { bool has_mig_hw_bug; }; =20 +static struct list_head egm_dev_list; + +static int nvgrace_gpu_create_egm_aux_device(struct pci_dev *pdev) +{ + struct nvgrace_egm_dev_entry *egm_entry; + u64 egmpxm; + int ret =3D 0; + + /* + * EGM is an optional feature enabled in SBIOS. If disabled, there + * will be no EGM properties populated in the ACPI tables and this + * fetch would fail. Treat this failure as non-fatal and return + * early. + */ + if (nvgrace_gpu_has_egm_property(pdev, &egmpxm)) + goto exit; + + egm_entry =3D kvzalloc(sizeof(*egm_entry), GFP_KERNEL); + if (!egm_entry) + return -ENOMEM; + + egm_entry->egm_dev =3D + nvgrace_gpu_create_aux_device(pdev, NVGRACE_EGM_DEV_NAME, + egmpxm); + if (!egm_entry->egm_dev) { + kvfree(egm_entry); + ret =3D -EINVAL; + goto exit; + } + + list_add_tail(&egm_entry->list, &egm_dev_list); + +exit: + return ret; +} + +static void nvgrace_gpu_destroy_egm_aux_device(struct pci_dev *pdev) +{ + struct nvgrace_egm_dev_entry *egm_entry, *temp_egm_entry; + u64 egmpxm; + + if (nvgrace_gpu_has_egm_property(pdev, &egmpxm)) + return; + + list_for_each_entry_safe(egm_entry, temp_egm_entry, &egm_dev_list, list) { + /* + * Free the EGM region corresponding to the input GPU + * device. + */ + if (egm_entry->egm_dev->egmpxm =3D=3D egmpxm) { + auxiliary_device_destroy(&egm_entry->egm_dev->aux_dev); + list_del(&egm_entry->list); + kvfree(egm_entry); + } + } +} + static void nvgrace_gpu_init_fake_bar_emu_regs(struct vfio_device *core_vd= ev) { struct nvgrace_gpu_pci_core_device *nvdev =3D @@ -965,14 +1024,20 @@ static int nvgrace_gpu_probe(struct pci_dev *pdev, memphys, memlength); if (ret) goto out_put_vdev; + + ret =3D nvgrace_gpu_create_egm_aux_device(pdev); + if (ret) + goto out_put_vdev; } =20 ret =3D vfio_pci_core_register_device(&nvdev->core_device); if (ret) - goto out_put_vdev; + goto out_reg; =20 return ret; =20 +out_reg: + nvgrace_gpu_destroy_egm_aux_device(pdev); out_put_vdev: vfio_put_device(&nvdev->core_device.vdev); return ret; @@ -982,6 +1047,7 @@ static void nvgrace_gpu_remove(struct pci_dev *pdev) { struct vfio_pci_core_device *core_device =3D dev_get_drvdata(&pdev->dev); =20 + nvgrace_gpu_destroy_egm_aux_device(pdev); vfio_pci_core_unregister_device(core_device); vfio_put_device(&core_device->vdev); } @@ -1011,6 +1077,8 @@ static struct pci_driver nvgrace_gpu_vfio_pci_driver = =3D { =20 static int __init nvgrace_gpu_vfio_pci_init(void) { + INIT_LIST_HEAD(&egm_dev_list); + return pci_register_driver(&nvgrace_gpu_vfio_pci_driver); } module_init(nvgrace_gpu_vfio_pci_init); diff --git a/include/linux/nvgrace-egm.h b/include/linux/nvgrace-egm.h new file mode 100644 index 000000000000..9575d4ad4338 --- /dev/null +++ b/include/linux/nvgrace-egm.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#ifndef NVGRACE_EGM_H +#define NVGRACE_EGM_H + +#include + +#define NVGRACE_EGM_DEV_NAME "egm" + +struct nvgrace_egm_dev { + struct auxiliary_device aux_dev; + u64 egmpxm; +}; + +struct nvgrace_egm_dev_entry { + struct list_head list; + struct nvgrace_egm_dev *egm_dev; +}; + +#endif /* NVGRACE_EGM_H */ --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2056.outbound.protection.outlook.com [40.107.223.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1BA25239581; Thu, 4 Sep 2025 04:08:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.56 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958923; cv=fail; b=uJligAboYcRN8D8vva5DOJiTZ52f7wYnYxOUcct86t3tKeF/djMKsTGK7K7v6GeAMSna2WnRmt+1CCQUn1CiZYFEeGbdcUkfG/N5rGf92yC9h5QUiOTW3/TnygPi6k47nUZ36DIWsmJVyEXu1iYaSX47/wX5GjxUxeS1GXz/0OQ= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958923; c=relaxed/simple; bh=ysV5ybT/1AWol/nFCUbNJNxibO16gNpf4s8q8lgO/So=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IBEDe3Nc8w1tjbLvak5CgbtRiwJbFFK1/1J7sm73cOZp0qEvuJu8ofjTb7cNuaucyeWGbvpHOu8PaR/WjdTw8eKpokwRcs7edioeswWuT9lcOZBEZ3bse6AXdxHig7hBvBgPe0/k0Tor/8FwkTG378Tb0QrNWpdoPN86X67uB6k= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=SE1VK2nR; arc=fail smtp.client-ip=40.107.223.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="SE1VK2nR" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=jbVDZfGcpFrBVYk/PCGUvLzrdXpvPuqDGqha55HSmkpfxbQZf93lt6xtSRDnZlt2QGqOkyQv78/kXyqszQPXt84/3Oh+S3YLITDWWUxFwtp/FvOPu4DdupDt8o7mDwLvPdzQmJRa4JUG1UrgWYnRXxxmns1Qlh7N0KH5SEHBX3927t0NBTs15A2aE2KmnPozg2BonqkKoPDckMCg++YLJS53Ix1ht/ApbhQ0LI3n/2WY806ZZZYd9fBmPT4O7np9IQhvT78RYPp11hS4f4Cjgcuoj1RqRzNqI+SWAjjkGCtiU9yL25zN/hZNFloOfl7BOtIScXtw0R8IIkhJYFsIcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=o0MxRJ/98ZtJsBvBcruUvEwBhh8pH/DfOIE+XeP51Nk=; b=QFONrBHwtZVe2VXDwhzK1Kv8ZzpOrdjsQduTQjzQlpweeOE26HZ70B9duou7CytyuuDaaaPQJnFmZVdE9EN/fkzDFsin3th8aIms56LH6LL0yyWdrFLHDv5FaNV3p4kArwf2xIEgyDcqdVTk/UTF0yFcw0rDFG/zTChoyV/lIjQEHoSAtSpkKeVzHVzT95h/CmsLfERhGeB9FEEg+X9yAbqavuLIABYLSS1nFB3WoI8yRh7S0wF5WrnVB6JIUQwbITC3nREHCjzOdiL1bWJ9ui9T02b+w/mwSXEZw+q7fgYHvFnQ/BvD4ygZwbyE9qD54GgxIjV/LAIXcRo99NxC+g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=o0MxRJ/98ZtJsBvBcruUvEwBhh8pH/DfOIE+XeP51Nk=; b=SE1VK2nRlUxU8BZKYJ/lS0NTSbs09QZLmKDKsLsjbWfU2W/NxLT9gmI6xUIXYirSNtyr6kW7GuhdeULpor0EnbedMTA5rILC5kfwkZDaQXd1IYK6y6+kbnq2W54gGRraiRJ/6z+1j1EoqBO5sPwQVvqUI1voIuqOAvfbyiYiOyeo0ATrRW8l9M/D9EyURLDx9Yrbe9jhP2S/Fh4ylrP+LnzNDhZZpAp0TW/mxX8AG8xMFPYUKG9FYWzLrwPVle65Eub7+Jnb5rII7LbbClbF4cm8x/2DnT6jGoCGNqPuUsuctqHRvVpQHDEjmck29CQSRa4RkSxTvad7l3fRKWW3Sw== Received: from SA1PR03CA0005.namprd03.prod.outlook.com (2603:10b6:806:2d3::6) by BN3PR12MB9570.namprd12.prod.outlook.com (2603:10b6:408:2ca::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.17; Thu, 4 Sep 2025 04:08:37 +0000 Received: from SN1PEPF000252A2.namprd05.prod.outlook.com (2603:10b6:806:2d3:cafe::25) by SA1PR03CA0005.outlook.office365.com (2603:10b6:806:2d3::6) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.19 via Frontend Transport; Thu, 4 Sep 2025 04:08:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SN1PEPF000252A2.mail.protection.outlook.com (10.167.242.9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:36 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:30 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:30 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:30 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 03/14] vfio/nvgrace-gpu: track GPUs associated with the EGM regions Date: Thu, 4 Sep 2025 04:08:17 +0000 Message-ID: <20250904040828.319452-4-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN1PEPF000252A2:EE_|BN3PR12MB9570:EE_ X-MS-Office365-Filtering-Correlation-Id: 2d4083bd-9667-40f4-7a49-08ddeb68b90a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|82310400026|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?XLbVvOLcwQRarCBD84s7BJOelvjtL9ayeB/r26B+LtLmiADBOcW886cJV/Un?= =?us-ascii?Q?bdDUV+kHJfrvxJR3y5LQXTY6Pm9WtZwauSDDRR7CKj6rryA0WJZz52Kd1EJd?= =?us-ascii?Q?C+13jhoHU8lLIHMqt/71EC13M7LFkEBUOjreJuwiaH3SjH2WnqrZQMcsZxHX?= =?us-ascii?Q?r4pJMWAACXgNWmJ6vLwnsj6ZD7ojYb5DJ5F5MgN3ohAOYzm9TLQn8XpGfHRI?= =?us-ascii?Q?QeYnsqdAzQ6tehdL57WPNlGz5J4fmWj2XioCEa+xQHSrvUF0TD9RcHc0R4wR?= =?us-ascii?Q?EXJvJR2a8fB+w+Ik7UOZIKUW0ajDU8X1nnXlRs0i5A9bGbQoHRBfi/7IzvUB?= =?us-ascii?Q?Svy3a8TWCv4MtaZx7IX2V7XSfOw6f2j1exv+Ol3VVDEqlTZxklyvtXhCnR7a?= =?us-ascii?Q?LjU0wD9/Hac+4uYR0jPQwLp8/ONb8WRinhOBd4mgVV7z9G6wugJiOUu/exJJ?= =?us-ascii?Q?A3VBVDtXMtzHXqaSCIqhwg7zUwNvAY0Pi/VHMNGE3t2sudqWlxsSBihDuXwp?= =?us-ascii?Q?5DgWZra+4BMaRL97Htz1i8pvrYpQKtuFbxiwVGYDj16Ladb1EtW+FRbc6q3V?= =?us-ascii?Q?ztb2q8zN3fkwEPGQO4uU51IE4uLiBjXvMWppUmnx3J0Qa93NOWg/RVYxkkQM?= =?us-ascii?Q?K97P8y5a3qtDRGb6KeVChEhAKW5Gqht0ZnBr8FHyNiYfQMt6wPnaBDFcfcBN?= =?us-ascii?Q?W1/ie7opPrUWiFBOZii1N73GYmYn+TuulH1qgvhE1G0/YJY5M21AT01vv2IZ?= =?us-ascii?Q?IJkfDJPDFSPfWC1zZaHofp0ljkEjlAjuf/Rw0drhB12HOOBDg0LuV0XyY32R?= =?us-ascii?Q?/nORhkB/ws3BNFNGWZPmMd2hjBdgzpDrFPpxpHe6Cly6JX6WgSFX4NqBUig1?= =?us-ascii?Q?yC8NGe3V5QxWXwwZAzATR62k7R9UCvnR7PsCk2D6jE5+gNBakSCmGF0Sfq4w?= =?us-ascii?Q?QIUBrxcb+t9t9G3NaYRTgtgEBH45N4rDC33Lr3+YxRWH2hvHMJkPFqmPWIA1?= =?us-ascii?Q?ElPkP9WdZpkh0GaomVk4HgvVs4j4fCrK+v9R7DCJALBOTioqdft+NzM/WgFt?= =?us-ascii?Q?TaoFZm/lMjaEtW/0+HggrrVUaiDEuTL1dnDAv7qWJfz4TeWtSJidNg9I/cUO?= =?us-ascii?Q?02pHLH+FIveACojgm5L8VEtJCjO9+sDsQ+Hqi+vLon2zNciM4KN6k3Qm75eF?= =?us-ascii?Q?jfeUqMkZE8jywV6OJe3FB/65CqNK7ukzqXgJsQhyGE2RdMJV3tfeQ/dF9hR7?= =?us-ascii?Q?/jNWKP87o/tS06hLaWJA42ueHQbeC5Rcc5YZClVPuov6WnKiKeOTeFe2/gAN?= =?us-ascii?Q?pkpQjtTKCunrab8PNzla1evwCcx6Pf4TeMwLL/emxwGzB61VCxJ7y7MgPE3G?= =?us-ascii?Q?snUVxGNH66sCwk3nVuFPH5/hyqyso+yrJaQXCanekp2bQSZ9ljMci87CGckq?= =?us-ascii?Q?8b0hfZyQ8MfDHC4Ly65dTfhHPmzlHkRoyqIaSkT+qeJJBcn1DVm887trpQRV?= =?us-ascii?Q?h3cbetfeA20T5qW4p4+lKEeKWaN9IUzPw4xN?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(82310400026)(36860700013)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:36.8805 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2d4083bd-9667-40f4-7a49-08ddeb68b90a X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SN1PEPF000252A2.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN3PR12MB9570 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal Grace Blackwell systems could have multiple GPUs on a socket and thus are associated with the corresponding EGM region for that socket. Track the GPUs as a list. On the device probe, the device pci_dev struct is added to a linked list of the appropriate EGM region. Similarly on device remove, the pci_dev struct for the GPU is removed from the EGM region. Since the GPUs on a socket have the same EGM region, they have the have the same set of EGM region information. Skip the EGM region information fetch if already done through a differnt GPU on the same socket. Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm_dev.c | 29 ++++++++++++++++++++++ drivers/vfio/pci/nvgrace-gpu/egm_dev.h | 4 +++ drivers/vfio/pci/nvgrace-gpu/main.c | 34 +++++++++++++++++++++++--- include/linux/nvgrace-egm.h | 6 +++++ 4 files changed, 70 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c b/drivers/vfio/pci/nvgr= ace-gpu/egm_dev.c index f4e27dadf1ef..28cfd29eda56 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.c @@ -17,6 +17,33 @@ int nvgrace_gpu_has_egm_property(struct pci_dev *pdev, u= 64 *pegmpxm) pegmpxm); } =20 +int add_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev) +{ + struct gpu_node *node; + + node =3D kvzalloc(sizeof(*node), GFP_KERNEL); + if (!node) + return -ENOMEM; + + node->pdev =3D pdev; + + list_add_tail(&node->list, &egm_dev->gpus); + + return 0; +} + +void remove_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev) +{ + struct gpu_node *node, *tmp; + + list_for_each_entry_safe(node, tmp, &egm_dev->gpus, list) { + if (node->pdev =3D=3D pdev) { + list_del(&node->list); + kvfree(node); + } + } +} + static void nvgrace_gpu_release_aux_device(struct device *device) { struct auxiliary_device *aux_dev =3D container_of(device, struct auxiliar= y_device, dev); @@ -37,6 +64,8 @@ nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const= char *name, goto create_err; =20 egm_dev->egmpxm =3D egmpxm; + INIT_LIST_HEAD(&egm_dev->gpus); + egm_dev->aux_dev.id =3D egmpxm; egm_dev->aux_dev.name =3D name; egm_dev->aux_dev.dev.release =3D nvgrace_gpu_release_aux_device; diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.h b/drivers/vfio/pci/nvgr= ace-gpu/egm_dev.h index c00f5288f4e7..1635753c9e50 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.h +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.h @@ -10,6 +10,10 @@ =20 int nvgrace_gpu_has_egm_property(struct pci_dev *pdev, u64 *pegmpxm); =20 +int add_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev); + +void remove_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev); + struct nvgrace_egm_dev * nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name, u64 egmphys); diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace= -gpu/main.c index 2cf851492990..436f0ac17332 100644 --- a/drivers/vfio/pci/nvgrace-gpu/main.c +++ b/drivers/vfio/pci/nvgrace-gpu/main.c @@ -66,9 +66,10 @@ static struct list_head egm_dev_list; =20 static int nvgrace_gpu_create_egm_aux_device(struct pci_dev *pdev) { - struct nvgrace_egm_dev_entry *egm_entry; + struct nvgrace_egm_dev_entry *egm_entry =3D NULL; u64 egmpxm; int ret =3D 0; + bool is_new_region =3D false; =20 /* * EGM is an optional feature enabled in SBIOS. If disabled, there @@ -79,6 +80,19 @@ static int nvgrace_gpu_create_egm_aux_device(struct pci_= dev *pdev) if (nvgrace_gpu_has_egm_property(pdev, &egmpxm)) goto exit; =20 + list_for_each_entry(egm_entry, &egm_dev_list, list) { + /* + * A system could have multiple GPUs associated with an + * EGM region and will have the same set of EGM region + * information. Skip the EGM region information fetch if + * already done through a differnt GPU on the same socket. + */ + if (egm_entry->egm_dev->egmpxm =3D=3D egmpxm) + goto add_gpu; + } + + is_new_region =3D true; + egm_entry =3D kvzalloc(sizeof(*egm_entry), GFP_KERNEL); if (!egm_entry) return -ENOMEM; @@ -87,13 +101,23 @@ static int nvgrace_gpu_create_egm_aux_device(struct pc= i_dev *pdev) nvgrace_gpu_create_aux_device(pdev, NVGRACE_EGM_DEV_NAME, egmpxm); if (!egm_entry->egm_dev) { - kvfree(egm_entry); ret =3D -EINVAL; + goto free_egm_entry; + } + +add_gpu: + ret =3D add_gpu(egm_entry->egm_dev, pdev); + if (!ret) { + if (is_new_region) + list_add_tail(&egm_entry->list, &egm_dev_list); goto exit; } =20 - list_add_tail(&egm_entry->list, &egm_dev_list); + if (is_new_region) + auxiliary_device_destroy(&egm_entry->egm_dev->aux_dev); =20 +free_egm_entry: + kvfree(egm_entry); exit: return ret; } @@ -112,6 +136,10 @@ static void nvgrace_gpu_destroy_egm_aux_device(struct = pci_dev *pdev) * device. */ if (egm_entry->egm_dev->egmpxm =3D=3D egmpxm) { + remove_gpu(egm_entry->egm_dev, pdev); + if (!list_empty(&egm_entry->egm_dev->gpus)) + break; + auxiliary_device_destroy(&egm_entry->egm_dev->aux_dev); list_del(&egm_entry->list); kvfree(egm_entry); diff --git a/include/linux/nvgrace-egm.h b/include/linux/nvgrace-egm.h index 9575d4ad4338..e42494a2b1a6 100644 --- a/include/linux/nvgrace-egm.h +++ b/include/linux/nvgrace-egm.h @@ -10,9 +10,15 @@ =20 #define NVGRACE_EGM_DEV_NAME "egm" =20 +struct gpu_node { + struct list_head list; + struct pci_dev *pdev; +}; + struct nvgrace_egm_dev { struct auxiliary_device aux_dev; u64 egmpxm; + struct list_head gpus; }; =20 struct nvgrace_egm_dev_entry { --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2059.outbound.protection.outlook.com [40.107.94.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0ECA623814D; Thu, 4 Sep 2025 04:08:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.59 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958922; cv=fail; b=pMQ9mrGi3zspdHwYmxErvlDBic940qEIytzJBrS2It3lF14BKf9ZuGDD02rQVoAuRiAk6E2yzUroBpv8UqNFcLqEaY0qaUC4jTgec4uNAIRZYULmlr9E8MI17gYnNZjZEIb1fyG0ALoodiDh4d5YItkiSmP8ZTKrc+QeK7u6UUg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958922; c=relaxed/simple; bh=bGzlAZvsc4fthEARNPfenMfbu8H0L9feoFQAmiy9+vI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=I/B6Sd7HBJRVtzh/PPkuCWZoK5oSfzfvzK/tROaUG/617Wuf+qfEOAFZMroTve+DK3TC+Y3tM/Wq7CCiIlCjbpQw4aahCb3DLaiyjjGtWXfIn8N0fRpkqhv6dCdlyrj+dMlCjFUr5RA0ZMEe2w79Qa/6Q+dOWX6NVcLirsT0csY= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=NvwR261R; arc=fail smtp.client-ip=40.107.94.59 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="NvwR261R" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=DqNy/HYJES4y42S1jxHEt05foTbFuQ4uBBFLHD+aGAnfoVjtty2ndzX/i/j/Bysw/EDl53jjLqP5BWdk5okOAkXq8+d1auYIyOjbSnupO8s1omkALGGwcry1t2CKw7w0KCsjyLJkxnp37U01chmuJmLo8VHp54LFfCmePCWZXKr4Pu2XMCzGlC/i+jkpjchROLde0IMyhIjoESbFWlyE8gHiph6yXWU8K1kWcymdbyxeDM1Cuez+YoRO+X6xy/q92x22cGECsVQuiQAb7Gmm9OftHvMhnA0+dEVHha1nZ/ZpZZK0EFIg8ClFSi1RMyy2o5Y/qSHYTzQ/K2xvAeDmXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=f3R5prkxlUWa+i6h6zI/FTbRbejLI9P8nFF/IddgsNw=; b=wyhzG3zmFegDz2NjFvWsSCSDw9bYaZgs1pHVb0NpYJn/8n0V5hXz2M/MsmCDzMKS5VlxYbPBHrYqL9iGmLZHiQqhkApO77BKSOLKIZ17vNzaCCvPUaqrmCIrqsAKtYgrxCFqbTGdmVPLjkQliMVnLYqWkrPaFzG2uVx+a0DmXhHRUjF19WYXS4nImqTp5OGuzcsyjHEDzu6yHoqJWchMjynsMKUz3DZsFflhM5pXQaUhEofP4wp7fqNVDbqTJ471ELNElmj9KmE8ruEvf9Q87Z3PU5B0RECbniJRzX/jc+jRiPUYgmym+UcpXQ+8Qcxug2S8bwoNnCJCO0IZ5IHdzA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=f3R5prkxlUWa+i6h6zI/FTbRbejLI9P8nFF/IddgsNw=; b=NvwR261RBtl0aIT+2jstiiSzKGAh2s95vGjTaJ96dUumW5XMxhAyrEMKTCOBYDdPHgZD+R+cXjvGe7BpWBF96X8qWBitgiabFdRbYOedwpQ+k3wZI90OWx6cVkJEsCJeQpLTUwiDIzX5lnCqLQYgkIFhZ6irCqEhqb5073+DbJ0Ug85YVSyjaH2XrhFQNdZIPn8FCim8r/IPlihAS2lBrAem1eXzQNzPjp43OdU3JZJVF3wd310ydZhhvUvw0vf7MsIMrafIVANEum2/zd55VBM/F6rten/ryJKmRXoTztawekT2S+5U0G9d9zktWtUV3LBRHDGJWLVn8SdT6RrAKw== Received: from SJ0PR13CA0175.namprd13.prod.outlook.com (2603:10b6:a03:2c7::30) by DS0PR12MB6535.namprd12.prod.outlook.com (2603:10b6:8:c0::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9052.25; Thu, 4 Sep 2025 04:08:37 +0000 Received: from SJ5PEPF000001D2.namprd05.prod.outlook.com (2603:10b6:a03:2c7:cafe::e9) by SJ0PR13CA0175.outlook.office365.com (2603:10b6:a03:2c7::30) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9115.7 via Frontend Transport; Thu, 4 Sep 2025 04:08:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001D2.mail.protection.outlook.com (10.167.242.54) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:37 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:31 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:30 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:30 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 04/14] vfio/nvgrace-gpu: Introduce functions to fetch and save EGM info Date: Thu, 4 Sep 2025 04:08:18 +0000 Message-ID: <20250904040828.319452-5-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001D2:EE_|DS0PR12MB6535:EE_ X-MS-Office365-Filtering-Correlation-Id: 4fa2f89e-6b91-41b1-26cf-08ddeb68b94d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|36860700013|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?ktFPMHOgZozTJzxKxeuL3qMg0U9iZl+VHjl6+oZFO7NX4jQllFJ6P1AL2T1Y?= =?us-ascii?Q?ciPclt/u2+mLYMnKBFyR++7HU8F96ehdewl3iLSlw39dc8y/ckpwk0ZUDsSi?= =?us-ascii?Q?XWr+WJXrePVXStSIqq+LO3b4eEHXUnNBHePk/6pIJD4yd7AsxyhOCM4df+J6?= =?us-ascii?Q?LwtZc4TuCnsM6Y8RuRG67tdaAHJ4910wQzjsRwdTko87ZVTtcc9KUvpddLal?= =?us-ascii?Q?5ZpYJ1doae4SPPNEISBc/Zy4zfNzctAnWw2hfZJ6araxic72l0ghv7TJnmJe?= =?us-ascii?Q?MvtmAN8GpoCO6/MTmNmSTGRLMED6qFpoFBKjl3jMp28B1VNIDn/HDlDNQ8ds?= =?us-ascii?Q?R9w0pjQ74If/LLCBGYjyENU5hUWKb6mX0qjlQzMOwmxusi5X1j1bm1r9cw1I?= =?us-ascii?Q?OEXUKpoVbv4ulz0EWL73FSoPByOr2TXL321ruJH4rZwpj9zVb7wAR9MrREPK?= =?us-ascii?Q?wQBbBPhAOpaOMS5Goy6v0IWqdPHs1JKfOyCZbau6PVGPklk1mvXIGmgKpJ4w?= =?us-ascii?Q?r2b2dkQmcGX/d8+Ri4TshdyvVLtgwNjIVf+yVBVM46YVTX6YwItuqmcbkpPJ?= =?us-ascii?Q?PeGTUQREVfD/nusm9TcEBBQFAe2fiNfPcPdunwVUHF9cXHD8Sa8MFbUDWF6T?= =?us-ascii?Q?grcxO/iln8fqQH2e+AZ92J1RqpMRXYwgfmV8O8to4sm6Q3wL3/GNVqikc1aI?= =?us-ascii?Q?lzUt/VBir7EtUrchqNbIuA5Q0PLQCgfzIgn3stuHjUIhVUO+KresdWcd3pHG?= =?us-ascii?Q?vakUjOWtQRbSNgLH5br8QjD4pfLaB4RY4UyHakMusqS/dOIxmVVJTH+59jzb?= =?us-ascii?Q?ZRSue9pvNpu6mbVm9PlvXy2reIVLaYOcLhgcfKiN/7be6k85PLA9zsKLcC52?= =?us-ascii?Q?1czJrbFswpMi6AYjHr1t50l2VAnuZcoCF/INvhNzwbzTzCAUJfBVBvVl2QRz?= =?us-ascii?Q?RBshMIU9eyMyugc3ycslalvuqnkXVJ38Z9vQxsu/xBtl+Q9ArvAakYuoAEfS?= =?us-ascii?Q?SbRjVWvhYpep9fMlyzMNo/JRlQZnmjHQa5FJJEl9XUKLCmXjLvYeqg3mFA/J?= =?us-ascii?Q?akjuFsVbWa0YuSsycyGZzfcTgsMfBJ6yKAipj88r/Dk0NA2j4ZTgmmVDZIsg?= =?us-ascii?Q?aEQYXc8CQBCYH1dcQzE7y6zCLZiOCz7GpJ4LXNdiSu9FMyOyOVAqerwMieFU?= =?us-ascii?Q?KHIZoyPytZMW7MgsQM7AnRcSj9QMqlKp7qN8rt3ivfMf7GQh/G9SzObaDS8q?= =?us-ascii?Q?L+bIYBU2LTykT6JR3PYhkenKpqPMnAFRDChLXdUQD1u6Sdpr+5Y27P0cNjjt?= =?us-ascii?Q?Zvpv9hG0g24Wn6hnmVbtCfhKfozwRoj2MlHAd+tviNwUU79dxjiSE2r3L6mw?= =?us-ascii?Q?dgWgEKEEUJyFlAv81tk5khX+EZ77IYqRiBGxpM2teSKNX6WGA+94psBANYCt?= =?us-ascii?Q?wAHnQyabNBjKl+W/35nbuoidxn/w0MxEp7eMQ+JLZ6a4DeSsIEWNfl6Bptik?= =?us-ascii?Q?pa40ZlMcwW6iiIFXb5MEInOyUJtD7XVwlvK6?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(36860700013)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:37.4095 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4fa2f89e-6b91-41b1-26cf-08ddeb68b94d X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001D2.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB6535 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal The nvgrace-gpu module tracks the various EGM regions on the system. The EGM region information - Base SPA and size - are part of the ACPI tables. This can be fetched from the DSD table using the GPU handle. When the GPUs are bound to the nvgrace-gpu module, it fetches the EGM region information from the ACPI table using the GPU's pci_dev. The EGM regions are tracked in a list and the information per region is maintained in the nvgrace_egm_dev. Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm_dev.c | 24 +++++++++++++++++++++++- drivers/vfio/pci/nvgrace-gpu/egm_dev.h | 4 +++- drivers/vfio/pci/nvgrace-gpu/main.c | 8 ++++++-- include/linux/nvgrace-egm.h | 2 ++ 4 files changed, 34 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c b/drivers/vfio/pci/nvgr= ace-gpu/egm_dev.c index 28cfd29eda56..ca50bc1f67a0 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.c @@ -17,6 +17,26 @@ int nvgrace_gpu_has_egm_property(struct pci_dev *pdev, u= 64 *pegmpxm) pegmpxm); } =20 +int nvgrace_gpu_fetch_egm_property(struct pci_dev *pdev, u64 *pegmphys, + u64 *pegmlength) +{ + int ret; + + /* + * The memory information is present in the system ACPI tables as DSD + * properties nvidia,egm-base-pa and nvidia,egm-size. + */ + ret =3D device_property_read_u64(&pdev->dev, "nvidia,egm-size", + pegmlength); + if (ret) + return ret; + + ret =3D device_property_read_u64(&pdev->dev, "nvidia,egm-base-pa", + pegmphys); + + return ret; +} + int add_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev) { struct gpu_node *node; @@ -54,7 +74,7 @@ static void nvgrace_gpu_release_aux_device(struct device = *device) =20 struct nvgrace_egm_dev * nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name, - u64 egmpxm) + u64 egmphys, u64 egmlength, u64 egmpxm) { struct nvgrace_egm_dev *egm_dev; int ret; @@ -64,6 +84,8 @@ nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const= char *name, goto create_err; =20 egm_dev->egmpxm =3D egmpxm; + egm_dev->egmphys =3D egmphys; + egm_dev->egmlength =3D egmlength; INIT_LIST_HEAD(&egm_dev->gpus); =20 egm_dev->aux_dev.id =3D egmpxm; diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.h b/drivers/vfio/pci/nvgr= ace-gpu/egm_dev.h index 1635753c9e50..2e1612445898 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.h +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.h @@ -16,6 +16,8 @@ void remove_gpu(struct nvgrace_egm_dev *egm_dev, struct p= ci_dev *pdev); =20 struct nvgrace_egm_dev * nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name, - u64 egmphys); + u64 egmphys, u64 egmlength, u64 egmpxm); =20 +int nvgrace_gpu_fetch_egm_property(struct pci_dev *pdev, u64 *pegmphys, + u64 *pegmlength); #endif /* EGM_DEV_H */ diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace= -gpu/main.c index 436f0ac17332..7486a1b49275 100644 --- a/drivers/vfio/pci/nvgrace-gpu/main.c +++ b/drivers/vfio/pci/nvgrace-gpu/main.c @@ -67,7 +67,7 @@ static struct list_head egm_dev_list; static int nvgrace_gpu_create_egm_aux_device(struct pci_dev *pdev) { struct nvgrace_egm_dev_entry *egm_entry =3D NULL; - u64 egmpxm; + u64 egmphys, egmlength, egmpxm; int ret =3D 0; bool is_new_region =3D false; =20 @@ -80,6 +80,10 @@ static int nvgrace_gpu_create_egm_aux_device(struct pci_= dev *pdev) if (nvgrace_gpu_has_egm_property(pdev, &egmpxm)) goto exit; =20 + ret =3D nvgrace_gpu_fetch_egm_property(pdev, &egmphys, &egmlength); + if (ret) + goto exit; + list_for_each_entry(egm_entry, &egm_dev_list, list) { /* * A system could have multiple GPUs associated with an @@ -99,7 +103,7 @@ static int nvgrace_gpu_create_egm_aux_device(struct pci_= dev *pdev) =20 egm_entry->egm_dev =3D nvgrace_gpu_create_aux_device(pdev, NVGRACE_EGM_DEV_NAME, - egmpxm); + egmphys, egmlength, egmpxm); if (!egm_entry->egm_dev) { ret =3D -EINVAL; goto free_egm_entry; diff --git a/include/linux/nvgrace-egm.h b/include/linux/nvgrace-egm.h index e42494a2b1a6..a66906753267 100644 --- a/include/linux/nvgrace-egm.h +++ b/include/linux/nvgrace-egm.h @@ -17,6 +17,8 @@ struct gpu_node { =20 struct nvgrace_egm_dev { struct auxiliary_device aux_dev; + phys_addr_t egmphys; + size_t egmlength; u64 egmpxm; struct list_head gpus; }; --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2053.outbound.protection.outlook.com [40.107.244.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 313DB23815C; Thu, 4 Sep 2025 04:08:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.244.53 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958928; cv=fail; b=bFw36U3CXzYIuyYCeI0ajbeOnjmJj4KcWOTkh4NNO5WAxdaaLwscNJmMKZz/eCxeoSCAco3GQGtUmyZnh5jvRKvOM3O4jPOpipCszh/pjkqQFTbNAuaL4q/VY4w15boAc3C+u/b3LSscQ40bhdgcMbO6I7hKvVYHL6QiRQRAe6c= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958928; c=relaxed/simple; bh=666Q32X7I9ZKmI6dIKXmgjwWwveBIi1IP3cFIRfuwfU=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=gHZKo6wb0o99uAIWqA7SWEt+D9oLHH3q1R8GI9V3ThhSS2ATu0dH/OLD6HEwjZtU1JhvgtgoPVyskfrNObD3OYs6mMwsPdPdAWXS5/YlLRjNXeakbqwqpw+ABQv4IOgNd65eS2sd5Q3SETPxbDJonqhqvTcogLyYAzCCPIwAh0M= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=kWTkzlQ2; arc=fail smtp.client-ip=40.107.244.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="kWTkzlQ2" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=L9ydYLXHTAeUb59Phu2IQs5LGel2GQYD6hnBaGnuhD2YESMALEou9XUXrTypLXbXp0BzuzT+dxZ3sfDPEcqnEhjwxfvuX6Pzk+dr3skOknw1kmhf603TzL/I2kwR9BKQoKgfJkT9xTiKIF4oTs0lyhOMo0Kx/xAQOGCj5uuS23Rri5NXii8JRFPIpIkLjO9hxfxQ3D8zmRUzUH6U00zj7xaM24AvRje6mzQQv8OOuxRFtTihQGPUYwxaLDxPpn9IfN8Y+eM3FGK052DgHcwPfIlSZ+BmYpvxfk9s0zT6WNTwQlp7+n89itm0tPtfdpIKMwek4lMLIkNg7LI3pZuhDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=PbolfrunMHeC6TYEJqZsnaqMg4ZrwH+CCYtdr2nFAsk=; b=yL/ZPupP7fxW24zY0WFDs/gUmhVU3c3IftW2TP2G/WwBO8awFewOAdOwlLCffUaWfaaNcLulJwMhzSht5xdy0XqJKQTGfO3L1hH6tm05y2ynzt8gpxMxS+xH6Xr1952fLva6yawxHbx3Djm9hDAeI5dTzLKkUutx0XzWVL2wf5cGRsLO5QDfHlEhXNvAWJjM5ac2mtWxGRqpflXQSaek02OMYIwxbZDnzFGBfW6IECOdGu2Ywqdqn8GyXQx5U5rzBPHBSzamgEDhoLcGUuNQcf4HjM3fbICKk8PvyIDokrbNJYK6IqY11kkreSreOpsnHMlJZsfrkmI/bb8w7tF+ow== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=PbolfrunMHeC6TYEJqZsnaqMg4ZrwH+CCYtdr2nFAsk=; b=kWTkzlQ2Y/PNsHcstzMbUX4N1qQ2UPThWS8NiPMt05oiP9hhg3SBuNxR4ixebkREGmXdVreTg5odxcMBoA56bZy2v+IQBa9Uh+wiuwG6b6Qi4sOi4pduzG78EC6ie+DgK8i4WvUXYC831w8hQFr6Gqk03sI7DhzghgjDdhbL3/ILlxOxIK9N1u5zFP2ffactK7qizBX3ZVLOEyJXz0Pynpl59rzezIStsCvMaNzTHHDbWwBRBJH6GadjsojhmgiVq+AKFMa7N9dUcovySa4mYhUrsmPkaI0wAl2hgtJtTKPPFlffqGZYnMs4ZlvSKTlyIpGi2fZ/HW3l84d+JnU8eQ== Received: from SA9PR13CA0145.namprd13.prod.outlook.com (2603:10b6:806:27::30) by MN2PR12MB4421.namprd12.prod.outlook.com (2603:10b6:208:26c::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.16; Thu, 4 Sep 2025 04:08:39 +0000 Received: from SN1PEPF0002529D.namprd05.prod.outlook.com (2603:10b6:806:27:cafe::44) by SA9PR13CA0145.outlook.office365.com (2603:10b6:806:27::30) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.12 via Frontend Transport; Thu, 4 Sep 2025 04:08:39 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SN1PEPF0002529D.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:39 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:31 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:31 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:31 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 05/14] vfio/nvgrace-egm: Introduce module to manage EGM Date: Thu, 4 Sep 2025 04:08:19 +0000 Message-ID: <20250904040828.319452-6-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN1PEPF0002529D:EE_|MN2PR12MB4421:EE_ X-MS-Office365-Filtering-Correlation-Id: 1254e421-24ef-4206-7b6e-08ddeb68ba87 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|376014|82310400026|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?5VMO9Ioo1eZlon+0n/4VBKS3CRghnuPs0L9jSEIvNDXjGTsikI7h/op9syMw?= =?us-ascii?Q?wIFlbMNL2BfW06r1NXd5PYYNL/OaXYNgDoZedc/WLZinDKlYe7W+JfnP4yzK?= =?us-ascii?Q?AMJEjlF8AYvVF8wHkEM/lEUFKwrh1tHEJKf+Dmt4eeZTT6UfJrjF7+6Ql0PV?= =?us-ascii?Q?J3/PNISmkE7YltFez7NHoKoDN7kCaAQTTTFI0jkN73Bn60TwPzGA4dfw7J0Z?= =?us-ascii?Q?rDJoSHwBGjxcCe5vnHW8gz8tYVelQx5/XSSxlfOwICDPR4KFQFTcLgqdUlkI?= =?us-ascii?Q?bLpV+TUR/B4NvVN3l4O95XdA3FJx9qcdF2Huh6Ppc6/VKAbk5ZKak57mkMlR?= =?us-ascii?Q?jt8yeDfirftB+QgD6s3Czrai5EkW5nfsJSpQWGXnEP9sBDmmOcozMYOTVoP2?= =?us-ascii?Q?LFk8i6ReYO+9Q8KwgdnOxebPENGY7dVryN7ZfJQ+QduZ96NOrfB2mB/waG+r?= =?us-ascii?Q?L4BynfVLZBpQl78E3DFf/og+xi9/LtGZ1cMJp1HYKJYZs2Wgw3TeqlOIfWLc?= =?us-ascii?Q?f8ASM5WnjiiVP+6MU18BQ+jJeeddzZvovjDKW2Er1aVO60eBVICjW3U4Hfch?= =?us-ascii?Q?irTFa1veSfSzJvuTc7vk3I68Z92sdvLLGeEIJJTEwVkIFIPk9LXmKQT8AX5q?= =?us-ascii?Q?Jv/fhOWnJRlDPshRJzVWqggV5F+Qirf476M0WG6R4DNAO6Ecw8JZrhwTiD0U?= =?us-ascii?Q?ujiulE5s6xmMiMQ9Ty5p5LQSP/KIk9nl+eJiV6XveZ+jPyIrUvpzPNJTVjP4?= =?us-ascii?Q?mRTA2h8WzZYiGLb2ehohhlrMYix2sMR1yRENjgeSwgvqFu3Vfhba/JGKP0If?= =?us-ascii?Q?Gc3MPPSObuiTsqbfO9Ma+Y9tXoXfHZWafGarteHgJ+N8pJhCmSzXMAmIK9ob?= =?us-ascii?Q?lEM2ah264jMbjjStDI6hoWTYTgATot6v3DF3JKLKefBcVE9NaArv53BzDsKj?= =?us-ascii?Q?KRT9ZzE8lqFci0sFQn2aN6tEWC78EBLmBX6wKCCg/8ACYKgwBWquXT28F2eu?= =?us-ascii?Q?9wB+CwqJ4zwSqItliJozsBjVBP9MCEBpjBxmto7pi0OW6KUogP5z+QFPhymL?= =?us-ascii?Q?Fmmy6WAjsm23ObgVbBwBwPj7LLW0wosHGGHnLvDusYA/ayHPXqePWmf6LSOu?= =?us-ascii?Q?aQ0jjPaREEWMPqgRU1dEQ1FNJDLEtvociCz5h4Nn7UGrJAb3fqglUBpFhb56?= =?us-ascii?Q?fRjwgIjTbThSJEEsnC8Az/Dw3hjgWPPeqzLSSxqgCI8wUSg5XMwM3xlWGOsI?= =?us-ascii?Q?VCesa8YBcrci5YXDgbNCoQtB5ssCFcUHyOLzbhzGdcAoE5gls3tYlkt6T/fR?= =?us-ascii?Q?RjrMaqbLulZUfVKObhVG8SchebS57x1hz5gAjxXHIoqZucT1YhYE+j4gF7AN?= =?us-ascii?Q?xvEma585eggPivH8u94hWTrb6myMNuVV7FkMMpjrHC+jwq3TfygW3gfkRSrs?= =?us-ascii?Q?UOvVDSx3GqZYSlVUeKQwpsGzSBZbJ2w35e4AQ+5tYpQgA2Chg5f/wU6uovEj?= =?us-ascii?Q?x0IYx5ND81NJRsL1kVkLeJ2PzdxZSrZm5PA3?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(376014)(82310400026)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:39.3707 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1254e421-24ef-4206-7b6e-08ddeb68ba87 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SN1PEPF0002529D.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4421 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal The Extended GPU Memory (EGM) feature that enables the GPU to access the system memory allocations within and across nodes through high bandwidth path on Grace Based systems. The GPU can utilize the system memory located on the same socket or from a different socket or even on a different node in a multi-node system [1]. When the EGM mode is enabled through SBIOS, the host system memory is partitioned into 2 parts: One partition for the Host OS usage called Hypervisor region, and a second Hypervisor-Invisible (HI) region for the VM. Only the hypervisor region is part of the host EFI map and is thus visible to the host OS on bootup. Since the entire VM sysmem is eligible for EGM allocations within the VM, the HI partition is interchangeably called as EGM region in the series. This HI/EGM region range base SPA and size is exposed through the ACPI DSDT properties. Whilst the EGM region is accessible on the host, it is not added to the kernel. The HI region is assigned to a VM by mapping the QEMU VMA to the SPA using remap_pfn_range(). The following figure shows the memory map in the virtualization environment. |---- Sysmem ----| |--- GPU mem ---| VM Memory | | | | |IPA <-> SPA map | |IPA <-> SPA map| | | | | |--- HI / EGM ---|-- Host Mem --| |--- GPU mem ---| Host Memory Introduce a new nvgrace-egm auxiliary driver module to manage and map the HI/EGM region in the Grace Blackwell systems. This binds to the auxiliary device created by the parent nvgrace-gpu (in-tree module for device assignment) / nvidia-vgpu-vfio (out-of-tree open source module for SRIOV vGPU) to manage the EGM region for the VM. Note that there is a unique EGM region per socket and the auxiliary device gets created for every region. The parent module fetches the EGM region information from the ACPI tables and populate to the data structures shared with the auxiliary nvgrace-egm module. nvgrace-egm module handles the following: 1. Fetch the EGM memory properties (base HPA, length, proximity domain) from the parent device shared EGM region structure. 2. Create a char device that can be used as memory-backend-file by Qemu for the VM and implement file operations. The char device is /dev/egmX, where X is the PXM node ID of the EGM being mapped fetched in 1. 3. Zero the EGM memory on first device open(). 4. Map the QEMU VMA to the EGM region using remap_pfn_range. 5. Cleaning up state and destroying the chardev on device unbind. 6. Handle presence of retired ECC pages on the EGM region. Suggested-by: Jason Gunthorpe Signed-off-by: Ankit Agrawal --- MAINTAINERS | 6 ++++++ drivers/vfio/pci/nvgrace-gpu/Kconfig | 11 +++++++++++ drivers/vfio/pci/nvgrace-gpu/Makefile | 3 +++ drivers/vfio/pci/nvgrace-gpu/egm.c | 22 ++++++++++++++++++++++ drivers/vfio/pci/nvgrace-gpu/main.c | 1 + 5 files changed, 43 insertions(+) create mode 100644 drivers/vfio/pci/nvgrace-gpu/egm.c diff --git a/MAINTAINERS b/MAINTAINERS index dd7df834b70b..ec6bc10f346d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -26476,6 +26476,12 @@ F: drivers/vfio/pci/nvgrace-gpu/egm_dev.h F: drivers/vfio/pci/nvgrace-gpu/main.c F: include/linux/nvgrace-egm.h =20 +VFIO NVIDIA GRACE EGM DRIVER +M: Ankit Agrawal +L: kvm@vger.kernel.org +S: Supported +F: drivers/vfio/pci/nvgrace-gpu/egm.c + VFIO PCI DEVICE SPECIFIC DRIVERS R: Jason Gunthorpe R: Yishai Hadas diff --git a/drivers/vfio/pci/nvgrace-gpu/Kconfig b/drivers/vfio/pci/nvgrac= e-gpu/Kconfig index a7f624b37e41..d5773bbd22f5 100644 --- a/drivers/vfio/pci/nvgrace-gpu/Kconfig +++ b/drivers/vfio/pci/nvgrace-gpu/Kconfig @@ -1,8 +1,19 @@ # SPDX-License-Identifier: GPL-2.0-only +config NVGRACE_EGM + tristate "EGM driver for NVIDIA Grace Hopper and Blackwell Superchip" + depends on ARM64 || (COMPILE_TEST && 64BIT) + help + Extended GPU Memory (EGM) support for the GPU in the NVIDIA Grace + based chips required to avail the CPU memory as additional + cross-node/cross-socket memory for GPU using KVM/qemu. + + If you don't know what to do here, say N. + config NVGRACE_GPU_VFIO_PCI tristate "VFIO support for the GPU in the NVIDIA Grace Hopper Superchip" depends on ARM64 || (COMPILE_TEST && 64BIT) select VFIO_PCI_CORE + select NVGRACE_EGM help VFIO support for the GPU in the NVIDIA Grace Hopper Superchip is required to assign the GPU device to userspace using KVM/qemu/etc. diff --git a/drivers/vfio/pci/nvgrace-gpu/Makefile b/drivers/vfio/pci/nvgra= ce-gpu/Makefile index e72cc6739ef8..d0d191be56b9 100644 --- a/drivers/vfio/pci/nvgrace-gpu/Makefile +++ b/drivers/vfio/pci/nvgrace-gpu/Makefile @@ -1,3 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only obj-$(CONFIG_NVGRACE_GPU_VFIO_PCI) +=3D nvgrace-gpu-vfio-pci.o nvgrace-gpu-vfio-pci-y :=3D main.o egm_dev.o + +obj-$(CONFIG_NVGRACE_EGM) +=3D nvgrace-egm.o +nvgrace-egm-y :=3D egm.o diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-= gpu/egm.c new file mode 100644 index 000000000000..999808807019 --- /dev/null +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include + +static int __init nvgrace_egm_init(void) +{ + return 0; +} + +static void __exit nvgrace_egm_cleanup(void) +{ +} + +module_init(nvgrace_egm_init); +module_exit(nvgrace_egm_cleanup); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Ankit Agrawal "); +MODULE_DESCRIPTION("NVGRACE EGM - Module to support Extended GPU Memory on= NVIDIA Grace Based systems"); diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace= -gpu/main.c index 7486a1b49275..b1ccd1ac2e0a 100644 --- a/drivers/vfio/pci/nvgrace-gpu/main.c +++ b/drivers/vfio/pci/nvgrace-gpu/main.c @@ -1125,3 +1125,4 @@ MODULE_LICENSE("GPL"); MODULE_AUTHOR("Ankit Agrawal "); MODULE_AUTHOR("Aniket Agashe "); MODULE_DESCRIPTION("VFIO NVGRACE GPU PF - User Level driver for NVIDIA dev= ices with CPU coherently accessible device memory"); +MODULE_SOFTDEP("pre: nvgrace-egm"); --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2074.outbound.protection.outlook.com [40.107.237.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AD7F23B63E; Thu, 4 Sep 2025 04:08:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.237.74 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958924; cv=fail; b=Sff7I8Y7z6nxX1BPcMJ593UdC/XaghHe+mJAbtdXijT9/ZJjn8VBXSMcGNTEBeWBdB+vIa7gMGsdQrc6Rej2F4LQe03NEi9aYKRfwmQxfTx8bNSbEVIuYAo6kwlk8nfplWJdpR7EOZFo/Gf4mleVSg53qgPrd7QkdBjAhjPcOKo= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958924; c=relaxed/simple; bh=t/x3PIIH49dIA2r4AkyvHyyrIpBnkaK3FkdB7qGrKyo=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=du95RYJ52mXYk0xcQJKdJidc58q5I1c5PHGcnGJol/Kh4cl9grbOtO4VwJbOeH56A26jRihWJyjZC1ct1PzzVcjGxYhta0kB4gMWKfIqr7FQVmbf1lc/dZ3uSBPld36n7Hb50Z9y75BHSse6A5mllV7ZOXLr+2xjrUpB62UF3NU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=gNu5uFAJ; arc=fail smtp.client-ip=40.107.237.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="gNu5uFAJ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=dCrUYxmEEzkBPvrQLA6k9DtYkK+OUrZv7lrXvgofYDDBSFjRVU5RnoJY0BVyIqz0i/RYEnMl+VwfIMDvaezoRfIRnqCwnaE9pVW105MXrCzL9M1vBYh0YSmoRcvHHU0zsxwTjcXETcLcr2WWkRrKWwlonOrxPiyPFglzdxGHxm1kC07nkNydFHtNV4i9t/DusEVPwNBNNkwNb89VJItwLRcoWgoTtV62VXv9qOolb84qzofi0oGcUeN3qmN6pvOLlLxFFhaB8f5cnVRBUuXQk99HdaI2Vs7OblRw64L4oZabzGoBk3Qdysm4/0/xhipwtc1ZuuqY4d60q9Po5jhA3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MzJ+e4pr6eIhdtuH+3dJ2Ii5jpfG5RP1f6Yx5bd3C50=; b=e894c2QyiNA2m3iTu+EJYvWsf/1J0aJ4NJmrfoRjw2TJdFd0CQPpM8OfCt1lPOTV+8BvYlOMI0+IVRDM+MrnLszfIsgthvjPSUacfVmDnhHhQOBkKW2cWOW4XVGckOEzeYu30ReijwXwl5vmHhNrOpELLAskndVROwg83lkOnwpDPp0HBXsOQueKFPibPnxSmO0Jyap30UJzEaMI6w8dNPmn6uF139lmCtDdcDQTBbd5AflvwKkDJx3Df0GVw9tZkE+VwSxqlky6sQzAnYcaOU2/iw9tTUwU7q3lC9xl5owksI5DkqA8HL0Z9MBSDNua0wsD71J4sMzSsFw0BYRtfg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=MzJ+e4pr6eIhdtuH+3dJ2Ii5jpfG5RP1f6Yx5bd3C50=; b=gNu5uFAJUOxjrkqagF3Zj0mRJsgQ7t++55LU5Hy+sPkTRNLHJ80Z4GOdUVn32TaWaZMd7boED/tqM2DhzETjAykiCUeXjVQFa/XJqwgCQM/7MUCz8+qEhIG4s7cwf0xRIWaT4KFwJmvqGuBUOnJFoM7CmoDzKffaEJpFlrwfn0yYSDte7q3ioGj54ocPF+7+2ao8VNtVuvAP72T+SUFYUtbtAWXYxzIiHwj8/nN5fl/ai6S1nAadsmECm3T/7YRQEj/nuHi1gI9xtPEBfpdR7Pusgv6ReBfcv1yoETkqwnT7ubbx+gsmxwlMmGrf5tWwS/0FmOO1ZU9SWUPW3PgXWw== Received: from SJ0PR13CA0175.namprd13.prod.outlook.com (2603:10b6:a03:2c7::30) by DS0PR12MB8477.namprd12.prod.outlook.com (2603:10b6:8:15b::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.18; Thu, 4 Sep 2025 04:08:38 +0000 Received: from SJ5PEPF000001D2.namprd05.prod.outlook.com (2603:10b6:a03:2c7:cafe::c8) by SJ0PR13CA0175.outlook.office365.com (2603:10b6:a03:2c7::30) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9115.7 via Frontend Transport; Thu, 4 Sep 2025 04:08:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001D2.mail.protection.outlook.com (10.167.242.54) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:38 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:31 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:31 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:31 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 06/14] vfio/nvgrace-egm: Introduce egm class and register char device numbers Date: Thu, 4 Sep 2025 04:08:20 +0000 Message-ID: <20250904040828.319452-7-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001D2:EE_|DS0PR12MB8477:EE_ X-MS-Office365-Filtering-Correlation-Id: b0696b30-f381-4c83-1f0c-08ddeb68b9c5 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|82310400026|376014|36860700013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?N47in0NsYCmiq33J9wNHeYK7qZsbwZECWHTI2ZUaQ4kSPtBA2RiIlOwOaase?= =?us-ascii?Q?utJFqCDwXfCNvIrqGWKv6NVpohV/9vl6vUIh/E0d49M543gwz3zbRo38G0U0?= =?us-ascii?Q?9n4nvKJV933oPan86xlS4QXPfssBCt7er32rRuFUPMrXtDwRE6oQfJGeJYIj?= =?us-ascii?Q?UkAx5QCgCcp0QOQQmQh3Lt0L1pEJVi1YwycAC0Cclvo2R/WBuRXhb0cKVPFL?= =?us-ascii?Q?e7m9mY/HUcdJgNl92+3Ee6RE1puQw+8mtBYqnWHd2VEqD/CDeO0Bd7rEogtH?= =?us-ascii?Q?C+kdkPye0isb8Gx0WXYvaI46EJNecIpI86iWlWKf9MygFGe1cpXqJjzHauN2?= =?us-ascii?Q?LCiIn3Q+K/lGu9QB3cfnl6PFCsqtjCJcH7Em9lkh7GeJQtXGvsMH1ib081cq?= =?us-ascii?Q?Gp21+sOwETlG74Q81kBZkQaQnEjsSOszkSdw9f0pItqZazjy9ooB4vdMG2z6?= =?us-ascii?Q?ZBasAX4pk1NGoWC/A9f6EWWumRpLmjZGlN16jZVI9Il+GKiuGcb+a669xQHt?= =?us-ascii?Q?r/C/A5JJrkVoTHAO9VEU0kAjo6wH/i+iCjcYz7a3MCbkNs92qB9hf4CV2nka?= =?us-ascii?Q?pMaGQQVFLh3/eXPHSrSvNZyV4LUanp3lxM0UzKMr/TRVRPn3szCBVLV48MAk?= =?us-ascii?Q?0msNCV8S0W2Amur96SrbQZDkPPQF1JLTq+FkT2HW94RF1yOMFeqAPJnM1s7y?= =?us-ascii?Q?bOAFnjixbO6UTgL3wxqmPjprFtu/PziA7RIdaO866y4Z6x60Mg7FsQEbtP6q?= =?us-ascii?Q?T/0FuUKsauOzXjuuQ+XlKr1X33dqF/6sCVls6+jw1xqogFLvMFmA11MD8NP3?= =?us-ascii?Q?SK0odpYxZMmX01uD5Uds8ZmWS+E4XbSM9yX+AzzouAojkyQKErkNOYamxhbe?= =?us-ascii?Q?4RMVVB2HGloxqlNtzo9fu/4j/5Fmv3tORwEUa0a6VNQ3AuNigs7sa2KBFlhq?= =?us-ascii?Q?lb9Ypb4Pa+j4TIWDafmWWFyuMqnXA5TKs488CTzuFBjEO+Oy1UzYZI7t29TH?= =?us-ascii?Q?G2sowUObw7P8GI90WbSnMHklJnZ3AyRILeRS1R8qKut65ZsJZmoKfJhgsRIs?= =?us-ascii?Q?k1XUW52wHVVIoDUupLO9QEhq0Elv6Gw7yMcG00foYaOlYOXujDRlw7WNANbe?= =?us-ascii?Q?HFnIOcLFq5ZUodTl3mfFcBvtaGTCXjlPAEuC4tB1orKX23AZKFqPwcT2FzwJ?= =?us-ascii?Q?KO8t19EYmvYMvoDCyX/Vc1Ngo8gcr7xcGNlA/+fExuBRxdoCjVpdntvMlkV9?= =?us-ascii?Q?k11tmZy7omLWASqLn4qnxfHJVtVLABcoAM0X0NXQNtxqvPDCQvcRbWHDifeF?= =?us-ascii?Q?32QGjWFolxxiKCxJMjsmAYk8OEsiNCuGY874lOOxMjErscWrA80OxMg3c6d7?= =?us-ascii?Q?/+vnBRaqP/M1+stlx+/E9ekF+E6zoUazzBpDGbcfeNARA2lQh/k/3giQMAM/?= =?us-ascii?Q?iF/9gkf+R/3Wcx+cz6Y4Cw10X4+JYPo8A0EhFSfJGGKdQz6w8x3hnN0alrOa?= =?us-ascii?Q?2EXJXXKqDA3s706p+quBC0L9jc68xi6ADruk?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(82310400026)(376014)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:38.2101 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b0696b30-f381-4c83-1f0c-08ddeb68b9c5 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001D2.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB8477 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal The EGM regions are exposed to the userspace as char devices. A unique char device with a different minor number is assigned to EGM region belonging to a different Grace socket. Add a new egm class and register a range of char device numbers for the same. Setting MAX_EGM_NODES as 4 as the 4-socket is the largest configuration on Grace based systems. Suggested-by: Aniket Agashe Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm.c | 36 ++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-= gpu/egm.c index 999808807019..6bab4d94cb99 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c @@ -4,14 +4,50 @@ */ =20 #include +#include + +#define MAX_EGM_NODES 4 + +static dev_t dev; +static struct class *class; + +static char *egm_devnode(const struct device *device, umode_t *mode) +{ + if (mode) + *mode =3D 0600; + + return NULL; +} =20 static int __init nvgrace_egm_init(void) { + int ret; + + /* + * Each EGM region on a system is represented with a unique + * char device with a different minor number. Allow a range + * of char device creation. + */ + ret =3D alloc_chrdev_region(&dev, 0, MAX_EGM_NODES, + NVGRACE_EGM_DEV_NAME); + if (ret < 0) + return ret; + + class =3D class_create(NVGRACE_EGM_DEV_NAME); + if (IS_ERR(class)) { + unregister_chrdev_region(dev, MAX_EGM_NODES); + return PTR_ERR(class); + } + + class->devnode =3D egm_devnode; + return 0; } =20 static void __exit nvgrace_egm_cleanup(void) { + class_destroy(class); + unregister_chrdev_region(dev, MAX_EGM_NODES); } =20 module_init(nvgrace_egm_init); --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2069.outbound.protection.outlook.com [40.107.220.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 31C3723D7DA; Thu, 4 Sep 2025 04:08:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.220.69 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958928; cv=fail; b=XGCjyapRsUJLaidM8wHQ+FvwXzx+qEQpKvry4AwNFPw/z6Z1AJrl+0PK2nrf9D3tPZPwcSLY+XC+DtGJWmh7fKN3YrsXBwJ7yKjIpxN38JxAO8wz1vf0qHVtyChzZjSnCyZDU6O9/RCxLVCOV2BEYvOT5XKESUBXZ99WyH33nXw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958928; c=relaxed/simple; bh=QXlZETqdmAvmv2iWMPHfnuYBP49NTuNBRExgtcSimrA=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=pcwBYEIAWBqTQE9bZUfwwEuZYp1XADIOI704SvLNR9hnTHr38iQiSAsF5bdkXrFaBm0zgjTRcGjUdLtdCuCfosglw4RO5aW33M8RHX+iYF0tLa0eczLnW9nyZujFv0ouv6VgGmA5xeGHIRotGDXf82xXpPmR0RfMdCs0TR9zYmc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=hhhpFyJK; arc=fail smtp.client-ip=40.107.220.69 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="hhhpFyJK" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=lTBkQYqYI6U3CXV7lQFuOyIU0hJlllQR/0tlUGWFQYAwbe0fSrA0vmJxsKR2GsZC+SXnXg+JYFnJTu4lHPWXuit6BUGqsrh2KpTMvB9krbGLS8x2Etng6+IHrqblvuTbA+RV1cARlP3oQTm4Jde0Jl2+mbxYnwePYvKFFRfybefcxhkn+FGiq8uEJu02jbwRxpRrmncEhwNvO5sD8uneKOMYOfNNtparGhatfDRTXTdSMdDVB/cywxZfkHJExNlCtU5kCXwnTee+OtpFO36cC2wMzgAewil6BWmYTu5c+tE5ZK4oB4Y31imH2Z3F3VWMYZMPW+gaj0AF9ht/Cxn+uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QRbQJdpskjsg4IWQzFquwJE/UP4PjjXeSD376SvXuXU=; b=mA2whtfN1l5uQc/V7LqGhDDUbKtB72OrOmO3gpxA3iDUXLai8Zb+YkMNwfBCA+Zst2yYeTjY7CRs/QlZfaExuDbxIQqmbZbEeIskNic6ccitXen9c9eSVM22+JgrOuUaTC059X8c2soBKLv7BubgKTE4826nSY+bGAgPqO64gR1KjuHpWapiC/NKlSa7efty/YtP5wh1ElzOwgKx/6+bYvql4DUbiHmRNUU/8yJF5mwqvDeMU7fL/0jtIeNU/zC86d+HaI7xpQiDKpJGtpZuWYAH+77XX6vqZu/yhoB0wSalaSMVPLWogNvnlLXVsHay7M/iGBEcZQf7L7+qJjncvw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QRbQJdpskjsg4IWQzFquwJE/UP4PjjXeSD376SvXuXU=; b=hhhpFyJKklHB4kSJWUfwC/ELNJBpddn9tIGF3CnzAhRh20JN59AlTm85knnSpC6Vim90OJ3DHTfAJAbxmOHoeq/r1WjohUXwOVjptZpYPir8pIsFwXGG2wtAqqIJ7CLSyiDb6AF+A5/OTp/CjUm1xtl77DXzUwY3lpUrxfyqBLpNgrDifWWE1dfDaYs9uddxiQYzGg4auYIvYXK6F/SHeNoXokDnSxk/ZSIBLwW5LaFadg5WVByrNxbTWmzJMfDqKhG9x6zMhMwjv6m+7ZqGLJoCm2JfuPzCHh1g+3wg19Xj25In6pLXWS9YttWlA1ONvxbv9pisDyuuUyeI0r66vg== Received: from SA9PR13CA0130.namprd13.prod.outlook.com (2603:10b6:806:27::15) by MW4PR12MB7030.namprd12.prod.outlook.com (2603:10b6:303:20a::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.18; Thu, 4 Sep 2025 04:08:41 +0000 Received: from SN1PEPF0002529D.namprd05.prod.outlook.com (2603:10b6:806:27:cafe::31) by SA9PR13CA0130.outlook.office365.com (2603:10b6:806:27::15) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.16 via Frontend Transport; Thu, 4 Sep 2025 04:08:40 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SN1PEPF0002529D.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:40 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:32 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:31 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:31 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 07/14] vfio/nvgrace-egm: Register auxiliary driver ops Date: Thu, 4 Sep 2025 04:08:21 +0000 Message-ID: <20250904040828.319452-8-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN1PEPF0002529D:EE_|MW4PR12MB7030:EE_ X-MS-Office365-Filtering-Correlation-Id: 881c359f-2940-4957-5a2c-08ddeb68bb49 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|82310400026|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?IrlW6iJ8cyUQK0d1TifnTG1VDoHw382s5mIh/VGnlGoQwbu42lW08FzNGu+D?= =?us-ascii?Q?546VVvXno7hmBu4ayTqY2pz0AHKSM/WwE+ys0YL7RN/bYDmZJi05wM38uuRX?= =?us-ascii?Q?qlavQQ04iv5D2ESZQf6ysZ9/xYMyOPX6RxvM6XDNR06Opj/3x7ZoKmOg1/1e?= =?us-ascii?Q?pvnJy50Mx2p/VqMjp3VTiNZ0ZBAa6VNbnXrJE4u7KQt7tkOOpeSl10Vr+cew?= =?us-ascii?Q?4IkSAhwM6G0SkmNMoIloYmjH1pvG7VcqySW133IaD6BQ1ckPTyRan+nlNgTt?= =?us-ascii?Q?nqaSfR3CyglgzB7f1FCwhdc5wffrZzWggunry3z19AzpOUNA6/46pQs+Tin1?= =?us-ascii?Q?j3o2vX7dKb9QV8gKFFrqLT72eBnSAxJBbMdYg2EKMPUAcoxrR6M94FA5hM2S?= =?us-ascii?Q?eOrBEoFZjBAF6GBNtNCjDVeSqV94tEd6QUVWYKd97DZhIOG0MwVb0MtQMyXV?= =?us-ascii?Q?Y+JH0xoD9a2GHgU8FsH/fNNmuLwP08cM3iY6wZOlR26Ql9+WhabCsyq20FOo?= =?us-ascii?Q?Qg4IFLENIPOkBKU8EG3YJ47wNJqE5gLERMz6pAAXU5Cdka0yG++nj5EfEXqo?= =?us-ascii?Q?rlYKHch68877wOg4KgJ8sX6t7lBogwWpzerlDbhfd72+wCKJuGOS57CfXiC1?= =?us-ascii?Q?E4UHS2HufzVqaV1196dpKF/B9h7RbiImI8pR9cX2Y9+x2JejU8d9FBXqjjkK?= =?us-ascii?Q?iwS96H8YZcgxh8mgmhKDWPqrufwmoDiW+KFyZk5HLJn3p86uor91mzUenrcH?= =?us-ascii?Q?i3/HxAjEn4DNFOsp+VWAC+D+ftTkMQVXYrUWPseiAZXYlpoa6ZqhcgbKcE6e?= =?us-ascii?Q?CZBMDMOy3zzjcm932md9v7Ki51D0V8zUkvPPaodrLdeW36GNzNYqkGlaKQOX?= =?us-ascii?Q?U1oB7A4BeUkks7dmYrtEpnxSSPB90X7HCy9qcbvacVktkmTB5/frV6t3n7/E?= =?us-ascii?Q?EgKrLxiyr8NiGSqkcnWzRa/Sluj72sTVEDnt+eWHu0b7COl1L2oRWrLPtv45?= =?us-ascii?Q?ifJPiSRaCxgjvKA2eUF0A2bmEpmlViZeYFwTCRj15qpsMHOHgwLOM91Mo9+l?= =?us-ascii?Q?iDZLxjt/++GYueONjMsq+Nkr3F0w+UwlNP1wWBnlMbfyiBSq4qnleA7/iwAO?= =?us-ascii?Q?JUUUV5JynDJx0dVBSNKqQRXJ1sve3Lz0/zzdv2jRtAAavirl9pgPSgKCKd8n?= =?us-ascii?Q?8pQOlnVdNPRjXl04Ll4WVH6gzp/Zesv4RcAq1h/frxGOqN8KTGYLGIqyVzbf?= =?us-ascii?Q?xoj2nbm9hIzZAYkaKWqg7DACr3LZjD25v/J3DAW90Kw3O4mw2Yej03Z6y/0K?= =?us-ascii?Q?fEW+/kp+AG/ghQ7FVEgRfq6V0uuTSizdRRMH1HCa5uMKNm9bVBJvYIzMXinx?= =?us-ascii?Q?zrU2XB68WJplaVtQUCUk4s31B8R3uW4pernMvZw7u6MBT7Xcb1102hiDzqHM?= =?us-ascii?Q?VeMnbGj+tzsGHvx59YFN2kTFFltPcLYpZXctEOQO9FFdq4nnhvjFbXt94i+V?= =?us-ascii?Q?18kLPNgqMneCsjqaAht2wGpFwglgg4hjbanw?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(82310400026)(376014)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:40.6323 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 881c359f-2940-4957-5a2c-08ddeb68bb49 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SN1PEPF0002529D.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB7030 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal Setup dummy auxiliary device ops to be able to get probed by the nvgrace-egm auxiliary driver. Both nvgrace-gpu and the out-of-tree nvidia-vgpu-vfio will make use of the EGM for device assignment and the SRIOV vGPU virtualization solutions respectively. Hence allow auxiliary device probing for both. Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm.c | 39 +++++++++++++++++++++++++++--- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-= gpu/egm.c index 6bab4d94cb99..12d4e6e83fff 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c @@ -11,6 +11,30 @@ static dev_t dev; static struct class *class; =20 +static int egm_driver_probe(struct auxiliary_device *aux_dev, + const struct auxiliary_device_id *id) +{ + return 0; +} + +static void egm_driver_remove(struct auxiliary_device *aux_dev) +{ +} + +static const struct auxiliary_device_id egm_id_table[] =3D { + { .name =3D "nvgrace_gpu_vfio_pci.egm" }, + { .name =3D "nvidia_vgpu_vfio.egm" }, + { }, +}; +MODULE_DEVICE_TABLE(auxiliary, egm_id_table); + +static struct auxiliary_driver egm_driver =3D { + .name =3D KBUILD_MODNAME, + .id_table =3D egm_id_table, + .probe =3D egm_driver_probe, + .remove =3D egm_driver_remove, +}; + static char *egm_devnode(const struct device *device, umode_t *mode) { if (mode) @@ -35,19 +59,28 @@ static int __init nvgrace_egm_init(void) =20 class =3D class_create(NVGRACE_EGM_DEV_NAME); if (IS_ERR(class)) { - unregister_chrdev_region(dev, MAX_EGM_NODES); - return PTR_ERR(class); + ret =3D PTR_ERR(class); + goto unregister_chrdev; } =20 class->devnode =3D egm_devnode; =20 - return 0; + ret =3D auxiliary_driver_register(&egm_driver); + if (!ret) + goto fn_exit; + + class_destroy(class); +unregister_chrdev: + unregister_chrdev_region(dev, MAX_EGM_NODES); +fn_exit: + return ret; } =20 static void __exit nvgrace_egm_cleanup(void) { class_destroy(class); unregister_chrdev_region(dev, MAX_EGM_NODES); + auxiliary_driver_unregister(&egm_driver); } =20 module_init(nvgrace_egm_init); --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2065.outbound.protection.outlook.com [40.107.243.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76554242909; Thu, 4 Sep 2025 04:08:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.243.65 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958925; cv=fail; b=r0zZitVwKHbFDQnw2DXtCHePmtG9gupaHqyLU33oMvSbUVOY4kkpXU/MXtM3TLC9bIMNIeNPJZItksMZw0ZjlAQ3AR/apXNF3q1t85UhZRrccgWKwAYIOEnFs0pb8evXu8QSk1xoFzXgG4poQnwu0O/P1ccVNlrF0ZTRRKEC//U= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958925; c=relaxed/simple; bh=WiXbiSeENmkb181nDTCtb0zyj2IhaXjHKQl6MR/sOvk=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=X9NFQhXeupiY9RZCvNhqNZcSKyI67uq3tM9KuzemXctIfMI1DeAAWqrRo5007PbklDS48NvwWPKn49IodKRw5mJJOiaGyVXd40yqp27OC4NW79Cf7razl/BFDmqAugSitU9+F4/zUhClSYJtv2LK7q3GBH1+nyiQiOokVkBCYVM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=RC1YgbWJ; arc=fail smtp.client-ip=40.107.243.65 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="RC1YgbWJ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=WhhaJXhFc0oF2bBK53E8L5+9HebtuGmSsg/IPquM4QQ4R6/4ds4qrtxbsz8zuregMgtlSj//brPpB47P5jGumW1gaIVWi8Wc+b9EYcyDMMRLYVWba1syLW1uRfTCbQy2JrFsGr46Z/pUoQw+1rixRrBLLUSsaRulWKvsNNAtt9GNQ0QE4eL+pMqzCj5SIoPhVuvzahZUIqq2NINb1JPUJ2YApgllAvkbsWAX7l8MdrE2Dgureb1imP43tqgefnz3XMfkhZXpW7KzsLw3FY+xFPIfG0niVDfuEyGAp+aaZkzqz4cvCrVeikiyTkgrCOVtelUVPw0KI1G2PrFPB2jZbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=X4no3/c9Dc9zAnyh7x/d8D8tQ4huZoTFcFpOeJlAVNE=; b=KRhdvC76Jh+o05/XzCJDQi8H1nE4pezhzfbsPMD1nqVI+EqWpKtXS/EFSUFcEexD+fzuSdpiYV7fbdU/Muq/wtztMJFwuOs9Q3aCuJoZifj5zO9ocPk3zr0yX+bBtrtHRZ4ztUU5VgqIPsg+IYm/hsYaeRu8FNp9dOrxxhFVP/CeBDO2/uUfwTRRE/IZcd8dfCm00otUSjWAtzU2aKK9DGc6XaBgUm83CFLAJhVS8H0tkRwXqOSbe6ltbq0Mn2YwRtKO4IwiHiDZmVXWIShTdcKv8M6hkSRRVuowB2VcmIOcx3jgQYd/x5SVFDXwHTdq3BPa5rhUCcEbg/Hk6oanLw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=X4no3/c9Dc9zAnyh7x/d8D8tQ4huZoTFcFpOeJlAVNE=; b=RC1YgbWJR02vSkvBGmN4RtF1mtvqhbZ3lFi3pRaKYQjYcD2UROfGjbdyrUWv/oJUAybUtC+5zEyhpgzhUkFpr7pwYSOYY0UQtDjgoooamUwQvIRy2P8TSZMIKIK8UPgq7punW4CB+qAIuRwdumqTzH5dw6qZk1CnUNH07siXPGks5Ljv0H/zUIytTH6rV+gkjFh/Dd8DbUXGm0WAGpZd4jwjPQV6vOzjIi1uynJWGUYf7Vofvk6qUVs3Ht/r8cVH2VlmJrCe0bjWO4BgHBEJi/ESp1FGO2/1QIoIZuTbZOJITaQN6mQwj+aGwXUa5Zj6BU++rNX66PZiLDBlFqZjpA== Received: from SJ0PR13CA0178.namprd13.prod.outlook.com (2603:10b6:a03:2c7::33) by DM3PR12MB9349.namprd12.prod.outlook.com (2603:10b6:0:49::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.17; Thu, 4 Sep 2025 04:08:39 +0000 Received: from SJ5PEPF000001D2.namprd05.prod.outlook.com (2603:10b6:a03:2c7:cafe::49) by SJ0PR13CA0178.outlook.office365.com (2603:10b6:a03:2c7::33) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9115.6 via Frontend Transport; Thu, 4 Sep 2025 04:08:39 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001D2.mail.protection.outlook.com (10.167.242.54) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:39 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:32 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:32 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:32 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 08/14] vfio/nvgrace-egm: Expose EGM region as char device Date: Thu, 4 Sep 2025 04:08:22 +0000 Message-ID: <20250904040828.319452-9-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001D2:EE_|DM3PR12MB9349:EE_ X-MS-Office365-Filtering-Correlation-Id: be430934-5d52-4fea-75e2-08ddeb68ba78 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|1800799024|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?AQ8ge4dg/qdAhl07rmFxzk3eTcdtjUS6Oj2W80+uULLqWhM8jVtCriqmUZjR?= =?us-ascii?Q?xHBi5DAyM2+pVr8b0mf8m7WpzD7zG7tgYH09bj3g/qPe2XozeavyN1s0a/U9?= =?us-ascii?Q?JCS4a0WgcAaZ4dN2B7oSwQ+rZvZnN8t8tOAdw5QEp359IvvozNaa83AlszFH?= =?us-ascii?Q?MUHIPaiheGr7VBK8arn8lpqTx0hukAKa7HHbsrUMDJYaWqP4dpgSx9g+3/EJ?= =?us-ascii?Q?ddMC2Idh0QgefdAMz/S1KN3oUg9svTxOZGquNgZ3rZkdxT4ekvKf0k83ntZN?= =?us-ascii?Q?GZ2v53Ar25gP6FiphxPaBGmP6MkffJjApoSOGFbfLcZflOu0N+y5qNENRG2c?= =?us-ascii?Q?DDTQrLPbDOfWqB0C0MM+J27eH+F1M7S6gqBiaBzmKf1t1E+gHCPuH4QSjvji?= =?us-ascii?Q?4ad4ZAJR0udkCzD6xZCIkzpwfWzwbKEcjf15qCgg0OfezdrL8J96Nql2+nGN?= =?us-ascii?Q?fqdzm11xnpOVjgtHnAHClvqO0qBpjZ+aFmgQxXedOqB8jue33xMWB+RL+r6w?= =?us-ascii?Q?xHvubH5SU/T97DMXtgUnLgM+71iKk0BqmcAcTXf5vYJh8QkhM4fQqla5T+hI?= =?us-ascii?Q?NtvVuyLMcfV+8hhQbOL6RKcf2iiLTEMJF/90SQkSZCVdY3NWA7KVoBF1jC7L?= =?us-ascii?Q?QQHw97rGalE/DcgvJe+G8Ahbcj/l2NnxIyXr9uttv46y7tzLgQ9c/x0FG3bw?= =?us-ascii?Q?i1CoLvtvmThIQaE0vlkZukyojoz2c7L3xl7g2FY72iEO8IAsXWIRBkL2XFdW?= =?us-ascii?Q?nnCXppKLDNHwJgCs+qd7XkTMSoor9bwxvK1heJ36FRlsStiw6zmE74oEaOQR?= =?us-ascii?Q?Do9cWcf5df6/wqWF3zskWu9r4zwxmP9pocZ2b3kj3aH14ZgHKfOhy3Vw5ifb?= =?us-ascii?Q?12EDFD5jk/DJYjsFeN6pOmvUs30M5OLVH3mD89KHU7TVW9icfR3csBvIBpfx?= =?us-ascii?Q?8OisXdGwHBrj8FAn0e97ExkM4kY1UoCA67RXSUS2/OdI0g1kc42Vk68XWOGc?= =?us-ascii?Q?mUH2UuqR4tuvO5BLkrjB6wnnzmJePAQ/cv52giWX2IN0F+v4JKLf1JGC4BZw?= =?us-ascii?Q?LFgtiHuPa8tWkBNLJicLwPlo4eqxhqFK8s36UvBv6XJfaJvEbXlSEHTnnvzT?= =?us-ascii?Q?TancvWYFcubdZAtCAIWoQvyZ4drkcN9ACQF5cHMhDp4hVoy1/qpSvHzuvpfM?= =?us-ascii?Q?7LrY4v+uUG5YeDC9LEAZRDK3kUEBcMd3FuklpPpTzq1tdDHy5qv2d3BTgjw+?= =?us-ascii?Q?PLMZip/OwG92bRuP1HAbBuekepkY4/DgRQCuHUj5QoElipfrsdxFfv4qZDCj?= =?us-ascii?Q?SYHj63LuzUbmwE2e0OF4i+UI2sY1MoBUyfQWPTLAILjpSjxVF6KjQ1aMDkPB?= =?us-ascii?Q?FLQiPcjXZGYj8i8j59FGjygyYJy+vxErnIi3cJyrpLo5+YJFhxNTp24Dph6l?= =?us-ascii?Q?kqEAJKjmYa2TZWSy9pslyqn55HheENQLnw2Hvjw/N1Xh2RYdJ4dl4qzkDXIJ?= =?us-ascii?Q?e8ZS/BYCfmMXBstklOGEQyNIzPZrrUmjEC7P?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(1800799024)(82310400026)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:39.3784 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: be430934-5d52-4fea-75e2-08ddeb68ba78 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001D2.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM3PR12MB9349 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal The EGM module expose the various EGM regions as a char device. A usermode app such as Qemu may mmap to the region and use as VM sysmem. Each EGM region is represented with a unique char device /dev/egmX bearing a distinct minor number. EGM module implements the mmap file_ops to manage the usermode app's VMA mapping to the EGM region. The appropriate region is determined from the minor number. Note that the EGM memory region is invisible to the host kernel as it is not present in the host EFI map. The host Linux MM thus cannot manage the memory, even though it is accessible on the host SPA. The EGM module thus use remap_pfn_range() to perform the VMA mapping to the EGM region. Suggested-by: Aniket Agashe Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm.c | 99 ++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-= gpu/egm.c index 12d4e6e83fff..c2dce5fa797a 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c @@ -10,15 +10,114 @@ =20 static dev_t dev; static struct class *class; +static DEFINE_XARRAY(egm_chardevs); + +struct chardev { + struct device device; + struct cdev cdev; +}; + +static int nvgrace_egm_open(struct inode *inode, struct file *file) +{ + return 0; +} + +static int nvgrace_egm_release(struct inode *inode, struct file *file) +{ + return 0; +} + +static int nvgrace_egm_mmap(struct file *file, struct vm_area_struct *vma) +{ + return 0; +} + +static const struct file_operations file_ops =3D { + .owner =3D THIS_MODULE, + .open =3D nvgrace_egm_open, + .release =3D nvgrace_egm_release, + .mmap =3D nvgrace_egm_mmap, +}; + +static void egm_chardev_release(struct device *dev) +{ + struct chardev *egm_chardev =3D container_of(dev, struct chardev, device); + + kvfree(egm_chardev); +} + +static struct chardev * +setup_egm_chardev(struct nvgrace_egm_dev *egm_dev) +{ + struct chardev *egm_chardev; + int ret; + + egm_chardev =3D kvzalloc(sizeof(*egm_chardev), GFP_KERNEL); + if (!egm_chardev) + goto create_err; + + device_initialize(&egm_chardev->device); + + /* + * Use the proximity domain number as the device minor + * number. So the EGM corresponding to node X would be + * /dev/egmX. + */ + egm_chardev->device.devt =3D MKDEV(MAJOR(dev), egm_dev->egmpxm); + egm_chardev->device.class =3D class; + egm_chardev->device.release =3D egm_chardev_release; + egm_chardev->device.parent =3D &egm_dev->aux_dev.dev; + cdev_init(&egm_chardev->cdev, &file_ops); + egm_chardev->cdev.owner =3D THIS_MODULE; + + ret =3D dev_set_name(&egm_chardev->device, "egm%lld", egm_dev->egmpxm); + if (ret) + goto error_exit; + + ret =3D cdev_device_add(&egm_chardev->cdev, &egm_chardev->device); + if (ret) + goto error_exit; + + return egm_chardev; + +error_exit: + kvfree(egm_chardev); +create_err: + return NULL; +} + +static void del_egm_chardev(struct chardev *egm_chardev) +{ + cdev_device_del(&egm_chardev->cdev, &egm_chardev->device); + put_device(&egm_chardev->device); +} =20 static int egm_driver_probe(struct auxiliary_device *aux_dev, const struct auxiliary_device_id *id) { + struct nvgrace_egm_dev *egm_dev =3D + container_of(aux_dev, struct nvgrace_egm_dev, aux_dev); + struct chardev *egm_chardev; + + egm_chardev =3D setup_egm_chardev(egm_dev); + if (!egm_chardev) + return -EINVAL; + + xa_store(&egm_chardevs, egm_dev->egmpxm, egm_chardev, GFP_KERNEL); + return 0; } =20 static void egm_driver_remove(struct auxiliary_device *aux_dev) { + struct nvgrace_egm_dev *egm_dev =3D + container_of(aux_dev, struct nvgrace_egm_dev, aux_dev); + struct chardev *egm_chardev =3D xa_erase(&egm_chardevs, egm_dev->egmpxm); + + if (!egm_chardev) + return; + + del_egm_chardev(egm_chardev); } =20 static const struct auxiliary_device_id egm_id_table[] =3D { --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2054.outbound.protection.outlook.com [40.107.223.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A61CB25B312; Thu, 4 Sep 2025 04:08:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.54 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958930; cv=fail; b=mYv4oxWNST8QIxkqXFWajM1TyRGXfk4OGPWMZF49hVt6c0ailmzR30ZPJXgd1Z9Y4JoFvR5e1eaTtY05QNqRQHnDWfXkzjBQr/0NO/jYTjf72FAMqlH5qtwUlji28b6JVDxKn5eQpCYkRl95uBlrSq/3F7OWHjrHjposTVf5PYA= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958930; c=relaxed/simple; bh=p+6NzfGvTNS2sxmytyS+MwsOPfefstmapEmDYF732BY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=k+T70F2FYNsqnEoEXmIVVDKZ4apAG935YbaxCoGrQcxCHxYrUV+KOJSzUl+q0n3HeCszgB4H+YKQjPl5KbmgoSuXLiwSfnlZI62k8Knmik2Lr2TN9Z4XI5I4UQ/Pdww7nVieU/pZvdodEpJ5sXuX/hLB2gkvGjq958zb3vTLpPM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=EQZeqcUH; arc=fail smtp.client-ip=40.107.223.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="EQZeqcUH" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=KL4jtetVdrDq4t+7VkCgk8t6aep329B/kQx2mdP3f5TM1JG8qgfzcJs+bF5SzdYYbCy2ZiLwXgYy139PsvKnLl8y/GBJG04jtit4RwqKKYQLnBW3hk0KoS+KiGz70SKbaU5in/cY7uQ+670ul2jdlKdp84qwOfGY7+aoUG3372XAXNeKdxO08uvwuBiH4dGv87DMAOXxpzCkKv4xw0RvI4tpk8h0RDZL8wnqba9kbrDpZiFVpc8PoZLSwoOQCqxe+fhXHVFrkymj8UY4LuvxnDRtLdiKaMOfFUWvDLpNe60KdCX02egqiXFGPr+i5LNO1o4O69NybpLCWV6rvFzjMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Syn8VaPMVAcVQQTRe3weluvECXHGgohWc+1wVLG18NU=; b=Ct26xzGi1X3/Tosym0YCXKriW/p7f+hz4ZI37tC+wQ0zX1Ia7nWcOkNfCZ9uuDCLyVms40GSocPsXnkaiyNNV0oU8W52IO0KyxEagUkvPEzAJwwp3r584MSLMErh+HNbrbyKnWmxJonEsUzSQtw7QnlCoSeJU+nV6zU6yaX0Q/Wu0UViAukv/SvFrZ5aLk364cIOPH0CjBk6QJ2lQOjd+maJY+v45AyPOkoEEQChnz7cfdUKr7yuwdTqUIe9PPb4VgIoZ7nqrzE78SLsuxMCLLCtkX1D7A927Tg7Qp1rtRqLQayLN6AtTl2gFK3P59FRjf0HsZypfiHRU2zCzEK5Xw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Syn8VaPMVAcVQQTRe3weluvECXHGgohWc+1wVLG18NU=; b=EQZeqcUHJb8dHp9Jw05Uq6uOTuKFsuIVj2cGZIJUHCT9Y5LxCKgH8CsTI+RlYwZcvtnV/6E82q3yPu5sHmhrfaHJZYRIVzXaEiX4jVhplgroTm4px4jwfVBlkAO7YnxG94OAjTVo58O2s4aw5oapAa+rttXUiry+Z8aU54rGrGgXZean2WVu7Q+jSCq7x0Afh03O0gwdMBkdQVvQi6joYPyBlG08LxB7y8gx3YXj+bCK86PaBTT1hMuUVEoUdCc0MBI/m7ysknDXETvl+ESi+DTtTzDUsocun2bM6oDiWDJ3hOkJ+N3mftjF4rVqbpUlOH4VOBgEVxBQ7Qb9aZRdiw== Received: from SA9PR13CA0146.namprd13.prod.outlook.com (2603:10b6:806:27::31) by CH1PR12MB9600.namprd12.prod.outlook.com (2603:10b6:610:2ae::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.17; Thu, 4 Sep 2025 04:08:41 +0000 Received: from SN1PEPF0002529D.namprd05.prod.outlook.com (2603:10b6:806:27:cafe::96) by SA9PR13CA0146.outlook.office365.com (2603:10b6:806:27::31) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SN1PEPF0002529D.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:41 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:33 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:32 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:32 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 09/14] vfio/nvgrace-egm: Add chardev ops for EGM management Date: Thu, 4 Sep 2025 04:08:23 +0000 Message-ID: <20250904040828.319452-10-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN1PEPF0002529D:EE_|CH1PR12MB9600:EE_ X-MS-Office365-Filtering-Correlation-Id: 5d0e638e-3780-4349-5913-08ddeb68bbe9 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|36860700013|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?G8XfpTUziUhPGBSKxGvcp/98EMjJxUoZsxJhGIxSLDIRdlK/5+wG8umguLbl?= =?us-ascii?Q?kph+NQSe66/7L1DczkvX3HnRX4QoiHkkhi7eBOTvbojgj6g+zGMR/xY0wwZ8?= =?us-ascii?Q?HFEwMUav8RpIFnIv2zodYcG8ueuC/L1AeMoUbd+EhAqBgBBLZGziEcNAFVST?= =?us-ascii?Q?qqzMMJu5fOSs0x3R04cDZgj+klbyQpWrhq1LMAMHrbzXDSO2amm2XcKkBsvY?= =?us-ascii?Q?D/nmjCx8gb0idO2cidXIGPoY3v/mcA6ALwBLtJqq0IckOB0akmDBOxmiY9NT?= =?us-ascii?Q?6SkNVcKVaKqe0juJYQXntIlzDfS8/S45d3e7rKE7PQxiQAyz6WuDLWtrGY0k?= =?us-ascii?Q?Gj7NcunYe75sUOaTJjFdAbjcQngvUbQYehITkLMtBcurC68Bo+9Uibx4BEyw?= =?us-ascii?Q?ftYHDs/XBztsqqo4LrdPbTFyMPOrsR6NtJBWO9Dpzz1p2Dk5lAT2FNHV7a/d?= =?us-ascii?Q?90kY01I0qYrYpnL6rVnqgNkwlV29tTkl4sRiDpeTFGApuc5z6Ho1yQwgRnhQ?= =?us-ascii?Q?5Jj7qmj1Z4wGZAGbF7/raY61P+L4Ov1n9Mi9sxRO8yHVBMWdcBAH9FeUpQ0b?= =?us-ascii?Q?cYBn1ltdK1WMGJsGF+hLNizVr6aL5tRN/IDOOnYt4AMq7Kv1P70C+fGNIbcV?= =?us-ascii?Q?oCk3346KUVW/FF7FAEw06ukmCf3N5GoakLo7lsCZGYE8c2MabmAouxc9F3n4?= =?us-ascii?Q?qJ50EY5fyhFWHs9I/wjw67hA/pv7XpupmvfAy2M67EjXN6zl3EePexwJJ4GP?= =?us-ascii?Q?DNHsuDCat6gCFrjHq1PpBTjgmO4hHNF5hjBrjTegPjbIhJ2uJS8mrTFCWSor?= =?us-ascii?Q?hlI3I72phzDLr6vwyLK19vN7nQDk+d8yGLrYyOVDFt/3wiSPXBhFx2XEOW/K?= =?us-ascii?Q?9NhB+UT2FtJIJaT5uTKXZc04Bn8h2qUdomuMqROYCtXpnMkK+sa1ALUIvXYv?= =?us-ascii?Q?uvmLxUJxoiypNVmAiTUU9dTmVNyq9tBNBFo1vwh4FwTzCCdKzHb0Zd0R8OO9?= =?us-ascii?Q?1u3qM7BcwQQAXHOHCKWWqpVvZxdMke/ZX5dMaRrgksuzq+HdYjrbPOHR9dF2?= =?us-ascii?Q?9XMLVE7vuDprDUYiWzDmFa+S4u/5yd09CIRbvOQlL2876RV4kIKwwzV5ZXXo?= =?us-ascii?Q?9tJjJUQom8aPLB5DFaf7I3aOK+qBN2770EegJTR+QALk1zSzkGU3J0EU+Fzf?= =?us-ascii?Q?2GSMb7UnbExy1OyaZzflw62iBZQN/JvgL5bsTfEQN8A4sftVT6m5hzzSXN0a?= =?us-ascii?Q?uGlR3LOqZf8yc+3lNpqkjOOQzexXSDN1yHg7W1ruZ2s7Odb+h0Hg99PTz0lt?= =?us-ascii?Q?drBFHroEJO0q2dttQBD0QMxTTfv0SJyXRyHDWYRZ2Q2VQ/7CWBV01OdsI1G9?= =?us-ascii?Q?xzAi/UlYI9fMt03ojWS1bETuYWL+3BxEK7XfNTWey5hHMcl6ijc41fDv1mqV?= =?us-ascii?Q?bJJxaTZ9ylbfP4XpHNf0T/I0hw4Sy7yY/uYMUVivawE+JNfNFFwB3jl5ITNB?= =?us-ascii?Q?x6yYXa7VIy9xa3uwnY+p4xKeLhoGvYkG01LL?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(36860700013)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:41.6845 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5d0e638e-3780-4349-5913-08ddeb68bbe9 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SN1PEPF0002529D.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH1PR12MB9600 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal EGM module implements the mmap file_ops to manage the usermode app's VMA mapping to the EGM region. The appropriate region is determined from the minor number. Note that the EGM memory region is invisible to the host kernel as it is not present in the host EFI map. The host Linux MM thus cannot manage the memory, even though it is accessible on the host SPA. The EGM module thus use remap_pfn_range() to perform the VMA mapping to the EGM region. Suggested-by: Aniket Agashe Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-= gpu/egm.c index c2dce5fa797a..7bf6a05aa967 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c @@ -17,19 +17,46 @@ struct chardev { struct cdev cdev; }; =20 +static struct nvgrace_egm_dev * +egm_chardev_to_nvgrace_egm_dev(struct chardev *egm_chardev) +{ + struct auxiliary_device *aux_dev =3D + container_of(egm_chardev->device.parent, struct auxiliary_device, dev); + + return container_of(aux_dev, struct nvgrace_egm_dev, aux_dev); +} + static int nvgrace_egm_open(struct inode *inode, struct file *file) { + struct chardev *egm_chardev =3D + container_of(inode->i_cdev, struct chardev, cdev); + + file->private_data =3D egm_chardev; + return 0; } =20 static int nvgrace_egm_release(struct inode *inode, struct file *file) { + file->private_data =3D NULL; + return 0; } =20 static int nvgrace_egm_mmap(struct file *file, struct vm_area_struct *vma) { - return 0; + struct chardev *egm_chardev =3D file->private_data; + struct nvgrace_egm_dev *egm_dev =3D + egm_chardev_to_nvgrace_egm_dev(egm_chardev); + + /* + * EGM memory is invisible to the host kernel and is not managed + * by it. Map the usermode VMA to the EGM region. + */ + return remap_pfn_range(vma, vma->vm_start, + PHYS_PFN(egm_dev->egmphys), + (vma->vm_end - vma->vm_start), + vma->vm_page_prot); } =20 static const struct file_operations file_ops =3D { --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2060.outbound.protection.outlook.com [40.107.244.60]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3715E26A0D0; Thu, 4 Sep 2025 04:08:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.244.60 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958931; cv=fail; b=GMB5uRFxvEaZINHFngdDX7kwZNsmeV+nB+OevW50OTwcqhBagdYrPSSZ9TDdy/hK956i9e3YS7hO+xKkLrRO5Li8BUOZtFCc52M33ic8jZM+1VT/oES9gt0jeEzETBHH/0P1s9QfzoimMJ91OyyS6/jOuNiqnw+HHy0img/WRzk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958931; c=relaxed/simple; bh=DoenhJrFU6v7o5G24aQrCwzf7T8hAdaUmpHbYhYxcX4=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=kNc+rjelZN1VzraW4Pel0q10M4uMuPn1WwiR8G1NaEAMYQoOCTREcDFqRknwnwwm2wXdHIcPAm6yM4KeT8LLN4vOGHhN4suVw6Wko3ZOM8I1pB1P2BL1g762Z8G19xnr6vf1xYkKfC+jfr/d7NZ+ud1huOx0HmLRFv9l7q/xvXc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=ln6rCOKh; arc=fail smtp.client-ip=40.107.244.60 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="ln6rCOKh" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=KzPQ10OtFm80sn5Sun5atlwIVuZuuNRK9ZUnqNCkgrtDOADGSehzZffvv5pyDOE5nHVyihwx3RosDLCHBo8RMceGpsOR8gkdCcwOApR1XLXrRmlQsv8BNgDaYv0l6ybOTkEOOgjA9QXEux/RZINiZfOJ0virsNBme965dwN0uWh9sjGQ2QN+Fs+KIS2HG12UAu29uoFjQVLYuBTmgLsu2Hhjuc73OaSSm7tP+XkF8jyMofsHM1iGLL2KlglmcJnX00I4JQP+So2Pwnrgyi5itbH0wT95qofoDsoIcLedGFISzWGghgdfumpHAAWAi2qpViZ9/vdJAter3B+va0Z8OA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=W4pJrldtDqCm0V6xmb+gDx89f9jCAL9lyWiaOTDXJp8=; b=eruxdA1Rke5CzaWF5E0rTRbjU/LT9x/RaIyKokgWXlQTq6rxsuGd97aefE8FY+oFVaE380Ac9fduUTSxmoK9wG50lzrePIRIzX5hVb3rpIt4NMsdg3MvYRM8KY2of3y6omxyjhRvNevNP+KslY/e42WTPm7ak29e7y1qzDxdMyRMTwHuAN3tyRpwvaHqdKEz4LIBch4sjHzU1Rgh6TkGVMO4JbLbK9kWghFz75sc7dWm5yiiuXiPCKkEGv8wIL+yxmMshqAy+KHwSxQu+hG3Ixyv7i86JThx+ZXPQrbqGuzXTZ78NAgx3rWvJUAKdguUOZbaSC/McrAm0OXQrcbBUQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=W4pJrldtDqCm0V6xmb+gDx89f9jCAL9lyWiaOTDXJp8=; b=ln6rCOKhAml42T9LQMbZeMLKddGd5ic8cDUG8zHU2M0Mh1JwpZKsOXBDHxPuvGVtUUjtgp8SRskOkNgQiqrWpqRW4/pT1J1knGZz2Oar21IGNKlysA1tEQd9nHqICfJ9vSQvyQphZIDTnhXbUG4QlFdszu8lslkDnarHVNeiHi/G/+Mh3ntLG6ovwGRlJX/y+OR+UbxTqrizB5DmTRdwAofgjt7yHZeZPLT58aruHCZo+/B/6SEH7XPpNM11mCafw5pzbfZxq/8smKIzfvcdX4huiq7iz4eL24eLPKRcakB1M6Uc5mqzP451fuc4Y6oLdl42+EZO4mzmebd6rkD1nQ== Received: from BYAPR03CA0009.namprd03.prod.outlook.com (2603:10b6:a02:a8::22) by DS4PR12MB9611.namprd12.prod.outlook.com (2603:10b6:8:277::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.19; Thu, 4 Sep 2025 04:08:41 +0000 Received: from SJ5PEPF000001D5.namprd05.prod.outlook.com (2603:10b6:a02:a8:cafe::b4) by BYAPR03CA0009.outlook.office365.com (2603:10b6:a02:a8::22) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.17 via Frontend Transport; Thu, 4 Sep 2025 04:08:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001D5.mail.protection.outlook.com (10.167.242.57) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:40 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:33 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:32 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:32 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 10/14] vfio/nvgrace-egm: Clear Memory before handing out to VM Date: Thu, 4 Sep 2025 04:08:24 +0000 Message-ID: <20250904040828.319452-11-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001D5:EE_|DS4PR12MB9611:EE_ X-MS-Office365-Filtering-Correlation-Id: e8d26349-154a-476f-2d35-08ddeb68bb6c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|82310400026|36860700013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?BQu7E88s/TA1ZfBHlGbVhP+V3dwCdQOhEWC44LoCzJq/P1GIKTK3FqNd9a2N?= =?us-ascii?Q?MFGNCXk7Sox2nQDfLqIumi1+n7xuLrrdcggO4s4k3lsMjx2EXyjpq3fBxwo3?= =?us-ascii?Q?CyEo3424nQj/H9Ll/Oq6Qx8WKeiE3+Uel7bFGp8Fqaja2/jbDZ9JLE5Dxr6b?= =?us-ascii?Q?QPUPBrDaRFjPFdv3l8djtGX2jyDCc8DGVJDRA6M778CUUSM2ietpV/q1MAul?= =?us-ascii?Q?PPPjw8e7QmcWd1YNhkWLm2JC95OjVIj1X+2wPhM8ZzZYzf/Liu34aHGOWenw?= =?us-ascii?Q?rMxpUAqNCWa056s1PbMTslqpd/hH1uYCRmaRoIcO/I3MlWgoV0eAcSx893KX?= =?us-ascii?Q?ZlMWAXPrKAydenIBo+uOrV+5AwhViB5aPZ5gsClmU2Xw6AHV+KlNOwKxw6Ef?= =?us-ascii?Q?py3+/F+JN2MhB6DCF+R02zKADx5oaXO7evcOp+PYwaoXwVgNxiGoXlMjRs85?= =?us-ascii?Q?rCa7H3AZBGR7KhCDr7aJ8xmG8/fKp1J/edIF7t4VgtZHTHUcMccc1/eSAoxJ?= =?us-ascii?Q?5cc4aJytARfBw9XS3Ez6Dhxh13yb0bfknb3mLu2TTkLeAsHYY07Ex72pDD8X?= =?us-ascii?Q?7L13BddpvMwQiqAjcVXkBjla/VaCQTumav9nv8m9HmN7c9dVXgVuUNhsaofN?= =?us-ascii?Q?qyguf6itBXUwNfy+DxuPn83mAl7/YVvFE9yWOVEWCsFzdMDhxoloCFowRsqU?= =?us-ascii?Q?LCipOPmsuV8yxWDXMy9LCLlbV2J+tEWc1OYGVkccTbtKYog0mZ8m8Hutj992?= =?us-ascii?Q?qazlV2FWBlplrjBfi6gG09PsTXn6zqfWhf11mmDIhwnBqF8olA92AtOyD8Tg?= =?us-ascii?Q?va74FP8sDEENsJwC6QEMZ8/aCmEPc6LPtYV9S4J0NVeo1myZpA1LjmR2xDK8?= =?us-ascii?Q?jvkqCKqEe/P/PIxBg7seCXaPwsyqZZg2nL+bRSSeL42tOC2S3dPK/BOeRTtX?= =?us-ascii?Q?5MNZgq1wq1G3dXfIH77n9YAsLiJvtLtrYirAG/nfeRfnH/RgMmSUsLIFIYw0?= =?us-ascii?Q?bHVcYeuM/woqcBXKoGroxn2beFgMVIxrwIa+5aCUnjRgqMtHmWrSJcRGz5Q+?= =?us-ascii?Q?8iJFiBr7Y/5agWPOw0s3NhVyIeLjAMV360j2NVl0My4FWSD8qbUZJshcGDTg?= =?us-ascii?Q?JwkNBK68VDXE7AplRurHVKP0FxLzNgmxkAhTorfpElIYtPe01YT2pnA/2ctZ?= =?us-ascii?Q?XPgCw66IYVRv9B0Z+1rrouATBifgbj0CTVr8tYPl39RFYpKkZqGjn+Ha9JPe?= =?us-ascii?Q?+rUdZypVoOJP3v4+Kfs6rwuZWykCTWKcgYrJR3U0NagOWmCbt2sh/JtEsf35?= =?us-ascii?Q?cFNVdkhZuPk6rg4AVGEzISxdtApGiiYqm8+bTk6I3eoRalJxP6tu5DtK9AWN?= =?us-ascii?Q?RSTUyESssAEXnxu5uKeyaqoegph/BAPwRhG6sXdMdJl487x/o2SE6ME5FKvB?= =?us-ascii?Q?U+wW40rXtraUBy6T05azFx/2jwgOyDkNta5bhiOqMoNeCwuLOWab0jaCb+i4?= =?us-ascii?Q?DecmWplDYxhKMIdRtjMt2zwY7M2dtTQ1FUB7?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(1800799024)(82310400026)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:40.9764 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: e8d26349-154a-476f-2d35-08ddeb68bb6c X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001D5.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS4PR12MB9611 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal The EGM region is invisible to the host Linux kernel and it does not manage the region. The EGM module manages the EGM memory and thus is responsible to clear out the region before handing out to the VM. Clear EGM region on EGM chardev open. It is possible to trigger open multiple times by tools such as kvmtool. Thus ensure the region is cleared only on the first open. Suggested-by: Vikram Sethi Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-= gpu/egm.c index 7bf6a05aa967..bf1241ed1d60 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c @@ -15,6 +15,7 @@ static DEFINE_XARRAY(egm_chardevs); struct chardev { struct device device; struct cdev cdev; + atomic_t open_count; }; =20 static struct nvgrace_egm_dev * @@ -30,6 +31,26 @@ static int nvgrace_egm_open(struct inode *inode, struct = file *file) { struct chardev *egm_chardev =3D container_of(inode->i_cdev, struct chardev, cdev); + struct nvgrace_egm_dev *egm_dev =3D + egm_chardev_to_nvgrace_egm_dev(egm_chardev); + void *memaddr; + + if (atomic_inc_return(&egm_chardev->open_count) > 1) + return 0; + + /* + * nvgrace-egm module is responsible to manage the EGM memory as + * the host kernel has no knowledge of it. Clear the region before + * handing over to userspace. + */ + memaddr =3D memremap(egm_dev->egmphys, egm_dev->egmlength, MEMREMAP_WB); + if (!memaddr) { + atomic_dec(&egm_chardev->open_count); + return -EINVAL; + } + + memset((u8 *)memaddr, 0, egm_dev->egmlength); + memunmap(memaddr); =20 file->private_data =3D egm_chardev; =20 @@ -38,7 +59,11 @@ static int nvgrace_egm_open(struct inode *inode, struct = file *file) =20 static int nvgrace_egm_release(struct inode *inode, struct file *file) { - file->private_data =3D NULL; + struct chardev *egm_chardev =3D + container_of(inode->i_cdev, struct chardev, cdev); + + if (atomic_dec_and_test(&egm_chardev->open_count)) + file->private_data =3D NULL; =20 return 0; } @@ -96,6 +121,7 @@ setup_egm_chardev(struct nvgrace_egm_dev *egm_dev) egm_chardev->device.parent =3D &egm_dev->aux_dev.dev; cdev_init(&egm_chardev->cdev, &file_ops); egm_chardev->cdev.owner =3D THIS_MODULE; + atomic_set(&egm_chardev->open_count, 0); =20 ret =3D dev_set_name(&egm_chardev->device, "egm%lld", egm_dev->egmpxm); if (ret) --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM10-BN7-obe.outbound.protection.outlook.com (mail-bn7nam10on2067.outbound.protection.outlook.com [40.107.92.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E722126CE17; Thu, 4 Sep 2025 04:08:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.92.67 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958932; cv=fail; b=HRKgm1hi4euFMSGAb5bqcqQ+3xXBFiNmwZCofFgm5IR3YgLITcDcmT/MfnxTJgPI5E1g0icdVjscwXWlAHdaWAIuQDiuBGHqC6I5yN+RP30LYgUsfBU1XHDcetEM5UNMenOBJirqZGPX8RiUj/miMm93byeUB+9qzAAlmdqHrRk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958932; c=relaxed/simple; bh=ZFikG7Tjp8zlLVutDGVh3X+rA6wse6HV9rhb0jT2VNk=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=F6cXcudKhul2Jt10ujgxvGu3NLjrQT7rlgGqUsx627v6k1H2zXsDP3gd6FMK73Uc9Ln5ZhJA1vt0Oq7K1s3fwAnONLYKiEEb35Nb7p6rwvgud18K+wrbIudmdanhkCeEITRDKtrxJlTZxt5XjKvCVkC2+6o5EGcFpojM4iaXN3Q= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=kIhXJNNq; arc=fail smtp.client-ip=40.107.92.67 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="kIhXJNNq" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=rU5BXtxjX4brIovHJpPC3y8TUlJoQm/u93pPhUUBRg5Ee29M+gw99P1O4XbAiu8Q8+Hrq8PI57gJsHmFmI+ZAOLSJOJk/ROw50W+g2DfbUgbJpmf9Xz0UcrDJuSLN4wbbw010dzdNi72QvxS7Mg5UQOLGY+zAzzWjbiWXDhSsIvr811Ik1Q0LUQYzr7rqSBbkoW1a8DZglLPVyZ7MSIxe3Qoyjdud0VaZvjSramlJEyJ4t7BzHYf7VuZbhf5wiRSWS/YEWuW2c8YcMRvYE0C23hTlKoOJY+MSPwikG4Iio7entjof6rXgyy7lTy5FQSwXbipRg5Ia1Rb4Kp13y0aZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EabDFE5r3erzRQEbAWOqlFl95J5u923xUXoF5yM2Qp0=; b=jFAC6EvmMR8jUBprXQz6Ohoo4HUt+BaIzzupFUsZdLp/NkcoLmZ77l2HodGtpGdHatZdz0KiYtgGKwH7X4ZJsYPheiUnOsecsvY1kfXabrW1nymtjWMXwZKxs3TB2Ea9pggaVBifU9dXi6nJsMr+4EolevuRxMX6lGwXbPSxj6THboqnAwvoAVd2nbasgF0vjfM3it64XxTYUd2VJ4ffMDEwaMFwivxjm7KhnnwsLtRpws7ojqMx4GfmnaATOdYTfpGY7KcoiVVslNxhiY71Nib/O4Lbr0I52YAWZK4mmRMiKigi3PG/v+E07yrV1itFqAdWwG/NWB9xZEWBbzIhnA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EabDFE5r3erzRQEbAWOqlFl95J5u923xUXoF5yM2Qp0=; b=kIhXJNNqBqLQufFEQGTK4+tT5dmNxMzcxGVCRCwATaeqTJUctUvuYsX3Vqry8Onmit/EeQLHMI+zRNXZqkK7eGlAS5JDqqDBZ9Qj2Zqqv6OyNXOz93/EdkQU3Q8AN6JcCt48SGx8Q6HbAFQqrEGFn9CNLLPpi0GiKKJdmgNnwWRKvLosZJPHOEWz4A5bAUkgh818l8bkxw/B2yYKlwQlTgn0Y5Gb6UGuT/c4SmSmIEnYCfyxU9tuAhOqr4JCgllT9sDZNUc8YylA9oHtDHpS1hm9NVZkWIUDmsiaBewamSDjt6mYAOEhQ0+81bAGu5ZNs/8vidzLXF8YhnizQVOY+A== Received: from SA9PR13CA0139.namprd13.prod.outlook.com (2603:10b6:806:27::24) by DS0PR12MB6656.namprd12.prod.outlook.com (2603:10b6:8:d2::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.16; Thu, 4 Sep 2025 04:08:43 +0000 Received: from SN1PEPF0002529D.namprd05.prod.outlook.com (2603:10b6:806:27:cafe::ce) by SA9PR13CA0139.outlook.office365.com (2603:10b6:806:27::24) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.17 via Frontend Transport; Thu, 4 Sep 2025 04:08:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SN1PEPF0002529D.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:42 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:33 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:33 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:33 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 11/14] vfio/nvgrace-egm: Fetch EGM region retired pages list Date: Thu, 4 Sep 2025 04:08:25 +0000 Message-ID: <20250904040828.319452-12-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN1PEPF0002529D:EE_|DS0PR12MB6656:EE_ X-MS-Office365-Filtering-Correlation-Id: 5b5058c7-5d78-422d-bfef-08ddeb68bca0 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|1800799024|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?wroWYFKwhzE5TWbVE6QKsLjy4TJ4K50RQnJAzbvqOZ8EF7APlanZ5QAwC0Zg?= =?us-ascii?Q?2VxflgzVrVXeHjgFogh28JV8RujMIML4RxoX3jKeQewLsW8/rcgT2hS++R2k?= =?us-ascii?Q?SBGCC07OAuqlRjeQaII/XNXE9t+Gz4ejuJhCb6wG1MJ8Srj1zZxP7i4Vgmpl?= =?us-ascii?Q?o6l2Cp839vNNiAxQEq0nH7u7BjN1Ufnm3pmgufjZDojEGc6Eh29iqX5llgM4?= =?us-ascii?Q?A1FSknbKDIuMtxAYHZtkdoGoLc2pukWCItSsWJd7dt/6iZBV5aUOJ6r4Cn/A?= =?us-ascii?Q?HUE/NuT+TNp4xqC7sRWBcbZhqN2NM2Ixe1zfC94w81pOANIIBk7ExrEOrjMo?= =?us-ascii?Q?o7HU0q2F+GEbE7nuo9TAwy1+WRUSk9W6R/DY4oxgPuRfTbJyHyT2frZVWR31?= =?us-ascii?Q?FGUzmT55GkuTS83BzJfjeCQYf0PXi329OvBtrWJbqXKC3A1l0y9KE/maIXRE?= =?us-ascii?Q?t6LldHxTvm17+s2AaBZEZDV9vYRrbXQN/dahxDr5a7w4YjRHcGI+cZaIrm8n?= =?us-ascii?Q?5rsJ/bVMCvHx14M8fG2pprWdo7Ft2IRHNHymrXtm64UGrlww/1unSmCPjt1H?= =?us-ascii?Q?hdFxPt/1O6LiRISUSWwDeF2ulobXK0Q27HgN9Fnk5yPaCGfEshWEXXC7NHDR?= =?us-ascii?Q?Nz1xzjovH3R/vlJ9z/kvcOCglgxuLoM6TC2q3vvJipoG0Hkbd0HRq7gpmlE9?= =?us-ascii?Q?OiO7uWiQd+sdb6PH7MPleCf38Lt88cdqRDU8wIUmt8tra+C94lQdgRMrlTIl?= =?us-ascii?Q?vvsgkxgtuZL9NzM9HvCZjsX6qc4sv7k10NeGRK9vVujfkG42bcKw8mm/sIsP?= =?us-ascii?Q?ABdcB8s5JuqiD/iPgvmj1vHifIPxcKo42+KEC9M75K/HzFgfpidzzdeLh6So?= =?us-ascii?Q?ImCBqRdv9mbCdUl70wWHcDu7k+qYZaob8PMyrpCrZJtowktZqqjVscrW7n2S?= =?us-ascii?Q?nANzTGfEPeemZD58V0yXnzHOwWSyxhEaX3FFxGJXR0pO50bg7sGvq2hvQ8gf?= =?us-ascii?Q?i2GPcSsHrx6Z1DNBcecvQHgw7ZRhdBN4kkKw/6qQUAoRiFrbeos6ODiF2uUz?= =?us-ascii?Q?108isWKWZlAy7wMnpEF/VTjNYc9ZdcEXQmRZ67jT2TZXQHuNS1seM98dMGTY?= =?us-ascii?Q?bZmk9c1KczqGo5xNNInXmJ4FnOOljT+jaE+yzlvLC8aTZWsL911QiDDPYzZR?= =?us-ascii?Q?TEyfqIAMnIPIluEPCG4A8loTlBpuPohtQt49E44SR5Qf7ranJ/RghzoiTWM3?= =?us-ascii?Q?okcGFKMQpabxvKn3xjMRSpuXc7KY4ZrnR9fG6VhAJWF5l8zrh+Dr9CHSlbSc?= =?us-ascii?Q?rXX+fJvDnaScXSTkXXoxnKFI8t4u0G3a/if5/FBGJdkfvs2AS7IjVM0Sdluy?= =?us-ascii?Q?Gf0l0LstVlTcH4StKjeGzWjxorrr0RkOmB5KPBdWs09wdzAqcy8TQcOsJbb/?= =?us-ascii?Q?MALZ4IJtRCEanGp+sPmNSo3f0l8Xfq6qza0GVy6gm7uXkf08bH2mDQovFbyH?= =?us-ascii?Q?uL2AKuPUnsE5wNXl+BWebHRjwS5OIwI/Fz7M?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(1800799024)(82310400026)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:42.8919 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5b5058c7-5d78-422d-bfef-08ddeb68bca0 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SN1PEPF0002529D.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB6656 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal It is possible for some system memory pages on the EGM to have retired pages with uncorrectable ECC errors. A list of pages known with such errors (referred as retired pages) are maintained by the Host UEFI. The Host UEFI populates such list in a reserved region. It communicates the SPA of this region through a ACPI DSDT property. nvgrace-egm module is responsible to store the list of retired page offsets to be made available for usermode processes. The module: 1. Get the reserved memory region SPA and maps to it to fetch the list of bad pages. 2. Calculate the retired page offsets in the EGM and stores it. Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm.c | 81 ++++++++++++++++++++++++++ drivers/vfio/pci/nvgrace-gpu/egm_dev.c | 32 ++++++++-- drivers/vfio/pci/nvgrace-gpu/egm_dev.h | 5 +- drivers/vfio/pci/nvgrace-gpu/main.c | 8 ++- include/linux/nvgrace-egm.h | 2 + 5 files changed, 118 insertions(+), 10 deletions(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-= gpu/egm.c index bf1241ed1d60..7a026b4d98f7 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c @@ -8,6 +8,11 @@ =20 #define MAX_EGM_NODES 4 =20 +struct h_node { + unsigned long mem_offset; + struct hlist_node node; +}; + static dev_t dev; static struct class *class; static DEFINE_XARRAY(egm_chardevs); @@ -16,6 +21,7 @@ struct chardev { struct device device; struct cdev cdev; atomic_t open_count; + DECLARE_HASHTABLE(htbl, 0x10); }; =20 static struct nvgrace_egm_dev * @@ -145,20 +151,86 @@ static void del_egm_chardev(struct chardev *egm_chard= ev) put_device(&egm_chardev->device); } =20 +static void cleanup_retired_pages(struct chardev *egm_chardev) +{ + struct h_node *cur_page; + unsigned long bkt; + struct hlist_node *temp_node; + + hash_for_each_safe(egm_chardev->htbl, bkt, temp_node, cur_page, node) { + hash_del(&cur_page->node); + kvfree(cur_page); + } +} + +static int nvgrace_egm_fetch_retired_pages(struct nvgrace_egm_dev *egm_dev, + struct chardev *egm_chardev) +{ + u64 count; + void *memaddr; + int index, ret =3D 0; + + memaddr =3D memremap(egm_dev->retiredpagesphys, PAGE_SIZE, MEMREMAP_WB); + if (!memaddr) + return -ENOMEM; + + count =3D *(u64 *)memaddr; + + for (index =3D 0; index < count; index++) { + struct h_node *retired_page; + + /* + * Since the EGM is linearly mapped, the offset in the + * carveout is the same offset in the VM system memory. + * + * Calculate the offset to communicate to the usermode + * apps. + */ + retired_page =3D kvzalloc(sizeof(*retired_page), GFP_KERNEL); + if (!retired_page) { + ret =3D -ENOMEM; + break; + } + + retired_page->mem_offset =3D *((u64 *)memaddr + index + 1) - + egm_dev->egmphys; + hash_add(egm_chardev->htbl, &retired_page->node, + retired_page->mem_offset); + } + + memunmap(memaddr); + + if (ret) + cleanup_retired_pages(egm_chardev); + + return ret; +} + static int egm_driver_probe(struct auxiliary_device *aux_dev, const struct auxiliary_device_id *id) { struct nvgrace_egm_dev *egm_dev =3D container_of(aux_dev, struct nvgrace_egm_dev, aux_dev); struct chardev *egm_chardev; + int ret; =20 egm_chardev =3D setup_egm_chardev(egm_dev); if (!egm_chardev) return -EINVAL; =20 + hash_init(egm_chardev->htbl); + + ret =3D nvgrace_egm_fetch_retired_pages(egm_dev, egm_chardev); + if (ret) + goto error_exit; + xa_store(&egm_chardevs, egm_dev->egmpxm, egm_chardev, GFP_KERNEL); =20 return 0; + +error_exit: + del_egm_chardev(egm_chardev); + return ret; } =20 static void egm_driver_remove(struct auxiliary_device *aux_dev) @@ -166,10 +238,19 @@ static void egm_driver_remove(struct auxiliary_device= *aux_dev) struct nvgrace_egm_dev *egm_dev =3D container_of(aux_dev, struct nvgrace_egm_dev, aux_dev); struct chardev *egm_chardev =3D xa_erase(&egm_chardevs, egm_dev->egmpxm); + struct h_node *cur_page; + unsigned long bkt; + struct hlist_node *temp_node; =20 if (!egm_chardev) return; =20 + hash_for_each_safe(egm_chardev->htbl, bkt, temp_node, cur_page, node) { + hash_del(&cur_page->node); + kvfree(cur_page); + } + + cleanup_retired_pages(egm_chardev); del_egm_chardev(egm_chardev); } =20 diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c b/drivers/vfio/pci/nvgr= ace-gpu/egm_dev.c index ca50bc1f67a0..b8e143542bce 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.c @@ -18,22 +18,41 @@ int nvgrace_gpu_has_egm_property(struct pci_dev *pdev, = u64 *pegmpxm) } =20 int nvgrace_gpu_fetch_egm_property(struct pci_dev *pdev, u64 *pegmphys, - u64 *pegmlength) + u64 *pegmlength, u64 *pretiredpagesphys) { int ret; =20 /* - * The memory information is present in the system ACPI tables as DSD - * properties nvidia,egm-base-pa and nvidia,egm-size. + * The EGM memory information is present in the system ACPI tables + * as DSD properties nvidia,egm-base-pa and nvidia,egm-size. */ ret =3D device_property_read_u64(&pdev->dev, "nvidia,egm-size", pegmlength); if (ret) - return ret; + goto error_exit; =20 ret =3D device_property_read_u64(&pdev->dev, "nvidia,egm-base-pa", pegmphys); + if (ret) + goto error_exit; + + /* + * SBIOS puts the list of retired pages on a region. The region + * SPA is exposed as "nvidia,egm-retired-pages-data-base". + */ + ret =3D device_property_read_u64(&pdev->dev, + "nvidia,egm-retired-pages-data-base", + pretiredpagesphys); + if (ret) + goto error_exit; + + /* Catch firmware bug and avoid a crash */ + if (*pretiredpagesphys =3D=3D 0) { + dev_err(&pdev->dev, "Retired pages region is not setup\n"); + ret =3D -EINVAL; + } =20 +error_exit: return ret; } =20 @@ -74,7 +93,8 @@ static void nvgrace_gpu_release_aux_device(struct device = *device) =20 struct nvgrace_egm_dev * nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name, - u64 egmphys, u64 egmlength, u64 egmpxm) + u64 egmphys, u64 egmlength, u64 egmpxm, + u64 retiredpagesphys) { struct nvgrace_egm_dev *egm_dev; int ret; @@ -86,6 +106,8 @@ nvgrace_gpu_create_aux_device(struct pci_dev *pdev, cons= t char *name, egm_dev->egmpxm =3D egmpxm; egm_dev->egmphys =3D egmphys; egm_dev->egmlength =3D egmlength; + egm_dev->retiredpagesphys =3D retiredpagesphys; + INIT_LIST_HEAD(&egm_dev->gpus); =20 egm_dev->aux_dev.id =3D egmpxm; diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.h b/drivers/vfio/pci/nvgr= ace-gpu/egm_dev.h index 2e1612445898..2f329a05685d 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.h +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.h @@ -16,8 +16,9 @@ void remove_gpu(struct nvgrace_egm_dev *egm_dev, struct p= ci_dev *pdev); =20 struct nvgrace_egm_dev * nvgrace_gpu_create_aux_device(struct pci_dev *pdev, const char *name, - u64 egmphys, u64 egmlength, u64 egmpxm); + u64 egmphys, u64 egmlength, u64 egmpxm, + u64 retiredpagesphys); =20 int nvgrace_gpu_fetch_egm_property(struct pci_dev *pdev, u64 *pegmphys, - u64 *pegmlength); + u64 *pegmlength, u64 *pretiredpagesphys); #endif /* EGM_DEV_H */ diff --git a/drivers/vfio/pci/nvgrace-gpu/main.c b/drivers/vfio/pci/nvgrace= -gpu/main.c index b1ccd1ac2e0a..534dc3ee6113 100644 --- a/drivers/vfio/pci/nvgrace-gpu/main.c +++ b/drivers/vfio/pci/nvgrace-gpu/main.c @@ -67,7 +67,7 @@ static struct list_head egm_dev_list; static int nvgrace_gpu_create_egm_aux_device(struct pci_dev *pdev) { struct nvgrace_egm_dev_entry *egm_entry =3D NULL; - u64 egmphys, egmlength, egmpxm; + u64 egmphys, egmlength, egmpxm, retiredpagesphys; int ret =3D 0; bool is_new_region =3D false; =20 @@ -80,7 +80,8 @@ static int nvgrace_gpu_create_egm_aux_device(struct pci_d= ev *pdev) if (nvgrace_gpu_has_egm_property(pdev, &egmpxm)) goto exit; =20 - ret =3D nvgrace_gpu_fetch_egm_property(pdev, &egmphys, &egmlength); + ret =3D nvgrace_gpu_fetch_egm_property(pdev, &egmphys, &egmlength, + &retiredpagesphys); if (ret) goto exit; =20 @@ -103,7 +104,8 @@ static int nvgrace_gpu_create_egm_aux_device(struct pci= _dev *pdev) =20 egm_entry->egm_dev =3D nvgrace_gpu_create_aux_device(pdev, NVGRACE_EGM_DEV_NAME, - egmphys, egmlength, egmpxm); + egmphys, egmlength, egmpxm, + retiredpagesphys); if (!egm_entry->egm_dev) { ret =3D -EINVAL; goto free_egm_entry; diff --git a/include/linux/nvgrace-egm.h b/include/linux/nvgrace-egm.h index a66906753267..197255c2a3b7 100644 --- a/include/linux/nvgrace-egm.h +++ b/include/linux/nvgrace-egm.h @@ -7,6 +7,7 @@ #define NVGRACE_EGM_H =20 #include +#include =20 #define NVGRACE_EGM_DEV_NAME "egm" =20 @@ -19,6 +20,7 @@ struct nvgrace_egm_dev { struct auxiliary_device aux_dev; phys_addr_t egmphys; size_t egmlength; + phys_addr_t retiredpagesphys; u64 egmpxm; struct list_head gpus; }; --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2089.outbound.protection.outlook.com [40.107.93.89]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2391258CED; Thu, 4 Sep 2025 04:08:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.93.89 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958929; cv=fail; b=C9uZGMTkGYtZpSAAWU8hzvGSyafvuT18qOdXkuIj1noJWNhMB00HOIAji542+09IZM56krqAvOFe0F8hMlgRGDLjpc9N1yl2TpIgl2bLd1d9XWiTpYzRGUpIZPT4G0clD7kUHRb2kNo1ZicyUH3VJzQpp2drPP3vP0ipcrQWUGg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958929; c=relaxed/simple; bh=EhEhxtDd4+lqofVLump5Fu9L38P35M+Mc6h47XTW+xs=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Y+M5CPkZVkqEJjdpqRbloT8P6psVNXluWm/mRuWYGXT00gy17M2gjgdBn4xTppASaP/7ppfDgaiBOKD85x/eb4m52b6mak+ys7atrXQ/0FDEtzgKmx2DBT+1+k50ZPTsiv4TrO2p7fmo2oVAOXSHJh81noDlUc5E2JtCMXW9FXU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=Q+yh2iog; arc=fail smtp.client-ip=40.107.93.89 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="Q+yh2iog" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=LOMpIuch7dbtS9vxmkg7IAnAxtzAAh+gQRANSIDtbIxCSLoMM+zAS08P0Gepk7xGG4soKEPzkNTzsewAmgLsV84NSet5sm5zIsTVU/jgG8E+qwH8M7RvLyKoHL3L1dMYxYafrS5j198uX2TH2cBmPdwRoIDjw1pQlGXRDs6JMKK/Mxqx17+W3Aim/R9b0fUtmrgvbssJMzN7rAIkePxJGjn0qaqJWxvFMtUdh9N/NIdq1hd7H7FTY0rp4OWDdboDj3WrV7g095qpf6i0tOxJL7w76xDMGc+Vxxo/4/QKls7uTtkMJPEd8DBbHqzhXAFCOLwjea/mf9UkDjDbFYtORw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=JePaWapvwsKekbtka54kAQ+t860eZBEn2CaYPVS0l9U=; b=m8OSeJMKM8LiQq0CBFSfwTxuudCJG55bDn7S6SI4s+AqnA/sEZmgkovx58dUu7lCD3o1QlVVz7lJSWvfWluClnDVGS1kddsaHXUNPBRGS+lstud/t2lBy79AaBZJ2bVH4jn7GCwfDaIclX7059Sm/Uxna+VM2uMmsDzja+dKm/GHDDaywsWfUwrSWWaIlxnu8YifMDHdY472MRtcfz/qudaa+C1tjPz3uXXczsZj2XYWUHIJj0SkVw8lQ8BYRrOOMf8p7oPOCCCnvm8jZrcJfi3nDUNksSSVMxYIYWZ/mVwQZ80/GKxgczwRTtEnqvoacqQg6PJ5EfdLy3bKgQH3qA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JePaWapvwsKekbtka54kAQ+t860eZBEn2CaYPVS0l9U=; b=Q+yh2iog0UjCnZWV3agWzoIv/oSm7HcssGi6+taF2IqtqC0d18omlwdco9cuPrsbaFdo2zSc/aeBDld2+XdMtMGsfy4GOQXgwbG8f/p8WRipISgVOTKcMINNBjJKV1o+ZKyv/fahaVODZ4rrYuKX/ioqqPbbsvVZBP4vy8PwezBVkMoMOzgRVESbdomLoYsxBmNMJCWu5L/Ks3qhaxSwOGxVB1jzldgPueXrHiF55/2R2/gl4QHjG5YosxizRR6VSuvWOm2s8F+tzPGku6UTvlip/7ilMJDlS6iFzGsh4G16SKZts9WbBzpVA5Pcg7NLIKUUnqjliEA3DMTle1dQfA== Received: from BYAPR03CA0036.namprd03.prod.outlook.com (2603:10b6:a02:a8::49) by IA1PR12MB9497.namprd12.prod.outlook.com (2603:10b6:208:593::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9073.27; Thu, 4 Sep 2025 04:08:42 +0000 Received: from SJ5PEPF000001D5.namprd05.prod.outlook.com (2603:10b6:a02:a8:cafe::7c) by BYAPR03CA0036.outlook.office365.com (2603:10b6:a02:a8::49) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.18 via Frontend Transport; Thu, 4 Sep 2025 04:08:42 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001D5.mail.protection.outlook.com (10.167.242.57) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:42 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:34 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:33 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:33 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 12/14] vfio/nvgrace-egm: Introduce ioctl to share retired pages Date: Thu, 4 Sep 2025 04:08:26 +0000 Message-ID: <20250904040828.319452-13-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001D5:EE_|IA1PR12MB9497:EE_ X-MS-Office365-Filtering-Correlation-Id: b22e721f-9603-4f2d-a422-08ddeb68bc0a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|82310400026|1800799024|36860700013|7053199007; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?S8/h0LgUPGJ3xRy5zqnTqCbzBWiv2lUgpWu7WdLD6q2AZpE0g4qipvrE7H9k?= =?us-ascii?Q?JYaoKYW2RoWPNi+mURZIPyAq4LYcnLdtYyC+fRRTTMG4LvTwU3zaZurICHhH?= =?us-ascii?Q?B0QNPSrg3mKxVLFjIcOWghA/2q96px5Ev8xUVaSEuszshqnbVNzOjrlvZrnW?= =?us-ascii?Q?4yseP/oB9m51vyjU4Nr83Jans7dv5LjzWFrJLguzakqPWRHlAhIO3dCKzzx2?= =?us-ascii?Q?tHtynyxAqkvIbNKTEPsbBiz+PfIZuijY03tCCGqFyPo+nnrn9IIGiFLf9NBn?= =?us-ascii?Q?HosMqbEpaZWfhLMVQZFCUGz3NVfdV8LxIaMHALH8dBfCkvjeKe8v70Bs1sNy?= =?us-ascii?Q?zdTlwqsGZF9ZQnHYapjzTJdnHGLWuRX/a5bq4Oa6h1lbRP+kLCFksL95kDB0?= =?us-ascii?Q?wESuRjl15eGl8aqw6lLxS3DhU3Klu+DnHxL0sby+IU6djcJbPJZg3RAoCGO/?= =?us-ascii?Q?eU93mhcf+bC+r5n+/C4qRDEQpGlvKFXH/tvlgDtUNVYmEVrlx5z4xKkNK1QN?= =?us-ascii?Q?8VwyHuiorQl5/qZU3selv2GMPAadrZcBAbE3D55/EBOaCYYujBgE0XFLBZZC?= =?us-ascii?Q?bhZjXiq+ELKaQrfgKEQKbCqtg4cPbW+jf1f/Kmp5Ygb1T8uLa/offi4CiW0v?= =?us-ascii?Q?ZuEpyCj7Se6QhT5vOsbDl2DuijFemOYEA/nUJJTP/cRiIDLhcWsd0wsRlEjT?= =?us-ascii?Q?DPpLo/zLranNp9IIu0pPfI76P/C7g3hxxr3TMzJWW+emBu0oY4sUYu3Ws/Cv?= =?us-ascii?Q?fn9GGEsAUOiVKANJlzRQhRPVwXDpuDBHd3/xEA7XEk2pNX/T3ldAOM9bWuLl?= =?us-ascii?Q?WUk4DmQT8C6A5Gy6cKeHzNQyJqyS5+1xFfaK1qEgaXTkk73fN9REVVFr2udj?= =?us-ascii?Q?daK4seM6FrzFakJDF4wbTMmlMZhSC4VsoE1vVJmjL3xms+flP7Z1a1YPHdEy?= =?us-ascii?Q?zKmWhhtNGlJnbxOFTn+uP6RXATT94hpKT5mszJb9C64EYEMP33glt8raaLPh?= =?us-ascii?Q?MJBZ+yIFM6BCbx6NiHDgT5X608KiF4n51turMdpu7D7lZlbcpEag3JHFW9OY?= =?us-ascii?Q?oIVgQbQ7YT7tiZWgLtKksKNhd1NsRvY+h3SS7CIvOg9R1qej4UUW3uL16Gmk?= =?us-ascii?Q?VK6Yznp2j3hN01LrCD107gmSAZ2VAhP85iUUdMkj52NyePH1vQutjQuOR3xz?= =?us-ascii?Q?ylcOdKCGlNozOqq43suoxgMPS1s4Ej3/vCwp0aTfYRcV5fhtd2uxaRcK7sj0?= =?us-ascii?Q?Hb+ZuhEfQGqKQdvfFoK9NRFigGx9PEvI6tkR+ub02Rg8BCCsDlynzE0bIweu?= =?us-ascii?Q?TSv3kzOMC+HMqqZveB0EQZzo9UUI52msT5j+XspoubR3Y0Z4aT39fsBP1Gb2?= =?us-ascii?Q?0Lt4Qq9XRopy/Hi3kzaq/Ioa+6cX4pBwr7UDdxicTfmOiCjEldmjIBnGmy1l?= =?us-ascii?Q?cxWnnIcuMRy/lIguGjabufY426mkEo7Mmk824v9dcyC6IIMLhHsEd/t8YlAX?= =?us-ascii?Q?gpm9SdA4uVo748Ql5QH81U8L2TaZc66ITVZE?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(82310400026)(1800799024)(36860700013)(7053199007);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:42.0134 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b22e721f-9603-4f2d-a422-08ddeb68bc0a X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001D5.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB9497 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal nvgrace-egm module stores the list of retired page offsets to be made available for usermode processes. Introduce an ioctl to share the information with the userspace. The ioctl is called by usermode apps such as QEMU to get the retired page offsets. The usermode apps are expected to take appropriate action to communicate the list to the VM. Signed-off-by: Ankit Agrawal --- MAINTAINERS | 1 + drivers/vfio/pci/nvgrace-gpu/egm.c | 67 ++++++++++++++++++++++++++++++ include/uapi/linux/egm.h | 26 ++++++++++++ 3 files changed, 94 insertions(+) create mode 100644 include/uapi/linux/egm.h diff --git a/MAINTAINERS b/MAINTAINERS index ec6bc10f346d..bd2d2d309d92 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -26481,6 +26481,7 @@ M: Ankit Agrawal L: kvm@vger.kernel.org S: Supported F: drivers/vfio/pci/nvgrace-gpu/egm.c +F: include/uapi/linux/egm.h =20 VFIO PCI DEVICE SPECIFIC DRIVERS R: Jason Gunthorpe diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-= gpu/egm.c index 7a026b4d98f7..2cb100e39c4b 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c @@ -5,6 +5,7 @@ =20 #include #include +#include =20 #define MAX_EGM_NODES 4 =20 @@ -90,11 +91,77 @@ static int nvgrace_egm_mmap(struct file *file, struct v= m_area_struct *vma) vma->vm_page_prot); } =20 +static long nvgrace_egm_ioctl(struct file *file, unsigned int cmd, unsigne= d long arg) +{ + unsigned long minsz =3D offsetofend(struct egm_retired_pages_list, count); + struct egm_retired_pages_list info; + void __user *uarg =3D (void __user *)arg; + struct chardev *egm_chardev =3D file->private_data; + + if (copy_from_user(&info, uarg, minsz)) + return -EFAULT; + + if (info.argsz < minsz || !egm_chardev) + return -EINVAL; + + switch (cmd) { + case EGM_RETIRED_PAGES_LIST: + int ret; + unsigned long retired_page_struct_size =3D sizeof(struct egm_retired_pag= es_info); + struct egm_retired_pages_info tmp; + struct h_node *cur_page; + struct hlist_node *tmp_node; + unsigned long bkt; + int count =3D 0, index =3D 0; + + hash_for_each_safe(egm_chardev->htbl, bkt, tmp_node, cur_page, node) + count++; + + if (info.argsz < (minsz + count * retired_page_struct_size)) { + info.argsz =3D minsz + count * retired_page_struct_size; + info.count =3D 0; + goto done; + } else { + hash_for_each_safe(egm_chardev->htbl, bkt, tmp_node, cur_page, node) { + /* + * This check fails if there was an ECC error + * after the usermode app read the count of + * bad pages through this ioctl. + */ + if (minsz + index * retired_page_struct_size >=3D info.argsz) { + info.argsz =3D minsz + index * retired_page_struct_size; + info.count =3D index; + goto done; + } + + tmp.offset =3D cur_page->mem_offset; + tmp.size =3D PAGE_SIZE; + + ret =3D copy_to_user(uarg + minsz + + index * retired_page_struct_size, + &tmp, retired_page_struct_size); + if (ret) + return -EFAULT; + index++; + } + + info.count =3D index; + } + break; + default: + return -EINVAL; + } + +done: + return copy_to_user(uarg, &info, minsz) ? -EFAULT : 0; +} + static const struct file_operations file_ops =3D { .owner =3D THIS_MODULE, .open =3D nvgrace_egm_open, .release =3D nvgrace_egm_release, .mmap =3D nvgrace_egm_mmap, + .unlocked_ioctl =3D nvgrace_egm_ioctl, }; =20 static void egm_chardev_release(struct device *dev) diff --git a/include/uapi/linux/egm.h b/include/uapi/linux/egm.h new file mode 100644 index 000000000000..d157fbb5e305 --- /dev/null +++ b/include/uapi/linux/egm.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#ifndef _UAPIEGM_H +#define _UAPIEGM_H + +#define EGM_TYPE ('E') + +struct egm_retired_pages_info { + __aligned_u64 offset; + __aligned_u64 size; +}; + +struct egm_retired_pages_list { + __u32 argsz; + /* out */ + __u32 count; + /* out */ + struct egm_retired_pages_info retired_pages[]; +}; + +#define EGM_RETIRED_PAGES_LIST _IO(EGM_TYPE, 100) + +#endif /* _UAPIEGM_H */ --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2085.outbound.protection.outlook.com [40.107.220.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 26AAC270ED9; Thu, 4 Sep 2025 04:08:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.220.85 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958933; cv=fail; b=LvDj4fApgsYUT0InqeRf/e1HTlmbixZevA5g2Yy0o8NTUBt661aR9vtQSonGRkNu/Vg4eXocMba9gasQzTkT2QYYjmla9mF4DgXX2lSznwTaeKhZd0zE2/2wHZv8hky7FKVjJe1D3JySgO198rII903/lVu7uAvyXmYFljdevko= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958933; c=relaxed/simple; bh=RMWczn1FNhW+1XGc2lfPXuslvoZbcc6h22zsEt8Frjk=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=cpvA+lI6jscKq1hOQTbAtbOoyhCEdOGge8/hUZuU86GQw5Rp7Rnvcg5si6NLBKamzlxmYcGy3XVBOSQQexxEwd+uVv4daxV1VtDcamg+l0m/2w1mqsE82DMUENS9riZyD4edBuIxsJR3ItNxtthcKQGIQQtmHyCdwflzNKjBdWA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=CyI8JK9u; arc=fail smtp.client-ip=40.107.220.85 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="CyI8JK9u" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Z1gKk3rxBRihZ/22jKVLDpqHTzMNtv3BAxkAYglHQr6a8sZTK/8FSuTRo+iRFh9kmZIZEdNNExNG7wkyxyg40BVoIuMlkODRluTK3PbM3dqg2CWU0gBcDWnufeO1iDzKdqMvn5iRTX6fr6zVqUspIRhLKZhhfpiRpjDyxw/VDE1T++gJ44TJyC/umxSVGsQMrEl8faykagfYFWUkuVtRvyfZY6GtVvYtnfEEpBO3vsl3rkU44LX2yzC7XGIqcF/f5xcOs94FDyW0MD5OlqNTJG2fCWTjWBVH9lisOibpeKEwNSHSxcZD5+059RVcINP33eMFchqyB74cleuNfERQaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cDVxtfNzHkobm8rKo2QFfNTB6iNwPUmxc7x505yGi/Q=; b=GpiEQg4O3klf8hnJswBE+DhQfvW0e7TzJyZJEK4+WoiMYFYE+svMSgqKuHCfLh+uUaMrc+n6y0/sIVp1m8TXiEbS6ZIoGWAC+psBMTiiDE5d1FQZjgAWyOoBRaEYJD92q+B2MNb5/xYVmxxI/zDD4jgDdeTwhBOCpPIlpTC7C075ZAmQ6/IGiOPuocOM58rdOSnhTdRvug7f3IwijfhHX0X5FvkvLV6LynGDLx24M3HzH7TCZ0mdeR1v1UZFmr3p7TcJG56+rEFJx0wruG3rpbk35f6Ee2AnsbxyFsqERuZZ4ntYHPLKgHRDQ2KsV5or7KC79KpokwcYPzmlgPxoow== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cDVxtfNzHkobm8rKo2QFfNTB6iNwPUmxc7x505yGi/Q=; b=CyI8JK9utZ9HpvPpMzEWMV25DwCZgAtLaI6nvLPcRyqVLwsgQPhS1tUH2CbMxH/gmZ6ORy0vpNhOZ/46OoV+DvDwDq6VePAHHHlQ6njuvkAXN3Jl2Ufu9QpruNstg/uz/MnqXGAS4hY+DQjn65hjsW50Yu3uCf4mQGCidFYDAxTdjOO1TiDZlleUSCwNPxkr+cQQNflXx9vVtDgPdpaLetVyzGnucS9wM/35G7GBm/vR/v9NP5MibBOalXlqaZhuR5k+VV3gt7MvQSCBZ9P+L/pseTb2frhX3EwnKziJR2dnn2eBwzeEwwkIBpkqkKwQ87GGZ3ShE+b5CiJhR1Gn9w== Received: from SA9PR13CA0124.namprd13.prod.outlook.com (2603:10b6:806:27::9) by BL1PR12MB5993.namprd12.prod.outlook.com (2603:10b6:208:399::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.16; Thu, 4 Sep 2025 04:08:44 +0000 Received: from SN1PEPF0002529D.namprd05.prod.outlook.com (2603:10b6:806:27:cafe::7f) by SA9PR13CA0124.outlook.office365.com (2603:10b6:806:27::9) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.16 via Frontend Transport; Thu, 4 Sep 2025 04:08:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by SN1PEPF0002529D.mail.protection.outlook.com (10.167.242.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:44 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:34 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:34 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:34 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 13/14] vfio/nvgrace-egm: expose the egm size through sysfs Date: Thu, 4 Sep 2025 04:08:27 +0000 Message-ID: <20250904040828.319452-14-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SN1PEPF0002529D:EE_|BL1PR12MB5993:EE_ X-MS-Office365-Filtering-Correlation-Id: 08f378de-9d51-4c51-7dcf-08ddeb68bd84 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|82310400026|1800799024|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?oQ80znGw+L7IqJyrzaW35bVFahTmCEt7MDQq6m/LxcFVUxKL5DMEpcsqubvv?= =?us-ascii?Q?edPKcZLv4LIr7/7uPA6jSLuiCw/KHv7QVAXAU4MhizxekaT9gTPysgqGY5R1?= =?us-ascii?Q?+sj//5ydIsoapUFg1Ev8Xoqi55lHPe3JQhUh4ICmPOBvdAWoBbw9BIp2lBJN?= =?us-ascii?Q?Cf/UguQlm/OaboWaoZ9h5nwmPhtQG9Lm2MmspHYfjAlBXAdJEezWWCOVfXC/?= =?us-ascii?Q?Wh4CEToB80Mo4uHE+rlsBsXFMRwlutVrBfM7BgI2IVVs9XTuCRoOMGZxRn15?= =?us-ascii?Q?YaITtIPocIviNM13qf9BKLi/Q6HguuHNEDrnv7qt1won4LNITdboiRme3JXw?= =?us-ascii?Q?URtusZ5l9/RL4IwWmmMbTI6mlCpqBhHEEaMwtBLVendzQ/IGwmCvS9KvwPaR?= =?us-ascii?Q?6BRrnVk0avoeEUoGIq/nEFx3/zYAdOOx1A2qY4FTqH7iP6tE+J1dug/2Pp8q?= =?us-ascii?Q?7yL0ftLG9x7lYrbwRy1MmMKFbjWTTrgrWiRb14WNKhwXlcozwLLyW0dC8Sog?= =?us-ascii?Q?DnR4tCj0bo5YEAme95uAyUtoKBHcpeX6+sh/R6VD7P7K4yGE+X8d1YHB35Kq?= =?us-ascii?Q?6xJ/Ox2Znlf/aVFuQmAwqVlrgEGhdmbAkzqtepczqWturw2aPCIgRHlMyB8d?= =?us-ascii?Q?0wT0CBka1UZbMBJol/wwfVBTTbO4lAshvlc23MPqiZnxngx3RIViWa2zXxO9?= =?us-ascii?Q?HdQWE1fUp5LbSDb/vo4fumxSo6iynVByAMNpueUaLCYHswLNaqtHD/KldLV3?= =?us-ascii?Q?JD819INL8ez9GFG1Ducj+lXM9rV1W8cvGW35LZy/ru98JddErcSLv0ls7VDo?= =?us-ascii?Q?y1Dng4F+8f6WkhT174lZIMf9vUkhToVAfxhtBzX095UB4uipQA6fvNUDIIOv?= =?us-ascii?Q?qTePFlyzOuAMwMM/v5QvXqqJXMOQSAVCHoi55lcdiohyfzJWLQnJ/AmI1XmE?= =?us-ascii?Q?GY0V2ear/kIu9kzuPV+hVFGdpntf4r2r9UQQ44dWvt07a78fsXHIPXtKG3C+?= =?us-ascii?Q?1cnRSAeTpCS5v/WpF9IRiXiiqE4bTX+XJhRhYBGh5rKsuc5kRmHCUnZRaai4?= =?us-ascii?Q?+WAKvzyk8v5sijphQq7OEtZwSvhAJQ8O+1p6T8OFf0avapeFnZLp0L2U8t0y?= =?us-ascii?Q?eNDxJ2o9V+6CWyFmMZZU8XmRhzmRDyF2KLZNxxn3fQc/66hKAWGzDQvO4LWQ?= =?us-ascii?Q?Bsw/QNOZRkG708VSGErviv2J56IJ1ChObFfPTWHRfXs3YsUxP6XP2G7E9zlp?= =?us-ascii?Q?4oMl4jPtaYqAtbF7oRsR6tF9ZdIhoMa87j/7BtVrTrcdqA26PfP8aR2rM+wf?= =?us-ascii?Q?qqRs6m3T1eHcuFRO/236KFDOJkMhC2Qxr3YulbFjDTOGXp9afLfEsktWXN9p?= =?us-ascii?Q?JaFjAmWaIQlTO8Y/G70OglERsQk5+zg4OtIoKWw1Pf7ha8hets9UUPxbrjfR?= =?us-ascii?Q?ICSAqoKh8XHbkIxNKnp1y6L9hjExRncJ63kWvQdmGYfvsi640X8ezaG38PiT?= =?us-ascii?Q?aF2RB+2C9A9U+FX/EIRGlukNa8/6EHnb89UC?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(82310400026)(1800799024)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:44.3780 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 08f378de-9d51-4c51-7dcf-08ddeb68bd84 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SN1PEPF0002529D.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL1PR12MB5993 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal To allocate the EGM, the userspace need to know its size. Currently, there is no easy way for the userspace to determine that. Make nvgrace-egm expose the size through sysfs that can be queried by the userspace from /egm_size. On a 2-socket, 4 GPU Grace Blackwell setup, it shows up as: Socket0: /sys/devices/pci0008:00/0008:00:00.0/0008:01:00.0/nvgrace_gpu_vfio_pci.egm.= 4/egm/egm4/egm_size /sys/devices/pci0009:00/0009:00:00.0/0009:01:00.0/nvgrace_gpu_vfio_pci.egm.= 4/egm/egm4/egm_size Socket1: /sys/devices/pci0018:00/0018:00:00.0/0018:01:00.0/nvgrace_gpu_vfio_pci.egm.= 5/egm/egm5/egm_size /sys/devices/pci0019:00/0019:00:00.0/0019:01:00.0/nvgrace_gpu_vfio_pci.egm.= 5/egm/egm5/egm_size Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm.c b/drivers/vfio/pci/nvgrace-= gpu/egm.c index 2cb100e39c4b..346607eeb0f9 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm.c @@ -343,6 +343,32 @@ static char *egm_devnode(const struct device *device, = umode_t *mode) return NULL; } =20 +static ssize_t egm_size_show(struct device *dev, struct device_attribute *= attr, + char *buf) +{ + struct chardev *egm_chardev =3D container_of(dev, struct chardev, device); + struct nvgrace_egm_dev *egm_dev =3D + egm_chardev_to_nvgrace_egm_dev(egm_chardev); + + return sysfs_emit(buf, "0x%lx\n", egm_dev->egmlength); +} + +static DEVICE_ATTR_RO(egm_size); + +static struct attribute *attrs[] =3D { + &dev_attr_egm_size.attr, + NULL, +}; + +static struct attribute_group attr_group =3D { + .attrs =3D attrs, +}; + +static const struct attribute_group *attr_groups[2] =3D { + &attr_group, + NULL +}; + static int __init nvgrace_egm_init(void) { int ret; @@ -364,6 +390,7 @@ static int __init nvgrace_egm_init(void) } =20 class->devnode =3D egm_devnode; + class->dev_groups =3D attr_groups; =20 ret =3D auxiliary_driver_register(&egm_driver); if (!ret) --=20 2.34.1 From nobody Fri Oct 3 06:37:01 2025 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2044.outbound.protection.outlook.com [40.107.223.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9210125A65B; Thu, 4 Sep 2025 04:08:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.44 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958929; cv=fail; b=BlZkNQy2GZoDqL5p1Azmj7z8Cxr2VMFl5OBgNRwNn8SJ9zQ4PTG4x6YqI0PzZMssBmwEyYhtf85CLU7HipB7zjwvgSW3Q/UmmqQtRjkIUdiEOz0mLXx+Hiv+ub/07ojVBh+tuW2Vqo39fDCmu9+PsIozRjy5AXN3d2ZdaK+2UCI= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756958929; c=relaxed/simple; bh=zOAXaHa/xO5N9QYTZEPrDXfQ9I6bNF/usDo6ghY65qo=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=j0moJYLVu4AAWkaT25riCcpdLTGifS65U4Eyjk3miYkhMCB/L6PjCqaA6oVBuu+th0Lk+pkY0Ueu/1NBhWEGmTJWBjOy8PFgIPH48jInNm/zRficOQb/Zj/TKoHI4WA9hjeVZXs+wQiV+XQzFj860zeE19HXg1iJEJgfdRfO/zY= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=aSDBu0gQ; arc=fail smtp.client-ip=40.107.223.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="aSDBu0gQ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Z9FGyi8yvpcTr5sskl7d9APsLyBvQzW0uy82FZ0ZD9liK6MtxtUE/tH+jb9c495l/7agN4R6CVnJM3thCR9d7o/NBYgJi8Kg9ZEY32uk0QHlN1zMczw+8pS86mNMnGXF89rz/ki8GTdh4kF5Cqaujdhp2xMg8emB5Nq1n6QmqxiF0WDrV++XtQEYAcOx/DkL/j3fbW4g1CjdGcJcDATAkxyE0KJVd6y2gD/PAtquaR8t4lklLqyA7Pk2HnLKJGnL6V62yUL1OiRGrvXewfH8lU+pXgjn4sEDqd4PAP7/uWlq+sXmlUbfcBKf9ArX8EM420hocZK5biHS6d+ft+rkew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5LxRM8U49l+wWLu1IBuPy19Urz8r7McS06JfqfyR/lU=; b=TfPe/kiCkgOZyd+QWHrPj+RhTY3wI0sTbymaBCqxPxNNyqRWrvNBxTrOC/xSi/0bOjzQsWln9RMBx+AIqEVusI1MMNH/ZU5NCoQCCDuIsvWyGsTlMt4A8VX2ypuOHfEHZx39YdNOQ08wY3qvrks4SW979b4/U+Lw0X1WWitE/UbDK0Ai9x15qa2XBGM5HfTFEGnLtUkCO6+l2BxQV5sRe3UlCa7xQ1xT18mPetFyyYdCSRG4wOYj2jn1CDHvkDPElJzSq+eo2emJM4PVEWMzaXbr824VHoHx3p0K0mWbpZUFQaVI8W5vDDqZOI0grdv2hu/BRLOBsnnxqGKiybwecA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5LxRM8U49l+wWLu1IBuPy19Urz8r7McS06JfqfyR/lU=; b=aSDBu0gQ7mEdy2DQMNb7ogRWn0/YEuOuiV0V3Y7HrXpBg4mzSv5WHH5efhkcG8KZaJr/oGQq1zwNvT8yYdRbETOR6gSVoUY2ue+E6R44lhJ8HKl8CplBnnnGpcaHc3nsVZPs04jTwx4fVNdmmfKgSTiaa9t3lWX3QzasnwGnirlmD0RedDsvwiixYyxt4/VIpbpvQBxqIagQHuHKrigrFrUAvoGFPqo+TghGMXc6AGu0HqgU+tmi4dqcwTbEpfeg4iAk18BDELt6JmmYlu7HTQOkbbiFsSGO+bHaY1rLatz/eimWHr7YEbf+PpptKiib0gNvUTwU8QsQ8bkWYSYmNw== Received: from BYAPR03CA0016.namprd03.prod.outlook.com (2603:10b6:a02:a8::29) by MN2PR12MB4317.namprd12.prod.outlook.com (2603:10b6:208:1d0::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.17; Thu, 4 Sep 2025 04:08:43 +0000 Received: from SJ5PEPF000001D5.namprd05.prod.outlook.com (2603:10b6:a02:a8:cafe::21) by BYAPR03CA0016.outlook.office365.com (2603:10b6:a02:a8::29) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9094.16 via Frontend Transport; Thu, 4 Sep 2025 04:08:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001D5.mail.protection.outlook.com (10.167.242.57) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9094.14 via Frontend Transport; Thu, 4 Sep 2025 04:08:43 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:34 -0700 Received: from drhqmail202.nvidia.com (10.126.190.181) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 3 Sep 2025 21:08:34 -0700 Received: from localhost.nvidia.com (10.127.8.12) by mail.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14 via Frontend Transport; Wed, 3 Sep 2025 21:08:34 -0700 From: To: , , , , , , , CC: , , , , , , , , , , , , , , Subject: [RFC 14/14] vfio/nvgrace-gpu: Add link from pci to EGM Date: Thu, 4 Sep 2025 04:08:28 +0000 Message-ID: <20250904040828.319452-15-ankita@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250904040828.319452-1-ankita@nvidia.com> References: <20250904040828.319452-1-ankita@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001D5:EE_|MN2PR12MB4317:EE_ X-MS-Office365-Filtering-Correlation-Id: cae901b3-d08c-4ea1-f7e5-08ddeb68bd00 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|1800799024|36860700013|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?nXwEnFs+canT1IebrASUB7SsjBDSVkXTPR2Mrw4i/zZTn8VcL9LLOfo2TKJq?= =?us-ascii?Q?nWvd11zPr/9u0JNahIsWeMcipOR889rvJN3F5JkuEwVL37yiz9lZdudH3Paq?= =?us-ascii?Q?uXo7prTzxZb39w9uy/hUH3BylChcOTxLFaHnb58sii9ctvyweSZMzUHpACEe?= =?us-ascii?Q?J6Li0bxOyr9rnrmdmxVR9Xqp1LKl9fTRa9zpnGewp0YM9vxBZ8z/jtapuioL?= =?us-ascii?Q?qrPUIEM2snt4JAmnNg6Kn4wQBw6109FEOq09WaDikn6blgMCt/Lsf0YsPr7m?= =?us-ascii?Q?j5/J2YiJkpGXHXy3fF3JGRGbGbi1DwDjaaXhT53FUx2e822ozFsrcHiM7+zt?= =?us-ascii?Q?saHTP7OSCt2sXB3rB4d06kMlFDCOQ2pMMrQyDZKSg8VbwM/qqRcKe3vwAyjN?= =?us-ascii?Q?v646xnmCExxmeaCgns0JSlrsNYt0gwhaHnCPHKxwH1Ya0PoydP6cyoowTrio?= =?us-ascii?Q?eausFMrxHyJbIVMNOmh+/OHEvAKtquUbOMSlf8IeoVs9YAP/OPon25v2oRMz?= =?us-ascii?Q?94bQO9gPHlGbKLmMxaZ6ruS4537q5EL1KEUfIEu5wGqqZOTsQIB2xqourrLl?= =?us-ascii?Q?AHCbYL5335SVScwBt2jcgnizIk6VQgBErrE5LGKCYlAnVFSvPffF5NP3e1/D?= =?us-ascii?Q?374nBS6zXHbvpExj1OWVTzQgZ+sxPC3ZR/ux0z7CY1bS3nWEXq912K0nzC/X?= =?us-ascii?Q?8ci0N0VeW4p6CQtJXcfv1zuoWYGjRNK8PJg0WrTi9am+Lbv6LrSDMtb96ruv?= =?us-ascii?Q?e9UIjg0MVEGpQQvEZm1u26Wu4FfouNNX8abrpthc6OwHkg/Ss2I+YrXhqJl1?= =?us-ascii?Q?+VeqWkt/NvYTarUM5SpLDyLgPIHObQcr+5bs1mE853wLpO9Yrk3Wzr6xPw75?= =?us-ascii?Q?Ubf9kDT2fYnFtQlw31mmbC3tkCWGjNU8LWWVC0HKjJMG4xBSPzhAS/Rbxhrm?= =?us-ascii?Q?daD9Yq0QUzV6sFEoUp+UJlb/eKHcOAM4DCxGX8R0kjy4T3r0Gl7tb/WqlE57?= =?us-ascii?Q?sdc21p5hbzJv1BK/MgKumugb8/7jwk1fXYDBmfehb+X0A2H9Yf9dGQOxikXc?= =?us-ascii?Q?w86EJjRyNTt9vmyyydZ82eSC7N1/iHi26e7+egjhX2TCy/CAmnqrGeVf2k8u?= =?us-ascii?Q?rYuRNbDf/5rPRIbhj+qkM4l36OIvJ+c5m7S+I8XEyF909sdhKy041WQm2TXa?= =?us-ascii?Q?yijPcNeEWZdOK7V+M3M7d9ljELbSG4RayNSkPZYoNKHpkbYG9kSUdA9NaKdp?= =?us-ascii?Q?aMne6Zp0H1iZ8N40nHOgpLmHPGYwv22ou1ld11SKVeRyPsysanbfKt5vnF6u?= =?us-ascii?Q?17ZBE6SNVOB722aWE1Bbf1OGS49tH2jKb1hr0SR0r0Ly/AMFMpggXJmFnM7p?= =?us-ascii?Q?IFMGMLtIUWjyqa4uvUyGXp6X8eqIi04j8AxYphk9eqCAWNMDmt9UyPo7Gau0?= =?us-ascii?Q?RnyPCQ6iVEuvHIbKQ7Z5O9R301ydMVRM68v69LQMncDeyjd1uo0Dikmh3aVH?= =?us-ascii?Q?BFVn2373afAE4+W5aA8qkI7eoOQboIcusNEo?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(1800799024)(36860700013)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Sep 2025 04:08:43.6267 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: cae901b3-d08c-4ea1-f7e5-08ddeb68bd00 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001D5.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4317 Content-Type: text/plain; charset="utf-8" From: Ankit Agrawal To replicate the host EGM topology in the VM in terms of the GPU affinity, the userspace need to be aware of which GPUs belong to the same socket as the EGM region. Expose the list of GPUs associated with an EGM region through sysfs. The list can be queried from the auxiliary device path. On a 2-socket, 4 GPU Grace Blackwell setup, it shows up as the following: /sys/devices/pci0008:00/0008:00:00.0/0008:01:00.0/nvgrace_gpu_vfio_pci.egm.4 /sys/devices/pci0009:00/0009:00:00.0/0009:01:00.0/nvgrace_gpu_vfio_pci.egm.4 pointing to egm4. /sys/devices/pci0018:00/0018:00:00.0/0018:01:00.0/nvgrace_gpu_vfio_pci.egm.5 /sys/devices/pci0019:00/0019:00:00.0/0019:01:00.0/nvgrace_gpu_vfio_pci.egm.5 pointing to egm5. Moreover /sys/devices/pci0008:00/0008:00:00.0/0008:01:00.0/nvgrace_gpu_vfio_pci.egm.4 /sys/devices/pci0009:00/0009:00:00.0/0009:01:00.0/nvgrace_gpu_vfio_pci.egm.4 lists links to both the 0008:01:00.0 & 0009:01:00.0 GPU devices. and /sys/devices/pci0018:00/0018:00:00.0/0018:01:00.0/nvgrace_gpu_vfio_pci.egm.5 /sys/devices/pci0019:00/0019:00:00.0/0019:01:00.0/nvgrace_gpu_vfio_pci.egm.5 lists links to both the 0018:01:00.0 & 0019:01:00.0. Suggested-by: Matthew R. Ochs Signed-off-by: Ankit Agrawal --- drivers/vfio/pci/nvgrace-gpu/egm_dev.c | 42 +++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c b/drivers/vfio/pci/nvgr= ace-gpu/egm_dev.c index b8e143542bce..20e9213aa0ac 100644 --- a/drivers/vfio/pci/nvgrace-gpu/egm_dev.c +++ b/drivers/vfio/pci/nvgrace-gpu/egm_dev.c @@ -56,6 +56,36 @@ int nvgrace_gpu_fetch_egm_property(struct pci_dev *pdev,= u64 *pegmphys, return ret; } =20 +static int create_egm_symlinks(struct nvgrace_egm_dev *egm_dev, + struct pci_dev *pdev) +{ + int ret_l1, ret_l2; + + ret_l1 =3D sysfs_create_link_nowarn(&pdev->dev.kobj, + &egm_dev->aux_dev.dev.kobj, + dev_name(&egm_dev->aux_dev.dev)); + + /* + * Allow if Link already exists - created since GPU is the auxiliary + * device's parent; flag the error otherwise. + */ + if (ret_l1 && ret_l1 !=3D -EEXIST) + return ret_l1; + + ret_l2 =3D sysfs_create_link(&egm_dev->aux_dev.dev.kobj, + &pdev->dev.kobj, + dev_name(&pdev->dev)); + + /* + * Remove the aux dev link only if wasn't already present. + */ + if (ret_l2 && !ret_l1) + sysfs_remove_link(&pdev->dev.kobj, + dev_name(&egm_dev->aux_dev.dev)); + + return ret_l2; +} + int add_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev) { struct gpu_node *node; @@ -68,7 +98,16 @@ int add_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_= dev *pdev) =20 list_add_tail(&node->list, &egm_dev->gpus); =20 - return 0; + return create_egm_symlinks(egm_dev, pdev); +} + +static void remove_egm_symlinks(struct nvgrace_egm_dev *egm_dev, + struct pci_dev *pdev) +{ + sysfs_remove_link(&pdev->dev.kobj, + dev_name(&egm_dev->aux_dev.dev)); + sysfs_remove_link(&egm_dev->aux_dev.dev.kobj, + dev_name(&pdev->dev)); } =20 void remove_gpu(struct nvgrace_egm_dev *egm_dev, struct pci_dev *pdev) @@ -77,6 +116,7 @@ void remove_gpu(struct nvgrace_egm_dev *egm_dev, struct = pci_dev *pdev) =20 list_for_each_entry_safe(node, tmp, &egm_dev->gpus, list) { if (node->pdev =3D=3D pdev) { + remove_egm_symlinks(egm_dev, pdev); list_del(&node->list); kvfree(node); } --=20 2.34.1