From nobody Mon Dec 1 21:33:25 2025 Received: from SN4PR2101CU001.outbound.protection.outlook.com (mail-southcentralusazon11012060.outbound.protection.outlook.com [40.93.195.60]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7986A19C540 for ; Fri, 28 Nov 2025 04:42:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.93.195.60 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304958; cv=fail; b=gUDImcgmHfXNK1NF5xmUB0cn3zIaGZh7+EPISFC6Km4mXZrYxLVEqJ3qDJM2r4EMuYUArL53ighSzTFeF+INjLPcfAftsCjfidalcSpMNQfPh0mLPCIxGTrRdY4MUtD1ejh8xu/EkFWhTwqfdcNGB36T/5jzltJrEhbOUENyJ7E= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304958; c=relaxed/simple; bh=LvCUYcfkd/5kyTjaQ3MVmRIfCfD5Si5CF3QlBigb5+E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=MVaF6ott6ybQZ6TFApOaw8pljSevYyYixBRpDLuYhimYDSBB0fVuHaMGzIXy3G7oA1Gnm7Euo5/bnaKErVnoMz85Ami0/lqV2lvnwYBMW4DiVv0836j2GXe1yBSI3Kmwv/tS2HiFIcTu6GSShoFBrihAXCi8MH+STlSKnZ1Xs3k= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=d08WAEZb; arc=fail smtp.client-ip=40.93.195.60 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="d08WAEZb" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=HbBZ0pxG9zCHLurR/HT7g3urcIp9yKKWf//RPD4uM0tpFnfMZItIJ6QIrAWzC+YYv3kyBh4Cm28PXaHIjD6jojP3QACJ/FVJ4uESDREij/2U2HDwNwiDHCumVuldvKyQHVFc3XxOy+lXt6a+sFLBdMQJImX97mVA81oXs0USiI5JP4EQUa/PYuvY2Mwaq7qTkaS8LmBld0daT7YydjFDl4FDhhk61AohhFsNfa5cMkSl8xIxJ/AhgBM/W2Vm4NEbI/KJU/YuY3DD8Ze6z0F4Vw3ZMrlkNOT4kuC1CvuGwChAzf9SEJd8cnM7e+37yss7u/8155BVJOymsfZWJwIlrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dBX1MTpys5nf/5TRZ3ME6tM/7FbPstMIoIN2z40U/pQ=; b=r81r3mWSiSWuQNVcgIKJv9js9ywQ9uIEKa2QHAC4M4DfyDT4e+Ub+P8DuV0JFTTzneLCyE91hARimU1ic2Br0quy3zG9lhGYv3MfSwgOvu5BCk04ITZGTy6pvfWGLLyPgrp+crsdCOK/BDdiWKHdyupAmGuu26X3KQxbcQYTH1UbVChX4XFttZI3HKME0vSiSaTdohevuQaMcF7TKrnRds+ymehMij2mmOhWypm3vExHfiiFAftAr0oRfrEEgDERQw9JY4l7si7Pa0c9A0lQ5XeWorAoQ7UbZFcYqt8jfaDivAPO0i6s0P5g12uKZR+ipbeoK9uRSzSvtvrvmuvYHQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dBX1MTpys5nf/5TRZ3ME6tM/7FbPstMIoIN2z40U/pQ=; b=d08WAEZbNGckoGFuI74M/ouHRhlsxbCfGjqj4toFXGQOVAwsuxPGEaeyVowSor/v78vy83RV+HFKEN+Vl6xBgQNkbUHTxg2K/Q79Fjr6FIH4bkM+myElX2eDPUTsVgxlzkgFurKrDilHE+OFUEPDNxGxcNjo3btXiniM/cqRBOWapSC1mJStyg14qhVUeNvPnwhNV4lVB11P7tf9YPCBfGQVX0Upqak84kh6wvrvAweq/lyceslpUK3eEluZ7dL/UujvaJLtaCfinvVLcMKcpAyYMsBJkQmZuYwz5SGZca9y8y2ooCwWKMZvBJE0GF5oYIyEGiuA5wTcv/2Df3rFZg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) by MW6PR12MB8735.namprd12.prod.outlook.com (2603:10b6:303:245::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9366.13; Fri, 28 Nov 2025 04:42:34 +0000 Received: from DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1]) by DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1%3]) with mapi id 15.20.9343.016; Fri, 28 Nov 2025 04:42:34 +0000 From: Jordan Niethe To: linux-mm@kvack.org Cc: balbirs@nvidia.com, matthew.brost@intel.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, david@redhat.com, ziy@nvidia.com, apopple@nvidia.com, lorenzo.stoakes@oracle.com, lyude@redhat.com, dakr@kernel.org, airlied@gmail.com, simona@ffwll.ch, rcampbell@nvidia.com, mpenttil@redhat.com, jgg@nvidia.com, willy@infradead.org Subject: [RFC PATCH 1/6] mm/hmm: Add flag to track device private PFNs Date: Fri, 28 Nov 2025 15:41:41 +1100 Message-Id: <20251128044146.80050-2-jniethe@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251128044146.80050-1-jniethe@nvidia.com> References: <20251128044146.80050-1-jniethe@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SJ0PR05CA0201.namprd05.prod.outlook.com (2603:10b6:a03:330::26) To DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM4PR12MB9072:EE_|MW6PR12MB8735:EE_ X-MS-Office365-Filtering-Correlation-Id: 9585f612-9544-4157-647c-08de2e388c27 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|7416014|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?xVt0wE+PNHAFBTB198FOlSFvg4mq/QkBsnzhA1+wYhx6SvodUXqu4p59aUvS?= =?us-ascii?Q?MYVGjH1Tfh1Gql2tXR3I7AJ1NVRmnBrL94Ni5g+eVpiAC7r6o8ndol8PbIVs?= =?us-ascii?Q?kUEjbB+hhaAJwninsqAWBLvZ2jxTpQ2z8yjE4+LwXYfuL94C0jUeplluxgbz?= =?us-ascii?Q?08e2CKkurmM9c7bUomR32FK32gqYR+/savXssc5V1/68dhkMDR+ClLLoY7ld?= =?us-ascii?Q?5Sv3Id2ToaWqpNehjom13Ukw/+ZfU/yAYQT094mQg8O9P/EBbN4zjkNEq1xs?= =?us-ascii?Q?9GmmIaZe1n+QIFd+WGmJQThF8Axbin1j7vxoU2XV4lpbwYhVifRQs7FsgNDq?= =?us-ascii?Q?wRhtDd3LvNR+ZbDRU56RLBe6FsA6OmznvAq+8qoXiO9YgxIeHtplldXsYxHI?= =?us-ascii?Q?9HLsdL3G2QV50NoQtesPn7dUfIL5gvJp7e/V766W54PSgvfUsyUDHSiMS+A4?= =?us-ascii?Q?PL/ERLppc6HhaEGTbnxQmgslS70CqbS+TuAE2pCDggYuuFl5k1Mysq6grosq?= =?us-ascii?Q?0qqIZlJ9odtv1/3VqzFzi0+vP66ZUKBneiMu1mfci1I/b6Wjvn1Mc3MHuSUl?= =?us-ascii?Q?NMEUSzJOncPBcp7ABscR1Rlf3jnE8JUMxPLH+4xnTUh7IGYO7mQV5v269fc7?= =?us-ascii?Q?tFkjUSQFnwqf/1uBX13KGUYO3rfJ0Yc61VExKNv7zuaJeuL3BuOHv42lacLw?= =?us-ascii?Q?GgmzvCzLj0xr1MqeY3f+9v9gKv9OW28oVLvQFc927O+j9LvuBoDtMWhx5eTo?= =?us-ascii?Q?dOnVV6utW1hgdGUGQxna2pibEefxt6GjlI4Hm7TfOu9jpihCi4EVu3FVSSPK?= =?us-ascii?Q?hjl1BWjWbBoaHm8mP0gBPCGGJPLtNuoB9gMFogoDFDT1lJKnRHMFbENmKoea?= =?us-ascii?Q?08BClpiL1DKCUL1p+eHQcX7FxSZ6idc6gKqYTiH1UkiaLNRqN4GkHZhnVHfs?= =?us-ascii?Q?VOudTlbjUohJYvoiPChwCaSh+AHbvqCMq697KHPI1slxXYccyb15zF4OPpOU?= =?us-ascii?Q?J0gR2AHp94e3FcZFx90B9GmvrDi4vaKSiaGPZov2gSk+P0tD8pgAUxa5oFKK?= =?us-ascii?Q?10rQGSmoZ4N6NE97R7EPIacyM1j5qZb0cpgqAitFt0/KQq0ostJrTMIxfVBl?= =?us-ascii?Q?vP/uQaV+S8cPvGgSB7a6J7x0SuaM4bBpAE606IMRp0r4piav9ot1MUJzk7Gm?= =?us-ascii?Q?u5LCFZiLUmB2satOOYuUzRJArwhfz0pD1a5RbcmrpVbEO2OFW520EYJBACJ7?= =?us-ascii?Q?nTonVL2GdMH9Iv2HbuyhRtz1MGjfZFlOT795ZgQmaR+RlEUpX4IxWBXWcaD9?= =?us-ascii?Q?2P4Diu2wsK1tvbtiB7vRBuj3uwST2RT7bGZObOSnKjZ2NMehssNifjhTp6np?= =?us-ascii?Q?fOfJB4Z9ddjQBsxjfj2QB9LbfmZcV/4i4pJmM3KTfXc5RYH5Qj+Rt04thVA+?= =?us-ascii?Q?ISKIQ28qSl7nb68/cPtdeQzbJCKbiYrz?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM4PR12MB9072.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(7416014)(376014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?3OJEzoBq1lj1/IJcLMqGR13nfzqOXBjBqyjf2xfes8+3d5td4d4/FM4RlGYy?= =?us-ascii?Q?rw/nipcvS04RZ86R74eQ+tCHmK5/tzZrE+SsYzhIQbbOAflV7S9brEKlOR6Q?= =?us-ascii?Q?ydCWyzmNN3FSDMW0u+i9CNQByzjkweyJyoZJCowFZMcAAL9ESLUSTBmLVGSp?= =?us-ascii?Q?R7c9mmV8GnzKtOYuIsjUnUzQaauORAfYUtJGs6v4SAPj+m/nw8ILTTzOnB/C?= =?us-ascii?Q?/yAjaPJRL1HoAefN25r1BQEW78DqIvL53g0dTcaR+7PsgCgNtQD4dJp4tz62?= =?us-ascii?Q?resfuu6DWZwZfEjbjmTklx0HiI6TOJ8MOlm7QxGJJ996DY1jpM7j7VoZN69i?= =?us-ascii?Q?yVIYKepYEl2Ni0RLRDKsKr547cXsXUtvKhRQxDDSKGh8XnMufx9Pzg4hZeBz?= =?us-ascii?Q?WGtTIfWSrGEmYD9EvvovO4ur30gY3RzWySNiOuOuppVUapnXGN9vIljZZCRR?= =?us-ascii?Q?RbNJDQGbfwYjfRq23L8M7koLBYvNGMOoWtD7I+3wZG+vc1MkS8XaG5ta1hqT?= =?us-ascii?Q?fUhn53pEosYAP9LS2qybZRzRB7j680hyDTLKyrOlYqoKl+WvDdbgYZrXh5jE?= =?us-ascii?Q?zFDOLfHeDgaQrnvmU6nF3NXeRcd0YORAOZwuk+NnEjHqnuu6OXwUj5e1C1/X?= =?us-ascii?Q?t6YasiYrBFik+pK/OTB8CvabszpyCyQaqbk2Mi7tNLJwBg1XQkkP1Th/nnEd?= =?us-ascii?Q?mYCH69tCmZu70EGdR0JXQkh693E5udsaZv2EKlawfnyCZAEaEWsrktFaKOoq?= =?us-ascii?Q?ncvEX9C03KK8mmaeISIgWKo7Vf1E+EXkc8bh/h/MVs8Pqpu8dcHBMlNdgQHd?= =?us-ascii?Q?hxz7Bnj41mM8z4gJnhverQXxwGY/BEhhs3iS8isJX0BY7te6116B75URZI45?= =?us-ascii?Q?CHoueXOBjZ3Xl8Uc63j72foa93zkohqeryrIvXSNLq8uGU8FfnS6l4IboMCJ?= =?us-ascii?Q?mLLmgd9oSsl4jvHKin0Db3Hin4HQ5j8qkRFymWDFCfXiRZkGnhNy1F9IEXv0?= =?us-ascii?Q?4kJvdZ/X2sXuMuI1kvpOEhSNmlkABNzFa/d9AnCajQ+fF2mZo/Hxxoxc0QwO?= =?us-ascii?Q?bVW6KD41S9WWp/t/HwDTPHym+3ZiOvSYqOutXt5/9RJslpYw9Nfob1Oh8/4Y?= =?us-ascii?Q?Le3gDxzcm3SfZ8ypnECHe96CtjalCrqcwGjK0womTUY/93AjwatoSymM+0dm?= =?us-ascii?Q?Q7CWue7Q321Ly7XSLmtXRj42pFs1b46sABFZRD/JAhRf2SgdK3wMDFFqv7ov?= =?us-ascii?Q?kMiOCFVy8G7TtyalUNMaaRvseKxvgd6fpgT1v2NyBkYCP/VjlerHUW9JXxUL?= =?us-ascii?Q?UjJdRlI2lWpEqpO9t/xSE1dH+YeySUBA44ngMBocHonjyd3FO/HQL+Lfs+k7?= =?us-ascii?Q?xMum8PLkvnVTJmulZNYRB+nrraP7P441Rost0vnsiiFFejaxCn6vJEH5naqC?= =?us-ascii?Q?MXhw0CbPL3phyUbjuzvw2v2Fla3JmPYk3Tr/Mg4uZ0xgnkh6FpjFaKjMridD?= =?us-ascii?Q?AoQ6ifRTkX3Z8dGaRaSXm1mYdPQjNKIwdNW3fSPPl2kMVz4wmOmHYkA0i4cH?= =?us-ascii?Q?dR5OqlCbo8JdkZ7h6uDxqEkFX0Z6hdyagDCradib?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9585f612-9544-4157-647c-08de2e388c27 X-MS-Exchange-CrossTenant-AuthSource: DM4PR12MB9072.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2025 04:42:34.0196 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: KFPdM1CMcvVVC6loxR9viqlCwSrv0v6b/MOMM3o8nhh+rqJ+bMZstACYCO10KfwDsRUYdDi8isRdJPj+8Hh8ng== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR12MB8735 Content-Type: text/plain; charset="utf-8" A future change will remove device private pages from the physical address space. This will mean that device private pages no longer have normal PFN and must be handled separately. Prepare for this by adding a HMM_PFN_DEVICE_PRIVATE flag to indicate that a hmm_pfn contains a PFN for a device private page. Signed-off-by: Jordan Niethe Signed-off-by: Alistair Popple --- include/linux/hmm.h | 2 ++ mm/hmm.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index db75ffc949a7..df571fa75a44 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -23,6 +23,7 @@ struct mmu_interval_notifier; * HMM_PFN_WRITE - if the page memory can be written to (requires HMM_PFN_= VALID) * HMM_PFN_ERROR - accessing the pfn is impossible and the device should * fail. ie poisoned memory, special pages, no vma, etc + * HMM_PFN_DEVICE_PRIVATE - the pfn field contains a DEVICE_PRIVATE pfn. * HMM_PFN_P2PDMA - P2P page * HMM_PFN_P2PDMA_BUS - Bus mapped P2P transfer * HMM_PFN_DMA_MAPPED - Flag preserved on input-to-output transformation @@ -40,6 +41,7 @@ enum hmm_pfn_flags { HMM_PFN_VALID =3D 1UL << (BITS_PER_LONG - 1), HMM_PFN_WRITE =3D 1UL << (BITS_PER_LONG - 2), HMM_PFN_ERROR =3D 1UL << (BITS_PER_LONG - 3), + HMM_PFN_DEVICE_PRIVATE =3D 1UL << (BITS_PER_LONG - 7), /* * Sticky flags, carried from input to output, * don't forget to update HMM_PFN_INOUT_FLAGS diff --git a/mm/hmm.c b/mm/hmm.c index 87562914670a..1cff68ade1d4 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -262,7 +262,7 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, uns= igned long addr, if (is_device_private_entry(entry) && page_pgmap(pfn_swap_entry_to_page(entry))->owner =3D=3D range->dev_private_owner) { - cpu_flags =3D HMM_PFN_VALID; + cpu_flags =3D HMM_PFN_VALID | HMM_PFN_DEVICE_PRIVATE; if (is_writable_device_private_entry(entry)) cpu_flags |=3D HMM_PFN_WRITE; new_pfn_flags =3D swp_offset_pfn(entry) | cpu_flags; --=20 2.34.1 From nobody Mon Dec 1 21:33:25 2025 Received: from SN4PR0501CU005.outbound.protection.outlook.com (mail-southcentralusazon11011053.outbound.protection.outlook.com [40.93.194.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8A7D12FF65C for ; Fri, 28 Nov 2025 04:42:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.93.194.53 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304963; cv=fail; b=Tzr+G/6ixiqskSjoNE2teXewR1Gzk8F9kLo+fnZoQY7Wc2VL5Mk5BBCxOOzuCFvJ/amKFYO5NCFf/NgCC0ce2OIcFa/gQFSOAbCRS5AexV7Ftv8K73644uqNOkfjaDudtj3vS6ujYUJzflyB+YRa4tMm1ESt2MgQ+k4nA2Z/798= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304963; c=relaxed/simple; bh=xglUndQsH065viSus/+cb3BfpZY1SA7LV4dFDmzxHOI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=IqXcL8An/odjq6cPQvk/fd57grNeR8THvAvczoo4Y2Hf0XTascHJn9PYhgRjWOBjpSwt2o2SKSQEPXCL6g97777dIzxdd8U/d1/8wnztQo8zcg5zR1M2hWy0mJaBZ5Dq/8+lTvbS1fd2SgRfd7pfS8HAaLpFOS/u9DbAukhUkJs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=cnORAWNB; arc=fail smtp.client-ip=40.93.194.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="cnORAWNB" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=aUQeqU2+sNFInPjxaP9xPa1L6rF99u2ElrxhXeFVP57raE4uFAD/fI3eDlQL6dLIXA5duiCOeroFU0ObqMtO919SkXy2kTlIidDYWfjRgYkwr4suN2PhhzWxNUselrhjStKo18Am/jnPtGjGggaxAxGfRPIGdDUaTPVVvlkY6Yk7YWVbs864YggP521okjs3pWt+G4QUSNIrGjFk0pl6ryjzWGKHGyTTshAK9dyA0a9IUeLVOeHmk3FzPfAtwWyfGr4tDCVtPAmeSEMTMXC3d359g40p4JwY24440Le9LQEVVu7/vT/tDT9EMUK/Esab5brtPTnFUzdPTke2igL+wQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GCbQVhYsLzUTy2wEbA7auN8AWdyEUP1vgJe2ayHSlKY=; b=Jbn/j79y/eHpfeeekB28HCFki0SgPDltEYpyrxNYyVoos43VVGiXyBhX8nFbutlVXtYUj1tXNVSq0toQVAlA9JZ6YetKzf4QXPdpWvJ2hD6jgqumUP9/CG+ZCV0IFGogtND4OMF5cB+IGnMO1hakPP3lXWYvAk4WHO6XPlrkuiSD/nh9zyoyPC5FDXShCqFt3Zq4v4cRhPCst/qexj4LgqEDTKZ50AShxs0TmlKO4ZvQV27TUWM3K9YFhE81H0W7VZRCr94tWKCdzNW/hc96sYDV//nJAk5atvvlArjWDZ4Rtcfx91pP/Ei67xRifg3J68SD/84XJFsPJ9fcI7JiJQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GCbQVhYsLzUTy2wEbA7auN8AWdyEUP1vgJe2ayHSlKY=; b=cnORAWNBSDw8LV74MGNC3TcqSrD1EzLOJBLfobmOmQBukR9EZ04AXtJ7pF7wuk4u7/XVqf/yRSbKVJmT9LtDkziz1HvPMP4WPtcJYmE2wc3xv/2FfzONFyuqvkiE3GM/1I/A98W0ik6QpKhvvi2NQ7qohR8cTUG6ZxKUd6b/PcISZXTH+raNInbe8Nzy3nrHTtALYzsuldoD21WgG5WwL5s/GqZRms74bhHK2zTkqp6RogOgPmWILsTMW1EDwZV+sVfYa0/RbvGfI0vKMR/mFhzosPg0L9lFZiWxXgzQcWoojfZT2IAHGJ4oqS3Jc1vflAAZxOEKFvdVYfygymO0Hg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) by MW6PR12MB8735.namprd12.prod.outlook.com (2603:10b6:303:245::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9366.13; Fri, 28 Nov 2025 04:42:39 +0000 Received: from DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1]) by DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1%3]) with mapi id 15.20.9343.016; Fri, 28 Nov 2025 04:42:39 +0000 From: Jordan Niethe To: linux-mm@kvack.org Cc: balbirs@nvidia.com, matthew.brost@intel.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, david@redhat.com, ziy@nvidia.com, apopple@nvidia.com, lorenzo.stoakes@oracle.com, lyude@redhat.com, dakr@kernel.org, airlied@gmail.com, simona@ffwll.ch, rcampbell@nvidia.com, mpenttil@redhat.com, jgg@nvidia.com, willy@infradead.org Subject: [RFC PATCH 2/6] mm/migrate_device: Add migrate PFN flag to track device private PFNs Date: Fri, 28 Nov 2025 15:41:42 +1100 Message-Id: <20251128044146.80050-3-jniethe@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251128044146.80050-1-jniethe@nvidia.com> References: <20251128044146.80050-1-jniethe@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SJ0PR13CA0229.namprd13.prod.outlook.com (2603:10b6:a03:2c1::24) To DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM4PR12MB9072:EE_|MW6PR12MB8735:EE_ X-MS-Office365-Filtering-Correlation-Id: 7f62411e-c76d-4367-4548-08de2e388f34 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|7416014|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?3lE2EmUEqgC4GtYXyPmsvnLbem5wPQfN3ofj5Vy6hGs8028Y9eXJZ+yvCins?= =?us-ascii?Q?C+VwoBr/NjK/dfzyhizxtD2WTsMPweSf0VYiJJ3WSQXD1fpV/PiDBMZgj2yY?= =?us-ascii?Q?Y+YVnvjR/Y5+5zXjRVkeJQa5+FqiqoyOHZJdR/MBLCRnhgZcdUnNpVVuXHJG?= =?us-ascii?Q?+WqU0cfyADEMyZoPR9BmplaqGusukuw2aqCa6ZlY2Gbzv/8d2mpBR9f5/cfz?= =?us-ascii?Q?U6/Ij7hOU3tpTwrmYRwTYn6xqq5LiYbpQKa3AhpNfaVCNxcIhO4o9q0To+RL?= =?us-ascii?Q?pTyhnIpjZm/cjOWRuwzdjSi88Eo2mVKUyqxff0jVBhcu2Ueci18Xj2S9m4Cj?= =?us-ascii?Q?GKUTt3NlwMkXOqdpgmQpWDncmQBBvZnXZ6kT19UpYszLTFSqH94EeH5ve2Mc?= =?us-ascii?Q?RBbI24cAsnBd32ThNR0VBHpHsp6ChtspvjtIwOeAvF6b2Q1r4JTWEmCt2pIb?= =?us-ascii?Q?iSmNgIcGUjo7bUnGjPoIP98SW1HEyDxuuZqe9jWebXOY77zLoG5RuHGDRb2V?= =?us-ascii?Q?W2IS9cJphwqWEIueCFEPiH5JiKofU5o1k8BIuM9xl+PWsM/2F+d/IJWgCQq9?= =?us-ascii?Q?f5XPKSiCdfyYOKZ3Hqj2zlRsdhfp63U3jg1Lljt4b3EjNTYTW83y5MD44712?= =?us-ascii?Q?s2ifF57jmk6idGh2ISysbrDVg6n3g90RDji5MQ3uRi7uNSwHs+4Drr5aRY8q?= =?us-ascii?Q?JXwKQ0CXCgVsgeCKzVsDCxlZEWZJW94/MN9QiaXlmtttuZoqZwpzjIyZMIdE?= =?us-ascii?Q?AH5U0CK5OfZjqRrZevJwLqauJo8L4N2Afw1fafJFhjeYFs26tKUYW98yb+Zu?= =?us-ascii?Q?698e/Bb+bN89CToT+AXjRqU+Idy1YEEyTf4bnNJ1Zr85iMtMaVmuGVCy4Jm+?= =?us-ascii?Q?LBbBbAhIF9MjpN6IoYga7UlW2eJLXCOVkLhv0vaWy9FWN/t+Fts5f3Gt0g9s?= =?us-ascii?Q?Q6YTg60Hpn7tIjh8hytOEPecXIt/9/XjufzAkSfhTHSuYACARHWKGsKupoql?= =?us-ascii?Q?RdQpZPk3qbhuMOKYl99hwrFl0rJo+3FmHE7g5LWhPC9VjJSbiALhWTn4zbvm?= =?us-ascii?Q?rdWMRLRx2IeJ0I2oNkoeR9NfYeiLwg4pUNZreJJuVZMfzHo6DMxo9q6MJtY7?= =?us-ascii?Q?RagE/3LDpyo36enBP9eK4ovUkjA9SWupclcroL+BPAdrhzeLf56+b/5XRcq2?= =?us-ascii?Q?CwznoB+cW0czgJ0r5fm1uCOxELdY844bxaN3jxv5r5CJfN2dQzOgOdRHucy7?= =?us-ascii?Q?v3ZAw64o6sNz1N9X2c4SQHKcAHSgi6P7afiGKM+8NX+PisK+g4fBGrWpS6tj?= =?us-ascii?Q?7eknkdOEWqZUTm26WPmUuEUoWFwP81y55IVLYzU/vGkGDdC1Q2PHIFOjnLPa?= =?us-ascii?Q?2v1TUnzNj8DDEMZzRwU58s+78MKSF06c3JRTJz3GjprNmdSVy143ROvkTJik?= =?us-ascii?Q?nD6V6Hg9JYOFRuo43TdWGhjOQvZEpape?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM4PR12MB9072.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(7416014)(376014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?STvp8a3djIuH5WQ7XqXMju6vya4xHsCPPZgnisnsNSdCOTdztpxozAFcfKm/?= =?us-ascii?Q?n7rPfyL50UVEspqUtQA80S0oI1PrLkdp5jFEqUJ5XLMGuzayOq/07Zgbs2cx?= =?us-ascii?Q?C7EX93Ztv0Ypog7DVW3c4UGnODeLzy2MLo7Wsj+aC3auCN2GWYirgrtoEu1F?= =?us-ascii?Q?CvB3K+FQ67uaEwVoFqWgW9gnQR/vw+2F6xOSADxLTuv8dCHR7XDb0OGHSvQe?= =?us-ascii?Q?rbqWlo2xjSPZ8LNtMtsrQJwMvsdYNVREO8PkQxc57AdWUW9q1JPTSRRpS+B6?= =?us-ascii?Q?CH3uiLNifHnoyH8cPzLKSkDI7GUylgNku1dXW0HDSJENh+EVP/PyuZ/l+EdA?= =?us-ascii?Q?ssAjbp3ayEAHYXlZyvURosPtJqYuW5XYnkUgyrbFNdutyPvRJ/h+v7Bi61NI?= =?us-ascii?Q?SuRsYV2Lwb6BWqyRmFAzokrBw83oBzWjNAy8VW9nGmJS83PGT+sMjaNaKILX?= =?us-ascii?Q?PepT9TRoX8+czopBj1DPOJn36tIrkzI5P/rOJnA58WlUNZEoiOO3vk7a6qAO?= =?us-ascii?Q?Cd8NyZro0ELb7+OUufE2p3pAAXJtuxWJPGAlZ/17Jwxov7eajy5nIMiR9QVS?= =?us-ascii?Q?L/H4UkuCnaprmfD+W4r+8SACFyC1TY2M+Rerm3pTqJFICS6T9p8Tc8xVOzfi?= =?us-ascii?Q?Ne2Jq6pYOjG0PM/rQG+IcNc3u+BRcJuf65pZlcPLI0wqLMXoHha3pKtwJINN?= =?us-ascii?Q?dSH4DbkedFf5cMezHeb5qgNlYNADcc6h9rgNzcHd72dvOxGiKaX3FxMUTQaR?= =?us-ascii?Q?xoUgHiSumv1bsCOY7YfihYyR/1L6g9pYXRMRKNMd2Ef2BBQLwG9oKfMY4VbZ?= =?us-ascii?Q?M7MA0Et8E8j4U1snNUfJmBxih85IisJe2pC92FM7ZZuGHwgbKRZ7ztspRMrY?= =?us-ascii?Q?2T7uKpVlqWe5la2hnaz6qXKIFYLOQKumz68PcQh3Goq6A2qbK63v6m1uW2JY?= =?us-ascii?Q?h2nawLvLDRBV0SdHFU8lpNmHGLPhndhGnUftpdEOUih7hzVWfkO2WL+H6M26?= =?us-ascii?Q?GTOkHNAlcJx6PHJpkGOl4RMIkGuRLfWEc2wkAOJmRNmVByuY2SFGB5spJREv?= =?us-ascii?Q?YYGovcw65URLgP7e1xyoOVxl3cW9wGCqDgedvAQOa+QIa3vvmpF1GY1pCyxs?= =?us-ascii?Q?fpOqfXzeSVOYW79K4Ovr3mA558Uraje/QmvmhS8Vt2TOpiia/zUKHBjgVHwi?= =?us-ascii?Q?p8u0LaPPg1hhD8ZudIrcZ4oVTcz+XrO5GpDW/frerGX7J6v2jJlkHpMiex72?= =?us-ascii?Q?/pLssQeWU745kXI6njcNxGV/hWgW/+Z9qqF51AwC6PsgePLH9srBTFyVNam8?= =?us-ascii?Q?yZu8yWttH9pxm1z9V8OKXPXzWjTIIeTgSGJ/hl4f7INVaNXZgb23DmsydVwV?= =?us-ascii?Q?JMxIAlvCmiL4qAqaTf+ianD5Jfd4IAwr794bJ38VwQIOo4TgUuLIJUaM7jDt?= =?us-ascii?Q?yZsvvsDvunTi+sEqfIjQKRqNByyzOT3ihXviH0mXJq+B97PmkUTz5NAyq9kd?= =?us-ascii?Q?jn9CBOuR1p33ni0bwqwijrywU2RI6fn35KK39oXaDwD55ZrxCLq1Lc38pER8?= =?us-ascii?Q?Yu1qKV3mPPry5n+/Qt/z67MLs0EBw6dzoi1+5dL8?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7f62411e-c76d-4367-4548-08de2e388f34 X-MS-Exchange-CrossTenant-AuthSource: DM4PR12MB9072.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2025 04:42:39.0035 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Z6rMg+N6IdVHswlUhr4YUvEcRBolPzWpHF5D951xKV7PmzkMZvTWhq0kB6e9Jl00aj2cLANOy626G2+mVUbV2g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR12MB8735 Content-Type: text/plain; charset="utf-8" A future change will remove device private pages from the physical address space. This will mean that device private pages no longer have normal PFN and must be handled separately. Prepare for this by adding a MIGRATE_PFN_DEVICE flag to indicate that a migrate pfn contains a PFN for a device private page. Signed-off-by: Jordan Niethe Signed-off-by: Alistair Popple --- Note: Existing drivers must also be updated in next revision. --- include/linux/migrate.h | 1 + lib/test_hmm.c | 3 ++- mm/migrate_device.c | 5 +++-- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 1f0ac122c3bf..d8f520dca342 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -125,6 +125,7 @@ static inline int migrate_misplaced_folio(struct folio = *folio, int node) #define MIGRATE_PFN_VALID (1UL << 0) #define MIGRATE_PFN_MIGRATE (1UL << 1) #define MIGRATE_PFN_WRITE (1UL << 3) +#define MIGRATE_PFN_DEVICE (1UL << 4) #define MIGRATE_PFN_SHIFT 6 =20 static inline struct page *migrate_pfn_to_page(unsigned long mpfn) diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 83e3d8208a54..0035e1b7beec 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -684,7 +684,8 @@ static void dmirror_migrate_alloc_and_copy(struct migra= te_vma *args, =20 pr_debug("migrating from sys to dev pfn src: 0x%lx pfn dst: 0x%lx\n", page_to_pfn(spage), page_to_pfn(dpage)); - *dst =3D migrate_pfn(page_to_pfn(dpage)); + *dst =3D migrate_pfn(page_to_pfn(dpage)) | + MIGRATE_PFN_DEVICE; if ((*src & MIGRATE_PFN_WRITE) || (!spage && args->vma->vm_flags & VM_WRITE)) *dst |=3D MIGRATE_PFN_WRITE; diff --git a/mm/migrate_device.c b/mm/migrate_device.c index abd9f6850db6..82f09b24d913 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -148,7 +148,8 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, goto next; =20 mpfn =3D migrate_pfn(page_to_pfn(page)) | - MIGRATE_PFN_MIGRATE; + MIGRATE_PFN_MIGRATE | + MIGRATE_PFN_DEVICE; if (is_writable_device_private_entry(entry)) mpfn |=3D MIGRATE_PFN_WRITE; } else { @@ -918,7 +919,7 @@ static unsigned long migrate_device_pfn_lock(unsigned l= ong pfn) return 0; } =20 - return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; + return migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE | MIGRATE_PFN_DEVICE; } =20 /** --=20 2.34.1 From nobody Mon Dec 1 21:33:25 2025 Received: from PH7PR06CU001.outbound.protection.outlook.com (mail-westus3azon11010000.outbound.protection.outlook.com [52.101.201.0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6079B2FFF95 for ; Fri, 28 Nov 2025 04:42:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.201.0 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304967; cv=fail; b=AYxS6d8pzP+mydQekeBzGG5OdsAHOcBaNbsnLOPKT44cpFolCJ72CJlp5JS+DqZkQscUDqope7nqpzm5+88w+WzxJXGJe9jH1pTnOAnhDzvRKOvcw5VCGEjqLHwlUGU++N0pOXX5mwfgxgEB8xNnVk+J43pfod/rZ1R0hGWQe8c= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304967; c=relaxed/simple; bh=VkDRbrFwUWi7US13qwcHo+D5yw0of/onjhPqi6P/5QQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=QIePz+OHwlr5qgfEboI7H0i0u1xCYJqvLVoE3/0nZPGJg2IsWWgid8eBHSYGDWWaKdZqVtDc9xI6Ao9uPt1dwXy72rcu/l65C8iQuBNqH37YAnE+hVdGazeAbparSZfTWzz8CqaFJTQUaXbwBFLprUQWZ3FfQxJycCgQV2AIoa4= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=BWyqxuBX; arc=fail smtp.client-ip=52.101.201.0 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="BWyqxuBX" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=OujqEzTuyiqrDyzUL4S9ZSbNdhjk1rKgVf6ev4sXFeVYZjjKif/MyKzjY7zXAYxjWUZCu0jdnZ7XffCVL7K9ozRi6ZKITtysjnf3dPxpObZ2c+lSDNa0Rs5B3GJcCmsUYGI/atgB3RejWVA2pzkI+x+3XUTFiK2lyqF6lqoLzj47WeBnTz79ii/vFrVY4YWbOB+N/649NDbAj/+J/8pqaU3ql0lVqb1cJut7sL0UrR1XYg2upb1bm+B8b1AA68SMUyu0ef32yjmLsZ7NMNQza8xJvFgQOuPfbquHB65UZr7xLGvZO/mfpvuFoGOwp2KIz0QGEIcSb7VH4Pjl4xWuwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NR/puIQWtRRGPq8UicCYMmQO8w2eMsSWf3EGcjyMrVQ=; b=xWxt7Ic3j9v/CMmLKAo6C+WWkRoI35hyFoE6eIX+pyjnXnOwmii9e5JxZQ3eJUNffEFHuAgWo9vOrYtaM/8EwcdhxR4yPn2TwDI6OG7lMz961kbGe4gXnWUAm2kIx9JxrZp9pnkEzU0cXpICh4ypGir7r2UffKWw670NJn9WtSPE0+Urjz0ADRDRus7wtEZ1OkArfqOqTAh95DGqOqrUylRO6r0Lb6711tGJ7OF7KpvUROOH6b0DWkviPg/c9wl1PSiFFovvABWvrV3yfzg/enMJ3irCDTSFszR1wjpAbkMkaZ/NyIKscD8N+9g5R/vpLZMGG2vsADdTkiOzx6OR1g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NR/puIQWtRRGPq8UicCYMmQO8w2eMsSWf3EGcjyMrVQ=; b=BWyqxuBXXFeRU723AV7nrcIuI8rURSKU6BqFqUQUliQyxBlREDYGPAqjtCNpcepMI9KV9NwmcYryrCBDp/HlwkfhlRulzzDjZUxVLSdx1XcrZhKHBhwD7n6h46abwsFcekRYcw84V00FrA874Q8hzLNxFuLElzwJBnBdrSQBrGQtNXci+XvcZMdf+ke9l5rmxPaQimyOQFuCpvoYFjSI78bF9LLldV3IDfERtIoeL1XZ7+ykfUxdezVQ2WXOkJqcagQkYAidnvI/grwOzMA60YuXYmccf9N0M0KvFF+Bzie71Kp0zBWB8DgQH9e7lyrMc/m0lps874mdReTVVL6IjQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) by MW6PR12MB8735.namprd12.prod.outlook.com (2603:10b6:303:245::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9366.13; Fri, 28 Nov 2025 04:42:43 +0000 Received: from DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1]) by DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1%3]) with mapi id 15.20.9343.016; Fri, 28 Nov 2025 04:42:43 +0000 From: Jordan Niethe To: linux-mm@kvack.org Cc: balbirs@nvidia.com, matthew.brost@intel.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, david@redhat.com, ziy@nvidia.com, apopple@nvidia.com, lorenzo.stoakes@oracle.com, lyude@redhat.com, dakr@kernel.org, airlied@gmail.com, simona@ffwll.ch, rcampbell@nvidia.com, mpenttil@redhat.com, jgg@nvidia.com, willy@infradead.org Subject: [RFC PATCH 3/6] mm/page_vma_mapped: Add flags to page_vma_mapped_walk::pfn to track device private PFNs Date: Fri, 28 Nov 2025 15:41:43 +1100 Message-Id: <20251128044146.80050-4-jniethe@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251128044146.80050-1-jniethe@nvidia.com> References: <20251128044146.80050-1-jniethe@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SJ0PR03CA0297.namprd03.prod.outlook.com (2603:10b6:a03:39e::32) To DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM4PR12MB9072:EE_|MW6PR12MB8735:EE_ X-MS-Office365-Filtering-Correlation-Id: 20c53a5c-a002-476a-2480-08de2e3891c4 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|7416014|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?S+xbHesd+n2R8I7wITbhHd4DpQ/bW64h0DxWeqdmW1NqrKwKvDkAAVraYeUN?= =?us-ascii?Q?YXskEXaHdpvmD2vRQPwDTVZDsxWjs3d2A+G3hAjgn5BZUYMDh7zhg+lbDk5M?= =?us-ascii?Q?9HwRC097ux+KOOpabXpjZwZuRqCARH3YN5Z361ymsWJeQFbp5gBcRY5CjqOf?= =?us-ascii?Q?P2b2MpzKY8GWzoY4ZRI5OrE9aKy6UdvmHG/ovspcemuHo91S/3GFTu6sfSCn?= =?us-ascii?Q?Nfd4BnGkK6KDQEXPQUclTEEzFuI/AguPxyWWVf/5Z2QpplTI0zMBonRWoN2M?= =?us-ascii?Q?n07AcYSLb+NSoeAj0Qgnmn+dwo8GCEbp7MQ29B/eUqorVrRdAiHU28djoflS?= =?us-ascii?Q?fOe+h8vR9T0AsOZ3Uxz3VxllQ/uECqCz/HRjjAua+eCv5QJEyYecW3aRRt3l?= =?us-ascii?Q?n5g/yelAsjjjk5mfMbtuPwZsWm0C2w/qDwuZHFaVYyxPltEHT309qSbb+wFg?= =?us-ascii?Q?o6Gf7vpxZJRCzmCDcaxtO33BNRscxwgpaBgH68+bpCSVrCiJDmFe5yno9nw1?= =?us-ascii?Q?lnXfMje8QUgD+D1wrdxeNW9EuHyXwrCXYiIA+tAewpqG/XvemAPgYp0MbZHB?= =?us-ascii?Q?qiYP2TizlFodzXBAY5J2tyz1s7zwxJ7cc7+d1lcbgzVdVp0XZ5+Dur55MaNm?= =?us-ascii?Q?dwEyRACWsydbvF+FigQWi1P0Zbpz5qwbUJXS1Vhcgw4ben36o1+ue5as9zgH?= =?us-ascii?Q?xrDxu51sRDMurP2ES7OyoP6WYHoUHgw4DGQrXz3rGA6Hfq1VfDw+vO12jZG3?= =?us-ascii?Q?t9BQQxg+lIl7dLb2iBrx0xc4Txv4RU0AR+WlbNCxSoJcoEOx29TStt74/6Wx?= =?us-ascii?Q?d6/5QQtsbM7GqMveHxjXsYHNusPerus+Tz3zNvG9sBPBAzg9nL3WgBVVmmFd?= =?us-ascii?Q?QMiGDjEjuarmn10ul9PB/BixUpV6P+3uiexSwZBblumKKHWPK9cDwNCn1a9b?= =?us-ascii?Q?FahrV9pap1inpbFjLv0S6Plp/CAfexC1wtOSRUk1aw6XbMDQN0ypT+Yk43t4?= =?us-ascii?Q?Y9W9B/jdIFuBmhjoslSZwad59puj+tjJ8qDqpRPQhb54FGQd1j3HGy049/Yp?= =?us-ascii?Q?JuvV7V36xtoJjIYc5ceyCarxhcRg3BuimzA23b7YPWhNlEnPiST5NFROlrEA?= =?us-ascii?Q?i7e40L9iRov+4lx1XPyDAHWPDyOAWOiqJKeBjkX/UzFeZiB85zkAHUtyUhpa?= =?us-ascii?Q?HdhPP4OQ35jS2Gto7Pd/ESVqcNDHE/ILYqHr5KYKOpYPZC8EhKT9l7qagl0d?= =?us-ascii?Q?z7uM6b2wUwNilX9dpCGzG5/Fm3+Nx55rQ3UUOVB46ESDo2TbKM10C+ksuuUo?= =?us-ascii?Q?E+XnmHeTXVNtYr1Am/08E3yQoUKEGNiyhVUHiVDWd/Wlm5pvEs3pywXrM20s?= =?us-ascii?Q?0TUdaZfaBDxNlEL5sdo0LMxUpV+JCIcXgqEzmaODvKJAd772ZPpjVVvdxgyQ?= =?us-ascii?Q?s/v5PvwY/1WKj8hP9YbF5VPxcqM04Mx6?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM4PR12MB9072.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(7416014)(376014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?uSksldqXTjqsVMlzW57be+kf2qY9Qa7YpLm4g0ZRu6kC9rvJiL6uiMqEkGYF?= =?us-ascii?Q?KE53dAci+t/K7ykIab7mXCggGZtFnQKw6pZTruBx2XA3T1eX956a6+//yKyq?= =?us-ascii?Q?VY6KpBGPhlBnYNTQIJYo7pWYrGwuRfjZCfsgtCR7UnA0TGiKHbXs1Iv95Z3k?= =?us-ascii?Q?nXrzYdjaSMIEItIOsBqatQfYpFwQX43CSFo1NzjyZma7LZ7tszbMD9JYwPD8?= =?us-ascii?Q?90rWBUac73iNOnE9Y9vEbKwx5Iu+9KGUdz6UDjs08qtsemnuderk0lrBCOXf?= =?us-ascii?Q?MRHQo+0yMbj7SJ+mhL89ruomRQB28lb1/x994tmwewP7zZhHLWLKR+E5/RHr?= =?us-ascii?Q?rO0xDM/KUMuwQSU1tvOsHxbcuR/SLPTajaxxwQkpjDC803rEZSP2LUPAj8le?= =?us-ascii?Q?NQOLJCkig7qUq4izsK+cLPqx+D/mQpeezN8AJR9Ogo/9ciXqPD93b17PdQBq?= =?us-ascii?Q?LjIMMgtZPiu180ER6lcD3m0/H9Xu4xn0DRqRO92hPcsRRSLcT2Z2Jwp0Y6gW?= =?us-ascii?Q?xiSBctdHPUg+qb2p7HqpQxzKHD9TMNgVpOO/8MPALYDt1WRfBcFmlFBkErda?= =?us-ascii?Q?AUYwGROGqQ0H6hJH0Ovy77axhxiUsuYE62wwIeDWbQ44u/tKJZKfWXX73ReM?= =?us-ascii?Q?jD71G2v+i0ERAxx2LTJtwEprSVCNblPXrAxinuzn3ZNo84vm1mSG9C+CNd+y?= =?us-ascii?Q?2kh/sjzrMh89FQFwdHL9qv0pR140PiHRPJvM9f5aEpQjmWUR1WTZ5MvVq/G0?= =?us-ascii?Q?r5jTn2wHHViYPiHdelfIHVZwOzlDfy/9hYrCPzcHaTkZ4MRLgCtFnGYocHKQ?= =?us-ascii?Q?lawwqYVMO9lHI8egHNNk9QdScaXTT5jkGJRKcOoIioPjSfysTLpz/t6+pFxH?= =?us-ascii?Q?fH9SyHVODx7/Ek9zhZoDertKAZzE5+tFUBxpTQlp8eO9uCuThobApAchGV+2?= =?us-ascii?Q?H3VOTILXy+URl/Oka1PTX+zhcYOoHUL7qglDrsii37yH1XN1pQoxTkT0msrj?= =?us-ascii?Q?HkG9E9tIbCr8+O3E3j5Y2qPBPA5iyOXrl7dFNpnsS+OKL8ITTn7AEHKuqGIZ?= =?us-ascii?Q?NAl3Kr4xq84DoEPVzzJFjJWTEqV+QyRwggPjmfyRdilHsN9h4sb3R7f3YtKY?= =?us-ascii?Q?fOJ+XiztJUwpW3mzCFamZuwKhFdx7y3zWvOWWinwgZ2EQJgv4fvqbvY1WQ/K?= =?us-ascii?Q?VWB8I8Znlgry4Ftme8vRhKaw8LuchBdFPz8qA9heX/XKIJTLuef4o1K6Aoy1?= =?us-ascii?Q?Hfiuq4Kmq1gc3UIj1LSw2Y0M+LNz3s1fzyZqlLc1SvPHafj9FIJgh6DqyIl0?= =?us-ascii?Q?2qv9ebOweaBNi5GurJ84UVzlTD4PAZCRw4nRgO+mJVeuaTdI0vrA2cRw48Pz?= =?us-ascii?Q?NorYRoSQyMAIChWVeUlEdhw+q8OjM249qUW5U+YABDxkCN02cQN/t2zC/A6Q?= =?us-ascii?Q?RDRjDsTYgJJugbwxmPfgaKP7SrjS2K8bGalVD4eejPix/HwIqydOnohT8FZ+?= =?us-ascii?Q?agNlh/HdEAee7jk+kXQ6+pKMR6+FDlh4+mNply+ALGblkByArvoljxAjlyjH?= =?us-ascii?Q?Sg3g092Xx/U5pldJxRzfWm2aqtppwNt0jmSNA+WO?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 20c53a5c-a002-476a-2480-08de2e3891c4 X-MS-Exchange-CrossTenant-AuthSource: DM4PR12MB9072.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2025 04:42:43.5186 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 5oJrcPTRRTgdEpQaGd5+G2CpbIVkC0mxd/pB7bmPcWGTEo09hDziIBA1luec+nvN0qyR5GF0sY1hEQDMZSeikw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR12MB8735 Content-Type: text/plain; charset="utf-8" A future change will remove device private pages from the physical address space. This will mean that device private pages no longer have normal PFN and must be handled separately. Prepare for this by modifying page_vma_mapped_walk::pfn to contain flags as well as a PFN. Introduce a PVMW_PFN_DEVICE_PRIVATE flag to indicate that a page_vma_mapped_walk::pfn contains a PFN for a device private page. Signed-off-by: Jordan Niethe Signed-off-by: Alistair Popple --- include/linux/rmap.h | 26 +++++++++++++++++++++++++- mm/page_vma_mapped.c | 6 +++--- mm/rmap.c | 4 ++-- mm/vmscan.c | 2 +- 4 files changed, 31 insertions(+), 7 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index daa92a58585d..79e5c733d9c8 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -939,9 +939,33 @@ struct page_vma_mapped_walk { unsigned int flags; }; =20 +/* pfn is a device private offset */ +#define PVMW_PFN_DEVICE_PRIVATE (1UL << 0) +#define PVMW_PFN_SHIFT 1 + +static inline unsigned long page_vma_walk_pfn(unsigned long pfn) +{ + return (pfn << PVMW_PFN_SHIFT); +} + +static inline unsigned long folio_page_vma_walk_pfn(const struct folio *fo= lio) +{ + return page_vma_walk_pfn(folio_pfn(folio)); +} + +static inline struct page *page_vma_walk_pfn_to_page(unsigned long pvmw_pf= n) +{ + return pfn_to_page(pvmw_pfn >> PVMW_PFN_SHIFT); +} + +static inline struct folio *page_vma_walk_pfn_to_folio(unsigned long pvmw_= pfn) +{ + return page_folio(page_vma_walk_pfn_to_page(pvmw_pfn)); +} + #define DEFINE_FOLIO_VMA_WALK(name, _folio, _vma, _address, _flags) \ struct page_vma_mapped_walk name =3D { \ - .pfn =3D folio_pfn(_folio), \ + .pfn =3D folio_page_vma_walk_pfn(_folio), \ .nr_pages =3D folio_nr_pages(_folio), \ .pgoff =3D folio_pgoff(_folio), \ .vma =3D _vma, \ diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index c498a91b6706..9146bd084435 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -133,9 +133,9 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw= , unsigned long pte_nr) pfn =3D pte_pfn(ptent); } =20 - if ((pfn + pte_nr - 1) < pvmw->pfn) + if ((pfn + pte_nr - 1) < (pvmw->pfn >> PVMW_PFN_SHIFT)) return false; - if (pfn > (pvmw->pfn + pvmw->nr_pages - 1)) + if (pfn > ((pvmw->pfn >> PVMW_PFN_SHIFT) + pvmw->nr_pages - 1)) return false; return true; } @@ -346,7 +346,7 @@ unsigned long page_mapped_in_vma(const struct page *pag= e, { const struct folio *folio =3D page_folio(page); struct page_vma_mapped_walk pvmw =3D { - .pfn =3D page_to_pfn(page), + .pfn =3D folio_page_vma_walk_pfn(folio), .nr_pages =3D 1, .vma =3D vma, .flags =3D PVMW_SYNC, diff --git a/mm/rmap.c b/mm/rmap.c index ac4f783d6ec2..e94500318f92 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1129,7 +1129,7 @@ static bool mapping_wrprotect_range_one(struct folio = *folio, { struct wrprotect_file_state *state =3D (struct wrprotect_file_state *)arg; struct page_vma_mapped_walk pvmw =3D { - .pfn =3D state->pfn, + .pfn =3D page_vma_walk_pfn(state->pfn), .nr_pages =3D state->nr_pages, .pgoff =3D state->pgoff, .vma =3D vma, @@ -1207,7 +1207,7 @@ int pfn_mkclean_range(unsigned long pfn, unsigned lon= g nr_pages, pgoff_t pgoff, struct vm_area_struct *vma) { struct page_vma_mapped_walk pvmw =3D { - .pfn =3D pfn, + .pfn =3D page_vma_walk_pfn(pfn), .nr_pages =3D nr_pages, .pgoff =3D pgoff, .vma =3D vma, diff --git a/mm/vmscan.c b/mm/vmscan.c index b2fc8b626d3d..e07ad830e30a 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4238,7 +4238,7 @@ bool lru_gen_look_around(struct page_vma_mapped_walk = *pvmw) pte_t *pte =3D pvmw->pte; unsigned long addr =3D pvmw->address; struct vm_area_struct *vma =3D pvmw->vma; - struct folio *folio =3D pfn_folio(pvmw->pfn); + struct folio *folio =3D page_vma_walk_pfn_to_folio(pvmw->pfn); struct mem_cgroup *memcg =3D folio_memcg(folio); struct pglist_data *pgdat =3D folio_pgdat(folio); struct lruvec *lruvec =3D mem_cgroup_lruvec(memcg, pgdat); --=20 2.34.1 From nobody Mon Dec 1 21:33:25 2025 Received: from SN4PR0501CU005.outbound.protection.outlook.com (mail-southcentralusazon11011040.outbound.protection.outlook.com [40.93.194.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CA72301465 for ; Fri, 28 Nov 2025 04:42:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.93.194.40 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304975; cv=fail; b=OB+OnDWqISmNSa/UTK8WCH125KGZNOL7A01cTh2fDzI/0PtevR/Imvtn5OHYbUl5jZN27CCC1STloKbHGhlApzXYWYNG4ke1uMYq11DaBSewKiOndGlF2//xs4ayJPMzRl9youcBsDKev7OsA/v7Y91dGlNwEV3Fze2Khkww+sw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304975; c=relaxed/simple; bh=/4/VFZoV9b7KVZkbEveSVQAEMeaQn1zRi25rEqEwRoM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=hoU/mstGbSxy33WDFktLp2xiZEJaDVbjpRUqpzKwOnM8r0m1ea9z+viQgpwu34B+YKMCRYuHVV8k82toqgDeA3+TS5BS4qp81HU70y8FOwD+8XXaUxWzCesejqUsycn42rWkNZjgiP1lFO7dVmhkA/pWsjN/LvOPUkAsqxj6HgA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=TSISMFgS; arc=fail smtp.client-ip=40.93.194.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="TSISMFgS" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=kv9nUqt20ywrh+twP4Y2qe97Kcy0BicRK20Cp1SCFHHwca0Q3dhNOv79GBz0Wy8j4PvaGiBkMf1WcyGh0rODMe28aeEI1TS7Rfm8wOaxFhx+KUvB1s8k3Or4mvffW5beu2osAQlp93wQxMWvukaw3tJwgItPaCaH8vbKONhbuHWgboLQTU5z1jXMcnDqAgQh7cGtuILY9+dO+PJfOXxzNVPtnkH88rIHgJEyfiPoFS2tXYgDh+zOQ087ZmNcZwsVUUxetz+U9oLl8ecR4nXfRNm9WHJve9NASeiFYz5A2jJy7sA1IY/hCwgkj+mOuKZokPss6rjIJgWrC1nKZRWNkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GY0dbfjKjUXBulNyR9nzUSJvMxAlo02giuEdw1eVWCw=; b=jp6OJKl5tS9r2jsk7E/fjLxUFDmHpd9ugGl0VfMG8zGjXAE2sG5nfJPNU9xsMLW9PJ4Ol6hzpoczFPGWvh8YkhAoQKB+I0lvy80RHGWVWVsNFXr/5RQT8qj7yKKzVMla1BQ8l7yrjZHYVdpISPjaabnBXqQS7duZo11zAmo4aqz2ZyEU+KIpWpTj50TCGVf6hiQqE30sFafy5WC1TUEj4nzf8XiUdKEK7UpmkvrbLStgzVJYactZjzsch2//1wPM3EzsLIkPJAGa9wlZxAs2K8N2FLeVAYjdJ0/zZt8uuuzlivSPTziC12Hz2X0XLpLdVfheM5K+rQ840l0wP4lLNw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GY0dbfjKjUXBulNyR9nzUSJvMxAlo02giuEdw1eVWCw=; b=TSISMFgSNGa6VhifJ5YwrlUC9YwlA8ELsSy4IAXB4KFCX94j1n92tvWd1f6DpEtOCT7bSpGE0A9z+zQrQBY/yp0yt+BdoynlRKpNU8gnDjl1jnEuXwzn7SVbllbUYbJd9pUUhKRpQ2ENoFVjLRxbKxLLOytFUGs3fxyl+whiqc7j2YAGXDCFU4fxyzhBSznnUjn6NiShpMDEDxArJdhHwsggtq4SMimZwOo/CshOCg/7sxqQN8PX7S9ZgLncXd+L15fTXmfPvn8SZRHYi9CqtlDKowLmkg0XZpq+WD2gPE3MqAjbRD1C1OZGW5HVqINqtgm+IZcxGTC8+Zexf0vPiQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) by MW6PR12MB8735.namprd12.prod.outlook.com (2603:10b6:303:245::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9366.13; Fri, 28 Nov 2025 04:42:49 +0000 Received: from DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1]) by DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1%3]) with mapi id 15.20.9343.016; Fri, 28 Nov 2025 04:42:49 +0000 From: Jordan Niethe To: linux-mm@kvack.org Cc: balbirs@nvidia.com, matthew.brost@intel.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, david@redhat.com, ziy@nvidia.com, apopple@nvidia.com, lorenzo.stoakes@oracle.com, lyude@redhat.com, dakr@kernel.org, airlied@gmail.com, simona@ffwll.ch, rcampbell@nvidia.com, mpenttil@redhat.com, jgg@nvidia.com, willy@infradead.org Subject: [RFC PATCH 4/6] mm: Add a new swap type for migration entries with device private PFNs Date: Fri, 28 Nov 2025 15:41:44 +1100 Message-Id: <20251128044146.80050-5-jniethe@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251128044146.80050-1-jniethe@nvidia.com> References: <20251128044146.80050-1-jniethe@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: BY5PR16CA0024.namprd16.prod.outlook.com (2603:10b6:a03:1a0::37) To DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM4PR12MB9072:EE_|MW6PR12MB8735:EE_ X-MS-Office365-Filtering-Correlation-Id: d792c4aa-11a9-43e5-350c-08de2e38958c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|7416014|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?pQuRGMwITgR4XOiqEfQIWaYOGBxJJpwNb21LalUpB/zzx1fuiIKU9jsMzU4O?= =?us-ascii?Q?Vx85APTARqRsIzmD+bx0VDULmhu1RqyAP7Ka9+G34C1d9HpHNdSLGgEMg+5M?= =?us-ascii?Q?+CpjvVwGVZEQ0hJ6G0AGv3MsubUeCEhq7uMpV+IkyTYQTvQxG9Q+qDLTatqX?= =?us-ascii?Q?6pGXaRjLpCDokV87AWnoE/YNMPcLNVKDyaZN3m09B0xY4bV27137/s2x9psd?= =?us-ascii?Q?l7oH9eh600A7OP8LOWp8rI2+AjJcrCZ2t03erC0WCB+ivOpVZ80Tyo/fYe62?= =?us-ascii?Q?ORQxluGOikxOCtRueWdiLq/ZEbQivwc9k0WRtdN8a95YI5FyU4te7BwcnYcr?= =?us-ascii?Q?jWHepvKict8YnKi2ht1jf/wXNtJImpFKObTNDlOzOpYVr3L7XHvpFOJpAmU5?= =?us-ascii?Q?a2HFmOWqSBwb3tCaJon6AnEhdCr0BCor/VBPUhaK4TguQ2eD1ghAgIolVoL9?= =?us-ascii?Q?NXVK9DQYWT5Ryhm6luZ/JUw3gJTlMl+TIcrn1D68gfTqw8PjCbHpZeOzorfM?= =?us-ascii?Q?aYcD56fGiO26q7QkCv469ffWu0UQVvUo+5P5G8hHhK3DQHL5frIfZ+BajBbQ?= =?us-ascii?Q?/Au2L0Fj/6uv6kuLjnJeUXyTb39vf1dXAYhFHiYO8SsMEpqIA0Tndwa0ihNG?= =?us-ascii?Q?MN7uZbAv2DLeYNtlXL7q2G4YR+LMQ3CB0Prcg24xTIbtGROvdFn6z8Nym4w+?= =?us-ascii?Q?VcjgK2BU8sAveW06UxcQbKOY1a4ygG/CfwkyulyGA035jYGsKFwu4ZhcYzYf?= =?us-ascii?Q?Benqjw2tBiiQT3Vc3I42gfQYuTQEk1RkKDkXEtQJ4W5vGuxrKZl14h6psoTh?= =?us-ascii?Q?8gjw1oLvtqZt458LqFZj49M1C4SP9Ng1VhOG23QsbdIljV9c/Zat/HsDrlcb?= =?us-ascii?Q?47KkuwciQSmTz5C8mPyIN85psKkKngmJMhlO0nIt84aVKxnd6gOSjh09n3xv?= =?us-ascii?Q?Lj3Z6SGLgbHhOl63fYV3qbQVxRQrvGqlFsPbs81Ot7o5Zo64mPDlF1GadNyL?= =?us-ascii?Q?NMZ5Ua8lmenMZ8PhfCeCoGfS1O/mFUoWu1VN1T+oQHbrtgDDJ+DCfPJazNLC?= =?us-ascii?Q?6oKbpqDsSSkkZmmxVRM61/ijXbhbbCCEOPPMu9j/Zuyb9EI2H1o49/zm4cpP?= =?us-ascii?Q?mUAICHqrqJ1xRlmc7kD8QvofJ354eHedoNZ4lZ6F+Om63Obc0vMiziMRI6wJ?= =?us-ascii?Q?SBmFFBotMzVJqwIFfuSAAw9i2jm6fSDDQskbCQHlBMOCbhVTtuhWUMOiMM+J?= =?us-ascii?Q?nqINNb7SYNTaVhGNpW3OesxnQ7q31t6QoGPiRb7UcT7b3dykBtgDeHIVDZIX?= =?us-ascii?Q?0U/V73tFPHYfo0CbXD++1F5uCw9e0v20q9ivn/YoZRvVqVk0bCZOYiKXlXVw?= =?us-ascii?Q?eAgoyoNBi3sGxlvIJDP4OfRFVD8+vaM6wB+TRE97A8to7VuzGST/5uIUGbz4?= =?us-ascii?Q?NOWUx6X0mOmHIZiEbP1SKJxFRBpvgZ+N?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM4PR12MB9072.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(7416014)(376014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?cs2Xr7qQO4PCj9DRJLde3/7GvmJIaFuD5M/7wqTtFxH2WkugGeCt+JxC4GIX?= =?us-ascii?Q?pVrRhmYLCD/jQmN1k9MKTYh13g/2ONSsU4fMEvgzmzTFL3dbNX3/dQOEha3d?= =?us-ascii?Q?39w+HFXhz2gPYr/IXEmHDGkUKeXrTKKCkJh8uxm2rnYalTtIHdYc5hUcxigc?= =?us-ascii?Q?kT9rgm4nIQF+3H4e8h8rceNq/WPwDcEKU+QKIPqIlh+HYD4U5GdHiMLzrKcu?= =?us-ascii?Q?YzDez4FPKY/blYgm+AHkjrGwuAknslvw04Jc31WfVK/EF+W/skp0uLkOaOaR?= =?us-ascii?Q?u5dJYhApqBQ8phIcFVwVJqm9074l+RKRJWuDEsoO12wb8SjQW/QeR+8ORezZ?= =?us-ascii?Q?whWmzuumNfJ6cJ4kZL5ez57bSV48PTkEqKHBIaFWCiLzV1+0gV75VZ0g5uUm?= =?us-ascii?Q?BjhfrpOUUPLOwCObrdEBl0g5Q46WaVIyWkZZGUqmFI9gPLIytByAoPetkAQ8?= =?us-ascii?Q?VEXDxcIVxfszKO0SMeYDSoWAlAGiF672dBFbtMqlmadApWEZ88EAOae3Xj6k?= =?us-ascii?Q?RuSXs0Uk6UORRcjG2Bf4eYOZi8Vpt27HNAbHKpmnPJgH/FsTDFNHtG5P/+01?= =?us-ascii?Q?nlqNTTS95x96H5KtZtbBHOACHxeHYUuv7cQjkf4E+nIYul4zm5LWr9OTQDX4?= =?us-ascii?Q?vRdPnW8jCW1HEBT5QPaZA2wMPojvrlfnSPae6JsbadZmOz2yZVf+0Mt7ajbJ?= =?us-ascii?Q?zUe8i9KNu5dlATxT3SU3lEw0jH9SqcjrQ3IYo+X9lY10bCnt5OExD/QhAW+P?= =?us-ascii?Q?7XJ8RedmR9a1x5KX4By6sZWAVoXamsoJRwZshg9RDe8QbmH5QWkiACc01weX?= =?us-ascii?Q?rH0TL2jTFBxzzvlKcTs767VOx0smTln0YMcqQ9+q3Az3+Juc79tByWvICn3y?= =?us-ascii?Q?LkZG5NMVteDxecbiXotb56dg+wl6uZiE3c1CN9hasXdI/ENcN++/HNqizcWg?= =?us-ascii?Q?jwEVcdpr5CNJP/ei6DXrVMuklpBbzCL8PQkNqncFTx3QRihz/xZ/6WaGBsyZ?= =?us-ascii?Q?xcTI4kCEFJW/PBxYYFbH+6HTqKbyWb6RsnPdmrjne8e9NPi+SDCwc/LWgmIj?= =?us-ascii?Q?7R78SaoGREv2+bVpLbIHQCN0Q7jc5dJ7uFZT1zmLbuXKK9WTCqqcIhaP4e0W?= =?us-ascii?Q?t4JMdadfMtsAguK/aggxwwSntKtZDFhALf6A+Dml/QmWQbjlrm2af4fw5n/u?= =?us-ascii?Q?kWZ9tnir3C+YlTixY1YOixzUxIsTyMQUPxFYF88fHJb0FUDUaIhj3lS8rg89?= =?us-ascii?Q?0dvb/nmg5DDoEyGChDxNGp3CAQe+09ndNZ/9aF4hwYyjTG72zR75cQxfnVvD?= =?us-ascii?Q?S8OYiO2V2jQPtJW/28YiVCcKEId5i1HMAmbCwgZLZHFDN37W1SX42Lyqzwz3?= =?us-ascii?Q?9+4E+5FDaBEZSjG6B8lLMmAGrpAMGNqxl4E4gXoPiDkg3ukQOv0Vgpruc9J4?= =?us-ascii?Q?rslc1dBnW5gKv/ACACim9YvzEM+jZWnLDn+BlU4iKs+FVeH8EppbZd8SWgas?= =?us-ascii?Q?CsCIkKdTkZnJOnnh56u16FbBZW1WzBzlWTGH8d/PvvPCw3jdB4xxIEyRBTTY?= =?us-ascii?Q?H+4i1m0fNmfExvDKgcUW1utHL5T3jyy6UdWBa6uA?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d792c4aa-11a9-43e5-350c-08de2e38958c X-MS-Exchange-CrossTenant-AuthSource: DM4PR12MB9072.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2025 04:42:49.6489 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: kWTgNaeYhfQ3tAHdJEfRo3C7l20kaNnIvcudhowfDDZsBYSukx9L0w59kKwFydQkx9Y6UdaQTUuwkxQby200mg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR12MB8735 Content-Type: text/plain; charset="utf-8" A future change will remove device private pages from the physical address space. This will mean that device private pages no longer have normal PFN and must be handled separately. When migrating a device private page a migration entry is created for that page - this includes the PFN for that page. Once device private PFNs exist in a different address space to regular PFNs we need to be able to determine which kind of PFN is in the entry so we can associate it with the correct page. Introduce new swap types: - SWP_MIGRATION_DEVICE_READ - SWP_MIGRATION_DEVICE_WRITE - SWP_MIGRATION_DEVICE_READ_EXCLUSIVE These correspond to - SWP_MIGRATION_READ - SWP_MIGRATION_WRITE - SWP_MIGRATION_READ_EXCLUSIVE except the swap entry contains a device private PFN. The existing helpers such as is_writable_migration_entry() will still return true for a SWP_MIGRATION_DEVICE_WRITE entry. Introduce new helpers such as is_writable_device_migration_private_entry() to disambiguate between a SWP_MIGRATION_WRITE and a SWP_MIGRATION_DEVICE_WRITE entry. Signed-off-by: Jordan Niethe Signed-off-by: Alistair Popple --- include/linux/swap.h | 8 +++- include/linux/swapops.h | 87 ++++++++++++++++++++++++++++++++++++++--- mm/memory.c | 9 ++++- mm/migrate.c | 2 +- mm/migrate_device.c | 31 ++++++++++----- mm/mprotect.c | 21 +++++++--- mm/page_vma_mapped.c | 2 +- mm/pagewalk.c | 3 +- mm/rmap.c | 32 ++++++++++----- 9 files changed, 161 insertions(+), 34 deletions(-) diff --git a/include/linux/swap.h b/include/linux/swap.h index e818fbade1e2..87f14d673979 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -74,12 +74,18 @@ static inline int current_is_kswapd(void) * * When a page is mapped by the device for exclusive access we set the CPU= page * table entries to a special SWP_DEVICE_EXCLUSIVE entry. + * + * Because device private pages do not use regular PFNs, special migration + * entries are also needed. */ #ifdef CONFIG_DEVICE_PRIVATE -#define SWP_DEVICE_NUM 3 +#define SWP_DEVICE_NUM 6 #define SWP_DEVICE_WRITE (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGRATION_NUM) #define SWP_DEVICE_READ (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGRATION_NUM+= 1) #define SWP_DEVICE_EXCLUSIVE (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGRATION= _NUM+2) +#define SWP_MIGRATION_DEVICE_READ (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIGR= ATION_NUM+3) +#define SWP_MIGRATION_DEVICE_READ_EXCLUSIVE (MAX_SWAPFILES+SWP_HWPOISON_NU= M+SWP_MIGRATION_NUM+4) +#define SWP_MIGRATION_DEVICE_WRITE (MAX_SWAPFILES+SWP_HWPOISON_NUM+SWP_MIG= RATION_NUM+5) #else #define SWP_DEVICE_NUM 0 #endif diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 64ea151a7ae3..7aa3f00e304a 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -196,6 +196,43 @@ static inline bool is_device_exclusive_entry(swp_entry= _t entry) return swp_type(entry) =3D=3D SWP_DEVICE_EXCLUSIVE; } =20 +static inline swp_entry_t make_readable_migration_device_private_entry(pgo= ff_t offset) +{ + return swp_entry(SWP_MIGRATION_DEVICE_READ, offset); +} + +static inline swp_entry_t make_writable_migration_device_private_entry(pgo= ff_t offset) +{ + return swp_entry(SWP_MIGRATION_DEVICE_WRITE, offset); +} + +static inline bool is_device_private_migration_entry(swp_entry_t entry) +{ + return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_DEVICE_READ || + swp_type(entry) =3D=3D SWP_MIGRATION_DEVICE_READ_EXCLUSIVE || + swp_type(entry) =3D=3D SWP_MIGRATION_DEVICE_WRITE); +} + +static inline bool is_readable_device_migration_private_entry(swp_entry_t = entry) +{ + return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_DEVICE_READ); +} + +static inline bool is_writable_device_migration_private_entry(swp_entry_t = entry) +{ + return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_DEVICE_WRITE); +} + +static inline swp_entry_t make_device_migration_readable_exclusive_migrati= on_entry(pgoff_t offset) +{ + return swp_entry(SWP_MIGRATION_DEVICE_READ_EXCLUSIVE, offset); +} + +static inline bool is_device_migration_readable_exclusive_entry(swp_entry_= t entry) +{ + return swp_type(entry) =3D=3D SWP_MIGRATION_DEVICE_READ_EXCLUSIVE; +} + #else /* CONFIG_DEVICE_PRIVATE */ static inline swp_entry_t make_readable_device_private_entry(pgoff_t offse= t) { @@ -217,6 +254,11 @@ static inline bool is_writable_device_private_entry(sw= p_entry_t entry) return false; } =20 +static inline bool is_readable_device_migration_private_entry(swp_entry_t = entry) +{ + return false; +} + static inline swp_entry_t make_device_exclusive_entry(pgoff_t offset) { return swp_entry(0, 0); @@ -227,6 +269,36 @@ static inline bool is_device_exclusive_entry(swp_entry= _t entry) return false; } =20 +static inline swp_entry_t make_readable_migration_device_private_entry(pgo= ff_t offset) +{ + return swp_entry(0, 0); +} + +static inline swp_entry_t make_writable_migration_device_private_entry(pgo= ff_t offset) +{ + return swp_entry(0, 0); +} + +static inline bool is_device_private_migration_entry(swp_entry_t entry) +{ + return false; +} + +static inline bool is_writable_device_migration_private_entry(swp_entry_t = entry) +{ + return false; +} + +static inline swp_entry_t make_device_migration_readable_exclusive_migrati= on_entry(pgoff_t offset) +{ + return swp_entry(0, 0); +} + +static inline bool is_device_migration_readable_exclusive_entry(swp_entry_= t entry) +{ + return false; +} + #endif /* CONFIG_DEVICE_PRIVATE */ =20 #ifdef CONFIG_MIGRATION @@ -234,22 +306,26 @@ static inline int is_migration_entry(swp_entry_t entr= y) { return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_READ || swp_type(entry) =3D=3D SWP_MIGRATION_READ_EXCLUSIVE || - swp_type(entry) =3D=3D SWP_MIGRATION_WRITE); + swp_type(entry) =3D=3D SWP_MIGRATION_WRITE || + is_device_private_migration_entry(entry)); } =20 static inline int is_writable_migration_entry(swp_entry_t entry) { - return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_WRITE); + return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_WRITE || + is_writable_device_migration_private_entry(entry)); } =20 static inline int is_readable_migration_entry(swp_entry_t entry) { - return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_READ); + return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_READ || + is_readable_device_migration_private_entry(entry)); } =20 static inline int is_readable_exclusive_migration_entry(swp_entry_t entry) { - return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_READ_EXCLUSIVE); + return unlikely(swp_type(entry) =3D=3D SWP_MIGRATION_READ_EXCLUSIVE || + is_device_migration_readable_exclusive_entry(entry)); } =20 static inline swp_entry_t make_readable_migration_entry(pgoff_t offset) @@ -525,7 +601,8 @@ static inline bool is_pfn_swap_entry(swp_entry_t entry) BUILD_BUG_ON(SWP_TYPE_SHIFT < SWP_PFN_BITS); =20 return is_migration_entry(entry) || is_device_private_entry(entry) || - is_device_exclusive_entry(entry) || is_hwpoison_entry(entry); + is_device_exclusive_entry(entry) || is_hwpoison_entry(entry) || + is_device_private_migration_entry(entry); } =20 struct page_vma_mapped_walk; diff --git a/mm/memory.c b/mm/memory.c index b59ae7ce42eb..f1ed361434ff 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -962,8 +962,13 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct m= m_struct *src_mm, * to be set to read. A previously exclusive entry is * now shared. */ - entry =3D make_readable_migration_entry( - swp_offset(entry)); + if (is_device_private_migration_entry(entry)) + entry =3D make_readable_migration_device_private_entry( + swp_offset(entry)); + else + entry =3D make_readable_migration_entry( + swp_offset(entry)); + pte =3D swp_entry_to_pte(entry); if (pte_swp_soft_dirty(orig_pte)) pte =3D pte_swp_mksoft_dirty(pte); diff --git a/mm/migrate.c b/mm/migrate.c index c0e9f15be2a2..3c561d61afba 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -495,7 +495,7 @@ void migration_entry_wait(struct mm_struct *mm, pmd_t *= pmd, goto out; =20 entry =3D pte_to_swp_entry(pte); - if (!is_migration_entry(entry)) + if (!(is_migration_entry(entry))) goto out; =20 migration_entry_wait_on_locked(entry, ptl); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 82f09b24d913..458b5114bb2b 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -235,15 +235,28 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, folio_mark_dirty(folio); =20 /* Setup special migration page table entry */ - if (mpfn & MIGRATE_PFN_WRITE) - entry =3D make_writable_migration_entry( - page_to_pfn(page)); - else if (anon_exclusive) - entry =3D make_readable_exclusive_migration_entry( - page_to_pfn(page)); - else - entry =3D make_readable_migration_entry( - page_to_pfn(page)); + if (mpfn & MIGRATE_PFN_WRITE) { + if (is_device_private_page(page)) + entry =3D make_writable_migration_device_private_entry( + page_to_pfn(page)); + else + entry =3D make_writable_migration_entry( + page_to_pfn(page)); + } else if (anon_exclusive) { + if (is_device_private_page(page)) + entry =3D make_device_migration_readable_exclusive_migration_entry( + page_to_pfn(page)); + else + entry =3D make_readable_exclusive_migration_entry( + page_to_pfn(page)); + } else { + if (is_device_private_page(page)) + entry =3D make_readable_migration_device_private_entry( + page_to_pfn(page)); + else + entry =3D make_readable_migration_entry( + page_to_pfn(page)); + } if (pte_present(pte)) { if (pte_young(pte)) entry =3D make_migration_entry_young(entry); diff --git a/mm/mprotect.c b/mm/mprotect.c index 113b48985834..7d79a0f53bf5 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -365,11 +365,22 @@ static long change_pte_range(struct mmu_gather *tlb, * A protection check is difficult so * just be safe and disable write */ - if (folio_test_anon(folio)) - entry =3D make_readable_exclusive_migration_entry( - swp_offset(entry)); - else - entry =3D make_readable_migration_entry(swp_offset(entry)); + if (!is_writable_device_migration_private_entry(entry)) { + if (folio_test_anon(folio)) + entry =3D make_readable_exclusive_migration_entry( + swp_offset(entry)); + else + entry =3D make_readable_migration_entry( + swp_offset(entry)); + } else { + if (folio_test_anon(folio)) + entry =3D make_device_migration_readable_exclusive_migration_entry( + swp_offset(entry)); + else + entry =3D make_readable_migration_device_private_entry( + swp_offset(entry)); + } + newpte =3D swp_entry_to_pte(entry); if (pte_swp_soft_dirty(oldpte)) newpte =3D pte_swp_mksoft_dirty(newpte); diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 9146bd084435..e9fe747d3df3 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -112,7 +112,7 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw= , unsigned long pte_nr) return false; entry =3D pte_to_swp_entry(ptent); =20 - if (!is_migration_entry(entry)) + if (!(is_migration_entry(entry))) return false; =20 pfn =3D swp_offset_pfn(entry); diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 9f91cf85a5be..f5c77dda3359 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -1003,7 +1003,8 @@ struct folio *folio_walk_start(struct folio_walk *fw, swp_entry_t entry =3D pte_to_swp_entry(pte); =20 if ((flags & FW_MIGRATION) && - is_migration_entry(entry)) { + (is_migration_entry(entry) || + is_device_private_migration_entry(entry))) { page =3D pfn_swap_entry_to_page(entry); expose_page =3D false; goto found; diff --git a/mm/rmap.c b/mm/rmap.c index e94500318f92..9642a79cbdb4 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -2535,15 +2535,29 @@ static bool try_to_migrate_one(struct folio *folio,= struct vm_area_struct *vma, * pte. do_swap_page() will wait until the migration * pte is removed and then restart fault handling. */ - if (writable) - entry =3D make_writable_migration_entry( - page_to_pfn(subpage)); - else if (anon_exclusive) - entry =3D make_readable_exclusive_migration_entry( - page_to_pfn(subpage)); - else - entry =3D make_readable_migration_entry( - page_to_pfn(subpage)); + if (writable) { + if (is_device_private_page(subpage)) + entry =3D make_writable_migration_device_private_entry( + page_to_pfn(subpage)); + else + entry =3D make_writable_migration_entry( + page_to_pfn(subpage)); + } else if (anon_exclusive) { + if (is_device_private_page(subpage)) + entry =3D make_device_migration_readable_exclusive_migration_entry( + page_to_pfn(subpage)); + else + entry =3D make_readable_exclusive_migration_entry( + page_to_pfn(subpage)); + } else { + if (is_device_private_page(subpage)) + entry =3D make_readable_migration_device_private_entry( + page_to_pfn(subpage)); + else + entry =3D make_readable_migration_entry( + page_to_pfn(subpage)); + } + if (likely(pte_present(pteval))) { if (pte_young(pteval)) entry =3D make_migration_entry_young(entry); --=20 2.34.1 From nobody Mon Dec 1 21:33:25 2025 Received: from CY3PR05CU001.outbound.protection.outlook.com (mail-westcentralusazon11013059.outbound.protection.outlook.com [40.93.201.59]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5223D2FD1D6 for ; Fri, 28 Nov 2025 04:42:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.93.201.59 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304978; cv=fail; b=CqUrFyMop59R7+0+l0ZsQizfyo9TQ+SK6O9jkoNkPjMdBZdtoHkTJ+8BMX7NMgkA/JGaZTwSYz/e0xkLGtlpxHOwZmGUFiz/qjojBOJl4yVu+YCBrVddKAI/9mIl50FUNegrrpgzVH+UR6Pf7Vrb3eJt+JsukBTCxLfXvNwoGVY= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304978; c=relaxed/simple; bh=KuG7HhQUXHpN/spvxI/xD7yeCEo8dmE+3587QJDhH1M=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=bVRZeDdDZDNlB0BZ0Ego7PxeWafbx6nsj8CYWi216EI9RTc8pAUu41RV8xYKRl/FXoknT2yQ96DBvKAeVXCQAHmPrCdrVuEj8NoIXVJhhGYOmI3oqPo2uPnYfxlkJvpDLL2H/cRosbiaEbN9OJMBgby/iUHZVmGGZm4JQmZci0s= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=hoEeTKwh; arc=fail smtp.client-ip=40.93.201.59 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="hoEeTKwh" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=liOEc3/yFAzCQqc8263oU0rxVpA9PaKz59fvimEsiUwckvI/f8eVSbYD25os6wYy0SPuARsHHtMu8ZwclUxeukfLuTEksLa6dtLk/onu5j9QJF+X6XSZiZR0iDjpi83NDrd5ZUUt0c0bySCj7Oag7Nk8F1XsYs4qTTxZRRFaMNjIzSB479AR8VgyRUbMavex7QeLUivWPJq2li9DXH1v3ucnrk405+zibAyRS2oFwaKw1RnVVlo/bx9XSDcLYdxi315hzjAH8cAK7on13wWyk+KnGodT33jOm5s1QAFDpQGCsulo6T1ePQp+pEWGhkfvAtQCarGwEguppRPFkM1IGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LrU9ocWwwrvSGqzUbXguGNq3kciL0EoWAWkk9iyHF/w=; b=W3+QNcIHQaKkbrxgLHrxw3pdT2ep1MNR+8O2NI4FKyTU3crUwi0gBr76dK61rz2pqy+X6GzWrAsXOEDxmysqqRQbsv90s+anXhI4GdU05JSAnf+UPC/yLl1m2iEBwFPONZbBu1zfG6ara1HE4a6ejQxbRcOO0giC144gdKqZtcwB+bTbS2EE6hZ5ogX4zF5pTrXaSP2gIb3QySapCrqigtjPKdUWmXct925bEfurewRsws2vTSV9f1glChmyHHPVIHLklqCVj+IY5fbeSQHVasP177s/xxHy91x4LbQwQ58tM69zTPRPaI59nqOS7DY6FaqwexAebDFyQ4Ncd05cSQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LrU9ocWwwrvSGqzUbXguGNq3kciL0EoWAWkk9iyHF/w=; b=hoEeTKwhi+ZOy8fMKDCQvzaOKvu5BoLjTVAx5q4KrN4qc4sPddJ4kPRPXQVSqzol7PJcJm6HsratLjfrRCMSo1EKqmV0ARSCkwDIKWLQ6+hBXbEOOznOzFakUP0uFqtnGpuXdnOxehMxJqnKA0j9aDFW0sNgcYUxOQmo89N20nysBQCkPGY0mhNmlKvmQaHCRLMWFNG2+VjdfSuUJgTBe7pdH75C+tx1D+8u6ufDOz1DOZFz40a33b9WaSAywI24HklCLd6WOrrLe+DH0oP3UD6XpVsfxLsYWUHivZYZIvkiy5mMN/yzVf/99F8+MYoxn7lXfu3xtyVWzngfuwLL2g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) by MW6PR12MB8735.namprd12.prod.outlook.com (2603:10b6:303:245::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9366.13; Fri, 28 Nov 2025 04:42:55 +0000 Received: from DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1]) by DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1%3]) with mapi id 15.20.9343.016; Fri, 28 Nov 2025 04:42:55 +0000 From: Jordan Niethe To: linux-mm@kvack.org Cc: balbirs@nvidia.com, matthew.brost@intel.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, david@redhat.com, ziy@nvidia.com, apopple@nvidia.com, lorenzo.stoakes@oracle.com, lyude@redhat.com, dakr@kernel.org, airlied@gmail.com, simona@ffwll.ch, rcampbell@nvidia.com, mpenttil@redhat.com, jgg@nvidia.com, willy@infradead.org Subject: [RFC PATCH 5/6] mm/util: Add flag to track device private PFNs in page snapshots Date: Fri, 28 Nov 2025 15:41:45 +1100 Message-Id: <20251128044146.80050-6-jniethe@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251128044146.80050-1-jniethe@nvidia.com> References: <20251128044146.80050-1-jniethe@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SJ0PR05CA0058.namprd05.prod.outlook.com (2603:10b6:a03:33f::33) To DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM4PR12MB9072:EE_|MW6PR12MB8735:EE_ X-MS-Office365-Filtering-Correlation-Id: 3f3c517f-7a5f-4791-92d1-08de2e3898b8 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|7416014|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?0OBbtCyO78GXLuOp2tfydsCo7gv1jNdqrhASUPfuTNeLojTChoHUm9+b9P74?= =?us-ascii?Q?0//mjfNqt2XfYzmCHxdq4qlkf4l5J8+ETI+JD4kmKQ/AS35PIoW/X6cEG2Ht?= =?us-ascii?Q?EocazXJefPST8WdLvdfCl7ahS50BBdxYASsjf09Q0vxSlrEwOGtq91Wr0ejt?= =?us-ascii?Q?qPjuPwuo8lb7zE2pL6N1XUHcwno921f35OFOEw1ymGziOTdEdCmKUN99dRKR?= =?us-ascii?Q?lSqmFpMNXu6MDWdqJnMQW29wOGUbyjDh34la3yxBBypYsmDEQqQxXhln1OiM?= =?us-ascii?Q?N5XTLRL8Zx4dWQ41hkZGNks457TlupMw3sczD+q11AQ6+eVJgXwkRPoheg4w?= =?us-ascii?Q?Wil8wPuUa2cxEDUzGyr3ttm/Sam22N4WyOCdzWqXgW+kqiEhmZzniJCcflyf?= =?us-ascii?Q?McEG1clGwkDRH74L3lBJPfDXTvabW1JaMtrU7NxmwrlNR2dL4OerTncQo8cO?= =?us-ascii?Q?QC/c6ld+AHC9TSEiGGehRa8TTHbGyzbh0s/ZBn0kVibvrTodU3KggVQggLMA?= =?us-ascii?Q?ZVQokh1wy4dZlV0o6S7SZgpRtpOca6UR4NJH4hJ+mcEsVwEfAPzg7iF9xzkO?= =?us-ascii?Q?R/PQO8ySXY2EE898WGBaPiCUA7LzclDoQc0Zqadb3TD3AzhpKTFq9BVLzqtx?= =?us-ascii?Q?BsA7fpL9+KwmbRt+/E2MHc6uBDhgJe/sspbfJLK+OEDrrKHWbX3a2m2aZeET?= =?us-ascii?Q?YvlykMzcgtqVjUWBSEdhS1miRSO1ExxRxSJx7AqDb/qH3W3EA1Pcjen3gXxz?= =?us-ascii?Q?k4hZZ4v/C5NgY+pFJiJfJd4sABK4wJtiHTRfS6QG80TFLNCEnloNEzrPEdPR?= =?us-ascii?Q?2tNjMHnZ9ErYGYPXvts06BVFl4FSZrWUao5Gc+BciRFUhzYmIAd/blV1KDEP?= =?us-ascii?Q?1kxgMOK70DTOc14RhWHifoxbSYgXzJSKZGJ5YSnzev77HJ2jO0PRvm12Jsg2?= =?us-ascii?Q?QQJ/RXxZSVyIx8xlsQ5Zy9btwH4Z1MYe6Sb5J48IusdrjhRSSsK37WMsDde+?= =?us-ascii?Q?e2llvXUPkozEjX58Kro8Rh4Mh1WOYyNKoCSZYYv9Lk5TS9ug4ecHwL78tPYj?= =?us-ascii?Q?p2jy9fhZqhNkLXhZbXM+h4BfcHvnwoHXPdCjmC7IdaOJi1kQWxSa7OSc3AGt?= =?us-ascii?Q?5vLCP+Meob7tLfU3ZMrhpjdtRWH5KeL71ugaYZp8z48c+FQ/wPcNt8j0altM?= =?us-ascii?Q?t6cN15elzW//Vdz7j0sBhrve3Ex6D2KTpn5k94c2QiJvyDsqp5HARUBm/hkV?= =?us-ascii?Q?bE9LBKZPdQW0n3kXB+OZVzZuILnha4VxI4p9ofMwBWlOnerF407XJkRl4OOC?= =?us-ascii?Q?NbYvz/lTeZmXmAh9FA/G3c3MEXwtlbllKsEafb9CW2ia2IZg+lWRul5KKW2O?= =?us-ascii?Q?saiQw8iW7DLi4cb9nVn5ygxx9UgKGRo5RVijQZIJsjgtfPgamI1V8ggZZ6vb?= =?us-ascii?Q?aBUEHS5XDvTFRxHkW4mV3BCoa+cbspvO?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM4PR12MB9072.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(7416014)(376014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?SfySAPaI51zkpPYaqnsNpkVSWp/DcdA+euzWt4zvZmMp6kScrT/KoYCstEQ2?= =?us-ascii?Q?Yyie+UjgG9dIozJC4zLY0c+V83P9L62J3bF34jVURfL1Q6pn8kf+J0LHmAG7?= =?us-ascii?Q?CVEKSkI61Dbx1FYPL/JCfbMTuqB/RdcLB/+haISmlkO7pkjl3jPRHA14fZ7J?= =?us-ascii?Q?Xd+aVCdaEC4RS1Qwr65VMVcFCPidu9giEySrC1LoL3tyWHEuCRKlKg/2jYqL?= =?us-ascii?Q?KS/J4+stzSk7uZjHyCj2EBmrACXXMbXcwYnxw62DFlU/VZ7feCqf91apDriZ?= =?us-ascii?Q?MG+ZipRwV7P7Ntk+6XuNPIn4nRdY83F6t86+71O/Tjb0+MfubHBNyvdWw2io?= =?us-ascii?Q?Sg/rteBiJHtGzhlvLuq+MADqN/MNxSxk/DIhhevgKrZYCi5l2c/9f1FrPU8P?= =?us-ascii?Q?F+Zs8HxhnmtJ3AnmiLRvUgRmFVaoX3fg82FWZ2hDaBcAJGywUssXCL+UQ9VG?= =?us-ascii?Q?DjS1lQ3e6Tujr4iQR6jbNzvDUzl1a9/Hmsc4mfR7AmxOzF7JXOH8UcPSIaWc?= =?us-ascii?Q?FMaG6GY23xOOV1+AhgtewOE0DWOIHsgCSMrWfNly/r9yZjvy1NPcJV0vlYH4?= =?us-ascii?Q?c+LFlf3oV05P9ELJdbmo1abOGFMZswDHHfkbVXMWIzzwcfliJ9GxNcILSrpd?= =?us-ascii?Q?aZ/tmy03LfbM3iBH9P+vEGVtsI044VlkBEd2Yen47xwddyBDR19RsEG/zCzd?= =?us-ascii?Q?Vy9n4/+eyv1d6/UGcobkfyF1N7RhoPnd1z8uCKC/CJNKDAFOv8va7w6wyn5u?= =?us-ascii?Q?nMn/HPQf4OTWEPyEYYg4ZItUvlhuDdqWn60L2qHoWyUVSa0ZSj1ZT9nHLCGH?= =?us-ascii?Q?tbQXZZppIktEkvvrFNo/5Nq3M8fKz1QaSAwpUKrTHn/hq30nPotRZq8jQDjU?= =?us-ascii?Q?g4aEdZku6edeiSPGHUAlEKkJK1h1siUNwFLW1SqzttxLKBcL/EA0SfSJr32v?= =?us-ascii?Q?YlvVTTbXAv6saQ4dru1+5NlmBaqArkIEIi/IvWAq3v08d2PyYDQ8cpK5ctqz?= =?us-ascii?Q?gmz/9tsYvXANCvjs8h6E7/X2K8eUlHqiM2hTp8ekSJeb8Xfxg86OPFxTksHD?= =?us-ascii?Q?KahfBg0OCSTyQYVi2az/72E/FedPQAnI6XxGONVS2tJakppMGdfFBdWXiUDv?= =?us-ascii?Q?QlNkIIAwthVMPFxyWFeyapupkuU0Nmd3JpZ08r0kYSjaSWOhnnJPWDfu98vQ?= =?us-ascii?Q?qsDZHwk57X1T5Ug9Vvuy1Kdtw1rSy7RkMQycnKNNYYEvIHy6LMVdl9fgmlsF?= =?us-ascii?Q?Aipl06UorkHqoV17x3f1n72ZR/pYfxLx5Qpb9KjS9h8fRWhvf1XNxxDFlSoa?= =?us-ascii?Q?WisBKMtI8BXrxIViAeO7FQY2e3/bd0FwMPCxS7Gbu2sroTUAsj8iBn9YCGqn?= =?us-ascii?Q?eRw1O5Vl+RFuvkPMk/CbUt8pGrFyCsXI9AJ7QcBbUZ/FFHylt4kU7/8RAoWJ?= =?us-ascii?Q?Ztogf1SkuNfuIAx1m4vcrmRVJYWQ64BMWPrBDjhk7RCIkNWtm17nNr6NRCBi?= =?us-ascii?Q?oBtzeaHTrmfWBE8WNhu+srVshM8De3hvYe/3mwjbGoEwQy76gFYfJiddFWRd?= =?us-ascii?Q?f0rjU13q3kYBfsk8MwSxW782Jg4ES1fIpL8A1fog?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3f3c517f-7a5f-4791-92d1-08de2e3898b8 X-MS-Exchange-CrossTenant-AuthSource: DM4PR12MB9072.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2025 04:42:54.9559 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: PjFOLZ8vRVeRfsUGYhbIfwsexPi3m52BTo0cujfTrLzSmMRs9s0gUzCy7jfAaW79a1Kh9PSwRBofiIN09OTyYg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR12MB8735 Content-Type: text/plain; charset="utf-8" A future change will remove device private pages from the physical address space. This will mean that device private pages no longer have normal PFN and must be handled separately. Add a new flag PAGE_SNAPSHOT_DEVICE_PRIVATE to track when the pfn of a page snapshot is a device private page. Signed-off-by: Jordan Niethe Signed-off-by: Alistair Popple --- fs/proc/page.c | 6 ++++-- include/linux/mm.h | 7 ++++--- mm/util.c | 3 +++ 3 files changed, 11 insertions(+), 5 deletions(-) diff --git a/fs/proc/page.c b/fs/proc/page.c index fc64f23e05e5..c3e88a199c19 100644 --- a/fs/proc/page.c +++ b/fs/proc/page.c @@ -192,10 +192,12 @@ u64 stable_page_flags(const struct page *page) folio_test_large_rmappable(folio)) { /* Note: we indicate any THPs here, not just PMD-sized ones */ u |=3D 1 << KPF_THP; - } else if (is_huge_zero_pfn(ps.pfn)) { + } else if (!(ps.flags & PAGE_SNAPSHOT_DEVICE_PRIVATE) && + is_huge_zero_pfn(ps.pfn)) { u |=3D 1 << KPF_ZERO_PAGE; u |=3D 1 << KPF_THP; - } else if (is_zero_pfn(ps.pfn)) { + } else if (!(ps.flags & PAGE_SNAPSHOT_DEVICE_PRIVATE) + && is_zero_pfn(ps.pfn)) { u |=3D 1 << KPF_ZERO_PAGE; } =20 diff --git a/include/linux/mm.h b/include/linux/mm.h index 7c79b3369b82..6b8c299a6687 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4317,9 +4317,10 @@ static inline bool page_pool_page_is_pp(const struct= page *page) } #endif =20 -#define PAGE_SNAPSHOT_FAITHFUL (1 << 0) -#define PAGE_SNAPSHOT_PG_BUDDY (1 << 1) -#define PAGE_SNAPSHOT_PG_IDLE (1 << 2) +#define PAGE_SNAPSHOT_FAITHFUL (1 << 0) +#define PAGE_SNAPSHOT_PG_BUDDY (1 << 1) +#define PAGE_SNAPSHOT_PG_IDLE (1 << 2) +#define PAGE_SNAPSHOT_DEVICE_PRIVATE (1 << 3) =20 struct page_snapshot { struct folio folio_snapshot; diff --git a/mm/util.c b/mm/util.c index 8989d5767528..2472b7381b11 100644 --- a/mm/util.c +++ b/mm/util.c @@ -1215,6 +1215,9 @@ static void set_ps_flags(struct page_snapshot *ps, co= nst struct folio *folio, =20 if (folio_test_idle(folio)) ps->flags |=3D PAGE_SNAPSHOT_PG_IDLE; + + if (is_device_private_page(page)) + ps->flags |=3D PAGE_SNAPSHOT_DEVICE_PRIVATE; } =20 /** --=20 2.34.1 From nobody Mon Dec 1 21:33:25 2025 Received: from CO1PR03CU002.outbound.protection.outlook.com (mail-westus2azon11010004.outbound.protection.outlook.com [52.101.46.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F4C12FE05D for ; Fri, 28 Nov 2025 04:43:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.46.4 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304991; cv=fail; b=dB6FPlhYkjO5Rb8v+UDCtSnOFCsJ5SbDTUJjJwpLzV1kx0W7VTQLPvWptC77go1gFL1Cw7fBGN306QV4jtmY210T9WwOGWDN+Rni1U69YxC9xGCtfBZmErFnlTC4MwuRGU5k+BD5HjyruEOpUVysdQUQNmqpgkSZNHHT7NvkPiQ= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764304991; c=relaxed/simple; bh=XoxeMXEg1rpOZinjeUds9Gc8xU8lnotwd+lQz7C0Fo0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: Content-Type:MIME-Version; b=sXf/FgjfB5v2RuowqYqjYLAXXYIwBUuMV6LmcGfRtznDKMXIms6m4wSxeRa2HoZzH2VyiPXudE6jKEaYTc+FDv+ot9rifyaA3gEXyjGwRNtlNUSye5nCzmU/lheIBk4hdzQ7xzIdY817mpmVJDLxdtd1A1+osYwRyMKk9C85njc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=RTs/RKu6; arc=fail smtp.client-ip=52.101.46.4 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="RTs/RKu6" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=B4oh9+qeAEadXToZwmeRCky8t44HQzAFEjw0grRKUJLFVlFpr/SrJplt9aXB5W3FL5Rq7otkFB0tbBuYHkpFHn68Qaa6+KdrC15PQxFxGwWSRSB2zCgSnqMeVhySgFVu7PlYWYi5vMPhwvei3xIBA0nx3l3yr1nud60STGcD1sEtTDiCvLXWyrnADXb1fcuVdycXsh5BHMda4WXAZik/WHLA32hnCxgEWEspYpMW+/vc4ATY5G/LkBcBHlB9d6mDfkH3pyMYh34EesfDe6H4OJ7myGkQ7AmEKJ+ekS5lJgu/mdsruzZhUG56EeYyteGwpCHf7rlRswbeQgmCLjDmWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=E4zxis86Tbsq0XEk6uGCwOSRX1uMP5AU6BpdE/rvAjk=; b=xfRMnNAETOIWUXiOZYq9Ip4X2c58kloeqUNtLm7CT+oGWXS1Ms+IQBX0MvgfuGU1ENL6jdYZJvU3xSm9Ohzt1K2dSqAgFW5f5wT74io50g/861ulcMCRS1ygsahpOrSDt001eOZ8Jnr/gzH/Onl61fkftTyR8tbZygcXjv42cNh3GFWVBFDP1jXNo9oGChg/VJzl2+a1b+LjIWNHE444x4Q2RHK2w7aAUUL0T2IWPh+qgQzUBUbn31xRb+sdbPKAyYOzrNjSvzLy4ONaKIMkFUUFPXlEkll03tBfTmhsch+mX6YDOEW+Of/zbkmaeZDN+4HQ+oBCGSpRahTqpwhPbQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=E4zxis86Tbsq0XEk6uGCwOSRX1uMP5AU6BpdE/rvAjk=; b=RTs/RKu6lvuwldSgnVmXUfFAwZLeHfpok9QDWxq05IOLNHWTrNi8zVUnAzKlZc2fdmdzrOjh969rCJbootk9fW1a+8HEVmqs+dKFqm20TXdihiohE9HY/3XlYFqYUEmeBw88fLdYLGonBq7kl2XQx1CZhN6sl3lXe9b2PhbMC8SzBhfzt2E/lpLp7nvBgK1IJUaIuq6aDQ+S+kBKPKki3SkegnPelyhgnnIVNCfDf4RjVPd/vFXn0jUcgBsIraBWMVZ6iXAhy4DZfskRhn4AxQcAZ/zS+LWWE2RP1X1jOHxU55PEMLV1ITty4TWeO0paJGnscDkrp9IGlj01EUIv9A== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) by MW6PR12MB8735.namprd12.prod.outlook.com (2603:10b6:303:245::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9366.13; Fri, 28 Nov 2025 04:43:00 +0000 Received: from DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1]) by DM4PR12MB9072.namprd12.prod.outlook.com ([fe80::9e49:782:8e98:1ff1%3]) with mapi id 15.20.9343.016; Fri, 28 Nov 2025 04:43:00 +0000 From: Jordan Niethe To: linux-mm@kvack.org Cc: balbirs@nvidia.com, matthew.brost@intel.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, david@redhat.com, ziy@nvidia.com, apopple@nvidia.com, lorenzo.stoakes@oracle.com, lyude@redhat.com, dakr@kernel.org, airlied@gmail.com, simona@ffwll.ch, rcampbell@nvidia.com, mpenttil@redhat.com, jgg@nvidia.com, willy@infradead.org Subject: [RFC PATCH 6/6] mm: Remove device private pages from the physical address space Date: Fri, 28 Nov 2025 15:41:46 +1100 Message-Id: <20251128044146.80050-7-jniethe@nvidia.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251128044146.80050-1-jniethe@nvidia.com> References: <20251128044146.80050-1-jniethe@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SJ0PR05CA0048.namprd05.prod.outlook.com (2603:10b6:a03:33f::23) To DM4PR12MB9072.namprd12.prod.outlook.com (2603:10b6:8:be::6) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM4PR12MB9072:EE_|MW6PR12MB8735:EE_ X-MS-Office365-Filtering-Correlation-Id: d33baecb-2692-428b-725e-08de2e389b97 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|7416014|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?GhmFd0KnJWpbqOeEUx92Pr91a1mYiGN2oknO1HU9CAe9ZywUhwf5QYXojf2k?= =?us-ascii?Q?7ugW4RFP63p5FgByuGVStY2l1osA6EC/2oiAf0jBxvfBen027l05PkglEjdh?= =?us-ascii?Q?yAAK2RO9NaCXlRFgusUgHhdMqPhK9CplsZnbnHEThWjJUVlsl7+0jx6j1Mt3?= =?us-ascii?Q?ruRrebE4luFSXMfFTtxfWHsEOxTabZJfV1VHWfe1Ciy+NbgPpaELYet+ZeGi?= =?us-ascii?Q?4iNmcN4dtyDF8QE6iyq7fCZoDyfy4VtKVaqrIV7CHSntJXrpUI2kmOlCUeRj?= =?us-ascii?Q?Nc6A2NE2l9h4D7CBMhETOid/QcsXpcLt0GdZmg6ybFM6XVSG7ezhdBsd7j5U?= =?us-ascii?Q?QjqKeo4MXQL/T6/ur7IMvT8FeYKzqgDRuLbPIJSdPJfgW63orzNJuQYzV14H?= =?us-ascii?Q?JrMzHtUenqPknnCTtKpNkBen5i2ekC6oVrWWNOnBqPJnIOSLAaNnDKFgqz8K?= =?us-ascii?Q?CZIjUETSAU9OJnGWYv0KCBRx0vQaEMUwhRRRMu0zXtn2OJZptnh7sjkkaw+c?= =?us-ascii?Q?38aYrV8UUlf8qPCb1wxNG+zwYkE4gnNwc+Ucvpjk0NvJ4yyR6w+WLGWfmh4T?= =?us-ascii?Q?fAcX7C9CO4G3a13iirqoy0m6KVShy6JB7uteDf+5bVUj7E6G1l1LFBn0uJA8?= =?us-ascii?Q?3Lu5s7h1nDI6kVa7tTjXcAnExBw/vNkpE3dswjSOpeV9WL0yaBZKglLLhUu1?= =?us-ascii?Q?HeVd1uBvWHD67sOeJpN4xITQ8lADICqgykPmIe12O+W/29SC9ZblJyBhzNgN?= =?us-ascii?Q?naEcXbSA1+ojibsjlo7SdpSemTV/kJKITVH4oIjRnPEH2GlPIGrm8iaiWk/H?= =?us-ascii?Q?Zf5YioEdp6JhGsl8YfcnAiWi18lR0yFLC33BzOEdRU9HNuRg/Kob4lwvM+FE?= =?us-ascii?Q?PYD+eSOJrJ+iBppIgQGmrDiHLv4Z+D268yt7JQJ1V0rJTBbFB6wDXehZgQvl?= =?us-ascii?Q?6kslOGkLiC0npfn/RRvrWs6SX+39+ZA20ag206NlQKVG2ECu2FPrrILntV2K?= =?us-ascii?Q?kMdmJicf0qMgkNSUdo7oqb6+1+1jHY1bEMMn2jkqTAmWsL2zDUMM5Kp3puc+?= =?us-ascii?Q?2kM4G1oAroB0BI65zz6rWF8rQDNNlpFr9q7JSx5IkGuSiAbUalO0j6YNG+N/?= =?us-ascii?Q?UqvUmu9V9b0H2PCfdfrGayhz/qYQiTmTHoK/ttNC88lQD4OXbmGFDmlOPV3h?= =?us-ascii?Q?l8WuFYvzzVilwZtDCIxdTRxuNOw+Tuy/FvV2B9nr/FthWCsEF0AE3slYMonk?= =?us-ascii?Q?O2N9NBYvd3NAm3BaMPnDdTJPIhWXLaaxcRrFyD/hEz3jeng/Mji5nyAvX11Q?= =?us-ascii?Q?BTXXl0z8Ec6bJEpxascNUZOAnIzsCEWYqn20npkDilEb1Cp9bdPOs53wu2X+?= =?us-ascii?Q?+mREvQ308ql0bgZK5ukKbjzjdAnZcz+ODi1c2Y5sTZi7551hmvJNibYVUf42?= =?us-ascii?Q?aVnuy1Q4gcRzLwGicHCTjI912BMmH20c?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM4PR12MB9072.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(7416014)(376014)(1800799024);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?aP0KRnuYTDbX8KA9UOB5gXV5dX5qDZ6YoAj0ibBoWgZjD2FunPoBN5p1ZHK6?= =?us-ascii?Q?6eZ1WJSmilNr39g1zzQcKpUrxGjJoVGtNMdmgFpL68WowM0bGj55o9g9mFbe?= =?us-ascii?Q?lHFKzttSXBU8G8JigG1p64ND3DUvn7fxLX+cyWlZiiMjHGCTKcKwD+fBSKRl?= =?us-ascii?Q?7wIT4OLl3nRmdqhZZZh1nc7BetGBE/xl5ZPIMIXI3VJ2mdiKWXblQgtK6AW5?= =?us-ascii?Q?xeTWe82Z5VRg+CJk3yZmrUTcThZZr1okmWLW4j1sS/DSE40ARnDO1UyekHl5?= =?us-ascii?Q?T1GGmln5Z6GnFb0X7ZkJN3Rmsr+dHpoa4NsOWl51Qyl4rwUbflaMK/O1iv+c?= =?us-ascii?Q?ZrS1/iy77tfVKvTIhJjPhXJL8zpkkt0xcTR7BmCVX/tEQjg2scz/3AGiOZYF?= =?us-ascii?Q?hX5CtImd5lCuOru73LL9MQpho21FqoVEdG/uxroNsct3AFPdpt9n0tlIvUkU?= =?us-ascii?Q?vSCVOPN+aB6JoUsMyCpuBgSfFD8b/jEg2joPgursSxjyliIYaXnshPl9QmXe?= =?us-ascii?Q?4gIY+PV5Wvf39zTlvkfBnphLENJ4j4FPtVJPr/3u4M2o/ZEAQp6FchIeGovN?= =?us-ascii?Q?cGtCuYmRrFRYJ1IA0GpT09hMKXijRMNPVcU7y5PzEl3HvbFCZiwDp4UtRfoo?= =?us-ascii?Q?syU/xrcPCrs8NL/NiZIRbkLicooncIxIGpDiyiZWeRTX4UwpEmqQ5RYYgxQ4?= =?us-ascii?Q?v8T+e1EjDPY3DwgVU1aCgXMMNUyNNjl0kRixbz2+rACYk+3fsRVkGNcMC7zV?= =?us-ascii?Q?uvkksAayi2hS9txq6vnd/8+jKBnFgIK3Wv/txn5AN9bieA6Mf718ftOnrbP6?= =?us-ascii?Q?H9qDsYtD5cafDLCOkTe4JhqjHcfKRTTG7QjaBFCD8wqFWkqgpRsAjSoO/JYh?= =?us-ascii?Q?30mwq1nVQGY1V6WqN3jRGUDOBtcCtOBHy2DTVw2d50ykaBbZMWBAeVExdV0Y?= =?us-ascii?Q?ijiTRNU9ik9Z8zAXuqBD3slN6dTnebMHlu1kdK+XZTVfxpXG9Xivkpo97ARE?= =?us-ascii?Q?Lj38VBJIQMpNkTJSfWVfhx7Wmfd3gr5CPbmGpgGbqTEQJxJb2Ndwf0wByHFr?= =?us-ascii?Q?eX/BYPzzFfF91PTVb7uzpJgoyecJ2DX0CemsFVAQy/srz9RnZDBV5xUzi4T5?= =?us-ascii?Q?oajy7qP78i5ZPE0fop++nzI/P2wpLLBx968b5Q/XvdEk83zVLT2bJZLIZBWe?= =?us-ascii?Q?6oIYaoPDn4t6tKTcWhKEY2ox0kK+PzurfRWGoFF14Cz7/OMouZdEw2MPdaB9?= =?us-ascii?Q?3mkUmPYN50CYNctrMQneMF9Q+Y8ELRUiL6vSVC2aMpHUSaiVkwo3RoizhNvN?= =?us-ascii?Q?i2O6Zhzw283wvoPx55BVhjuy8k7peND1z4Ye/mkTAFSytzsVNdv4r5SCQ80X?= =?us-ascii?Q?lNtiN7D2bx+Z39Ssi5Zg4VRryCifcDtfqxrfGC0irW/uJEfbZzG2LksBZ6el?= =?us-ascii?Q?O0kK01gQWT3m10Gp2cWeQF42WkR8n5Zye2f4sktbrhCB1VmVhxpK2wJB+fFj?= =?us-ascii?Q?zn/CEwZnMwpiuZ7r25p2v68RtyTU7OiP0yrgUi4Hft6+2iyxmNO1Uk9rWDZc?= =?us-ascii?Q?xjM6ardq1PVsOAciRXrumtoT2mpP1sIy8GSMlto6?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d33baecb-2692-428b-725e-08de2e389b97 X-MS-Exchange-CrossTenant-AuthSource: DM4PR12MB9072.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2025 04:43:00.3280 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ezkial5T34/PKY+kv3nS/6QwmJN0w+qojsNw27NOQFEDIdn68QaeRZzi10Up8Xy8zwJuuCDVAu/zvkGdtdkUFQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW6PR12MB8735 Content-Type: text/plain; charset="utf-8" Currently when creating these device private struct pages, the first step is to use request_free_mem_region() to get a range of physical address space large enough to represent the devices memory. This allocated physical address range is then remapped as device private memory using memremap_pages(). Needing allocation of physical address space has some problems: 1) There may be insufficient physical address space to represent the device memory. KASLR reducing the physical address space and VM configurations with limited physical address space increase the likelihood of hitting this especially as device memory increases. This has been observed to prevent device private from being initialized. 2) Attempting to add the device private pages to the linear map at addresses beyond the actual physical memory causes issues on architectures like aarch64 meaning the feature does not work there. Instead of using the physical address space, introduce a device private address space and allocate devices regions from there to represent the device private pages. Introduce a new interface memremap_device_private_pagemap() that allocates a requested amount of device private address space and creates the necessary device private pages. To support this new interface, struct dev_pagemap needs some changes: - Add a new dev_pagemap::nr_pages field as an input parameter. - Add a new dev_pagemap::pages array to store the device private pages. When using memremap_device_private_pagemap(), rather then passing in dev_pagemap::ranges[dev_pagemap::nr_ranges] of physical address space to be remapped, dev_pagemap::nr_ranges will always be 1, and the device private range that is reserved is returned in dev_pagemap::range. Forbid calling memremap_pages() with dev_pagemap::ranges::type =3D MEMORY_DEVICE_PRIVATE. Represent this device private address space using a new device_private_pgmap_tree maple tree. This tree maps a given device private address to a struct dev_pagemap, where a specific device private page may then be looked up in that dev_pagemap::pages array. Device private address space can be reclaimed and the assoicated device private pages freed using the corresponding new memunmap_device_private_pagemap() interface. Because the device private pages now live outside the physical address space, they no longer have a normal PFN. This means that page_to_pfn(), et al. are no longer meaningful. Introduce helpers: - device_private_page_to_offset() - device_private_folio_to_offset() to take a given device private page / folio and return its offset within the device private address space (this is essentially a PFN within the device private address space). Update the places where we previously converted a device private page to a PFN to use these new helpers. When we encounter a device private PFN, instead of looking up its page within the pagemap use device_private_offset_to_page() instead. Update lib/test_hmm.c to use the new memremap_device_private_pagemap() interface. Signed-off-by: Jordan Niethe Signed-off-by: Alistair Popple --- Note: The existing users of memremap_pages() will be updated in the next revision. --- Documentation/mm/hmm.rst | 9 +- include/linux/hmm.h | 3 + include/linux/memremap.h | 25 +++++- include/linux/migrate.h | 4 + include/linux/mm.h | 2 + include/linux/rmap.h | 7 ++ include/linux/swapops.h | 15 +++- lib/test_hmm.c | 65 ++++++++------- mm/debug.c | 9 +- mm/memremap.c | 174 +++++++++++++++++++++++++++++---------- mm/migrate.c | 4 +- mm/migrate_device.c | 14 ++-- mm/mm_init.c | 8 +- mm/page_vma_mapped.c | 10 +++ mm/pagewalk.c | 3 +- mm/rmap.c | 38 ++++++--- mm/util.c | 5 +- 17 files changed, 282 insertions(+), 113 deletions(-) diff --git a/Documentation/mm/hmm.rst b/Documentation/mm/hmm.rst index 7d61b7a8b65b..49a10d3dfb2d 100644 --- a/Documentation/mm/hmm.rst +++ b/Documentation/mm/hmm.rst @@ -276,17 +276,12 @@ These can be allocated and freed with:: struct resource *res; struct dev_pagemap pagemap; =20 - res =3D request_free_mem_region(&iomem_resource, /* number of bytes */, - "name of driver resource"); pagemap.type =3D MEMORY_DEVICE_PRIVATE; - pagemap.range.start =3D res->start; - pagemap.range.end =3D res->end; - pagemap.nr_range =3D 1; + pagemap.nr_pages =3D /* number of pages */; pagemap.ops =3D &device_devmem_ops; - memremap_pages(&pagemap, numa_node_id()); + memremap_device_private_pagemap(&pagemap, numa_node_id()); =20 memunmap_pages(&pagemap); - release_mem_region(pagemap.range.start, range_len(&pagemap.range)); =20 There are also devm_request_free_mem_region(), devm_memremap_pages(), devm_memunmap_pages(), and devm_release_mem_region() when the resources can diff --git a/include/linux/hmm.h b/include/linux/hmm.h index df571fa75a44..f6e65a6d80ea 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -68,6 +68,9 @@ enum hmm_pfn_flags { */ static inline struct page *hmm_pfn_to_page(unsigned long hmm_pfn) { + if (hmm_pfn & HMM_PFN_DEVICE_PRIVATE) + return device_private_offset_to_page(hmm_pfn & ~HMM_PFN_FLAGS); + return pfn_to_page(hmm_pfn & ~HMM_PFN_FLAGS); } =20 diff --git a/include/linux/memremap.h b/include/linux/memremap.h index e5951ba12a28..737574209cea 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -38,6 +38,7 @@ struct vmem_altmap { * backing the device memory. Doing so simplifies the implementation, but = it is * important to remember that there are certain points at which the struct= page * must be treated as an opaque object, rather than a "normal" struct page. + * Unlike "normal" struct pages, the page_to_pfn() is invalid. * * A more complete discussion of unaddressable memory may be found in * include/linux/hmm.h and Documentation/mm/hmm.rst. @@ -120,9 +121,13 @@ struct dev_pagemap_ops { * @owner: an opaque pointer identifying the entity that manages this * instance. Used by various helpers to make sure that no * foreign ZONE_DEVICE memory is accessed. - * @nr_range: number of ranges to be mapped - * @range: range to be mapped when nr_range =3D=3D 1 + * @nr_range: number of ranges to be mapped. Always =3D=3D 1 for + * MEMORY_DEVICE_PRIVATE. + * @range: range to be mapped when nr_range =3D=3D 1. Used as an output pa= ram for + * MEMORY_DEVICE_PRIVATE. * @ranges: array of ranges to be mapped when nr_range > 1 + * @nr_pages: number of pages requested to be mapped for MEMORY_DEVICE_PRI= VATE. + * @pages: array of nr_pages initialized for MEMORY_DEVICE_PRIVATE. */ struct dev_pagemap { struct vmem_altmap altmap; @@ -138,6 +143,8 @@ struct dev_pagemap { struct range range; DECLARE_FLEX_ARRAY(struct range, ranges); }; + unsigned long nr_pages; + struct page *pages; }; =20 static inline bool pgmap_has_memory_failure(struct dev_pagemap *pgmap) @@ -164,6 +171,15 @@ static inline bool folio_is_device_private(const struc= t folio *folio) folio->pgmap->type =3D=3D MEMORY_DEVICE_PRIVATE; } =20 +struct page *device_private_offset_to_page(unsigned long offset); +struct page *device_private_entry_to_page(swp_entry_t entry); +pgoff_t device_private_page_to_offset(const struct page *page); + +static inline pgoff_t device_private_folio_to_offset(const struct folio *f= olio) +{ + return device_private_page_to_offset((const struct page *)&folio->page); +} + static inline bool is_device_private_page(const struct page *page) { return IS_ENABLED(CONFIG_DEVICE_PRIVATE) && @@ -206,7 +222,12 @@ static inline bool is_fsdax_page(const struct page *pa= ge) } =20 #ifdef CONFIG_ZONE_DEVICE +void __init_zone_device_page(struct page *page, unsigned long pfn, + unsigned long zone_idx, int nid, + struct dev_pagemap *pgmap); void zone_device_page_init(struct page *page); +unsigned long memremap_device_private_pagemap(struct dev_pagemap *pgmap); +void memunmap_device_private_pagemap(struct dev_pagemap *pgmap); void *memremap_pages(struct dev_pagemap *pgmap, int nid); void memunmap_pages(struct dev_pagemap *pgmap); void *devm_memremap_pages(struct device *dev, struct dev_pagemap *pgmap); diff --git a/include/linux/migrate.h b/include/linux/migrate.h index d8f520dca342..d50684dd4ee6 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -132,6 +132,10 @@ static inline struct page *migrate_pfn_to_page(unsigne= d long mpfn) { if (!(mpfn & MIGRATE_PFN_VALID)) return NULL; + + if (mpfn & MIGRATE_PFN_DEVICE) + return device_private_offset_to_page(mpfn >> MIGRATE_PFN_SHIFT); + return pfn_to_page(mpfn >> MIGRATE_PFN_SHIFT); } =20 diff --git a/include/linux/mm.h b/include/linux/mm.h index 6b8c299a6687..94d83897ea18 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1851,6 +1851,8 @@ static inline unsigned long memdesc_section(memdesc_f= lags_t mdf) */ static inline unsigned long folio_pfn(const struct folio *folio) { + VM_BUG_ON(folio_is_device_private(folio)); + return page_to_pfn(&folio->page); } =20 diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 79e5c733d9c8..c1561a92864f 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -950,11 +950,18 @@ static inline unsigned long page_vma_walk_pfn(unsigne= d long pfn) =20 static inline unsigned long folio_page_vma_walk_pfn(const struct folio *fo= lio) { + if (folio_is_device_private(folio)) + return page_vma_walk_pfn(device_private_folio_to_offset(folio)) | + PVMW_PFN_DEVICE_PRIVATE; + return page_vma_walk_pfn(folio_pfn(folio)); } =20 static inline struct page *page_vma_walk_pfn_to_page(unsigned long pvmw_pf= n) { + if (pvmw_pfn & PVMW_PFN_DEVICE_PRIVATE) + return device_private_offset_to_page(pvmw_pfn >> PVMW_PFN_SHIFT); + return pfn_to_page(pvmw_pfn >> PVMW_PFN_SHIFT); } =20 diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 7aa3f00e304a..03271ad98f73 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -565,7 +565,13 @@ static inline int pte_none_mostly(pte_t pte) =20 static inline struct page *pfn_swap_entry_to_page(swp_entry_t entry) { - struct page *p =3D pfn_to_page(swp_offset_pfn(entry)); + struct page *p; + + if (is_device_private_entry(entry) || + is_device_private_migration_entry(entry)) + p =3D device_private_entry_to_page(entry); + else + p =3D pfn_to_page(swp_offset_pfn(entry)); =20 /* * Any use of migration entries may only occur while the @@ -578,8 +584,13 @@ static inline struct page *pfn_swap_entry_to_page(swp_= entry_t entry) =20 static inline struct folio *pfn_swap_entry_folio(swp_entry_t entry) { - struct folio *folio =3D pfn_folio(swp_offset_pfn(entry)); + struct folio *folio; =20 + if (is_device_private_entry(entry) || + is_device_private_migration_entry(entry)) + folio =3D page_folio(device_private_entry_to_page(entry)); + else + folio =3D pfn_folio(swp_offset_pfn(entry)); /* * Any use of migration entries may only occur while the * corresponding folio is locked diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 0035e1b7beec..59dae2ec628a 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -495,7 +495,7 @@ static int dmirror_allocate_chunk(struct dmirror_device= *mdevice, struct page **ppage) { struct dmirror_chunk *devmem; - struct resource *res =3D NULL; + bool device_private =3D false; unsigned long pfn; unsigned long pfn_first; unsigned long pfn_last; @@ -508,13 +508,9 @@ static int dmirror_allocate_chunk(struct dmirror_devic= e *mdevice, =20 switch (mdevice->zone_device_type) { case HMM_DMIRROR_MEMORY_DEVICE_PRIVATE: - res =3D request_free_mem_region(&iomem_resource, DEVMEM_CHUNK_SIZE, - "hmm_dmirror"); - if (IS_ERR_OR_NULL(res)) - goto err_devmem; - devmem->pagemap.range.start =3D res->start; - devmem->pagemap.range.end =3D res->end; + device_private =3D true; devmem->pagemap.type =3D MEMORY_DEVICE_PRIVATE; + devmem->pagemap.nr_pages =3D DEVMEM_CHUNK_SIZE / PAGE_SIZE; break; case HMM_DMIRROR_MEMORY_DEVICE_COHERENT: devmem->pagemap.range.start =3D (MINOR(mdevice->cdevice.dev) - 2) ? @@ -523,13 +519,13 @@ static int dmirror_allocate_chunk(struct dmirror_devi= ce *mdevice, devmem->pagemap.range.end =3D devmem->pagemap.range.start + DEVMEM_CHUNK_SIZE - 1; devmem->pagemap.type =3D MEMORY_DEVICE_COHERENT; + devmem->pagemap.nr_range =3D 1; break; default: ret =3D -EINVAL; goto err_devmem; } =20 - devmem->pagemap.nr_range =3D 1; devmem->pagemap.ops =3D &dmirror_devmem_ops; devmem->pagemap.owner =3D mdevice; =20 @@ -549,13 +545,20 @@ static int dmirror_allocate_chunk(struct dmirror_devi= ce *mdevice, mdevice->devmem_capacity =3D new_capacity; mdevice->devmem_chunks =3D new_chunks; } - ptr =3D memremap_pages(&devmem->pagemap, numa_node_id()); - if (IS_ERR_OR_NULL(ptr)) { - if (ptr) - ret =3D PTR_ERR(ptr); - else - ret =3D -EFAULT; - goto err_release; + + if (device_private) { + ret =3D memremap_device_private_pagemap(&devmem->pagemap); + if (ret) + goto err_release; + } else { + ptr =3D memremap_pages(&devmem->pagemap, numa_node_id()); + if (IS_ERR_OR_NULL(ptr)) { + if (ptr) + ret =3D PTR_ERR(ptr); + else + ret =3D -EFAULT; + goto err_release; + } } =20 devmem->mdevice =3D mdevice; @@ -565,15 +568,21 @@ static int dmirror_allocate_chunk(struct dmirror_devi= ce *mdevice, =20 mutex_unlock(&mdevice->devmem_lock); =20 - pr_info("added new %u MB chunk (total %u chunks, %u MB) PFNs [0x%lx 0x%lx= )\n", + pr_info("added new %u MB chunk (total %u chunks, %u MB) %sPFNs [0x%lx 0x%= lx)\n", DEVMEM_CHUNK_SIZE / (1024 * 1024), mdevice->devmem_count, mdevice->devmem_count * (DEVMEM_CHUNK_SIZE / (1024 * 1024)), + device_private ? "device " : "", pfn_first, pfn_last); =20 spin_lock(&mdevice->lock); for (pfn =3D pfn_first; pfn < pfn_last; pfn++) { - struct page *page =3D pfn_to_page(pfn); + struct page *page; + + if (device_private) + page =3D device_private_offset_to_page(pfn); + else + page =3D pfn_to_page(pfn); =20 page->zone_device_data =3D mdevice->free_pages; mdevice->free_pages =3D page; @@ -589,9 +598,6 @@ static int dmirror_allocate_chunk(struct dmirror_device= *mdevice, =20 err_release: mutex_unlock(&mdevice->devmem_lock); - if (res && devmem->pagemap.type =3D=3D MEMORY_DEVICE_PRIVATE) - release_mem_region(devmem->pagemap.range.start, - range_len(&devmem->pagemap.range)); err_devmem: kfree(devmem); =20 @@ -660,8 +666,8 @@ static void dmirror_migrate_alloc_and_copy(struct migra= te_vma *args, */ spage =3D migrate_pfn_to_page(*src); if (WARN(spage && is_zone_device_page(spage), - "page already in device spage pfn: 0x%lx\n", - page_to_pfn(spage))) + "page already in device spage dev pfn: 0x%lx\n", + device_private_page_to_offset(spage))) continue; =20 dpage =3D dmirror_devmem_alloc_page(mdevice); @@ -683,8 +689,9 @@ static void dmirror_migrate_alloc_and_copy(struct migra= te_vma *args, rpage->zone_device_data =3D dmirror; =20 pr_debug("migrating from sys to dev pfn src: 0x%lx pfn dst: 0x%lx\n", - page_to_pfn(spage), page_to_pfn(dpage)); - *dst =3D migrate_pfn(page_to_pfn(dpage)) | + page_to_pfn(spage), + device_private_page_to_offset(dpage)); + *dst =3D migrate_pfn(device_private_page_to_offset(dpage)) | MIGRATE_PFN_DEVICE; if ((*src & MIGRATE_PFN_WRITE) || (!spage && args->vma->vm_flags & VM_WRITE)) @@ -846,8 +853,8 @@ static vm_fault_t dmirror_devmem_fault_alloc_and_copy(s= truct migrate_vma *args, dpage =3D alloc_page_vma(GFP_HIGHUSER_MOVABLE, args->vma, addr); if (!dpage) continue; - pr_debug("migrating from dev to sys pfn src: 0x%lx pfn dst: 0x%lx\n", - page_to_pfn(spage), page_to_pfn(dpage)); + pr_debug("migrating from dev to sys dev pfn src: 0x%lx pfn dst: 0x%lx\n", + device_private_page_to_offset(spage), page_to_pfn(dpage)); =20 lock_page(dpage); xa_erase(&dmirror->pt, addr >> PAGE_SHIFT); @@ -1257,10 +1264,10 @@ static void dmirror_device_remove_chunks(struct dmi= rror_device *mdevice) spin_unlock(&mdevice->lock); =20 dmirror_device_evict_chunk(devmem); - memunmap_pages(&devmem->pagemap); if (devmem->pagemap.type =3D=3D MEMORY_DEVICE_PRIVATE) - release_mem_region(devmem->pagemap.range.start, - range_len(&devmem->pagemap.range)); + memunmap_device_private_pagemap(&devmem->pagemap); + else + memunmap_pages(&devmem->pagemap); kfree(devmem); } mdevice->devmem_count =3D 0; diff --git a/mm/debug.c b/mm/debug.c index 64ddb0c4b4be..81326d96a678 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -77,9 +77,11 @@ static void __dump_folio(struct folio *folio, struct pag= e *page, if (page_mapcount_is_type(mapcount)) mapcount =3D 0; =20 - pr_warn("page: refcount:%d mapcount:%d mapping:%p index:%#lx pfn:%#lx\n", + pr_warn("page: refcount:%d mapcount:%d mapping:%p index:%#lx %spfn:%#lx\n= ", folio_ref_count(folio), mapcount, mapping, - folio->index + idx, pfn); + folio->index + idx, + folio_is_device_private(folio) ? "device " : "", + pfn); if (folio_test_large(folio)) { int pincount =3D 0; =20 @@ -113,7 +115,8 @@ static void __dump_folio(struct folio *folio, struct pa= ge *page, * inaccuracy here due to racing. */ pr_warn("%sflags: %pGp%s\n", type, &folio->flags, - is_migrate_cma_folio(folio, pfn) ? " CMA" : ""); + (!folio_is_device_private(folio) && + is_migrate_cma_folio(folio, pfn)) ? " CMA" : ""); if (page_has_type(&folio->page)) pr_warn("page_type: %x(%s)\n", folio->page.page_type >> 24, page_type_name(folio->page.page_type)); diff --git a/mm/memremap.c b/mm/memremap.c index 46cb1b0b6f72..eb8dec1e550e 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -12,9 +12,12 @@ #include #include #include +#include #include "internal.h" =20 static DEFINE_XARRAY(pgmap_array); +static struct maple_tree device_private_pgmap_tree =3D + MTREE_INIT(device_private_pgmap_tree, MT_FLAGS_ALLOC_RANGE); =20 /* * The memremap() and memremap_pages() interfaces are alternately used @@ -113,9 +116,10 @@ void memunmap_pages(struct dev_pagemap *pgmap) { int i; =20 + WARN_ONCE(pgmap->type =3D=3D MEMORY_DEVICE_PRIVATE, "Type should not be M= EMORY_DEVICE_PRIVATE\n"); + percpu_ref_kill(&pgmap->ref); - if (pgmap->type !=3D MEMORY_DEVICE_PRIVATE && - pgmap->type !=3D MEMORY_DEVICE_COHERENT) + if (pgmap->type !=3D MEMORY_DEVICE_COHERENT) for (i =3D 0; i < pgmap->nr_range; i++) percpu_ref_put_many(&pgmap->ref, pfn_len(pgmap, i)); =20 @@ -144,7 +148,6 @@ static void dev_pagemap_percpu_release(struct percpu_re= f *ref) static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *par= ams, int range_id, int nid) { - const bool is_private =3D pgmap->type =3D=3D MEMORY_DEVICE_PRIVATE; struct range *range =3D &pgmap->ranges[range_id]; struct dev_pagemap *conflict_pgmap; int error, is_ram; @@ -190,7 +193,7 @@ static int pagemap_range(struct dev_pagemap *pgmap, str= uct mhp_params *params, if (error) goto err_pfn_remap; =20 - if (!mhp_range_allowed(range->start, range_len(range), !is_private)) { + if (!mhp_range_allowed(range->start, range_len(range), true)) { error =3D -EINVAL; goto err_kasan; } @@ -198,30 +201,19 @@ static int pagemap_range(struct dev_pagemap *pgmap, s= truct mhp_params *params, mem_hotplug_begin(); =20 /* - * For device private memory we call add_pages() as we only need to - * allocate and initialize struct page for the device memory. More- - * over the device memory is un-accessible thus we do not want to - * create a linear mapping for the memory like arch_add_memory() - * would do. - * - * For all other device memory types, which are accessible by - * the CPU, we do want the linear mapping and thus use + * All device memory types except device private memory are accessible + * by the CPU, so we want the linear mapping and thus use * arch_add_memory(). */ - if (is_private) { - error =3D add_pages(nid, PHYS_PFN(range->start), - PHYS_PFN(range_len(range)), params); - } else { - error =3D kasan_add_zero_shadow(__va(range->start), range_len(range)); - if (error) { - mem_hotplug_done(); - goto err_kasan; - } - - error =3D arch_add_memory(nid, range->start, range_len(range), - params); + error =3D kasan_add_zero_shadow(__va(range->start), range_len(range)); + if (error) { + mem_hotplug_done(); + goto err_kasan; } =20 + error =3D arch_add_memory(nid, range->start, range_len(range), + params); + if (!error) { struct zone *zone; =20 @@ -248,8 +240,7 @@ static int pagemap_range(struct dev_pagemap *pgmap, str= uct mhp_params *params, return 0; =20 err_add_memory: - if (!is_private) - kasan_remove_zero_shadow(__va(range->start), range_len(range)); + kasan_remove_zero_shadow(__va(range->start), range_len(range)); err_kasan: pfnmap_untrack(PHYS_PFN(range->start), range_len(range)); err_pfn_remap: @@ -281,22 +272,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int ni= d) =20 switch (pgmap->type) { case MEMORY_DEVICE_PRIVATE: - if (!IS_ENABLED(CONFIG_DEVICE_PRIVATE)) { - WARN(1, "Device private memory not supported\n"); - return ERR_PTR(-EINVAL); - } - if (!pgmap->ops || !pgmap->ops->migrate_to_ram) { - WARN(1, "Missing migrate_to_ram method\n"); - return ERR_PTR(-EINVAL); - } - if (!pgmap->ops->page_free) { - WARN(1, "Missing page_free method\n"); - return ERR_PTR(-EINVAL); - } - if (!pgmap->owner) { - WARN(1, "Missing owner\n"); - return ERR_PTR(-EINVAL); - } + WARN(1, "Use memremap_device_private_pagemap()\n"); + return ERR_PTR(-EINVAL); break; case MEMORY_DEVICE_COHERENT: if (!pgmap->ops->page_free) { @@ -491,3 +468,116 @@ void zone_device_page_init(struct page *page) lock_page(page); } EXPORT_SYMBOL_GPL(zone_device_page_init); + +unsigned long memremap_device_private_pagemap(struct dev_pagemap *pgmap) +{ + unsigned long dpfn, dpfn_first, dpfn_last =3D 0; + unsigned long start; + int rc; + + if (pgmap->type !=3D MEMORY_DEVICE_PRIVATE) { + WARN(1, "Not device private memory\n"); + return -EINVAL; + } + if (!IS_ENABLED(CONFIG_DEVICE_PRIVATE)) { + WARN(1, "Device private memory not supported\n"); + return -EINVAL; + } + if (!pgmap->ops || !pgmap->ops->migrate_to_ram) { + WARN(1, "Missing migrate_to_ram method\n"); + return -EINVAL; + } + if (!pgmap->ops->page_free) { + WARN(1, "Missing page_free method\n"); + return -EINVAL; + } + if (!pgmap->owner) { + WARN(1, "Missing owner\n"); + return -EINVAL; + } + + pgmap->pages =3D kzalloc(sizeof(struct page) * pgmap->nr_pages, + GFP_KERNEL); + if (!pgmap->pages) + return -ENOMEM; + + rc =3D mtree_alloc_range(&device_private_pgmap_tree, &start, pgmap, + pgmap->nr_pages * PAGE_SIZE, 0, + 1ull << MAX_PHYSMEM_BITS, GFP_KERNEL); + if (rc < 0) + goto err_mtree_alloc; + + pgmap->range.start =3D start; + pgmap->range.end =3D pgmap->range.start + (pgmap->nr_pages * PAGE_SIZE) -= 1; + pgmap->nr_range =3D 1; + + init_completion(&pgmap->done); + rc =3D percpu_ref_init(&pgmap->ref, dev_pagemap_percpu_release, 0, + GFP_KERNEL); + if (rc < 0) + goto err_ref_init; + + dpfn_first =3D pgmap->range.start >> PAGE_SHIFT; + dpfn_last =3D dpfn_first + (range_len(&pgmap->range) >> PAGE_SHIFT); + for (dpfn =3D dpfn_first; dpfn < dpfn_last; dpfn++) { + struct page *page =3D device_private_offset_to_page(dpfn); + + __init_zone_device_page(page, dpfn, ZONE_DEVICE, numa_node_id(), pgmap); + page_folio(page)->pgmap =3D (void *) pgmap; + } + + return 0; + +err_ref_init: + mtree_erase(&device_private_pgmap_tree, pgmap->range.start); +err_mtree_alloc: + kfree(pgmap->pages); + return rc; +} +EXPORT_SYMBOL_GPL(memremap_device_private_pagemap); + +void memunmap_device_private_pagemap(struct dev_pagemap *pgmap) +{ + percpu_ref_kill(&pgmap->ref); + wait_for_completion(&pgmap->done); + percpu_ref_exit(&pgmap->ref); + kfree(pgmap->pages); + mtree_erase(&device_private_pgmap_tree, pgmap->range.start); +} +EXPORT_SYMBOL_GPL(memunmap_device_private_pagemap); + +struct page *device_private_offset_to_page(unsigned long offset) +{ + struct dev_pagemap *pgmap; + + pgmap =3D mtree_load(&device_private_pgmap_tree, offset << PAGE_SHIFT); + if (WARN_ON_ONCE(!pgmap)) + return NULL; + + return &pgmap->pages[offset - (pgmap->range.start >> PAGE_SHIFT)]; +} +EXPORT_SYMBOL_GPL(device_private_offset_to_page); + +struct page *device_private_entry_to_page(swp_entry_t entry) +{ + unsigned long offset; + + if (!((is_device_private_entry(entry) || + (is_device_private_migration_entry(entry))))) { + return NULL; + } + + offset =3D swp_offset_pfn(entry); + + return device_private_offset_to_page(offset); +} + +pgoff_t device_private_page_to_offset(const struct page *page) +{ + struct dev_pagemap *pgmap =3D (struct dev_pagemap *) page_pgmap(page); + + VM_BUG_ON_PAGE(!is_device_private_page(page), page); + + return (pgmap->range.start >> PAGE_SHIFT) + ((page - pgmap->pages)); +} +EXPORT_SYMBOL_GPL(device_private_page_to_offset); diff --git a/mm/migrate.c b/mm/migrate.c index 3c561d61afba..76e08fedbf2b 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -399,10 +399,10 @@ static bool remove_migration_pte(struct folio *folio, if (unlikely(is_device_private_page(new))) { if (pte_write(pte)) entry =3D make_writable_device_private_entry( - page_to_pfn(new)); + device_private_page_to_offset(new)); else entry =3D make_readable_device_private_entry( - page_to_pfn(new)); + device_private_page_to_offset(new)); pte =3D swp_entry_to_pte(entry); if (pte_swp_soft_dirty(old_pte)) pte =3D pte_swp_mksoft_dirty(pte); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 458b5114bb2b..4579f8e9b759 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -147,7 +147,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, pgmap->owner !=3D migrate->pgmap_owner) goto next; =20 - mpfn =3D migrate_pfn(page_to_pfn(page)) | + mpfn =3D migrate_pfn(device_private_page_to_offset(page)) | MIGRATE_PFN_MIGRATE | MIGRATE_PFN_DEVICE; if (is_writable_device_private_entry(entry)) @@ -238,21 +238,21 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, if (mpfn & MIGRATE_PFN_WRITE) { if (is_device_private_page(page)) entry =3D make_writable_migration_device_private_entry( - page_to_pfn(page)); + device_private_page_to_offset(page)); else entry =3D make_writable_migration_entry( page_to_pfn(page)); } else if (anon_exclusive) { if (is_device_private_page(page)) entry =3D make_device_migration_readable_exclusive_migration_entry( - page_to_pfn(page)); + device_private_page_to_offset(page)); else entry =3D make_readable_exclusive_migration_entry( page_to_pfn(page)); } else { if (is_device_private_page(page)) entry =3D make_readable_migration_device_private_entry( - page_to_pfn(page)); + device_private_page_to_offset(page)); else entry =3D make_readable_migration_entry( page_to_pfn(page)); @@ -650,10 +650,10 @@ static void migrate_vma_insert_page(struct migrate_vm= a *migrate, =20 if (vma->vm_flags & VM_WRITE) swp_entry =3D make_writable_device_private_entry( - page_to_pfn(page)); + device_private_page_to_offset(page)); else swp_entry =3D make_readable_device_private_entry( - page_to_pfn(page)); + device_private_page_to_offset(page)); entry =3D swp_entry_to_pte(swp_entry); } else { if (folio_is_zone_device(folio) && @@ -923,7 +923,7 @@ static unsigned long migrate_device_pfn_lock(unsigned l= ong pfn) { struct folio *folio; =20 - folio =3D folio_get_nontail_page(pfn_to_page(pfn)); + folio =3D folio_get_nontail_page(device_private_offset_to_page(pfn)); if (!folio) return 0; =20 diff --git a/mm/mm_init.c b/mm/mm_init.c index 7712d887b696..772025d833f4 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1004,9 +1004,9 @@ static void __init memmap_init(void) } =20 #ifdef CONFIG_ZONE_DEVICE -static void __ref __init_zone_device_page(struct page *page, unsigned long= pfn, - unsigned long zone_idx, int nid, - struct dev_pagemap *pgmap) +void __ref __init_zone_device_page(struct page *page, unsigned long pfn, + unsigned long zone_idx, int nid, + struct dev_pagemap *pgmap) { =20 __init_single_page(page, pfn, zone_idx, nid); @@ -1038,7 +1038,7 @@ static void __ref __init_zone_device_page(struct page= *page, unsigned long pfn, * Please note that MEMINIT_HOTPLUG path doesn't clear memmap * because this is done early in section_activate() */ - if (pageblock_aligned(pfn)) { + if (pgmap->type !=3D MEMORY_DEVICE_PRIVATE && pageblock_aligned(pfn)) { init_pageblock_migratetype(page, MIGRATE_MOVABLE, false); cond_resched(); } diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index e9fe747d3df3..9911bbe15699 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -104,6 +104,7 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, = pmd_t *pmdvalp, static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte= _nr) { unsigned long pfn; + bool device_private =3D false; pte_t ptent =3D ptep_get(pvmw->pte); =20 if (pvmw->flags & PVMW_MIGRATION) { @@ -115,6 +116,9 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw= , unsigned long pte_nr) if (!(is_migration_entry(entry))) return false; =20 + if (is_device_private_migration_entry(entry)) + device_private =3D true; + pfn =3D swp_offset_pfn(entry); } else if (is_swap_pte(ptent)) { swp_entry_t entry; @@ -125,6 +129,9 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw= , unsigned long pte_nr) !is_device_exclusive_entry(entry)) return false; =20 + if (is_device_private_entry(entry)) + device_private =3D true; + pfn =3D swp_offset_pfn(entry); } else { if (!pte_present(ptent)) @@ -133,6 +140,9 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw= , unsigned long pte_nr) pfn =3D pte_pfn(ptent); } =20 + if ((device_private) ^ !!(pvmw->pfn & PVMW_PFN_DEVICE_PRIVATE)) + return false; + if ((pfn + pte_nr - 1) < (pvmw->pfn >> PVMW_PFN_SHIFT)) return false; if (pfn > ((pvmw->pfn >> PVMW_PFN_SHIFT) + pvmw->nr_pages - 1)) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index f5c77dda3359..5970f62bc4b2 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -1003,8 +1003,7 @@ struct folio *folio_walk_start(struct folio_walk *fw, swp_entry_t entry =3D pte_to_swp_entry(pte); =20 if ((flags & FW_MIGRATION) && - (is_migration_entry(entry) || - is_device_private_migration_entry(entry))) { + (is_migration_entry(entry))) { page =3D pfn_swap_entry_to_page(entry); expose_page =3D false; goto found; diff --git a/mm/rmap.c b/mm/rmap.c index 9642a79cbdb4..5aef8223914b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1873,7 +1873,7 @@ static bool try_to_unmap_one(struct folio *folio, str= uct vm_area_struct *vma, struct mmu_notifier_range range; enum ttu_flags flags =3D (enum ttu_flags)(long)arg; unsigned long nr_pages =3D 1, end_addr; - unsigned long pfn; + unsigned long nr; unsigned long hsz =3D 0; int ptes =3D 0; =20 @@ -1980,13 +1980,20 @@ static bool try_to_unmap_one(struct folio *folio, s= truct vm_area_struct *vma, */ pteval =3D ptep_get(pvmw.pte); if (likely(pte_present(pteval))) { - pfn =3D pte_pfn(pteval); + nr =3D pte_pfn(pteval) - folio_pfn(folio); } else { - pfn =3D swp_offset_pfn(pte_to_swp_entry(pteval)); + swp_entry_t entry =3D pte_to_swp_entry(pteval); + + if (is_device_private_entry(entry) || + is_device_private_migration_entry(entry)) + nr =3D swp_offset_pfn(entry) - device_private_folio_to_offset(folio); + else + nr =3D swp_offset_pfn(entry) - folio_pfn(folio); + VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); } =20 - subpage =3D folio_page(folio, pfn - folio_pfn(folio)); + subpage =3D folio_page(folio, nr); address =3D pvmw.address; anon_exclusive =3D folio_test_anon(folio) && PageAnonExclusive(subpage); @@ -2300,7 +2307,7 @@ static bool try_to_migrate_one(struct folio *folio, s= truct vm_area_struct *vma, struct page *subpage; struct mmu_notifier_range range; enum ttu_flags flags =3D (enum ttu_flags)(long)arg; - unsigned long pfn; + unsigned long nr; unsigned long hsz =3D 0; =20 /* @@ -2370,13 +2377,20 @@ static bool try_to_migrate_one(struct folio *folio,= struct vm_area_struct *vma, */ pteval =3D ptep_get(pvmw.pte); if (likely(pte_present(pteval))) { - pfn =3D pte_pfn(pteval); + nr =3D pte_pfn(pteval) - folio_pfn(folio); } else { - pfn =3D swp_offset_pfn(pte_to_swp_entry(pteval)); + swp_entry_t entry =3D pte_to_swp_entry(pteval); + + if (is_device_private_entry(entry) || + is_device_private_migration_entry(entry)) + nr =3D swp_offset_pfn(entry) - device_private_folio_to_offset(folio); + else + nr =3D swp_offset_pfn(entry) - folio_pfn(folio); + VM_WARN_ON_FOLIO(folio_test_hugetlb(folio), folio); } =20 - subpage =3D folio_page(folio, pfn - folio_pfn(folio)); + subpage =3D folio_page(folio, nr); address =3D pvmw.address; anon_exclusive =3D folio_test_anon(folio) && PageAnonExclusive(subpage); @@ -2436,7 +2450,7 @@ static bool try_to_migrate_one(struct folio *folio, s= truct vm_area_struct *vma, folio_mark_dirty(folio); writable =3D pte_write(pteval); } else if (likely(pte_present(pteval))) { - flush_cache_page(vma, address, pfn); + flush_cache_page(vma, address, pte_pfn(pteval)); /* Nuke the page table entry. */ if (should_defer_flush(mm, flags)) { /* @@ -2538,21 +2552,21 @@ static bool try_to_migrate_one(struct folio *folio,= struct vm_area_struct *vma, if (writable) { if (is_device_private_page(subpage)) entry =3D make_writable_migration_device_private_entry( - page_to_pfn(subpage)); + device_private_page_to_offset(subpage)); else entry =3D make_writable_migration_entry( page_to_pfn(subpage)); } else if (anon_exclusive) { if (is_device_private_page(subpage)) entry =3D make_device_migration_readable_exclusive_migration_entry( - page_to_pfn(subpage)); + device_private_page_to_offset(subpage)); else entry =3D make_readable_exclusive_migration_entry( page_to_pfn(subpage)); } else { if (is_device_private_page(subpage)) entry =3D make_readable_migration_device_private_entry( - page_to_pfn(subpage)); + device_private_page_to_offset(subpage)); else entry =3D make_readable_migration_entry( page_to_pfn(subpage)); diff --git a/mm/util.c b/mm/util.c index 2472b7381b11..5f2aef804035 100644 --- a/mm/util.c +++ b/mm/util.c @@ -1241,7 +1241,10 @@ void snapshot_page(struct page_snapshot *ps, const s= truct page *page) struct folio *foliop; int loops =3D 5; =20 - ps->pfn =3D page_to_pfn(page); + if (is_device_private_page(page)) + ps->pfn =3D device_private_page_to_offset(page); + else + ps->pfn =3D page_to_pfn(page); ps->flags =3D PAGE_SNAPSHOT_FAITHFUL; =20 again: --=20 2.34.1