From nobody Sat Feb 7 17:48:37 2026 Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2056.outbound.protection.outlook.com [40.107.102.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5CDC1197A66; Fri, 9 Aug 2024 14:59:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.102.56 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723215566; cv=fail; b=vAw2CneiNIatIqJ15Qhes6gU9ddom86ko7p+zH0IJa6GRV74QebltVxHVdPPJw+539ms1ztiki9rk86jBiW2fNCVOddl3KYE0q5HIHMW7ljLrxInwvK6GGnPGjAxI0Jse5WQOe62M7s4qT/ydMLXMjzgGn8SrkLrynaLnpwyqGU= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723215566; c=relaxed/simple; bh=Fz8VaNKwi2LWkmSKVb9+vIce53pqf+FOxkSn3IfM1uk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=XYBXi3uivgyjxPK1DxIk3avv2jAG9yYy0Ep+YoX9bnGLZPOegc0ch4DbE8jdFI4WpwL72xVyErmwg+sMbjH/KUioIvdYW33hlwitvvSAjr+tVDItGu/nkHUVpVLce6yPuyb43Xd5lkVffg1yVQ4p+cOu9YPEgMsk6+rakelCb+M= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=ifdlw5nZ; arc=fail smtp.client-ip=40.107.102.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="ifdlw5nZ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=kNqaHLv5gNzECjF1Dp4j++PobWqNnLKnNSRXSHIZkUArNW80x7/oBMX4/c59b6UBuXW9wUNgynLKd7Xuc5sHsF7YW1YHorj9s25RjYIsJDGPb6Faph1JU9isLaVBDtVj/Bd5ihAgWGvsGuPVKQH/IePzVe8mdZFE5Eglekykzuo19K++nSShhCvyFJ4O06h8rrtrAffhHoDf9W6bQ6rvh+ojh2g9XwUOiuSkCkqfrvOMf+vF6mw7gHk4Q04eneB+gRxm2Ois41xaC7C+mpp5AYifV6O9omZW4OA6p3iOtG+TR488AIDb6GGYz4MMGFWcQ8LaKOmHpZY44gRPUoIjdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=edoJuHT7gWila84psYDlaeDgJJza53cSwF1dPd/agJ8=; b=ACzyjykpcGCbcANyyQMmfBo6rAt7Z/xz1VVeC59DPGJ0nasTJQP4t90ihJKD9VbQ8OnWq2UciNMWpq3F5tP9d+RM3bn158j5H6A9bAwyKWjYKwowrGR46SubQhUNRKj6V28tB7sxrTmrhLQh3eJizbqMo2mEDUukoBbKE+HPggchdxOH24WftXKvCyak6ckcaUB0BvNyeekpo23hUG/rnadhBuvK1u8drMQ6sYuhResTbFqRjprqpUx6NY/Y8Mp2TEi9S+yeCD4gqWXst+pnRrnMfj/LaeX3oZaL/ojvnarJEXZk36XxC1Iqul+ZWNnU7tPOhRdRQavu16jveaC2FQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=edoJuHT7gWila84psYDlaeDgJJza53cSwF1dPd/agJ8=; b=ifdlw5nZj/Vp5u8Qoo43/xhTWmg8eRtrt+IvcffviFotbTFHBhpyzdyqnc9E1TqpL+iQCl888itNL7juNb8oFLArUGqP7QHXir0jVmV9ppgDG0lMLSfr7y/tCtxpODJzX/gDqTxmgD12rWhxBYF7StrUzn8Y2LlDuVHpYyJ9Xts9feUcHDETt+yDx1cLDaVL3j3O8TJDyrX4uF1EVG/HJrikJKaQmZx2weLwOgB40zlXq45SSK+W0bzVdc6iiW3vaKFYpqUH70odIXUwEqgz7VQyKAtVfSsmdUWLzpAuUkGdot3fFB92SSCjY/aS2x//7RPHhfN9wakT381eCxjkdA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from CYXPR12MB9320.namprd12.prod.outlook.com (2603:10b6:930:e6::9) by SA1PR12MB6823.namprd12.prod.outlook.com (2603:10b6:806:25e::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7849.14; Fri, 9 Aug 2024 14:59:19 +0000 Received: from CYXPR12MB9320.namprd12.prod.outlook.com ([fe80::9347:9720:e1df:bb5f]) by CYXPR12MB9320.namprd12.prod.outlook.com ([fe80::9347:9720:e1df:bb5f%3]) with mapi id 15.20.7849.014; Fri, 9 Aug 2024 14:59:19 +0000 From: Zi Yan To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , "Huang, Ying" , Baolin Wang , Kefeng Wang , Yang Shi , Mel Gorman , linux-kernel@vger.kernel.org, Zi Yan , stable@vger.kernel.org Subject: [PATCH v3 1/3] mm/numa: no task_numa_fault() call if PTE is changed Date: Fri, 9 Aug 2024 10:59:04 -0400 Message-ID: <20240809145906.1513458-2-ziy@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240809145906.1513458-1-ziy@nvidia.com> References: <20240809145906.1513458-1-ziy@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: BLAPR03CA0084.namprd03.prod.outlook.com (2603:10b6:208:329::29) To CYXPR12MB9320.namprd12.prod.outlook.com (2603:10b6:930:e6::9) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CYXPR12MB9320:EE_|SA1PR12MB6823:EE_ X-MS-Office365-Filtering-Correlation-Id: c2ddc1f3-7a4a-449c-31a5-08dcb883d8d7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014|7416014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?BcV98Ixm6bFTO0Y8qjsi3CJsQoguHCTzfeDNaKgbTVwaK3X6NSgedOUOgZvh?= =?us-ascii?Q?HfbsGxY7z1ktAizjQbL1LaBGW6vYxUjvLvHtvEGeI1j1t+dS4NApCvaTuIW/?= =?us-ascii?Q?pBM92nQ/QcynwQ3z5p2iR6d5Z7SIDAa0EqH3rav7yHvbzuIn8GQceTf+1jZa?= =?us-ascii?Q?IOn1Ti+dAgK9x3u3U2QPIHbq1BsbsjmhgYseO8Yiu5d42AvX4gks8UhtALpK?= =?us-ascii?Q?miacYgeTVkOInR8ssb7FZUFcg0HYDHO/cSWMqUcGbKXuQQr+Y5rfHI27yfqy?= =?us-ascii?Q?rpt/9A8iizGUCUl6F6UEQG5dKGM6ucA/yuwyFNk6Cmbyk0r7qUFFgzncdA8c?= =?us-ascii?Q?Rj5AeomZKmjz7w7v8BbunuyUTRxIzj1q9N+FFr98PuOSYtByzs/RCRK8rxOR?= =?us-ascii?Q?X1TxoDT4dpJs8VFpqyyFS/gXb6r7pJOEC4DSSJhw0gjsQfwkiWo6eGsFSqZc?= =?us-ascii?Q?ZMOcykyQNkwJMhf1hjgjqPb/CJqsE2NRkSbYdBY3sITOBsCMVeYA075yhRMj?= =?us-ascii?Q?mYtPlZctsuRYH00cgznAyyW87T/gWV8sBNjuf136ar+UMZ/NO3PRIN9yBeoW?= =?us-ascii?Q?Iy50Ry+hkZvlfMEOpDBsvv+6dPmqlpQz8kCneX8Qv9BIFc4mKJPCW1mfWNkx?= =?us-ascii?Q?U52Fdk621aWFT28rsj5yx0wjsmOv4FBqvt4VkQ8amND7smBRdsVzb0ZeT6vE?= =?us-ascii?Q?j25VSmZ2yHt/juq/+Y0nyx/ZgwHmP4+wv5SfHSjDHc4w0Q/HrBxysI9ImSnY?= =?us-ascii?Q?zBm/bY6NivAsUh3cs0cpvNiarFeEj75S60fvZX+aOr6E0ISyzRk2+kkKIseL?= =?us-ascii?Q?7a7WuKur31G2EfcZaODGkJ6kSGI1c/ma0rxoMD6Q/e30CPVJmtGxX+LoT2tE?= =?us-ascii?Q?FLKIOLqPLHcU6m6UQIY+5xJbxMcz7D6lS+yomGCmZkUSAYjPnDz1hE7rMTav?= =?us-ascii?Q?Rg0+CCcbjdm2QHmqSfuTAT/W1gu+fcHXDUq6aYCpgcmXdUcNtL3l4VfTSkjy?= =?us-ascii?Q?ZUmF7TiIRzOnycl6Zue12gisIx+7C23BzMMKfWyee5bf/hF+aYQbduyIzuA4?= =?us-ascii?Q?9A2ItYfG2L+ek9W8JCj+Z58UmbfWOjo0ljmLSdeixVTmkIC2wKnDEtpM50dy?= =?us-ascii?Q?TtdUN1VSYAuemSvfLnqMDVlGAyntvbFDPtXmirNxJqQWpn74DTjIpgJnAWHC?= =?us-ascii?Q?0oJsYjA9z0P14D+IYsggBUbnx8HPUbjfkogc7OvEY9XYugg4tt+TyTT8saEc?= =?us-ascii?Q?UWJTdDKD9+v2IFR3rBXM7CHP2Zq8XzLwUFTsdUp2P2L8rcPMWdQnzpcKp8Lz?= =?us-ascii?Q?jJDvd+SNWgFeWi7SQbjvco3FkJAzP0I1suELMKreEXu8dw=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CYXPR12MB9320.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(376014)(7416014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?k9Wio8SjiAzNpz3xTlFaNsXwl0I3SXEqtJVuL4ymN2CbmrzpGmjxvf7RbDrO?= =?us-ascii?Q?OLyMPafd38dL9U9n6Kzurc02Y6r2TjqvV6CbNK7ehNLIDBeBMyyaQpSFZlqU?= =?us-ascii?Q?UfrTG7xaDilVXvHCxmyuorbnED3K+tk+JRRliLs4JWpN9oANHvwgSCeTWO/G?= =?us-ascii?Q?SVZXuSykacaSQskMvh3OPELJz8UGyqgz+m4ia7D1vO38DJ4tRSF+F9lGL/8f?= =?us-ascii?Q?bzik3mdcoQlxgGUUyqEQEl0G9IB6PHMGH1WROqjXpk8h2IPQoPTPU36g6I0w?= =?us-ascii?Q?DrlM8AujO4lgYGekSjOGIWa0Cd28LyytgPdujpWK9lNKzf0imUCsPyDeabcC?= =?us-ascii?Q?Nw1o7BPYvFV9waAA6J2LL46+0wVMd36Z8mh0QMc5xaIUqfhiye/Sv07VLtCy?= =?us-ascii?Q?4fugxLeJQKrkGZc+6mFf1thVv2o54mL/5NQqPo8lfKdfZShscD6asBYjefKJ?= =?us-ascii?Q?Kb0vI2RgJ22//MuD0NhcldsnsRZPCGUJ7jA8gg6Q8u1BskSsGlUyH6N5bhC5?= =?us-ascii?Q?4/v4VOj6EC+dmgQAsgTtwTtbfWzX3r3Y7vXLc+YzsGK6dqS9lhmh9jRZN2DI?= =?us-ascii?Q?Vfi6943in/A89wpuJEccAvwYWcsfvXTpbhSEbTXLYBD1DVILgLyOiHXbQP4V?= =?us-ascii?Q?X9uhA/Pj7KnW+HOMBft01pBwJlXLoEBWNai4U3ZevuVZiEe3olUSvrb8Lk8m?= =?us-ascii?Q?2cg+1Z8VTt5j7fserfX4JiLep9xHqedtKJNPKT9le4mtohGqO6LMba3Okpfr?= =?us-ascii?Q?hpAnY4WZZ6HQu55D+S4dn7COTb0wNstTn0+yKE74WmRn5R081u/A1om0MQQe?= =?us-ascii?Q?9teH5vLL2fyGcbHyeCg54SjIYt4KftJGaivS+4gphus103WIPD3Ueh9QuCcf?= =?us-ascii?Q?2s6d701e5lF+7S+iIs2yYks9SDzDhVY724vevFnqzQ79O+JWoEtQXGOnsHz6?= =?us-ascii?Q?nBNo5/E4RCMVJeUvFelD/zMqBaiLdyeBsQKGBcCCUQBF1s1IIe/P9KIzJ0NZ?= =?us-ascii?Q?fs+fvlhUfUEeHU+JXA53em3GSVrJA+c57o1dnkGeO6CPxJWUihrcsnEzdSp5?= =?us-ascii?Q?l7PMa6d114TxwjN4Vuuujjv/ERzUF37eK8A3BupvpPDT8Jqkhc0VqjiJ8UzP?= =?us-ascii?Q?ErpKzsjcpfddcD/elBDA4UlG8hj5slOO2kdeEbGZFcHxoYTTgaydRLX6hueF?= =?us-ascii?Q?sFLY2wBd+AWAcMlFEf4G/LTNbbjolP/F+1y4nCo5th+p47KZNxbrfw50uYKn?= =?us-ascii?Q?Uz+AQjbrMnzXWmfQtHZ8ugKYmK3XkfqpcNtx0DF4ulXW5S6BPcUSHFCOBHBw?= =?us-ascii?Q?ltjtInRyRym2XXhLPpatxSEr8svxVaBSgstF/mFJ9/Tk70vER4hzIWBsKBNy?= =?us-ascii?Q?IemTNEyHdwOep8KeFm1SxUpVin2AuC8lR7CuBLCmSeoD2KSqXexZwKK3B99s?= =?us-ascii?Q?WZ1LMuuhpByU/XxQu0leu+GfzkdkFqGeDVGXtIn6nm3HR4Y4+KndtqxwIqw/?= =?us-ascii?Q?WyG3x+HI/Gv6VwHUPAEh0hpFhvMdfCBFW3CBTKjVj5wmI8tI5Qp4mmZuZjEj?= =?us-ascii?Q?izAeikIk0DhHiwIAkBJNy1FdKjIvzgBXlUOKD41W?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: c2ddc1f3-7a4a-449c-31a5-08dcb883d8d7 X-MS-Exchange-CrossTenant-AuthSource: CYXPR12MB9320.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Aug 2024 14:59:19.9173 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: cNfpbJyV7zMWue2Usz7z2XcTFv/g8gJpnoNNEdg5+ojbST0y2+BcUR8jTgYERrP6 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB6823 Content-Type: text/plain; charset="utf-8" When handling a numa page fault, task_numa_fault() should be called by a process that restores the page table of the faulted folio to avoid duplicated stats counting. Commit b99a342d4f11 ("NUMA balancing: reduce TLB flush via delaying mapping on hint page fault") restructured do_numa_page() and did not avoid task_numa_fault() call in the second page table check after a numa migration failure. Fix it by making all !pte_same() return immediately. This issue can cause task_numa_fault() being called more than necessary and lead to unexpected numa balancing results (It is hard to tell whether the issue will cause positive or negative performance impact due to duplicated numa fault counting). Reported-by: "Huang, Ying" Closes: https://lore.kernel.org/linux-mm/87zfqfw0yw.fsf@yhuang6-desk2.ccr.c= orp.intel.com/ Fixes: b99a342d4f11 ("NUMA balancing: reduce TLB flush via delaying mapping= on hint page fault") Cc: Signed-off-by: Zi Yan Acked-by: David Hildenbrand --- mm/memory.c | 33 ++++++++++++++++----------------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 67496dc5064f..bf791da57cab 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5461,7 +5461,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) =20 if (unlikely(!pte_same(old_pte, vmf->orig_pte))) { pte_unmap_unlock(vmf->pte, vmf->ptl); - goto out; + return 0; } =20 pte =3D pte_modify(old_pte, vma->vm_page_prot); @@ -5523,23 +5523,19 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) if (!migrate_misplaced_folio(folio, vma, target_nid)) { nid =3D target_nid; flags |=3D TNF_MIGRATED; - } else { - flags |=3D TNF_MIGRATE_FAIL; - vmf->pte =3D pte_offset_map_lock(vma->vm_mm, vmf->pmd, - vmf->address, &vmf->ptl); - if (unlikely(!vmf->pte)) - goto out; - if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) { - pte_unmap_unlock(vmf->pte, vmf->ptl); - goto out; - } - goto out_map; + task_numa_fault(last_cpupid, nid, nr_pages, flags); + return 0; } =20 -out: - if (nid !=3D NUMA_NO_NODE) - task_numa_fault(last_cpupid, nid, nr_pages, flags); - return 0; + flags |=3D TNF_MIGRATE_FAIL; + vmf->pte =3D pte_offset_map_lock(vma->vm_mm, vmf->pmd, + vmf->address, &vmf->ptl); + if (unlikely(!vmf->pte)) + return 0; + if (unlikely(!pte_same(ptep_get(vmf->pte), vmf->orig_pte))) { + pte_unmap_unlock(vmf->pte, vmf->ptl); + return 0; + } out_map: /* * Make it present again, depending on how arch implements @@ -5552,7 +5548,10 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) numa_rebuild_single_mapping(vmf, vma, vmf->address, vmf->pte, writable); pte_unmap_unlock(vmf->pte, vmf->ptl); - goto out; + + if (nid !=3D NUMA_NO_NODE) + task_numa_fault(last_cpupid, nid, nr_pages, flags); + return 0; } =20 static inline vm_fault_t create_huge_pmd(struct vm_fault *vmf) --=20 2.43.0 From nobody Sat Feb 7 17:48:37 2026 Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2056.outbound.protection.outlook.com [40.107.102.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62893198825; Fri, 9 Aug 2024 14:59:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.102.56 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723215570; cv=fail; b=tHP02uqzKUakgLgW5JOzjaVgc43eccF5NxNwbvPJ8Jm7fPO4jM7VrCs4JVSjvdkaprXVQ3bW9jc/eMH+H4WD2F3nU08x0hBiE/+UjJkL5bPBbT18l+oTjaJkjiaifZhfQ1uiF7rVul70ElQEtCSsp+6rkPpDlVAuJh8zN6PyC6A= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723215570; c=relaxed/simple; bh=iCFOykj8kWC9b5XpMMgQ8sP5n4+nkOSTnOUrzZguwDA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=BktTjhW7WxGuoaClZpUKweohTLNMiZu0mzKxeWueMJJyRdYqpoGEbR4a6ALi34XWEIPDE9aEVKl1Kn9Lo19sbDwsxLUrj+hvBRFmj6JtkDFf9IaNrC+JYH0rwcogO7+uw3YPIS8xO1yMqGVjP1ahKxZ4qEixqenIcHwM5S2tUTU= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=tJmVXXx3; arc=fail smtp.client-ip=40.107.102.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="tJmVXXx3" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=lJrUNknsdqer/ahxrBmkom08NyFnAxDLZoVTYRdkEfRdqj0xELHFfImwzbnnMrT29sdcIH1VbLU4mrZhjAceyAS0dLi13LtjFd9Emz6sjIMMbh5GiN22IVYVy1/5+QigY91gtHmcy3NLXcjcBTb+TIhD1k5gv+9DvWWhW2yS7JIflw99Vmcwj8rutB2SqE9YzIX72mKXMUoGLjNuIVINyK/BYv3yLtDP6iwt3pUz0dsj/mN8w9Np//U70BTsfr+8B9jgraDxrykCW6jCqv5nTRQpzpIbp0kGGnH/g6x4LLK4xiGPPDXvGBAYTQklUsLt5MVu132i6ef01n3SUHIOtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1XYiZT7rhmKPSP0vkQcCuc/N49agUoRG3Dy29+8nYjw=; b=pV0vlrlKIrYbcGP5gWNKDZyJX3AqSiJO4ZaeHNBGfCUwLGhg1WuPyD/fmpl41l6D7EmXSXxEnXfCnNKR2AL3CO6Wth13rhk1VFFViVJYR3MuaqNgx0UUjebABXJyRnGFQD59UJ1ovVPnZmaWDBuPIZnUw2PqMIoW3RozlWWyj/lRP3Ceyqoi7e/6rKs3e2Bb8qwXdLmAjFkVdcGSDD9PZKY1nGhZF1slPd8I+z0X9fb2kfShd4JR8DLpq1CshTWQmwq2Yo66dmnpvqRNwR9gpiQTNdLzpMGrX4LSvFoYu+wS1n3p4ahcOQAtACxcki2wabBE59Cu9J7tsXdwW1gr/w== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1XYiZT7rhmKPSP0vkQcCuc/N49agUoRG3Dy29+8nYjw=; b=tJmVXXx3MSWTndsVbGYnccDdTDn6ExcqgngxFID5vz3Gn5Eft+rfhnOLIXdLpPR8xuYMrxOljBH6VRwUb9EFjt6ggZGpaJ30RkDy4Sbc2Q0PRmVRXPgHnh1acopDIMgayVZhPKldzb3KJnYP8x6jT42p3+5c88AOHbOb8IMALjH0mes257vYUrKeML0eRxaErnZE2RePoaswwq1W68o8InSAluQcXMDseROp0CR/ZD7C1nJeBpSiNP4jnyw93IgCJgIu69IblsT0PllLitse6cNAL7O1C93mSUpdl+AZJ711pisdlDAT5HPqRHaf5ppP5gXQUKbdF6hYvRkz5FEXpA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from CYXPR12MB9320.namprd12.prod.outlook.com (2603:10b6:930:e6::9) by SA1PR12MB6823.namprd12.prod.outlook.com (2603:10b6:806:25e::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7849.14; Fri, 9 Aug 2024 14:59:21 +0000 Received: from CYXPR12MB9320.namprd12.prod.outlook.com ([fe80::9347:9720:e1df:bb5f]) by CYXPR12MB9320.namprd12.prod.outlook.com ([fe80::9347:9720:e1df:bb5f%3]) with mapi id 15.20.7849.014; Fri, 9 Aug 2024 14:59:21 +0000 From: Zi Yan To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , "Huang, Ying" , Baolin Wang , Kefeng Wang , Yang Shi , Mel Gorman , linux-kernel@vger.kernel.org, Zi Yan , stable@vger.kernel.org Subject: [PATCH v3 2/3] mm/numa: no task_numa_fault() call if PMD is changed Date: Fri, 9 Aug 2024 10:59:05 -0400 Message-ID: <20240809145906.1513458-3-ziy@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240809145906.1513458-1-ziy@nvidia.com> References: <20240809145906.1513458-1-ziy@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: BLAPR03CA0077.namprd03.prod.outlook.com (2603:10b6:208:329::22) To CYXPR12MB9320.namprd12.prod.outlook.com (2603:10b6:930:e6::9) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CYXPR12MB9320:EE_|SA1PR12MB6823:EE_ X-MS-Office365-Filtering-Correlation-Id: 033aea5b-0b48-4cf7-4272-08dcb883d9df X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014|7416014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?kF8MiWo6w8j10A2B2YK8g5fFiggSqMhA9C6wS2D9hbbbgCW5lcX3tyURZDdG?= =?us-ascii?Q?c2Cho4TBGP5wbKO+inaCdtQ37VuR0irJakQG9mQchJIUPv6Z/XNwJ3xQYqZW?= =?us-ascii?Q?RFxYZc8Zp9S7kSqxhXKvrGe60LNh4FKF5DK5rndj8oM6efLQweGp0y2vZF9d?= =?us-ascii?Q?70RrSP+Xzb+tGoPsPs9D2QywvZO8/HXBgbLZ9hfW6XA7zhMDdgL16hnsBklC?= =?us-ascii?Q?O5kdpMqHyGPiUZn5yQ5l/lwwf1fU73AQxSv0KpG2ejPHbJtvWdBuMFBJ411N?= =?us-ascii?Q?DatOzyqHl7BuOvfeAmpp091VK5RblvmWZqSm3SLs2DhVTJDT9cwxkUw4f1+R?= =?us-ascii?Q?PYZVNkEXqvNgfWrEDWnioedgO/1m+VuVjyVfvNOOZ33N3Gev1flo8YVM14v6?= =?us-ascii?Q?RZQ8E4ElqtSG5eDmZCr3qwT7PBBfJbdb0RrHKK9ksPF8tP2cPuVJDWffYCio?= =?us-ascii?Q?dYrPFwqdfaSNC730nKJEWmeGKaRDQ4Of9q3R86Lx8EOLNKU3l5X5YH5yt+Uc?= =?us-ascii?Q?BgXjqtCf3k5xmI2YjR1OS2V0W5glf97IG0sD78joYtMTE4gOR3cRYbVGy4Lz?= =?us-ascii?Q?UDOxLRom8MtIWKdTeZCMqC0BlmXjwIQPNMXKmvdGwFTt8bRosu/FKCr7PzH7?= =?us-ascii?Q?F7qA7Vg1FllAJJeliAUImbtNzTqjO9fDY4jIssI/Y0LbMfz5uRvXflFk+6bv?= =?us-ascii?Q?sw+3XVFLTjI7F48oqgRt+CdK8phDwf5R2hc2/vUGaGcDS8cV+BM+iqhrmWDP?= =?us-ascii?Q?crx3ePh9V6n+BqIalUTZquoxEaOQclDaWfzu1OWdj7JVfeAAGo3dt8mGcNqt?= =?us-ascii?Q?KlUkXfVDknJeSfjWnIxyPw2YBoqsAFiGsunfKdH3LO5mwszMU2wvkIIhCV7J?= =?us-ascii?Q?CrrCbER0TDplCCowA7Dvr0cd/ldGxNW/yMVYd10q0VGGvi6b4PIF7E0i4mB8?= =?us-ascii?Q?w6Geq9tt+WOLg8l2rpslyAKDoWwAZm6Gwc80TqkfTR0tYKtF4KF8xfxNaZm/?= =?us-ascii?Q?KHNstDgZsatXkoIB3jg4LR5zNecm91y3hGvL16BRwTBRX9Xvf5M89V7uqyhP?= =?us-ascii?Q?LWNNKm0/zOCLfV5WRATL1Oqn6O9O2AyoVHE06D504ACrKzE5E51w1V1SdM0u?= =?us-ascii?Q?AF8ZzOroYqPEDrGo1m1s8QIcDZJ0sRere4KomNinM/ANYgtC9f+tzRe+Ta0r?= =?us-ascii?Q?TleIjsxldHi1ShYh5cM9D+Hose8Yi/2s8p5Pb8w/OfZj2g+8dFEENXdck7eB?= =?us-ascii?Q?HT4uuzMBCwl0K4lgf+d5G62/Z/rkx0cdM+PcUsFRowGqn+1unI68J74zphZs?= =?us-ascii?Q?RfhgjBPp8N4qieM0gQU+cIzB4o3ulK4Q/H5sV6jpXzsKZujdPJX/00+UkW4k?= =?us-ascii?Q?6FIr20M=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CYXPR12MB9320.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(376014)(7416014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?URpX3NK/YKDy0tNgoA4oxpJP1Y1xr0YYBSMaGZvS0o9NJTlywQwylPF/03tp?= =?us-ascii?Q?Q4RwNfX+LTsQK5T7Z3ml12100/LmO558upw1fvYtFMTa75yuwJpigx5uQNNH?= =?us-ascii?Q?U1qsAoqUG65PnOBQDRd2EKdeUjF35Wxe9kzmHZIN+PixJkNG0LW7sCxOwtBS?= =?us-ascii?Q?P5/MFgkUs+2bYOW2CSNKsANWEFxeFWDXzVrHJ5M54yFyw4TSgTljeQqX8hiV?= =?us-ascii?Q?f9AkyY7Ta6u1MFlAZrNkdVaE2bxQVshdhfa5C6xgy1UqGw814+2LPRNRfMSo?= =?us-ascii?Q?o/4hUF51QUPEwrpgezyxSYELOftp78+YZ8sNHWryZU7V/+7MxvC3TsWIMMbV?= =?us-ascii?Q?EbC2TI2nqnyGPQmcYB7LS3LbuxP8H1fGVI8w88CNE2EjFa61AwjGTkahObKi?= =?us-ascii?Q?+JZdJshfAipdBUMmnd8j6I6BFprcf4XJJwAlSMFWIZ2lXg0uHQ7msbKlY0lY?= =?us-ascii?Q?R4GMV9QjPfmXCTatQYWoMDpRC4FGbVOEjpXwSb0ST7gV5GOU3mufYcxs7seN?= =?us-ascii?Q?t9mIAVqPqJ4Vu6Z7ZCJgoQzpym5bMhoblBRAcPZm3xVjnsHhJ5KEiUeEwJMH?= =?us-ascii?Q?2SyiKHXTj0pjm92sZBIZB3oWNrrEtkgohNQvCd0kKLd7xiNkaq6wbj6EXPX+?= =?us-ascii?Q?i40ghAT1SAZAbNW9SJ3cF25OcJnkWp1M2TwGDhcHugj3Ynh/ouy6jPjbYq4h?= =?us-ascii?Q?h8jJGwhG8zIGt53IzOdOlgQnjhc8v3JGb9EvdY72vgtIoqVbG6x/liimxmFh?= =?us-ascii?Q?K8tMhw6BRVqQiBADYwamTI6uH1M6DlEJQGs/dKbmUeILB5jBGDHnjwd0nAIm?= =?us-ascii?Q?uWaE5qR8R1Otv5IM0H19b2JMZhtj/KUWlNIKolQgZdEhGh51TDZw2aMY8Bou?= =?us-ascii?Q?/BFPxluuK98/jOfhqSTT6d/VVnk2hGBdoyOwn2Zdh8wwddVJ2iGtevc6DKGW?= =?us-ascii?Q?WqZYDO7p5Ud/q+zPDNn5+Bn66cFK7rbZOro7xhjCIz3lGrTNtdVjaNNmJRtG?= =?us-ascii?Q?ZlIVbRN/Wy8qrhuLJ123G/VvnyY/6tuYsv0Sovd152EAchapGg8NKIOPzww7?= =?us-ascii?Q?yPningfp+bWRKlxoZvgb4n/ldcC0XkPaxZusBfNmiFnWbXlluXFa9/GTzVd2?= =?us-ascii?Q?WlGHqgOJlbhhqFtFNXql60xue49KCyPhVnU1vd4QLEGmIwsayIAcVl5VP+bS?= =?us-ascii?Q?tO93Sebt3qCOYAuUDKGLI5rxkyhoo3DyuAsi7pBnsWLXRp41kHQNUc+pY6so?= =?us-ascii?Q?UaUCmzpWfR5qaj8kN5/ZHxM7EvhBQ+F5DhSaKiZz4zNCRpvJoVpxECUtkEeT?= =?us-ascii?Q?cXXQCvIcL7q2W6xBr/yz6rkEFZsJJowrrUHryIQJpBo7RBV3W2q478O5hTAg?= =?us-ascii?Q?elmZRkQJ3Kjn6AmBiHAQpajjinmd6E/1TobdCiLeqIgI0olQTPiZkZA/km5l?= =?us-ascii?Q?mwDTaBPmfwnq5cEbI/YVKhJdAtStzDkjYSGrJfdV0QAY/RnUKee46y/IcRA9?= =?us-ascii?Q?O1tmYR/jIB6ufX7ikCZ3P/cZcUvJHRODWTGwA+9Np8qgtobD9NXllSD7J1v4?= =?us-ascii?Q?i50K6y6zgCThUjh+SUziUJCMv2YqINL0M9poWHu7?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 033aea5b-0b48-4cf7-4272-08dcb883d9df X-MS-Exchange-CrossTenant-AuthSource: CYXPR12MB9320.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Aug 2024 14:59:21.6054 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: gu1i5YaaeqcuWRxGgDyxwsN9BGpuvjQIskos8790AZrQe4J/Zb+Uh/Fr2pp8sZeA X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB6823 Content-Type: text/plain; charset="utf-8" When handling a numa page fault, task_numa_fault() should be called by a process that restores the page table of the faulted folio to avoid duplicated stats counting. Commit c5b5a3dd2c1f ("mm: thp: refactor NUMA fault handling") restructured do_huge_pmd_numa_page() and did not avoid task_numa_fault() call in the second page table check after a numa migration failure. Fix it by making all !pmd_same() return immediately. This issue can cause task_numa_fault() being called more than necessary and lead to unexpected numa balancing results (It is hard to tell whether the issue will cause positive or negative performance impact due to duplicated numa fault counting). Reported-by: "Huang, Ying" Closes: https://lore.kernel.org/linux-mm/87zfqfw0yw.fsf@yhuang6-desk2.ccr.c= orp.intel.com/ Fixes: c5b5a3dd2c1f ("mm: thp: refactor NUMA fault handling") Cc: Signed-off-by: Zi Yan Acked-by: David Hildenbrand --- mm/huge_memory.c | 29 +++++++++++++---------------- 1 file changed, 13 insertions(+), 16 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0024266dea0a..666fa675e5b6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1681,7 +1681,7 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) vmf->ptl =3D pmd_lock(vma->vm_mm, vmf->pmd); if (unlikely(!pmd_same(oldpmd, *vmf->pmd))) { spin_unlock(vmf->ptl); - goto out; + return 0; } =20 pmd =3D pmd_modify(oldpmd, vma->vm_page_prot); @@ -1724,22 +1724,16 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *v= mf) if (!migrate_misplaced_folio(folio, vma, target_nid)) { flags |=3D TNF_MIGRATED; nid =3D target_nid; - } else { - flags |=3D TNF_MIGRATE_FAIL; - vmf->ptl =3D pmd_lock(vma->vm_mm, vmf->pmd); - if (unlikely(!pmd_same(oldpmd, *vmf->pmd))) { - spin_unlock(vmf->ptl); - goto out; - } - goto out_map; - } - -out: - if (nid !=3D NUMA_NO_NODE) task_numa_fault(last_cpupid, nid, HPAGE_PMD_NR, flags); + return 0; + } =20 - return 0; - + flags |=3D TNF_MIGRATE_FAIL; + vmf->ptl =3D pmd_lock(vma->vm_mm, vmf->pmd); + if (unlikely(!pmd_same(oldpmd, *vmf->pmd))) { + spin_unlock(vmf->ptl); + return 0; + } out_map: /* Restore the PMD */ pmd =3D pmd_modify(oldpmd, vma->vm_page_prot); @@ -1749,7 +1743,10 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vm= f) set_pmd_at(vma->vm_mm, haddr, vmf->pmd, pmd); update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); spin_unlock(vmf->ptl); - goto out; + + if (nid !=3D NUMA_NO_NODE) + task_numa_fault(last_cpupid, nid, HPAGE_PMD_NR, flags); + return 0; } =20 /* --=20 2.43.0 From nobody Sat Feb 7 17:48:37 2026 Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam04on2056.outbound.protection.outlook.com [40.107.102.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8303519885D for ; Fri, 9 Aug 2024 14:59:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.102.56 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723215572; cv=fail; b=I0ya5PQG92hi0eKonIaDT6b68awBoQBT6CMggZva+d37PDxlx15bdm7pRj3M33z8kz7jIvvTQhNLMr2pJe3jrbADSt0OFp8GjtlUWAkvDg3hPNy/r5isPmnBkRaMYSJeyu8p4CW1typYf1HA1gdg9CHHpyk5nXhE5PHFh2FDoOY= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723215572; c=relaxed/simple; bh=OjsoFSvB+Zz86ZG4BUemaFe/IEbkn0GSHNyKdnpOL8s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=Ci8dJ3+KBbxukV0+gVjyrsW6hjrqAVoMz3Sw9HvGZlqIGfNvhRkOh0rKm2Zygz+U3aILas4xirzRw6E2L9nvEIq4uoQ5jKZQnFrOi9Tab5j/hfNV8L59eqrs4eDA8JcTwmFNb2ixeX0OCe1Oo19ySdW6PioE7sRoO3jG74QvdW4= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=msMk/chR; arc=fail smtp.client-ip=40.107.102.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="msMk/chR" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=DbKuD2iwF4gFglTo+WrcYagwHK7VSz+mZGVDbqL1N4yhQxmgqyFkc1RhcOIWb4z48EgdbnMhSCKGcGzPu5oLu5KI8Mi/ID1jmbbLAPnNI4guKHnInZslbpxjjQ3/US9BUn748w1D5fSeYIqvIXgdCvJxfuQuB5RPVqmDJSAUw6UrrvCRhRAfXoCssU5GdAcv6IYxlleisu5BTwxFJTCxax6CjqWz9m2p/my7w0gG5KLJAdnByAScDtcUPzbTF7teJW5i+3RkD8ekSxJLomF6IQ7tdFPzhLLnsuGxmR8q2glzCSCTKBUkw2ZdIYHT+rYzc97ZUnp/xdz6G5ABSytihQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Jp1ggtAq2TPL+tWQv2b9sD0zy1Lqc8F9tl7KovE4HMg=; b=k0oJ1BcH0CK6azo2E/wqHYrp5oBnaQJDsjf3xZnJ0vemrNbFMgtN4Rvn2QxF578VT05IVz1wqBdcoVkxr9bIBgF/TJDUr2aLPdFXu8LQtO38tXhPk/kcDuavJ1I1q5LMdquP/ECE7SFWwq7TYusS93nBzLuL6GSKLZjZg1l574Z15MKVjrnVae3aVOaIeSFdkR63gntkBxsYpfT71/6UWEJplOr/yqV2ivT/MHa/4fO0fYOHF+UaR79/5/YFclYqecm18p8LGYu14hAIQfzwzH/lLAhcwjcCiA8liA69cEkz36a8JX2oKjycV0G+VCgfVV1S553nOY3vdmRx4z5+ow== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Jp1ggtAq2TPL+tWQv2b9sD0zy1Lqc8F9tl7KovE4HMg=; b=msMk/chRuSZUhIUp4Hy93Ik51puNiRdC0d3LILReIOE2R4V4l+LYQP0y/W2Qvagig+hW3V04Wwj86uugUIMAM4qZWesxgAVK7v2ZPcjfh2q03UafUypvQJ96EpehWCEkiOkrpqpUl6Y11ybMWYwJ4r2SiTbS9rUkbBanIkp5i9gZwe0PQK4IxnpxmU7AqVA+9AYeYxczL/MvLxu9A5M3tDiUNCwunv60XFf+S1zaTIaA6SlDxsAysU5fLqvUUplzvA2UYJ0zYKLHsdAhVM5+H55ntK/vrh3q4vO2EvD0eVY3qYnV7/bYzfjpR0X0KTghHyE3LKBt9yskxvD0MRiYrg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from CYXPR12MB9320.namprd12.prod.outlook.com (2603:10b6:930:e6::9) by SA1PR12MB6823.namprd12.prod.outlook.com (2603:10b6:806:25e::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7849.14; Fri, 9 Aug 2024 14:59:23 +0000 Received: from CYXPR12MB9320.namprd12.prod.outlook.com ([fe80::9347:9720:e1df:bb5f]) by CYXPR12MB9320.namprd12.prod.outlook.com ([fe80::9347:9720:e1df:bb5f%3]) with mapi id 15.20.7849.014; Fri, 9 Aug 2024 14:59:23 +0000 From: Zi Yan To: linux-mm@kvack.org Cc: Andrew Morton , David Hildenbrand , "Huang, Ying" , Baolin Wang , Kefeng Wang , Yang Shi , Mel Gorman , linux-kernel@vger.kernel.org, Zi Yan Subject: [PATCH v3 3/3] mm/migrate: move common code to numa_migrate_check (was numa_migrate_prep) Date: Fri, 9 Aug 2024 10:59:06 -0400 Message-ID: <20240809145906.1513458-4-ziy@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240809145906.1513458-1-ziy@nvidia.com> References: <20240809145906.1513458-1-ziy@nvidia.com> Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: BLAPR03CA0088.namprd03.prod.outlook.com (2603:10b6:208:329::33) To CYXPR12MB9320.namprd12.prod.outlook.com (2603:10b6:930:e6::9) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CYXPR12MB9320:EE_|SA1PR12MB6823:EE_ X-MS-Office365-Filtering-Correlation-Id: fcc233f4-6fe5-404b-b093-08dcb883db08 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?ooUYHu3jv0TDp6L6f7V/oUy+zyp/KNRRBspA8U9Wyvh0DcWa11G6/Z22yMMO?= =?us-ascii?Q?auztTJ0T5uWKXfclU0Ymr1mRwLLJFg/d4MzULAzyv822MoY3OZo9nmE/5tSu?= =?us-ascii?Q?ZiXp1xpXyIMkcsBzeotfzWG99NLegOarrCwJc8zzmXz87rLI6uJgdnBpVcXi?= =?us-ascii?Q?K0iz/5+/i3DgaeOz8FPz9yZXF9SS+8lSp9rRCr3lfVEM3XnAuUs+wG5irV+W?= =?us-ascii?Q?1LLuqrvAgPV9/K2Tpwc79K08m0kAFTbTuKZMsV6zyTXoT7tFXDl0u3P18Jpx?= =?us-ascii?Q?g5JdKw/qbzcHtgTid8x/+UhKcxBziQmEHngyFCnVmyN1CH3UIO8QoG6gIHNr?= =?us-ascii?Q?0ZCLu+9o682Tdf5MBQZNQwYgJvJD6XC7BvIRoC628RWbxcLGEFAIeWGfv08H?= =?us-ascii?Q?T7/Bom4VhEt681iY5JagMOblk80R+/4yLQjoI+D1NBvt8nCpAD9mzNMCsXXU?= =?us-ascii?Q?4e+PakHwDxp7AO0LdlO90TnfB6LoHgPw+qufOAe3g9N9S5HcafdLiiw8wM0x?= =?us-ascii?Q?z/70dx1qHK6OGs/xY2YWJQ9nsMB+ivMnY92ATojeuP7xbelxEYEsRhupFNNP?= =?us-ascii?Q?JpmpJc9KFFdKNUFkzJATCiitBQ7ylI4n4lk+pPSj4/OQEW4fdoYBjtAZA0FD?= =?us-ascii?Q?xaORzbsYniXleO3Bj4MxEldnDZhiTbT4R/h9qYqoQZviJ5dkhdglDyfCye3W?= =?us-ascii?Q?3NYg4IShwKdGzWY6il3BKu4wgGgKE0uk2tbYafCGRUoaRH3Fldto3q3g+VpI?= =?us-ascii?Q?CKNEYrtvL1EC1TyRbov+pmWlSuDKE5x5BMYi/BMP2GvGn8K58Mk4ftYvH1rO?= =?us-ascii?Q?5zEfzwt7hCYNHC0p/JW68q2WX3jX3ntTWxpJ8/yktwuAW3sznAJhYSiixox/?= =?us-ascii?Q?N3uUY7Hx9WJnukUri/fdc0Z8uPLB2gV4mHtBzQenH6GS5n5dgMpRkF+28hAk?= =?us-ascii?Q?EuGw0pTaZdKGIXypx5ItEeTXYprFkCnFrwQtEJlCa2WCyS18DpEHnhxytlsz?= =?us-ascii?Q?QrFVodL0ONVsi4HrIW2wK/kKX0MSZOg6G+z0dmwlb0nogUCJmQ+9mz7JXMNf?= =?us-ascii?Q?fPYO43wFatyZAvCroCo5yagq35gpVjySDPg+He0mz8kd3FjwagKO+eAI5LZD?= =?us-ascii?Q?tTyWbLaDnOBSFGqEKvi4OjaGUC4RhgpiCAQLreHM3oAO8aqhLr+Lu/L8HJtT?= =?us-ascii?Q?C4xTxYpLcroVVBuy98TioGoyKAMcREmyuMZ8P50fN8Xy6mEuPpzZybV3Eths?= =?us-ascii?Q?SB86kLyWLFvc7020XWucboInPl5pltX0OWd8MtU6ssDmz38jf81U5eJ4M3MF?= =?us-ascii?Q?2o+R+sVQeDc+XzoSYMR3Wgj3q38MD0xTTaMGK29nhL/fdA=3D=3D?= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:CYXPR12MB9320.namprd12.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230040)(366016)(1800799024)(376014);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?us-ascii?Q?EnIpbK5dj9/xYPibNM3XPy4mXGaS6lCzJzruuk5AREgyCupp5uLXB8662rzH?= =?us-ascii?Q?37IXWGV6xC1VuGcrDOZntHNcv5NftEddccj4qdv7HHSYD0esqL4e8D4PGWJX?= =?us-ascii?Q?si7dZ0pnAs61NKtx09ApDMg86FLRutEDmEZbDGGyt3LoOM6N3c+ZvOWkjB2k?= =?us-ascii?Q?NDUpK0TbCrWSd9e1zUrXrNsFrSONHu9u6d4K5EE7Q0CWjrdSOhpJ4xdIF39K?= =?us-ascii?Q?f/5j1uSYGnmIEgi/pu2sTNBO9Ah0egRLGbqkKRz4eAIvA7MXHz9HeIwkkFgt?= =?us-ascii?Q?1D08JmkXawj1Ganinj/BRe4FONgtS0Qvo8gXjiypE8NarCI2ArqjMJqwpEC5?= =?us-ascii?Q?ogNvDa4OLPYG2EHskOEhP5q9Ca2Xie58bcnvx+RF8f9IjuPKMiRvPYrORDFg?= =?us-ascii?Q?aK9V+xRg2hN+fKkhKPbKZa63rDWhdePgmGbSm+VPOwQbWoIMGQp8Xvz3SrcO?= =?us-ascii?Q?5A2V+87wEv5PTWxu2k2RxYHWKZ8EUZfclbv1epzqhOlFApfFqoiat6cKPMon?= =?us-ascii?Q?Fbe3T03saQ1zirqY1k8ia0szB1yiNvlYnxoIGyF5FmlW919byysPhQn6TnAj?= =?us-ascii?Q?NVd8Mtw9f17h+EIqbSedao3VwEOzoQEJAda8Q5GWjFdIq1Ju82/0J9veXp1i?= =?us-ascii?Q?1oPeDh8SSGHICXxJKqRnkT+flPPrTLBWPVQlEwDtbcDCtjPkzxKzvPKBbg+E?= =?us-ascii?Q?mCJxRaJFmD1hPcjNyXbfI7paOHjAXWTNbbrQtInwUCWBKw05OHzgpLGkrvhH?= =?us-ascii?Q?oSoeXl6EN5qdA6ZqFVtfuCkPv1Iu1GJdmLXpxNtGfIwyo3+35lTBn3NX5Sw9?= =?us-ascii?Q?TYl1NRpDyLjR0Pw6bHcGBQPlSTSJR0FfAS5EFWKKCBwIJA2peEtDVjv3v34H?= =?us-ascii?Q?NFJimcXO0qUH1LL2i4OzuwmC3oE9NYpZM3NOGNv7wX5KCpoT0RcR6QJ6VX6C?= =?us-ascii?Q?1I00AxybhGHqgHuq7KVKgF2BDt7Qvc2apN8QJ5kdIBXcJ7tjQy6P3wd5Uhz1?= =?us-ascii?Q?8p7XrefQpXYleoni//I19RDJtgAzgu02a3HeR6KV184mxEvHyOKVr5sE7u0q?= =?us-ascii?Q?Xuy0glHfgHUTsFFfDTh2b1rKjSLss4fPPQlbIFosjGIwgD5JOBccd4UqbdBp?= =?us-ascii?Q?QKrs1o6e5sKs7rOLh0d2MpyxgvNePEE9LbgMPAGOOOwL10VOMvkctEU10Kuw?= =?us-ascii?Q?VutQ29ZsZgsG/IPL4xzF47irdn5ws9nLlLk5x74RZ180FqfqUN/xxWdjncPn?= =?us-ascii?Q?gpuMG8bMBQ2COFTWhAsS/rK2/Q9keJYtRGZj5vBArkrv73eqRI8QFvXj0P3A?= =?us-ascii?Q?pCCN9+jIaEjHyIbar4g6dxQbfewVMJXShX18kWKBB3fVTPorb27cLAz8CKQ1?= =?us-ascii?Q?1/Vc+0DrSiiT4KnteFTeW/hlo7rLar48AxA2VRk9VAIJrBUNDgUoKEk2MAWl?= =?us-ascii?Q?Lw0HbGLu3xN3An5t9p5e15cbkZ2v8c2+ssQ+GJ0MAOpK8mmpY6srtuJS3JUv?= =?us-ascii?Q?dde9SJePX6rDRCZepa0VzWKntEAep8dLSq9eLsbqSDIb//cPsFbNKafB5a7W?= =?us-ascii?Q?B2M7XLnZpqkdNTjYUGvOP6znYSEpBeMdsDWrFuoE?= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: fcc233f4-6fe5-404b-b093-08dcb883db08 X-MS-Exchange-CrossTenant-AuthSource: CYXPR12MB9320.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Aug 2024 14:59:23.5882 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Df0qtp2IIRRfrJzjzaQPUeVR0ovLobssFrabwTHHkUoiRx1d22byt8FwOVnVpBlw X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB6823 Content-Type: text/plain; charset="utf-8" do_numa_page() and do_huge_pmd_numa_page() share a lot of common code. To reduce redundancy, move common code to numa_migrate_prep() and rename the function to numa_migrate_check() to reflect its functionality. Now do_huge_pmd_numa_page() also checks shared folios to set TNF_SHARED flag. Suggested-by: David Hildenbrand Signed-off-by: Zi Yan Reviewed-by: "Huang, Ying" Reviewed-by: Baolin Wang Acked-by: David Hildenbrand --- mm/huge_memory.c | 29 +++++++++------------- mm/internal.h | 5 ++-- mm/memory.c | 63 +++++++++++++++++++++++++----------------------- 3 files changed, 47 insertions(+), 50 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 666fa675e5b6..f2fd3aabb67b 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1669,22 +1669,23 @@ static inline bool can_change_pmd_writable(struct v= m_area_struct *vma, vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf) { struct vm_area_struct *vma =3D vmf->vma; - pmd_t oldpmd =3D vmf->orig_pmd; - pmd_t pmd; struct folio *folio; unsigned long haddr =3D vmf->address & HPAGE_PMD_MASK; int nid =3D NUMA_NO_NODE; - int target_nid, last_cpupid =3D (-1 & LAST_CPUPID_MASK); + int target_nid, last_cpupid; + pmd_t pmd, old_pmd; bool writable =3D false; int flags =3D 0; =20 vmf->ptl =3D pmd_lock(vma->vm_mm, vmf->pmd); - if (unlikely(!pmd_same(oldpmd, *vmf->pmd))) { + old_pmd =3D pmdp_get(vmf->pmd); + + if (unlikely(!pmd_same(old_pmd, vmf->orig_pmd))) { spin_unlock(vmf->ptl); return 0; } =20 - pmd =3D pmd_modify(oldpmd, vma->vm_page_prot); + pmd =3D pmd_modify(old_pmd, vma->vm_page_prot); =20 /* * Detect now whether the PMD could be writable; this information @@ -1699,18 +1700,10 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *v= mf) if (!folio) goto out_map; =20 - /* See similar comment in do_numa_page for explanation */ - if (!writable) - flags |=3D TNF_NO_GROUP; - nid =3D folio_nid(folio); - /* - * For memory tiering mode, cpupid of slow memory page is used - * to record page access time. So use default value. - */ - if (!folio_use_access_time(folio)) - last_cpupid =3D folio_last_cpupid(folio); - target_nid =3D numa_migrate_prep(folio, vmf, haddr, nid, &flags); + + target_nid =3D numa_migrate_check(folio, vmf, haddr, &flags, writable, + &last_cpupid); if (target_nid =3D=3D NUMA_NO_NODE) goto out_map; if (migrate_misplaced_folio_prepare(folio, vma, target_nid)) { @@ -1730,13 +1723,13 @@ vm_fault_t do_huge_pmd_numa_page(struct vm_fault *v= mf) =20 flags |=3D TNF_MIGRATE_FAIL; vmf->ptl =3D pmd_lock(vma->vm_mm, vmf->pmd); - if (unlikely(!pmd_same(oldpmd, *vmf->pmd))) { + if (unlikely(!pmd_same(pmdp_get(vmf->pmd), vmf->orig_pmd))) { spin_unlock(vmf->ptl); return 0; } out_map: /* Restore the PMD */ - pmd =3D pmd_modify(oldpmd, vma->vm_page_prot); + pmd =3D pmd_modify(pmdp_get(vmf->pmd), vma->vm_page_prot); pmd =3D pmd_mkyoung(pmd); if (writable) pmd =3D pmd_mkwrite(pmd, vma); diff --git a/mm/internal.h b/mm/internal.h index 52f7fc4e8ac3..fb16e18c9761 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1191,8 +1191,9 @@ void vunmap_range_noflush(unsigned long start, unsign= ed long end); =20 void __vunmap_range_noflush(unsigned long start, unsigned long end); =20 -int numa_migrate_prep(struct folio *folio, struct vm_fault *vmf, - unsigned long addr, int page_nid, int *flags); +int numa_migrate_check(struct folio *folio, struct vm_fault *vmf, + unsigned long addr, int *flags, bool writable, + int *last_cpupid); =20 void free_zone_device_folio(struct folio *folio); int migrate_device_coherent_page(struct page *page); diff --git a/mm/memory.c b/mm/memory.c index bf791da57cab..e4f27c0696cb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5368,16 +5368,43 @@ static vm_fault_t do_fault(struct vm_fault *vmf) return ret; } =20 -int numa_migrate_prep(struct folio *folio, struct vm_fault *vmf, - unsigned long addr, int page_nid, int *flags) +int numa_migrate_check(struct folio *folio, struct vm_fault *vmf, + unsigned long addr, int *flags, + bool writable, int *last_cpupid) { struct vm_area_struct *vma =3D vmf->vma; =20 + /* + * Avoid grouping on RO pages in general. RO pages shouldn't hurt as + * much anyway since they can be in shared cache state. This misses + * the case where a mapping is writable but the process never writes + * to it but pte_write gets cleared during protection updates and + * pte_dirty has unpredictable behaviour between PTE scan updates, + * background writeback, dirty balancing and application behaviour. + */ + if (!writable) + *flags |=3D TNF_NO_GROUP; + + /* + * Flag if the folio is shared between multiple address spaces. This + * is later used when determining whether to group tasks together + */ + if (folio_likely_mapped_shared(folio) && (vma->vm_flags & VM_SHARED)) + *flags |=3D TNF_SHARED; + /* + * For memory tiering mode, cpupid of slow memory page is used + * to record page access time. So use default value. + */ + if (folio_use_access_time(folio)) + *last_cpupid =3D (-1 & LAST_CPUPID_MASK); + else + *last_cpupid =3D folio_last_cpupid(folio); + /* Record the current PID acceesing VMA */ vma_set_access_pid_bit(vma); =20 count_vm_numa_event(NUMA_HINT_FAULTS); - if (page_nid =3D=3D numa_node_id()) { + if (folio_nid(folio) =3D=3D numa_node_id()) { count_vm_numa_event(NUMA_HINT_FAULTS_LOCAL); *flags |=3D TNF_FAULT_LOCAL; } @@ -5479,35 +5506,11 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf) if (!folio || folio_is_zone_device(folio)) goto out_map; =20 - /* - * Avoid grouping on RO pages in general. RO pages shouldn't hurt as - * much anyway since they can be in shared cache state. This misses - * the case where a mapping is writable but the process never writes - * to it but pte_write gets cleared during protection updates and - * pte_dirty has unpredictable behaviour between PTE scan updates, - * background writeback, dirty balancing and application behaviour. - */ - if (!writable) - flags |=3D TNF_NO_GROUP; - - /* - * Flag if the folio is shared between multiple address spaces. This - * is later used when determining whether to group tasks together - */ - if (folio_likely_mapped_shared(folio) && (vma->vm_flags & VM_SHARED)) - flags |=3D TNF_SHARED; - nid =3D folio_nid(folio); nr_pages =3D folio_nr_pages(folio); - /* - * For memory tiering mode, cpupid of slow memory page is used - * to record page access time. So use default value. - */ - if (folio_use_access_time(folio)) - last_cpupid =3D (-1 & LAST_CPUPID_MASK); - else - last_cpupid =3D folio_last_cpupid(folio); - target_nid =3D numa_migrate_prep(folio, vmf, vmf->address, nid, &flags); + + target_nid =3D numa_migrate_check(folio, vmf, vmf->address, &flags, + writable, &last_cpupid); if (target_nid =3D=3D NUMA_NO_NODE) goto out_map; if (migrate_misplaced_folio_prepare(folio, vma, target_nid)) { --=20 2.43.0