From nobody Sun May 10 21:18:10 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A97C3C433F5 for ; Fri, 22 Apr 2022 22:07:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232078AbiDVWJ7 (ORCPT ); Fri, 22 Apr 2022 18:09:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232810AbiDVWIf (ORCPT ); Fri, 22 Apr 2022 18:08:35 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B928C2727DA for ; Fri, 22 Apr 2022 13:55:29 -0700 (PDT) Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 23MHAVOn019099; Fri, 22 Apr 2022 19:55:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=I+X9oa4wBtGt3ENNnP+MXjrcgbDt5U7J2EH0Hbr6Yp4=; b=P6rWvK1tRbg/aH2vTgsx8czaXf+u2nKGh9C/qax8VILyucevdXYlMByvnGOQxTyqo9Lp OewIMTHRnx7LXubi0TXOLKuhPqGsS7AI7U+S/4Tou21yIcg0JRPMqqqjFVJdbPw0wJTv PpZnRohcds3hDEg/BM5X5T+eewFiWkRlrlJB+I6knVs8m0YTOXoRqByBaSJo7CY7uXJq Anylyf/dZCt5cAvlKdtcZSM7HDBNcxI6cXK4+Q+nRNUwvlvjdcyJAsj3FKv3rW0p2IMI r1dA7L/UJ+gEvtOyHlZyorvt5IsBQr5Dld9TESA5dnmwe2bbSsEv63M60ntcbajFslI/ Mw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3fkvdv8cx9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:36 +0000 Received: from m0098420.ppops.net (m0098420.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 23MJtZFC013936; Fri, 22 Apr 2022 19:55:35 GMT Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0b-001b2d01.pphosted.com with ESMTP id 3fkvdv8cwy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:35 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 23MJhXiq005349; Fri, 22 Apr 2022 19:55:33 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma02fra.de.ibm.com with ESMTP id 3fgu6u68jj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:33 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 23MJtUIc56033626 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 22 Apr 2022 19:55:30 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7133811C054; Fri, 22 Apr 2022 19:55:30 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3573411C04A; Fri, 22 Apr 2022 19:55:27 +0000 (GMT) Received: from li-6e1fa1cc-351b-11b2-a85c-b897023bb5f3.ibm.com.com (unknown [9.43.112.230]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 22 Apr 2022 19:55:26 +0000 (GMT) From: Jagdish Gediya To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org Cc: baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, ying.huang@intel.com, aneesh.kumar@linux.ibm.com, shy828301@gmail.com, weixugc@google.com, gthelen@google.com, dan.j.williams@intel.com, Jagdish Gediya Subject: [PATCH v3 1/7] mm: demotion: Fix demotion targets sharing among sources Date: Sat, 23 Apr 2022 01:25:10 +0530 Message-Id: <20220422195516.10769-2-jvgediya@linux.ibm.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220422195516.10769-1-jvgediya@linux.ibm.com> References: <20220422195516.10769-1-jvgediya@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: JTs40v1XVo3FaJ9_tA9o9Cl_vjs4rxdj X-Proofpoint-ORIG-GUID: mhTJzr8VhHL4_WdJQD2OmeAdrz16qQ9x X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-22_06,2022-04-22_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 malwarescore=0 adultscore=0 priorityscore=1501 spamscore=0 clxscore=1015 phishscore=0 impostorscore=0 suspectscore=0 bulkscore=0 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204220083 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Sharing used_targets between multiple nodes in a single pass limits some of the opportunities for demotion target sharing. Don't share the used targets between multiple nodes in a single pass, instead accumulate all the used targets in source nodes shared by all pass, and reset 'used_targets' to source nodes while finding demotion targets for any new node. This results into some more opportunities to share demotion targets between multiple source nodes, e.g. with below NUMA topology, where node 0 & 1 are cpu + dram nodes, node 2 & 3 are equally slower memory only nodes, and node 4 is slowest memory only node, available: 5 nodes (0-4) node 0 cpus: 0 1 node 0 size: n MB node 0 free: n MB node 1 cpus: 2 3 node 1 size: n MB node 1 free: n MB node 2 cpus: node 2 size: n MB node 2 free: n MB node 3 cpus: node 3 size: n MB node 3 free: n MB node 4 cpus: node 4 size: n MB node 4 free: n MB node distances: node 0 1 2 3 4 0: 10 20 40 40 80 1: 20 10 40 40 80 2: 40 40 10 40 80 3: 40 40 40 10 80 4: 80 80 80 80 10 The existing implementation gives below demotion targets, node demotion_target 0 3, 2 1 4 2 X 3 X 4 X With this patch applied, below are the demotion targets, node demotion_target 0 3, 2 1 3, 2 2 4 3 4 4 X e.g. with below NUMA topology, where node 0, 1 & 2 are cpu + dram nodes and node 3 is slow memory node, available: 4 nodes (0-3) node 0 cpus: 0 1 node 0 size: n MB node 0 free: n MB node 1 cpus: 2 3 node 1 size: n MB node 1 free: n MB node 2 cpus: 4 5 node 2 size: n MB node 2 free: n MB node 3 cpus: node 3 size: n MB node 3 free: n MB node distances: node 0 1 2 3 0: 10 20 20 40 1: 20 10 20 40 2: 20 20 10 40 3: 40 40 40 10 The existing implementation gives below demotion targets, node demotion_target 0 3 1 X 2 X 3 X With this patch applied, below are the demotion targets, node demotion_target 0 3 1 3 2 3 3 X Fixes: 79c28a416722 ("mm/numa: automatically generate node migration order") Signed-off-by: Aneesh Kumar K.V Signed-off-by: Jagdish Gediya Reviewed-by: Baolin Wang Tested-by: Baolin Wang Acked-by: "Huang, Ying" --- mm/migrate.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 6c31ee1e1c9b..8bbe1e478122 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2355,7 +2355,7 @@ static void __set_migration_target_nodes(void) { nodemask_t next_pass =3D NODE_MASK_NONE; nodemask_t this_pass =3D NODE_MASK_NONE; - nodemask_t used_targets =3D NODE_MASK_NONE; + nodemask_t source_nodes =3D NODE_MASK_NONE; int node, best_distance; =20 /* @@ -2373,20 +2373,23 @@ static void __set_migration_target_nodes(void) again: this_pass =3D next_pass; next_pass =3D NODE_MASK_NONE; + /* - * To avoid cycles in the migration "graph", ensure - * that migration sources are not future targets by - * setting them in 'used_targets'. Do this only - * once per pass so that multiple source nodes can - * share a target node. - * - * 'used_targets' will become unavailable in future - * passes. This limits some opportunities for - * multiple source nodes to share a destination. + * Accumulate source nodes to avoid the cycle in migration + * list. */ - nodes_or(used_targets, used_targets, this_pass); + nodes_or(source_nodes, source_nodes, this_pass); =20 for_each_node_mask(node, this_pass) { + /* + * To avoid cycles in the migration "graph", ensure + * that migration sources are not future targets by + * setting them in 'used_targets'. Reset used_targets + * to source nodes for each node in this pass so that + * multiple source nodes can share a target node. + */ + nodemask_t used_targets =3D source_nodes; + best_distance =3D -1; =20 /* --=20 2.35.1 From nobody Sun May 10 21:18:10 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75569C433F5 for ; Fri, 22 Apr 2022 22:23:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232803AbiDVW0J (ORCPT ); Fri, 22 Apr 2022 18:26:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232797AbiDVWZu (ORCPT ); Fri, 22 Apr 2022 18:25:50 -0400 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64ECC11F96D for ; Fri, 22 Apr 2022 14:17:40 -0700 (PDT) Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 23MGXP4U010796; Fri, 22 Apr 2022 19:55:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=lm2njEXu/VlZ5DeJu6voRE/R0+J9++pgff3IHSCKCDM=; b=c927QxCgv7cDkZ+YGo6jErhsk+SgqBQKN3DRra+el/MSSZulsjqqrl00WlLBXSB8WR+g W8XnLt8S2EEiZzJSuU/tZ8oXu2odLp5g0xRiOmaGrlNYYATtov5uUCB9rQx26u+FF+RK 5NXiY4j2pr55yHjFPheNA+gabyYE5yuFTjqnH5WD3rtHow0s066VDkZSPaSWDPNoSeYS tVRagGXmbvOrx/y2aPLcBbCZbspkf/eJfdQkk99uBiRWmzp43MCTJ7hRGcsS4Q2SO/ch sg+RJLJbi0OrZqNWyIFLeCfoE+fdzcRq2DQTkJ8PxHwrjOZxTpWGfbA6smB5ujIBL7eA 3g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fjswfqu9j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:40 +0000 Received: from m0098421.ppops.net (m0098421.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 23MJr1ZS032250; Fri, 22 Apr 2022 19:55:40 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fjswfqu8w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:40 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 23MJisxq029302; Fri, 22 Apr 2022 19:55:38 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma06fra.de.ibm.com with ESMTP id 3ffn2hyt11-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:38 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 23MJtZtb49152346 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 22 Apr 2022 19:55:35 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E157411C052; Fri, 22 Apr 2022 19:55:34 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A4CBA11C050; Fri, 22 Apr 2022 19:55:31 +0000 (GMT) Received: from li-6e1fa1cc-351b-11b2-a85c-b897023bb5f3.ibm.com.com (unknown [9.43.112.230]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 22 Apr 2022 19:55:31 +0000 (GMT) From: Jagdish Gediya To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org Cc: baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, ying.huang@intel.com, aneesh.kumar@linux.ibm.com, shy828301@gmail.com, weixugc@google.com, gthelen@google.com, dan.j.williams@intel.com, Jagdish Gediya Subject: [PATCH v3 2/7] mm: demotion: Add new node state N_DEMOTION_TARGETS Date: Sat, 23 Apr 2022 01:25:11 +0530 Message-Id: <20220422195516.10769-3-jvgediya@linux.ibm.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220422195516.10769-1-jvgediya@linux.ibm.com> References: <20220422195516.10769-1-jvgediya@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: aRlckp_abbNY4VUdskj5kbYr-iDtmzuD X-Proofpoint-GUID: IKNXG7uKtK-7dawYm7TAlnVKvtqeozOA X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-22_06,2022-04-22_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 mlxlogscore=780 spamscore=0 priorityscore=1501 malwarescore=0 bulkscore=0 phishscore=0 suspectscore=0 adultscore=0 clxscore=1015 mlxscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204220083 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Some systems(e.g. PowerVM) have DRAM(fast memory) only NUMA node which are N_MEMORY as well as slow memory(persistent memory) only NUMA node which are also N_MEMORY. As the current demotion target finding algorithm works based on N_MEMORY and best distance, it can choose DRAM only NUMA node as demotion target instead of persistent memory node on such systems. If DRAM only NUMA node is filled with demoted pages then at some point new allocations can start falling to persistent memory, so basically cold pages are in fast memory (due to demotion) and new pages are in slow memory, this is why persistent memory nodes should be utilized for demotion and dram node should be avoided for demotion so that they can be used for new allocations. Add new state N_DEMOTION_TARGETS, node_states[N_DEMOTION_TARGETS] then can be used to hold the list of nodes which can be used as demotion targets, later patches in the series builds demotion targets based on nodes available in node_states[N_DEMOTION_TARGETS]. Signed-off-by: Aneesh Kumar K.V Signed-off-by: Jagdish Gediya Acked-by: Wei Xu --- drivers/base/node.c | 4 ++++ include/linux/nodemask.h | 1 + 2 files changed, 5 insertions(+) diff --git a/drivers/base/node.c b/drivers/base/node.c index ec8bb24a5a22..6eef22e6413e 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -1038,6 +1038,9 @@ static struct node_attr node_state_attr[] =3D { [N_CPU] =3D _NODE_ATTR(has_cpu, N_CPU), [N_GENERIC_INITIATOR] =3D _NODE_ATTR(has_generic_initiator, N_GENERIC_INITIATOR), + [N_DEMOTION_TARGETS] =3D _NODE_ATTR(demotion_targets, + N_DEMOTION_TARGETS), + }; =20 static struct attribute *node_state_attrs[] =3D { @@ -1050,6 +1053,7 @@ static struct attribute *node_state_attrs[] =3D { &node_state_attr[N_MEMORY].attr.attr, &node_state_attr[N_CPU].attr.attr, &node_state_attr[N_GENERIC_INITIATOR].attr.attr, + &node_state_attr[N_DEMOTION_TARGETS].attr.attr, NULL }; =20 diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h index 567c3ddba2c4..17844300fd57 100644 --- a/include/linux/nodemask.h +++ b/include/linux/nodemask.h @@ -400,6 +400,7 @@ enum node_states { N_MEMORY, /* The node has memory(regular, high, movable) */ N_CPU, /* The node has one or more cpus */ N_GENERIC_INITIATOR, /* The node has one or more Generic Initiators */ + N_DEMOTION_TARGETS, /* Nodes that should be considered as demotion target= s */ NR_NODE_STATES }; =20 --=20 2.35.1 From nobody Sun May 10 21:18:10 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BBABC433EF for ; Fri, 22 Apr 2022 22:07:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232059AbiDVWJ6 (ORCPT ); Fri, 22 Apr 2022 18:09:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232289AbiDVWIJ (ORCPT ); Fri, 22 Apr 2022 18:08:09 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7FF62B376D for ; Fri, 22 Apr 2022 13:54:14 -0700 (PDT) Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 23MJb8I3016570; Fri, 22 Apr 2022 19:55:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=uY0Z6msCCH9KzE009HO1twr/mIvovXgZnjZLx+NVgAE=; b=nIIZJAv0fvAwuej9k4W2ANK++bg/JdZ26uocRoaEqXvKEPT47GMarA1yUxWOBgAHTf76 YM85CYilX4g0bxafOndy2KThIvQhpRqYIoe4Kct5/ex7IATlx5vLTbnw1t0MkALatWcs HuLF2FTQBx3ax6GYl4tMhC1n/KVUortCxt9M4BAiAD2qvIuUkSHbIFhBmafYOQcmjcwr mXt6SWSZWnn0hjn4W8Um+H4u2yIsInywh58alLCso7kMCzuHfwjwH/NNqeLVTDl2Cucm YP2gjXd1U3CHcnl58ejez88GdcWjSxOsiGRrIo19qwwVJ260QY+7k0Sk4+CiQ03vW/Fs Qg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fjn0ykhqh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:45 +0000 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 23MJjAEA015282; Fri, 22 Apr 2022 19:55:44 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fjn0ykhq0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:44 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 23MJhTBF027247; Fri, 22 Apr 2022 19:55:42 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma03ams.nl.ibm.com with ESMTP id 3ffne8st22-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:42 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 23MJtdkQ40305124 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 22 Apr 2022 19:55:39 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3855711C050; Fri, 22 Apr 2022 19:55:39 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E883311C052; Fri, 22 Apr 2022 19:55:35 +0000 (GMT) Received: from li-6e1fa1cc-351b-11b2-a85c-b897023bb5f3.ibm.com.com (unknown [9.43.112.230]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 22 Apr 2022 19:55:35 +0000 (GMT) From: Jagdish Gediya To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org Cc: baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, ying.huang@intel.com, aneesh.kumar@linux.ibm.com, shy828301@gmail.com, weixugc@google.com, gthelen@google.com, dan.j.williams@intel.com, Jagdish Gediya Subject: [PATCH v3 3/7] drivers/base/node: Add support to write node_states[] via sysfs Date: Sat, 23 Apr 2022 01:25:12 +0530 Message-Id: <20220422195516.10769-4-jvgediya@linux.ibm.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220422195516.10769-1-jvgediya@linux.ibm.com> References: <20220422195516.10769-1-jvgediya@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: yOg7QkFiteOgPwyb7WJin5XJOPO6WZ_O X-Proofpoint-ORIG-GUID: m18RbXrSxa4x2MrQV8mNB9H0Ke2nlc6w X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-22_06,2022-04-22_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 bulkscore=0 impostorscore=0 suspectscore=0 mlxscore=0 spamscore=0 phishscore=0 priorityscore=1501 lowpriorityscore=0 adultscore=0 malwarescore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204220083 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Current /sys/devices/system/node/* interface doesn't support to write node_states[], however write support is needed in case users want to set them manually e.g. when user want to override default N_DEMOTION_TARGETS found by the kernel. Rename existing _NODE_ATTR to _NODE_ATTR_RO and introduce new _NODE_ATTR_RW which can be used for node_states[] which can be written from sysfs. It may be necessary to validate written values and take action based on them in a state specific way so a callback 'write' is introduced in 'struct node_attr'. A new function demotion_targets_write() is added to validate the input nodes for N_DEMOTION_TARGETS which should be subset of N_MEMORY and to build new demotion list based on new nodes. Signed-off-by: Jagdish Gediya Acked-by: Wei Xu --- drivers/base/node.c | 62 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 49 insertions(+), 13 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 6eef22e6413e..e03eedbc421b 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -20,6 +20,7 @@ #include #include #include +#include =20 static struct bus_type node_subsys =3D { .name =3D "node", @@ -1013,6 +1014,7 @@ void unregister_one_node(int nid) struct node_attr { struct device_attribute attr; enum node_states state; + int (*write)(nodemask_t nodes); }; =20 static ssize_t show_node_state(struct device *dev, @@ -1024,23 +1026,57 @@ static ssize_t show_node_state(struct device *dev, nodemask_pr_args(&node_states[na->state])); } =20 -#define _NODE_ATTR(name, state) \ - { __ATTR(name, 0444, show_node_state, NULL), state } +static ssize_t store_node_state(struct device *s, + struct device_attribute *attr, + const char *buf, size_t count) +{ + nodemask_t nodes; + struct node_attr *na =3D container_of(attr, struct node_attr, attr); + + if (nodelist_parse(buf, nodes)) + return -EINVAL; + + if (na->write) { + if (na->write(nodes)) + return -EINVAL; + } else { + node_states[na->state] =3D nodes; + } + + return count; +} + +static int demotion_targets_write(nodemask_t nodes) +{ + if (nodes_subset(nodes, node_states[N_MEMORY])) { + node_states[N_DEMOTION_TARGETS] =3D nodes; + set_migration_target_nodes(); + return 0; + } + + return -EINVAL; +} + +#define _NODE_ATTR_RO(name, state) \ + { __ATTR(name, 0444, show_node_state, NULL), state, NULL } + +#define _NODE_ATTR_RW(name, state, write_fn) \ + { __ATTR(name, 0644, show_node_state, store_node_state), state, write_fn } =20 static struct node_attr node_state_attr[] =3D { - [N_POSSIBLE] =3D _NODE_ATTR(possible, N_POSSIBLE), - [N_ONLINE] =3D _NODE_ATTR(online, N_ONLINE), - [N_NORMAL_MEMORY] =3D _NODE_ATTR(has_normal_memory, N_NORMAL_MEMORY), + [N_POSSIBLE] =3D _NODE_ATTR_RO(possible, N_POSSIBLE), + [N_ONLINE] =3D _NODE_ATTR_RO(online, N_ONLINE), + [N_NORMAL_MEMORY] =3D _NODE_ATTR_RO(has_normal_memory, N_NORMAL_MEMORY), #ifdef CONFIG_HIGHMEM - [N_HIGH_MEMORY] =3D _NODE_ATTR(has_high_memory, N_HIGH_MEMORY), + [N_HIGH_MEMORY] =3D _NODE_ATTR_RO(has_high_memory, N_HIGH_MEMORY), #endif - [N_MEMORY] =3D _NODE_ATTR(has_memory, N_MEMORY), - [N_CPU] =3D _NODE_ATTR(has_cpu, N_CPU), - [N_GENERIC_INITIATOR] =3D _NODE_ATTR(has_generic_initiator, - N_GENERIC_INITIATOR), - [N_DEMOTION_TARGETS] =3D _NODE_ATTR(demotion_targets, - N_DEMOTION_TARGETS), - + [N_MEMORY] =3D _NODE_ATTR_RO(has_memory, N_MEMORY), + [N_CPU] =3D _NODE_ATTR_RO(has_cpu, N_CPU), + [N_GENERIC_INITIATOR] =3D _NODE_ATTR_RO(has_generic_initiator, + N_GENERIC_INITIATOR), + [N_DEMOTION_TARGETS] =3D _NODE_ATTR_RW(demotion_targets, + N_DEMOTION_TARGETS, + demotion_targets_write), }; =20 static struct attribute *node_state_attrs[] =3D { --=20 2.35.1 From nobody Sun May 10 21:18:10 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB845C433F5 for ; Fri, 22 Apr 2022 21:36:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230166AbiDVVjv (ORCPT ); Fri, 22 Apr 2022 17:39:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230104AbiDVVjn (ORCPT ); Fri, 22 Apr 2022 17:39:43 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78C934179C3 for ; Fri, 22 Apr 2022 13:46:12 -0700 (PDT) Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 23MIwSSP017070; Fri, 22 Apr 2022 19:55:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=4X2QURIlgZAv9adJmpQNjH+Rxhc/IoUAl9LWBytqM5A=; b=cSsN4Bqg5QdTDSh5z+pM1k1bytAuEMcbVQuYYoPu5OCWQCZcvZoKTcisFl1GZbwDsGeN bpLy/NaXp5b5b5kroh3liym+anJCX5hedx3AEArxuU5yox2LensXy2ysXFI+XBUyiHhj sY8n1e5XVNheuYeX11hnCK3xDE4IXGLas9qTIX7WiK9XHzI93HdUPzt2zZ1Pm76Y0e1q jS+zK/dBOGq9Q2bZ6pKcJnxKHPXezw6xr7WRgHUFYHeyjkXDWMBD+3HaMfC2JqeKfGYA Rxhlvp+4axlRQqJ8tPUIaTc4W9YQXgyjNILslmJJLc2U0pUYKzcB712dEqj9HQGNhaza GA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fjm2jyxr2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:49 +0000 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 23MJogII012415; Fri, 22 Apr 2022 19:55:49 GMT Received: from ppma03fra.de.ibm.com (6b.4a.5195.ip4.static.sl-reverse.com [149.81.74.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fjm2jyxqn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:48 +0000 Received: from pps.filterd (ppma03fra.de.ibm.com [127.0.0.1]) by ppma03fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 23MJiZqd005844; Fri, 22 Apr 2022 19:55:46 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma03fra.de.ibm.com with ESMTP id 3ffne97td3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:46 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 23MJgoKO35127788 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 22 Apr 2022 19:42:50 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5580711C04A; Fri, 22 Apr 2022 19:55:43 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3E17311C04C; Fri, 22 Apr 2022 19:55:40 +0000 (GMT) Received: from li-6e1fa1cc-351b-11b2-a85c-b897023bb5f3.ibm.com.com (unknown [9.43.112.230]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 22 Apr 2022 19:55:39 +0000 (GMT) From: Jagdish Gediya To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org Cc: baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, ying.huang@intel.com, aneesh.kumar@linux.ibm.com, shy828301@gmail.com, weixugc@google.com, gthelen@google.com, dan.j.williams@intel.com, Jagdish Gediya Subject: [PATCH v3 4/7] device-dax/kmem: Set node state as N_DEMOTION_TARGETS Date: Sat, 23 Apr 2022 01:25:13 +0530 Message-Id: <20220422195516.10769-5-jvgediya@linux.ibm.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220422195516.10769-1-jvgediya@linux.ibm.com> References: <20220422195516.10769-1-jvgediya@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: WYHc5m7ZLWuDYh8gE8WUbhlGSGAIMlTY X-Proofpoint-ORIG-GUID: bZ4xj7Y8DyommoyUhUEXDMw8rSd5AuDn X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-22_06,2022-04-22_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 impostorscore=0 clxscore=1015 lowpriorityscore=0 adultscore=0 suspectscore=0 mlxlogscore=999 spamscore=0 bulkscore=0 phishscore=0 malwarescore=0 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204220083 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Set dax-device node as N_DEMOTION_TARGETS so that it can be used as demotion target. In future, support should be added to distinguish the dax-devices which are not preferred as demotion target e.g. HBM, for such devices, node shouldn't be set to N_DEMOTION_TARGETS. Signed-off-by: Aneesh Kumar K.V Signed-off-by: Jagdish Gediya Acked-by: Wei Xu --- drivers/dax/kmem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index a37622060fff..f42ab9d04bdf 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -147,6 +147,8 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) =20 dev_set_drvdata(dev, data); =20 + node_set_state(numa_node, N_DEMOTION_TARGETS); + return 0; =20 err_request_mem: --=20 2.35.1 From nobody Sun May 10 21:18:10 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54427C433FE for ; Fri, 22 Apr 2022 22:09:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232098AbiDVWMA (ORCPT ); Fri, 22 Apr 2022 18:12:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231559AbiDVWKL (ORCPT ); Fri, 22 Apr 2022 18:10:11 -0400 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 866772F7E2A for ; Fri, 22 Apr 2022 13:56:36 -0700 (PDT) Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 23MGrT4o020112; Fri, 22 Apr 2022 19:55:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=E4yEUAxbjgnTO7FClAV39+AdHCwnl7sbYhKHBs2Qa6I=; b=STYKulM4uATqR8xEa7FsAJ9+MkRbTWukIzLiyIIw3ZHkn68fFwm05WD8EKX7R4ENAMOf KI7YZ49n4FRz3GXKEz+UsMpPxtlz4PrpTU3neIS59EU646tMibeEyQKb94WLPOdM2vin mJ4dVx5yvhfjxtRhLprnluNk4SGlEcMeCt1FEObRIKa7E3kru2PhzhyAt8WhoFs60wFV z61Tmng+WkwCNIXaEG3RsazFBMpeG/a1uN3v13aYgqeVjg88MpHUxnEXCSNkkv3KaC8z oBmLu13zyvIArPdyFUvY88De8TYVYSOtokm76/GJWCM+jWJfGs1r93PXF6njuCGjVCNA Mg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fk1yf8e5m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:53 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 23MJLZKj031559; Fri, 22 Apr 2022 19:55:53 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fk1yf8e55-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:52 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 23MJiTjf011197; Fri, 22 Apr 2022 19:55:50 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma05fra.de.ibm.com with ESMTP id 3ffne8ytcj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:50 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 23MJtlLj40894884 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 22 Apr 2022 19:55:47 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8B36011C04C; Fri, 22 Apr 2022 19:55:47 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 502D911C04A; Fri, 22 Apr 2022 19:55:44 +0000 (GMT) Received: from li-6e1fa1cc-351b-11b2-a85c-b897023bb5f3.ibm.com.com (unknown [9.43.112.230]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 22 Apr 2022 19:55:44 +0000 (GMT) From: Jagdish Gediya To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org Cc: baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, ying.huang@intel.com, aneesh.kumar@linux.ibm.com, shy828301@gmail.com, weixugc@google.com, gthelen@google.com, dan.j.williams@intel.com, Jagdish Gediya Subject: [PATCH v3 5/7] mm: demotion: Build demotion list based on N_DEMOTION_TARGETS Date: Sat, 23 Apr 2022 01:25:14 +0530 Message-Id: <20220422195516.10769-6-jvgediya@linux.ibm.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220422195516.10769-1-jvgediya@linux.ibm.com> References: <20220422195516.10769-1-jvgediya@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: cJPTmN8FTowY5h6ZkAMojGkVXguilEO6 X-Proofpoint-GUID: 8WNQ91jXL9acN28yVe-5EOs3eWzq185K X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-22_06,2022-04-22_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 spamscore=0 bulkscore=0 suspectscore=0 impostorscore=0 lowpriorityscore=0 priorityscore=1501 phishscore=0 mlxscore=0 clxscore=1015 adultscore=0 mlxlogscore=867 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204220083 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Only nodes which has state N_DEMOTION_TARGETS should be used as demotion targets, make nodes which are not in demotion targets as source nodes while building demotion target list so that demotion targets are only chosen from N_DEMOTION_TARGETS. Signed-off-by: Aneesh Kumar K.V Signed-off-by: Jagdish Gediya Acked-by: Wei Xu --- mm/migrate.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index 8bbe1e478122..5b92a09fbe4a 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2366,10 +2366,10 @@ static void __set_migration_target_nodes(void) disable_all_migrate_targets(); =20 /* - * Allocations go close to CPUs, first. Assume that - * the migration path starts at the nodes with CPUs. + * Some systems can have DRAM(fast memory) only NUMA nodes, demotion targ= ets + * need to be found for them as well. */ - next_pass =3D node_states[N_CPU]; + nodes_andnot(next_pass, node_states[N_ONLINE], node_states[N_DEMOTION_TAR= GETS]); again: this_pass =3D next_pass; next_pass =3D NODE_MASK_NONE; --=20 2.35.1 From nobody Sun May 10 21:18:10 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D6E3C433F5 for ; Fri, 22 Apr 2022 21:59:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231383AbiDVWBz (ORCPT ); Fri, 22 Apr 2022 18:01:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231281AbiDVWBs (ORCPT ); Fri, 22 Apr 2022 18:01:48 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A6E6306720 for ; Fri, 22 Apr 2022 13:45:11 -0700 (PDT) Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 23MJrAOK020809; Fri, 22 Apr 2022 19:55:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=lZutXka2GOB3kc31PvIIjpZk/pQq2vS+oBixzg1Fc18=; b=J6TkaY7cLuuJXoqIwycNItdTnzjcfWrnbcSwG5UKNuHlyScXbFWxjtVcqno/nW3cksVB YDK7JlurtLztJvbObs4YmeS7Q6Rz/rUQf8/yhu0pKOcdsLeE5nYcH7wdlqs4xY2c/Jar IR2uWIL9VsWnhW2oMgrdLf6w4oDQk3dCPxOOdJPGdRUP41RU+k4INzAdtIYxLgeSKMs6 wnXVEhViw9z5iuGWYYkCPV/MDsIopLGfipb/0vAr4e705I1dnItyRDxIW0C6s0FBTnaC sBoCbbQNmZj5NN85sGmF6amXuaf4ow/RWeLm4LDRxwxPsfhzmZjtBav+h4GCK2oJ0TBt Tg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3fkrbbwck8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:57 +0000 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 23MJtvj0017102; Fri, 22 Apr 2022 19:55:57 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0b-001b2d01.pphosted.com with ESMTP id 3fkrbbwck0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:57 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 23MJiexr005656; Fri, 22 Apr 2022 19:55:55 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma06ams.nl.ibm.com with ESMTP id 3ffn2j1u45-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:54 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 23MJtpwR55705928 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 22 Apr 2022 19:55:51 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CC67211C04C; Fri, 22 Apr 2022 19:55:51 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7BFDD11C04A; Fri, 22 Apr 2022 19:55:48 +0000 (GMT) Received: from li-6e1fa1cc-351b-11b2-a85c-b897023bb5f3.ibm.com.com (unknown [9.43.112.230]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 22 Apr 2022 19:55:48 +0000 (GMT) From: Jagdish Gediya To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org Cc: baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, ying.huang@intel.com, aneesh.kumar@linux.ibm.com, shy828301@gmail.com, weixugc@google.com, gthelen@google.com, dan.j.williams@intel.com, Jagdish Gediya Subject: [PATCH v3 6/7] mm: demotion: expose per-node demotion targets via sysfs Date: Sat, 23 Apr 2022 01:25:15 +0530 Message-Id: <20220422195516.10769-7-jvgediya@linux.ibm.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220422195516.10769-1-jvgediya@linux.ibm.com> References: <20220422195516.10769-1-jvgediya@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: pAt2Pn13pdZaoo_rYGtyjmBx-8Mrss4r X-Proofpoint-GUID: 0q4ved5SKwpCH7ZhFuWSXgyGuXdlaO8w X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-22_06,2022-04-22_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 priorityscore=1501 mlxscore=0 spamscore=0 malwarescore=0 mlxlogscore=999 adultscore=0 bulkscore=0 clxscore=1015 phishscore=0 impostorscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204220081 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Kernel prepares per-node demotion target list based on node_states[N_DEMOTION_TARGETS], If enabled through sysfs, demotion kicks in during reclaim, and pages get migrated according to demotion target list prepared by kernel. It is helpful to know demotion target list prepared by kernel to understand the demotion behaviour, so add interface /sys/devices/system/node/nodeX/demotion_targets to view per-node demotion targets via sysfs. Signed-off-by: Jagdish Gediya Reported-by: kernel test robot --- drivers/base/node.c | 10 ++++++++++ include/linux/migrate.h | 1 + mm/migrate.c | 17 +++++++++++++++++ 3 files changed, 28 insertions(+) diff --git a/drivers/base/node.c b/drivers/base/node.c index e03eedbc421b..92326219aac2 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -561,11 +561,21 @@ static ssize_t node_read_distance(struct device *dev, } static DEVICE_ATTR(distance, 0444, node_read_distance, NULL); =20 +static ssize_t demotion_targets_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + nodemask_t demotion_targets =3D node_get_demotion_targets(dev->id); + + return sysfs_emit(buf, "%*pbl\n", nodemask_pr_args(&demotion_targets)); +} +static DEVICE_ATTR_RO(demotion_targets); + static struct attribute *node_dev_attrs[] =3D { &dev_attr_meminfo.attr, &dev_attr_numastat.attr, &dev_attr_distance.attr, &dev_attr_vmstat.attr, + &dev_attr_demotion_targets.attr, NULL }; =20 diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 90e75d5a54d6..072019441a24 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -173,6 +173,7 @@ int migrate_vma_setup(struct migrate_vma *args); void migrate_vma_pages(struct migrate_vma *migrate); void migrate_vma_finalize(struct migrate_vma *migrate); int next_demotion_node(int node); +nodemask_t node_get_demotion_targets(int node); =20 #else /* CONFIG_MIGRATION disabled: */ =20 diff --git a/mm/migrate.c b/mm/migrate.c index 5b92a09fbe4a..da864831bc0c 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2187,6 +2187,23 @@ struct demotion_nodes { =20 static struct demotion_nodes *node_demotion __read_mostly; =20 +nodemask_t node_get_demotion_targets(int node) +{ + nodemask_t demotion_targets =3D NODE_MASK_NONE; + unsigned short target_nr; + + if (!node_demotion) + return NODE_MASK_NONE; + + rcu_read_lock(); + target_nr =3D READ_ONCE(node_demotion[node].nr); + for (int i =3D 0; i < target_nr; i++) + node_set(READ_ONCE(node_demotion[node].nodes[i]), demotion_targets); + rcu_read_unlock(); + + return demotion_targets; +} + /** * next_demotion_node() - Get the next node in the demotion path * @node: The starting node to lookup the next node --=20 2.35.1 From nobody Sun May 10 21:18:10 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8511EC433EF for ; Fri, 22 Apr 2022 22:03:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231482AbiDVWGJ (ORCPT ); Fri, 22 Apr 2022 18:06:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231617AbiDVWFw (ORCPT ); Fri, 22 Apr 2022 18:05:52 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30F89156110 for ; Fri, 22 Apr 2022 13:49:11 -0700 (PDT) Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 23MHnjdp025864; Fri, 22 Apr 2022 19:56:02 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=04STmEqe5u5BMaOs7DSkgySzGj61M92rE1OARfFQvtE=; b=G2MGAsS/y0h79mO20XiIlN3WLTv4vYWfnqSXUlZz8xzFAH1ADXqIytGZD1+djCU534DZ QlUjl4iVKSc3FneFqHocBmAhA/1qbVov9/GtgYC/G0u2FLQzciupNvMxDfSF3GFvLHma vcz9qFXa/5FYiHBlMgQRsyG69zIfWJsa3ERwBsvN8ROSFAqcIqqlyMFiIUDgiwCVFV2w sEgrpDh/ee0YXvbmCah4o0XuYoUxz62sG/PxZsVhIBuZMQ5iG3TiWO/5OaSczV/1lxjx xnpAo5n/9aHK0zS/g+0fpHpZ3efYt5ocCpmPIVaIiHMadhnhcde8AI3vlp79JZD23urX 7Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fkm5q9unv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:56:02 +0000 Received: from m0187473.ppops.net (m0187473.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 23MJZHwP035931; Fri, 22 Apr 2022 19:56:02 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com with ESMTP id 3fkm5q9unb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:56:01 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 23MJjMko020138; Fri, 22 Apr 2022 19:55:59 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma04ams.nl.ibm.com with ESMTP id 3ffne99tgn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 22 Apr 2022 19:55:59 +0000 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 23MJtuBg37749180 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 22 Apr 2022 19:55:56 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 28D5F11C04C; Fri, 22 Apr 2022 19:55:56 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F3C0811C04A; Fri, 22 Apr 2022 19:55:52 +0000 (GMT) Received: from li-6e1fa1cc-351b-11b2-a85c-b897023bb5f3.ibm.com.com (unknown [9.43.112.230]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 22 Apr 2022 19:55:52 +0000 (GMT) From: Jagdish Gediya To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org Cc: baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, ying.huang@intel.com, aneesh.kumar@linux.ibm.com, shy828301@gmail.com, weixugc@google.com, gthelen@google.com, dan.j.williams@intel.com, Jagdish Gediya Subject: [PATCH v3 7/7] docs: numa: Add documentation for demotion Date: Sat, 23 Apr 2022 01:25:16 +0530 Message-Id: <20220422195516.10769-8-jvgediya@linux.ibm.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220422195516.10769-1-jvgediya@linux.ibm.com> References: <20220422195516.10769-1-jvgediya@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: eLaW0tlhh9_-kMbPEf1AAeNqm613K_JR X-Proofpoint-GUID: 8XX4G7vY-17mP6Mx995RZH3Xu56D4qqO X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-22_06,2022-04-22_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 spamscore=0 clxscore=1011 bulkscore=0 phishscore=0 priorityscore=1501 lowpriorityscore=0 malwarescore=0 mlxscore=0 suspectscore=0 mlxlogscore=987 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204220083 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add documentation for demotion mentioning about why is it required and all the sysfs interfaces available related to demotion. Signed-off-by: Jagdish Gediya --- Documentation/admin-guide/mm/index.rst | 1 + .../admin-guide/mm/numa_demotion.rst | 57 +++++++++++++++++++ 2 files changed, 58 insertions(+) create mode 100644 Documentation/admin-guide/mm/numa_demotion.rst diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-g= uide/mm/index.rst index c21b5823f126..4bd0ed3de9c5 100644 --- a/Documentation/admin-guide/mm/index.rst +++ b/Documentation/admin-guide/mm/index.rst @@ -34,6 +34,7 @@ the Linux memory management. memory-hotplug nommu-mmap numa_memory_policy + numa_demotion numaperf pagemap soft-dirty diff --git a/Documentation/admin-guide/mm/numa_demotion.rst b/Documentation= /admin-guide/mm/numa_demotion.rst new file mode 100644 index 000000000000..252be9dc0517 --- /dev/null +++ b/Documentation/admin-guide/mm/numa_demotion.rst @@ -0,0 +1,57 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +NUMA Demotion +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +What is demotion required? +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D + +With the advent of various new memory types, Systems have multiple +types of memory, e.g. DRAM and PMEM (persistent memory). The memory +subsystem of such systems can be called memory tiering system, +because the performance of the different types of memory are usually +different. + +In a system with some DRAM and some persistent memory, once DRAM +fills up, reclaim will start and some of the DRAM contents will be +thrown out to swap even if there is space in persistent memory. +Allocations will, at some point, start falling over to the slower +persistent memory. + +Instead of page being discarded during reclaim, it can be moved to +persistent memory. Allowing page migration during reclaim enables +these systems to migrate pages from fast tiers to slow tiers when +the fast tier is under pressure. + +SYSFS interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Enable/Disable demotion +------------------------ + +By default demotion is disabled, it can be enabled/disabled using below +sysfs interface, + +echo 0/1 or false/true > /sys/kernel/mm/numa/demotion_enabled + +Read system demotion targets +----------------------------- +cat /sys/devices/system/node/demotion_targets + +Kernel shows node_states[N_DEMOTION_TARGETS] when this command +is run. + +Override default demotion targets +--------------------------------- +echo > /sys/devices/system/node/demotion_targets + +If nodelist is valid and subset of N_MEMORY then +node_states[N_DEMOTION_TARGETS] is set to this new nodelist, and +kernel builds the new demotion list based on it. + +Read per node demotion targets +------------------------------- +cat /sys/devices/system/node/nodeX/demotion_targets + +It shows per node demotion targets configured by kernel. --=20 2.35.1