From nobody Tue May 13 19:53:56 2025
Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id 76200847B;
	Fri,  4 Apr 2025 07:46:42 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=166.125.252.92
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1743752809; cv=none;
 b=ROhyLMTGfhjq6pYHjBHBv5uvy7sFUqLPby/mYyuGMEjlqYxz5zRHsoeRZNi5anuhK7tiHDPOEg1OD7jHW1R/tHhFva+iTwcz3lzqtntNTu9n1Gm/hhQK+OYjBOKGsv1+EKwGdIkVvaTL630otsLK0cCSH/4TWLNmqClEBd8FJI0=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1743752809; c=relaxed/simple;
	bh=sYySw3ebV2rPRTM+5OKnTORtx8ZjzQN/s/eAyKcnWUI=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version;
 b=pPrqhtkXuRyB0SiPx1ShcgbnQydkUZTG4zN1FgBeNCTt0yIWGdvz+uDMlRupVaGxf/mg/Pacx9l1ouhZ2VBBB3zZ0Qw9I5zP/4IX8qXkmvD127pxYaM4p7DuBgdDQzHeYoYA8olW4OBvkX/eSPb9mTi+yDYR1O6gUmJId/IzJww=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=sk.com;
 spf=pass smtp.mailfrom=sk.com; arc=none smtp.client-ip=166.125.252.92
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=sk.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=sk.com
X-AuditID: a67dfc5b-681ff7000002311f-ca-67ef8e5a4971
From: Rakie Kim <rakie.kim@sk.com>
To: akpm@linux-foundation.org
Cc: gourry@gourry.net,
	linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	linux-cxl@vger.kernel.org,
	joshua.hahnjy@gmail.com,
	dan.j.williams@intel.com,
	ying.huang@linux.alibaba.com,
	david@redhat.com,
	Jonathan.Cameron@huawei.com,
	osalvador@suse.de,
	kernel_team@skhynix.com,
	honggyu.kim@sk.com,
	yunjeong.mun@sk.com,
	rakie.kim@sk.com
Subject: [PATCH v6 1/3] mm/mempolicy: Fix memory leaks in weighted interleave
 sysfs
Date: Fri,  4 Apr 2025 16:46:19 +0900
Message-ID: <20250404074623.1179-2-rakie.kim@sk.com>
X-Mailer: git-send-email 2.48.1.windows.1
In-Reply-To: <20250404074623.1179-1-rakie.kim@sk.com>
References: <20250404074623.1179-1-rakie.kim@sk.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Brightmail-Tracker: 
 H4sIAAAAAAAAA+NgFnrBLMWRmVeSWpSXmKPExsXC9ZZnoW5U3/t0gzeLJC3mrF/DZjF96gVG
	i6/rfzFb/Lx7nN1i1cJrbBbHt85jtzg/6xSLxeVdc9gs7q35z2pxZlqRxeo1GQ7cHjtn3WX3
	6G67zO7RcuQtq8fiPS+ZPDZ9msTucWLGbxaPnQ8tPd7vu8rmsfl0tcfnTXIBXFFcNimpOZll
	qUX6dglcGWcfTmIvOCZX0bPqCnsD41uJLkYODgkBE4ltP2W7GDnBzCWP2llAwmwCShLH9saA
	hEUEZCWm/j0PFObiYBZ4zCTx6PkLRpCEsECwxMQzW9lBbBYBVYnlr9rAbF6gOQ1LnjBDzNSU
	aLh0jwnE5hQwlfj98CSYLQRUs/rpEah6QYmTM5+wgNjMAvISzVtnM4MskxD4zCbxavZMJohB
	khIHV9xgmcDIPwtJzywkPQsYmVYxCmXmleUmZuaY6GVU5mVW6CXn525iBIb/sto/0TsYP10I
	PsQowMGoxMNrUfguXYg1say4MvcQowQHs5II792c9+lCvCmJlVWpRfnxRaU5qcWHGKU5WJTE
	eY2+lacICaQnlqRmp6YWpBbBZJk4OKUaGJfFpzHrlL/5XPjLg2FBttdtj8+in854z9resq7u
	yfO+Eyl2YjLXVx1+EDo1M61A8N6l399D5iszThAUuLKT8W5tkUOU7VbWw/tNmMx3Jtg5fOfU
	dj8q/8a82jt53eYm7k2Te3bqHb6bsnfVk/qphiIBXw8ZnDDf/+mcX2zFzgP67Fey3sq0rFZi
	Kc5INNRiLipOBACIZu2NewIAAA==
X-Brightmail-Tracker: 
 H4sIAAAAAAAAA+NgFnrMLMWRmVeSWpSXmKPExsXCNUNNSzeq7326wdP3XBZz1q9hs5g+9QKj
	xdf1v5gtft49zm7x+dlrZotVC6+xWRzfOo/d4vDck6wW52edYrG4vGsOm8W9Nf9ZLc5MK7I4
	dO05q8XqNRkWv7etYHPg99g56y67R3fbZXaPliNvWT0W73nJ5LHp0yR2jxMzfrN47Hxo6fF+
	31U2j2+3PTwWv/jA5LH5dLXH501yATxRXDYpqTmZZalF+nYJXBlnH05iLzgmV9Gz6gp7A+Nb
	iS5GTg4JAROJJY/aWboYOTjYBJQkju2NAQmLCMhKTP17HijMxcEs8JhJ4tHzF4wgCWGBYImJ
	Z7ayg9gsAqoSy1+1gdm8QHMaljxhhpipKdFw6R4TiM0pYCrx++FJMFsIqGb10yNQ9YISJ2c+
	YQGxmQXkJZq3zmaewMgzC0lqFpLUAkamVYwimXlluYmZOaZ6xdkZlXmZFXrJ+bmbGIFBv6z2
	z8QdjF8uux9iFOBgVOLhtSh8ly7EmlhWXJl7iFGCg1lJhPduzvt0Id6UxMqq1KL8+KLSnNTi
	Q4zSHCxK4rxe4akJQgLpiSWp2ampBalFMFkmDk6pBsYGz8fOkRFqhrbsZqyBjQHpwbcMOkMW
	fs+wM1iUfa7Wx/71+R92+8oKNq+dXvPEteHpkZZ9zxbsWRr9+I9FdaTBy6xfh/2PSkQd7W05
	defM1Tn5+z5Lpv0tWOL6K63kl9CJafclU7q2acmdeTv/xQ89l7nv5Ri2eZSXTSw8ZGRzVVD6
	cu7aKaFKLMUZiYZazEXFiQDHMuWxdgIAAA==
X-CFilter-Loop: Reflected
Content-Type: text/plain; charset="utf-8"

Memory leaks occurred when removing sysfs attributes for weighted
interleave. Improper kobject deallocation led to unreleased memory
when initialization failed or when nodes were removed.

This patch resolves the issue by replacing unnecessary `kfree()`
calls with proper `kobject_del()` and `kobject_put()` sequences,
ensuring correct teardown and preventing memory leaks.

By explicitly calling `kobject_del()` before `kobject_put()`,
the release function is now invoked safely, and internal sysfs
state is correctly cleaned up. This guarantees that the memory
associated with the kobject is fully released and avoids
resource leaks, thereby improving system stability.

Fixes: dce41f5ae253 ("mm/mempolicy: implement the sysfs-based weighted_inte=
rleave interface")
Signed-off-by: Rakie Kim <rakie.kim@sk.com>
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Yunjeong Mun <yunjeong.mun@sk.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
---
 mm/mempolicy.c | 64 +++++++++++++++++++++++++++-----------------------
 1 file changed, 34 insertions(+), 30 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index bbaadbeeb291..af3753925573 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -3448,7 +3448,9 @@ static void sysfs_wi_release(struct kobject *wi_kobj)
=20
 	for (i =3D 0; i < nr_node_ids; i++)
 		sysfs_wi_node_release(node_attrs[i], wi_kobj);
-	kobject_put(wi_kobj);
+
+	kfree(node_attrs);
+	kfree(wi_kobj);
 }
=20
 static const struct kobj_type wi_ktype =3D {
@@ -3494,15 +3496,22 @@ static int add_weighted_interleave_group(struct kob=
ject *root_kobj)
 	struct kobject *wi_kobj;
 	int nid, err;
=20
-	wi_kobj =3D kzalloc(sizeof(struct kobject), GFP_KERNEL);
-	if (!wi_kobj)
+	node_attrs =3D kcalloc(nr_node_ids, sizeof(struct iw_node_attr *),
+			     GFP_KERNEL);
+	if (!node_attrs)
 		return -ENOMEM;
=20
+	wi_kobj =3D kzalloc(sizeof(struct kobject), GFP_KERNEL);
+	if (!wi_kobj) {
+		err =3D -ENOMEM;
+		goto node_out;
+	}
+
 	err =3D kobject_init_and_add(wi_kobj, &wi_ktype, root_kobj,
 				   "weighted_interleave");
 	if (err) {
-		kfree(wi_kobj);
-		return err;
+		kobject_put(wi_kobj);
+		goto err_out;
 	}
=20
 	for_each_node_state(nid, N_POSSIBLE) {
@@ -3512,9 +3521,18 @@ static int add_weighted_interleave_group(struct kobj=
ect *root_kobj)
 			break;
 		}
 	}
-	if (err)
+	if (err) {
+		kobject_del(wi_kobj);
 		kobject_put(wi_kobj);
+		goto err_out;
+	}
+
 	return 0;
+
+node_out:
+	kfree(node_attrs);
+err_out:
+	return err;
 }
=20
 static void mempolicy_kobj_release(struct kobject *kobj)
@@ -3528,7 +3546,6 @@ static void mempolicy_kobj_release(struct kobject *ko=
bj)
 	mutex_unlock(&iw_table_lock);
 	synchronize_rcu();
 	kfree(old);
-	kfree(node_attrs);
 	kfree(kobj);
 }
=20
@@ -3542,37 +3559,24 @@ static int __init mempolicy_sysfs_init(void)
 	static struct kobject *mempolicy_kobj;
=20
 	mempolicy_kobj =3D kzalloc(sizeof(*mempolicy_kobj), GFP_KERNEL);
-	if (!mempolicy_kobj) {
-		err =3D -ENOMEM;
-		goto err_out;
-	}
-
-	node_attrs =3D kcalloc(nr_node_ids, sizeof(struct iw_node_attr *),
-			     GFP_KERNEL);
-	if (!node_attrs) {
-		err =3D -ENOMEM;
-		goto mempol_out;
-	}
+	if (!mempolicy_kobj)
+		return -ENOMEM;
=20
 	err =3D kobject_init_and_add(mempolicy_kobj, &mempolicy_ktype, mm_kobj,
 				   "mempolicy");
 	if (err)
-		goto node_out;
+		goto err_out;
=20
 	err =3D add_weighted_interleave_group(mempolicy_kobj);
-	if (err) {
-		pr_err("mempolicy sysfs structure failed to initialize\n");
-		kobject_put(mempolicy_kobj);
-		return err;
-	}
+	if (err)
+		goto err_del;
=20
-	return err;
-node_out:
-	kfree(node_attrs);
-mempol_out:
-	kfree(mempolicy_kobj);
+	return 0;
+
+err_del:
+	kobject_del(mempolicy_kobj);
 err_out:
-	pr_err("failed to add mempolicy kobject to the system\n");
+	kobject_put(mempolicy_kobj);
 	return err;
 }
=20
--=20
2.34.1
From nobody Tue May 13 19:53:56 2025
Received: from invmail4.hynix.com (exvmail4.hynix.com [166.125.252.92])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id E948518FC91;
	Fri,  4 Apr 2025 07:46:46 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=166.125.252.92
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1743752811; cv=none;
 b=bxftHyEc3qujiSLNdlBJN4UyP6prkmboTtDdt1vdEWdvbRGgEG0Rw2TLZZE9CnJ9wmuqs/m+tm4J2Gqc/slJAoPkjRq4F3OiHeYQPLJsbwsmXcEp/mphmrbP71+wAArmg1eXY+86ejg8IxeC7TaPjMCBKRQMZ31eRmSi8DlTApY=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1743752811; c=relaxed/simple;
	bh=HrX9Yn58RNgtGFV4viTIov2i+s6Rp3kU0uuVvAluDRo=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version;
 b=bje1KGXs/gTI7HWBpDt8vNTXhG+waOyU6lX0F2B9TKFWReyMruWXybial4UCHiOFP+yS6GtFeqStu2B9ihZZnzKjyrQEjNr8sCKv1qeIWyuOHgrU8j22Vre8CAmjHgL+D5BYCXNj9mtSEj03e8Gco5Lp0+E+k4BNtdn8JnzzRuc=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=sk.com;
 spf=pass smtp.mailfrom=sk.com; arc=none smtp.client-ip=166.125.252.92
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=sk.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=sk.com
X-AuditID: a67dfc5b-681ff7000002311f-d0-67ef8e5ceb36
From: Rakie Kim <rakie.kim@sk.com>
To: akpm@linux-foundation.org
Cc: gourry@gourry.net,
	linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	linux-cxl@vger.kernel.org,
	joshua.hahnjy@gmail.com,
	dan.j.williams@intel.com,
	ying.huang@linux.alibaba.com,
	david@redhat.com,
	Jonathan.Cameron@huawei.com,
	osalvador@suse.de,
	kernel_team@skhynix.com,
	honggyu.kim@sk.com,
	yunjeong.mun@sk.com,
	rakie.kim@sk.com
Subject: [PATCH v6 2/3] mm/mempolicy: Prepare weighted interleave sysfs for
 memory hotplug
Date: Fri,  4 Apr 2025 16:46:20 +0900
Message-ID: <20250404074623.1179-3-rakie.kim@sk.com>
X-Mailer: git-send-email 2.48.1.windows.1
In-Reply-To: <20250404074623.1179-1-rakie.kim@sk.com>
References: <20250404074623.1179-1-rakie.kim@sk.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Brightmail-Tracker: 
 H4sIAAAAAAAAA+NgFnrBLMWRmVeSWpSXmKPExsXC9ZZnoW5M3/t0gzeXVS3mrF/DZjF96gVG
	i6/rfzFb/Lx7nN1i1cJrbBbHt85jtzg/6xSLxeVdc9gs7q35z2pxZlqRxeo1GQ7cHjtn3WX3
	6G67zO7RcuQtq8fiPS+ZPDZ9msTucWLGbxaPnQ8tPd7vu8rmsfl0tcfnTXIBXFFcNimpOZll
	qUX6dglcGff2H2QqeKFWcejmG8YGxm/yXYycHBICJhJ7pt9jhLFbV50Fsjk42ASUJI7tjQEJ
	iwjISkz9e56li5GLg1ngMZPEo+cvwGqEBaIkni1WBalhEVCVOLr+GhuIzQs0ZtryOSwQIzUl
	Gi7dYwKxOQVMJX4/PAlmCwHVrH56hB2iXlDi5MwnYPXMAvISzVtnM4PskhD4zCYx/+5ZqEGS
	EgdX3GCZwMg/C0nPLCQ9CxiZVjEKZeaV5SZm5pjoZVTmZVboJefnbmIEhv+y2j/ROxg/XQg+
	xCjAwajEw2tR+C5diDWxrLgy9xCjBAezkgjv3Zz36UK8KYmVValF+fFFpTmpxYcYpTlYlMR5
	jb6VpwgJpCeWpGanphakFsFkmTg4pRoY2wLnmyZ+n28i/CLDI+72odeVqs9Zlizb4Wxtcmaz
	2Mu1eT9PdpsGHb1RWjh17Up7F+6NTWkS3dvWzcku2C3wWTJ9+a9nAVNum25W+8W3ZVpUVprO
	reN1nnZXdFmljddVFLaf1Uh+PGl2Fadzd0GT6U63fUlT1l11PxYxjaF+SfEG24vHCj5/U2Ip
	zkg01GIuKk4EABrSeI57AgAA
X-Brightmail-Tracker: 
 H4sIAAAAAAAAA+NgFnrELMWRmVeSWpSXmKPExsXCNUNNSzem7326wa21UhZz1q9hs5g+9QKj
	xdf1v5gtft49zm7x+dlrZotVC6+xWRzfOo/d4vDck6wW52edYrG4vGsOm8W9Nf9ZLc5MK7I4
	dO05q8XqNRkWv7etYHPg99g56y67R3fbZXaPliNvWT0W73nJ5LHp0yR2jxMzfrN47Hxo6fF+
	31U2j2+3PTwWv/jA5LH5dLXH501yATxRXDYpqTmZZalF+nYJXBn39h9kKnihVnHo5hvGBsZv
	8l2MnBwSAiYSravOMnYxcnCwCShJHNsbAxIWEZCVmPr3PEsXIxcHs8BjJolHz1+A1QgLREk8
	W6wKUsMioCpxdP01NhCbF2jMtOVzWCBGako0XLrHBGJzCphK/H54EswWAqpZ/fQIO0S9oMTJ
	mU/A6pkF5CWat85mnsDIMwtJahaS1AJGplWMIpl5ZbmJmTmmesXZGZV5mRV6yfm5mxiBIb+s
	9s/EHYxfLrsfYhTgYFTi4bUofJcuxJpYVlyZe4hRgoNZSYT3bs77dCHelMTKqtSi/Pii0pzU
	4kOM0hwsSuK8XuGpCUIC6YklqdmpqQWpRTBZJg5OqQbGjpDJARse9C59wZ8tkmQXtMORLeLu
	9cieiW2LTWaaGFyZ4bUxm0n0o+aKY6KLn584apRuYhi6f9eCpaIVEfUxu4tntS2dHvBVSfQC
	Z2LcRv5s9b3pX0vmqE7oVva7GV95XHWis03oJeH0/iOiawokT63KXXpvEVeh0fptTwokHN7P
	Odxla2OpxFKckWioxVxUnAgAjwN6gnUCAAA=
X-CFilter-Loop: Reflected
Content-Type: text/plain; charset="utf-8"

Previously, the weighted interleave sysfs structure was statically
managed during initialization. This prevented new nodes from being
recognized when memory hotplug events occurred, limiting the ability
to update or extend sysfs entries dynamically at runtime.

To address this, this patch refactors the sysfs infrastructure and
encapsulates it within a new structure, `sysfs_wi_group`, which holds
both the kobject and an array of node attribute pointers.

By allocating this group structure globally, the per-node sysfs
attributes can be managed beyond initialization time, enabling
external modules to insert or remove node entries in response to
events such as memory hotplug or node online/offline transitions.

Instead of allocating all per-node sysfs attributes at once, the
initialization path now uses the existing sysfs_wi_node_add() and
sysfs_wi_node_delete() helpers. This refactoring makes it possible
to modularly manage per-node sysfs entries and ensures the
infrastructure is ready for runtime extension.

Signed-off-by: Rakie Kim <rakie.kim@sk.com>
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Yunjeong Mun <yunjeong.mun@sk.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
---
 mm/mempolicy.c | 73 ++++++++++++++++++++++----------------------------
 1 file changed, 32 insertions(+), 41 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index af3753925573..73a9405ff352 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -3388,6 +3388,13 @@ struct iw_node_attr {
 	int nid;
 };
=20
+struct sysfs_wi_group {
+	struct kobject wi_kobj;
+	struct iw_node_attr *nattrs[];
+};
+
+static struct sysfs_wi_group *wi_group;
+
 static ssize_t node_show(struct kobject *kobj, struct kobj_attribute *attr,
 			 char *buf)
 {
@@ -3430,27 +3437,24 @@ static ssize_t node_store(struct kobject *kobj, str=
uct kobj_attribute *attr,
 	return count;
 }
=20
-static struct iw_node_attr **node_attrs;
-
-static void sysfs_wi_node_release(struct iw_node_attr *node_attr,
-				  struct kobject *parent)
+static void sysfs_wi_node_delete(int nid)
 {
-	if (!node_attr)
+	if (!wi_group->nattrs[nid])
 		return;
-	sysfs_remove_file(parent, &node_attr->kobj_attr.attr);
-	kfree(node_attr->kobj_attr.attr.name);
-	kfree(node_attr);
+
+	sysfs_remove_file(&wi_group->wi_kobj,
+			  &wi_group->nattrs[nid]->kobj_attr.attr);
+	kfree(wi_group->nattrs[nid]->kobj_attr.attr.name);
+	kfree(wi_group->nattrs[nid]);
 }
=20
 static void sysfs_wi_release(struct kobject *wi_kobj)
 {
-	int i;
-
-	for (i =3D 0; i < nr_node_ids; i++)
-		sysfs_wi_node_release(node_attrs[i], wi_kobj);
+	int nid;
=20
-	kfree(node_attrs);
-	kfree(wi_kobj);
+	for (nid =3D 0; nid < nr_node_ids; nid++)
+		sysfs_wi_node_delete(nid);
+	kfree(wi_group);
 }
=20
 static const struct kobj_type wi_ktype =3D {
@@ -3458,7 +3462,7 @@ static const struct kobj_type wi_ktype =3D {
 	.release =3D sysfs_wi_release,
 };
=20
-static int add_weight_node(int nid, struct kobject *wi_kobj)
+static int sysfs_wi_node_add(int nid)
 {
 	struct iw_node_attr *node_attr;
 	char *name;
@@ -3480,58 +3484,45 @@ static int add_weight_node(int nid, struct kobject =
*wi_kobj)
 	node_attr->kobj_attr.store =3D node_store;
 	node_attr->nid =3D nid;
=20
-	if (sysfs_create_file(wi_kobj, &node_attr->kobj_attr.attr)) {
+	if (sysfs_create_file(&wi_group->wi_kobj, &node_attr->kobj_attr.attr)) {
 		kfree(node_attr->kobj_attr.attr.name);
 		kfree(node_attr);
 		pr_err("failed to add attribute to weighted_interleave\n");
 		return -ENOMEM;
 	}
=20
-	node_attrs[nid] =3D node_attr;
+	wi_group->nattrs[nid] =3D node_attr;
 	return 0;
 }
=20
-static int add_weighted_interleave_group(struct kobject *root_kobj)
+static int add_weighted_interleave_group(struct kobject *mempolicy_kobj)
 {
-	struct kobject *wi_kobj;
 	int nid, err;
=20
-	node_attrs =3D kcalloc(nr_node_ids, sizeof(struct iw_node_attr *),
-			     GFP_KERNEL);
-	if (!node_attrs)
+	wi_group =3D kzalloc(struct_size(wi_group, nattrs, nr_node_ids),
+			GFP_KERNEL);
+	if (!wi_group)
 		return -ENOMEM;
=20
-	wi_kobj =3D kzalloc(sizeof(struct kobject), GFP_KERNEL);
-	if (!wi_kobj) {
-		err =3D -ENOMEM;
-		goto node_out;
-	}
-
-	err =3D kobject_init_and_add(wi_kobj, &wi_ktype, root_kobj,
+	err =3D kobject_init_and_add(&wi_group->wi_kobj, &wi_ktype, mempolicy_kob=
j,
 				   "weighted_interleave");
-	if (err) {
-		kobject_put(wi_kobj);
+	if (err)
 		goto err_out;
-	}
=20
 	for_each_node_state(nid, N_POSSIBLE) {
-		err =3D add_weight_node(nid, wi_kobj);
+		err =3D sysfs_wi_node_add(nid);
 		if (err) {
 			pr_err("failed to add sysfs [node%d]\n", nid);
-			break;
+			goto err_del;
 		}
 	}
-	if (err) {
-		kobject_del(wi_kobj);
-		kobject_put(wi_kobj);
-		goto err_out;
-	}
=20
 	return 0;
=20
-node_out:
-	kfree(node_attrs);
+err_del:
+	kobject_del(&wi_group->wi_kobj);
 err_out:
+	kobject_put(&wi_group->wi_kobj);
 	return err;
 }
=20
--=20
2.34.1
From nobody Tue May 13 19:53:56 2025
Received: from invmail4.hynix.com (exvmail4.skhynix.com [166.125.252.92])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id B0B061991DD;
	Fri,  4 Apr 2025 07:46:49 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=166.125.252.92
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1743752813; cv=none;
 b=pHjjOhdN/xhT2IX6EF6Gt88K+i/HTcfQyhLkffMIZviPAJ3FWUUsR84Z0mRUSS2J5Pn+VV5bjMFlITW9lUyC5oOBSd8l0XSsmkvzoJ4VDysjgsEB8LSOBn15P11gm4CvL383SH1Y7N4jKUG24LoOrg8R5tHlmXeBKlSEaxAvd3o=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1743752813; c=relaxed/simple;
	bh=CSZVRIjnXcSmPYcN+7RUGoYO5fSewwMnIim4k59qa3k=;
	h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References:
	 MIME-Version;
 b=B1mLjTet7DQ9vWftocAFmsGLPY491Bh19TbUPqToXZm33KbhBYYO0QrzrPS8u3sg88lNZEEMMSyX5xQ0/Gbom/PwEgMriE05UIbA5wL3Ut7V6IvsPgIDki9PT14R4xnL0ckn3Yoh+jOgRiMH5BdpwLVZSkHix1rAm9lRXUozXb8=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=sk.com;
 spf=pass smtp.mailfrom=sk.com; arc=none smtp.client-ip=166.125.252.92
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=none (p=none dis=none) header.from=sk.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=sk.com
X-AuditID: a67dfc5b-681ff7000002311f-dc-67ef8e5e12db
From: Rakie Kim <rakie.kim@sk.com>
To: akpm@linux-foundation.org
Cc: gourry@gourry.net,
	linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	linux-cxl@vger.kernel.org,
	joshua.hahnjy@gmail.com,
	dan.j.williams@intel.com,
	ying.huang@linux.alibaba.com,
	david@redhat.com,
	Jonathan.Cameron@huawei.com,
	osalvador@suse.de,
	kernel_team@skhynix.com,
	honggyu.kim@sk.com,
	yunjeong.mun@sk.com,
	rakie.kim@sk.com
Subject: [PATCH v6 3/3] mm/mempolicy: Support memory hotplug in weighted
 interleave
Date: Fri,  4 Apr 2025 16:46:21 +0900
Message-ID: <20250404074623.1179-4-rakie.kim@sk.com>
X-Mailer: git-send-email 2.48.1.windows.1
In-Reply-To: <20250404074623.1179-1-rakie.kim@sk.com>
References: <20250404074623.1179-1-rakie.kim@sk.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Brightmail-Tracker: 
 H4sIAAAAAAAAA+NgFnrJLMWRmVeSWpSXmKPExsXC9ZZnoW5c3/t0g99TjSzmrF/DZjF96gVG
	i6/rfzFb/Lx7nN1i1cJrbBbHt85jtzg/6xSLxeVdc9gs7q35z2pxZlqRxeo1GQ7cHjtn3WX3
	6G67zO7RcuQtq8fiPS+ZPDZ9msTucWLGbxaPnQ8tPd7vu8rmsfl0tcfnTXIBXFFcNimpOZll
	qUX6dglcGSuW/GUruGpU8fzcKvYGxm+aXYycHBICJhKda5exwdhrGyYD2RwcbAJKEsf2xoCE
	RQRkJab+Pc/SxcjFwSzwmEni0fMXjCAJYYFgiZkHX7OD2CwCqhJPJp5jArF5gebcfnKIHWKm
	pkTDpXtgcU4BU4nfD0+C2UJANaufHmGHqBeUODnzCQuIzSwgL9G8dTYzyDIJge9sEn92rmKC
	GCQpcXDFDZYJjPyzkPTMQtKzgJFpFaNQZl5ZbmJmjoleRmVeZoVecn7uJkZgBCyr/RO9g/HT
	heBDjAIcjEo8vBaF79KFWBPLiitzDzFKcDArifDezXmfLsSbklhZlVqUH19UmpNafIhRmoNF
	SZzX6Ft5ipBAemJJanZqakFqEUyWiYNTqoFxzco/614ETW7TknFvXTW5eJ5WuFSt8/+/wpsX
	LTXbpPevgGsX2xbXtC2cxlGLNfy+3VW6cbhx7XrGZbeq58zMmPMkR9bFszi5YLOTCJfK08sx
	PrcXtTw6Nm0mzxG5FD/WKcvfpB4027zguMNiiawgyXtx72ZlTmneMc1k1n2JOekHeDZ1Z/cs
	V2Ipzkg01GIuKk4EACbiurV8AgAA
X-Brightmail-Tracker: 
 H4sIAAAAAAAAA+NgFnrMLMWRmVeSWpSXmKPExsXCNUNNSzeu7326wZYnWhZz1q9hs5g+9QKj
	xdf1v5gtft49zm7x+dlrZotVC6+xWRzfOo/d4vDck6wW52edYrG4vGsOm8W9Nf9ZLc5MK7I4
	dO05q8XqNRkWv7etYHPg99g56y67R3fbZXaPliNvWT0W73nJ5LHp0yR2jxMzfrN47Hxo6fF+
	31U2j2+3PTwWv/jA5LH5dLXH501yATxRXDYpqTmZZalF+nYJXBkrlvxlK7hqVPH83Cr2BsZv
	ml2MnBwSAiYSaxsms3UxcnCwCShJHNsbAxIWEZCVmPr3PEsXIxcHs8BjJolHz18wgiSEBYIl
	Zh58zQ5iswioSjyZeI4JxOYFmnP7ySF2iJmaEg2X7oHFOQVMJX4/PAlmCwHVrH56hB2iXlDi
	5MwnLCA2s4C8RPPW2cwTGHlmIUnNQpJawMi0ilEkM68sNzEzx1SvODujMi+zQi85P3cTIzDo
	l9X+mbiD8ctl90OMAhyMSjy8FoXv0oVYE8uKK3MPMUpwMCuJ8N7NeZ8uxJuSWFmVWpQfX1Sa
	k1p8iFGag0VJnNcrPDVBSCA9sSQ1OzW1ILUIJsvEwSnVwLj9sOLJqaY5x54umx/58Ivbm6U/
	p3mInFM4myGiMavO483KR8Evl0dNuVuUbFK7aaWpwLJ/izwnvrAzuzn/1rGsGZIb3jhELVvv
	lHCcWTsy5sHabYJfFNesm7rVesWGtUzLJQMaJVVcP/fHV+3S/i5hwtVnrhhe4qZafrx7ZW21
	191/ndaBfyYqsRRnJBpqMRcVJwIA9wOrCnYCAAA=
X-CFilter-Loop: Reflected
Content-Type: text/plain; charset="utf-8"

The weighted interleave policy distributes page allocations across multiple
NUMA nodes based on their performance weight, thereby improving memory
bandwidth utilization. The weight values for each node are configured
through sysfs.

Previously, sysfs entries for configuring weighted interleave were created
for all possible nodes (N_POSSIBLE) at initialization, including nodes that
might not have memory. However, not all nodes in N_POSSIBLE are usable at
runtime, as some may remain memoryless or offline.
This led to sysfs entries being created for unusable nodes, causing
potential misconfiguration issues.

To address this issue, this patch modifies the sysfs creation logic to:
1) Limit sysfs entries to nodes that are online and have memory, avoiding
   the creation of sysfs entries for nodes that cannot be used.
2) Support memory hotplug by dynamically adding and removing sysfs entries
   based on whether a node transitions into or out of the N_MEMORY state.

Additionally, the patch ensures that sysfs attributes are properly managed
when nodes go offline, preventing stale or redundant entries from persisting
in the system.

By making these changes, the weighted interleave policy now manages its
sysfs entries more efficiently, ensuring that only relevant nodes are
considered for interleaving, and dynamically adapting to memory hotplug
events.

Signed-off-by: Rakie Kim <rakie.kim@sk.com>
Signed-off-by: Honggyu Kim <honggyu.kim@sk.com>
Signed-off-by: Yunjeong Mun <yunjeong.mun@sk.com>
---
 mm/mempolicy.c | 109 ++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 86 insertions(+), 23 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 73a9405ff352..f25c2c7f8fcf 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -113,6 +113,7 @@
 #include <asm/tlbflush.h>
 #include <asm/tlb.h>
 #include <linux/uaccess.h>
+#include <linux/memory.h>
=20
 #include "internal.h"
=20
@@ -3390,6 +3391,7 @@ struct iw_node_attr {
=20
 struct sysfs_wi_group {
 	struct kobject wi_kobj;
+	struct mutex kobj_lock;
 	struct iw_node_attr *nattrs[];
 };
=20
@@ -3439,13 +3441,24 @@ static ssize_t node_store(struct kobject *kobj, str=
uct kobj_attribute *attr,
=20
 static void sysfs_wi_node_delete(int nid)
 {
-	if (!wi_group->nattrs[nid])
+	struct iw_node_attr *attr;
+
+	if (nid < 0 || nid >=3D nr_node_ids)
+		return;
+
+	mutex_lock(&wi_group->kobj_lock);
+	attr =3D wi_group->nattrs[nid];
+	if (!attr) {
+		mutex_unlock(&wi_group->kobj_lock);
 		return;
+	}
+
+	wi_group->nattrs[nid] =3D NULL;
+	mutex_unlock(&wi_group->kobj_lock);
=20
-	sysfs_remove_file(&wi_group->wi_kobj,
-			  &wi_group->nattrs[nid]->kobj_attr.attr);
-	kfree(wi_group->nattrs[nid]->kobj_attr.attr.name);
-	kfree(wi_group->nattrs[nid]);
+	sysfs_remove_file(&wi_group->wi_kobj, &attr->kobj_attr.attr);
+	kfree(attr->kobj_attr.attr.name);
+	kfree(attr);
 }
=20
 static void sysfs_wi_release(struct kobject *wi_kobj)
@@ -3464,35 +3477,80 @@ static const struct kobj_type wi_ktype =3D {
=20
 static int sysfs_wi_node_add(int nid)
 {
-	struct iw_node_attr *node_attr;
+	int ret =3D 0;
 	char *name;
+	struct iw_node_attr *new_attr =3D NULL;
=20
-	node_attr =3D kzalloc(sizeof(*node_attr), GFP_KERNEL);
-	if (!node_attr)
+	if (nid < 0 || nid >=3D nr_node_ids) {
+		pr_err("Invalid node id: %d\n", nid);
+		return -EINVAL;
+	}
+
+	new_attr =3D kzalloc(sizeof(struct iw_node_attr), GFP_KERNEL);
+	if (!new_attr)
 		return -ENOMEM;
=20
 	name =3D kasprintf(GFP_KERNEL, "node%d", nid);
 	if (!name) {
-		kfree(node_attr);
+		kfree(new_attr);
 		return -ENOMEM;
 	}
=20
-	sysfs_attr_init(&node_attr->kobj_attr.attr);
-	node_attr->kobj_attr.attr.name =3D name;
-	node_attr->kobj_attr.attr.mode =3D 0644;
-	node_attr->kobj_attr.show =3D node_show;
-	node_attr->kobj_attr.store =3D node_store;
-	node_attr->nid =3D nid;
+	mutex_lock(&wi_group->kobj_lock);
+	if (wi_group->nattrs[nid]) {
+		mutex_unlock(&wi_group->kobj_lock);
+		pr_info("Node [%d] already exists\n", nid);
+		kfree(new_attr);
+		kfree(name);
+		return 0;
+	}
+	wi_group->nattrs[nid] =3D new_attr;
=20
-	if (sysfs_create_file(&wi_group->wi_kobj, &node_attr->kobj_attr.attr)) {
-		kfree(node_attr->kobj_attr.attr.name);
-		kfree(node_attr);
-		pr_err("failed to add attribute to weighted_interleave\n");
-		return -ENOMEM;
+	sysfs_attr_init(&wi_group->nattrs[nid]->kobj_attr.attr);
+	wi_group->nattrs[nid]->kobj_attr.attr.name =3D name;
+	wi_group->nattrs[nid]->kobj_attr.attr.mode =3D 0644;
+	wi_group->nattrs[nid]->kobj_attr.show =3D node_show;
+	wi_group->nattrs[nid]->kobj_attr.store =3D node_store;
+	wi_group->nattrs[nid]->nid =3D nid;
+
+	ret =3D sysfs_create_file(&wi_group->wi_kobj,
+				&wi_group->nattrs[nid]->kobj_attr.attr);
+	if (ret) {
+		kfree(wi_group->nattrs[nid]->kobj_attr.attr.name);
+		kfree(wi_group->nattrs[nid]);
+		wi_group->nattrs[nid] =3D NULL;
+		pr_err("Failed to add attribute to weighted_interleave: %d\n", ret);
 	}
+	mutex_unlock(&wi_group->kobj_lock);
=20
-	wi_group->nattrs[nid] =3D node_attr;
-	return 0;
+	return ret;
+}
+
+static int wi_node_notifier(struct notifier_block *nb,
+			       unsigned long action, void *data)
+{
+	int err;
+	struct memory_notify *arg =3D data;
+	int nid =3D arg->status_change_nid;
+
+	if (nid < 0)
+		goto notifier_end;
+
+	switch(action) {
+	case MEM_ONLINE:
+		err =3D sysfs_wi_node_add(nid);
+		if (err) {
+			pr_err("failed to add sysfs [node%d]\n", nid);
+			return NOTIFY_BAD;
+		}
+		break;
+	case MEM_OFFLINE:
+		sysfs_wi_node_delete(nid);
+		break;
+	}
+
+notifier_end:
+	return NOTIFY_OK;
 }
=20
 static int add_weighted_interleave_group(struct kobject *mempolicy_kobj)
@@ -3503,13 +3561,17 @@ static int add_weighted_interleave_group(struct kob=
ject *mempolicy_kobj)
 			GFP_KERNEL);
 	if (!wi_group)
 		return -ENOMEM;
+	mutex_init(&wi_group->kobj_lock);
=20
 	err =3D kobject_init_and_add(&wi_group->wi_kobj, &wi_ktype, mempolicy_kob=
j,
 				   "weighted_interleave");
 	if (err)
 		goto err_out;
=20
-	for_each_node_state(nid, N_POSSIBLE) {
+	for_each_online_node(nid) {
+		if (!node_state(nid, N_MEMORY))
+			continue;
+
 		err =3D sysfs_wi_node_add(nid);
 		if (err) {
 			pr_err("failed to add sysfs [node%d]\n", nid);
@@ -3517,6 +3579,7 @@ static int add_weighted_interleave_group(struct kobje=
ct *mempolicy_kobj)
 		}
 	}
=20
+	hotplug_memory_notifier(wi_node_notifier, DEFAULT_CALLBACK_PRI);
 	return 0;
=20
 err_del:
--=20
2.34.1