[v2] sched/topology: Fix memory leak in the error path of sched_init_numa

[PATCH v2] sched/topology: Fix memory leak in the error path of sched_init_numa

Posted by Luo Gengkun 3 months, 3 weeks ago

In sched_init_numa, masks are used to store memory, but the error path
returns directly without freeing the allocated memory.
To fix this, the freeing logic in sched_reset_numa can be extraced into a
new function, __sched_free_masks, which can be called on the error path.

Fixes: 0fb3978b0aac ("sched/numa: Fix NUMA topology for systems with CPU-less nodes")
Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
---
 Change log
 V1->V2:
 - Rename the free_masks to __sched_free_masks as suggested by Huang, Ying
 - Add Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
 Link to V1: https://lore.kernel.org/all/20251013041348.350886-1-luogengkun@huaweicloud.com/
---
 kernel/sched/topology.c | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 444bdfdab731..4e4e0bffd323 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1924,6 +1924,20 @@ static void init_numa_topology_type(int offline_node)
 
 #define NR_DISTANCE_VALUES (1 << DISTANCE_BITS)
 
+static void __sched_free_masks(struct cpumask ***masks, int nr_levels)
+{
+	int i, j;
+
+	for (i = 0; i < nr_levels && masks; i++) {
+		if (!masks[i])
+			continue;
+		for_each_node(j)
+			kfree(masks[i][j]);
+		kfree(masks[i]);
+	}
+	kfree(masks);
+}
+
 void sched_init_numa(int offline_node)
 {
 	struct sched_domain_topology_level *tl;
@@ -2003,15 +2017,19 @@ void sched_init_numa(int offline_node)
 	 */
 	for (i = 0; i < nr_levels; i++) {
 		masks[i] = kzalloc(nr_node_ids * sizeof(void *), GFP_KERNEL);
-		if (!masks[i])
+		if (!masks[i]) {
+			__sched_free_masks(masks, nr_levels);
 			return;
+		}
 
 		for_each_cpu_node_but(j, offline_node) {
 			struct cpumask *mask = kzalloc(cpumask_size(), GFP_KERNEL);
 			int k;
 
-			if (!mask)
+			if (!mask) {
+				__sched_free_masks(masks, nr_levels);
 				return;
+			}
 
 			masks[i][j] = mask;
 
@@ -2079,18 +2097,9 @@ static void sched_reset_numa(void)
 	masks = sched_domains_numa_masks;
 	rcu_assign_pointer(sched_domains_numa_masks, NULL);
 	if (distances || masks) {
-		int i, j;
-
 		synchronize_rcu();
 		kfree(distances);
-		for (i = 0; i < nr_levels && masks; i++) {
-			if (!masks[i])
-				continue;
-			for_each_node(j)
-				kfree(masks[i][j]);
-			kfree(masks[i]);
-		}
-		kfree(masks);
+		__sched_free_masks(masks, nr_levels);
 	}
 	if (sched_domain_topology_saved) {
 		kfree(sched_domain_topology);
-- 
2.34.1

Re: [PATCH v2] sched/topology: Fix memory leak in the error path of sched_init_numa

Posted by Valentin Schneider 3 months, 3 weeks ago

On 14/10/25 08:36, Luo Gengkun wrote:
> @@ -2003,15 +2017,19 @@ void sched_init_numa(int offline_node)
>        */
>       for (i = 0; i < nr_levels; i++) {
>               masks[i] = kzalloc(nr_node_ids * sizeof(void *), GFP_KERNEL);
> -		if (!masks[i])
> +		if (!masks[i]) {
> +			__sched_free_masks(masks, nr_levels);

This could be passed 'i + 1' instead of nr_levels.

>                       return;

So the tricky thing here is sched_init_numa() doesn't have a return value
pretty much because it's not meant to fail, if you get memory allocation
issues while setting up scheduler structures then things are not looking
good at all...

Best case scenario would be the rest of the topology gets set up correctly
and you get non-NUMA aware scheduling, but AFAICT the call order is

  sched_cpu_activate()
    sched_init_numa()
    partition_sched_domains()

So if the NUMA topology setup failed, the regular topology setup isn't
going to fare much better.

Anywho, there's nothing wrong with your patch, it's just that in case of an
issue it IMO doesn't improve the situation much.