[PATCH] sched/isolation: Enforce at least one housekeeping CPU per node unless maxcpus limits

Aaron Tomlin posted 1 patch 3 months, 2 weeks ago
kernel/sched/isolation.c | 40 ++++++++++++++++++++++++++++++++++++++--
1 file changed, 38 insertions(+), 2 deletions(-)
[PATCH] sched/isolation: Enforce at least one housekeeping CPU per node unless maxcpus limits
Posted by Aaron Tomlin 3 months, 2 weeks ago
This patch improves housekeeping CPU selection logic by enforcing that
each online NUMA node has at least one dedicated housekeeping CPU,
ensuring better NUMA locality for kernel threads and timed work.

Before assigning additional housekeeping CPUs, the patch checks if any
online NUMA node contains CPUs with logical IDs greater than or equal to
max_cpus=. If so, per-node NUMA enforcement is skipped and a
warning is issued, since some nodes would be unserviceable given the CPU
limit.

If NUMA enforcement is possible, each online node lacking a housekeeping
CPU will have one present CPU (the lowest logical ID) assigned and
included in the housekeeping staging mask, with a warning logged for
visibility. The final guarantee that at least one present housekeeping
CPU is assigned across the system remains intact.

Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
 kernel/sched/isolation.c | 40 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index a4cf17b1fab0..87b7f20d76b1 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -114,8 +114,10 @@ static void __init housekeeping_setup_type(enum hk_type type,
 static int __init housekeeping_setup(char *str, unsigned long flags)
 {
 	cpumask_var_t non_housekeeping_mask, housekeeping_staging;
-	unsigned int first_cpu;
-	int err = 0;
+	const struct cpumask *node_cpus;
+	unsigned int first_cpu, last_cpu;
+	int node, node_cpu, err = 0;
+	bool skip_numa_enforcement = false;
 
 	if ((flags & HK_FLAG_KERNEL_NOISE) && !(housekeeping.flags & HK_FLAG_KERNEL_NOISE)) {
 		if (!IS_ENABLED(CONFIG_NO_HZ_FULL)) {
@@ -135,6 +137,40 @@ static int __init housekeeping_setup(char *str, unsigned long flags)
 	cpumask_andnot(housekeeping_staging,
 		       cpu_possible_mask, non_housekeeping_mask);
 
+	for_each_online_node(node) {
+		node_cpus = cpumask_of_node(node);
+
+		if (cpumask_empty(node_cpus))
+			continue;
+
+		last_cpu = cpumask_last(node_cpus);
+		if (last_cpu >= setup_max_cpus) {
+			skip_numa_enforcement = true;
+			pr_warn("Housekeeping: NUMA node %d has CPU %d >= "
+				"max_cpus=%d. Skipping NUMA enforcement\n",
+				node, last_cpu, setup_max_cpus);
+			break;
+		}
+	}
+
+	if (!skip_numa_enforcement) {
+		for_each_online_node(node) {
+			node_cpus = cpumask_of_node(node);
+
+			if (cpumask_intersects(node_cpus, housekeeping_staging))
+				continue;
+
+			for_each_cpu_and(node_cpu, node_cpus, cpu_present_mask) {
+				pr_warn("Housekeeping: Adding CPU %d "
+					"from node %d to ensure NUMA "
+					"coverage\n", node_cpu, node);
+				__cpumask_set_cpu(node_cpu, housekeeping_staging);
+				__cpumask_clear_cpu(node_cpu, non_housekeeping_mask);
+				break;
+			}
+		}
+	}
+
 	first_cpu = cpumask_first_and(cpu_present_mask, housekeeping_staging);
 	if (first_cpu >= nr_cpu_ids || first_cpu >= setup_max_cpus) {
 		__cpumask_set_cpu(smp_processor_id(), housekeeping_staging);
-- 
2.51.0
Re: [PATCH] sched/isolation: Enforce at least one housekeeping CPU per node unless maxcpus limits
Posted by Aaron Tomlin 2 months, 4 weeks ago
On Sat, Oct 25, 2025 at 12:11:13PM -0400, Aaron Tomlin wrote:
> This patch improves housekeeping CPU selection logic by enforcing that
> each online NUMA node has at least one dedicated housekeeping CPU,
> ensuring better NUMA locality for kernel threads and timed work.
> 
> Before assigning additional housekeeping CPUs, the patch checks if any
> online NUMA node contains CPUs with logical IDs greater than or equal to
> max_cpus=. If so, per-node NUMA enforcement is skipped and a
> warning is issued, since some nodes would be unserviceable given the CPU
> limit.
> 
> If NUMA enforcement is possible, each online node lacking a housekeeping
> CPU will have one present CPU (the lowest logical ID) assigned and
> included in the housekeeping staging mask, with a warning logged for
> visibility. The final guarantee that at least one present housekeeping
> CPU is assigned across the system remains intact.

Hi,

Just following up on this patch submission.

Please let me know if it requires any further revision or if there is any
feedback I should address.


Kind regards,
-- 
Aaron Tomlin