[PATCH v10 6/8] sched/isolation: Force housekeeping if isolcpus and nohz_full don't leave any

Gabriele Monaco posted 8 patches 1 month, 4 weeks ago
There is a newer version of this series
[PATCH v10 6/8] sched/isolation: Force housekeeping if isolcpus and nohz_full don't leave any
Posted by Gabriele Monaco 1 month, 4 weeks ago
Currently the user can set up isolcpus and nohz_full in such a way that
leaves no housekeeping CPU (i.e. no CPU that is neither domain isolated
nor nohz full). This can be a problem for other subsystems (e.g. the
timer wheel imgration).

Prevent this configuration by invalidating the last setting in case the
union of isolcpus (domain) and nohz_full covers all CPUs.

Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
 kernel/sched/isolation.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index a4cf17b1fab0..0bf7b91684c0 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -167,6 +167,29 @@ static int __init housekeeping_setup(char *str, unsigned long flags)
 			}
 		}
 
+		/*
+		 * Check the combination of nohz_full and isolcpus=domain,
+		 * necessary to avoid problems with the timer migration
+		 * hierarchy. managed_irq is ignored by this check since it
+		 * isn't considered in the timer migration logic.
+		 */
+		iter_flags = housekeeping.flags & (HK_FLAG_KERNEL_NOISE | HK_FLAG_DOMAIN);
+		type = find_first_bit(&iter_flags, HK_TYPE_MAX);
+		/*
+		 * Pass the check if none of these flags were previously set or
+		 * are not in the current selection.
+		 */
+		iter_flags = flags & (HK_FLAG_KERNEL_NOISE | HK_FLAG_DOMAIN);
+		first_cpu = type == HK_TYPE_MAX || !iter_flags ? 0 :
+			    cpumask_first_and_and(cpu_present_mask,
+				    housekeeping_staging, housekeeping.cpumasks[type]);
+		if (first_cpu >= nr_cpu_ids || first_cpu >= setup_max_cpus) {
+			pr_warn("Housekeeping: must include one present CPU "
+				"neither in nohz_full= nor in isolcpus=domain, "
+				"ignoring setting %s\n", str);
+			goto free_housekeeping_staging;
+		}
+
 		iter_flags = flags & ~housekeeping.flags;
 
 		for_each_set_bit(type, &iter_flags, HK_TYPE_MAX)
-- 
2.50.1
Re: [PATCH v10 6/8] sched/isolation: Force housekeeping if isolcpus and nohz_full don't leave any
Posted by Waiman Long 1 month, 4 weeks ago
On 8/6/25 5:38 AM, Gabriele Monaco wrote:
> Currently the user can set up isolcpus and nohz_full in such a way that
> leaves no housekeeping CPU (i.e. no CPU that is neither domain isolated
> nor nohz full). This can be a problem for other subsystems (e.g. the
> timer wheel imgration).
>
> Prevent this configuration by invalidating the last setting in case the
> union of isolcpus (domain) and nohz_full covers all CPUs.
>
> Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
> ---
>   kernel/sched/isolation.c | 23 +++++++++++++++++++++++
>   1 file changed, 23 insertions(+)
>
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index a4cf17b1fab0..0bf7b91684c0 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -167,6 +167,29 @@ static int __init housekeeping_setup(char *str, unsigned long flags)
>   			}
>   		}
>   
> +		/*
> +		 * Check the combination of nohz_full and isolcpus=domain,
> +		 * necessary to avoid problems with the timer migration
> +		 * hierarchy. managed_irq is ignored by this check since it
> +		 * isn't considered in the timer migration logic.
> +		 */
> +		iter_flags = housekeeping.flags & (HK_FLAG_KERNEL_NOISE | HK_FLAG_DOMAIN);
> +		type = find_first_bit(&iter_flags, HK_TYPE_MAX);
> +		/*
> +		 * Pass the check if none of these flags were previously set or
> +		 * are not in the current selection.
> +		 */
> +		iter_flags = flags & (HK_FLAG_KERNEL_NOISE | HK_FLAG_DOMAIN);
> +		first_cpu = type == HK_TYPE_MAX || !iter_flags ? 0 :
> +			    cpumask_first_and_and(cpu_present_mask,
> +				    housekeeping_staging, housekeeping.cpumasks[type]);
This statement is a bit hard to read as I have to look up the C operator 
precedence table to make sure that I get the precedence right. It will 
be easier if parentheses are used to make it easier to read.
> +		if (first_cpu >= nr_cpu_ids || first_cpu >= setup_max_cpus) {

Perhaps you can simplify it to "if (first_cpu >= min(nr_cpu_ids, 
setup_max_cpus))".

Other than that, the logic looks right to me.

Cheers,
Longman


> +			pr_warn("Housekeeping: must include one present CPU "
> +				"neither in nohz_full= nor in isolcpus=domain, "
> +				"ignoring setting %s\n", str);
> +			goto free_housekeeping_staging;
> +		}
> +
>   		iter_flags = flags & ~housekeeping.flags;
>   
>   		for_each_set_bit(type, &iter_flags, HK_TYPE_MAX)