[PATCH v3 04/12] smp: Use on-stack cpumask in smp_call_function_many_cond

Chuyi Zhou posted 12 patches 2 weeks, 5 days ago
There is a newer version of this series
[PATCH v3 04/12] smp: Use on-stack cpumask in smp_call_function_many_cond
Posted by Chuyi Zhou 2 weeks, 5 days ago
This patch use on-stack cpumask to replace percpu cfd cpumask in
smp_call_function_many_cond(). Note that when both CONFIG_CPUMASK_OFFSTACK
and PREEMPT_RT are enabled, allocation during preempt-disabled section
would break RT. Therefore, only do this when CONFIG_CPUMASK_OFFSTACK=n.
This is a preparation for enabling preemption during csd_lock_wait() in
smp_call_function_many_cond().

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
---
 kernel/smp.c | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 80daf9dd4a25..9728ba55944d 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -799,14 +799,25 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 					unsigned int scf_flags,
 					smp_cond_func_t cond_func)
 {
+	bool preemptible_wait = !IS_ENABLED(CONFIG_CPUMASK_OFFSTACK);
 	int cpu, last_cpu, this_cpu = smp_processor_id();
 	struct call_function_data *cfd;
 	bool wait = scf_flags & SCF_WAIT;
+	cpumask_var_t cpumask_stack;
+	struct cpumask *cpumask;
 	int nr_cpus = 0;
 	bool run_remote = false;
 
 	lockdep_assert_preemption_disabled();
 
+	cfd = this_cpu_ptr(&cfd_data);
+	cpumask = cfd->cpumask;
+
+	if (preemptible_wait) {
+		BUILD_BUG_ON(!alloc_cpumask_var(&cpumask_stack, GFP_ATOMIC));
+		cpumask = cpumask_stack;
+	}
+
 	/*
 	 * Can deadlock when called with interrupts disabled.
 	 * We allow cpu's that are not yet online though, as no one else can
@@ -827,16 +838,15 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 
 	/* Check if we need remote execution, i.e., any CPU excluding this one. */
 	if (cpumask_any_and_but(mask, cpu_online_mask, this_cpu) < nr_cpu_ids) {
-		cfd = this_cpu_ptr(&cfd_data);
-		cpumask_and(cfd->cpumask, mask, cpu_online_mask);
-		__cpumask_clear_cpu(this_cpu, cfd->cpumask);
+		cpumask_and(cpumask, mask, cpu_online_mask);
+		__cpumask_clear_cpu(this_cpu, cpumask);
 
 		cpumask_clear(cfd->cpumask_ipi);
-		for_each_cpu(cpu, cfd->cpumask) {
+		for_each_cpu(cpu, cpumask) {
 			call_single_data_t *csd = per_cpu_ptr(cfd->csd, cpu);
 
 			if (cond_func && !cond_func(cpu, info)) {
-				__cpumask_clear_cpu(cpu, cfd->cpumask);
+				__cpumask_clear_cpu(cpu, cpumask);
 				continue;
 			}
 
@@ -887,13 +897,16 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 	}
 
 	if (run_remote && wait) {
-		for_each_cpu(cpu, cfd->cpumask) {
+		for_each_cpu(cpu, cpumask) {
 			call_single_data_t *csd;
 
 			csd = per_cpu_ptr(cfd->csd, cpu);
 			csd_lock_wait(csd);
 		}
 	}
+
+	if (preemptible_wait)
+		free_cpumask_var(cpumask_stack);
 }
 
 /**
-- 
2.20.1
Re: [PATCH v3 04/12] smp: Use on-stack cpumask in smp_call_function_many_cond
Posted by Sebastian Andrzej Siewior 2 weeks, 5 days ago
On 2026-03-18 12:56:30 [+0800], Chuyi Zhou wrote:
> This patch use on-stack cpumask to replace percpu cfd cpumask in
> smp_call_function_many_cond(). Note that when both CONFIG_CPUMASK_OFFSTACK
> and PREEMPT_RT are enabled, allocation during preempt-disabled section
> would break RT. Therefore, only do this when CONFIG_CPUMASK_OFFSTACK=n.
> This is a preparation for enabling preemption during csd_lock_wait() in
> smp_call_function_many_cond().

You explained why we do this only for !CONFIG_CPUMASK_OFFSTACK but
failed to explain why we need a function local cpumask. Other than
preparation step. But this allocation looks pointless, let me look
further…

> Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
> Reviewed-by: Muchun Song <muchun.song@linux.dev>

Sebastian
Re: [PATCH v3 04/12] smp: Use on-stack cpumask in smp_call_function_many_cond
Posted by Chuyi Zhou 2 weeks, 4 days ago
Hi Sebastian,

在 2026/3/18 23:55, Sebastian Andrzej Siewior 写道:
> On 2026-03-18 12:56:30 [+0800], Chuyi Zhou wrote:
>> This patch use on-stack cpumask to replace percpu cfd cpumask in
>> smp_call_function_many_cond(). Note that when both CONFIG_CPUMASK_OFFSTACK
>> and PREEMPT_RT are enabled, allocation during preempt-disabled section
>> would break RT. Therefore, only do this when CONFIG_CPUMASK_OFFSTACK=n.
>> This is a preparation for enabling preemption during csd_lock_wait() in
>> smp_call_function_many_cond().
> 
> You explained why we do this only for !CONFIG_CPUMASK_OFFSTACK but
> failed to explain why we need a function local cpumask. Other than
> preparation step. But this allocation looks pointless, let me look
> further…
> 

OK. It might be better to explain here why we need an local cpumask.

Thanks.

>> Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
>> Reviewed-by: Muchun Song <muchun.song@linux.dev>
> 
> Sebastian
Re: [PATCH v3 04/12] smp: Use on-stack cpumask in smp_call_function_many_cond
Posted by Steven Rostedt 2 weeks, 5 days ago
On Wed, 18 Mar 2026 12:56:30 +0800
"Chuyi Zhou" <zhouchuyi@bytedance.com> wrote:

> diff --git a/kernel/smp.c b/kernel/smp.c
> index 80daf9dd4a25..9728ba55944d 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -799,14 +799,25 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
>  					unsigned int scf_flags,
>  					smp_cond_func_t cond_func)
>  {
> +	bool preemptible_wait = !IS_ENABLED(CONFIG_CPUMASK_OFFSTACK);
>  	int cpu, last_cpu, this_cpu = smp_processor_id();
>  	struct call_function_data *cfd;
>  	bool wait = scf_flags & SCF_WAIT;
> +	cpumask_var_t cpumask_stack;
> +	struct cpumask *cpumask;
>  	int nr_cpus = 0;
>  	bool run_remote = false;

> @@ -887,13 +897,16 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
>  	}
>  
>  	if (run_remote && wait) {
> -		for_each_cpu(cpu, cfd->cpumask) {
> +		for_each_cpu(cpu, cpumask) {
>  			call_single_data_t *csd;
>  
>  			csd = per_cpu_ptr(cfd->csd, cpu);
>  			csd_lock_wait(csd);
>  		}
>  	}
> +
> +	if (preemptible_wait)
> +		free_cpumask_var(cpumask_stack);

Ironic, that preemptible_wait is only true if !CONFIG_CPUMASK_OFFSTACK, and
free_cpumask_var() is defined as:

// #ifndef CONFIG_CPUMASK_OFFSTACK
static __always_inline void free_cpumask_var(cpumask_var_t mask)
{
}

So basically the above is just a compiler exercise to insert a nop :-/

-- Steve


>  }