[PATCH v3 03/12] smp: Remove get_cpu from smp_call_function_any

Chuyi Zhou posted 12 patches 2 weeks, 5 days ago
There is a newer version of this series
[PATCH v3 03/12] smp: Remove get_cpu from smp_call_function_any
Posted by Chuyi Zhou 2 weeks, 5 days ago
Now smp_call_function_single() would enable preemption before
csd_lock_wait() to reduce the critical section. To allow callers of
smp_call_function_any() to also benefit from this optimization, remove
get_cpu()/put_cpu() from smp_call_function_any().

Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
---
 kernel/smp.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index b603d4229f95..80daf9dd4a25 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -761,16 +761,26 @@ EXPORT_SYMBOL_GPL(smp_call_function_single_async);
 int smp_call_function_any(const struct cpumask *mask,
 			  smp_call_func_t func, void *info, int wait)
 {
+	bool local = true;
 	unsigned int cpu;
 	int ret;
 
-	/* Try for same CPU (cheapest) */
+	/*
+	 * Prevent migration to another CPU after selecting the current CPU
+	 * as the target.
+	 */
 	cpu = get_cpu();
-	if (!cpumask_test_cpu(cpu, mask))
+
+	/* Try for same CPU (cheapest) */
+	if (!cpumask_test_cpu(cpu, mask)) {
 		cpu = sched_numa_find_nth_cpu(mask, 0, cpu_to_node(cpu));
+		local = false;
+		put_cpu();
+	}
 
 	ret = smp_call_function_single(cpu, func, info, wait);
-	put_cpu();
+	if (local)
+		put_cpu();
 	return ret;
 }
 EXPORT_SYMBOL_GPL(smp_call_function_any);
-- 
2.20.1
Re: [PATCH v3 03/12] smp: Remove get_cpu from smp_call_function_any
Posted by Steven Rostedt 2 weeks, 5 days ago
On Wed, 18 Mar 2026 12:56:29 +0800
"Chuyi Zhou" <zhouchuyi@bytedance.com> wrote:

> Now smp_call_function_single() would enable preemption before
> csd_lock_wait() to reduce the critical section. To allow callers of
> smp_call_function_any() to also benefit from this optimization, remove
> get_cpu()/put_cpu() from smp_call_function_any().
> 
> Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
> Reviewed-by: Muchun Song <muchun.song@linux.dev>
> ---
>  kernel/smp.c | 16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/smp.c b/kernel/smp.c
> index b603d4229f95..80daf9dd4a25 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -761,16 +761,26 @@ EXPORT_SYMBOL_GPL(smp_call_function_single_async);
>  int smp_call_function_any(const struct cpumask *mask,
>  			  smp_call_func_t func, void *info, int wait)
>  {
> +	bool local = true;
>  	unsigned int cpu;
>  	int ret;
>  
> -	/* Try for same CPU (cheapest) */
> +	/*
> +	 * Prevent migration to another CPU after selecting the current CPU
> +	 * as the target.
> +	 */
>  	cpu = get_cpu();
> -	if (!cpumask_test_cpu(cpu, mask))
> +
> +	/* Try for same CPU (cheapest) */
> +	if (!cpumask_test_cpu(cpu, mask)) {
>  		cpu = sched_numa_find_nth_cpu(mask, 0, cpu_to_node(cpu));

Hmm, isn't this looking for another CPU that is closest to the current CPU?

By allowing migration, it is possible that the task will migrate to another
CPU where this will pick one that is much farther.

I'm not sure if that's really an issue or not, as I believe it's mostly for
performance reasons. Then again, why even keep preemption disabled for the
current CPU case? Isn't disabling preemption more for performance than
correctness?

Perhaps migrate_disable() is all that is needed?

-- Steve


> +		local = false;
> +		put_cpu();
> +	}
>  
>  	ret = smp_call_function_single(cpu, func, info, wait);
> -	put_cpu();
> +	if (local)
> +		put_cpu();
>  	return ret;
>  }
>  EXPORT_SYMBOL_GPL(smp_call_function_any);
Re: [PATCH v3 03/12] smp: Remove get_cpu from smp_call_function_any
Posted by Sebastian Andrzej Siewior 2 weeks, 5 days ago
On 2026-03-18 10:32:37 [-0400], Steven Rostedt wrote:
> On Wed, 18 Mar 2026 12:56:29 +0800
> "Chuyi Zhou" <zhouchuyi@bytedance.com> wrote:
> 
> > Now smp_call_function_single() would enable preemption before
> > csd_lock_wait() to reduce the critical section. To allow callers of
> > smp_call_function_any() to also benefit from this optimization, remove
> > get_cpu()/put_cpu() from smp_call_function_any().
> > 
> > Signed-off-by: Chuyi Zhou <zhouchuyi@bytedance.com>
> > Reviewed-by: Muchun Song <muchun.song@linux.dev>
> > ---
> >  kernel/smp.c | 16 +++++++++++++---
> >  1 file changed, 13 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/smp.c b/kernel/smp.c
> > index b603d4229f95..80daf9dd4a25 100644
> > --- a/kernel/smp.c
> > +++ b/kernel/smp.c
> > @@ -761,16 +761,26 @@ EXPORT_SYMBOL_GPL(smp_call_function_single_async);
> >  int smp_call_function_any(const struct cpumask *mask,
> >  			  smp_call_func_t func, void *info, int wait)
> >  {
> > +	bool local = true;
> >  	unsigned int cpu;
> >  	int ret;
> >  
> > -	/* Try for same CPU (cheapest) */
> > +	/*
> > +	 * Prevent migration to another CPU after selecting the current CPU
> > +	 * as the target.
> > +	 */
> >  	cpu = get_cpu();
> > -	if (!cpumask_test_cpu(cpu, mask))
> > +
> > +	/* Try for same CPU (cheapest) */
> > +	if (!cpumask_test_cpu(cpu, mask)) {
> >  		cpu = sched_numa_find_nth_cpu(mask, 0, cpu_to_node(cpu));
> 
> Hmm, isn't this looking for another CPU that is closest to the current CPU?
> 
> By allowing migration, it is possible that the task will migrate to another
> CPU where this will pick one that is much farther.
> 
> I'm not sure if that's really an issue or not, as I believe it's mostly for
> performance reasons. Then again, why even keep preemption disabled for the
> current CPU case? Isn't disabling preemption more for performance than
> correctness?
> 
> Perhaps migrate_disable() is all that is needed?

You want the local-CPU so it is cheap as in direct invocation. But if
you migrate away between the get_cpu() and the check in
generic_exec_single() you do trigger an interrupt.
A migration here would not be wrong, it would just not be as cheap as
intended. Not sure if migrate_disable() counters the cheap part again.

An alternative would be to have the local execution in
__generic_exec_local() and have it run with disabled preemption and a
fallback with enabled preemption where it is enqueue and probably waited
for.

> -- Steve

Sebastian