[patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()

Thomas Gleixner posted 19 patches 2 months ago
There is a newer version of this series
[patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
Posted by Thomas Gleixner 2 months ago
The @nr_cpus_allowed management does way too much useless work for the
common case where a process starts with unrestricted affinity,
i.e. @nr_cpus_allowed is equal to the number of possible CPUs right
away.

Add a check whether that limit is reached already and then avoid the whole
cpumask update and evaluation.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/sched/core.c |   16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2719,6 +2719,7 @@ void set_cpus_allowed_common(struct task
 
 	cpumask_copy(&p->cpus_mask, ctx->new_mask);
 	p->nr_cpus_allowed = cpumask_weight(ctx->new_mask);
+	mm_update_cpus_allowed(p->mm, ctx->new_mask);
 
 	/*
 	 * Swap in a new user_cpus_ptr if SCA_USER flag set
@@ -2765,7 +2766,6 @@ static void
 		put_prev_task(rq, p);
 
 	p->sched_class->set_cpus_allowed(p, ctx);
-	mm_update_cpus_allowed(p->mm, ctx->new_mask);
 
 	if (queued)
 		enqueue_task(rq, p, ENQUEUE_RESTORE | ENQUEUE_NOCLOCK);
@@ -10408,12 +10408,20 @@ void call_trace_sched_update_nr_running(
  */
 static inline void mm_update_cpus_allowed(struct mm_struct *mm, const struct cpumask *affmsk)
 {
-	struct cpumask *mm_allowed = mm_cpus_allowed(mm);
+	struct cpumask *mm_allowed;
 
-	if (!mm)
+	if (!mm || READ_ONCE(mm->mm_cid.nr_cpus_allowed) == nr_cpu_ids)
 		return;
-	/* The mm_cpus_allowed is the union of each thread allowed CPUs masks. */
+
+	/*
+	 * mm::mm_cid::mm_cpus_allowed is the superset of each threads
+	 * allowed CPUs mask which means it can only grow.
+	 */
 	guard(raw_spinlock)(&mm->mm_cid.lock);
+	/* Check again under the lock */
+	if (mm->mm_cid.nr_cpus_allowed == nr_cpu_ids)
+		return;
+	mm_allowed = mm_cpus_allowed(mm);
 	cpumask_or(mm_allowed, mm_allowed, affmsk);
 	WRITE_ONCE(mm->mm_cid.nr_cpus_allowed, cpumask_weight(mm_allowed));
 }
Re: [patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
Posted by Peter Zijlstra 2 months ago
On Wed, Oct 15, 2025 at 07:29:34PM +0200, Thomas Gleixner wrote:

> +	if (!mm || READ_ONCE(mm->mm_cid.nr_cpus_allowed) == nr_cpu_ids)
>  		return;

FWIW this doesn't work on architectures that change their
cpu_possible_mask around (eg. Power).
Re: [patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
Posted by Thomas Gleixner 2 months ago
On Fri, Oct 17 2025 at 13:12, Peter Zijlstra wrote:
> On Wed, Oct 15, 2025 at 07:29:34PM +0200, Thomas Gleixner wrote:
>
>> +	if (!mm || READ_ONCE(mm->mm_cid.nr_cpus_allowed) == nr_cpu_ids)
>>  		return;
>
> FWIW this doesn't work on architectures that change their
> cpu_possible_mask around (eg. Power).

No. Power does not change it after boot either. Half of the kernel would
explode if that'd be the case.

Thanks,

        tglx
Re: [patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
Posted by Peter Zijlstra 2 months ago
On Fri, Oct 17, 2025 at 02:49:29PM +0200, Thomas Gleixner wrote:
> On Fri, Oct 17 2025 at 13:12, Peter Zijlstra wrote:
> > On Wed, Oct 15, 2025 at 07:29:34PM +0200, Thomas Gleixner wrote:
> >
> >> +	if (!mm || READ_ONCE(mm->mm_cid.nr_cpus_allowed) == nr_cpu_ids)
> >>  		return;
> >
> > FWIW this doesn't work on architectures that change their
> > cpu_possible_mask around (eg. Power).
> 
> No. Power does not change it after boot either. Half of the kernel would
> explode if that'd be the case.

Power very much does changes cpu_possible_mask; it doesn't change
nr_cpu_ids. Anyway, the point is that a full mask won't be nr_cpu_ids.

Same is true when you offline a CPU come to think of it.

Same is true if the cpumask is sparse.

Anyway, just saying, checking against nr_cpu_ids might not be the best
shortcut here.
Re: [patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
Posted by Peter Zijlstra 2 months ago
On Fri, Oct 17, 2025 at 07:58:53PM +0200, Peter Zijlstra wrote:
> On Fri, Oct 17, 2025 at 02:49:29PM +0200, Thomas Gleixner wrote:
> > On Fri, Oct 17 2025 at 13:12, Peter Zijlstra wrote:
> > > On Wed, Oct 15, 2025 at 07:29:34PM +0200, Thomas Gleixner wrote:
> > >
> > >> +	if (!mm || READ_ONCE(mm->mm_cid.nr_cpus_allowed) == nr_cpu_ids)
> > >>  		return;
> > >
> > > FWIW this doesn't work on architectures that change their
> > > cpu_possible_mask around (eg. Power).
> > 
> > No. Power does not change it after boot either. Half of the kernel would
> > explode if that'd be the case.
> 
> Power very much does changes cpu_possible_mask; it doesn't change
> nr_cpu_ids. Anyway, the point is that a full mask won't be nr_cpu_ids.

Gah, bad memories, it is cpu_present_mask they change.

> Same is true when you offline a CPU come to think of it.
> 
> Same is true if the cpumask is sparse.
> 
> Anyway, just saying, checking against nr_cpu_ids might not be the best
> shortcut here.

Put another way, nr_cpus_allowed == nr_cpu_ids only work when none of
the masks involved have holes. The moment anything {possible, present,
online} has holes in, it goes sideways.
Re: [patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
Posted by Thomas Gleixner 2 months ago
On Fri, Oct 17 2025 at 20:19, Peter Zijlstra wrote:
> On Fri, Oct 17, 2025 at 07:58:53PM +0200, Peter Zijlstra wrote:
>> Same is true when you offline a CPU come to think of it.
>> 
>> Same is true if the cpumask is sparse.
>> 
>> Anyway, just saying, checking against nr_cpu_ids might not be the best
>> shortcut here.
>
> Put another way, nr_cpus_allowed == nr_cpu_ids only work when none of
> the masks involved have holes. The moment anything {possible, present,
> online} has holes in, it goes sideways.

You're right. I was too narrowly focussed on the normal x86 case, where
nr_cpu_ids == num_possible_cpus ....

Let me think about that.
Re: [patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
Posted by Peter Zijlstra 2 months ago
On Sun, Oct 19, 2025 at 10:32:47PM +0200, Thomas Gleixner wrote:
> On Fri, Oct 17 2025 at 20:19, Peter Zijlstra wrote:
> > On Fri, Oct 17, 2025 at 07:58:53PM +0200, Peter Zijlstra wrote:
> >> Same is true when you offline a CPU come to think of it.
> >> 
> >> Same is true if the cpumask is sparse.
> >> 
> >> Anyway, just saying, checking against nr_cpu_ids might not be the best
> >> shortcut here.
> >
> > Put another way, nr_cpus_allowed == nr_cpu_ids only work when none of
> > the masks involved have holes. The moment anything {possible, present,
> > online} has holes in, it goes sideways.
> 
> You're right. I was too narrowly focussed on the normal x86 case, where
> nr_cpu_ids == num_possible_cpus ....
> 
> Let me think about that.

So the obvious idea would be to grow hotplug hooks, such that you can
do:

  nr_cpus_allowed == num_online_cpus()

But then hotplug will have to iterate all mm's. Doable, but not really
nice.
Re: [patch 06/19] sched/mmcid: Prevent pointless work in mm_update_cpus_allowed()
Posted by Thomas Gleixner 1 month, 4 weeks ago
On Mon, Oct 20 2025 at 10:22, Peter Zijlstra wrote:
> On Sun, Oct 19, 2025 at 10:32:47PM +0200, Thomas Gleixner wrote:
>> On Fri, Oct 17 2025 at 20:19, Peter Zijlstra wrote:
>> > On Fri, Oct 17, 2025 at 07:58:53PM +0200, Peter Zijlstra wrote:
>> >> Same is true when you offline a CPU come to think of it.
>> >> 
>> >> Same is true if the cpumask is sparse.
>> >> 
>> >> Anyway, just saying, checking against nr_cpu_ids might not be the best
>> >> shortcut here.
>> >
>> > Put another way, nr_cpus_allowed == nr_cpu_ids only work when none of
>> > the masks involved have holes. The moment anything {possible, present,
>> > online} has holes in, it goes sideways.
>> 
>> You're right. I was too narrowly focussed on the normal x86 case, where
>> nr_cpu_ids == num_possible_cpus ....
>> 
>> Let me think about that.
>
> So the obvious idea would be to grow hotplug hooks, such that you can
> do:
>
>   nr_cpus_allowed == num_online_cpus()
>
> But then hotplug will have to iterate all mm's. Doable, but not really
> nice.

Right, but that can be done once the dust settled and if there is
actually a need for it.

Thanks,

        tglx