[RFC patch v3 04/20] sched: Avoid calculating the cpumask if the system is overloaded

Tim Chen posted 20 patches 3 months, 3 weeks ago
[RFC patch v3 04/20] sched: Avoid calculating the cpumask if the system is overloaded
Posted by Tim Chen 3 months, 3 weeks ago
From: K Prateek Nayak <kprateek.nayak@amd.com>

If the SIS_UTIL cuts off idle cpu search, result of the cpumask_and() is
of no use. Since select_idle_cpu() can now be called twice per wake up
in the select_idle_sibling() due to cache aware wake up, this overhead
can be visible in benchmarks like hackbench.

To save some additional cycles, especially in cases where we target
the LLC frequently and the search bails out because the LLC is busy,
only calculate the cpumask if the system is not overloaded.

Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
---
 kernel/sched/fair.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 567ad2a0cfa2..6a2678f9d44a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7918,8 +7918,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
 	int i, cpu, idle_cpu = -1, nr = INT_MAX;
 	struct sched_domain_shared *sd_share;
 
-	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
-
 	if (sched_feat(SIS_UTIL)) {
 		sd_share = rcu_dereference(per_cpu(sd_llc_shared, target));
 		if (sd_share) {
@@ -7931,6 +7929,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
 		}
 	}
 
+	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
+
 	if (static_branch_unlikely(&sched_cluster_active)) {
 		struct sched_group *sg = sd->groups;
 
-- 
2.32.0
Re: [RFC patch v3 04/20] sched: Avoid calculating the cpumask if the system is overloaded
Posted by Shrikanth Hegde 3 months, 1 week ago

On 6/18/25 23:57, Tim Chen wrote:
> From: K Prateek Nayak <kprateek.nayak@amd.com>
> 
> If the SIS_UTIL cuts off idle cpu search, result of the cpumask_and() is
> of no use. Since select_idle_cpu() can now be called twice per wake up
> in the select_idle_sibling() due to cache aware wake up, this overhead
> can be visible in benchmarks like hackbench.
> 
> To save some additional cycles, especially in cases where we target
> the LLC frequently and the search bails out because the LLC is busy,
> only calculate the cpumask if the system is not overloaded.
> 

This patch could be independent and should help in general.
But changelog needs to be updated.


> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> ---
>   kernel/sched/fair.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 567ad2a0cfa2..6a2678f9d44a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7918,8 +7918,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
>   	int i, cpu, idle_cpu = -1, nr = INT_MAX;
>   	struct sched_domain_shared *sd_share;
>   
> -	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> -
>   	if (sched_feat(SIS_UTIL)) {
>   		sd_share = rcu_dereference(per_cpu(sd_llc_shared, target));
>   		if (sd_share) {
> @@ -7931,6 +7929,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
>   		}
>   	}
>   
> +	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> +
>   	if (static_branch_unlikely(&sched_cluster_active)) {
>   		struct sched_group *sg = sd->groups;
>
Re: [RFC patch v3 04/20] sched: Avoid calculating the cpumask if the system is overloaded
Posted by Tim Chen 3 months ago
On Fri, 2025-07-04 at 01:09 +0530, Shrikanth Hegde wrote:
> 
> On 6/18/25 23:57, Tim Chen wrote:
> > From: K Prateek Nayak <kprateek.nayak@amd.com>
> > 
> > If the SIS_UTIL cuts off idle cpu search, result of the cpumask_and() is
> > of no use. Since select_idle_cpu() can now be called twice per wake up
> > in the select_idle_sibling() due to cache aware wake up, this overhead
> > can be visible in benchmarks like hackbench.
> > 
> > To save some additional cycles, especially in cases where we target
> > the LLC frequently and the search bails out because the LLC is busy,
> > only calculate the cpumask if the system is not overloaded.
> > 
> 
> This patch could be independent and should help in general.
> But changelog needs to be updated.
> 
> 

Yes, that makes sense.

Tim

> > Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
> > ---
> >   kernel/sched/fair.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 567ad2a0cfa2..6a2678f9d44a 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -7918,8 +7918,6 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> >   	int i, cpu, idle_cpu = -1, nr = INT_MAX;
> >   	struct sched_domain_shared *sd_share;
> >   
> > -	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> > -
> >   	if (sched_feat(SIS_UTIL)) {
> >   		sd_share = rcu_dereference(per_cpu(sd_llc_shared, target));
> >   		if (sd_share) {
> > @@ -7931,6 +7929,8 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> >   		}
> >   	}
> >   
> > +	cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
> > +
> >   	if (static_branch_unlikely(&sched_cluster_active)) {
> >   		struct sched_group *sg = sd->groups;
> >   
>