[PATCH] sched_ext: idle: small CPU iteration refactoring

Andrea Righi posted 1 patch 1 year, 1 month ago
kernel/sched/ext.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
[PATCH] sched_ext: idle: small CPU iteration refactoring
Posted by Andrea Righi 1 year, 1 month ago
Replace the loop to check if all SMT CPUs are idle with
cpumask_subset(). This simplifies the code and slightly improves
efficiency, while preserving the original behavior.

Note that idle_masks.smt handling remains racy, which is acceptable as
it serves as an optimization and is self-correcting.

Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
 kernel/sched/ext.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 926579624c41..0ce116e0f67c 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -3671,10 +3671,8 @@ void __scx_update_idle(struct rq *rq, bool idle)
 			 * idle_masks.smt handling is racy but that's fine as
 			 * it's only for optimization and self-correcting.
 			 */
-			for_each_cpu(cpu, smt) {
-				if (!cpumask_test_cpu(cpu, idle_masks.cpu))
-					return;
-			}
+			if (!cpumask_subset(smt, idle_masks.cpu))
+				return;
 			cpumask_or(idle_masks.smt, idle_masks.smt, smt);
 		} else {
 			cpumask_andnot(idle_masks.smt, idle_masks.smt, smt);
-- 
2.47.1
Re: [PATCH] sched_ext: idle: small CPU iteration refactoring
Posted by Tejun Heo 1 year, 1 month ago
On Sat, Jan 04, 2025 at 10:00:09AM +0100, Andrea Righi wrote:
> Replace the loop to check if all SMT CPUs are idle with
> cpumask_subset(). This simplifies the code and slightly improves
> efficiency, while preserving the original behavior.
> 
> Note that idle_masks.smt handling remains racy, which is acceptable as
> it serves as an optimization and is self-correcting.
> 
> Suggested-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>

Applied to sched_ext/for-6.14.

Thanks.

-- 
tejun
Re: [PATCH] sched_ext: idle: small CPU iteration refactoring
Posted by Yury Norov 1 year, 1 month ago
On Sat, Jan 04, 2025 at 10:00:09AM +0100, Andrea Righi wrote:
> loop to check if all SMT CPUs are idle with
> cpumask_subset(). This simplifies the code and slightly improves
> efficiency, while preserving the original behavior.

In my experience, replacing iterator with a dedicated function makes
you 400-500 times faster. So 'slightly' sounds like an understatement
here.

> Note that idle_masks.smt handling remains racy, which is acceptable as
> it serves as an optimization and is self-correcting.
> 
> Suggested-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>

Reviewed-by: Yury Norov <yury.norov@gmail.com>

> ---
>  kernel/sched/ext.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 926579624c41..0ce116e0f67c 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -3671,10 +3671,8 @@ void __scx_update_idle(struct rq *rq, bool idle)
>  			 * idle_masks.smt handling is racy but that's fine as
>  			 * it's only for optimization and self-correcting.
>  			 */
> -			for_each_cpu(cpu, smt) {
> -				if (!cpumask_test_cpu(cpu, idle_masks.cpu))
> -					return;
> -			}
> +			if (!cpumask_subset(smt, idle_masks.cpu))
> +				return;
>  			cpumask_or(idle_masks.smt, idle_masks.smt, smt);
>  		} else {
>  			cpumask_andnot(idle_masks.smt, idle_masks.smt, smt);
> -- 
> 2.47.1
> 
> From yury
Re: [PATCH] sched_ext: idle: small CPU iteration refactoring
Posted by Andrea Righi 1 year, 1 month ago
Hi Yury,

On Sat, Jan 04, 2025 at 08:00:10PM -0800, Yury Norov wrote:
> On Sat, Jan 04, 2025 at 10:00:09AM +0100, Andrea Righi wrote:
> > loop to check if all SMT CPUs are idle with
> > cpumask_subset(). This simplifies the code and slightly improves
> > efficiency, while preserving the original behavior.
> 
> In my experience, replacing iterator with a dedicated function makes
> you 400-500 times faster. So 'slightly' sounds like an understatement
> here.

There's also the fact that the smt mask typically will have just 2 bits set
(maybe 4 in some high end servers), so I wasn't expecting big improvements.
But in any case, it's definitely a good change to do.

Thanks!
-Andrea

> 
> > Note that idle_masks.smt handling remains racy, which is acceptable as
> > it serves as an optimization and is self-correcting.
> > 
> > Suggested-by: Yury Norov <yury.norov@gmail.com>
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> 
> Reviewed-by: Yury Norov <yury.norov@gmail.com>
> 
> > ---
> >  kernel/sched/ext.c | 6 ++----
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> > index 926579624c41..0ce116e0f67c 100644
> > --- a/kernel/sched/ext.c
> > +++ b/kernel/sched/ext.c
> > @@ -3671,10 +3671,8 @@ void __scx_update_idle(struct rq *rq, bool idle)
> >  			 * idle_masks.smt handling is racy but that's fine as
> >  			 * it's only for optimization and self-correcting.
> >  			 */
> > -			for_each_cpu(cpu, smt) {
> > -				if (!cpumask_test_cpu(cpu, idle_masks.cpu))
> > -					return;
> > -			}
> > +			if (!cpumask_subset(smt, idle_masks.cpu))
> > +				return;
> >  			cpumask_or(idle_masks.smt, idle_masks.smt, smt);
> >  		} else {
> >  			cpumask_andnot(idle_masks.smt, idle_masks.smt, smt);
> > -- 
> > 2.47.1