kernel/sched/ext.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
Replace the loop to check if all SMT CPUs are idle with
cpumask_subset(). This simplifies the code and slightly improves
efficiency, while preserving the original behavior.
Note that idle_masks.smt handling remains racy, which is acceptable as
it serves as an optimization and is self-correcting.
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
kernel/sched/ext.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 926579624c41..0ce116e0f67c 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -3671,10 +3671,8 @@ void __scx_update_idle(struct rq *rq, bool idle)
* idle_masks.smt handling is racy but that's fine as
* it's only for optimization and self-correcting.
*/
- for_each_cpu(cpu, smt) {
- if (!cpumask_test_cpu(cpu, idle_masks.cpu))
- return;
- }
+ if (!cpumask_subset(smt, idle_masks.cpu))
+ return;
cpumask_or(idle_masks.smt, idle_masks.smt, smt);
} else {
cpumask_andnot(idle_masks.smt, idle_masks.smt, smt);
--
2.47.1
On Sat, Jan 04, 2025 at 10:00:09AM +0100, Andrea Righi wrote: > Replace the loop to check if all SMT CPUs are idle with > cpumask_subset(). This simplifies the code and slightly improves > efficiency, while preserving the original behavior. > > Note that idle_masks.smt handling remains racy, which is acceptable as > it serves as an optimization and is self-correcting. > > Suggested-by: Yury Norov <yury.norov@gmail.com> > Signed-off-by: Andrea Righi <arighi@nvidia.com> Applied to sched_ext/for-6.14. Thanks. -- tejun
On Sat, Jan 04, 2025 at 10:00:09AM +0100, Andrea Righi wrote:
> loop to check if all SMT CPUs are idle with
> cpumask_subset(). This simplifies the code and slightly improves
> efficiency, while preserving the original behavior.
In my experience, replacing iterator with a dedicated function makes
you 400-500 times faster. So 'slightly' sounds like an understatement
here.
> Note that idle_masks.smt handling remains racy, which is acceptable as
> it serves as an optimization and is self-correcting.
>
> Suggested-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
> ---
> kernel/sched/ext.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 926579624c41..0ce116e0f67c 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -3671,10 +3671,8 @@ void __scx_update_idle(struct rq *rq, bool idle)
> * idle_masks.smt handling is racy but that's fine as
> * it's only for optimization and self-correcting.
> */
> - for_each_cpu(cpu, smt) {
> - if (!cpumask_test_cpu(cpu, idle_masks.cpu))
> - return;
> - }
> + if (!cpumask_subset(smt, idle_masks.cpu))
> + return;
> cpumask_or(idle_masks.smt, idle_masks.smt, smt);
> } else {
> cpumask_andnot(idle_masks.smt, idle_masks.smt, smt);
> --
> 2.47.1
>
> From yury
Hi Yury,
On Sat, Jan 04, 2025 at 08:00:10PM -0800, Yury Norov wrote:
> On Sat, Jan 04, 2025 at 10:00:09AM +0100, Andrea Righi wrote:
> > loop to check if all SMT CPUs are idle with
> > cpumask_subset(). This simplifies the code and slightly improves
> > efficiency, while preserving the original behavior.
>
> In my experience, replacing iterator with a dedicated function makes
> you 400-500 times faster. So 'slightly' sounds like an understatement
> here.
There's also the fact that the smt mask typically will have just 2 bits set
(maybe 4 in some high end servers), so I wasn't expecting big improvements.
But in any case, it's definitely a good change to do.
Thanks!
-Andrea
>
> > Note that idle_masks.smt handling remains racy, which is acceptable as
> > it serves as an optimization and is self-correcting.
> >
> > Suggested-by: Yury Norov <yury.norov@gmail.com>
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
>
> Reviewed-by: Yury Norov <yury.norov@gmail.com>
>
> > ---
> > kernel/sched/ext.c | 6 ++----
> > 1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> > index 926579624c41..0ce116e0f67c 100644
> > --- a/kernel/sched/ext.c
> > +++ b/kernel/sched/ext.c
> > @@ -3671,10 +3671,8 @@ void __scx_update_idle(struct rq *rq, bool idle)
> > * idle_masks.smt handling is racy but that's fine as
> > * it's only for optimization and self-correcting.
> > */
> > - for_each_cpu(cpu, smt) {
> > - if (!cpumask_test_cpu(cpu, idle_masks.cpu))
> > - return;
> > - }
> > + if (!cpumask_subset(smt, idle_masks.cpu))
> > + return;
> > cpumask_or(idle_masks.smt, idle_masks.smt, smt);
> > } else {
> > cpumask_andnot(idle_masks.smt, idle_masks.smt, smt);
> > --
> > 2.47.1
© 2016 - 2026 Red Hat, Inc.