[PATCH] sched/fair: Fix integer underflow

Pierre Gondois posted 1 patch 2 months, 2 weeks ago
There is a newer version of this series
kernel/sched/fair.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH] sched/fair: Fix integer underflow
Posted by Pierre Gondois 2 months, 2 weeks ago
(struct sg_lb_stats).idle_cpus is of type 'unsigned int'.
(local->idle_cpus - busiest->idle_cpus) can underflow to UINT_MAX
for instance, and max_t(long, 0, UINT_MAX) will output UINT_MAX.

Use lsub_positive() instead of max_t().

Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()")
cc: stable@vger.kernel.org
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
---
 kernel/sched/fair.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9057584ec06d..6d9124499f52 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10775,8 +10775,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
 			 * idle CPUs.
 			 */
 			env->migration_type = migrate_task;
-			env->imbalance = max_t(long, 0,
-					       (local->idle_cpus - busiest->idle_cpus));
+			env->imbalance = local->idle_cpus;
+			lsub_positive(&env->imbalance, busiest->idle_cpus);
 		}
 
 #ifdef CONFIG_NUMA
-- 
2.25.1
Re: [PATCH] sched/fair: Fix integer underflow
Posted by Vincent Guittot 2 months, 2 weeks ago
Hi Pierre

On Fri, 13 Sept 2024 at 10:58, Pierre Gondois <pierre.gondois@arm.com> wrote:
>
> (struct sg_lb_stats).idle_cpus is of type 'unsigned int'.
> (local->idle_cpus - busiest->idle_cpus) can underflow to UINT_MAX
> for instance, and max_t(long, 0, UINT_MAX) will output UINT_MAX.
>
> Use lsub_positive() instead of max_t().

Have you faced the problem or this patch is based on code review ?

we have the below in sched_balance_find_src_group() that should ensure
that we have local->idle_cpus > busiest->idle_cpus

if (busiest->group_weight > 1 &&
    local->idle_cpus <= (busiest->idle_cpus + 1)) {
    /*
    * If the busiest group is not overloaded
    * and there is no imbalance between this and busiest
    * group wrt idle CPUs, it is balanced. The imbalance
    * becomes significant if the diff is greater than 1
    * otherwise we might end up to just move the imbalance
    * on another group. Of course this applies only if
    * there is more than 1 CPU per group.
    */
    goto out_balanced;
}

>
> Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()")
> cc: stable@vger.kernel.org
> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
> ---
>  kernel/sched/fair.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 9057584ec06d..6d9124499f52 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10775,8 +10775,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
>                          * idle CPUs.
>                          */
>                         env->migration_type = migrate_task;
> -                       env->imbalance = max_t(long, 0,
> -                                              (local->idle_cpus - busiest->idle_cpus));
> +                       env->imbalance = local->idle_cpus;
> +                       lsub_positive(&env->imbalance, busiest->idle_cpus);
>                 }
>
>  #ifdef CONFIG_NUMA
> --
> 2.25.1
>
Re: [PATCH] sched/fair: Fix integer underflow
Posted by Pierre Gondois 2 months, 2 weeks ago
Hello Vincent,

On 9/13/24 18:14, Vincent Guittot wrote:
> Hi Pierre
> 
> On Fri, 13 Sept 2024 at 10:58, Pierre Gondois <pierre.gondois@arm.com> wrote:
>>
>> (struct sg_lb_stats).idle_cpus is of type 'unsigned int'.
>> (local->idle_cpus - busiest->idle_cpus) can underflow to UINT_MAX
>> for instance, and max_t(long, 0, UINT_MAX) will output UINT_MAX.
>>
>> Use lsub_positive() instead of max_t().
> 
> Have you faced the problem or this patch is based on code review ?
> 
> we have the below in sched_balance_find_src_group() that should ensure
> that we have local->idle_cpus > busiest->idle_cpus
> 
> if (busiest->group_weight > 1 &&
>      local->idle_cpus <= (busiest->idle_cpus + 1)) {
>      /*
>      * If the busiest group is not overloaded
>      * and there is no imbalance between this and busiest
>      * group wrt idle CPUs, it is balanced. The imbalance
>      * becomes significant if the diff is greater than 1
>      * otherwise we might end up to just move the imbalance
>      * on another group. Of course this applies only if
>      * there is more than 1 CPU per group.
>      */
>      goto out_balanced;
> }

It was with this setup:
- busiest->group_type == group_overloaded
so it might not have checked the value. I can check the exact path if desired,

Regards,
Pierre


> 
>>
>> Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()")
>> cc: stable@vger.kernel.org
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
>> ---
>>   kernel/sched/fair.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 9057584ec06d..6d9124499f52 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -10775,8 +10775,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
>>                           * idle CPUs.
>>                           */
>>                          env->migration_type = migrate_task;
>> -                       env->imbalance = max_t(long, 0,
>> -                                              (local->idle_cpus - busiest->idle_cpus));
>> +                       env->imbalance = local->idle_cpus;
>> +                       lsub_positive(&env->imbalance, busiest->idle_cpus);
>>                  }
>>
>>   #ifdef CONFIG_NUMA
>> --
>> 2.25.1
>>
Re: [PATCH] sched/fair: Fix integer underflow
Posted by Vincent Guittot 2 months ago
On Fri, 13 Sept 2024 at 19:36, Pierre Gondois <pierre.gondois@arm.com> wrote:
>
> Hello Vincent,
>
> On 9/13/24 18:14, Vincent Guittot wrote:
> > Hi Pierre
> >
> > On Fri, 13 Sept 2024 at 10:58, Pierre Gondois <pierre.gondois@arm.com> wrote:
> >>
> >> (struct sg_lb_stats).idle_cpus is of type 'unsigned int'.
> >> (local->idle_cpus - busiest->idle_cpus) can underflow to UINT_MAX
> >> for instance, and max_t(long, 0, UINT_MAX) will output UINT_MAX.
> >>
> >> Use lsub_positive() instead of max_t().
> >
> > Have you faced the problem or this patch is based on code review ?
> >
> > we have the below in sched_balance_find_src_group() that should ensure
> > that we have local->idle_cpus > busiest->idle_cpus
> >
> > if (busiest->group_weight > 1 &&
> >      local->idle_cpus <= (busiest->idle_cpus + 1)) {
> >      /*
> >      * If the busiest group is not overloaded
> >      * and there is no imbalance between this and busiest
> >      * group wrt idle CPUs, it is balanced. The imbalance
> >      * becomes significant if the diff is greater than 1
> >      * otherwise we might end up to just move the imbalance
> >      * on another group. Of course this applies only if
> >      * there is more than 1 CPU per group.
> >      */
> >      goto out_balanced;
> > }
>
> It was with this setup:
> - busiest->group_type == group_overloaded
> so it might not have checked the value. I can check the exact path if desired,

I'm curious that we never faced this before as the change is almost 4
years old now but there are probably some topologies that can end up
in such situation

Also the fix tag should be
Fixes: 16b0a7a1a0af ("sched/fair: Ensure tasks spreading in LLC during LB")
Before this commit, group_overloaded was not comparing number of tasks

With the correct fix tag:
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

>
> Regards,
> Pierre
>
>
> >
> >>
> >> Fixes: 0b0695f2b34a ("sched/fair: Rework load_balance()")
> >> cc: stable@vger.kernel.org
> >> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
> >> ---
> >>   kernel/sched/fair.c | 4 ++--
> >>   1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index 9057584ec06d..6d9124499f52 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -10775,8 +10775,8 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
> >>                           * idle CPUs.
> >>                           */
> >>                          env->migration_type = migrate_task;
> >> -                       env->imbalance = max_t(long, 0,
> >> -                                              (local->idle_cpus - busiest->idle_cpus));
> >> +                       env->imbalance = local->idle_cpus;
> >> +                       lsub_positive(&env->imbalance, busiest->idle_cpus);
> >>                  }
> >>
> >>   #ifdef CONFIG_NUMA
> >> --
> >> 2.25.1
> >>