[v1] sched/fair: simplify __calc_delta()

[PATCH] sched/fair: simplify __calc_delta()

Posted by Dawei Li 1 year, 11 months ago

Based on how __calc_delta() is called now, the input parameter, weight
is always NICE_0_LOAD. I think we don't need it as an input parameter
now?

Also, when weight is always NICE_0_LOAD, the initial fact value is
always 2^10, and the first fact_hi will always be 0. Thus, we can get
rid of the first if bock.

The previous comment "(delta_exec * (weight * lw->inv_weight)) >>
WMULT_SHIFT" seems to be assuming that lw->weight * lw->inv_weight is
always (approximately) equal to 2^WMULT_SHIFT. However, when
CONFIG_64BIT is set, lw->weight * lw->inv_weight is (approximately)
equal to 2^WMULT_SHIFT * 2^10. What remains true for both CONFIG_32BIT
and CONFIG_64BIT is: scale_load_down(lw->weight) * lw->inv_weight is
(approximately) equal to 2^WMULT_SHIFT. (Correct me if I am wrong.)

Also updated the comment for calc_delta_fair() to make it more
accurate.

Signed-off-by: Dawei Li <daweilics@gmail.com>
---
 kernel/sched/fair.c | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6a16129f9a5c..c5cdb15f7d62 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -252,32 +252,23 @@ static void __update_inv_weight(struct load_weight *lw)
 }
 
 /*
- * delta_exec * weight / lw.weight
+ * delta_exec * NICE_0_LOAD / lw->weight
  *   OR
- * (delta_exec * (weight * lw->inv_weight)) >> WMULT_SHIFT
+ * (delta_exec * scale_load_down(NICE_0_LOAD) * lw->inv_weight) >> WMULT_SHIFT
  *
- * Either weight := NICE_0_LOAD and lw \e sched_prio_to_wmult[], in which case
- * we're guaranteed shift stays positive because inv_weight is guaranteed to
- * fit 32 bits, and NICE_0_LOAD gives another 10 bits; therefore shift >= 22.
- *
- * Or, weight =< lw.weight (because lw.weight is the runqueue weight), thus
- * weight/lw.weight <= 1, and therefore our shift will also be positive.
+ * We're guaranteed shift stays positive because inv_weight is guaranteed to
+ * fit 32 bits, and scale_load_down(NICE_0_LOAD) gives another 10 bits;
+ * therefore shift >= 22.
  */
-static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight *lw)
+static u64 __calc_delta(u64 delta_exec, struct load_weight *lw)
 {
-	u64 fact = scale_load_down(weight);
-	u32 fact_hi = (u32)(fact >> 32);
+	u64 fact = scale_load_down(NICE_0_LOAD);
+	int fact_hi;
 	int shift = WMULT_SHIFT;
 	int fs;
 
 	__update_inv_weight(lw);
 
-	if (unlikely(fact_hi)) {
-		fs = fls(fact_hi);
-		shift -= fs;
-		fact >>= fs;
-	}
-
 	fact = mul_u32_u32(fact, lw->inv_weight);
 
 	fact_hi = (u32)(fact >> 32);
@@ -291,12 +282,12 @@ static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight
 }
 
 /*
- * delta /= w
+ * delta *= NICE_0_LOAD / se->load.weight
  */
 static inline u64 calc_delta_fair(u64 delta, struct sched_entity *se)
 {
 	if (unlikely(se->load.weight != NICE_0_LOAD))
-		delta = __calc_delta(delta, NICE_0_LOAD, &se->load);
+		delta = __calc_delta(delta, &se->load);
 
 	return delta;
 }
-- 
2.40.1

Re: [PATCH] sched/fair: simplify __calc_delta()

Posted by Pierre Gondois 1 year, 11 months ago

Hello Dawei,

On 3/6/24 23:28, Dawei Li wrote:
> Based on how __calc_delta() is called now, the input parameter, weight
> is always NICE_0_LOAD. I think we don't need it as an input parameter
> now?

Maybe
   5e963f2bd4654a202a8a05aa3a86cb0300b10e6c ("sched/fair: Commit to EEVDF")
should be referenced to explain that the case where (weight =< lw.weight)
doesn't exist anymore and that NICE_0_LOAD could be incorporated in
__calc_delta() directly.


Also I think indirect forms are preferred in general:
"I think we don't need it as an input parameter now ?" ->
"The 'weight' parameter doesn't seem to be required anymore"
(same note for the whole commit message)

> 
> Also, when weight is always NICE_0_LOAD, the initial fact value is
> always 2^10, and the first fact_hi will always be 0. Thus, we can get
> rid of the first if bock.
> 
> The previous comment "(delta_exec * (weight * lw->inv_weight)) >>
> WMULT_SHIFT" seems to be assuming that lw->weight * lw->inv_weight is
> always (approximately) equal to 2^WMULT_SHIFT. However, when
> CONFIG_64BIT is set, lw->weight * lw->inv_weight is (approximately)
> equal to 2^WMULT_SHIFT * 2^10. What remains true for both CONFIG_32BIT
> and CONFIG_64BIT is: scale_load_down(lw->weight) * lw->inv_weight is
> (approximately) equal to 2^WMULT_SHIFT. (Correct me if I am wrong.)

I think the comment is more about explaining that:
   X * lw.weight
equals:
   X * lw->inv_weight >> WMULT_SHIFT

Also, if CONFIG_64BIT is set, we should have:
   weight / lw.weight == scale_load_down(lw->weight) * 2**10 * lw->inv_weight

So IIUC, either both lines should be update, either none.
(meaning that:
   delta_exec * NICE_0_LOAD / lw->weight
should be changed to
   delta_exec * scale_load_down(NICE_0_LOAD) / lw->weight
)

I assume it's better to let the comment as is.


> 
> Also updated the comment for calc_delta_fair() to make it more
> accurate.
> 
> Signed-off-by: Dawei Li <daweilics@gmail.com>
> ---
>   kernel/sched/fair.c | 29 ++++++++++-------------------
>   1 file changed, 10 insertions(+), 19 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 6a16129f9a5c..c5cdb15f7d62 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -252,32 +252,23 @@ static void __update_inv_weight(struct load_weight *lw)
>   }
>   
>   /*
> - * delta_exec * weight / lw.weight
> + * delta_exec * NICE_0_LOAD / lw->weight
>    *   OR
> - * (delta_exec * (weight * lw->inv_weight)) >> WMULT_SHIFT
> + * (delta_exec * scale_load_down(NICE_0_LOAD) * lw->inv_weight) >> WMULT_SHIFT
>    *
> - * Either weight := NICE_0_LOAD and lw \e sched_prio_to_wmult[], in which case
> - * we're guaranteed shift stays positive because inv_weight is guaranteed to
> - * fit 32 bits, and NICE_0_LOAD gives another 10 bits; therefore shift >= 22.
> - *
> - * Or, weight =< lw.weight (because lw.weight is the runqueue weight), thus
> - * weight/lw.weight <= 1, and therefore our shift will also be positive.
> + * We're guaranteed shift stays positive because inv_weight is guaranteed to
> + * fit 32 bits, and scale_load_down(NICE_0_LOAD) gives another 10 bits;
> + * therefore shift >= 22.
>    */
> -static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight *lw)
> +static u64 __calc_delta(u64 delta_exec, struct load_weight *lw)
>   {
> -	u64 fact = scale_load_down(weight);
> -	u32 fact_hi = (u32)(fact >> 32);
> +	u64 fact = scale_load_down(NICE_0_LOAD);
> +	int fact_hi;
>   	int shift = WMULT_SHIFT;
>   	int fs;

NIT: maybe re-ordering the variables to have a reverse tree

Otherwise, the patch looks good to me,
Regards,
Pierre

Re: [PATCH] sched/fair: simplify __calc_delta()

Posted by Dawei Li 1 year, 11 months ago

Hi Pierre,
Thank you for the review!

On Tue, Mar 12, 2024 at 6:18 AM Pierre Gondois <pierre.gondois@arm.com> wrote:
>
> Hello Dawei,
>
> On 3/6/24 23:28, Dawei Li wrote:
> > Based on how __calc_delta() is called now, the input parameter, weight
> > is always NICE_0_LOAD. I think we don't need it as an input parameter
> > now?
>
> Maybe
>    5e963f2bd4654a202a8a05aa3a86cb0300b10e6c ("sched/fair: Commit to EEVDF")
> should be referenced to explain that the case where (weight =< lw.weight)
> doesn't exist anymore and that NICE_0_LOAD could be incorporated in
> __calc_delta() directly.
>
>
> Also I think indirect forms are preferred in general:
> "I think we don't need it as an input parameter now ?" ->
> "The 'weight' parameter doesn't seem to be required anymore"
> (same note for the whole commit message)
>
> >
> > Also, when weight is always NICE_0_LOAD, the initial fact value is
> > always 2^10, and the first fact_hi will always be 0. Thus, we can get
> > rid of the first if bock.
> >
> > The previous comment "(delta_exec * (weight * lw->inv_weight)) >>
> > WMULT_SHIFT" seems to be assuming that lw->weight * lw->inv_weight is
> > always (approximately) equal to 2^WMULT_SHIFT. However, when
> > CONFIG_64BIT is set, lw->weight * lw->inv_weight is (approximately)
> > equal to 2^WMULT_SHIFT * 2^10. What remains true for both CONFIG_32BIT
> > and CONFIG_64BIT is: scale_load_down(lw->weight) * lw->inv_weight is
> > (approximately) equal to 2^WMULT_SHIFT. (Correct me if I am wrong.)
>
> I think the comment is more about explaining that:
>    X * lw.weight
> equals:
>    X * lw->inv_weight >> WMULT_SHIFT
>
I assume you mean
X / lw->weight
equals:
X * lw->inv_weight >> WMULT_SHIFT
However, this is not always true, and that's why I'd like to revise
it. It is true for
CONFIG_32BIT. However, For CONFIG_64BIT, we have lw->weight * lw->inv_weight =
2**WMULT_SHIFT * 2**10. Thus,
X / lw->weight
equals:
X * lw->inv_weight >> (WMULT_SHIFT + 10)


> Also, if CONFIG_64BIT is set, we should have:
>    weight / lw.weight == scale_load_down(lw->weight) * 2**10 * lw->inv_weight
>

weight / lw->weight should be equal to scale_load_down(weight) /
scale_load_down(lw->weight)
= scale_load_down(weight) * lw->inv_weight / (2**WMULT_SHIFT)
Right?

> So IIUC, either both lines should be update, either none.
> (meaning that:
>    delta_exec * NICE_0_LOAD / lw->weight
> should be changed to
>    delta_exec * scale_load_down(NICE_0_LOAD) / lw->weight

I think this is not correct? scale_load_down(NICE_0_LOAD) is the true
weight, as mapped
directly from the task's nice/priority value, while lw->weight is the
scaled_up load.
Their units/scales don't match.

I am quite new to the source code. I could be wrong. But would like to
see more clarifications
on this.

> )
>
> I assume it's better to let the comment as is.
>
>
> >
> > Also updated the comment for calc_delta_fair() to make it more
> > accurate.
> >
> > Signed-off-by: Dawei Li <daweilics@gmail.com>
> > ---
> >   kernel/sched/fair.c | 29 ++++++++++-------------------
> >   1 file changed, 10 insertions(+), 19 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 6a16129f9a5c..c5cdb15f7d62 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -252,32 +252,23 @@ static void __update_inv_weight(struct load_weight *lw)
> >   }
> >
> >   /*
> > - * delta_exec * weight / lw.weight
> > + * delta_exec * NICE_0_LOAD / lw->weight
> >    *   OR
> > - * (delta_exec * (weight * lw->inv_weight)) >> WMULT_SHIFT
> > + * (delta_exec * scale_load_down(NICE_0_LOAD) * lw->inv_weight) >> WMULT_SHIFT
> >    *
> > - * Either weight := NICE_0_LOAD and lw \e sched_prio_to_wmult[], in which case
> > - * we're guaranteed shift stays positive because inv_weight is guaranteed to
> > - * fit 32 bits, and NICE_0_LOAD gives another 10 bits; therefore shift >= 22.
> > - *
> > - * Or, weight =< lw.weight (because lw.weight is the runqueue weight), thus
> > - * weight/lw.weight <= 1, and therefore our shift will also be positive.
> > + * We're guaranteed shift stays positive because inv_weight is guaranteed to
> > + * fit 32 bits, and scale_load_down(NICE_0_LOAD) gives another 10 bits;
> > + * therefore shift >= 22.
> >    */
> > -static u64 __calc_delta(u64 delta_exec, unsigned long weight, struct load_weight *lw)
> > +static u64 __calc_delta(u64 delta_exec, struct load_weight *lw)
> >   {
> > -     u64 fact = scale_load_down(weight);
> > -     u32 fact_hi = (u32)(fact >> 32);
> > +     u64 fact = scale_load_down(NICE_0_LOAD);
> > +     int fact_hi;
> >       int shift = WMULT_SHIFT;
> >       int fs;
>
> NIT: maybe re-ordering the variables to have a reverse tree
>
> Otherwise, the patch looks good to me,
> Regards,
> Pierre

Re: [PATCH] sched/fair: simplify __calc_delta()

Posted by Pierre Gondois 1 year, 11 months ago

Hello Dawei,

On 3/13/24 00:25, Dawei Li wrote:
> Hi Pierre,
> Thank you for the review!
> 
> On Tue, Mar 12, 2024 at 6:18 AM Pierre Gondois <pierre.gondois@arm.com> wrote:
>>
>> Hello Dawei,
>>
>> On 3/6/24 23:28, Dawei Li wrote:
>>> Based on how __calc_delta() is called now, the input parameter, weight
>>> is always NICE_0_LOAD. I think we don't need it as an input parameter
>>> now?
>>
>> Maybe
>>     5e963f2bd4654a202a8a05aa3a86cb0300b10e6c ("sched/fair: Commit to EEVDF")
>> should be referenced to explain that the case where (weight =< lw.weight)
>> doesn't exist anymore and that NICE_0_LOAD could be incorporated in
>> __calc_delta() directly.
>>
>>
>> Also I think indirect forms are preferred in general:
>> "I think we don't need it as an input parameter now ?" ->
>> "The 'weight' parameter doesn't seem to be required anymore"
>> (same note for the whole commit message)
>>
>>>
>>> Also, when weight is always NICE_0_LOAD, the initial fact value is
>>> always 2^10, and the first fact_hi will always be 0. Thus, we can get
>>> rid of the first if bock.
>>>
>>> The previous comment "(delta_exec * (weight * lw->inv_weight)) >>
>>> WMULT_SHIFT" seems to be assuming that lw->weight * lw->inv_weight is
>>> always (approximately) equal to 2^WMULT_SHIFT. However, when
>>> CONFIG_64BIT is set, lw->weight * lw->inv_weight is (approximately)
>>> equal to 2^WMULT_SHIFT * 2^10. What remains true for both CONFIG_32BIT
>>> and CONFIG_64BIT is: scale_load_down(lw->weight) * lw->inv_weight is
>>> (approximately) equal to 2^WMULT_SHIFT. (Correct me if I am wrong.)
>>
>> I think the comment is more about explaining that:
>>     X * lw.weight
>> equals:
>>     X * lw->inv_weight >> WMULT_SHIFT
>>
> I assume you mean
> X / lw->weight
> equals:
> X * lw->inv_weight >> WMULT_SHIFT

Yes right indeed.

> However, this is not always true, and that's why I'd like to revise
> it. It is true for
> CONFIG_32BIT. However, For CONFIG_64BIT, we have lw->weight * lw->inv_weight =
> 2**WMULT_SHIFT * 2**10. Thus,
> X / lw->weight
> equals:
> X * lw->inv_weight >> (WMULT_SHIFT + 10)

Ok yes, you're correct indeed.
The equality is always correct when scale_load_down() is used,

Regards,
Pierre

> 
> 
>> Also, if CONFIG_64BIT is set, we should have:
>>     weight / lw.weight == scale_load_down(lw->weight) * 2**10 * lw->inv_weight
>>
> 
> weight / lw->weight should be equal to scale_load_down(weight) /
> scale_load_down(lw->weight)
> = scale_load_down(weight) * lw->inv_weight / (2**WMULT_SHIFT)
> Right?
> 
>> So IIUC, either both lines should be update, either none.
>> (meaning that:
>>     delta_exec * NICE_0_LOAD / lw->weight
>> should be changed to
>>     delta_exec * scale_load_down(NICE_0_LOAD) / lw->weight
> 
> I think this is not correct? scale_load_down(NICE_0_LOAD) is the true
> weight, as mapped
> directly from the task's nice/priority value, while lw->weight is the
> scaled_up load.
> Their units/scales don't match.
>