sched/fair: feec()/scale_rt_capacity() improvements

[PATCH 1/2] sched/fair: Check if prev_cpu has highest spare cap in feec()

Posted by Pierre Gondois 3 years, 7 months ago

When evaluating the CPU candidates in the perf domain (pd) containing
the previously used CPU (prev_cpu), find_energy_efficient_cpu()
evaluates the energy of the pd:
- without the task (base_energy)
- with the task placed on prev_cpu (if the task fits)
- with the task placed on the CPU with the highest spare capacity,
  prev_cpu being excluded from this set

If prev_cpu is already the CPU with the highest spare capacity,
max_spare_cap_cpu will be the CPU with the second highest spare
capacity.

On an Arm64 Juno-r2, with a workload of 10 tasks at a 10% duty cycle,
when prev_cpu and max_spare_cap_cpu are both valid candidates,
prev_spare_cap > max_spare_cap at ~82%.
Thus the energy of the pd when placing the task on max_spare_cap_cpu
is computed with no possible positive outcome 82% most of the time.

Do not consider max_spare_cap_cpu as a valid candidate if
prev_spare_cap > max_spare_cap.

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
---
 kernel/sched/fair.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 914096c5b1ae..bcae7bdd5582 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6900,7 +6900,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 	for (; pd; pd = pd->next) {
 		unsigned long cpu_cap, cpu_thermal_cap, util;
 		unsigned long cur_delta, max_spare_cap = 0;
-		bool compute_prev_delta = false;
+		unsigned long prev_spare_cap = 0;
 		int max_spare_cap_cpu = -1;
 		unsigned long base_energy;
 
@@ -6944,18 +6944,19 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 
 			if (cpu == prev_cpu) {
 				/* Always use prev_cpu as a candidate. */
-				compute_prev_delta = true;
+				prev_spare_cap = cpu_cap;
 			} else if (cpu_cap > max_spare_cap) {
 				/*
 				 * Find the CPU with the maximum spare capacity
-				 * in the performance domain.
+				 * among the remaining CPUs in the performance
+				 * domain.
 				 */
 				max_spare_cap = cpu_cap;
 				max_spare_cap_cpu = cpu;
 			}
 		}
 
-		if (max_spare_cap_cpu < 0 && !compute_prev_delta)
+		if (max_spare_cap_cpu < 0 && prev_spare_cap == 0)
 			continue;
 
 		eenv_pd_busy_time(&eenv, cpus, p);
@@ -6963,7 +6964,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 		base_energy = compute_energy(&eenv, pd, cpus, p, -1);
 
 		/* Evaluate the energy impact of using prev_cpu. */
-		if (compute_prev_delta) {
+		if (prev_spare_cap > 0) {
 			prev_delta = compute_energy(&eenv, pd, cpus, p,
 						    prev_cpu);
 			/* CPU utilization has changed */
@@ -6974,7 +6975,7 @@ static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu)
 		}
 
 		/* Evaluate the energy impact of using max_spare_cap_cpu. */
-		if (max_spare_cap_cpu >= 0) {
+		if (max_spare_cap_cpu >= 0 && max_spare_cap > prev_spare_cap) {
 			cur_delta = compute_energy(&eenv, pd, cpus, p,
 						   max_spare_cap_cpu);
 			/* CPU utilization has changed */
-- 
2.25.1

Re: [PATCH 1/2] sched/fair: Check if prev_cpu has highest spare cap in feec()

Posted by Dietmar Eggemann 3 years, 7 months ago

On 19/08/2022 17:33, Pierre Gondois wrote:
> When evaluating the CPU candidates in the perf domain (pd) containing
> the previously used CPU (prev_cpu), find_energy_efficient_cpu()
> evaluates the energy of the pd:
> - without the task (base_energy)
> - with the task placed on prev_cpu (if the task fits)
> - with the task placed on the CPU with the highest spare capacity,
>   prev_cpu being excluded from this set
> 
> If prev_cpu is already the CPU with the highest spare capacity,
> max_spare_cap_cpu will be the CPU with the second highest spare
> capacity.
> 
> On an Arm64 Juno-r2, with a workload of 10 tasks at a 10% duty cycle,
> when prev_cpu and max_spare_cap_cpu are both valid candidates,
> prev_spare_cap > max_spare_cap at ~82%.
> Thus the energy of the pd when placing the task on max_spare_cap_cpu
> is computed with no possible positive outcome 82% most of the time.
> 
> Do not consider max_spare_cap_cpu as a valid candidate if
> prev_spare_cap > max_spare_cap.
> 
> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>

LGTM. When I ran the workload I see this happening in 50%-90% of the EAS
wakeups. This should prevent one needless compute_energy() call out of 7
on a typical 3-gear system like 2x2x4 in these cases.

Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>

[...]

Re: [PATCH 1/2] sched/fair: Check if prev_cpu has highest spare cap in feec()

Posted by Pierre Gondois 3 years, 6 months ago

Hello Peter,

The second patch:
  -[PATCH 2/2] sched/fair: Use IRQ scaling for all sched classes
must be dropped, cf. Vincent Guittot's review, but I believe this patch
should be ok to take if there is no other comment,

Regards,
Pierre

On 8/29/22 07:13, Dietmar Eggemann wrote:
> On 19/08/2022 17:33, Pierre Gondois wrote:
>> When evaluating the CPU candidates in the perf domain (pd) containing
>> the previously used CPU (prev_cpu), find_energy_efficient_cpu()
>> evaluates the energy of the pd:
>> - without the task (base_energy)
>> - with the task placed on prev_cpu (if the task fits)
>> - with the task placed on the CPU with the highest spare capacity,
>>    prev_cpu being excluded from this set
>>
>> If prev_cpu is already the CPU with the highest spare capacity,
>> max_spare_cap_cpu will be the CPU with the second highest spare
>> capacity.
>>
>> On an Arm64 Juno-r2, with a workload of 10 tasks at a 10% duty cycle,
>> when prev_cpu and max_spare_cap_cpu are both valid candidates,
>> prev_spare_cap > max_spare_cap at ~82%.
>> Thus the energy of the pd when placing the task on max_spare_cap_cpu
>> is computed with no possible positive outcome 82% most of the time.
>>
>> Do not consider max_spare_cap_cpu as a valid candidate if
>> prev_spare_cap > max_spare_cap.
>>
>> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
> 
> LGTM. When I ran the workload I see this happening in 50%-90% of the EAS
> wakeups. This should prevent one needless compute_energy() call out of 7
> on a typical 3-gear system like 2x2x4 in these cases.
> 
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
> 
> [...]