[PATCH v2 1/2] sched/deadline: Restore dl_server bandwidth on non-destructive root domain changes

Juri Lelli posted 2 patches 1 week, 1 day ago
[PATCH v2 1/2] sched/deadline: Restore dl_server bandwidth on non-destructive root domain changes
Posted by Juri Lelli 1 week, 1 day ago
When root domain non-destructive changes (e.g., only modifying one of
the existing root domains while the rest is not touched) happen we still
need to clear DEADLINE bandwidth accounting so that it's then properly
restored, taking into account DEADLINE tasks associated to each cpuset
(associated to each root domain). After the introduction of dl_servers,
we fail to restore such servers contribution after non-destructive
changes (as they are only considered on destructive changes when
runqueues are attached to the new domains).

Fix this by making sure we iterate over the dl_servers attached to
domains that have not been destroyed and add their bandwidth
contribution back correctly.

Signed-off-by: Juri Lelli <juri.lelli@redhat.com>

---
v1->v2: always restore, considering a root domain span (and check for
        active cpus)
---
 kernel/sched/deadline.c | 17 ++++++++++++++---
 kernel/sched/topology.c |  8 +++++---
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 9ce93d0bf452..a9cdbf058871 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -2970,11 +2970,22 @@ void dl_add_task_root_domain(struct task_struct *p)
 
 void dl_clear_root_domain(struct root_domain *rd)
 {
-	unsigned long flags;
+	int i;
 
-	raw_spin_lock_irqsave(&rd->dl_bw.lock, flags);
+	guard(raw_spinlock_irqsave)(&rd->dl_bw.lock);
 	rd->dl_bw.total_bw = 0;
-	raw_spin_unlock_irqrestore(&rd->dl_bw.lock, flags);
+
+	/*
+	 * dl_server bandwidth is only restored when CPUs are attached to root
+	 * domains (after domains are created or CPUs moved back to the
+	 * default root doamin).
+	 */
+	for_each_cpu(i, rd->span) {
+		struct sched_dl_entity *dl_se = &cpu_rq(i)->fair_server;
+
+		if (dl_server(dl_se) && cpu_active(i))
+			rd->dl_bw.total_bw += dl_se->dl_bw;
+	}
 }
 
 #endif /* CONFIG_SMP */
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 9748a4c8d668..9c405f0e7b26 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -2721,9 +2721,11 @@ void partition_sched_domains_locked(int ndoms_new, cpumask_var_t doms_new[],
 
 				/*
 				 * This domain won't be destroyed and as such
-				 * its dl_bw->total_bw needs to be cleared.  It
-				 * will be recomputed in function
-				 * update_tasks_root_domain().
+				 * its dl_bw->total_bw needs to be cleared.
+				 * Tasks contribution will be then recomputed
+				 * in function dl_update_tasks_root_domain(),
+				 * dl_servers contribution in function
+				 * dl_restore_server_root_domain().
 				 */
 				rd = cpu_rq(cpumask_any(doms_cur[i]))->rd;
 				dl_clear_root_domain(rd);
-- 
2.47.0
Re: [PATCH v2 1/2] sched/deadline: Restore dl_server bandwidth on non-destructive root domain changes
Posted by Phil Auld 1 week, 1 day ago
On Thu, Nov 14, 2024 at 02:28:09PM +0000 Juri Lelli wrote:
> When root domain non-destructive changes (e.g., only modifying one of
> the existing root domains while the rest is not touched) happen we still
> need to clear DEADLINE bandwidth accounting so that it's then properly
> restored, taking into account DEADLINE tasks associated to each cpuset
> (associated to each root domain). After the introduction of dl_servers,
> we fail to restore such servers contribution after non-destructive
> changes (as they are only considered on destructive changes when
> runqueues are attached to the new domains).
> 
> Fix this by making sure we iterate over the dl_servers attached to
> domains that have not been destroyed and add their bandwidth
> contribution back correctly.
> 
> Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
>


Reviewed-by: Phil Auld <pauld@redhat.com>


> ---
> v1->v2: always restore, considering a root domain span (and check for
>         active cpus)
> ---
>  kernel/sched/deadline.c | 17 ++++++++++++++---
>  kernel/sched/topology.c |  8 +++++---
>  2 files changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 9ce93d0bf452..a9cdbf058871 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2970,11 +2970,22 @@ void dl_add_task_root_domain(struct task_struct *p)
>  
>  void dl_clear_root_domain(struct root_domain *rd)
>  {
> -	unsigned long flags;
> +	int i;
>  
> -	raw_spin_lock_irqsave(&rd->dl_bw.lock, flags);
> +	guard(raw_spinlock_irqsave)(&rd->dl_bw.lock);
>  	rd->dl_bw.total_bw = 0;
> -	raw_spin_unlock_irqrestore(&rd->dl_bw.lock, flags);
> +
> +	/*
> +	 * dl_server bandwidth is only restored when CPUs are attached to root
> +	 * domains (after domains are created or CPUs moved back to the
> +	 * default root doamin).
> +	 */
> +	for_each_cpu(i, rd->span) {
> +		struct sched_dl_entity *dl_se = &cpu_rq(i)->fair_server;
> +
> +		if (dl_server(dl_se) && cpu_active(i))
> +			rd->dl_bw.total_bw += dl_se->dl_bw;
> +	}
>  }
>  
>  #endif /* CONFIG_SMP */
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 9748a4c8d668..9c405f0e7b26 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -2721,9 +2721,11 @@ void partition_sched_domains_locked(int ndoms_new, cpumask_var_t doms_new[],
>  
>  				/*
>  				 * This domain won't be destroyed and as such
> -				 * its dl_bw->total_bw needs to be cleared.  It
> -				 * will be recomputed in function
> -				 * update_tasks_root_domain().
> +				 * its dl_bw->total_bw needs to be cleared.
> +				 * Tasks contribution will be then recomputed
> +				 * in function dl_update_tasks_root_domain(),
> +				 * dl_servers contribution in function
> +				 * dl_restore_server_root_domain().
>  				 */
>  				rd = cpu_rq(cpumask_any(doms_cur[i]))->rd;
>  				dl_clear_root_domain(rd);
> -- 
> 2.47.0
> 

--