sched/deadline: Fix GRUB accounting

[PATCH 3/5] sched/deadline: Fix accounting after global limits change

Posted by Juri Lelli 7 months, 2 weeks ago

A global limits change (sched_rt_handler() logic) currently leaves stale
and/or incorrect values in variables related to accounting (e.g.
extra_bw).

Properly clean up per runqueue variables before implementing the change
and rebuild scheduling domains (so that accounting is also properly
restored) after such a change is complete.

Reported-by: Marcel Ziswiler <marcel.ziswiler@codethink.co.uk>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
---
 kernel/sched/deadline.c | 4 +++-
 kernel/sched/rt.c       | 6 ++++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 7a3b556d45a99..187f324565f92 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -3166,6 +3166,9 @@ void sched_dl_do_global(void)
 	if (global_rt_runtime() != RUNTIME_INF)
 		new_bw = to_ratio(global_rt_period(), global_rt_runtime());
 
+	for_each_possible_cpu(cpu)
+		init_dl_rq_bw_ratio(&cpu_rq(cpu)->dl);
+
 	for_each_possible_cpu(cpu) {
 		rcu_read_lock_sched();
 
@@ -3181,7 +3184,6 @@ void sched_dl_do_global(void)
 		raw_spin_unlock_irqrestore(&dl_b->lock, flags);
 
 		rcu_read_unlock_sched();
-		init_dl_rq_bw_ratio(&cpu_rq(cpu)->dl);
 	}
 }
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 15d5855c542cb..be6e9bcbe82b6 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2886,6 +2886,12 @@ static int sched_rt_handler(const struct ctl_table *table, int write, void *buff
 	sched_domains_mutex_unlock();
 	mutex_unlock(&mutex);
 
+	/*
+	 * After changing maximum available bandwidth for DEADLINE, we need to
+	 * recompute per root domain and per cpus variables accordingly.
+	 */
+	rebuild_sched_domains();
+
 	return ret;
 }
 
-- 
2.49.0

Re: [PATCH 3/5] sched/deadline: Fix accounting after global limits change

Posted by Peter Zijlstra 6 months, 4 weeks ago

On Fri, Jun 27, 2025 at 01:51:16PM +0200, Juri Lelli wrote:

> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 15d5855c542cb..be6e9bcbe82b6 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2886,6 +2886,12 @@ static int sched_rt_handler(const struct ctl_table *table, int write, void *buff
>  	sched_domains_mutex_unlock();
>  	mutex_unlock(&mutex);
>  
> +	/*
> +	 * After changing maximum available bandwidth for DEADLINE, we need to
> +	 * recompute per root domain and per cpus variables accordingly.
> +	 */
> +	rebuild_sched_domains();
> +
>  	return ret;
>  }

So I'll merge these patches since correctness first etc. But the above
is quite terrible. It would be really good not to have to rebuild the
sched domains for every rt change. Surely we can iterate the existing
domains and update stuff?

Re: [PATCH 3/5] sched/deadline: Fix accounting after global limits change

Posted by Juri Lelli 6 months, 4 weeks ago

On 14/07/25 10:59, Peter Zijlstra wrote:
> On Fri, Jun 27, 2025 at 01:51:16PM +0200, Juri Lelli wrote:
> 
> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index 15d5855c542cb..be6e9bcbe82b6 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -2886,6 +2886,12 @@ static int sched_rt_handler(const struct ctl_table *table, int write, void *buff
> >  	sched_domains_mutex_unlock();
> >  	mutex_unlock(&mutex);
> >  
> > +	/*
> > +	 * After changing maximum available bandwidth for DEADLINE, we need to
> > +	 * recompute per root domain and per cpus variables accordingly.
> > +	 */
> > +	rebuild_sched_domains();
> > +
> >  	return ret;
> >  }
> 
> So I'll merge these patches since correctness first etc. But the above

Thanks!

> is quite terrible. It would be really good not to have to rebuild the
> sched domains for every rt change. Surely we can iterate the existing
> domains and update stuff?

Yeah, I agree. Tried doing an update at first, but then the involved
locking and the not so pleasant thing I could come up with made me
decide for the big hammer. Also because it should be a very infrequent
operation anyway.

But, I will try again somewhat soon.

Thanks,
Juri

[tip: sched/core] sched/deadline: Fix accounting after global limits change

Posted by tip-bot2 for Juri Lelli 6 months, 4 weeks ago

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     440989c10f4e32620e9e2717ca52c3ed7ae11048
Gitweb:        https://git.kernel.org/tip/440989c10f4e32620e9e2717ca52c3ed7ae11048
Author:        Juri Lelli <juri.lelli@redhat.com>
AuthorDate:    Fri, 27 Jun 2025 13:51:16 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Mon, 14 Jul 2025 10:59:33 +02:00

sched/deadline: Fix accounting after global limits change

A global limits change (sched_rt_handler() logic) currently leaves stale
and/or incorrect values in variables related to accounting (e.g.
extra_bw).

Properly clean up per runqueue variables before implementing the change
and rebuild scheduling domains (so that accounting is also properly
restored) after such a change is complete.

Reported-by: Marcel Ziswiler <marcel.ziswiler@codethink.co.uk>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Marcel Ziswiler <marcel.ziswiler@codethink.co.uk> # nuc & rock5b
Link: https://lore.kernel.org/r/20250627115118.438797-4-juri.lelli@redhat.com
---
 kernel/sched/deadline.c | 4 +++-
 kernel/sched/rt.c       | 6 ++++++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 0abffe3..9c7d952 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -3183,6 +3183,9 @@ void sched_dl_do_global(void)
 	if (global_rt_runtime() != RUNTIME_INF)
 		new_bw = to_ratio(global_rt_period(), global_rt_runtime());
 
+	for_each_possible_cpu(cpu)
+		init_dl_rq_bw_ratio(&cpu_rq(cpu)->dl);
+
 	for_each_possible_cpu(cpu) {
 		rcu_read_lock_sched();
 
@@ -3198,7 +3201,6 @@ void sched_dl_do_global(void)
 		raw_spin_unlock_irqrestore(&dl_b->lock, flags);
 
 		rcu_read_unlock_sched();
-		init_dl_rq_bw_ratio(&cpu_rq(cpu)->dl);
 	}
 }
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 15d5855..be6e9bc 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2886,6 +2886,12 @@ undo:
 	sched_domains_mutex_unlock();
 	mutex_unlock(&mutex);
 
+	/*
+	 * After changing maximum available bandwidth for DEADLINE, we need to
+	 * recompute per root domain and per cpus variables accordingly.
+	 */
+	rebuild_sched_domains();
+
 	return ret;
 }

[PATCH 1/5] sched/deadline: Initialize dl_servers after SMP
[PATCH 2/5] sched/deadline: Reset extra_bw to max_bw when clearing root domains
[PATCH 3/5] sched/deadline: Fix accounting after global limits change
[PATCH 4/5] tools/sched: Add root_domains_dump.py which dumps root domains info
[PATCH 5/5] tools/sched: Add dl_bw_dump.py for printing bandwidth accounting info