[PATCH] sched/deadline: Reject debugfs dl_server writes for offline CPUs

Andrea Righi posted 1 patch 1 week, 6 days ago
There is a newer version of this series
kernel/sched/debug.c | 3 +++
1 file changed, 3 insertions(+)
[PATCH] sched/deadline: Reject debugfs dl_server writes for offline CPUs
Posted by Andrea Righi 1 week, 6 days ago
Writing runtime or period via the per-CPU dl_server debugfs files
(/sys/kernel/debug/sched/{fair,ext}_server/cpu*/{runtime,period}) on an
offline CPU can trigger two distinct kernel issues:

1) Divide-by-zero in dl_server_apply_params():

  Oops: divide error: 0000 [#1] SMP NOPTI
  RIP: 0010:dl_server_apply_params+0x239/0x3a0
  Call Trace:
   sched_server_write_common.isra.0+0x21a/0x3c0
   full_proxy_write+0x78/0xd0
   vfs_write+0xe7/0x6e0

  Both __dl_sub() and __dl_add() divide by cpus internally, which can be
  0 once the CPU has been removed from any active root-domain span (this
  has been latent since the debugfs interface was introduced).

2) WARN_ON_ONCE in dl_server_start():

  WARNING: kernel/sched/deadline.c:1805 at dl_server_start+0x232/0x270

  Commit ee6e44dfe6e5 ("sched/deadline: Stop dl_server before CPU goes
  offline") added this check to catch enqueueing the server on an
  offline rq.

There's no meaningful semantics for re-configuring the per-CPU dl_server
bandwidth while the CPU is offline, so simply reject the write with
-EBUSY so userspace gets a clear error.

Reported-by: Sashiko <sashiko-bot@kernel.org>
Closes: https://lore.kernel.org/all/20260526092228.3B6891F00A3A@smtp.kernel.org/
Fixes: d741f297bcea ("sched/fair: Fair server interface")
Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
 kernel/sched/debug.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index ed3a0d65da0ca..e57ad8c78a60e 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -415,6 +415,9 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
 			return  -EINVAL;
 		}
 
+		if (!cpu_online(cpu_of(rq)))
+			return -EBUSY;
+
 		update_rq_clock(rq);
 		dl_server_stop(dl_se);
 		retval = dl_server_apply_params(dl_se, runtime, period, 0);

base-commit: 7b197f597bc895b01204d8389a4cf3b00780bd21
-- 
2.54.0
Re: [PATCH] sched/deadline: Reject debugfs dl_server writes for offline CPUs
Posted by Juri Lelli 1 week, 6 days ago
Hi Andrea,

On 26/05/26 12:05, Andrea Righi wrote:
> Writing runtime or period via the per-CPU dl_server debugfs files
> (/sys/kernel/debug/sched/{fair,ext}_server/cpu*/{runtime,period}) on an
> offline CPU can trigger two distinct kernel issues:
> 
> 1) Divide-by-zero in dl_server_apply_params():
> 
>   Oops: divide error: 0000 [#1] SMP NOPTI
>   RIP: 0010:dl_server_apply_params+0x239/0x3a0
>   Call Trace:
>    sched_server_write_common.isra.0+0x21a/0x3c0
>    full_proxy_write+0x78/0xd0
>    vfs_write+0xe7/0x6e0
> 
>   Both __dl_sub() and __dl_add() divide by cpus internally, which can be
>   0 once the CPU has been removed from any active root-domain span (this
>   has been latent since the debugfs interface was introduced).
> 
> 2) WARN_ON_ONCE in dl_server_start():
> 
>   WARNING: kernel/sched/deadline.c:1805 at dl_server_start+0x232/0x270
> 
>   Commit ee6e44dfe6e5 ("sched/deadline: Stop dl_server before CPU goes
>   offline") added this check to catch enqueueing the server on an
>   offline rq.
> 
> There's no meaningful semantics for re-configuring the per-CPU dl_server
> bandwidth while the CPU is offline, so simply reject the write with
> -EBUSY so userspace gets a clear error.
> 
> Reported-by: Sashiko <sashiko-bot@kernel.org>
> Closes: https://lore.kernel.org/all/20260526092228.3B6891F00A3A@smtp.kernel.org/
> Fixes: d741f297bcea ("sched/fair: Fair server interface")
> Signed-off-by: Andrea Righi <arighi@nvidia.com>
> ---
>  kernel/sched/debug.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index ed3a0d65da0ca..e57ad8c78a60e 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -415,6 +415,9 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
>  			return  -EINVAL;
>  		}
>  
> +		if (!cpu_online(cpu_of(rq)))
> +			return -EBUSY;
> +
>  		update_rq_clock(rq);
>  		dl_server_stop(dl_se);
>  		retval = dl_server_apply_params(dl_se, runtime, period, 0);

I was looking at Sashiko findings and wondered what to do about this as
well. I think what you are proposing should be fine, unless for some
reason one wants to tweak dl-server parameters before swithcing a CPU
on. but since hotplug it's a disruptive operation already, I would say
imposing to make such a change after CPU is online should be ok (and
simpler to get right from a bandwidth accounting pov).

Reviewed-by: Juri Lelli <juri.lelli@redhat.com>

Thanks,
Juri
Re: [PATCH] sched/deadline: Reject debugfs dl_server writes for offline CPUs
Posted by Andrea Righi 1 week, 3 days ago
Hi Peter,

On Tue, May 26, 2026 at 02:07:53PM +0200, Juri Lelli wrote:
> Hi Andrea,
> 
> On 26/05/26 12:05, Andrea Righi wrote:
> > Writing runtime or period via the per-CPU dl_server debugfs files
> > (/sys/kernel/debug/sched/{fair,ext}_server/cpu*/{runtime,period}) on an
> > offline CPU can trigger two distinct kernel issues:
> > 
> > 1) Divide-by-zero in dl_server_apply_params():
> > 
> >   Oops: divide error: 0000 [#1] SMP NOPTI
> >   RIP: 0010:dl_server_apply_params+0x239/0x3a0
> >   Call Trace:
> >    sched_server_write_common.isra.0+0x21a/0x3c0
> >    full_proxy_write+0x78/0xd0
> >    vfs_write+0xe7/0x6e0
> > 
> >   Both __dl_sub() and __dl_add() divide by cpus internally, which can be
> >   0 once the CPU has been removed from any active root-domain span (this
> >   has been latent since the debugfs interface was introduced).
> > 
> > 2) WARN_ON_ONCE in dl_server_start():
> > 
> >   WARNING: kernel/sched/deadline.c:1805 at dl_server_start+0x232/0x270
> > 
> >   Commit ee6e44dfe6e5 ("sched/deadline: Stop dl_server before CPU goes
> >   offline") added this check to catch enqueueing the server on an
> >   offline rq.
> > 
> > There's no meaningful semantics for re-configuring the per-CPU dl_server
> > bandwidth while the CPU is offline, so simply reject the write with
> > -EBUSY so userspace gets a clear error.
> > 
> > Reported-by: Sashiko <sashiko-bot@kernel.org>
> > Closes: https://lore.kernel.org/all/20260526092228.3B6891F00A3A@smtp.kernel.org/
> > Fixes: d741f297bcea ("sched/fair: Fair server interface")
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> > ---
> >  kernel/sched/debug.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> > index ed3a0d65da0ca..e57ad8c78a60e 100644
> > --- a/kernel/sched/debug.c
> > +++ b/kernel/sched/debug.c
> > @@ -415,6 +415,9 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
> >  			return  -EINVAL;
> >  		}
> >  
> > +		if (!cpu_online(cpu_of(rq)))
> > +			return -EBUSY;
> > +
> >  		update_rq_clock(rq);
> >  		dl_server_stop(dl_se);
> >  		retval = dl_server_apply_params(dl_se, runtime, period, 0);
> 
> I was looking at Sashiko findings and wondered what to do about this as
> well. I think what you are proposing should be fine, unless for some
> reason one wants to tweak dl-server parameters before swithcing a CPU
> on. but since hotplug it's a disruptive operation already, I would say
> imposing to make such a change after CPU is online should be ok (and
> simpler to get right from a bandwidth accounting pov).
> 
> Reviewed-by: Juri Lelli <juri.lelli@redhat.com>

If this makes sense to you, could you add it to your queue:sched/core?

Otherwise it's possible to trigger the issues above by changing dl_server
bandwidth for offline CPUs.

Thanks,
-Andrea
Re: [PATCH] sched/deadline: Reject debugfs dl_server writes for offline CPUs
Posted by Peter Zijlstra 1 week, 3 days ago
On Fri, May 29, 2026 at 09:09:30AM +0200, Andrea Righi wrote:
> Hi Peter,
> 
> On Tue, May 26, 2026 at 02:07:53PM +0200, Juri Lelli wrote:
> > Hi Andrea,
> > 
> > On 26/05/26 12:05, Andrea Righi wrote:
> > > Writing runtime or period via the per-CPU dl_server debugfs files
> > > (/sys/kernel/debug/sched/{fair,ext}_server/cpu*/{runtime,period}) on an
> > > offline CPU can trigger two distinct kernel issues:
> > > 
> > > 1) Divide-by-zero in dl_server_apply_params():
> > > 
> > >   Oops: divide error: 0000 [#1] SMP NOPTI
> > >   RIP: 0010:dl_server_apply_params+0x239/0x3a0
> > >   Call Trace:
> > >    sched_server_write_common.isra.0+0x21a/0x3c0
> > >    full_proxy_write+0x78/0xd0
> > >    vfs_write+0xe7/0x6e0
> > > 
> > >   Both __dl_sub() and __dl_add() divide by cpus internally, which can be
> > >   0 once the CPU has been removed from any active root-domain span (this
> > >   has been latent since the debugfs interface was introduced).
> > > 
> > > 2) WARN_ON_ONCE in dl_server_start():
> > > 
> > >   WARNING: kernel/sched/deadline.c:1805 at dl_server_start+0x232/0x270
> > > 
> > >   Commit ee6e44dfe6e5 ("sched/deadline: Stop dl_server before CPU goes
> > >   offline") added this check to catch enqueueing the server on an
> > >   offline rq.
> > > 
> > > There's no meaningful semantics for re-configuring the per-CPU dl_server
> > > bandwidth while the CPU is offline, so simply reject the write with
> > > -EBUSY so userspace gets a clear error.
> > > 
> > > Reported-by: Sashiko <sashiko-bot@kernel.org>
> > > Closes: https://lore.kernel.org/all/20260526092228.3B6891F00A3A@smtp.kernel.org/
> > > Fixes: d741f297bcea ("sched/fair: Fair server interface")
> > > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> > > ---
> > >  kernel/sched/debug.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > > 
> > > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> > > index ed3a0d65da0ca..e57ad8c78a60e 100644
> > > --- a/kernel/sched/debug.c
> > > +++ b/kernel/sched/debug.c
> > > @@ -415,6 +415,9 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
> > >  			return  -EINVAL;
> > >  		}
> > >  
> > > +		if (!cpu_online(cpu_of(rq)))
> > > +			return -EBUSY;
> > > +
> > >  		update_rq_clock(rq);
> > >  		dl_server_stop(dl_se);
> > >  		retval = dl_server_apply_params(dl_se, runtime, period, 0);
> > 
> > I was looking at Sashiko findings and wondered what to do about this as
> > well. I think what you are proposing should be fine, unless for some
> > reason one wants to tweak dl-server parameters before swithcing a CPU
> > on. but since hotplug it's a disruptive operation already, I would say
> > imposing to make such a change after CPU is online should be ok (and
> > simpler to get right from a bandwidth accounting pov).
> > 
> > Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
> 
> If this makes sense to you, could you add it to your queue:sched/core?
> 
> Otherwise it's possible to trigger the issues above by changing dl_server
> bandwidth for offline CPUs.

Right, so I had seen these patches fly by, but then I couldn't find it
in a hurry when I took these other patches :/.

It is a bit unfortunate, but oh well. I'm sure people will try and fix
it if they're too annoyed by this behaviour and we can revisit thing
then.

So yes, let me go add this to the other two.
[tip: sched/core] sched/deadline: Reject debugfs dl_server writes for offline CPUs
Posted by tip-bot2 for Andrea Righi 1 week, 3 days ago
The following commit has been merged into the sched/core branch of tip:

Commit-ID:     4043f549841619a01999bf5d4e0b7931ef87f6cc
Gitweb:        https://git.kernel.org/tip/4043f549841619a01999bf5d4e0b7931ef87f6cc
Author:        Andrea Righi <arighi@nvidia.com>
AuthorDate:    Tue, 26 May 2026 12:05:02 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 29 May 2026 12:43:15 +02:00

sched/deadline: Reject debugfs dl_server writes for offline CPUs

Writing runtime or period via the per-CPU dl_server debugfs files
(/sys/kernel/debug/sched/{fair,ext}_server/cpu*/{runtime,period}) on an
offline CPU can trigger two distinct kernel issues:

1) Divide-by-zero in dl_server_apply_params():

  Oops: divide error: 0000 [#1] SMP NOPTI
  RIP: 0010:dl_server_apply_params+0x239/0x3a0
  Call Trace:
   sched_server_write_common.isra.0+0x21a/0x3c0
   full_proxy_write+0x78/0xd0
   vfs_write+0xe7/0x6e0

  Both __dl_sub() and __dl_add() divide by cpus internally, which can be
  0 once the CPU has been removed from any active root-domain span (this
  has been latent since the debugfs interface was introduced).

2) WARN_ON_ONCE in dl_server_start():

  WARNING: kernel/sched/deadline.c:1805 at dl_server_start+0x232/0x270

  Commit ee6e44dfe6e5 ("sched/deadline: Stop dl_server before CPU goes
  offline") added this check to catch enqueueing the server on an
  offline rq.

There's no meaningful semantics for re-configuring the per-CPU dl_server
bandwidth while the CPU is offline, so simply reject the write with
-EBUSY so userspace gets a clear error.

Closes: https://lore.kernel.org/all/20260526092228.3B6891F00A3A@smtp.kernel.org/
Fixes: d741f297bcea ("sched/fair: Fair server interface")
Reported-by: Sashiko <sashiko-bot@kernel.org>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Tested-by: abaci-kreproducer <abaci@linux.alibaba.com>
Link: https://patch.msgid.link/20260526100502.575774-1-arighi@nvidia.com
---
 kernel/sched/debug.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 4809e1d..5e09cf9 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -416,6 +416,9 @@ static ssize_t sched_server_write_common(struct file *filp, const char __user *u
 			return  -EINVAL;
 		}
 
+		if (!cpu_online(cpu_of(rq)))
+			return -EBUSY;
+
 		update_rq_clock(rq);
 		dl_server_stop(dl_se);
 		retval = dl_server_apply_params(dl_se, runtime, period, 0);