[PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED

Aaron Tomlin posted 1 patch 2 weeks, 2 days ago
kernel/sched/core.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
[PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Aaron Tomlin 2 weeks, 2 days ago
Following on from commit c7461cca91 ("cgroup, docs: Be explicit about
independence of RT_GROUP_SCHED and non-cpu controllers"), this patch
introduces an explicit error message that informs the user why the
task migration failed. Now user-mode has a clear and actionable reason
for the failure, greatly assisting with debugging.

Signed-off-by: Aaron Tomlin <atomlin@atomlin.com>
---
 kernel/sched/core.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 62b3416f5e43..b9d7689d21e4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9185,8 +9185,18 @@ static int cpu_cgroup_can_attach(struct cgroup_taskset *tset)
 		goto scx_check;
 
 	cgroup_taskset_for_each(task, css, tset) {
-		if (!sched_rt_can_attach(css_tg(css), task))
+		struct task_group *tg = css_tg(css);
+
+		if (!sched_rt_can_attach(tg, task)) {
+			if (tg != &root_task_group) {
+				pr_err_ratelimited("cgroup: cannot attach "
+						"cpu controller. Task "
+						"%s:%d is not in the root "
+						"cgroup.", task->comm,
+						task_pid_nr(task));
+			}
 			return -EINVAL;
+		}
 	}
 scx_check:
 #endif /* CONFIG_RT_GROUP_SCHED */
-- 
2.49.0
Re: [PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Michal Koutný 2 weeks, 2 days ago
On Mon, Sep 15, 2025 at 09:11:46PM -0400, Aaron Tomlin <atomlin@atomlin.com> wrote:
> Following on from commit c7461cca91 ("cgroup, docs: Be explicit about
> independence of RT_GROUP_SCHED and non-cpu controllers"), this patch
> introduces an explicit error message that informs the user why the
> task migration failed. Now user-mode has a clear and actionable reason
> for the failure, greatly assisting with debugging.

The action in this case should be to assign non-trivial quota to target
task_group (or un-rt the task), no?

What setup do you envision for this message? (CONFIG_RT_GROUP_SCHED and
cgroup v2?)

Thanks,
Michal
Re: [PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Aaron Tomlin 2 weeks, 1 day ago
On Tue, Sep 16, 2025 at 11:09:45AM +0200, Michal Koutný wrote:
> The action in this case should be to assign non-trivial quota to target
> task_group (or un-rt the task), no?

Hi Michal,

I would prefer not to take any action. However, is there a strong
preference to demote the rt task so the CPU controller can be enabled in
this context?

> What setup do you envision for this message? (CONFIG_RT_GROUP_SCHED and
> cgroup v2?)

Yes. At this point, RT_GROUP_SCHED would be enabled at runtime.


Kind regards,
-- 
Aaron Tomlin
Re: [PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Michal Koutný 2 weeks, 1 day ago
On Tue, Sep 16, 2025 at 10:47:56AM -0400, Aaron Tomlin <atomlin@atomlin.com> wrote:
> I would prefer not to take any action. However, is there a strong
> preference to demote the rt task so the CPU controller can be enabled in
> this context?

Maybe more context clarifies. The preference is not to end up in this
corner.

> > What setup do you envision for this message? (CONFIG_RT_GROUP_SCHED and
> > cgroup v2?)
> 
> Yes. At this point, RT_GROUP_SCHED would be enabled at runtime.

I wonder does this combination come from a distro or is it your custom
setup?

I assume the latter (but I'm curious if there's such a distro), in that
case you likely want to have the cpu controller on v1 hierarchy. v1
usage is what the boottime switch is currently useful for, v2 de facto
doesn't support RT group scheduling as of today [1], v2 systems should
simply unset CONFIG_RT_GROUP_SCHED to avoid issues enabling cpu
controller.

HTH,
Michal

[1] This [2] didn't make it into tree thus I'd be reserved with the
    message printed in your patch too.
[2] https://lore.kernel.org/all/20250310170442.504716-11-mkoutny@suse.com/
Re: [PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Aaron Tomlin 2 weeks, 1 day ago
On Tue, Sep 16, 2025 at 05:44:14PM +0200, Michal Koutný wrote:
> On Tue, Sep 16, 2025 at 10:47:56AM -0400, Aaron Tomlin <atomlin@atomlin.com> wrote:
> > I would prefer not to take any action. However, is there a strong
> > preference to demote the rt task so the CPU controller can be enabled in
> > this context?
> 
> Maybe more context clarifies. The preference is not to end up in this
> corner.

I see. If I understand correctly, in this context, are you suggesting to
modify the identified task's scheduling class (e.g. to SCHED_NORMAL) so the
CPU controller can be enabled?

> I wonder does this combination come from a distro or is it your custom
> setup? I assume the latter (but I'm curious if there's such a distro), in
> that case you likely want to have the cpu controller on v1 hierarchy. v1
> usage is what the boottime switch is currently useful for, v2 de facto
> doesn't support RT group scheduling as of today [1], v2 systems should
> simply unset CONFIG_RT_GROUP_SCHED to avoid issues enabling cpu
> controller.

Yes. Under Red Hat Enterprise Linux (RHEL) 9 Kconfig option
CONFIG_RT_GROUP_SCHED and Cgroup version 2 is enabled by default.
Albeit, upstream can disable SCHED_FIFO/SCHED_RT group scheduling at
boot-time via rt_group_sched=.

> HTH,
> Michal
> 
> [1] This [2] didn't make it into tree thus I'd be reserved with the
>     message printed in your patch too.
> [2] https://lore.kernel.org/all/20250310170442.504716-11-mkoutny@suse.com/

I see. Understood.


Kind regards,
-- 
Aaron Tomlin
Re: [PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Michal Koutný 2 weeks, 1 day ago
On Tue, Sep 16, 2025 at 12:57:04PM -0400, Aaron Tomlin <atomlin@atomlin.com> wrote:
> I see. If I understand correctly, in this context, are you suggesting to
> modify the identified task's scheduling class (e.g. to SCHED_NORMAL) so the
> CPU controller can be enabled?

So I'd primarily suggest you to disable CONFIG_RT_GROUP_SCHED. It seems
you ran into a situation where you have kernel with
CONFIG_RT_GROUP_SCHED but userspace that'd like to use cgroup v2 and
these two are known not get well along. (Alternatively, you may switch
userspace to v1 if you really really need RT groups.)

Michal
Re: [PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Aaron Tomlin 1 week, 5 days ago
On Wed, Sep 17, 2025 at 11:37:37AM +0200, Michal Koutný wrote:
> So I'd primarily suggest you to disable CONFIG_RT_GROUP_SCHED. It seems
> you ran into a situation where you have kernel with
> CONFIG_RT_GROUP_SCHED but userspace that'd like to use cgroup v2 and
> these two are known not get well along. (Alternatively, you may switch
> userspace to v1 if you really really need RT groups.)

Thanks Michal! I initially thought you were suggesting to "normalize" the
RT task in this context (e.g. a simplified version of
__sched_setscheduler()) to allow forward progress so the CPU controller can
be attached. Please ignore.


Kind regards,
-- 
Aaron Tomlin
Re: [PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Phil Auld 2 weeks, 1 day ago
On Tue, Sep 16, 2025 at 12:57:04PM -0400 Aaron Tomlin wrote:
> On Tue, Sep 16, 2025 at 05:44:14PM +0200, Michal Koutný wrote:
> > On Tue, Sep 16, 2025 at 10:47:56AM -0400, Aaron Tomlin <atomlin@atomlin.com> wrote:
> > > I would prefer not to take any action. However, is there a strong
> > > preference to demote the rt task so the CPU controller can be enabled in
> > > this context?
> > 
> > Maybe more context clarifies. The preference is not to end up in this
> > corner.
> 
> I see. If I understand correctly, in this context, are you suggesting to
> modify the identified task's scheduling class (e.g. to SCHED_NORMAL) so the
> CPU controller can be enabled?
> 
> > I wonder does this combination come from a distro or is it your custom
> > setup? I assume the latter (but I'm curious if there's such a distro), in
> > that case you likely want to have the cpu controller on v1 hierarchy. v1
> > usage is what the boottime switch is currently useful for, v2 de facto
> > doesn't support RT group scheduling as of today [1], v2 systems should
> > simply unset CONFIG_RT_GROUP_SCHED to avoid issues enabling cpu
> > controller.
> 
> Yes. Under Red Hat Enterprise Linux (RHEL) 9 Kconfig option
> CONFIG_RT_GROUP_SCHED and Cgroup version 2 is enabled by default.
> Albeit, upstream can disable SCHED_FIFO/SCHED_RT group scheduling at
> boot-time via rt_group_sched=.
>

 CONFIG_RT_GROUP_SCHED is not enabled in RHEL9 and later.  It was on in
 RHEL8 which also defaulted to cgroupv1.


Thanks,
Phil



> > HTH,
> > Michal
> > 
> > [1] This [2] didn't make it into tree thus I'd be reserved with the
> >     message printed in your patch too.
> > [2] https://lore.kernel.org/all/20250310170442.504716-11-mkoutny@suse.com/
> 
> I see. Understood.
> 
> 
> Kind regards,
> -- 
> Aaron Tomlin
> 

-- 
Re: [PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Aaron Tomlin 2 weeks, 1 day ago
On Tue, Sep 16, 2025 at 01:07:35PM -0400, Phil Auld wrote:
>  CONFIG_RT_GROUP_SCHED is not enabled in RHEL9 and later.  It was on in
>  RHEL8 which also defaulted to cgroupv1.

Hi Phil,

Sorry about that! It is only enabled on RHEL 8. I can see under
redhat/configs/kernel-5.14.0-x86_64.config it is indeed disabled.


Kind regards,
-- 
Aaron Tomlin
Re: [PATCH] sched/core: Report failed rt migrations to non-root cgroup without rt bandwidth under RT_GROUP_SCHED
Posted by Phil Auld 2 weeks, 1 day ago
On Tue, Sep 16, 2025 at 01:23:36PM -0400 Aaron Tomlin wrote:
> On Tue, Sep 16, 2025 at 01:07:35PM -0400, Phil Auld wrote:
> >  CONFIG_RT_GROUP_SCHED is not enabled in RHEL9 and later.  It was on in
> >  RHEL8 which also defaulted to cgroupv1.
> 
> Hi Phil,
> 
> Sorry about that! It is only enabled on RHEL 8. I can see under
> redhat/configs/kernel-5.14.0-x86_64.config it is indeed disabled.
> 
>

Hi Aaron, No need to apologize.  I just wanted to clarify :)


CHeers,
Phil


> Kind regards,
> -- 
> Aaron Tomlin
> 

--