[PATCH sched_ext/for-6.18] sched_ext: Add migration-disabled counter to error state dump

Andrea Righi posted 1 patch 1 week, 6 days ago
There is a newer version of this series
kernel/sched/ext.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[PATCH sched_ext/for-6.18] sched_ext: Add migration-disabled counter to error state dump
Posted by Andrea Righi 1 week, 6 days ago
Include the task's migration-disabled counter when dumping task state
during an error exit.

This can help diagnose cases where tasks can get stuck, because they're
unable to migrate elsewhere.

Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
 kernel/sched/ext.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 477eccf023388..e03bb51364661 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -4167,7 +4167,8 @@ static void scx_dump_task(struct seq_buf *s, struct scx_dump_ctx *dctx,
 		  p->scx.sticky_cpu, p->scx.holding_cpu, dsq_id_buf);
 	dump_line(s, "      dsq_vtime=%llu slice=%llu weight=%u",
 		  p->scx.dsq_vtime, p->scx.slice, p->scx.weight);
-	dump_line(s, "      cpus=%*pb", cpumask_pr_args(p->cpus_ptr));
+	dump_line(s, "      cpus=%*pb migration_disabled=%u", cpumask_pr_args(p->cpus_ptr),
+		  p->migration_disabled);
 
 	if (SCX_HAS_OP(sch, dump_task)) {
 		ops_dump_init(s, "    ");
-- 
2.51.0
Re: [PATCH sched_ext/for-6.18] sched_ext: Add migration-disabled counter to error state dump
Posted by Tejun Heo 1 week, 6 days ago
Hello,

On Thu, Sep 18, 2025 at 11:29:28AM +0200, Andrea Righi wrote:
> Include the task's migration-disabled counter when dumping task state
> during an error exit.
> 
> This can help diagnose cases where tasks can get stuck, because they're
> unable to migrate elsewhere.
> 
> Signed-off-by: Andrea Righi <arighi@nvidia.com>
> ---
>  kernel/sched/ext.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 477eccf023388..e03bb51364661 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -4167,7 +4167,8 @@ static void scx_dump_task(struct seq_buf *s, struct scx_dump_ctx *dctx,
>  		  p->scx.sticky_cpu, p->scx.holding_cpu, dsq_id_buf);
>  	dump_line(s, "      dsq_vtime=%llu slice=%llu weight=%u",
>  		  p->scx.dsq_vtime, p->scx.slice, p->scx.weight);
> -	dump_line(s, "      cpus=%*pb", cpumask_pr_args(p->cpus_ptr));
> +	dump_line(s, "      cpus=%*pb migration_disabled=%u", cpumask_pr_args(p->cpus_ptr),

Can you abbreviate this a bit? We have limited dump buffer size and real
estate. Wouldn't e.g. no_mig= achieve the same with a lot less space?

Thanks.

-- 
tejun
Re: [PATCH sched_ext/for-6.18] sched_ext: Add migration-disabled counter to error state dump
Posted by Andrea Righi 1 week, 6 days ago
On Thu, Sep 18, 2025 at 05:49:20AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Thu, Sep 18, 2025 at 11:29:28AM +0200, Andrea Righi wrote:
> > Include the task's migration-disabled counter when dumping task state
> > during an error exit.
> > 
> > This can help diagnose cases where tasks can get stuck, because they're
> > unable to migrate elsewhere.
> > 
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> > ---
> >  kernel/sched/ext.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> > index 477eccf023388..e03bb51364661 100644
> > --- a/kernel/sched/ext.c
> > +++ b/kernel/sched/ext.c
> > @@ -4167,7 +4167,8 @@ static void scx_dump_task(struct seq_buf *s, struct scx_dump_ctx *dctx,
> >  		  p->scx.sticky_cpu, p->scx.holding_cpu, dsq_id_buf);
> >  	dump_line(s, "      dsq_vtime=%llu slice=%llu weight=%u",
> >  		  p->scx.dsq_vtime, p->scx.slice, p->scx.weight);
> > -	dump_line(s, "      cpus=%*pb", cpumask_pr_args(p->cpus_ptr));
> > +	dump_line(s, "      cpus=%*pb migration_disabled=%u", cpumask_pr_args(p->cpus_ptr),
> 
> Can you abbreviate this a bit? We have limited dump buffer size and real
> estate. Wouldn't e.g. no_mig= achieve the same with a lot less space?

Oh yes, good point. I think no_mig= is clear enough, or even nomig=. And if
it isn't clear, people can always check the code to see what it refers to.

Will send a v2 with this change.

Thanks,
-Andrea