[PATCH v4 10/15] sched: Add task enqueue/dequeue trace points

Gabriele Monaco posted 15 patches 3 weeks, 2 days ago
There is a newer version of this series
[PATCH v4 10/15] sched: Add task enqueue/dequeue trace points
Posted by Gabriele Monaco 3 weeks, 2 days ago
From: Nam Cao <namcao@linutronix.de>

Add trace points into enqueue_task() and dequeue_task().

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
---
 include/trace/events/sched.h | 13 +++++++++++++
 kernel/sched/core.c          |  9 ++++++++-
 kernel/sched/sched.h         |  2 ++
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 366b2e8ec40c..f4e1d3554e3e 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -912,6 +912,19 @@ DECLARE_TRACE(sched_dl_server_stop,
 	TP_PROTO(struct sched_dl_entity *dl_se, int cpu),
 	TP_ARGS(dl_se, cpu));
 
+/*
+ * The two trace points below may not work as expected for fair tasks due
+ * to delayed dequeue. See:
+ * https://lore.kernel.org/lkml/179674c6-f82a-4718-ace2-67b5e672fdee@amd.com/
+ */
+DECLARE_TRACE(sched_enqueue,
+	TP_PROTO(struct task_struct *tsk, int cpu),
+	TP_ARGS(tsk, cpu));
+
+DECLARE_TRACE(sched_dequeue,
+	TP_PROTO(struct task_struct *tsk, int cpu),
+	TP_ARGS(tsk, cpu));
+
 #endif /* _TRACE_SCHED_H */
 
 /* This part must be outside protection */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f6293fa02fb7..8958c013fb2c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2087,6 +2087,8 @@ unsigned long get_wchan(struct task_struct *p)
 
 void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
 {
+	trace_sched_enqueue_tp(p, rq->cpu);
+
 	if (!(flags & ENQUEUE_NOCLOCK))
 		update_rq_clock(rq);
 
@@ -2114,6 +2116,8 @@ void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
  */
 inline bool dequeue_task(struct rq *rq, struct task_struct *p, int flags)
 {
+	int ret;
+
 	if (sched_core_enabled(rq))
 		sched_core_dequeue(rq, p, flags);
 
@@ -2131,7 +2135,10 @@ inline bool dequeue_task(struct rq *rq, struct task_struct *p, int flags)
 	 */
 	uclamp_rq_dec(rq, p);
 	rq->queue_mask |= p->sched_class->queue_mask;
-	return p->sched_class->dequeue_task(rq, p, flags);
+	ret = p->sched_class->dequeue_task(rq, p, flags);
+	if (trace_sched_dequeue_tp_enabled() && !(flags & DEQUEUE_SLEEP))
+		trace_sched_dequeue_tp(p, rq->cpu);
+	return ret;
 }
 
 void activate_task(struct rq *rq, struct task_struct *p, int flags)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index e885a935b716..8465472b40fa 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2918,6 +2918,8 @@ static inline void sub_nr_running(struct rq *rq, unsigned count)
 
 static inline void __block_task(struct rq *rq, struct task_struct *p)
 {
+	trace_sched_dequeue_tp(p, rq->cpu);
+
 	if (p->sched_contributes_to_load)
 		rq->nr_uninterruptible++;
 
-- 
2.52.0
Re: [PATCH v4 10/15] sched: Add task enqueue/dequeue trace points
Posted by K Prateek Nayak 3 weeks, 2 days ago
Hello Gabriele, Nam,

On 1/16/2026 6:09 PM, Gabriele Monaco wrote:
> @@ -2087,6 +2087,8 @@ unsigned long get_wchan(struct task_struct *p)
>  
>  void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
>  {

For delayed task, I think you'll need:

	if (trace_sched_enqueue_tp_enabled() && !(flags & ENQUEUE_DELAYED))

> +		trace_sched_enqueue_tp(p, rq->cpu);
> +
>  	if (!(flags & ENQUEUE_NOCLOCK))
>  		update_rq_clock(rq);

Since delayed tasks haven't hit __block_task(), they are essentially
still enqueued. Peter should be able to confirm. Other than that,
the placements of the tracepoints look good now. Feel free to include:

Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>

-- 
Thanks and Regards,
Prateek
Re: [PATCH v4 10/15] sched: Add task enqueue/dequeue trace points
Posted by Gabriele Monaco 3 weeks, 2 days ago
On Fri, 2026-01-16 at 20:20 +0530, K Prateek Nayak wrote:
> Hello Gabriele, Nam,
> 
> On 1/16/2026 6:09 PM, Gabriele Monaco wrote:
> > @@ -2087,6 +2087,8 @@ unsigned long get_wchan(struct task_struct *p)
> >  
> >  void enqueue_task(struct rq *rq, struct task_struct *p, int flags)
> >  {
> 
> For delayed task, I think you'll need:
> 
> 	if (trace_sched_enqueue_tp_enabled() && !(flags & ENQUEUE_DELAYED))
> 
> > +		trace_sched_enqueue_tp(p, rq->cpu);
> > +
> >  	if (!(flags & ENQUEUE_NOCLOCK))
> >  		update_rq_clock(rq);
> 
> Since delayed tasks haven't hit __block_task(), they are essentially
> still enqueued. Peter should be able to confirm. Other than that,
> the placements of the tracepoints look good now. Feel free to include:
> 
> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com>

Mmh, I was sure I missed something after the comments on Nam's patch, thanks for
the heads up and review!

Going to try your suggestion.

Thanks,
Gabriele