[v1] sched_ext: Clarify CPU context for running/stopping callbacks

[PATCH] sched_ext: Clarify CPU context for running/stopping callbacks

Posted by Andrea Righi 9 months, 2 weeks ago

The ops.running() and ops.stopping() callbacks can be invoked from a CPU
other than the one the task is assigned to, particularly during affinity
changes, as both scx_next_task_scx() and dequeue_task_scx() may run on
CPUs different from the task's target CPU.

This behavior can lead to confusion or incorrect assumptions if not
properly clarified, potentially resulting in bugs (see [1]).

Therefore, update the documentation to clarify this aspect and advise
users to use scx_bpf_task_cpu() to determine the actual CPU the task
will run on or was running on.

[1] https://github.com/sched-ext/scx/pull/1728

Cc: Jake Hillion <jake@hillion.co.uk>
Cc: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
 kernel/sched/ext.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index a2380a6bba210..f146e678cc261 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -373,6 +373,10 @@ struct sched_ext_ops {
 	 * @running: A task is starting to run on its associated CPU
 	 * @p: task starting to run
 	 *
+	 * Note that this callback may be called from a CPU other than the
+	 * one the task is going to run on. Use scx_bpf_task_cpu(@p) to
+	 * determine the target CPU the task is going to use.
+	 *
 	 * See ->runnable() for explanation on the task state notifiers.
 	 */
 	void (*running)(struct task_struct *p);
@@ -382,6 +386,10 @@ struct sched_ext_ops {
 	 * @p: task stopping to run
 	 * @runnable: is task @p still runnable?
 	 *
+	 * Note that this callback may be called from a CPU other than the
+	 * one the task was running on. Use scx_bpf_task_cpu(@p) to
+	 * retrieve the CPU used by the task.
+	 *
 	 * See ->runnable() for explanation on the task state notifiers. If
 	 * !@runnable, ->quiescent() will be invoked after this operation
 	 * returns.
-- 
2.49.0

Re: [PATCH] sched_ext: Clarify CPU context for running/stopping callbacks

Posted by Tejun Heo 9 months, 2 weeks ago

On Wed, Apr 23, 2025 at 09:00:59PM +0200, Andrea Righi wrote:
> The ops.running() and ops.stopping() callbacks can be invoked from a CPU
> other than the one the task is assigned to, particularly during affinity
> changes, as both scx_next_task_scx() and dequeue_task_scx() may run on
> CPUs different from the task's target CPU.
> 
> This behavior can lead to confusion or incorrect assumptions if not
> properly clarified, potentially resulting in bugs (see [1]).
> 
> Therefore, update the documentation to clarify this aspect and advise
> users to use scx_bpf_task_cpu() to determine the actual CPU the task
> will run on or was running on.
> 
> [1] https://github.com/sched-ext/scx/pull/1728
> 
> Cc: Jake Hillion <jake@hillion.co.uk>
> Cc: Changwoo Min <changwoo@igalia.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>
> ---
>  kernel/sched/ext.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index a2380a6bba210..f146e678cc261 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -373,6 +373,10 @@ struct sched_ext_ops {
>  	 * @running: A task is starting to run on its associated CPU
>  	 * @p: task starting to run
>  	 *
> +	 * Note that this callback may be called from a CPU other than the
> +	 * one the task is going to run on. Use scx_bpf_task_cpu(@p) to
> +	 * determine the target CPU the task is going to use.

Can you briefly explain the scenario as a part of the comment? Just a couple
sentences can go a long way against future head scratches.

Thanks.

-- 
tejun