[v2] sched_ext: Clarify CPU context for running/stopping callbacks

[PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks

Posted by Andrea Righi 9 months, 2 weeks ago

The ops.running() and ops.stopping() callbacks can be invoked from a CPU
other than the one the task is assigned to, particularly when a task
property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
run on CPUs different from the task's target CPU.

This behavior can lead to confusion or incorrect assumptions if not
properly clarified, potentially resulting in bugs (see [1]).

Therefore, update the documentation to clarify this aspect and advise
users to use scx_bpf_task_cpu() to determine the actual CPU the task
will run on or was running on.

[1] https://github.com/sched-ext/scx/pull/1728

Cc: Jake Hillion <jake@hillion.co.uk>
Cc: Changwoo Min <changwoo@igalia.com>
Signed-off-by: Andrea Righi <arighi@nvidia.com>
---
 kernel/sched/ext.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

Changes in v2:
 - clarify the scenario a bit more in the code comments
 - link to v1: https://lore.kernel.org/all/20250423190059.270236-1-arighi@nvidia.com/

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index ac79067dc87e6..a83232a032aa4 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -368,6 +368,15 @@ struct sched_ext_ops {
 	 * @running: A task is starting to run on its associated CPU
 	 * @p: task starting to run
 	 *
+	 * Note that this callback may be called from a CPU other than the
+	 * one the task is going to run on. This can happen when a task
+	 * property is changed (i.e., affinity), since scx_next_task_scx(),
+	 * which triggers this callback, may run on a CPU different from
+	 * the task's assigned CPU.
+	 *
+	 * Therefore, always use scx_bpf_task_cpu(@p) to determine the
+	 * target CPU the task is going to use.
+	 *
 	 * See ->runnable() for explanation on the task state notifiers.
 	 */
 	void (*running)(struct task_struct *p);
@@ -377,6 +386,15 @@ struct sched_ext_ops {
 	 * @p: task stopping to run
 	 * @runnable: is task @p still runnable?
 	 *
+	 * Note that this callback may be called from a CPU other than the
+	 * one the task was running on. This can happen when a task
+	 * property is changed (i.e., affinity), since dequeue_task_scx(),
+	 * which triggers this callback, may run on a CPU different from
+	 * the task's assigned CPU.
+	 *
+	 * Therefore, always use scx_bpf_task_cpu(@p) to retrieve the CPU
+	 * the task was running on.
+	 *
 	 * See ->runnable() for explanation on the task state notifiers. If
 	 * !@runnable, ->quiescent() will be invoked after this operation
 	 * returns.
-- 
2.49.0

Re: [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks

Posted by Tejun Heo 9 months, 2 weeks ago

On Wed, Apr 23, 2025 at 11:02:05PM +0200, Andrea Righi wrote:
> The ops.running() and ops.stopping() callbacks can be invoked from a CPU
> other than the one the task is assigned to, particularly when a task
> property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
> run on CPUs different from the task's target CPU.
> 
> This behavior can lead to confusion or incorrect assumptions if not
> properly clarified, potentially resulting in bugs (see [1]).
> 
> Therefore, update the documentation to clarify this aspect and advise
> users to use scx_bpf_task_cpu() to determine the actual CPU the task
> will run on or was running on.
> 
> [1] https://github.com/sched-ext/scx/pull/1728
> 
> Cc: Jake Hillion <jake@hillion.co.uk>
> Cc: Changwoo Min <changwoo@igalia.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>

Applied to sched_ext/for-6.16.

Thanks.

-- 
tejun

Re: [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks

Posted by Changwoo Min 9 months, 2 weeks ago

Hi Andrea,

On 4/24/25 06:02, Andrea Righi wrote:
> The ops.running() and ops.stopping() callbacks can be invoked from a CPU
> other than the one the task is assigned to, particularly when a task
> property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
> run on CPUs different from the task's target CPU.

The same goes to ops.quiescent() too since ops.quiescent() is also
called from dequeue_task_scx().

Reviewed-by: Changwoo Min <changwoo@igalia.com>

Regards,
Changwoo Min

> 
> This behavior can lead to confusion or incorrect assumptions if not
> properly clarified, potentially resulting in bugs (see [1]).
> 
> Therefore, update the documentation to clarify this aspect and advise
> users to use scx_bpf_task_cpu() to determine the actual CPU the task
> will run on or was running on.
> 
> [1] https://github.com/sched-ext/scx/pull/1728
> 
> Cc: Jake Hillion <jake@hillion.co.uk>
> Cc: Changwoo Min <changwoo@igalia.com>
> Signed-off-by: Andrea Righi <arighi@nvidia.com>
> ---
>   kernel/sched/ext.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
> 
> Changes in v2:
>   - clarify the scenario a bit more in the code comments
>   - link to v1: https://lore.kernel.org/all/20250423190059.270236-1-arighi@nvidia.com/
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index ac79067dc87e6..a83232a032aa4 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -368,6 +368,15 @@ struct sched_ext_ops {
>   	 * @running: A task is starting to run on its associated CPU
>   	 * @p: task starting to run
>   	 *
> +	 * Note that this callback may be called from a CPU other than the
> +	 * one the task is going to run on. This can happen when a task
> +	 * property is changed (i.e., affinity), since scx_next_task_scx(),
> +	 * which triggers this callback, may run on a CPU different from
> +	 * the task's assigned CPU.
> +	 *
> +	 * Therefore, always use scx_bpf_task_cpu(@p) to determine the
> +	 * target CPU the task is going to use.
> +	 *
>   	 * See ->runnable() for explanation on the task state notifiers.
>   	 */
>   	void (*running)(struct task_struct *p);
> @@ -377,6 +386,15 @@ struct sched_ext_ops {
>   	 * @p: task stopping to run
>   	 * @runnable: is task @p still runnable?
>   	 *
> +	 * Note that this callback may be called from a CPU other than the
> +	 * one the task was running on. This can happen when a task
> +	 * property is changed (i.e., affinity), since dequeue_task_scx(),
> +	 * which triggers this callback, may run on a CPU different from
> +	 * the task's assigned CPU.
> +	 *
> +	 * Therefore, always use scx_bpf_task_cpu(@p) to retrieve the CPU
> +	 * the task was running on.
> +	 *
>   	 * See ->runnable() for explanation on the task state notifiers. If
>   	 * !@runnable, ->quiescent() will be invoked after this operation
>   	 * returns.

Re: [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks

Posted by Andrea Righi 9 months, 2 weeks ago

Hi Changwoo,

On Thu, Apr 24, 2025 at 08:06:47AM +0900, Changwoo Min wrote:
> Hi Andrea,
> 
> On 4/24/25 06:02, Andrea Righi wrote:
> > The ops.running() and ops.stopping() callbacks can be invoked from a CPU
> > other than the one the task is assigned to, particularly when a task
> > property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
> > run on CPUs different from the task's target CPU.
> 
> The same goes to ops.quiescent() too since ops.quiescent() is also
> called from dequeue_task_scx().

Yeah, I was a bit conflicted about mentioning this for ops.runnable() and
ops.quiescent() as well, since it's more obvious in those cases that
they're executed outside the context of the "current CPU", since the task
isn't running on any CPU yet, or it's no longer running. In the end, I
decided to update only ops.running() and ops.stopping(), where it's less
clear that the task's CPU may not match the current CPU.

Thanks for taking a look!
-Andrea

> 
> Reviewed-by: Changwoo Min <changwoo@igalia.com>
> 
> Regards,
> Changwoo Min
> 
> > 
> > This behavior can lead to confusion or incorrect assumptions if not
> > properly clarified, potentially resulting in bugs (see [1]).
> > 
> > Therefore, update the documentation to clarify this aspect and advise
> > users to use scx_bpf_task_cpu() to determine the actual CPU the task
> > will run on or was running on.
> > 
> > [1] https://github.com/sched-ext/scx/pull/1728
> > 
> > Cc: Jake Hillion <jake@hillion.co.uk>
> > Cc: Changwoo Min <changwoo@igalia.com>
> > Signed-off-by: Andrea Righi <arighi@nvidia.com>
> > ---
> >   kernel/sched/ext.c | 18 ++++++++++++++++++
> >   1 file changed, 18 insertions(+)
> > 
> > Changes in v2:
> >   - clarify the scenario a bit more in the code comments
> >   - link to v1: https://lore.kernel.org/all/20250423190059.270236-1-arighi@nvidia.com/
> > 
> > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> > index ac79067dc87e6..a83232a032aa4 100644
> > --- a/kernel/sched/ext.c
> > +++ b/kernel/sched/ext.c
> > @@ -368,6 +368,15 @@ struct sched_ext_ops {
> >   	 * @running: A task is starting to run on its associated CPU
> >   	 * @p: task starting to run
> >   	 *
> > +	 * Note that this callback may be called from a CPU other than the
> > +	 * one the task is going to run on. This can happen when a task
> > +	 * property is changed (i.e., affinity), since scx_next_task_scx(),
> > +	 * which triggers this callback, may run on a CPU different from
> > +	 * the task's assigned CPU.
> > +	 *
> > +	 * Therefore, always use scx_bpf_task_cpu(@p) to determine the
> > +	 * target CPU the task is going to use.
> > +	 *
> >   	 * See ->runnable() for explanation on the task state notifiers.
> >   	 */
> >   	void (*running)(struct task_struct *p);
> > @@ -377,6 +386,15 @@ struct sched_ext_ops {
> >   	 * @p: task stopping to run
> >   	 * @runnable: is task @p still runnable?
> >   	 *
> > +	 * Note that this callback may be called from a CPU other than the
> > +	 * one the task was running on. This can happen when a task
> > +	 * property is changed (i.e., affinity), since dequeue_task_scx(),
> > +	 * which triggers this callback, may run on a CPU different from
> > +	 * the task's assigned CPU.
> > +	 *
> > +	 * Therefore, always use scx_bpf_task_cpu(@p) to retrieve the CPU
> > +	 * the task was running on.
> > +	 *
> >   	 * See ->runnable() for explanation on the task state notifiers. If
> >   	 * !@runnable, ->quiescent() will be invoked after this operation
> >   	 * returns.
>

Re: [PATCH v2] sched_ext: Clarify CPU context for running/stopping callbacks

Posted by Changwoo Min 9 months, 2 weeks ago

Hi Andrea,

On 4/24/25 14:26, Andrea Righi wrote:
> Hi Changwoo,
> 
> On Thu, Apr 24, 2025 at 08:06:47AM +0900, Changwoo Min wrote:
>> Hi Andrea,
>>
>> On 4/24/25 06:02, Andrea Righi wrote:
>>> The ops.running() and ops.stopping() callbacks can be invoked from a CPU
>>> other than the one the task is assigned to, particularly when a task
>>> property is changed, as both scx_next_task_scx() and dequeue_task_scx() may
>>> run on CPUs different from the task's target CPU.
>>
>> The same goes to ops.quiescent() too since ops.quiescent() is also
>> called from dequeue_task_scx().
> 
> Yeah, I was a bit conflicted about mentioning this for ops.runnable() and
> ops.quiescent() as well, since it's more obvious in those cases that
> they're executed outside the context of the "current CPU", since the task
> isn't running on any CPU yet, or it's no longer running. In the end, I
> decided to update only ops.running() and ops.stopping(), where it's less
> clear that the task's CPU may not match the current CPU.

That makes sense. Thanks for the clarification!

-- Changwoo

> 
> Thanks for taking a look!
> -Andrea
> 
>>
>> Reviewed-by: Changwoo Min <changwoo@igalia.com>
>>
>> Regards,
>> Changwoo Min
>>
>>>
>>> This behavior can lead to confusion or incorrect assumptions if not
>>> properly clarified, potentially resulting in bugs (see [1]).
>>>
>>> Therefore, update the documentation to clarify this aspect and advise
>>> users to use scx_bpf_task_cpu() to determine the actual CPU the task
>>> will run on or was running on.
>>>
>>> [1] https://github.com/sched-ext/scx/pull/1728
>>>
>>> Cc: Jake Hillion <jake@hillion.co.uk>
>>> Cc: Changwoo Min <changwoo@igalia.com>
>>> Signed-off-by: Andrea Righi <arighi@nvidia.com>
>>> ---
>>>    kernel/sched/ext.c | 18 ++++++++++++++++++
>>>    1 file changed, 18 insertions(+)
>>>
>>> Changes in v2:
>>>    - clarify the scenario a bit more in the code comments
>>>    - link to v1: https://lore.kernel.org/all/20250423190059.270236-1-arighi@nvidia.com/
>>>
>>> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
>>> index ac79067dc87e6..a83232a032aa4 100644
>>> --- a/kernel/sched/ext.c
>>> +++ b/kernel/sched/ext.c
>>> @@ -368,6 +368,15 @@ struct sched_ext_ops {
>>>    	 * @running: A task is starting to run on its associated CPU
>>>    	 * @p: task starting to run
>>>    	 *
>>> +	 * Note that this callback may be called from a CPU other than the
>>> +	 * one the task is going to run on. This can happen when a task
>>> +	 * property is changed (i.e., affinity), since scx_next_task_scx(),
>>> +	 * which triggers this callback, may run on a CPU different from
>>> +	 * the task's assigned CPU.
>>> +	 *
>>> +	 * Therefore, always use scx_bpf_task_cpu(@p) to determine the
>>> +	 * target CPU the task is going to use.
>>> +	 *
>>>    	 * See ->runnable() for explanation on the task state notifiers.
>>>    	 */
>>>    	void (*running)(struct task_struct *p);
>>> @@ -377,6 +386,15 @@ struct sched_ext_ops {
>>>    	 * @p: task stopping to run
>>>    	 * @runnable: is task @p still runnable?
>>>    	 *
>>> +	 * Note that this callback may be called from a CPU other than the
>>> +	 * one the task was running on. This can happen when a task
>>> +	 * property is changed (i.e., affinity), since dequeue_task_scx(),
>>> +	 * which triggers this callback, may run on a CPU different from
>>> +	 * the task's assigned CPU.
>>> +	 *
>>> +	 * Therefore, always use scx_bpf_task_cpu(@p) to retrieve the CPU
>>> +	 * the task was running on.
>>> +	 *
>>>    	 * See ->runnable() for explanation on the task state notifiers. If
>>>    	 * !@runnable, ->quiescent() will be invoked after this operation
>>>    	 * returns.
>>
>