drm/sched: Document racy behavior of drm_sched_entity_push_job()

[PATCH] drm/sched: Document racy behavior of drm_sched_entity_push_job()

Posted by Philipp Stanner 2 months, 3 weeks ago

drm_sched_entity_push_job() uses the unlocked spsc_queue. It takes a
reference to that queue's tip at the start, and some time later removes
that entry from that list, without locking or protection against
preemption.

This is by design, since the spsc_queue demands single producer and
single consumer. It was, however, never documented.

Document that you must not call drm_sched_entity_push_job() in parallel
for the same entity.

Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
 drivers/gpu/drm/scheduler/sched_entity.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 5a4697f636f2..b31e8d14aa20 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -562,6 +562,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
  * drm_sched_entity_push_job - Submit a job to the entity's job queue
  * @sched_job: job to submit
  *
+ * It is illegal to call this function in parallel, at least for jobs belonging
+ * to the same entity. Doing so leads to undefined behavior.
+ *
  * Note: To guarantee that the order of insertion to queue matches the job's
  * fence sequence number this function should be called with drm_sched_job_arm()
  * under common lock for the struct drm_sched_entity that was set up for
-- 
2.49.0

Re: [PATCH] drm/sched: Document racy behavior of drm_sched_entity_push_job()

Posted by Tvrtko Ursulin 2 months, 3 weeks ago

On 12/11/2025 07:31, Philipp Stanner wrote:
> drm_sched_entity_push_job() uses the unlocked spsc_queue. It takes a
> reference to that queue's tip at the start, and some time later removes
> that entry from that list, without locking or protection against
> preemption.

I couldn't figure out what you refer to by tip reference at the start, 
and later removes it. Are you talking about the top level view from 
drm_sched_entity_push_job() or where exactly?
> This is by design, since the spsc_queue demands single producer and
> single consumer. It was, however, never documented.
> 
> Document that you must not call drm_sched_entity_push_job() in parallel
> for the same entity.
> 
> Signed-off-by: Philipp Stanner <phasta@kernel.org>
> ---
>   drivers/gpu/drm/scheduler/sched_entity.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> index 5a4697f636f2..b31e8d14aa20 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -562,6 +562,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>    * drm_sched_entity_push_job - Submit a job to the entity's job queue
>    * @sched_job: job to submit
>    *
> + * It is illegal to call this function in parallel, at least for jobs belonging
> + * to the same entity. Doing so leads to undefined behavior.

One thing that is documented in the very next paragraph is that the 
design implies a lock held between arm and push. At least to ensure 
seqno order matches the queue order.

I did not get what other breakage you found, but I also previously did 
find something other than that. Hm.. if I could only remember what it 
was. Probably mine was something involving drm_sched_entity_select_rq(), 
drm_sched_entity_modify_sched() and (theoretical) multi-threaded 
userspace submit on the same entity. Luckily it seems no one does that.

The issue you found is separate and not theoretically fixed by this 
hypothetical common lock held over arm and push?

Regards,

Tvrtko

> + *
>    * Note: To guarantee that the order of insertion to queue matches the job's
>    * fence sequence number this function should be called with drm_sched_job_arm()
>    * under common lock for the struct drm_sched_entity that was set up for

Re: [PATCH] drm/sched: Document racy behavior of drm_sched_entity_push_job()

Posted by Philipp Stanner 2 months, 3 weeks ago

On Wed, 2025-11-12 at 09:42 +0000, Tvrtko Ursulin wrote:
> 
> On 12/11/2025 07:31, Philipp Stanner wrote:
> > drm_sched_entity_push_job() uses the unlocked spsc_queue. It takes a
> > reference to that queue's tip at the start, and some time later removes
> > that entry from that list, without locking or protection against
> > preemption.
> 
> I couldn't figure out what you refer to by tip reference at the start, 
> and later removes it. Are you talking about the top level view from 
> drm_sched_entity_push_job() or where exactly?
> > This is by design, since the spsc_queue demands single producer and
> > single consumer. It was, however, never documented.
> > 
> > Document that you must not call drm_sched_entity_push_job() in parallel
> > for the same entity.
> > 
> > Signed-off-by: Philipp Stanner <phasta@kernel.org>
> > ---
> >   drivers/gpu/drm/scheduler/sched_entity.c | 3 +++
> >   1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > index 5a4697f636f2..b31e8d14aa20 100644
> > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > @@ -562,6 +562,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> >    * drm_sched_entity_push_job - Submit a job to the entity's job queue
> >    * @sched_job: job to submit
> >    *
> > + * It is illegal to call this function in parallel, at least for jobs belonging
> > + * to the same entity. Doing so leads to undefined behavior.
> 
> One thing that is documented in the very next paragraph is that the 
> design implies a lock held between arm and push. At least to ensure 
> seqno order matches the queue order.
> 
> I did not get what other breakage you found, but I also previously did 
> find something other than that. Hm.. if I could only remember what it
> was. Probably mine was something involving drm_sched_entity_select_rq(), 
> drm_sched_entity_modify_sched() and (theoretical) multi-threaded 
> userspace submit on the same entity. Luckily it seems no one does that.
> 
> The issue you found is separate and not theoretically fixed by this 
> hypothetical common lock held over arm and push?

Well, if 2 CPUs should ever run in parallel in
drm_sched_entity_push_job() the spsc_queue will just explode. Most
notably, one CPU could get the job at the tip (the oldest job), then be
preempted, and then another CPU takes the same job and pops it.

The API contract should be that the user doesn't have to know whether
there's a linked list or the magic spsc_queue.

The entire entity submission pattern is basically designed around
"single producer [per entity]", and that's not very well documented.

I don't think the common lock comment fully addresses that, because
it's merely about the sequence numbers. I think the common lock should
likely prevent any issues, but I feel more comfortable if the user gets
informed explicitly about the potential racyness of this function.

Maybe we can combine the two paragraphs.

P.

Re: [PATCH] drm/sched: Document racy behavior of drm_sched_entity_push_job()

Posted by Tvrtko Ursulin 2 months, 3 weeks ago

On 12/11/2025 12:15, Philipp Stanner wrote:
> On Wed, 2025-11-12 at 09:42 +0000, Tvrtko Ursulin wrote:
>>
>> On 12/11/2025 07:31, Philipp Stanner wrote:
>>> drm_sched_entity_push_job() uses the unlocked spsc_queue. It takes a
>>> reference to that queue's tip at the start, and some time later removes
>>> that entry from that list, without locking or protection against
>>> preemption.
>>
>> I couldn't figure out what you refer to by tip reference at the start,
>> and later removes it. Are you talking about the top level view from
>> drm_sched_entity_push_job() or where exactly?
>>> This is by design, since the spsc_queue demands single producer and
>>> single consumer. It was, however, never documented.
>>>
>>> Document that you must not call drm_sched_entity_push_job() in parallel
>>> for the same entity.
>>>
>>> Signed-off-by: Philipp Stanner <phasta@kernel.org>
>>> ---
>>>    drivers/gpu/drm/scheduler/sched_entity.c | 3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>> index 5a4697f636f2..b31e8d14aa20 100644
>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>> @@ -562,6 +562,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>     * drm_sched_entity_push_job - Submit a job to the entity's job queue
>>>     * @sched_job: job to submit
>>>     *
>>> + * It is illegal to call this function in parallel, at least for jobs belonging
>>> + * to the same entity. Doing so leads to undefined behavior.
>>
>> One thing that is documented in the very next paragraph is that the
>> design implies a lock held between arm and push. At least to ensure
>> seqno order matches the queue order.
>>
>> I did not get what other breakage you found, but I also previously did
>> find something other than that. Hm.. if I could only remember what it
>> was. Probably mine was something involving drm_sched_entity_select_rq(),
>> drm_sched_entity_modify_sched() and (theoretical) multi-threaded
>> userspace submit on the same entity. Luckily it seems no one does that.
>>
>> The issue you found is separate and not theoretically fixed by this
>> hypothetical common lock held over arm and push?
> 
> Well, if 2 CPUs should ever run in parallel in
> drm_sched_entity_push_job() the spsc_queue will just explode. Most
> notably, one CPU could get the job at the tip (the oldest job), then be
> preempted, and then another CPU takes the same job and pops it.

Ah, you are talking about the dequeue/pop side. First paragraph of the 
commit message can be clarified in that case.

Pop is serialised by the worker so I don't think two simultaneous 
dequeues on the same scheduler are possible. How did you trigger it?
> The API contract should be that the user doesn't have to know whether
> there's a linked list or the magic spsc_queue.Luckily we moved the peek/pop helpers to sched_internal.h.

Btw I thought you gave up on the scheduler and are working on the simple 
rust queue for firmware schedulers so how come you are finding subtle 
bugs in this code?

> The entire entity submission pattern is basically designed around
> "single producer [per entity]", and that's not very well documented.
> 
> I don't think the common lock comment fully addresses that, because
> it's merely about the sequence numbers. I think the common lock should
> likely prevent any issues, but I feel more comfortable if the user gets
> informed explicitly about the potential racyness of this function.
> 
> Maybe we can combine the two paragraphs.
Yeah, in some way.

Tvrtko

Re: [PATCH] drm/sched: Document racy behavior of drm_sched_entity_push_job()

Posted by Philipp Stanner 2 months, 3 weeks ago

On Wed, 2025-11-12 at 13:13 +0000, Tvrtko Ursulin wrote:
> 
> On 12/11/2025 12:15, Philipp Stanner wrote:
> > On Wed, 2025-11-12 at 09:42 +0000, Tvrtko Ursulin wrote:
> > > 
> > > On 12/11/2025 07:31, Philipp Stanner wrote:
> > > > drm_sched_entity_push_job() uses the unlocked spsc_queue. It takes a
> > > > reference to that queue's tip at the start, and some time later removes
> > > > that entry from that list, without locking or protection against
> > > > preemption.
> > > 
> > > I couldn't figure out what you refer to by tip reference at the start,
> > > and later removes it. Are you talking about the top level view from
> > > drm_sched_entity_push_job() or where exactly?
> > > > This is by design, since the spsc_queue demands single producer and
> > > > single consumer. It was, however, never documented.
> > > > 
> > > > Document that you must not call drm_sched_entity_push_job() in parallel
> > > > for the same entity.
> > > > 
> > > > Signed-off-by: Philipp Stanner <phasta@kernel.org>
> > > > ---
> > > >    drivers/gpu/drm/scheduler/sched_entity.c | 3 +++
> > > >    1 file changed, 3 insertions(+)
> > > > 
> > > > diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
> > > > index 5a4697f636f2..b31e8d14aa20 100644
> > > > --- a/drivers/gpu/drm/scheduler/sched_entity.c
> > > > +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> > > > @@ -562,6 +562,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
> > > >     * drm_sched_entity_push_job - Submit a job to the entity's job queue
> > > >     * @sched_job: job to submit
> > > >     *
> > > > + * It is illegal to call this function in parallel, at least for jobs belonging
> > > > + * to the same entity. Doing so leads to undefined behavior.
> > > 
> > > One thing that is documented in the very next paragraph is that the
> > > design implies a lock held between arm and push. At least to ensure
> > > seqno order matches the queue order.
> > > 
> > > I did not get what other breakage you found, but I also previously did
> > > find something other than that. Hm.. if I could only remember what it
> > > was. Probably mine was something involving drm_sched_entity_select_rq(),
> > > drm_sched_entity_modify_sched() and (theoretical) multi-threaded
> > > userspace submit on the same entity. Luckily it seems no one does that.
> > > 
> > > The issue you found is separate and not theoretically fixed by this
> > > hypothetical common lock held over arm and push?
> > 
> > Well, if 2 CPUs should ever run in parallel in
> > drm_sched_entity_push_job() the spsc_queue will just explode. Most
> > notably, one CPU could get the job at the tip (the oldest job), then be
> > preempted, and then another CPU takes the same job and pops it.
> 
> Ah, you are talking about the dequeue/pop side. First paragraph of the 
> commit message can be clarified in that case.
> 
> Pop is serialised by the worker so I don't think two simultaneous 
> dequeues on the same scheduler are possible. How did you trigger it?
> > The API contract should be that the user doesn't have to know whether
> > there's a linked list or the magic spsc_queue.Luckily we moved the peek/pop helpers to sched_internal.h.
> 
> Btw I thought you gave up on the scheduler and are working on the simple 
> rust queue for firmware schedulers so how come you are finding subtle
> bugs in this code?

I'm a maintainer still, for a variety of reasons. That we work on
something for FW with a clean locking design doesn't mean we don't care
about existing infrastructure anymore. And I firmly believe in
documentation. People should know the rules from the API documentation,
not from looking into the implementation.

It's kind of a crucial design info that you must only push into
entities sequentially, no?

This doesn't fix a bug, obviously, since it's just a line of docu.
Regardless, pushing into the spsc queue in parallel would explode.
Emphasizing that costs as nothing.



P.

Re: [PATCH] drm/sched: Document racy behavior of drm_sched_entity_push_job()

Posted by Tvrtko Ursulin 2 months, 3 weeks ago

On 12/11/2025 13:31, Philipp Stanner wrote:
> On Wed, 2025-11-12 at 13:13 +0000, Tvrtko Ursulin wrote:
>>
>> On 12/11/2025 12:15, Philipp Stanner wrote:
>>> On Wed, 2025-11-12 at 09:42 +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 12/11/2025 07:31, Philipp Stanner wrote:
>>>>> drm_sched_entity_push_job() uses the unlocked spsc_queue. It takes a
>>>>> reference to that queue's tip at the start, and some time later removes
>>>>> that entry from that list, without locking or protection against
>>>>> preemption.
>>>>
>>>> I couldn't figure out what you refer to by tip reference at the start,
>>>> and later removes it. Are you talking about the top level view from
>>>> drm_sched_entity_push_job() or where exactly?
>>>>> This is by design, since the spsc_queue demands single producer and
>>>>> single consumer. It was, however, never documented.
>>>>>
>>>>> Document that you must not call drm_sched_entity_push_job() in parallel
>>>>> for the same entity.
>>>>>
>>>>> Signed-off-by: Philipp Stanner <phasta@kernel.org>
>>>>> ---
>>>>>     drivers/gpu/drm/scheduler/sched_entity.c | 3 +++
>>>>>     1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>> index 5a4697f636f2..b31e8d14aa20 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>>>>> @@ -562,6 +562,9 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity)
>>>>>      * drm_sched_entity_push_job - Submit a job to the entity's job queue
>>>>>      * @sched_job: job to submit
>>>>>      *
>>>>> + * It is illegal to call this function in parallel, at least for jobs belonging
>>>>> + * to the same entity. Doing so leads to undefined behavior.
>>>>
>>>> One thing that is documented in the very next paragraph is that the
>>>> design implies a lock held between arm and push. At least to ensure
>>>> seqno order matches the queue order.
>>>>
>>>> I did not get what other breakage you found, but I also previously did
>>>> find something other than that. Hm.. if I could only remember what it
>>>> was. Probably mine was something involving drm_sched_entity_select_rq(),
>>>> drm_sched_entity_modify_sched() and (theoretical) multi-threaded
>>>> userspace submit on the same entity. Luckily it seems no one does that.
>>>>
>>>> The issue you found is separate and not theoretically fixed by this
>>>> hypothetical common lock held over arm and push?
>>>
>>> Well, if 2 CPUs should ever run in parallel in
>>> drm_sched_entity_push_job() the spsc_queue will just explode. Most
>>> notably, one CPU could get the job at the tip (the oldest job), then be
>>> preempted, and then another CPU takes the same job and pops it.
>>
>> Ah, you are talking about the dequeue/pop side. First paragraph of the
>> commit message can be clarified in that case.
>>
>> Pop is serialised by the worker so I don't think two simultaneous
>> dequeues on the same scheduler are possible. How did you trigger it?
>>> The API contract should be that the user doesn't have to know whether
>>> there's a linked list or the magic spsc_queue.Luckily we moved the peek/pop helpers to sched_internal.h.
>>
>> Btw I thought you gave up on the scheduler and are working on the simple
>> rust queue for firmware schedulers so how come you are finding subtle
>> bugs in this code?
> 
> I'm a maintainer still, for a variety of reasons. That we work on
> something for FW with a clean locking design doesn't mean we don't care
> about existing infrastructure anymore. And I firmly believe in
> documentation. People should know the rules from the API documentation,
> not from looking into the implementation.
> 
> It's kind of a crucial design info that you must only push into
> entities sequentially, no?
> 
> This doesn't fix a bug, obviously, since it's just a line of docu.
> Regardless, pushing into the spsc queue in parallel would explode.
> Emphasizing that costs as nothing.

It wasn't an argument. I was just curious if this is something you 
managed to hit and how.

Regards,

Tvrtko