[PATCH] drm/panfrost: Fix scheduler workqueue bug

Philipp Stanner posted 1 patch 3 months ago
drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] drm/panfrost: Fix scheduler workqueue bug
Posted by Philipp Stanner 3 months ago
When the GPU scheduler was ported to using a struct for its
initialization parameters, it was overlooked that panfrost creates a
distinct workqueue for timeout handling.

The pointer to this new workqueue is not initialized to the struct,
resulting in NULL being passed to the scheduler, which then uses the
system_wq for timeout handling.

Set the correct workqueue to the init args struct.

Cc: stable@vger.kernel.org # 6.15+
Fixes: 796a9f55a8d1 ("drm/sched: Use struct for drm_sched_init() params")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Closes: https://lore.kernel.org/dri-devel/b5d0921c-7cbf-4d55-aa47-c35cd7861c02@igalia.com/
Signed-off-by: Philipp Stanner <phasta@kernel.org>
---
 drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 5657106c2f7d..15e2d505550f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -841,7 +841,6 @@ int panfrost_job_init(struct panfrost_device *pfdev)
 		.num_rqs = DRM_SCHED_PRIORITY_COUNT,
 		.credit_limit = 2,
 		.timeout = msecs_to_jiffies(JOB_TIMEOUT_MS),
-		.timeout_wq = pfdev->reset.wq,
 		.name = "pan_js",
 		.dev = pfdev->dev,
 	};
@@ -879,6 +878,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
 	pfdev->reset.wq = alloc_ordered_workqueue("panfrost-reset", 0);
 	if (!pfdev->reset.wq)
 		return -ENOMEM;
+	args.timeout_wq = pfdev->reset.wq;
 
 	for (j = 0; j < NUM_JOB_SLOTS; j++) {
 		js->queue[j].fence_context = dma_fence_context_alloc(1);
-- 
2.49.0
Re: [PATCH] drm/panfrost: Fix scheduler workqueue bug
Posted by Steven Price 3 months ago
On 09/07/2025 11:29, Philipp Stanner wrote:
> When the GPU scheduler was ported to using a struct for its
> initialization parameters, it was overlooked that panfrost creates a
> distinct workqueue for timeout handling.
> 
> The pointer to this new workqueue is not initialized to the struct,
> resulting in NULL being passed to the scheduler, which then uses the
> system_wq for timeout handling.
> 
> Set the correct workqueue to the init args struct.
> 
> Cc: stable@vger.kernel.org # 6.15+
> Fixes: 796a9f55a8d1 ("drm/sched: Use struct for drm_sched_init() params")
> Reported-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Closes: https://lore.kernel.org/dri-devel/b5d0921c-7cbf-4d55-aa47-c35cd7861c02@igalia.com/
> Signed-off-by: Philipp Stanner <phasta@kernel.org>

Reviewed-by: Steven Price <steven.price@arm.com>

> ---
>  drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 5657106c2f7d..15e2d505550f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -841,7 +841,6 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>  		.num_rqs = DRM_SCHED_PRIORITY_COUNT,
>  		.credit_limit = 2,
>  		.timeout = msecs_to_jiffies(JOB_TIMEOUT_MS),
> -		.timeout_wq = pfdev->reset.wq,
>  		.name = "pan_js",
>  		.dev = pfdev->dev,
>  	};
> @@ -879,6 +878,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>  	pfdev->reset.wq = alloc_ordered_workqueue("panfrost-reset", 0);
>  	if (!pfdev->reset.wq)
>  		return -ENOMEM;
> +	args.timeout_wq = pfdev->reset.wq;
>  
>  	for (j = 0; j < NUM_JOB_SLOTS; j++) {
>  		js->queue[j].fence_context = dma_fence_context_alloc(1);
Re: [PATCH] drm/panfrost: Fix scheduler workqueue bug
Posted by Tvrtko Ursulin 3 months ago
On 09/07/2025 11:29, Philipp Stanner wrote:
> When the GPU scheduler was ported to using a struct for its
> initialization parameters, it was overlooked that panfrost creates a
> distinct workqueue for timeout handling.
> 
> The pointer to this new workqueue is not initialized to the struct,
> resulting in NULL being passed to the scheduler, which then uses the
> system_wq for timeout handling.
> 
> Set the correct workqueue to the init args struct.
> 
> Cc: stable@vger.kernel.org # 6.15+
> Fixes: 796a9f55a8d1 ("drm/sched: Use struct for drm_sched_init() params")
> Reported-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Closes: https://lore.kernel.org/dri-devel/b5d0921c-7cbf-4d55-aa47-c35cd7861c02@igalia.com/
> Signed-off-by: Philipp Stanner <phasta@kernel.org>
> ---
>   drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
> index 5657106c2f7d..15e2d505550f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
> @@ -841,7 +841,6 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>   		.num_rqs = DRM_SCHED_PRIORITY_COUNT,
>   		.credit_limit = 2,
>   		.timeout = msecs_to_jiffies(JOB_TIMEOUT_MS),
> -		.timeout_wq = pfdev->reset.wq,
>   		.name = "pan_js",
>   		.dev = pfdev->dev,
>   	};
> @@ -879,6 +878,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
>   	pfdev->reset.wq = alloc_ordered_workqueue("panfrost-reset", 0);
>   	if (!pfdev->reset.wq)
>   		return -ENOMEM;
> +	args.timeout_wq = pfdev->reset.wq;
>   
>   	for (j = 0; j < NUM_JOB_SLOTS; j++) {
>   		js->queue[j].fence_context = dma_fence_context_alloc(1);

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

Regards,

Tvrtko