[PATCH V3] drm/sched: Fix fence reference count leak

Qianyi Liu posted 1 patch 11 months, 2 weeks ago
There is a newer version of this series
drivers/gpu/drm/scheduler/sched_entity.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
[PATCH V3] drm/sched: Fix fence reference count leak
Posted by Qianyi Liu 11 months, 2 weeks ago
From: qianyi liu <liuqianyi125@gmail.com>

The last_scheduled fence leaked when an entity was being killed and
adding its callback failed.

Decrement the reference count of prev when dma_fence_add_callback()
fails, ensuring proper balance.

Cc: stable@vger.kernel.org
Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
Signed-off-by: qianyi liu <liuqianyi125@gmail.com>
---
v2 -> v3: Rework commit message (Markus)
v1 -> v2: Added 'Fixes:' tag and clarified commit message (Philipp and Matthew)
---
 drivers/gpu/drm/scheduler/sched_entity.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c
index 69bcf0e99d57..1c0c14bcf726 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -259,9 +259,12 @@ static void drm_sched_entity_kill(struct drm_sched_entity *entity)
 		struct drm_sched_fence *s_fence = job->s_fence;
 
 		dma_fence_get(&s_fence->finished);
-		if (!prev || dma_fence_add_callback(prev, &job->finish_cb,
-					   drm_sched_entity_kill_jobs_cb))
+		if (!prev ||
+		    dma_fence_add_callback(prev, &job->finish_cb,
+					   drm_sched_entity_kill_jobs_cb)) {
+			dma_fence_put(prev);
 			drm_sched_entity_kill_jobs_cb(NULL, &job->finish_cb);
+		}
 
 		prev = &s_fence->finished;
 	}
-- 
2.25.1
Re: [PATCH V3] drm/sched: Fix fence reference count leak
Posted by Philipp Stanner 11 months ago
Sorry for the delay

On Wed, 2025-02-26 at 17:05 +0800, Qianyi Liu wrote:
> From: qianyi liu <liuqianyi125@gmail.com>
> 
> The last_scheduled fence leaked when an entity was being killed and
> adding its callback failed.

s/leaked/leaks

s/was being/is being

s/its callback/the cleanup callback

s/failed/fails


> 
> Decrement the reference count of prev when dma_fence_add_callback()
> fails, ensuring proper balance.
> 
> Cc: stable@vger.kernel.org
> Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and
> fini")
> Signed-off-by: qianyi liu <liuqianyi125@gmail.com>
> ---
> v2 -> v3: Rework commit message (Markus)
> v1 -> v2: Added 'Fixes:' tag and clarified commit message (Philipp
> and Matthew)
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index 69bcf0e99d57..1c0c14bcf726 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -259,9 +259,12 @@ static void drm_sched_entity_kill(struct
> drm_sched_entity *entity)
>  		struct drm_sched_fence *s_fence = job->s_fence;
>  
>  		dma_fence_get(&s_fence->finished);
> -		if (!prev || dma_fence_add_callback(prev, &job-
> >finish_cb,
> -					  
> drm_sched_entity_kill_jobs_cb))
> +		if (!prev ||
> +		    dma_fence_add_callback(prev, &job->finish_cb,
> +					  
> drm_sched_entity_kill_jobs_cb)) {
> +			dma_fence_put(prev);

Please add a little comment about the dma_fence_put()'s purpose. Sth
like "Adding callback above failed. dma_fence_put() checks for NULL."

Then we should be good I think

Thx

>  			drm_sched_entity_kill_jobs_cb(NULL, &job-
> >finish_cb);
> +		}



}


>  
>  		prev = &s_fence->finished;
>  	}
[PATCH V3] drm/sched: Fix fence reference count leak
Posted by Qianyi Liu 11 months ago
> Sorry for the delay
>
> On Wed, 2025-02-26 at 17:05 +0800, Qianyi Liu wrote:
>> From: qianyi liu <liuqianyi125@gmail.com>
>>
>> The last_scheduled fence leaked when an entity was being killed and
>> adding its callback failed.
>
> s/leaked/leaks
>
> s/was being/is being
>
> s/its callback/the cleanup callback
>
> s/failed/fails

>>
>> Decrement the reference count of prev when dma_fence_add_callback()
>> fails, ensuring proper balance.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and
>> fini")
>> Signed-off-by: qianyi liu <liuqianyi125@gmail.com>
>> ---
>> v2 -> v3: Rework commit message (Markus)
>> v1 -> v2: Added 'Fixes:' tag and clarified commit message (Philipp
>> and Matthew)
>> ---
>>  drivers/gpu/drm/scheduler/sched_entity.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
>> b/drivers/gpu/drm/scheduler/sched_entity.c
>> index 69bcf0e99d57..1c0c14bcf726 100644
>> --- a/drivers/gpu/drm/scheduler/sched_entity.c
>> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
>> @@ -259,9 +259,12 @@ static void drm_sched_entity_kill(struct
>> drm_sched_entity *entity)
>>  		struct drm_sched_fence *s_fence = job->s_fence;
>>
>>  		dma_fence_get(&s_fence->finished);
>> -		if (!prev || dma_fence_add_callback(prev, &job-
>> >finish_cb,
>> -
>> drm_sched_entity_kill_jobs_cb))
>> +		if (!prev ||
>> +		    dma_fence_add_callback(prev, &job->finish_cb,
>> +
>> drm_sched_entity_kill_jobs_cb)) {
>> +			dma_fence_put(prev);
>
> Please add a little comment about the dma_fence_put()'s purpose. Sth
> like "Adding callback above failed. dma_fence_put() checks for NULL."
>
> Then we should be good I think
>
> Thx

OK, thank you for your detailed feedback.

Best regards.
QianYi.
Re: [PATCH V3] drm/sched: Fix fence reference count leak
Posted by Philipp Stanner 11 months, 2 weeks ago
On Wed, 2025-02-26 at 17:05 +0800, Qianyi Liu wrote:
> From: qianyi liu <liuqianyi125@gmail.com>
> 
> The last_scheduled fence leaked when an entity was being killed and
> adding its callback failed.
> 
> Decrement the reference count of prev when dma_fence_add_callback()
> fails, ensuring proper balance.
> 
> Cc: stable@vger.kernel.org
> Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and
> fini")
> Signed-off-by: qianyi liu <liuqianyi125@gmail.com>

@Matt: since you in principle agreed with this patch, could you give it
an official RB?

I could then take it [but will probably rephrase some nits in the
commit message]


P.

> ---
> v2 -> v3: Rework commit message (Markus)
> v1 -> v2: Added 'Fixes:' tag and clarified commit message (Philipp
> and Matthew)
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index 69bcf0e99d57..1c0c14bcf726 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -259,9 +259,12 @@ static void drm_sched_entity_kill(struct
> drm_sched_entity *entity)
>  		struct drm_sched_fence *s_fence = job->s_fence;
>  
>  		dma_fence_get(&s_fence->finished);
> -		if (!prev || dma_fence_add_callback(prev, &job-
> >finish_cb,
> -					  
> drm_sched_entity_kill_jobs_cb))
> +		if (!prev ||
> +		    dma_fence_add_callback(prev, &job->finish_cb,
> +					  
> drm_sched_entity_kill_jobs_cb)) {
> +			dma_fence_put(prev);
>  			drm_sched_entity_kill_jobs_cb(NULL, &job-
> >finish_cb);
> +		}
>  
>  		prev = &s_fence->finished;
>  	}
[PATCH V3] drm/sched: Fix fence reference count leak
Posted by Qianyi Liu 11 months ago
>> The last_scheduled fence leaked when an entity was being killed and
>> adding its callback failed.
>>
>> Decrement the reference count of prev when dma_fence_add_callback()
>> fails, ensuring proper balance.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
>> Signed-off-by: qianyi liu <liuqianyi125@gmail.com>
>
>> From: qianyi liu <liuqianyi125@gmail.com>
>>
>> The last_scheduled fence leaked when an entity was being killed and
>> adding its callback failed.
>>
>> Decrement the reference count of prev when dma_fence_add_callback()
>> fails, ensuring proper balance.
>>
>> Cc: stable@vger.kernel.org
>> Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and
>> fini")
>> Signed-off-by: qianyi liu <liuqianyi125@gmail.com>
>
> @Matt: since you in principle agreed with this patch, could you give it
> an official RB?
>
> I could then take it [but will probably rephrase some nits in the
> commit message]

Hello,

This patch was submitted a while back but hasn't seen any updates—just a kindly
ping.

Best regards.
QianYi.