[PATCH] accel/amdxdna: Fix dma_fence leak when job is canceled

Lizhi Hou posted 1 patch 1 month, 1 week ago
drivers/accel/amdxdna/aie2_ctx.c    | 1 -
drivers/accel/amdxdna/amdxdna_ctx.c | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
[PATCH] accel/amdxdna: Fix dma_fence leak when job is canceled
Posted by Lizhi Hou 1 month, 1 week ago
Currently, dma_fence_put(job->fence) is called in job notification
callback. However, if a job is canceled, the notification callback is never
invoked, leading to a memory leak. Move dma_fence_put(job->fence)
to the job cleanup function to ensure the fence is always released.

Fixes: aac243092b70 ("accel/amdxdna: Add command execution")
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
---
 drivers/accel/amdxdna/aie2_ctx.c    | 1 -
 drivers/accel/amdxdna/amdxdna_ctx.c | 1 +
 2 files changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
index 2b51c5211c2d..e9dd9e03ef07 100644
--- a/drivers/accel/amdxdna/aie2_ctx.c
+++ b/drivers/accel/amdxdna/aie2_ctx.c
@@ -189,7 +189,6 @@ aie2_sched_notify(struct amdxdna_sched_job *job)
 
 	up(&job->hwctx->priv->job_sem);
 	job->job_done = true;
-	dma_fence_put(fence);
 	mmput_async(job->mm);
 	aie2_job_put(job);
 }
diff --git a/drivers/accel/amdxdna/amdxdna_ctx.c b/drivers/accel/amdxdna/amdxdna_ctx.c
index 878cc955f56d..d17aef89a0ad 100644
--- a/drivers/accel/amdxdna/amdxdna_ctx.c
+++ b/drivers/accel/amdxdna/amdxdna_ctx.c
@@ -422,6 +422,7 @@ void amdxdna_sched_job_cleanup(struct amdxdna_sched_job *job)
 	trace_amdxdna_debug_point(job->hwctx->name, job->seq, "job release");
 	amdxdna_arg_bos_put(job);
 	amdxdna_gem_put_obj(job->cmd_bo);
+	dma_fence_put(job->fence);
 }
 
 int amdxdna_cmd_submit(struct amdxdna_client *client,
-- 
2.34.1
Re: [PATCH] accel/amdxdna: Fix dma_fence leak when job is canceled
Posted by Mario Limonciello (AMD) (kernel.org) 1 month, 1 week ago

On 11/5/2025 1:41 PM, Lizhi Hou wrote:
> Currently, dma_fence_put(job->fence) is called in job notification
> callback. However, if a job is canceled, the notification callback is never
> invoked, leading to a memory leak. Move dma_fence_put(job->fence)
> to the job cleanup function to ensure the fence is always released.
> 
> Fixes: aac243092b70 ("accel/amdxdna: Add command execution")
> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>> ---
>   drivers/accel/amdxdna/aie2_ctx.c    | 1 -
>   drivers/accel/amdxdna/amdxdna_ctx.c | 1 +
>   2 files changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
> index 2b51c5211c2d..e9dd9e03ef07 100644
> --- a/drivers/accel/amdxdna/aie2_ctx.c
> +++ b/drivers/accel/amdxdna/aie2_ctx.c
> @@ -189,7 +189,6 @@ aie2_sched_notify(struct amdxdna_sched_job *job)
>   
>   	up(&job->hwctx->priv->job_sem);
>   	job->job_done = true;
> -	dma_fence_put(fence);
>   	mmput_async(job->mm);
>   	aie2_job_put(job);
>   }
> diff --git a/drivers/accel/amdxdna/amdxdna_ctx.c b/drivers/accel/amdxdna/amdxdna_ctx.c
> index 878cc955f56d..d17aef89a0ad 100644
> --- a/drivers/accel/amdxdna/amdxdna_ctx.c
> +++ b/drivers/accel/amdxdna/amdxdna_ctx.c
> @@ -422,6 +422,7 @@ void amdxdna_sched_job_cleanup(struct amdxdna_sched_job *job)
>   	trace_amdxdna_debug_point(job->hwctx->name, job->seq, "job release");
>   	amdxdna_arg_bos_put(job);
>   	amdxdna_gem_put_obj(job->cmd_bo);
> +	dma_fence_put(job->fence);
>   }
>   
>   int amdxdna_cmd_submit(struct amdxdna_client *client,
Re: [PATCH] accel/amdxdna: Fix dma_fence leak when job is canceled
Posted by Lizhi Hou 1 month, 1 week ago
Applied to drm-misc-next.

On 11/5/25 11:43, Mario Limonciello (AMD) (kernel.org) wrote:
>
>
> On 11/5/2025 1:41 PM, Lizhi Hou wrote:
>> Currently, dma_fence_put(job->fence) is called in job notification
>> callback. However, if a job is canceled, the notification callback is 
>> never
>> invoked, leading to a memory leak. Move dma_fence_put(job->fence)
>> to the job cleanup function to ensure the fence is always released.
>>
>> Fixes: aac243092b70 ("accel/amdxdna: Add command execution")
>> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>> ---
>>   drivers/accel/amdxdna/aie2_ctx.c    | 1 -
>>   drivers/accel/amdxdna/amdxdna_ctx.c | 1 +
>>   2 files changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/accel/amdxdna/aie2_ctx.c 
>> b/drivers/accel/amdxdna/aie2_ctx.c
>> index 2b51c5211c2d..e9dd9e03ef07 100644
>> --- a/drivers/accel/amdxdna/aie2_ctx.c
>> +++ b/drivers/accel/amdxdna/aie2_ctx.c
>> @@ -189,7 +189,6 @@ aie2_sched_notify(struct amdxdna_sched_job *job)
>>         up(&job->hwctx->priv->job_sem);
>>       job->job_done = true;
>> -    dma_fence_put(fence);
>>       mmput_async(job->mm);
>>       aie2_job_put(job);
>>   }
>> diff --git a/drivers/accel/amdxdna/amdxdna_ctx.c 
>> b/drivers/accel/amdxdna/amdxdna_ctx.c
>> index 878cc955f56d..d17aef89a0ad 100644
>> --- a/drivers/accel/amdxdna/amdxdna_ctx.c
>> +++ b/drivers/accel/amdxdna/amdxdna_ctx.c
>> @@ -422,6 +422,7 @@ void amdxdna_sched_job_cleanup(struct 
>> amdxdna_sched_job *job)
>>       trace_amdxdna_debug_point(job->hwctx->name, job->seq, "job 
>> release");
>>       amdxdna_arg_bos_put(job);
>>       amdxdna_gem_put_obj(job->cmd_bo);
>> +    dma_fence_put(job->fence);
>>   }
>>     int amdxdna_cmd_submit(struct amdxdna_client *client,
>