[PATCH] drm/amdgpu: fix fence reference leak in amdgpu_gfx_run_cleaner_shader_job

Wentao Liang posted 1 patch 2 days, 20 hours ago
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
[PATCH] drm/amdgpu: fix fence reference leak in amdgpu_gfx_run_cleaner_shader_job
Posted by Wentao Liang 2 days, 20 hours ago
In amdgpu_gfx_run_cleaner_shader_job(), amdgpu_job_submit() returns a
dma_fence with an elevated reference count. The function correctly
releases this reference on the success path after dma_fence_wait().
However, if dma_fence_wait() fails (e.g., due to a signal interruption),
the code jumps to the error label without calling dma_fence_put(),
resulting in a reference leak.

Fix the leak by adding dma_fence_put(f) before the goto err when
dma_fence_wait() returns an error.

Fixes: 559a285816af ("drm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in cleaner shader")
Cc: stable@vger.kernel.org
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index b8ca876694ff..88bec4e93712 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -1686,8 +1686,10 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
 	f = amdgpu_job_submit(job);
 
 	r = dma_fence_wait(f, false);
-	if (r)
+	if (r) {
+		dma_fence_put(f);
 		goto err;
+	}
 
 	dma_fence_put(f);
 
-- 
2.34.1
Re: [PATCH] drm/amdgpu: fix fence reference leak in amdgpu_gfx_run_cleaner_shader_job
Posted by Alex Deucher 2 days, 9 hours ago
On Fri, Jun 5, 2026 at 5:24 AM Wentao Liang <vulab@iscas.ac.cn> wrote:
>
> In amdgpu_gfx_run_cleaner_shader_job(), amdgpu_job_submit() returns a
> dma_fence with an elevated reference count. The function correctly
> releases this reference on the success path after dma_fence_wait().
> However, if dma_fence_wait() fails (e.g., due to a signal interruption),
> the code jumps to the error label without calling dma_fence_put(),
> resulting in a reference leak.
>
> Fix the leak by adding dma_fence_put(f) before the goto err when
> dma_fence_wait() returns an error.
>
> Fixes: 559a285816af ("drm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in cleaner shader")
> Cc: stable@vger.kernel.org
> Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index b8ca876694ff..88bec4e93712 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -1686,8 +1686,10 @@ static int amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
>         f = amdgpu_job_submit(job);
>
>         r = dma_fence_wait(f, false);
> -       if (r)
> +       if (r) {
> +               dma_fence_put(f);
>                 goto err;
> +       }

I think all of the clean up paths have issues.  How about something like this:

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index 321d7aa52f042..848846ac9391e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -1701,7 +1701,7 @@ static int
amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
                                  &sched, 1, NULL);
        if (r) {
                dev_err(adev->dev, "Failed setting up GFX kernel entity.\n");
-               goto err;
+               return r;
        }

        /*
@@ -1729,16 +1729,12 @@ static int
amdgpu_gfx_run_cleaner_shader_job(struct amdgpu_ring *ring)
        f = amdgpu_job_submit(job);

        r = dma_fence_wait(f, false);
-       if (r)
-               goto err;

        dma_fence_put(f);

+err:
        /* Clean up the scheduler entity */
        drm_sched_entity_destroy(&entity);
-       return 0;
-
-err:
        return r;
 }



>
>         dma_fence_put(f);
>
> --
> 2.34.1
>