[PATCH 2/2] drm/amdgpu: protect queue access in signal IOCTL

Chenyuan Mi posted 2 patches 1 month ago
[PATCH 2/2] drm/amdgpu: protect queue access in signal IOCTL
Posted by Chenyuan Mi 1 month ago
amdgpu_userq_signal_ioctl() retrieves the user queue via xa_load()
and then dereferences it in amdgpu_userq_fence_read_wptr(),
amdgpu_userq_fence_create(), and direct queue->last_fence accesses,
all before userq_mutex is acquired by amdgpu_userq_ensure_ev_fence().

A concurrent AMDGPU_USERQ_OP_FREE can destroy and free the queue
in this window, leading to a use-after-free.

This bug predates the queue-id wait ioctl changes and has been
present since the original signal/wait ioctl implementation.

Fix this by moving amdgpu_userq_ensure_ev_fence() before xa_load()
so that the queue lookup and all subsequent accesses are performed
under the userq_mutex that ensure_ev_fence acquires. Add the
necessary mutex_unlock() calls to the error paths between the moved
ensure_ev_fence and the existing unlock.

Fixes: a292fdecd728 ("drm/amdgpu: Implement userqueue signal/wait IOCTL")
Cc: stable@vger.kernel.org
Signed-off-by: Chenyuan Mi <chenyuan_mi@163.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
index 1785ea7c18fe..7866f583eea4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
@@ -545,23 +545,28 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data,
 		}
 	}
 
-	/* Retrieve the user queue */
+	/* We are here means UQ is active, make sure the eviction fence is valid */
+	amdgpu_userq_ensure_ev_fence(&fpriv->userq_mgr, &fpriv->evf_mgr);
+
+	/* Retrieve the user queue under userq_mutex (held by ensure_ev_fence) */
 	queue = xa_load(&userq_mgr->userq_xa, args->queue_id);
 	if (!queue) {
+		mutex_unlock(&userq_mgr->userq_mutex);
 		r = -ENOENT;
 		goto put_gobj_write;
 	}
 
 	r = amdgpu_userq_fence_read_wptr(adev, queue, &wptr);
-	if (r)
+	if (r) {
+		mutex_unlock(&userq_mgr->userq_mutex);
 		goto put_gobj_write;
+	}
 
 	r = amdgpu_userq_fence_alloc(&userq_fence);
-	if (r)
+	if (r) {
+		mutex_unlock(&userq_mgr->userq_mutex);
 		goto put_gobj_write;
-
-	/* We are here means UQ is active, make sure the eviction fence is valid */
-	amdgpu_userq_ensure_ev_fence(&fpriv->userq_mgr, &fpriv->evf_mgr);
+	}
 
 	/* Create a new fence */
 	r = amdgpu_userq_fence_create(queue, userq_fence, wptr, &fence);
-- 
2.53.0
Re: [PATCH 2/2] drm/amdgpu: protect queue access in signal IOCTL
Posted by Christian König 1 month ago
On 3/9/26 03:22, Chenyuan Mi wrote:
> [Some people who received this message don't often get email from chenyuan_mi@163.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> amdgpu_userq_signal_ioctl() retrieves the user queue via xa_load()
> and then dereferences it in amdgpu_userq_fence_read_wptr(),
> amdgpu_userq_fence_create(), and direct queue->last_fence accesses,
> all before userq_mutex is acquired by amdgpu_userq_ensure_ev_fence().
> 
> A concurrent AMDGPU_USERQ_OP_FREE can destroy and free the queue
> in this window, leading to a use-after-free.
> 
> This bug predates the queue-id wait ioctl changes and has been
> present since the original signal/wait ioctl implementation.
> 
> Fix this by moving amdgpu_userq_ensure_ev_fence() before xa_load()

Again that trivially causes a deadlock. So the patch is just not working at all.

Regards,
Christian.

> so that the queue lookup and all subsequent accesses are performed
> under the userq_mutex that ensure_ev_fence acquires. Add the
> necessary mutex_unlock() calls to the error paths between the moved
> ensure_ev_fence and the existing unlock.
> 
> Fixes: a292fdecd728 ("drm/amdgpu: Implement userqueue signal/wait IOCTL")
> Cc: stable@vger.kernel.org
> Signed-off-by: Chenyuan Mi <chenyuan_mi@163.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> index 1785ea7c18fe..7866f583eea4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c
> @@ -545,23 +545,28 @@ int amdgpu_userq_signal_ioctl(struct drm_device *dev, void *data,
>                 }
>         }
> 
> -       /* Retrieve the user queue */
> +       /* We are here means UQ is active, make sure the eviction fence is valid */
> +       amdgpu_userq_ensure_ev_fence(&fpriv->userq_mgr, &fpriv->evf_mgr);
> +
> +       /* Retrieve the user queue under userq_mutex (held by ensure_ev_fence) */
>         queue = xa_load(&userq_mgr->userq_xa, args->queue_id);
>         if (!queue) {
> +               mutex_unlock(&userq_mgr->userq_mutex);
>                 r = -ENOENT;
>                 goto put_gobj_write;
>         }
> 
>         r = amdgpu_userq_fence_read_wptr(adev, queue, &wptr);
> -       if (r)
> +       if (r) {
> +               mutex_unlock(&userq_mgr->userq_mutex);
>                 goto put_gobj_write;
> +       }
> 
>         r = amdgpu_userq_fence_alloc(&userq_fence);
> -       if (r)
> +       if (r) {
> +               mutex_unlock(&userq_mgr->userq_mutex);
>                 goto put_gobj_write;
> -
> -       /* We are here means UQ is active, make sure the eviction fence is valid */
> -       amdgpu_userq_ensure_ev_fence(&fpriv->userq_mgr, &fpriv->evf_mgr);
> +       }
> 
>         /* Create a new fence */
>         r = amdgpu_userq_fence_create(queue, userq_fence, wptr, &fence);
> --
> 2.53.0
>