[PATCH] RDMA/rxe: Fix double free in rxe_srq_from_init

Jiasheng Jiang posted 1 patch 3 weeks, 6 days ago
There is a newer version of this series
drivers/infiniband/sw/rxe/rxe_srq.c | 1 +
1 file changed, 1 insertion(+)
[PATCH] RDMA/rxe: Fix double free in rxe_srq_from_init
Posted by Jiasheng Jiang 3 weeks, 6 days ago
In rxe_srq_from_init(), the queue pointer 'q' is assigned to
'srq->rq.queue' before copying the SRQ number to user space.
If copy_to_user() fails, the function calls rxe_queue_cleanup()
to free the queue, but leaves the now-invalid pointer in
'srq->rq.queue'.

The caller of rxe_srq_from_init() (rxe_create_srq) eventually
calls rxe_srq_cleanup() upon receiving the error, which triggers
a second rxe_queue_cleanup() on the same memory, leading to a
double free.

Fix this by setting 'srq->rq.queue' to NULL after the initial
cleanup in the error path.

Fixes: aae0484e15f0 ("IB/rxe: avoid srq memory leak")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
---
 drivers/infiniband/sw/rxe/rxe_srq.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c
index 2a234f26ac10..c527c1cbd4ec 100644
--- a/drivers/infiniband/sw/rxe/rxe_srq.c
+++ b/drivers/infiniband/sw/rxe/rxe_srq.c
@@ -84,6 +84,7 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
 		if (copy_to_user(&uresp->srq_num, &srq->srq_num,
 				 sizeof(uresp->srq_num))) {
 			rxe_queue_cleanup(q);
+			srq->rq.queue = NULL;
 			return -EFAULT;
 		}
 	}
-- 
2.25.1
Re: [PATCH] RDMA/rxe: Fix double free in rxe_srq_from_init
Posted by Zhu Yanjun 3 weeks, 5 days ago
在 2026/1/11 9:12, Jiasheng Jiang 写道:
> In rxe_srq_from_init(), the queue pointer 'q' is assigned to
> 'srq->rq.queue' before copying the SRQ number to user space.
> If copy_to_user() fails, the function calls rxe_queue_cleanup()
> to free the queue, but leaves the now-invalid pointer in
> 'srq->rq.queue'.
> 
> The caller of rxe_srq_from_init() (rxe_create_srq) eventually
> calls rxe_srq_cleanup() upon receiving the error, which triggers
> a second rxe_queue_cleanup() on the same memory, leading to a
> double free.
> 
> Fix this by setting 'srq->rq.queue' to NULL after the initial
> cleanup in the error path.

In the function rxe_srq_from_init,

  80     srq->rq.queue = q;
  81     init->attr.max_wr = srq->rq.max_wr;
  82
  83     if (uresp) {
  84         if (copy_to_user(&uresp->srq_num, &srq->srq_num,
  85                  sizeof(uresp->srq_num))) {
  86             rxe_queue_cleanup(q);
  87             return -EFAULT;
  88         }
  89     }

If we move the following
"
srq->rq.queue = q;
init->attr.max_wr = srq->rq.max_wr;
"
after copy_to_user, it seems also to be able fix the mentioned problem.
The commit is like this:
"
diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c 
b/drivers/infiniband/sw/rxe/rxe_srq.c
index 2a234f26ac10..c9a7cd38953d 100644
--- a/drivers/infiniband/sw/rxe/rxe_srq.c
+++ b/drivers/infiniband/sw/rxe/rxe_srq.c
@@ -77,9 +77,6 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct 
rxe_srq *srq,
                 goto err_free;
         }

-       srq->rq.queue = q;
-       init->attr.max_wr = srq->rq.max_wr;
-
         if (uresp) {
                 if (copy_to_user(&uresp->srq_num, &srq->srq_num,
                                  sizeof(uresp->srq_num))) {
@@ -88,6 +85,9 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct 
rxe_srq *srq,
                 }
         }

+       srq->rq.queue = q;
+       init->attr.max_wr = srq->rq.max_wr;
+
         return 0;

  err_free:
"

But "make srq->rq.queue to NULL" can also fix this problem.

I am fine with this. Thanks a lot.

BTW, if you can post the call trace in commit log, it is better.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Thanks a lot.

Zhu Yanjun

> 
> Fixes: aae0484e15f0 ("IB/rxe: avoid srq memory leak")
> Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
> ---
>   drivers/infiniband/sw/rxe/rxe_srq.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c
> index 2a234f26ac10..c527c1cbd4ec 100644
> --- a/drivers/infiniband/sw/rxe/rxe_srq.c
> +++ b/drivers/infiniband/sw/rxe/rxe_srq.c
> @@ -84,6 +84,7 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
>   		if (copy_to_user(&uresp->srq_num, &srq->srq_num,
>   				 sizeof(uresp->srq_num))) {
>   			rxe_queue_cleanup(q);
> +			srq->rq.queue = NULL;
>   			return -EFAULT;
>   		}
>   	}

[PATCH v2] RDMA/rxe: Fix double free in rxe_srq_from_init
Posted by Jiasheng Jiang 3 weeks, 5 days ago
In rxe_srq_from_init(), the queue pointer 'q' is assigned to
'srq->rq.queue' before copying the SRQ number to user space.
If copy_to_user() fails, the function calls rxe_queue_cleanup()
to free the queue, but leaves the now-invalid pointer in
'srq->rq.queue'.

The caller of rxe_srq_from_init() (rxe_create_srq) eventually
calls rxe_srq_cleanup() upon receiving the error, which triggers
a second rxe_queue_cleanup() on the same memory, leading to a
double free.

The call trace looks like this:
   kmem_cache_free+0x.../0x...
   rxe_queue_cleanup+0x1a/0x30 [rdma_rxe]
   rxe_srq_cleanup+0x42/0x60 [rdma_rxe]
   rxe_elem_release+0x31/0x70 [rdma_rxe]
   rxe_create_srq+0x12b/0x1a0 [rdma_rxe]
   ib_create_srq_user+0x9a/0x150 [ib_core]

Fix this by moving 'srq->rq.queue = q' after copy_to_user.

Fixes: aae0484e15f0 ("IB/rxe: avoid srq memory leak")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
---
Changelog:

v1 -> v2:

1. Move both 'srq->rq.queue = q' and 'init->attr.max_wr = srq->rq.max_wr'
after copy_to_user().
2. Add call trace for better understanding of the issue.
---
 drivers/infiniband/sw/rxe/rxe_srq.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_srq.c b/drivers/infiniband/sw/rxe/rxe_srq.c
index 2a234f26ac10..c9a7cd38953d 100644
--- a/drivers/infiniband/sw/rxe/rxe_srq.c
+++ b/drivers/infiniband/sw/rxe/rxe_srq.c
@@ -77,9 +77,6 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
 		goto err_free;
 	}
 
-	srq->rq.queue = q;
-	init->attr.max_wr = srq->rq.max_wr;
-
 	if (uresp) {
 		if (copy_to_user(&uresp->srq_num, &srq->srq_num,
 				 sizeof(uresp->srq_num))) {
@@ -88,6 +85,9 @@ int rxe_srq_from_init(struct rxe_dev *rxe, struct rxe_srq *srq,
 		}
 	}
 
+	srq->rq.queue = q;
+	init->attr.max_wr = srq->rq.max_wr;
+
 	return 0;
 
 err_free:
-- 
2.25.1