drivers/infiniband/sw/rxe/rxe_qp.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
From: Zhu Yanjun <yanjun.zhu@linux.dev>
commit b2b1ddc457458fecd1c6f385baa9fbda5f0c63ad upstream.
In the function rxe_create_qp(), rxe_qp_from_init() is called to
initialize qp, internally things like rxe_init_task are not setup until
rxe_qp_init_req().
If an error occurred before this point then the unwind will call
rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
which will oops when trying to access the uninitialized spinlock.
If rxe_init_task is not executed, rxe_cleanup_task will not be called.
Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230413101115.1366068-1-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
[ Vladislav: match upstream cleanup order and add the missing
resp.task.func check. ]
Signed-off-by: Vladislav Nikolaev <vlad102nikolaev@gmail.com>
---
v2: Move rxe_cleanup_task(&qp->resp.task) after RC timer cleanup.
Add missing qp->resp.task.func check before cleaning up the responder task.
Backport fix for CVE-2023-54028.
drivers/infiniband/sw/rxe/rxe_qp.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 4c938d841f76..616efae0c09a 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -760,15 +760,20 @@ void rxe_qp_destroy(struct rxe_qp *qp)
{
qp->valid = 0;
qp->qp_timeout_jiffies = 0;
- rxe_cleanup_task(&qp->resp.task);
if (qp_type(qp) == IB_QPT_RC) {
del_timer_sync(&qp->retrans_timer);
del_timer_sync(&qp->rnr_nak_timer);
}
- rxe_cleanup_task(&qp->req.task);
- rxe_cleanup_task(&qp->comp.task);
+ if (qp->resp.task.func)
+ rxe_cleanup_task(&qp->resp.task);
+
+ if (qp->req.task.func)
+ rxe_cleanup_task(&qp->req.task);
+
+ if (qp->comp.task.func)
+ rxe_cleanup_task(&qp->comp.task);
/* flush out any receive wr's or pending requests */
if (qp->req.task.func)
--
2.39.5
> [PATCH v2 5.10/5.15] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" I'm dropping this for now; it isn't right for either branch as submitted: - 5.15.y: the bug doesn't exist there -- the task locks are already spin_lock_init()'d on the QP-create error path. - 5.10.y: mis-targeted -- it patches rxe_qp_do_cleanup(), but the 5.10 error-unwind path doesn't call rxe_cleanup_task() there. -- Thanks, Sasha
On Wed, 03. Jun 15:18, Vladislav Nikolaev wrote:
> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>
> commit b2b1ddc457458fecd1c6f385baa9fbda5f0c63ad upstream.
>
> In the function rxe_create_qp(), rxe_qp_from_init() is called to
> initialize qp, internally things like rxe_init_task are not setup until
> rxe_qp_init_req().
>
> If an error occurred before this point then the unwind will call
> rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
> which will oops when trying to access the uninitialized spinlock.
>
> If rxe_init_task is not executed, rxe_cleanup_task will not be called.
>
> Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
> Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> Link: https://lore.kernel.org/r/20230413101115.1366068-1-yanjun.zhu@intel.com
> Signed-off-by: Leon Romanovsky <leon@kernel.org>
> [ Vladislav: match upstream cleanup order and add the missing
> resp.task.func check. ]
> Signed-off-by: Vladislav Nikolaev <vlad102nikolaev@gmail.com>
> ---
Thanks for the update.
> v2: Move rxe_cleanup_task(&qp->resp.task) after RC timer cleanup.
> Add missing qp->resp.task.func check before cleaning up the responder task.
I did actually suggest only adding a corresponding check for the
rxe_cleanup_task(&qp->resp.task) call which the upstream commit performs.
Moving it a couple of lines around requires some explanation why it's
okay in 5.10/5.15 kernels. Note that in upstream it was done by another
commit 960ebe97e523 ("RDMA/rxe: Remove __rxe_do_task()").
[ yeah, it should be safe to move the call but it'd better be stated
explicitly in the backporter's comment ]
Worth saying that checkpatch.pl for the current patch gives:
ERROR: trailing whitespace
#52: FILE: drivers/infiniband/sw/rxe/rxe_qp.c:771:
+^I$
You might also want to consider porting 1c7eec4d5f3b ("RDMA/rxe: Fix
"trying to register non-static key in rxe_qp_do_cleanup" bug") which fixes
the similar problem for del_timer_sync / timer_delete_sync calls in this
code. This all could go as a series now probably.
On Wed, 3 Jun 2026 at 18:03:00 +0300, Fedor Pchelkin wrote:
> Moving it a couple of lines around requires some explanation why it's
> okay in 5.10/5.15 kernels. Note that in upstream it was done by another
> commit 960ebe97e523 ("RDMA/rxe: Remove __rxe_do_task()").
>
> [ yeah, it should be safe to move the call but it'd better be stated
> explicitly in the backporter's comment ]
>
> Worth saying that checkpatch.pl for the current patch gives:
>
> ERROR: trailing whitespace
> #52: FILE: drivers/infiniband/sw/rxe/rxe_qp.c:771:
> +^I$
>
> You might also want to consider porting 1c7eec4d5f3b ("RDMA/rxe: Fix
> "trying to register non-static key in rxe_qp_do_cleanup" bug") which fixes
> the similar problem for del_timer_sync / timer_delete_sync calls in this
> code. This all could go as a series now probably.
Thanks for the review.
I have prepared v3 as a 5.10/5.15 series and addressed all three points:
1. extended the backporter's comment to explain why moving
rxe_cleanup_task(&qp->resp.task) after the RC timer cleanup is safe
for 5.10/5.15 even though upstream got that order via 960ebe97e523;
2. fixed the trailing whitespace;
3. added the backport of 1c7eec4d5f3b as the second patch in the series.
The updated series is available here:
https://lore.kernel.org/all/20260605171449.1760-1-vlad102nikolaev@gmail.com/
© 2016 - 2026 Red Hat, Inc.