[PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock

Philipp Reisner posted 1 patch 1 month, 2 weeks ago
There is a newer version of this series
drivers/infiniband/sw/rxe/rxe_cq.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
[PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
Posted by Philipp Reisner 1 month, 2 weeks ago
Allow the comp_handler callback implementation to call ib_poll_cq().
A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.

The Mellanox and Intel drivers allow a comp_handler callback
implementation to call ib_poll_cq().

Avoid the deadlock by calling the comp_handler callback without
holding cq->cq_lock.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
---
 drivers/infiniband/sw/rxe/rxe_cq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
index fffd144d509e..95652001665d 100644
--- a/drivers/infiniband/sw/rxe/rxe_cq.c
+++ b/drivers/infiniband/sw/rxe/rxe_cq.c
@@ -88,6 +88,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 	int full;
 	void *addr;
 	unsigned long flags;
+	bool invoke_handler = false;
 
 	spin_lock_irqsave(&cq->cq_lock, flags);
 
@@ -113,11 +114,14 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 	if ((cq->notify & IB_CQ_NEXT_COMP) ||
 	    (cq->notify & IB_CQ_SOLICITED && solicited)) {
 		cq->notify = 0;
-		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+		invoke_handler = true;
 	}
 
 	spin_unlock_irqrestore(&cq->cq_lock, flags);
 
+	if (invoke_handler)
+		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+
 	return 0;
 }
 
-- 
2.50.1
Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
Posted by Zhu Yanjun 1 month, 1 week ago
在 2025/8/19 10:24, Philipp Reisner 写道:
> Allow the comp_handler callback implementation to call ib_poll_cq().
> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
> 
> The Mellanox and Intel drivers allow a comp_handler callback
> implementation to call ib_poll_cq().
> 
> Avoid the deadlock by calling the comp_handler callback without
> holding cq->cq_lock.
> 
> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
> ---

The new patch should be sent in a new mail thread; it is not appropriate 
to reply to the old thread.

Additionally:

The subject line does not include a version number.

The commit log of the new patch does not contain a changelog.

Other than these issues, I am fine with this commit.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

>   drivers/infiniband/sw/rxe/rxe_cq.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
> index fffd144d509e..95652001665d 100644
> --- a/drivers/infiniband/sw/rxe/rxe_cq.c
> +++ b/drivers/infiniband/sw/rxe/rxe_cq.c
> @@ -88,6 +88,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>   	int full;
>   	void *addr;
>   	unsigned long flags;
> +	bool invoke_handler = false;
>   
>   	spin_lock_irqsave(&cq->cq_lock, flags);
>   
> @@ -113,11 +114,14 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>   	if ((cq->notify & IB_CQ_NEXT_COMP) ||
>   	    (cq->notify & IB_CQ_SOLICITED && solicited)) {
>   		cq->notify = 0;
> -		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
> +		invoke_handler = true;
>   	}
>   
>   	spin_unlock_irqrestore(&cq->cq_lock, flags);
>   
> +	if (invoke_handler)
> +		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
> +
>   	return 0;
>   }
>