[PATCH] RDMA/irdma: Fix typo in SQ completions generation

Cyrill Gorcunov posted 1 patch 1 week, 3 days ago
drivers/infiniband/hw/irdma/utils.c |    2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Cyrill Gorcunov 1 week, 3 days ago
When we generate completion for SQ the opcode while being properly read
from ring buffer is ignored when written back to completion. Seems
to be a simple typo.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
Hopefully I didn't miss something obvious here, found it while been
fighting with unrelated issue.

 drivers/infiniband/hw/irdma/utils.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-tip.git/drivers/infiniband/hw/irdma/utils.c
===================================================================
--- linux-tip.git.orig/drivers/infiniband/hw/irdma/utils.c
+++ linux-tip.git/drivers/infiniband/hw/irdma/utils.c
@@ -2442,7 +2442,7 @@ void irdma_generate_flush_completions(st
 			cmpl->cpi.wr_id = qp->sq_wrtrk_array[wqe_idx].wrid;
 			sw_wqe = qp->sq_base[wqe_idx].elem;
 			get_64bit_val(sw_wqe, 24, &wqe_qword);
-			cmpl->cpi.op_type = (u8)FIELD_GET(IRDMAQPSQ_OPCODE, IRDMAQPSQ_OPCODE);
+			cmpl->cpi.op_type = (u8)FIELD_GET(IRDMAQPSQ_OPCODE, wqe_qword);
 			cmpl->cpi.q_type = IRDMA_CQE_QTYPE_SQ;
 			/* remove the SQ WR by moving SQ tail*/
 			IRDMA_RING_SET_TAIL(*sq_ring,
Re: [PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Jason Gunthorpe 4 days, 19 hours ago
On Fri, May 29, 2026 at 01:30:11AM +0300, Cyrill Gorcunov wrote:
> When we generate completion for SQ the opcode while being properly read
> from ring buffer is ignored when written back to completion. Seems
> to be a simple typo.
> 
> Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> Reviewed-by: Jacob Moroni <jmoroni@google.com>
> ---
> Hopefully I didn't miss something obvious here, found it while been
> fighting with unrelated issue.

Applied to for-next, thanks

Jason
Re: [PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Cyrill Gorcunov 4 days, 3 hours ago
On Wed, Jun 03, 2026 at 03:16:36PM -0300, Jason Gunthorpe wrote:
> On Fri, May 29, 2026 at 01:30:11AM +0300, Cyrill Gorcunov wrote:
> > When we generate completion for SQ the opcode while being properly read
> > from ring buffer is ignored when written back to completion. Seems
> > to be a simple typo.
> > 
> > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
> > Reviewed-by: Jacob Moroni <jmoroni@google.com>
> > ---
> > Hopefully I didn't miss something obvious here, found it while been
> > fighting with unrelated issue.
> 
> Applied to for-next, thanks

Thanks a huge, Jason! What about the series https://lore.kernel.org/netdev/20260522142239.628965142@gmail.com/ ?
Guys, could you take a glance please?

	Cyrill
Re: [PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Jacob Moroni 1 week, 3 days ago
> Hopefully I didn't miss something obvious here, found it while been
> fighting with unrelated issue.

Nice find. I took a look and your fix seems valid to me.

I guess prior to this fix, it could potentially generate flush
completions for the NOP/pad WQEs which I can see being
a problem since the WR ID would be totally bogus.

Reviewed-by: Jacob Moroni <jmoroni@google.com>
Re: [PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Cyrill Gorcunov 1 week, 1 day ago
Thanks a lot for the review, Jacob! I miss to add that the nit came in
with commit 81091d7696ae71627ff80bbf2c6b0986d2c1cce3,
which is 5.18 series, so I guess we might need to fetch it to stable tree later.
Re: [PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Jacob Moroni 5 days, 23 hours ago
BTW, your fix matches the OOT driver code:
https://github.com/intel/ethernet-linux-irdma-and-idpf/blob/main/rdma-driver/src/irdma/utils.c#L3561

Regarding the "unrelated issue" - is it an irdma issue? Anything we
can help with?

- Jake
Re: [PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Cyrill Gorcunov 5 days, 15 hours ago
On Tue, Jun 02, 2026 at 10:11:46AM -0400, Jacob Moroni wrote:
> BTW, your fix matches the OOT driver code:
> https://github.com/intel/ethernet-linux-irdma-and-idpf/blob/main/rdma-driver/src/irdma/utils.c#L3561
> 

Wow, didn't know about OOT driver :-) Thanks for pointing!

> Regarding the "unrelated issue" - is it an irdma issue? Anything we
> can help with?

Well, the issue I'm dealing with is messy yet (i mean i'm not sure if it
is irdma issue or not -- i'm loosing completions on posted operations when
hardware in reset mode and same time the card is physically removed). Look,
once I manage to collect all pieces of a problem I'll back with report ) At
moment I suspect that we need to *flush* queue here instead of cancelling it

---
static int irdma_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
{
	struct irdma_qp *iwqp = to_iwqp(ibqp);
	struct irdma_device *iwdev = iwqp->iwdev;

	iwqp->sc_qp.qp_uk.destroy_pending = true;

	if (iwqp->iwarp_state == IRDMA_QP_STATE_RTS)
		irdma_modify_qp_to_err(&iwqp->sc_qp);

	if (!iwqp->user_mode)
-->		cancel_delayed_work_sync(&iwqp->dwork_flush);
Re: [PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Jason Gunthorpe 5 days, 19 hours ago
On Tue, Jun 02, 2026 at 10:11:46AM -0400, Jacob Moroni wrote:
> BTW, your fix matches the OOT driver code:
> https://github.com/intel/ethernet-linux-irdma-and-idpf/blob/main/rdma-driver/src/irdma/utils.c#L3561

Oh lovely.

Can I delete irdma then if Intel would prefer to keep functional bug
fixes out of tree? Hmm?

Jason
Re: [PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Cyrill Gorcunov 5 days, 15 hours ago
On Tue, Jun 02, 2026 at 03:25:03PM -0300, Jason Gunthorpe wrote:
> On Tue, Jun 02, 2026 at 10:11:46AM -0400, Jacob Moroni wrote:
> > BTW, your fix matches the OOT driver code:
> > https://github.com/intel/ethernet-linux-irdma-and-idpf/blob/main/rdma-driver/src/irdma/utils.c#L3561
> 
> Oh lovely.
> 
> Can I delete irdma then if Intel would prefer to keep functional bug
> fixes out of tree? Hmm?

I guess there are simply not enough man power to keep OOT code in sync with
kernel tree.

	Cyrill
Re: [PATCH] RDMA/irdma: Fix typo in SQ completions generation
Posted by Jacob Moroni 5 days, 14 hours ago
Sorry, didn't mean to open up a can of worms. I personally would like to
move in the other direction - let's get all of these fixes upstream :)

> i'm loosing completions on posted operations when
> hardware in reset mode

I think this has come up before. For example, I vaguely recall an async
VF reset during heavy RDMA CM activity resulting in some timeout firing
(cm_destroy_id_wait_timeout), which I think was due to completions
getting lost.

You may be on to something regarding flushing that WQ. I'll give it a try
on my end.

- Jake


> At
> moment I suspect that we need to *flush* queue here instead of cancelling it