drivers/infiniband/hw/irdma/utils.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
When we generate completion for SQ the opcode while being properly read
from ring buffer is ignored when written back to completion. Seems
to be a simple typo.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
---
Hopefully I didn't miss something obvious here, found it while been
fighting with unrelated issue.
drivers/infiniband/hw/irdma/utils.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-tip.git/drivers/infiniband/hw/irdma/utils.c
===================================================================
--- linux-tip.git.orig/drivers/infiniband/hw/irdma/utils.c
+++ linux-tip.git/drivers/infiniband/hw/irdma/utils.c
@@ -2442,7 +2442,7 @@ void irdma_generate_flush_completions(st
cmpl->cpi.wr_id = qp->sq_wrtrk_array[wqe_idx].wrid;
sw_wqe = qp->sq_base[wqe_idx].elem;
get_64bit_val(sw_wqe, 24, &wqe_qword);
- cmpl->cpi.op_type = (u8)FIELD_GET(IRDMAQPSQ_OPCODE, IRDMAQPSQ_OPCODE);
+ cmpl->cpi.op_type = (u8)FIELD_GET(IRDMAQPSQ_OPCODE, wqe_qword);
cmpl->cpi.q_type = IRDMA_CQE_QTYPE_SQ;
/* remove the SQ WR by moving SQ tail*/
IRDMA_RING_SET_TAIL(*sq_ring,
On Fri, May 29, 2026 at 01:30:11AM +0300, Cyrill Gorcunov wrote: > When we generate completion for SQ the opcode while being properly read > from ring buffer is ignored when written back to completion. Seems > to be a simple typo. > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > Reviewed-by: Jacob Moroni <jmoroni@google.com> > --- > Hopefully I didn't miss something obvious here, found it while been > fighting with unrelated issue. Applied to for-next, thanks Jason
On Wed, Jun 03, 2026 at 03:16:36PM -0300, Jason Gunthorpe wrote: > On Fri, May 29, 2026 at 01:30:11AM +0300, Cyrill Gorcunov wrote: > > When we generate completion for SQ the opcode while being properly read > > from ring buffer is ignored when written back to completion. Seems > > to be a simple typo. > > > > Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> > > Reviewed-by: Jacob Moroni <jmoroni@google.com> > > --- > > Hopefully I didn't miss something obvious here, found it while been > > fighting with unrelated issue. > > Applied to for-next, thanks Thanks a huge, Jason! What about the series https://lore.kernel.org/netdev/20260522142239.628965142@gmail.com/ ? Guys, could you take a glance please? Cyrill
> Hopefully I didn't miss something obvious here, found it while been > fighting with unrelated issue. Nice find. I took a look and your fix seems valid to me. I guess prior to this fix, it could potentially generate flush completions for the NOP/pad WQEs which I can see being a problem since the WR ID would be totally bogus. Reviewed-by: Jacob Moroni <jmoroni@google.com>
Thanks a lot for the review, Jacob! I miss to add that the nit came in with commit 81091d7696ae71627ff80bbf2c6b0986d2c1cce3, which is 5.18 series, so I guess we might need to fetch it to stable tree later.
BTW, your fix matches the OOT driver code: https://github.com/intel/ethernet-linux-irdma-and-idpf/blob/main/rdma-driver/src/irdma/utils.c#L3561 Regarding the "unrelated issue" - is it an irdma issue? Anything we can help with? - Jake
On Tue, Jun 02, 2026 at 10:11:46AM -0400, Jacob Moroni wrote:
> BTW, your fix matches the OOT driver code:
> https://github.com/intel/ethernet-linux-irdma-and-idpf/blob/main/rdma-driver/src/irdma/utils.c#L3561
>
Wow, didn't know about OOT driver :-) Thanks for pointing!
> Regarding the "unrelated issue" - is it an irdma issue? Anything we
> can help with?
Well, the issue I'm dealing with is messy yet (i mean i'm not sure if it
is irdma issue or not -- i'm loosing completions on posted operations when
hardware in reset mode and same time the card is physically removed). Look,
once I manage to collect all pieces of a problem I'll back with report ) At
moment I suspect that we need to *flush* queue here instead of cancelling it
---
static int irdma_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
{
struct irdma_qp *iwqp = to_iwqp(ibqp);
struct irdma_device *iwdev = iwqp->iwdev;
iwqp->sc_qp.qp_uk.destroy_pending = true;
if (iwqp->iwarp_state == IRDMA_QP_STATE_RTS)
irdma_modify_qp_to_err(&iwqp->sc_qp);
if (!iwqp->user_mode)
--> cancel_delayed_work_sync(&iwqp->dwork_flush);
On Tue, Jun 02, 2026 at 10:11:46AM -0400, Jacob Moroni wrote: > BTW, your fix matches the OOT driver code: > https://github.com/intel/ethernet-linux-irdma-and-idpf/blob/main/rdma-driver/src/irdma/utils.c#L3561 Oh lovely. Can I delete irdma then if Intel would prefer to keep functional bug fixes out of tree? Hmm? Jason
On Tue, Jun 02, 2026 at 03:25:03PM -0300, Jason Gunthorpe wrote: > On Tue, Jun 02, 2026 at 10:11:46AM -0400, Jacob Moroni wrote: > > BTW, your fix matches the OOT driver code: > > https://github.com/intel/ethernet-linux-irdma-and-idpf/blob/main/rdma-driver/src/irdma/utils.c#L3561 > > Oh lovely. > > Can I delete irdma then if Intel would prefer to keep functional bug > fixes out of tree? Hmm? I guess there are simply not enough man power to keep OOT code in sync with kernel tree. Cyrill
Sorry, didn't mean to open up a can of worms. I personally would like to move in the other direction - let's get all of these fixes upstream :) > i'm loosing completions on posted operations when > hardware in reset mode I think this has come up before. For example, I vaguely recall an async VF reset during heavy RDMA CM activity resulting in some timeout firing (cm_destroy_id_wait_timeout), which I think was due to completions getting lost. You may be on to something regarding flushing that WQ. I'll give it a try on my end. - Jake > At > moment I suspect that we need to *flush* queue here instead of cancelling it
© 2016 - 2026 Red Hat, Inc.