[PATCH iwl-next] libie: log more info when virtchnl fails

Li Li posted 1 patch 1 month, 3 weeks ago
There is a newer version of this series
drivers/net/ethernet/intel/libie/controlq.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
[PATCH iwl-next] libie: log more info when virtchnl fails
Posted by Li Li 1 month, 3 weeks ago
Virtchnl failures can be hard to debug without logs. Logging the details
of virtchnl transactions can be useful for debugging virtchnl-related
issues.

Tested: Built and booted on a test machine.

Signed-off-by: Li Li <boolli@google.com>
---
 drivers/net/ethernet/intel/libie/controlq.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/intel/libie/controlq.c b/drivers/net/ethernet/intel/libie/controlq.c
index ebc05355e39d..7eaa77413621 100644
--- a/drivers/net/ethernet/intel/libie/controlq.c
+++ b/drivers/net/ethernet/intel/libie/controlq.c
@@ -762,6 +762,16 @@ libie_ctlq_xn_process_recv(struct libie_ctlq_xn_recv_params *params,
 	status = ctlq_msg->chnl_retval ? -EFAULT : 0;
 
 	xn = &xnm->ring[xn_index];
+
+	if (ctlq_msg->chnl_retval) {
+		dev_err_ratelimited(
+			params->ctlq->dev,
+			"Non-zero virtchnl ret val (msg op: %u, ret val: %u, msg_cookie: %u, data_len: %u); xn op: %u, id: %u, cookie: %u\n",
+			ctlq_msg->chnl_opcode, ctlq_msg->chnl_retval,
+			msg_cookie, ctlq_msg->data_len, xn->virtchnl_opcode,
+			xn->index, xn->cookie);
+	}
+
 	if (ctlq_msg->chnl_opcode != xn->virtchnl_opcode ||
 	    msg_cookie != xn->cookie)
 		return false;
@@ -1011,6 +1021,11 @@ int libie_ctlq_xn_send(struct libie_ctlq_xn_send_params *params)
 		params->recv_mem = xn->recv_mem;
 		break;
 	default:
+		dev_notice_ratelimited(
+			params->ctlq->dev,
+			"Transaction failed (op %u, xn state: %d, id: %u, cookie: %u, size: %zu)\n",
+			params->chnl_opcode, xn->state, xn->index, xn->cookie,
+			xn->recv_mem.iov_len);
 		ret = -EBADMSG;
 		break;
 	}
-- 
2.54.0.rc2.544.gc7ae2d5bb8-goog
Re: [Intel-wired-lan] [PATCH iwl-next] libie: log more info when virtchnl fails
Posted by Paul Menzel 1 month, 2 weeks ago
Dear Li,


Thank you for your patch.

Am 24.04.26 um 05:15 schrieb Li Li via Intel-wired-lan:
> Virtchnl failures can be hard to debug without logs. Logging the details
> of virtchnl transactions can be useful for debugging virtchnl-related
> issues.

Why is the first added one added as error, and the second as info?

Do you have a reproducer to get each of the new log messages?

> Tested: Built and booted on a test machine.

Please paste the new messages.

> Signed-off-by: Li Li <boolli@google.com>
> ---
>   drivers/net/ethernet/intel/libie/controlq.c | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/libie/controlq.c b/drivers/net/ethernet/intel/libie/controlq.c
> index ebc05355e39d..7eaa77413621 100644
> --- a/drivers/net/ethernet/intel/libie/controlq.c
> +++ b/drivers/net/ethernet/intel/libie/controlq.c
> @@ -762,6 +762,16 @@ libie_ctlq_xn_process_recv(struct libie_ctlq_xn_recv_params *params,
>   	status = ctlq_msg->chnl_retval ? -EFAULT : 0;
>   
>   	xn = &xnm->ring[xn_index];
> +
> +	if (ctlq_msg->chnl_retval) {
> +		dev_err_ratelimited(
> +			params->ctlq->dev,
> +			"Non-zero virtchnl ret val (msg op: %u, ret val: %u, msg_cookie: %u, data_len: %u); xn op: %u, id: %u, cookie: %u\n",
> +			ctlq_msg->chnl_opcode, ctlq_msg->chnl_retval,
> +			msg_cookie, ctlq_msg->data_len, xn->virtchnl_opcode,
> +			xn->index, xn->cookie);
> +	}
> +
>   	if (ctlq_msg->chnl_opcode != xn->virtchnl_opcode ||
>   	    msg_cookie != xn->cookie)
>   		return false;
> @@ -1011,6 +1021,11 @@ int libie_ctlq_xn_send(struct libie_ctlq_xn_send_params *params)
>   		params->recv_mem = xn->recv_mem;
>   		break;
>   	default:
> +		dev_notice_ratelimited(
> +			params->ctlq->dev,
> +			"Transaction failed (op %u, xn state: %d, id: %u, cookie: %u, size: %zu)\n",
> +			params->chnl_opcode, xn->state, xn->index, xn->cookie,
> +			xn->recv_mem.iov_len);
>   		ret = -EBADMSG;
>   		break;
>   	}


Kind regards,

Paul
Re: [Intel-wired-lan] [PATCH iwl-next] libie: log more info when virtchnl fails
Posted by Li Li 1 month, 2 weeks ago
On Mon, Apr 27, 2026 at 3:40 AM Paul Menzel <pmenzel@molgen.mpg.de> wrote:
>
> Dear Li,
>
>
> Thank you for your patch.
>
> Am 24.04.26 um 05:15 schrieb Li Li via Intel-wired-lan:
> > Virtchnl failures can be hard to debug without logs. Logging the details
> > of virtchnl transactions can be useful for debugging virtchnl-related
> > issues.
>
> Why is the first added one added as error, and the second as info?

Originally I was thinking the second log doesn't warrant an err level
log, but on second thought, I think it makes more sense to use
dev_err_ratelimited() in both places. I will send out a v3 patch to
update it.

>
>
> Do you have a reproducer to get each of the new log messages?

We recently had an IOMMU subsystem bug in our internal kernel that
caused virtchnl failures, and this patch helped root cause the IOMMU
subsystem issue. However, the patch we used internally was applied to
the OOT driver, which is implemented slightly differently than the
upstream driver. Example log we saw in our OOT driver, with the patch
applied:

idpf 0000:43:00.0: Non-zero virtchnl ret val (msg: 523, ret val: 6,
data_len: 0) (xn 523, id: 254, salt: 40, iov len: 128)

The message looks slightly different from the log added in this patch
because the OOT driver uses different terminology.

>
> > Tested: Built and booted on a test machine.
>
> Please paste the new messages.

As mentioned above, I don't have a reproducer for a real virtchnl
error in the upstream kernel, but I synthetically produced an error by
changing ctlq_msg->chnl_retval, and the following message appeared:

[  309.891790] idpf 0000:01:00.0: Non-zero virtchnl ret val (msg op:
1, ret val: 6, msg_cookie: 0, data_len: 8); xn op: 1, id: 0, cookie: 0
[  309.892357] idpf 0000:01:00.0: Transaction failed (op 1, xn state:
3, id: 0, cookie: 0, size: 8)

I will mention it in the v3 patch as well.



>
> > Signed-off-by: Li Li <boolli@google.com>
> > ---
> >   drivers/net/ethernet/intel/libie/controlq.c | 15 +++++++++++++++
> >   1 file changed, 15 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/libie/controlq.c b/drivers/net/ethernet/intel/libie/controlq.c
> > index ebc05355e39d..7eaa77413621 100644
> > --- a/drivers/net/ethernet/intel/libie/controlq.c
> > +++ b/drivers/net/ethernet/intel/libie/controlq.c
> > @@ -762,6 +762,16 @@ libie_ctlq_xn_process_recv(struct libie_ctlq_xn_recv_params *params,
> >       status = ctlq_msg->chnl_retval ? -EFAULT : 0;
> >
> >       xn = &xnm->ring[xn_index];
> > +
> > +     if (ctlq_msg->chnl_retval) {
> > +             dev_err_ratelimited(
> > +                     params->ctlq->dev,
> > +                     "Non-zero virtchnl ret val (msg op: %u, ret val: %u, msg_cookie: %u, data_len: %u); xn op: %u, id: %u, cookie: %u\n",
> > +                     ctlq_msg->chnl_opcode, ctlq_msg->chnl_retval,
> > +                     msg_cookie, ctlq_msg->data_len, xn->virtchnl_opcode,
> > +                     xn->index, xn->cookie);
> > +     }
> > +
> >       if (ctlq_msg->chnl_opcode != xn->virtchnl_opcode ||
> >           msg_cookie != xn->cookie)
> >               return false;
> > @@ -1011,6 +1021,11 @@ int libie_ctlq_xn_send(struct libie_ctlq_xn_send_params *params)
> >               params->recv_mem = xn->recv_mem;
> >               break;
> >       default:
> > +             dev_notice_ratelimited(
> > +                     params->ctlq->dev,
> > +                     "Transaction failed (op %u, xn state: %d, id: %u, cookie: %u, size: %zu)\n",
> > +                     params->chnl_opcode, xn->state, xn->index, xn->cookie,
> > +                     xn->recv_mem.iov_len);
> >               ret = -EBADMSG;
> >               break;
> >       }
>
>
> Kind regards,
>
> Paul
RE: [Intel-wired-lan] [PATCH iwl-next] libie: log more info when virtchnl fails
Posted by Loktionov, Aleksandr 1 month, 3 weeks ago

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> Of Li Li via Intel-wired-lan
> Sent: Friday, April 24, 2026 5:16 AM
> To: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel,
> Przemyslaw <przemyslaw.kitszel@intel.com>; David S. Miller
> <davem@davemloft.net>; Jakub Kicinski <kuba@kernel.org>; Eric Dumazet
> <edumazet@google.com>; intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org; David
> Decotigny <decot@google.com>; Singhai, Anjali
> <anjali.singhai@intel.com>; Samudrala, Sridhar
> <sridhar.samudrala@intel.com>; Brian Vazquez <brianvv@google.com>; Li
> Li <boolli@google.com>; Tantilov, Emil S <emil.s.tantilov@intel.com>
> Subject: [Intel-wired-lan] [PATCH iwl-next] libie: log more info when
> virtchnl fails
> 
> Virtchnl failures can be hard to debug without logs. Logging the
> details of virtchnl transactions can be useful for debugging virtchnl-
> related issues.
> 
> Tested: Built and booted on a test machine.
> 
> Signed-off-by: Li Li <boolli@google.com>
> ---
>  drivers/net/ethernet/intel/libie/controlq.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/libie/controlq.c
> b/drivers/net/ethernet/intel/libie/controlq.c
> index ebc05355e39d..7eaa77413621 100644
> --- a/drivers/net/ethernet/intel/libie/controlq.c
> +++ b/drivers/net/ethernet/intel/libie/controlq.c
> @@ -762,6 +762,16 @@ libie_ctlq_xn_process_recv(struct
> libie_ctlq_xn_recv_params *params,
>  	status = ctlq_msg->chnl_retval ? -EFAULT : 0;
> 
>  	xn = &xnm->ring[xn_index];
> +
> +	if (ctlq_msg->chnl_retval) {
> +		dev_err_ratelimited(
> +			params->ctlq->dev,
> +			"Non-zero virtchnl ret val (msg op: %u, ret val:
> %u, msg_cookie: %u, data_len: %u); xn op: %u, id: %u, cookie: %u\n",
> +			ctlq_msg->chnl_opcode, ctlq_msg->chnl_retval,
> +			msg_cookie, ctlq_msg->data_len, xn-
> >virtchnl_opcode,
> +			xn->index, xn->cookie);
> +	}
> +
>  	if (ctlq_msg->chnl_opcode != xn->virtchnl_opcode ||
>  	    msg_cookie != xn->cookie)
>  		return false;
> @@ -1011,6 +1021,11 @@ int libie_ctlq_xn_send(struct
> libie_ctlq_xn_send_params *params)
>  		params->recv_mem = xn->recv_mem;
>  		break;
>  	default:
> +		dev_notice_ratelimited(
> +			params->ctlq->dev,
> +			"Transaction failed (op %u, xn state: %d, id: %u,
> cookie: %u, size: %zu)\n",
> +			params->chnl_opcode, xn->state, xn->index, xn-
> >cookie,
> +			xn->recv_mem.iov_len);
For me dev_notice_ratelimited() level is low for a failure messages.
Why not dev_warn_ratelimited() instead?

>  		ret = -EBADMSG;
>  		break;
>  	}
> --
> 2.54.0.rc2.544.gc7ae2d5bb8-goog

Re: [Intel-wired-lan] [PATCH iwl-next] libie: log more info when virtchnl fails
Posted by Li Li 1 month, 3 weeks ago
On Fri, Apr 24, 2026 at 9:08 AM Loktionov, Aleksandr
<aleksandr.loktionov@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> > Of Li Li via Intel-wired-lan
> > Sent: Friday, April 24, 2026 5:16 AM
> > To: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel,
> > Przemyslaw <przemyslaw.kitszel@intel.com>; David S. Miller
> > <davem@davemloft.net>; Jakub Kicinski <kuba@kernel.org>; Eric Dumazet
> > <edumazet@google.com>; intel-wired-lan@lists.osuosl.org
> > Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org; David
> > Decotigny <decot@google.com>; Singhai, Anjali
> > <anjali.singhai@intel.com>; Samudrala, Sridhar
> > <sridhar.samudrala@intel.com>; Brian Vazquez <brianvv@google.com>; Li
> > Li <boolli@google.com>; Tantilov, Emil S <emil.s.tantilov@intel.com>
> > Subject: [Intel-wired-lan] [PATCH iwl-next] libie: log more info when
> > virtchnl fails
> >
> > Virtchnl failures can be hard to debug without logs. Logging the
> > details of virtchnl transactions can be useful for debugging virtchnl-
> > related issues.
> >
> > Tested: Built and booted on a test machine.
> >
> > Signed-off-by: Li Li <boolli@google.com>
> > ---
> >  drivers/net/ethernet/intel/libie/controlq.c | 15 +++++++++++++++
> >  1 file changed, 15 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/libie/controlq.c
> > b/drivers/net/ethernet/intel/libie/controlq.c
> > index ebc05355e39d..7eaa77413621 100644
> > --- a/drivers/net/ethernet/intel/libie/controlq.c
> > +++ b/drivers/net/ethernet/intel/libie/controlq.c
> > @@ -762,6 +762,16 @@ libie_ctlq_xn_process_recv(struct
> > libie_ctlq_xn_recv_params *params,
> >       status = ctlq_msg->chnl_retval ? -EFAULT : 0;
> >
> >       xn = &xnm->ring[xn_index];
> > +
> > +     if (ctlq_msg->chnl_retval) {
> > +             dev_err_ratelimited(
> > +                     params->ctlq->dev,
> > +                     "Non-zero virtchnl ret val (msg op: %u, ret val:
> > %u, msg_cookie: %u, data_len: %u); xn op: %u, id: %u, cookie: %u\n",
> > +                     ctlq_msg->chnl_opcode, ctlq_msg->chnl_retval,
> > +                     msg_cookie, ctlq_msg->data_len, xn-
> > >virtchnl_opcode,
> > +                     xn->index, xn->cookie);
> > +     }
> > +
> >       if (ctlq_msg->chnl_opcode != xn->virtchnl_opcode ||
> >           msg_cookie != xn->cookie)
> >               return false;
> > @@ -1011,6 +1021,11 @@ int libie_ctlq_xn_send(struct
> > libie_ctlq_xn_send_params *params)
> >               params->recv_mem = xn->recv_mem;
> >               break;
> >       default:
> > +             dev_notice_ratelimited(
> > +                     params->ctlq->dev,
> > +                     "Transaction failed (op %u, xn state: %d, id: %u,
> > cookie: %u, size: %zu)\n",
> > +                     params->chnl_opcode, xn->state, xn->index, xn-
> > >cookie,
> > +                     xn->recv_mem.iov_len);
> For me dev_notice_ratelimited() level is low for a failure messages.
> Why not dev_warn_ratelimited() instead?

Good call. Switched to using  dev_warn_ratelimited() in the v2 patch.
Thanks for the suggestion!

>
> >               ret = -EBADMSG;
> >               break;
> >       }
> > --
> > 2.54.0.rc2.544.gc7ae2d5bb8-goog
>