[v1] net: usb: r8152: fix transmit queue timeout | Patchew

[PATCH] net: usb: r8152: fix transmit queue timeout

insyelu posted 1 patch 3 weeks, 4 days ago

Diff against v2 v2 v2 v2
Download mbox

There is a newer version of this series

drivers/net/usb/r8152.c | 2 ++
1 file changed, 2 insertions(+)

Expand all Fold all

[PATCH] net: usb: r8152: fix transmit queue timeout

Posted by insyelu 3 weeks, 4 days ago

When the TX queue length reaches the threshold, the netdev watchdog
immediately detects a TX queue timeout.

This patch updates the transmit queue's trans_start timestamp upon
completion of each asynchronous USB URB submission on the TX path,
ensuring the network watchdog correctly reflects ongoing transmission
activity.

Signed-off-by: insyelu <insyelu@gmail.com>
---
 drivers/net/usb/r8152.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index fa5192583860..afec602a5fdb 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -1954,6 +1954,8 @@ static void write_bulk_callback(struct urb *urb)
 
 	if (!skb_queue_empty(&tp->tx_queue))
 		tasklet_schedule(&tp->tx_tl);
+
+	netif_trans_update(netdev);
 }
 
 static void intr_callback(struct urb *urb)
-- 
2.34.1

[PATCH v2] net: usb: r8152: fix transmit queue timeout

Posted by insyelu 3 weeks, 2 days ago

When the TX queue length reaches the threshold, the netdev watchdog
immediately detects a TX queue timeout.

This patch updates the trans_start timestamp of the transmit queue
on every asynchronous USB URB submission along the transmit path,
ensuring that the network watchdog accurately reflects ongoing
transmission activity.

Signed-off-by: insyelu <insyelu@gmail.com>
---
v2: Update the transmit timestamp when submitting the USB URB.
---
 drivers/net/usb/r8152.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index fa5192583860..880b59ed5422 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -2449,6 +2449,8 @@ static int r8152_tx_agg_fill(struct r8152 *tp, struct tx_agg *agg)
 	ret = usb_submit_urb(agg->urb, GFP_ATOMIC);
 	if (ret < 0)
 		usb_autopm_put_interface_async(tp->intf);
+	else
+		netif_trans_update(tp->netdev);
 
 out_tx_fill:
 	return ret;
-- 
2.34.1

Re: [PATCH v2] net: usb: r8152: fix transmit queue timeout

Posted by Andrew Lunn 3 weeks, 1 day ago

On Fri, Jan 16, 2026 at 10:37:25AM +0800, insyelu wrote:
> When the TX queue length reaches the threshold, the netdev watchdog
> immediately detects a TX queue timeout.
> 
> This patch updates the trans_start timestamp of the transmit queue
> on every asynchronous USB URB submission along the transmit path,
> ensuring that the network watchdog accurately reflects ongoing
> transmission activity.
> 
> Signed-off-by: insyelu <insyelu@gmail.com>
> ---
> v2: Update the transmit timestamp when submitting the USB URB.

Always create a new thread for a new version of a patch. The CI/CD
system just thinks this is a discussion comment, so has ignored it.

    Andrew

---
pw-bot: cr

RE: [PATCH] net: usb: r8152: fix transmit queue timeout

Posted by Hayes Wang 3 weeks, 4 days ago

insyelu <insyelu@gmail.com>
> Sent: Wednesday, January 14, 2026 10:56 AM
[...]
> When the TX queue length reaches the threshold, the netdev watchdog
> immediately detects a TX queue timeout.
> 
> This patch updates the transmit queue's trans_start timestamp upon
> completion of each asynchronous USB URB submission on the TX path,
> ensuring the network watchdog correctly reflects ongoing transmission
> activity.
> 
> Signed-off-by: insyelu <insyelu@gmail.com>
> ---
>  drivers/net/usb/r8152.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> index fa5192583860..afec602a5fdb 100644
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -1954,6 +1954,8 @@ static void write_bulk_callback(struct urb *urb)
> 
>         if (!skb_queue_empty(&tp->tx_queue))
>                 tasklet_schedule(&tp->tx_tl);
> +
> +       netif_trans_update(netdev);
>  }

Based on the definition of netif_trans_update(), I think it would be better to move it into tx_agg_fill().
Such as

--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -2449,6 +2449,8 @@ static int r8152_tx_agg_fill(struct r8152 *tp, struct tx_agg *agg)
        ret = usb_submit_urb(agg->urb, GFP_ATOMIC);
        if (ret < 0)
                usb_autopm_put_interface_async(tp->intf);
+       else
+               netif_trans_update(tp->netdev);

 out_tx_fill:
        return ret;

Best Regards,
Hayes

Re: [PATCH] net: usb: r8152: fix transmit queue timeout

Posted by lu lu 3 weeks, 3 days ago

Hayes Wang <hayeswang@realtek.com> 于2026年1月14日周三 12:38写道：
>
> insyelu <insyelu@gmail.com>
> > Sent: Wednesday, January 14, 2026 10:56 AM
> [...]
> > When the TX queue length reaches the threshold, the netdev watchdog
> > immediately detects a TX queue timeout.
> >
> > This patch updates the transmit queue's trans_start timestamp upon
> > completion of each asynchronous USB URB submission on the TX path,
> > ensuring the network watchdog correctly reflects ongoing transmission
> > activity.
> >
> > Signed-off-by: insyelu <insyelu@gmail.com>
> > ---
> >  drivers/net/usb/r8152.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
> > index fa5192583860..afec602a5fdb 100644
> > --- a/drivers/net/usb/r8152.c
> > +++ b/drivers/net/usb/r8152.c
> > @@ -1954,6 +1954,8 @@ static void write_bulk_callback(struct urb *urb)
> >
> >         if (!skb_queue_empty(&tp->tx_queue))
> >                 tasklet_schedule(&tp->tx_tl);
> > +
> > +       netif_trans_update(netdev);
> >  }
>
> Based on the definition of netif_trans_update(), I think it would be better to move it into tx_agg_fill().
> Such as

HI Hayes Wang,

To reduce the performance impact on the tx_tl tasklet’s transmit path,
netif_trans_update() has been moved from the main transmit path into
write_bulk_callback (the USB transfer completion callback).
The main considerations are as follows:
1. Reduce frequent tasklet overhead
netif_trans_update() is invoked frequently under high-throughput
conditions. Calling it directly in the main transmit path continuously
introduces a small but noticeable CPU overhead, degrading the
scheduling efficiency of the tx_tl tasklet.
2. Move non-critical operations out of the critical path
By deferring netif_trans_update() to the USB callback thread—and
ensuring it executes after tasklet_schedule(&tp->tx_tl)—the timestamp
update is removed from the critical transmit scheduling path, further
reducing the burden on tx_tl.
3. This design follows the approach used in mature USB NIC drivers
such as rtl8150

Although it is semantically more intuitive to call
netif_trans_update() at the point where tx_agg_fill is defined,
delaying the timestamp update is both reasonable and safe from a
performance standpoint—the network watchdog mechanism only requires a
coarse indication that “a transmission occurred recently” and does not
require strict synchronization with the submission time of each
individual packet.
If strict semantic consistency is preferred, I am happy to resubmit a
patch that aligns with the above rationale.

Best Regards,
insyelu
>
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -2449,6 +2449,8 @@ static int r8152_tx_agg_fill(struct r8152 *tp, struct tx_agg *agg)
>         ret = usb_submit_urb(agg->urb, GFP_ATOMIC);
>         if (ret < 0)
>                 usb_autopm_put_interface_async(tp->intf);
> +       else
> +               netif_trans_update(tp->netdev);
>
>  out_tx_fill:
>         return ret;
>
> Best Regards,
> Hayes
>

RE: [PATCH] net: usb: r8152: fix transmit queue timeout

Posted by Hayes Wang 3 weeks, 2 days ago

lu lu <insyelu@gmail.com>
> Sent: Thursday, January 15, 2026 9:37 AM
[...]
> To reduce the performance impact on the tx_tl tasklet’s transmit path,
> netif_trans_update() has been moved from the main transmit path into
> write_bulk_callback (the USB transfer completion callback).
> The main considerations are as follows:
> 1. Reduce frequent tasklet overhead
> netif_trans_update() is invoked frequently under high-throughput
> conditions. Calling it directly in the main transmit path continuously
> introduces a small but noticeable CPU overhead, degrading the
> scheduling efficiency of the tx_tl tasklet.
> 2. Move non-critical operations out of the critical path
> By deferring netif_trans_update() to the USB callback thread—and
> ensuring it executes after tasklet_schedule(&tp->tx_tl)—the timestamp
> update is removed from the critical transmit scheduling path, further
> reducing the burden on tx_tl.

Excuse me, I do not fully understand the reasoning above.
It seems that this change merely shifts the time (or effort) from tx_tl to the TX completion callback.

While the intention is to make tx_tl run faster, this also delays the completion of the callback,
which in turn may delay both the next callback execution and the next scheduling of tx_tl.

From this perspective, it is unclear what is actually being saved.

Have you observed a measurable difference based on testing?

If you want to reduce the frequency of calling netif_trans_update(),
you could try something like the following. This way,
netif_trans_update() would not be executed on every transmission.

--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -2432,9 +2432,12 @@ static int r8152_tx_agg_fill(struct r8152 *tp, struct tx_agg *agg)

        netif_tx_lock(tp->netdev);

-       if (netif_queue_stopped(tp->netdev) &&
-           skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
+       if (netif_queue_stopped(tp->netdev)) {
+           if (skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
                netif_wake_queue(tp->netdev);
+           else
+               netif_trans_update(tp->netdev);
+       }

        netif_tx_unlock(tp->netdev);

Best Regards,
Hayes

Re: [PATCH] net: usb: r8152: fix transmit queue timeout

Posted by lu lu 3 weeks, 2 days ago

Hayes Wang <hayeswang@realtek.com> 于2026年1月15日周四 19:42写道：
>
> lu lu <insyelu@gmail.com>
> > Sent: Thursday, January 15, 2026 9:37 AM
> [...]
> > To reduce the performance impact on the tx_tl tasklet’s transmit path,
> > netif_trans_update() has been moved from the main transmit path into
> > write_bulk_callback (the USB transfer completion callback).
> > The main considerations are as follows:
> > 1. Reduce frequent tasklet overhead
> > netif_trans_update() is invoked frequently under high-throughput
> > conditions. Calling it directly in the main transmit path continuously
> > introduces a small but noticeable CPU overhead, degrading the
> > scheduling efficiency of the tx_tl tasklet.
> > 2. Move non-critical operations out of the critical path
> > By deferring netif_trans_update() to the USB callback thread—and
> > ensuring it executes after tasklet_schedule(&tp->tx_tl)—the timestamp
> > update is removed from the critical transmit scheduling path, further
> > reducing the burden on tx_tl.
>
> Excuse me, I do not fully understand the reasoning above.
> It seems that this change merely shifts the time (or effort) from tx_tl to the TX completion callback.
>
> While the intention is to make tx_tl run faster, this also delays the completion of the callback,
> which in turn may delay both the next callback execution and the next scheduling of tx_tl.
>
> From this perspective, it is unclear what is actually being saved.
>
> Have you observed a measurable difference based on testing?

HW: RK3399-based development board (dual-core Cortex-A72 + quad-core Cortex-A53)
OS: LEDE (OpenWrt)
NIC: 2.5G USB Ethernet adapter connected via USB 3.0

Due to platform performance limitations, it cannot reach 2.5G. When
processing in the write_bulk_callback and during USB submission,
iperf3 stress testing shows a difference of approximately 8Mbit/s.
This may be related to the platform's big.LITTLE core configuration.


>
> If you want to reduce the frequency of calling netif_trans_update(),
> you could try something like the following. This way,
> netif_trans_update() would not be executed on every transmission.
>
> --- a/drivers/net/usb/r8152.c
> +++ b/drivers/net/usb/r8152.c
> @@ -2432,9 +2432,12 @@ static int r8152_tx_agg_fill(struct r8152 *tp, struct tx_agg *agg)
>
>         netif_tx_lock(tp->netdev);
>
> -       if (netif_queue_stopped(tp->netdev) &&
> -           skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
> +       if (netif_queue_stopped(tp->netdev)) {
> +           if (skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
>                 netif_wake_queue(tp->netdev);
> +           else
> +               netif_trans_update(tp->netdev);
> +       }
The queue was stopped because it exceeded the threshold. Attempting to
refresh the time at this point is clearly too late.

Thank you for your suggestion. I will submit v2 (immediately updating
the transfer start time after successful USB submission).

Best Regards,
insyelu

>
>         netif_tx_unlock(tp->netdev);
>
> Best Regards,
> Hayes
>

RE: [PATCH] net: usb: r8152: fix transmit queue timeout

Posted by Hayes Wang 3 weeks, 2 days ago

lu lu <insyelu@gmail.com>
> Sent: Friday, January 16, 2026 10:11 AM
[...]
> >         netif_tx_lock(tp->netdev);
> >
> > -       if (netif_queue_stopped(tp->netdev) &&
> > -           skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
> > +       if (netif_queue_stopped(tp->netdev)) {
> > +           if (skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
> >                 netif_wake_queue(tp->netdev);
> > +           else
> > +               netif_trans_update(tp->netdev);
> > +       }
> The queue was stopped because it exceeded the threshold. Attempting to
> refresh the time at this point is clearly too late.

Why would this be considered too late?
Based on RTL8152_TX_TIMEOUT, there are about 5 seconds to
wake the queue or update the timestamp before a TX timeout occurs.
I believe 5 seconds should be sufficient.

If there is no TX submission for 5 seconds after the driver stops the queue,
then something is already wrong.

Best Regards,
Hayes

Re: [PATCH] net: usb: r8152: fix transmit queue timeout

Posted by lu lu 3 weeks, 1 day ago

Hayes Wang <hayeswang@realtek.com> 于2026年1月16日周五 11:11写道：
>
> lu lu <insyelu@gmail.com>
> > Sent: Friday, January 16, 2026 10:11 AM
> [...]
> > >         netif_tx_lock(tp->netdev);
> > >
> > > -       if (netif_queue_stopped(tp->netdev) &&
> > > -           skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
> > > +       if (netif_queue_stopped(tp->netdev)) {
> > > +           if (skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
> > >                 netif_wake_queue(tp->netdev);
> > > +           else
> > > +               netif_trans_update(tp->netdev);
> > > +       }
> > The queue was stopped because it exceeded the threshold. Attempting to
> > refresh the time at this point is clearly too late.
>
> Why would this be considered too late?

if (netif_queue_stopped(tp->netdev)) {
    if (skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
        netif_wake_queue(tp->netdev);
    else
        netif_trans_update(tp->netdev);
}
The first time xmit stops the transmit queue, the queue is not full,
and it is successfully woken up afterward — OK.
The second time xmit stops the transmit queue, the network watchdog
times out immediately because the transmit timestamp was not refreshed
when the queue was last resumed — FAIL.
This scenario is logically possible.

There is no clear evidence that netif_trans_update imposes a
significant CPU load.
Please help me review:
https://patchwork.kernel.org/project/linux-usb/patch/20260116023725.8095-1-insyelu@gmail.com

> Based on RTL8152_TX_TIMEOUT, there are about 5 seconds to
> wake the queue or update the timestamp before a TX timeout occurs.
> I believe 5 seconds should be sufficient.

>
> If there is no TX submission for 5 seconds after the driver stops the queue,
> then something is already wrong.
>
> Best Regards,
> Hayes
>

RE: [PATCH] net: usb: r8152: fix transmit queue timeout

Posted by Hayes Wang 2 weeks, 6 days ago

Original Message-----
> From: lu lu <insyelu@gmail.com>
[...]
> if (netif_queue_stopped(tp->netdev)) {
>     if (skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
>         netif_wake_queue(tp->netdev);
>     else
>         netif_trans_update(tp->netdev);
> }
> The first time xmit stops the transmit queue, the queue is not full,
> and it is successfully woken up afterward — OK.
> The second time xmit stops the transmit queue, the network watchdog
> times out immediately because the transmit timestamp was not refreshed
> when the queue was last resumed — FAIL.
> This scenario is logically possible.

This situation should not happen, because trans_start is also updated when the driver stops the TX queue.

https://elixir.bootlin.com/linux/v6.18.6/source/include/linux/netdevice.h#L3629

A TX timeout occurs only if the TX queue has been stopped for longer than RTL8152_TX_TIMEOUT.
It should not occur immediately when the driver stops the TX queue.

Therefore, what needs to be done is to update the timestamp when the TX queue is stopped.
Updating trans_start while the TX queue is not stopped is useless.

Best Regards,
Hayes

Re: [PATCH] net: usb: r8152: fix transmit queue timeout

Posted by lu lu 2 weeks, 5 days ago

Hayes Wang <hayeswang@realtek.com> 于2026年1月19日周一 10:51写道：
>
> Original Message-----
> > From: lu lu <insyelu@gmail.com>
> [...]
> > if (netif_queue_stopped(tp->netdev)) {
> >     if (skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
> >         netif_wake_queue(tp->netdev);
> >     else
> >         netif_trans_update(tp->netdev);
> > }
> > The first time xmit stops the transmit queue, the queue is not full,
> > and it is successfully woken up afterward — OK.
> > The second time xmit stops the transmit queue, the network watchdog
> > times out immediately because the transmit timestamp was not refreshed
> > when the queue was last resumed — FAIL.
> > This scenario is logically possible.
>
> This situation should not happen, because trans_start is also updated when the driver stops the TX queue.
>
> https://elixir.bootlin.com/linux/v6.18.6/source/include/linux/netdevice.h#L3629
>
> A TX timeout occurs only if the TX queue has been stopped for longer than RTL8152_TX_TIMEOUT.
> It should not occur immediately when the driver stops the TX queue.
Thank you for the correction! Upon review, I confirmed that
netif_tx_stop_queue() already updates trans_start,
so the timeout scenario I originally envisioned does not actually occur.

>
> Therefore, what needs to be done is to update the timestamp when the TX queue is stopped.
> Updating trans_start while the TX queue is not stopped is useless.
if (netif_queue_stopped(tp->netdev)) {
    if (skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
        netif_wake_queue(tp->netdev);
    else
        netif_trans_update(tp->netdev);
}
This change continuously updates the trans_start value, even when the
TX queue has been stopped and its length exceeds the threshold.
This may prevent the watchdog timer from ever timing out, thereby
masking potential transmission stall issues.

The timestamp should be updated only upon successful URB submission to
accurately reflect that the transport layer is still operational.

Best regards,
insyelu

RE: [PATCH] net: usb: r8152: fix transmit queue timeout

Posted by Hayes Wang 2 weeks, 5 days ago

lu lu <insyelu@gmail.com>
> Sent: Monday, January 19, 2026 2:58 PM
[...]
> > Therefore, what needs to be done is to update the timestamp when the TX queue is stopped.
> > Updating trans_start while the TX queue is not stopped is useless.
> if (netif_queue_stopped(tp->netdev)) {
>     if (skb_queue_len(&tp->tx_queue) < tp->tx_qlen)
>         netif_wake_queue(tp->netdev);
>     else
>         netif_trans_update(tp->netdev);
> }
> This change continuously updates the trans_start value, even when the
> TX queue has been stopped and its length exceeds the threshold.
> This may prevent the watchdog timer from ever timing out, thereby
> masking potential transmission stall issues.
> 
> The timestamp should be updated only upon successful URB submission to
> accurately reflect that the transport layer is still operational.

Although I think a URB error and a transmission stall are different,
I am fine with the simpler approach in v2.

Best Regards,
Hayes