[v1] veth: Fix TXQ stall race condition and add recovery

[PATCH net V1 2/3] veth: stop and start all TX queue in netdev down/up

Posted by Jesper Dangaard Brouer 5 months, 2 weeks ago

The veth driver started manipulating TXQ states in commit
dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring
to reduce TX drops").

Other drivers manipulating TXQ states takes care of stopping
and starting TXQs in NDOs.  Thus, adding this to veth .ndo_open
and .ndo_stop.

Fixes: dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops")
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
---
 drivers/net/veth.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 7b1a9805b270..3976ddda5fb8 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1404,6 +1404,9 @@ static int veth_open(struct net_device *dev)
 			return err;
 	}
 
+	netif_tx_start_all_queues(dev);
+	netif_tx_start_all_queues(peer);
+
 	if (peer->flags & IFF_UP) {
 		netif_carrier_on(dev);
 		netif_carrier_on(peer);
@@ -1423,6 +1426,10 @@ static int veth_close(struct net_device *dev)
 	if (peer)
 		netif_carrier_off(peer);
 
+	netif_tx_stop_all_queues(dev);
+	if (peer)
+		netif_tx_stop_all_queues(peer);
+
 	if (priv->_xdp_prog)
 		veth_disable_xdp(dev);
 	else if (veth_gro_requested(dev))

Re: [PATCH net V1 2/3] veth: stop and start all TX queue in netdev down/up

Posted by Jakub Kicinski 5 months, 2 weeks ago

On Thu, 23 Oct 2025 16:59:37 +0200 Jesper Dangaard Brouer wrote:
> The veth driver started manipulating TXQ states in commit
> dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring
> to reduce TX drops").
> 
> Other drivers manipulating TXQ states takes care of stopping
> and starting TXQs in NDOs.  Thus, adding this to veth .ndo_open
> and .ndo_stop.

Kinda, but taking a device up or down resets the qdisc, IIRC.

So stopping the qdisc for real drivers is mostly a way to make sure
that there's nothing entering the xmit handler as the driver dismantles
its state.

I'm not sure if this is an official rule, but I'm under the impression
that stopping the queues or carrier loss (and
netif_tx_stop_all_queues(peer) in close() is stopping peer's Tx queue
on carrier loss) is inadvisable as it may lead to old packets getting
transmitted when carrier comes back.

IOW based on the commit msg - I'm not sure this patch is needed..

Re: [PATCH net V1 2/3] veth: stop and start all TX queue in netdev down/up

Posted by Jesper Dangaard Brouer 5 months, 1 week ago

On 25/10/2025 02.54, Jakub Kicinski wrote:
> On Thu, 23 Oct 2025 16:59:37 +0200 Jesper Dangaard Brouer wrote:
>> The veth driver started manipulating TXQ states in commit
>> dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring
>> to reduce TX drops").
>>
>> Other drivers manipulating TXQ states takes care of stopping
>> and starting TXQs in NDOs.  Thus, adding this to veth .ndo_open
>> and .ndo_stop.
> 
> Kinda, but taking a device up or down resets the qdisc, IIRC.
> 
> So stopping the qdisc for real drivers is mostly a way to make sure
> that there's nothing entering the xmit handler as the driver dismantles
> its state.
> 
> I'm not sure if this is an official rule, but I'm under the impression
> that stopping the queues or carrier loss (and
> netif_tx_stop_all_queues(peer) in close() is stopping peer's Tx queue
> on carrier loss) is inadvisable as it may lead to old packets getting
> transmitted when carrier comes back.
> 
> IOW based on the commit msg - I'm not sure this patch is needed..

During incident, when doing ip link set 'down' flushed all packets in
the qdisc, but the TXQs were not reset (started again) on link 'up'.
  Thus, the qdisc would fill-up again and block all packets on interface.
  Chris also tried to replace the qdisc, but the TXQ was still in stopped
mode QUEUE_STATE_DRV_XOFF state.

This was the origin of the patch, that we could not recover the machine
from this state.  Thus, the idea of starting all queue on link 'up',
would give us a recovery mechanism.  With dev_watchdog this change isn't
really needed.
As you mention this may lead to old packets getting transmitted when
carrier comes back, which would be a changed behavior, that we don't
want in a fixes patch.  So, I will drop this patch.

--Jesper

[PATCH net V1 1/3] veth: enable dev_watchdog for detecting stalled TXQs
[PATCH net V1 2/3] veth: stop and start all TX queue in netdev down/up
[PATCH net V1 3/3] veth: more robust handing of race to avoid txq getting stuck