[PATCH v2] net: mctp: Fix tx queue stall

Jinliang Wang posted 1 patch 3 months, 2 weeks ago
There is a newer version of this series
drivers/net/mctp/mctp-usb.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
[PATCH v2] net: mctp: Fix tx queue stall
Posted by Jinliang Wang 3 months, 2 weeks ago
The tx queue can become permanently stuck in a stopped state due to a
race condition between the URB submission path and its completion
callback.

The URB completion callback can run immediately after usb_submit_urb()
returns, before the submitting function calls netif_stop_queue(). If
this occurs, the queue state management becomes desynchronized, leading
to a stall where the queue is never woken.

Fix this by moving the netif_stop_queue() call to before submitting the
URB. This closes the race window by ensuring the network stack is aware
the queue is stopped before the URB completion can possibly run.

Signed-off-by: Jinliang Wang <jinliangw@google.com>
---
 drivers/net/mctp/mctp-usb.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mctp/mctp-usb.c b/drivers/net/mctp/mctp-usb.c
index 36ccc53b1797..ef860cfc629f 100644
--- a/drivers/net/mctp/mctp-usb.c
+++ b/drivers/net/mctp/mctp-usb.c
@@ -96,11 +96,13 @@ static netdev_tx_t mctp_usb_start_xmit(struct sk_buff *skb,
 			  skb->data, skb->len,
 			  mctp_usb_out_complete, skb);
 
+	/* Stops TX queue first to prevent race condition with URB complete */
+	netif_stop_queue(dev);
 	rc = usb_submit_urb(urb, GFP_ATOMIC);
-	if (rc)
+	if (rc) {
+		netif_wake_queue(dev);
 		goto err_drop;
-	else
-		netif_stop_queue(dev);
+	}
 
 	return NETDEV_TX_OK;
 
-- 
2.51.1.821.gb6fe4d2222-goog
Re: [PATCH v2] net: mctp: Fix tx queue stall
Posted by Jeremy Kerr 3 months, 1 week ago
Hi Jinliang,

Thanks for the fix! A couple of comments on the metadata:

> Subject: [PATCH v2] net: mctp: Fix tx queue stall

You'll want to indicate that this is for net rather than net-next in
the subject prefix, so:

> Subject: [PATCH net v2] net: mctp: Fix tx queue stall

Then, since we're targeting net, we'll want a fixes tag too. I would
suggest:

Fixes: 0791c0327a6e ("net: mctp: Add MCTP USB transport driver")

- so we get appropriate backports. No need to reply to the original
message for subsequent versions either, a new thread is best.

Also, it's helpful to indicate changes between submitted versions,
under the '---' marker, which doesn't end up in the git commit.
Something like:

---
v3:
 - target net tree, add fixes tag

v2:
 - remove duplicate comment in commit message

---
 drivers/net/mctp/mctp-usb.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

> The tx queue can become permanently stuck in a stopped state due to a
> race condition between the URB submission path and its completion
> callback.
> 
> The URB completion callback can run immediately after usb_submit_urb()
> returns, before the submitting function calls netif_stop_queue(). If
> this occurs, the queue state management becomes desynchronized, leading
> to a stall where the queue is never woken.
> 
> Fix this by moving the netif_stop_queue() call to before submitting the
> URB. This closes the race window by ensuring the network stack is aware
> the queue is stopped before the URB completion can possibly run.

LGTM. With the changes above:

Acked-by: Jeremy Kerr <jk@codeconstruct.com.au>

Cheers,


Jeremy