Fix the transmission stall due to backlog

[Patch mptcp-net 3/3] mptcp: fix stall because of data_ready

Posted by Gang Yan 3 days, 22 hours ago

From: Gang Yan <yangang@kylinos.cn>

This patch fixes the second type of backlog_list stall issue that
occurs when the data_ready callback attempts to trigger data transfer.
The issue reproduces at approximately a 20% rate (once every five runs)
when running multi_chunk.sh tests.

The stall occurs under the following conditions:
1. A large amount of out-of-order (OFO) data causes sk_rmem_alloc to
   exceed sk_rcvbuf
2. The skb matching the current ack_seq is not present in backlog_list
3. Data reception relies on data_ready callback notification

In this scenario, the data_ready callback (via mptcp_data_ready() ->
sk->sk_data_ready()) attempts to trigger data movement, but
__mptcp_move_skbs_from_subflow() repeatedly moves the ack_seq skb into
backlog_list and returns false:

'''
[  144.961282][    C0] MPTCP: msk->ack_seq:3442119990924456661, map_seq:3442119990924456661, offset:0, fin:0
[  144.961293][    C0] MPTCP: [MPTCP_BACKLOG] #0 map_seq=3442119990924655746 end_seq=3442119990924660542 len=4796
[  144.962310][    C0] MPTCP: [MPTCP_BACKLOG] #1 map_seq=3442119990924491850 end_seq=3442119990924500364 len=8514
[  144.962783][    C0] MPTCP: [MPTCP_BACKLOG] #2 map_seq=3442119990924660542 end_seq=3442119990924726001 len=65459
[  144.963260][    C0] MPTCP: [MPTCP_BACKLOG] #3 map_seq=3442119990924508114 end_seq=3442119990924514971 len=6857
[  144.963729][    C0] MPTCP: [MPTCP_BACKLOG] #4 map_seq=3442119990924726001 end_seq=3442119990924731093 len=5092
[  144.964193][    C0] MPTCP: [MPTCP_BACKLOG] #5 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
[  144.964651][    C0] MPTCP: [MPTCP_BACKLOG] #6 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
[  144.965164][    C0] MPTCP: [MPTCP_BACKLOG] #7 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
[  144.965711][    C0] MPTCP: [MPTCP_BACKLOG] #8 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
[  144.966260][    C0] MPTCP: [MPTCP_BACKLOG] #9 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
[  144.966804][    C0] MPTCP: [MPTCP_BACKLOG] #10 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
[  144.967308][    C0] MPTCP: [MPTCP_BACKLOG] #11 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
[  144.967793][    C0] MPTCP: [MPTCP_BACKLOG] #12 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
...
'''

The fix adds a check for an empty receive queue in addition to the
rcvbuf comparison. When the receive queue is empty, skbs should still
be moved to prevent the stall. With this patch, all mptcp_tls tests
pass successfully.

Co-developed-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
---
 net/mptcp/protocol.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index b6bafc37eea4..054aa72c9aa6 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -739,7 +739,8 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
 
 			mptcp_init_skb(ssk, skb, offset, len);
 
-			if (own_msk && sk_rmem_alloc_get(sk) < sk->sk_rcvbuf) {
+			if (own_msk && (sk_rmem_alloc_get(sk) < sk->sk_rcvbuf ||
+					skb_queue_empty(&sk->sk_receive_queue))) {
 				mptcp_subflow_lend_fwdmem(subflow, skb);
 				ret |= __mptcp_move_skb(sk, skb);
 			} else {
-- 
2.43.0

Re: [Patch mptcp-net 3/3] mptcp: fix stall because of data_ready

Posted by Paolo Abeni 3 days, 19 hours ago

On 2/5/26 7:41 AM, Gang Yan wrote:
> From: Gang Yan <yangang@kylinos.cn>
> 
> This patch fixes the second type of backlog_list stall issue that
> occurs when the data_ready callback attempts to trigger data transfer.
> The issue reproduces at approximately a 20% rate (once every five runs)
> when running multi_chunk.sh tests.
> 
> The stall occurs under the following conditions:
> 1. A large amount of out-of-order (OFO) data causes sk_rmem_alloc to
>    exceed sk_rcvbuf

Thinking again about this scenario, the above condition is quite
unexpected to me. Could you please share a pcap capture of this stall?
In theory the sender should not fill the rcv window with OoO skb only.

> 2. The skb matching the current ack_seq is not present in backlog_list
> 3. Data reception relies on data_ready callback notification
> 
> In this scenario, the data_ready callback (via mptcp_data_ready() ->
> sk->sk_data_ready()) attempts to trigger data movement, but
> __mptcp_move_skbs_from_subflow() repeatedly moves the ack_seq skb into
> backlog_list and returns false:
> 
> '''
> [  144.961282][    C0] MPTCP: msk->ack_seq:3442119990924456661, map_seq:3442119990924456661, offset:0, fin:0
> [  144.961293][    C0] MPTCP: [MPTCP_BACKLOG] #0 map_seq=3442119990924655746 end_seq=3442119990924660542 len=4796
> [  144.962310][    C0] MPTCP: [MPTCP_BACKLOG] #1 map_seq=3442119990924491850 end_seq=3442119990924500364 len=8514
> [  144.962783][    C0] MPTCP: [MPTCP_BACKLOG] #2 map_seq=3442119990924660542 end_seq=3442119990924726001 len=65459
> [  144.963260][    C0] MPTCP: [MPTCP_BACKLOG] #3 map_seq=3442119990924508114 end_seq=3442119990924514971 len=6857
> [  144.963729][    C0] MPTCP: [MPTCP_BACKLOG] #4 map_seq=3442119990924726001 end_seq=3442119990924731093 len=5092
> [  144.964193][    C0] MPTCP: [MPTCP_BACKLOG] #5 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> [  144.964651][    C0] MPTCP: [MPTCP_BACKLOG] #6 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> [  144.965164][    C0] MPTCP: [MPTCP_BACKLOG] #7 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> [  144.965711][    C0] MPTCP: [MPTCP_BACKLOG] #8 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> [  144.966260][    C0] MPTCP: [MPTCP_BACKLOG] #9 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> [  144.966804][    C0] MPTCP: [MPTCP_BACKLOG] #10 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> [  144.967308][    C0] MPTCP: [MPTCP_BACKLOG] #11 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> [  144.967793][    C0] MPTCP: [MPTCP_BACKLOG] #12 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> ...
> '''
> 
> The fix adds a check for an empty receive queue in addition to the
> rcvbuf comparison. When the receive queue is empty, skbs should still
> be moved to prevent the stall. With this patch, all mptcp_tls tests
> pass successfully.
> 
> Co-developed-by: Geliang Tang <tanggeliang@kylinos.cn>
> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> Signed-off-by: Gang Yan <yangang@kylinos.cn>
> ---
>  net/mptcp/protocol.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index b6bafc37eea4..054aa72c9aa6 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -739,7 +739,8 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
>  
>  			mptcp_init_skb(ssk, skb, offset, len);
>  
> -			if (own_msk && sk_rmem_alloc_get(sk) < sk->sk_rcvbuf) {
> +			if (own_msk && (sk_rmem_alloc_get(sk) < sk->sk_rcvbuf ||
> +					skb_queue_empty(&sk->sk_receive_queue))) {

Similar consideration WRT the previous patch, we need protect against
possible attacks. I think checking and allowing in-sequence skbs should
be enough:

			if (own_msk && (sk_rmem_alloc_get(sk) < sk->sk_rcvbuf ||
			    MPTCP_SKB_CB(skb)->map_seq == msk->ack_seq)) {

/P

Re: [Patch mptcp-net 3/3] mptcp: fix stall because of data_ready

Posted by gang.yan@linux.dev 3 days, 16 hours ago

February 5, 2026 at 6:07 PM, "Paolo Abeni" <pabeni@redhat.com mailto:pabeni@redhat.com?to=%22Paolo%20Abeni%22%20%3Cpabeni%40redhat.com%3E > wrote:
Hi, Paolo:

> 
> On 2/5/26 7:41 AM, Gang Yan wrote:
> 
> > 
> > From: Gang Yan <yangang@kylinos.cn>
> >  
> >  This patch fixes the second type of backlog_list stall issue that
> >  occurs when the data_ready callback attempts to trigger data transfer.
> >  The issue reproduces at approximately a 20% rate (once every five runs)
> >  when running multi_chunk.sh tests.
> >  
> >  The stall occurs under the following conditions:
> >  1. A large amount of out-of-order (OFO) data causes sk_rmem_alloc to
> >  exceed sk_rcvbuf
> > 
> Thinking again about this scenario, the above condition is quite
> unexpected to me. Could you please share a pcap capture of this stall?
> In theory the sender should not fill the rcv window with OoO skb only.

The captured packets (ns1.pcap) are attached to this email.

> 
> > 
> > 2. The skb matching the current ack_seq is not present in backlog_list
> >  3. Data reception relies on data_ready callback notification
> >  
> >  In this scenario, the data_ready callback (via mptcp_data_ready() ->
> >  sk->sk_data_ready()) attempts to trigger data movement, but
> >  __mptcp_move_skbs_from_subflow() repeatedly moves the ack_seq skb into
> >  backlog_list and returns false:
> >  
> >  '''
> >  [ 144.961282][ C0] MPTCP: msk->ack_seq:3442119990924456661, map_seq:3442119990924456661, offset:0, fin:0
> >  [ 144.961293][ C0] MPTCP: [MPTCP_BACKLOG] #0 map_seq=3442119990924655746 end_seq=3442119990924660542 len=4796
> >  [ 144.962310][ C0] MPTCP: [MPTCP_BACKLOG] #1 map_seq=3442119990924491850 end_seq=3442119990924500364 len=8514
> >  [ 144.962783][ C0] MPTCP: [MPTCP_BACKLOG] #2 map_seq=3442119990924660542 end_seq=3442119990924726001 len=65459
> >  [ 144.963260][ C0] MPTCP: [MPTCP_BACKLOG] #3 map_seq=3442119990924508114 end_seq=3442119990924514971 len=6857
> >  [ 144.963729][ C0] MPTCP: [MPTCP_BACKLOG] #4 map_seq=3442119990924726001 end_seq=3442119990924731093 len=5092
> >  [ 144.964193][ C0] MPTCP: [MPTCP_BACKLOG] #5 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> >  [ 144.964651][ C0] MPTCP: [MPTCP_BACKLOG] #6 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> >  [ 144.965164][ C0] MPTCP: [MPTCP_BACKLOG] #7 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> >  [ 144.965711][ C0] MPTCP: [MPTCP_BACKLOG] #8 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> >  [ 144.966260][ C0] MPTCP: [MPTCP_BACKLOG] #9 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> >  [ 144.966804][ C0] MPTCP: [MPTCP_BACKLOG] #10 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> >  [ 144.967308][ C0] MPTCP: [MPTCP_BACKLOG] #11 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> >  [ 144.967793][ C0] MPTCP: [MPTCP_BACKLOG] #12 map_seq=3442119990924456661 end_seq=3442119990924464310 len=7649
> >  ...
> >  '''
> >  
> >  The fix adds a check for an empty receive queue in addition to the
> >  rcvbuf comparison. When the receive queue is empty, skbs should still
> >  be moved to prevent the stall. With this patch, all mptcp_tls tests
> >  pass successfully.
> >  
> >  Co-developed-by: Geliang Tang <tanggeliang@kylinos.cn>
> >  Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
> >  Signed-off-by: Gang Yan <yangang@kylinos.cn>
> >  ---
> >  net/mptcp/protocol.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >  
> >  diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> >  index b6bafc37eea4..054aa72c9aa6 100644
> >  --- a/net/mptcp/protocol.c
> >  +++ b/net/mptcp/protocol.c
> >  @@ -739,7 +739,8 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk,
> >  
> >  mptcp_init_skb(ssk, skb, offset, len);
> >  
> >  - if (own_msk && sk_rmem_alloc_get(sk) < sk->sk_rcvbuf) {
> >  + if (own_msk && (sk_rmem_alloc_get(sk) < sk->sk_rcvbuf ||
> >  + skb_queue_empty(&sk->sk_receive_queue))) {
> > 
> Similar consideration WRT the previous patch, we need protect against
> possible attacks. I think checking and allowing in-sequence skbs should
> be enough:
> 
>  if (own_msk && (sk_rmem_alloc_get(sk) < sk->sk_rcvbuf ||
>  MPTCP_SKB_CB(skb)->map_seq == msk->ack_seq)) {
> 
This patch addresses the blocking issue that affects the NVMe-over-MPTCP testing.
Could this fix be considered for merging separately into the export branch as a
standalone fix? If that’s acceptable, I’ll prepare and send a v2 patch focused
only on this fix shortly.

Best regards,
Gang


> /P
>

[Patch mptcp-net 1/3] mptcp: add backlog_list bug reproducer test
[Patch mptcp-net 2/3] mptcp: fix receive stalls when 'ack_seq' in backlog_list
[Patch mptcp-net 3/3] mptcp: fix stall because of data_ready