[PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure

Paolo Abeni posted 10 patches 1 week, 2 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/cover.1777908248.git.pabeni@redhat.com
There is a newer version of this series
include/net/tcp.h    |   8 +
net/ipv4/tcp_input.c |  55 +++---
net/mptcp/fastopen.c |  17 +-
net/mptcp/mib.c      |   3 +
net/mptcp/mib.h      |   3 +
net/mptcp/options.c  |  64 ++++++-
net/mptcp/protocol.c | 399 ++++++++++++++++++++++++++++---------------
net/mptcp/protocol.h |  24 ++-
net/mptcp/subflow.c  |  11 ++
9 files changed, 414 insertions(+), 170 deletions(-)
[PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Posted by Paolo Abeni 1 week, 2 days ago
This an attempt to fix the data transfer stall reported by Geliang and
Gang more carefully enforcing memory constraints at the MPTCP level.

Patch 1/10 moves the bound check before entering the TCP socket.
Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely re-using
TCP helpers on MPTCP skbs.
Patch 6 makes TCP pruning related helpers available to MPTCP and patch 7
makes use of them. Patch 8 addresses an edge scenario that could still
lead to transfer stall under memory pressure.
Finally patch 9 and 10 improve the MPTCP-level retransmission schema to
make recovery from memory pressure significanly faster.

Note that the diffstat is biases by the quite large patch 4/9, which
contains mechanical transformation of existing code; "real" changes are
noticiable smaller.

Tested successfully vs the test cases proposed by Geliang and Gang and
vs the selftests.
---
Some notes on each patch WRT ignored or false positive issues noticed
by sashiko so far.

Paolo Abeni (10):
  mptcp: move checks vs rcvbuf size earlier in the RX path
  mptcp: drop the mptcp_ooo_try_coalesce() helper
  mptcp: drop the cant_coalesce CB field
  mptcp: remove CB offset field
  mptcp: sync mptcp skb cb layout with tcp one
  tcp: expose the tcp_collapse_ofo_queue() helper to mptcp usage, too
  mptcp: implemented OoO queue pruning
  mptcp: track prune recovery status
  mptcp: move the retrans loop to a separate helper
  mptcp: let the retrans scheduler do its job.

 include/net/tcp.h    |   8 +
 net/ipv4/tcp_input.c |  55 +++---
 net/mptcp/fastopen.c |  17 +-
 net/mptcp/mib.c      |   3 +
 net/mptcp/mib.h      |   3 +
 net/mptcp/options.c  |  64 ++++++-
 net/mptcp/protocol.c | 399 ++++++++++++++++++++++++++++---------------
 net/mptcp/protocol.h |  24 ++-
 net/mptcp/subflow.c  |  11 ++
 9 files changed, 414 insertions(+), 170 deletions(-)

-- 
2.54.0
Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Posted by Matthieu Baerts 5 days, 8 hours ago
Hi Geliang, Gang,

On 04/05/2026 17:39, Paolo Abeni wrote:
> This an attempt to fix the data transfer stall reported by Geliang and
> Gang more carefully enforcing memory constraints at the MPTCP level.
> 
> Patch 1/10 moves the bound check before entering the TCP socket.
> Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely re-using
> TCP helpers on MPTCP skbs.
> Patch 6 makes TCP pruning related helpers available to MPTCP and patch 7
> makes use of them. Patch 8 addresses an edge scenario that could still
> lead to transfer stall under memory pressure.
> Finally patch 9 and 10 improve the MPTCP-level retransmission schema to
> make recovery from memory pressure significanly faster.
> 
> Note that the diffstat is biases by the quite large patch 4/9, which
> contains mechanical transformation of existing code; "real" changes are
> noticiable smaller.
> 
> Tested successfully vs the test cases proposed by Geliang and Gang and
> vs the selftests.

At the last meeting on Wednesday, Geliang mentioned he validated this
series. Just to be sure, was it the v2 -- from last week -- or the v3 --
from this week, while you were in EU -- that you validated? Because
Paolo couldn't reproduce the issue you mentioned on the v3 on his side.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Posted by gang.yan@linux.dev 4 days, 12 hours ago
May 8, 2026 at 6:49 PM, "Matthieu Baerts" <matttbe@kernel.org mailto:matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel.org%3E > wrote:


> 
> Hi Geliang, Gang,
> 
> On 04/05/2026 17:39, Paolo Abeni wrote:
> 
> > 
> > This an attempt to fix the data transfer stall reported by Geliang and
> >  Gang more carefully enforcing memory constraints at the MPTCP level.
> >  
> >  Patch 1/10 moves the bound check before entering the TCP socket.
> >  Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely re-using
> >  TCP helpers on MPTCP skbs.
> >  Patch 6 makes TCP pruning related helpers available to MPTCP and patch 7
> >  makes use of them. Patch 8 addresses an edge scenario that could still
> >  lead to transfer stall under memory pressure.
> >  Finally patch 9 and 10 improve the MPTCP-level retransmission schema to
> >  make recovery from memory pressure significanly faster.
> >  
> >  Note that the diffstat is biases by the quite large patch 4/9, which
> >  contains mechanical transformation of existing code; "real" changes are
> >  noticiable smaller.
> >  
> >  Tested successfully vs the test cases proposed by Geliang and Gang and
> >  vs the selftests.
> > 
> At the last meeting on Wednesday, Geliang mentioned he validated this
> series. Just to be sure, was it the v2 -- from last week -- or the v3 --
> from this week, while you were in EU -- that you validated? Because
> Paolo couldn't reproduce the issue you mentioned on the v3 on his side.
> 
Hi, Matt

The issue can also be reproduced on v3. I reproduced it using Docker's
auto-debug mode with the mptcp_data.sh selftest:

‘’‘
	Not running all tests but:

-------- 8< --------
run_loop run_selftest_one mptcp_data.sh
-------- 8< --------




	=== Attempt: 1 (Sat, 09 May 2026 06:55:44 +0000) ===


Selftest Test: ./mptcp_data.sh
TAP version 13
1..1
# add_addr_accepted 4 subflows 4 
# id 1 flags signal 127.0.0.1 10001
# id 2 flags signal 127.0.0.1 10002
# id 3 flags signal 127.0.0.1 10003
# id 4 flags signal 127.0.0.1 10004
# TAP version 13
# 1..48
# # Starting 48 tests from 2 test cases.
# #  RUN           global.mptcp_v6 ...
# #            OK  global.mptcp_v6
# ok 1 global.mptcp_v6
# #  RUN           mptcp.shutdown_reuse ...
# #            OK  mptcp.shutdown_reuse
# ok 2 mptcp.shutdown_reuse
...
# ok 48 mptcp.sendfile
# # FAILED: 41 / 48 tests passed.
# # Totals: pass:41 fail:7 xfail:0 xpass:0 skip:0 error:0
not ok 1 test: selftest_mptcp_data # FAIL
# time=33
’‘’

As Geliang mentioned in the weekly meeting, we will continue
to debug and locate the problem based on Paolo's v3 patches.

We will post any updates to the mailing list once we make
some progess.

Thanks
Gang
> Cheers,
> Matt
> -- 
> Sponsored by the NGI0 Core fund.
>
Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Posted by Paolo Abeni 2 days, 10 hours ago
On 5/9/26 9:07 AM, gang.yan@linux.dev wrote:
> May 8, 2026 at 6:49 PM, "Matthieu Baerts" <matttbe@kernel.org mailto:matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel.org%3E > wrote:
>>
>> Hi Geliang, Gang,
>>
>> On 04/05/2026 17:39, Paolo Abeni wrote:
>>
>>>
>>> This an attempt to fix the data transfer stall reported by Geliang and
>>>  Gang more carefully enforcing memory constraints at the MPTCP level.
>>>  
>>>  Patch 1/10 moves the bound check before entering the TCP socket.
>>>  Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely re-using
>>>  TCP helpers on MPTCP skbs.
>>>  Patch 6 makes TCP pruning related helpers available to MPTCP and patch 7
>>>  makes use of them. Patch 8 addresses an edge scenario that could still
>>>  lead to transfer stall under memory pressure.
>>>  Finally patch 9 and 10 improve the MPTCP-level retransmission schema to
>>>  make recovery from memory pressure significanly faster.
>>>  
>>>  Note that the diffstat is biases by the quite large patch 4/9, which
>>>  contains mechanical transformation of existing code; "real" changes are
>>>  noticiable smaller.
>>>  
>>>  Tested successfully vs the test cases proposed by Geliang and Gang and
>>>  vs the selftests.
>>>
>> At the last meeting on Wednesday, Geliang mentioned he validated this
>> series. Just to be sure, was it the v2 -- from last week -- or the v3 --
>> from this week, while you were in EU -- that you validated? Because
>> Paolo couldn't reproduce the issue you mentioned on the v3 on his side.
>>
> Hi, Matt
> 
> The issue can also be reproduced on v3. I reproduced it using Docker's
> auto-debug mode with the mptcp_data.sh selftest:
> 
> ‘’‘
> 	Not running all tests but:
> 
> -------- 8< --------
> run_loop run_selftest_one mptcp_data.sh
> -------- 8< --------
> 
> 
> 
> 
> 	=== Attempt: 1 (Sat, 09 May 2026 06:55:44 +0000) ===
> 
> 
> Selftest Test: ./mptcp_data.sh
> TAP version 13
> 1..1
> # add_addr_accepted 4 subflows 4 
> # id 1 flags signal 127.0.0.1 10001
> # id 2 flags signal 127.0.0.1 10002
> # id 3 flags signal 127.0.0.1 10003
> # id 4 flags signal 127.0.0.1 10004
> # TAP version 13
> # 1..48
> # # Starting 48 tests from 2 test cases.
> # #  RUN           global.mptcp_v6 ...
> # #            OK  global.mptcp_v6
> # ok 1 global.mptcp_v6
> # #  RUN           mptcp.shutdown_reuse ...
> # #            OK  mptcp.shutdown_reuse
> # ok 2 mptcp.shutdown_reuse
> ...
> # ok 48 mptcp.sendfile
> # # FAILED: 41 / 48 tests passed.
> # # Totals: pass:41 fail:7 xfail:0 xpass:0 skip:0 error:0
> not ok 1 test: selftest_mptcp_data # FAIL
> # time=33
> ’‘’
> 
> As Geliang mentioned in the weekly meeting, we will continue
> to debug and locate the problem based on Paolo's v3 patches.

Thanks for testing. I went over a couple more revisions. I run more than
100 iterations successfully on a local build on top of v5 (no failures,
I stop due to time constraints):

https://lore.kernel.org/mptcp/f00bdac0-b544-87b8-2ef4-ca4de0f045de@gmail.com/T/#t

please have a spin as such later version.

Thanks,

Paolo

Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Posted by Geliang Tang 2 days, 8 hours ago
Hi Paolo,

On Mon, 2026-05-11 at 10:29 +0200, Paolo Abeni wrote:
> On 5/9/26 9:07 AM, gang.yan@linux.dev wrote:
> > May 8, 2026 at 6:49 PM, "Matthieu Baerts"
> > <matttbe@kernel.org mailto:
> > matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel
> > .org%3E > wrote:
> > > 
> > > Hi Geliang, Gang,
> > > 
> > > On 04/05/2026 17:39, Paolo Abeni wrote:
> > > 
> > > > 
> > > > This an attempt to fix the data transfer stall reported by
> > > > Geliang and
> > > >  Gang more carefully enforcing memory constraints at the MPTCP
> > > > level.
> > > >  
> > > >  Patch 1/10 moves the bound check before entering the TCP
> > > > socket.
> > > >  Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely
> > > > re-using
> > > >  TCP helpers on MPTCP skbs.
> > > >  Patch 6 makes TCP pruning related helpers available to MPTCP
> > > > and patch 7
> > > >  makes use of them. Patch 8 addresses an edge scenario that
> > > > could still
> > > >  lead to transfer stall under memory pressure.
> > > >  Finally patch 9 and 10 improve the MPTCP-level retransmission
> > > > schema to
> > > >  make recovery from memory pressure significanly faster.
> > > >  
> > > >  Note that the diffstat is biases by the quite large patch 4/9,
> > > > which
> > > >  contains mechanical transformation of existing code; "real"
> > > > changes are
> > > >  noticiable smaller.
> > > >  
> > > >  Tested successfully vs the test cases proposed by Geliang and
> > > > Gang and
> > > >  vs the selftests.
> > > > 
> > > At the last meeting on Wednesday, Geliang mentioned he validated
> > > this
> > > series. Just to be sure, was it the v2 -- from last week -- or
> > > the v3 --
> > > from this week, while you were in EU -- that you validated?
> > > Because
> > > Paolo couldn't reproduce the issue you mentioned on the v3 on his
> > > side.
> > > 
> > Hi, Matt
> > 
> > The issue can also be reproduced on v3. I reproduced it using
> > Docker's
> > auto-debug mode with the mptcp_data.sh selftest:
> > 
> > ‘’‘
> > 	Not running all tests but:
> > 
> > -------- 8< --------
> > run_loop run_selftest_one mptcp_data.sh
> > -------- 8< --------
> > 
> > 
> > 
> > 
> > 	=== Attempt: 1 (Sat, 09 May 2026 06:55:44 +0000) ===
> > 
> > 
> > Selftest Test: ./mptcp_data.sh
> > TAP version 13
> > 1..1
> > # add_addr_accepted 4 subflows 4 
> > # id 1 flags signal 127.0.0.1 10001
> > # id 2 flags signal 127.0.0.1 10002
> > # id 3 flags signal 127.0.0.1 10003
> > # id 4 flags signal 127.0.0.1 10004
> > # TAP version 13
> > # 1..48
> > # # Starting 48 tests from 2 test cases.
> > # #  RUN           global.mptcp_v6 ...
> > # #            OK  global.mptcp_v6
> > # ok 1 global.mptcp_v6
> > # #  RUN           mptcp.shutdown_reuse ...
> > # #            OK  mptcp.shutdown_reuse
> > # ok 2 mptcp.shutdown_reuse
> > ...
> > # ok 48 mptcp.sendfile
> > # # FAILED: 41 / 48 tests passed.
> > # # Totals: pass:41 fail:7 xfail:0 xpass:0 skip:0 error:0
> > not ok 1 test: selftest_mptcp_data # FAIL
> > # time=33
> > ’‘’
> > 
> > As Geliang mentioned in the weekly meeting, we will continue
> > to debug and locate the problem based on Paolo's v3 patches.
> 
> Thanks for testing. I went over a couple more revisions. I run more
> than
> 100 iterations successfully on a local build on top of v5 (no
> failures,
> I stop due to time constraints):
> 
> https://lore.kernel.org/mptcp/f00bdac0-b544-87b8-2ef4-ca4de0f045de@gmail.com/T/#t
> 
> please have a spin as such later version.

On v5, mptcp_data.sh works fine in normal mode with the virtme docker
image, but in debug mode it's still unstable - failing after several
loop iterations.

Thanks,
-Geliang

> 
> Thanks,
> 
> Paolo
> 
Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Posted by Paolo Abeni 2 days, 2 hours ago
On 5/11/26 1:11 PM, Geliang Tang wrote:
> On Mon, 2026-05-11 at 10:29 +0200, Paolo Abeni wrote:
>> On 5/9/26 9:07 AM, gang.yan@linux.dev wrote:
>>> May 8, 2026 at 6:49 PM, "Matthieu Baerts"
>>> <matttbe@kernel.org mailto:
>>> matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel
>>> .org%3E > wrote:
>>>>
>>>> Hi Geliang, Gang,
>>>>
>>>> On 04/05/2026 17:39, Paolo Abeni wrote:
>>>>
>>>>>
>>>>> This an attempt to fix the data transfer stall reported by
>>>>> Geliang and
>>>>>  Gang more carefully enforcing memory constraints at the MPTCP
>>>>> level.
>>>>>  
>>>>>  Patch 1/10 moves the bound check before entering the TCP
>>>>> socket.
>>>>>  Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely
>>>>> re-using
>>>>>  TCP helpers on MPTCP skbs.
>>>>>  Patch 6 makes TCP pruning related helpers available to MPTCP
>>>>> and patch 7
>>>>>  makes use of them. Patch 8 addresses an edge scenario that
>>>>> could still
>>>>>  lead to transfer stall under memory pressure.
>>>>>  Finally patch 9 and 10 improve the MPTCP-level retransmission
>>>>> schema to
>>>>>  make recovery from memory pressure significanly faster.
>>>>>  
>>>>>  Note that the diffstat is biases by the quite large patch 4/9,
>>>>> which
>>>>>  contains mechanical transformation of existing code; "real"
>>>>> changes are
>>>>>  noticiable smaller.
>>>>>  
>>>>>  Tested successfully vs the test cases proposed by Geliang and
>>>>> Gang and
>>>>>  vs the selftests.
>>>>>
>>>> At the last meeting on Wednesday, Geliang mentioned he validated
>>>> this
>>>> series. Just to be sure, was it the v2 -- from last week -- or
>>>> the v3 --
>>>> from this week, while you were in EU -- that you validated?
>>>> Because
>>>> Paolo couldn't reproduce the issue you mentioned on the v3 on his
>>>> side.
>>>>
>>> Hi, Matt
>>>
>>> The issue can also be reproduced on v3. I reproduced it using
>>> Docker's
>>> auto-debug mode with the mptcp_data.sh selftest:
>>>
>>> ‘’‘
>>> 	Not running all tests but:
>>>
>>> -------- 8< --------
>>> run_loop run_selftest_one mptcp_data.sh
>>> -------- 8< --------
>>>
>>>
>>>
>>>
>>> 	=== Attempt: 1 (Sat, 09 May 2026 06:55:44 +0000) ===
>>>
>>>
>>> Selftest Test: ./mptcp_data.sh
>>> TAP version 13
>>> 1..1
>>> # add_addr_accepted 4 subflows 4 
>>> # id 1 flags signal 127.0.0.1 10001
>>> # id 2 flags signal 127.0.0.1 10002
>>> # id 3 flags signal 127.0.0.1 10003
>>> # id 4 flags signal 127.0.0.1 10004
>>> # TAP version 13
>>> # 1..48
>>> # # Starting 48 tests from 2 test cases.
>>> # #  RUN           global.mptcp_v6 ...
>>> # #            OK  global.mptcp_v6
>>> # ok 1 global.mptcp_v6
>>> # #  RUN           mptcp.shutdown_reuse ...
>>> # #            OK  mptcp.shutdown_reuse
>>> # ok 2 mptcp.shutdown_reuse
>>> ...
>>> # ok 48 mptcp.sendfile
>>> # # FAILED: 41 / 48 tests passed.
>>> # # Totals: pass:41 fail:7 xfail:0 xpass:0 skip:0 error:0
>>> not ok 1 test: selftest_mptcp_data # FAIL
>>> # time=33
>>> ’‘’
>>>
>>> As Geliang mentioned in the weekly meeting, we will continue
>>> to debug and locate the problem based on Paolo's v3 patches.
>>
>> Thanks for testing. I went over a couple more revisions. I run more
>> than
>> 100 iterations successfully on a local build on top of v5 (no
>> failures,
>> I stop due to time constraints):
>>
>> https://lore.kernel.org/mptcp/f00bdac0-b544-87b8-2ef4-ca4de0f045de@gmail.com/T/#t
>>
>> please have a spin as such later version.
> 
> On v5, mptcp_data.sh works fine in normal mode with the virtme docker
> image, but in debug mode it's still unstable - failing after several
> loop iterations.

How many interactions? Can you please provide a sample of such failure?
does the failure always happen in the same way/on the same test case?

Thanks!

Paolo

Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Posted by Paolo Abeni 9 hours ago
On 5/11/26 6:35 PM, Paolo Abeni wrote:
> On 5/11/26 1:11 PM, Geliang Tang wrote:
>> On Mon, 2026-05-11 at 10:29 +0200, Paolo Abeni wrote:
>>> On 5/9/26 9:07 AM, gang.yan@linux.dev wrote:
>>>> May 8, 2026 at 6:49 PM, "Matthieu Baerts"
>>>> <matttbe@kernel.org mailto:
>>>> matttbe@kernel.org?to=%22Matthieu%20Baerts%22%20%3Cmatttbe%40kernel
>>>> .org%3E > wrote:
>>>>>
>>>>> Hi Geliang, Gang,
>>>>>
>>>>> On 04/05/2026 17:39, Paolo Abeni wrote:
>>>>>
>>>>>>
>>>>>> This an attempt to fix the data transfer stall reported by
>>>>>> Geliang and
>>>>>>  Gang more carefully enforcing memory constraints at the MPTCP
>>>>>> level.
>>>>>>  
>>>>>>  Patch 1/10 moves the bound check before entering the TCP
>>>>>> socket.
>>>>>>  Patch 2, 3, 4 and 5 are cleanups/refactors finalized to safely
>>>>>> re-using
>>>>>>  TCP helpers on MPTCP skbs.
>>>>>>  Patch 6 makes TCP pruning related helpers available to MPTCP
>>>>>> and patch 7
>>>>>>  makes use of them. Patch 8 addresses an edge scenario that
>>>>>> could still
>>>>>>  lead to transfer stall under memory pressure.
>>>>>>  Finally patch 9 and 10 improve the MPTCP-level retransmission
>>>>>> schema to
>>>>>>  make recovery from memory pressure significanly faster.
>>>>>>  
>>>>>>  Note that the diffstat is biases by the quite large patch 4/9,
>>>>>> which
>>>>>>  contains mechanical transformation of existing code; "real"
>>>>>> changes are
>>>>>>  noticiable smaller.
>>>>>>  
>>>>>>  Tested successfully vs the test cases proposed by Geliang and
>>>>>> Gang and
>>>>>>  vs the selftests.
>>>>>>
>>>>> At the last meeting on Wednesday, Geliang mentioned he validated
>>>>> this
>>>>> series. Just to be sure, was it the v2 -- from last week -- or
>>>>> the v3 --
>>>>> from this week, while you were in EU -- that you validated?
>>>>> Because
>>>>> Paolo couldn't reproduce the issue you mentioned on the v3 on his
>>>>> side.
>>>>>
>>>> Hi, Matt
>>>>
>>>> The issue can also be reproduced on v3. I reproduced it using
>>>> Docker's
>>>> auto-debug mode with the mptcp_data.sh selftest:
>>>>
>>>> ‘’‘
>>>> 	Not running all tests but:
>>>>
>>>> -------- 8< --------
>>>> run_loop run_selftest_one mptcp_data.sh
>>>> -------- 8< --------
>>>>
>>>>
>>>>
>>>>
>>>> 	=== Attempt: 1 (Sat, 09 May 2026 06:55:44 +0000) ===
>>>>
>>>>
>>>> Selftest Test: ./mptcp_data.sh
>>>> TAP version 13
>>>> 1..1
>>>> # add_addr_accepted 4 subflows 4 
>>>> # id 1 flags signal 127.0.0.1 10001
>>>> # id 2 flags signal 127.0.0.1 10002
>>>> # id 3 flags signal 127.0.0.1 10003
>>>> # id 4 flags signal 127.0.0.1 10004
>>>> # TAP version 13
>>>> # 1..48
>>>> # # Starting 48 tests from 2 test cases.
>>>> # #  RUN           global.mptcp_v6 ...
>>>> # #            OK  global.mptcp_v6
>>>> # ok 1 global.mptcp_v6
>>>> # #  RUN           mptcp.shutdown_reuse ...
>>>> # #            OK  mptcp.shutdown_reuse
>>>> # ok 2 mptcp.shutdown_reuse
>>>> ...
>>>> # ok 48 mptcp.sendfile
>>>> # # FAILED: 41 / 48 tests passed.
>>>> # # Totals: pass:41 fail:7 xfail:0 xpass:0 skip:0 error:0
>>>> not ok 1 test: selftest_mptcp_data # FAIL
>>>> # time=33
>>>> ’‘’
>>>>
>>>> As Geliang mentioned in the weekly meeting, we will continue
>>>> to debug and locate the problem based on Paolo's v3 patches.
>>>
>>> Thanks for testing. I went over a couple more revisions. I run more
>>> than
>>> 100 iterations successfully on a local build on top of v5 (no
>>> failures,
>>> I stop due to time constraints):
>>>
>>> https://lore.kernel.org/mptcp/f00bdac0-b544-87b8-2ef4-ca4de0f045de@gmail.com/T/#t
>>>
>>> please have a spin as such later version.
>>
>> On v5, mptcp_data.sh works fine in normal mode with the virtme docker
>> image, but in debug mode it's still unstable - failing after several
>> loop iterations.
> 
> How many interactions? Can you please provide a sample of such failure?
> does the failure always happen in the same way/on the same test case?

I could reproduce some failures with several iteration in debug build.
Interesting they pointed out to:
- a pre-existing issue (missing wake-up) that deserves a separate patch [1].
- an intrinsic problem with queue collapsing: it's (very) slow and can
slow down the received a lot. I had to raise the timeout (replacing
TEST_F with TEST_F_TIMEOUT) above 300 seconds to complete 600 mptcp_data
runs without errors.

A working alternative to the latter change is avoid entirely the
xtcp_*collapse() stuff. The trade off here is that such option will make
./mptcp_join.sh -R slower (as more mptcp-level retransmissions will be
needed)

I'll try to share somewhat soonish the fix [1].

/P

Re: [PATCH v3 mptcp-next 00/10] mptcp: address stall under memory pressure
Posted by MPTCP CI 1 week, 2 days ago
Hi Paolo,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_sockopts ⚠️ 
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/25329194631

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/4ee2213ed212
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1089374


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)