[PATCH mptcp-net v3 0/5] selftests: mptcp: avoid spurious errors on TCP disconnect

Matthieu Baerts (NGI0) posted 5 patches 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/20250910-sft-mptcp-disc-err-v3-0-6799fa467d77@kernel.org
net/mptcp/protocol.c                               | 16 ++++++++++++++++
tools/testing/selftests/net/mptcp/mptcp_connect.c  | 11 ++++++-----
tools/testing/selftests/net/mptcp/mptcp_connect.sh |  6 +++++-
tools/testing/selftests/net/mptcp/mptcp_lib.sh     |  2 +-
4 files changed, 28 insertions(+), 7 deletions(-)
[PATCH mptcp-net v3 0/5] selftests: mptcp: avoid spurious errors on TCP disconnect
Posted by Matthieu Baerts (NGI0) 1 week ago
This should fix the recent instabilities seen by MPTCP and NIPA CIs
where the 'mptcp_connect.sh' tests fail regularly when running the
'disconnect' subtests with "plain" TCP sockets, e.g.

  # INFO: disconnect
  # 63 ns1 MPTCP -> ns1 (10.0.1.1:20001      ) MPTCP     (duration   996ms) [ OK ]
  # 64 ns1 MPTCP -> ns1 (10.0.1.1:20002      ) TCP       (duration   851ms) [ OK ]
  # 65 ns1 TCP   -> ns1 (10.0.1.1:20003      ) MPTCP     Unexpected revents: POLLERR/POLLNVAL(19)
  # (duration   896ms) [FAIL] file received by server does not match (in, out):
  # -rw-r--r-- 1 root root 11112852 Aug 19 09:16 /tmp/tmp.hlJe5DoMoq.disconnect
  # Trailing bytes are:
  # /{ga 6@=#.8:-rw------- 1 root root 10085368 Aug 19 09:16 /tmp/tmp.blClunilxx
  # Trailing bytes are:
  # /{ga 6@=#.8:66 ns1 MPTCP -> ns1 (dead:beef:1::1:20004) MPTCP     (duration   987ms) [ OK ]
  # 67 ns1 MPTCP -> ns1 (dead:beef:1::1:20005) TCP       (duration   911ms) [ OK ]
  # 68 ns1 TCP   -> ns1 (dead:beef:1::1:20006) MPTCP     (duration   980ms) [ OK ]
  # [FAIL] Tests of the full disconnection have failed

Patch 3 fixes this issue, but a fix in MPTCP in patch 1 is also needed
to fix MPTCP behaviour, and simplify patch 3. Patches 2 and 4 improve
some errors reported by the selftests, and patch 5 helps with the
debugging. I guess all these patches can be sent to 'net'

Note: Patch 1 will cause errors in Packetdrill, that's normal, see [1].

Link: https://github.com/multipath-tcp/packetdrill/pull/171 [1]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Changes in v3:
- Patch 2: close some FDs in case of errors (Geliang)
- Patch 4: mention there are now more bytes (Geliang)
- Link to v2: https://lore.kernel.org/r/20250905-sft-mptcp-disc-err-v2-0-dfb3b6b4a877@kernel.org

Changes in v2:
- Patch 1: new
- Patch 3: no more state filtering needed thanks to patch 1
- Patch 4: switch from hexdump to od (Mat)
- Patch 5: new
- Link to v1: https://lore.kernel.org/r/20250819-sft-mptcp-disc-err-v1-0-9d0cf296bc13@kernel.org

---
Matthieu Baerts (NGI0) (5):
      mptcp: propagate shutdown to subflows when possible
      selftests: mptcp: connect: catch IO errors on listen side
      selftests: mptcp: avoid spurious errors on TCP disconnect
      selftests: mptcp: print trailing bytes with od
      selftests: mptcp: connect: print pcap suffix

 net/mptcp/protocol.c                               | 16 ++++++++++++++++
 tools/testing/selftests/net/mptcp/mptcp_connect.c  | 11 ++++++-----
 tools/testing/selftests/net/mptcp/mptcp_connect.sh |  6 +++++-
 tools/testing/selftests/net/mptcp/mptcp_lib.sh     |  2 +-
 4 files changed, 28 insertions(+), 7 deletions(-)
---
base-commit: 1e51af26b216c689f6d75872290bb0de1eeaae34
change-id: 20250806-sft-mptcp-disc-err-3357b769bcdb

Best regards,
-- 
Matthieu Baerts (NGI0) <matttbe@kernel.org>
Re: [PATCH mptcp-net v3 0/5] selftests: mptcp: avoid spurious errors on TCP disconnect
Posted by MPTCP CI 1 week ago
Hi Matthieu,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal: Unstable: 3 failed test(s): packetdrill_dss packetdrill_fastopen packetdrill_syscalls 🔴
- KVM Validation: debug: Unstable: 3 failed test(s): packetdrill_dss packetdrill_fastopen packetdrill_syscalls 🔴
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/17609032414

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/4c0b302d6616
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1000860


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Re: [PATCH mptcp-net v3 0/5] selftests: mptcp: avoid spurious errors on TCP disconnect
Posted by Geliang Tang 1 week ago
Hi Matt,

Thanks for this v3.

On Wed, 2025-09-10 at 10:55 +0200, Matthieu Baerts (NGI0) wrote:
> This should fix the recent instabilities seen by MPTCP and NIPA CIs
> where the 'mptcp_connect.sh' tests fail regularly when running the
> 'disconnect' subtests with "plain" TCP sockets, e.g.
> 
>   # INFO: disconnect
>   # 63 ns1 MPTCP -> ns1 (10.0.1.1:20001      ) MPTCP     (duration  
> 996ms) [ OK ]
>   # 64 ns1 MPTCP -> ns1 (10.0.1.1:20002      ) TCP       (duration  
> 851ms) [ OK ]
>   # 65 ns1 TCP   -> ns1 (10.0.1.1:20003      ) MPTCP     Unexpected
> revents: POLLERR/POLLNVAL(19)
>   # (duration   896ms) [FAIL] file received by server does not match
> (in, out):
>   # -rw-r--r-- 1 root root 11112852 Aug 19 09:16
> /tmp/tmp.hlJe5DoMoq.disconnect
>   # Trailing bytes are:
>   # /{ga 6@=#.8:-rw------- 1 root root 10085368 Aug 19 09:16
> /tmp/tmp.blClunilxx
>   # Trailing bytes are:
>   # /{ga 6@=#.8:66 ns1 MPTCP -> ns1 (dead:beef:1::1:20004) MPTCP    
> (duration   987ms) [ OK ]
>   # 67 ns1 MPTCP -> ns1 (dead:beef:1::1:20005) TCP       (duration  
> 911ms) [ OK ]
>   # 68 ns1 TCP   -> ns1 (dead:beef:1::1:20006) MPTCP     (duration  
> 980ms) [ OK ]
>   # [FAIL] Tests of the full disconnection have failed
> 
> Patch 3 fixes this issue, but a fix in MPTCP in patch 1 is also
> needed
> to fix MPTCP behaviour, and simplify patch 3. Patches 2 and 4 improve
> some errors reported by the selftests, and patch 5 helps with the
> debugging. I guess all these patches can be sent to 'net'
> 
> Note: Patch 1 will cause errors in Packetdrill, that's normal, see
> [1].
> 
> Link: https://github.com/multipath-tcp/packetdrill/pull/171 [1]
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

Looks good.

Reviewed-by: Geliang Tang <geliang@kernel.org>

-Geliang

> ---
> Changes in v3:
> - Patch 2: close some FDs in case of errors (Geliang)
> - Patch 4: mention there are now more bytes (Geliang)
> - Link to v2:
> https://lore.kernel.org/r/20250905-sft-mptcp-disc-err-v2-0-dfb3b6b4a877@kernel.org
> 
> Changes in v2:
> - Patch 1: new
> - Patch 3: no more state filtering needed thanks to patch 1
> - Patch 4: switch from hexdump to od (Mat)
> - Patch 5: new
> - Link to v1:
> https://lore.kernel.org/r/20250819-sft-mptcp-disc-err-v1-0-9d0cf296bc13@kernel.org
> 
> ---
> Matthieu Baerts (NGI0) (5):
>       mptcp: propagate shutdown to subflows when possible
>       selftests: mptcp: connect: catch IO errors on listen side
>       selftests: mptcp: avoid spurious errors on TCP disconnect
>       selftests: mptcp: print trailing bytes with od
>       selftests: mptcp: connect: print pcap suffix
> 
>  net/mptcp/protocol.c                               | 16
> ++++++++++++++++
>  tools/testing/selftests/net/mptcp/mptcp_connect.c  | 11 ++++++-----
>  tools/testing/selftests/net/mptcp/mptcp_connect.sh |  6 +++++-
>  tools/testing/selftests/net/mptcp/mptcp_lib.sh     |  2 +-
>  4 files changed, 28 insertions(+), 7 deletions(-)
> ---
> base-commit: 1e51af26b216c689f6d75872290bb0de1eeaae34
> change-id: 20250806-sft-mptcp-disc-err-3357b769bcdb
> 
> Best regards,
Re: [PATCH mptcp-net v3 0/5] selftests: mptcp: avoid spurious errors on TCP disconnect
Posted by Mat Martineau 6 days, 19 hours ago
On Wed, 10 Sep 2025, Matthieu Baerts (NGI0) wrote:

> This should fix the recent instabilities seen by MPTCP and NIPA CIs
> where the 'mptcp_connect.sh' tests fail regularly when running the
> 'disconnect' subtests with "plain" TCP sockets, e.g.
>
>  # INFO: disconnect
>  # 63 ns1 MPTCP -> ns1 (10.0.1.1:20001      ) MPTCP     (duration   996ms) [ OK ]
>  # 64 ns1 MPTCP -> ns1 (10.0.1.1:20002      ) TCP       (duration   851ms) [ OK ]
>  # 65 ns1 TCP   -> ns1 (10.0.1.1:20003      ) MPTCP     Unexpected revents: POLLERR/POLLNVAL(19)
>  # (duration   896ms) [FAIL] file received by server does not match (in, out):
>  # -rw-r--r-- 1 root root 11112852 Aug 19 09:16 /tmp/tmp.hlJe5DoMoq.disconnect
>  # Trailing bytes are:
>  # /{ga 6@=#.8:-rw------- 1 root root 10085368 Aug 19 09:16 /tmp/tmp.blClunilxx
>  # Trailing bytes are:
>  # /{ga 6@=#.8:66 ns1 MPTCP -> ns1 (dead:beef:1::1:20004) MPTCP     (duration   987ms) [ OK ]
>  # 67 ns1 MPTCP -> ns1 (dead:beef:1::1:20005) TCP       (duration   911ms) [ OK ]
>  # 68 ns1 TCP   -> ns1 (dead:beef:1::1:20006) MPTCP     (duration   980ms) [ OK ]
>  # [FAIL] Tests of the full disconnection have failed
>
> Patch 3 fixes this issue, but a fix in MPTCP in patch 1 is also needed
> to fix MPTCP behaviour, and simplify patch 3. Patches 2 and 4 improve
> some errors reported by the selftests, and patch 5 helps with the
> debugging. I guess all these patches can be sent to 'net'
>
> Note: Patch 1 will cause errors in Packetdrill, that's normal, see [1].
>
> Link: https://github.com/multipath-tcp/packetdrill/pull/171 [1]
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
> Changes in v3:
> - Patch 2: close some FDs in case of errors (Geliang)
> - Patch 4: mention there are now more bytes (Geliang)
> - Link to v2: https://lore.kernel.org/r/20250905-sft-mptcp-disc-err-v2-0-dfb3b6b4a877@kernel.org

Hi Matthieu -

Patch 5 I acked separately, one minor fix in the commit message but no 
need for v4.

Patches 1-4 LGTM. Even though it's possible (in patch 1) that a shutdown 
may be attempted on a subflow that doesn't need it, tcp_shutdown() has the 
logic to handle that.

Reviewed-by: Mat Martineau <martineau@kernel.org>


>
> Changes in v2:
> - Patch 1: new
> - Patch 3: no more state filtering needed thanks to patch 1
> - Patch 4: switch from hexdump to od (Mat)
> - Patch 5: new
> - Link to v1: https://lore.kernel.org/r/20250819-sft-mptcp-disc-err-v1-0-9d0cf296bc13@kernel.org
>
> ---
> Matthieu Baerts (NGI0) (5):
>      mptcp: propagate shutdown to subflows when possible
>      selftests: mptcp: connect: catch IO errors on listen side
>      selftests: mptcp: avoid spurious errors on TCP disconnect
>      selftests: mptcp: print trailing bytes with od
>      selftests: mptcp: connect: print pcap suffix
>
> net/mptcp/protocol.c                               | 16 ++++++++++++++++
> tools/testing/selftests/net/mptcp/mptcp_connect.c  | 11 ++++++-----
> tools/testing/selftests/net/mptcp/mptcp_connect.sh |  6 +++++-
> tools/testing/selftests/net/mptcp/mptcp_lib.sh     |  2 +-
> 4 files changed, 28 insertions(+), 7 deletions(-)
> ---
> base-commit: 1e51af26b216c689f6d75872290bb0de1eeaae34
> change-id: 20250806-sft-mptcp-disc-err-3357b769bcdb
>
> Best regards,
> -- 
> Matthieu Baerts (NGI0) <matttbe@kernel.org>
>
>
>
Re: [PATCH mptcp-net v3 0/5] selftests: mptcp: avoid spurious errors on TCP disconnect
Posted by Matthieu Baerts 6 days, 6 hours ago
Hi Geliang, Mat,

On 10/09/2025 21:03, Mat Martineau wrote:
> 
> On Wed, 10 Sep 2025, Matthieu Baerts (NGI0) wrote:
> 
>> This should fix the recent instabilities seen by MPTCP and NIPA CIs
>> where the 'mptcp_connect.sh' tests fail regularly when running the
>> 'disconnect' subtests with "plain" TCP sockets, e.g.
>>
>>  # INFO: disconnect
>>  # 63 ns1 MPTCP -> ns1 (10.0.1.1:20001      ) MPTCP     (duration  
>> 996ms) [ OK ]
>>  # 64 ns1 MPTCP -> ns1 (10.0.1.1:20002      ) TCP       (duration  
>> 851ms) [ OK ]
>>  # 65 ns1 TCP   -> ns1 (10.0.1.1:20003      ) MPTCP     Unexpected
>> revents: POLLERR/POLLNVAL(19)
>>  # (duration   896ms) [FAIL] file received by server does not match
>> (in, out):
>>  # -rw-r--r-- 1 root root 11112852 Aug 19 09:16 /tmp/
>> tmp.hlJe5DoMoq.disconnect
>>  # Trailing bytes are:
>>  # /{ga 6@=#.8:-rw------- 1 root root 10085368 Aug 19 09:16 /tmp/
>> tmp.blClunilxx
>>  # Trailing bytes are:
>>  # /{ga 6@=#.8:66 ns1 MPTCP -> ns1 (dead:beef:1::1:20004) MPTCP    
>> (duration   987ms) [ OK ]
>>  # 67 ns1 MPTCP -> ns1 (dead:beef:1::1:20005) TCP       (duration  
>> 911ms) [ OK ]
>>  # 68 ns1 TCP   -> ns1 (dead:beef:1::1:20006) MPTCP     (duration  
>> 980ms) [ OK ]
>>  # [FAIL] Tests of the full disconnection have failed
>>
>> Patch 3 fixes this issue, but a fix in MPTCP in patch 1 is also needed
>> to fix MPTCP behaviour, and simplify patch 3. Patches 2 and 4 improve
>> some errors reported by the selftests, and patch 5 helps with the
>> debugging. I guess all these patches can be sent to 'net'
>>
>> Note: Patch 1 will cause errors in Packetdrill, that's normal, see [1].
>>
>> Link: https://github.com/multipath-tcp/packetdrill/pull/171 [1]
>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>> ---
>> Changes in v3:
>> - Patch 2: close some FDs in case of errors (Geliang)
>> - Patch 4: mention there are now more bytes (Geliang)
>> - Link to v2: https://lore.kernel.org/r/20250905-sft-mptcp-disc-err-
>> v2-0-dfb3b6b4a877@kernel.org
> 
> Hi Matthieu -
> 
> Patch 5 I acked separately, one minor fix in the commit message but no
> need for v4.

Good catch! Fixed!

> Patches 1-4 LGTM. Even though it's possible (in patch 1) that a shutdown
> may be attempted on a subflow that doesn't need it, tcp_shutdown() has
> the logic to handle that.

Indeed, that's why I didn't add extra checks before calling
tcp_shutdown(). I just added this note in the commit message:

 Note that tcp_shutdown() will check the subflow state, so no need to do
 that again before calling it.

> Reviewed-by: Mat Martineau <martineau@kernel.org>

Thank you for the review here and in the packetdrill PR (applied and
backported)!

Now in our tree (fixes for -net):

New patches for t/upstream-net and t/upstream:
- 8cc6e542f150: mptcp: propagate shutdown to subflows when possible
- 07cc3161a4a6: selftests: mptcp: connect: catch IO errors on listen side
- 4f6152573302: selftests: mptcp: avoid spurious errors on TCP disconnect
- 42c1206d2751: selftests: mptcp: print trailing bytes with od
- 83fd25a7e401: selftests: mptcp: connect: print pcap prefix
- Results: c62821417a96..c3bb5aa20b8e (export-net)
- Results: 623fb42fecc5..6ff9d54223e3 (export)

Tests are now in progress:

- export-net:
https://github.com/multipath-tcp/mptcp_net-next/commit/b6b24ce1bd89c893d3e7f9262f7a9b92aaa19736/checks
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/bf5738cace8ab21e5bba9ebe1d8fa7175407f222/checks

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.