[v2] selftests: mptcp: connect: cover alt modes

[PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Matthieu Baerts (NGI0) 2 months ago

mptcp_connect.sh can be executed manually with "-m <MODE>" and "-C" to
make sure everything works as expected when using "mmap" and "sendfile"
modes instead of "poll", and with the MPTCP checksum support.

These modes should be validated, but they are not when the selftests are
executed via the kselftest helpers. It means that most CIs validating
these selftests, like NIPA for the net development trees and LKFT for
the stable ones, are not covering these modes.

To fix that, new test programs have been added, simply calling
mptcp_connect.sh with the right parameters.

The first patch can be backported up to v5.6, and the second one up to
v5.14.

Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Changes in v2:
- force using a different prefix in the subtests to avoid having the
  same test names in all mptcp_connect*.sh selftests.
- Link to v1: https://lore.kernel.org/r/20250714-net-mptcp-sft-connect-alt-v1-0-bf1c5abbe575@kernel.org

---
Matthieu Baerts (NGI0) (2):
      selftests: mptcp: connect: also cover alt modes
      selftests: mptcp: connect: also cover checksum

 tools/testing/selftests/net/mptcp/Makefile                  | 3 ++-
 tools/testing/selftests/net/mptcp/mptcp_connect_checksum.sh | 5 +++++
 tools/testing/selftests/net/mptcp/mptcp_connect_mmap.sh     | 5 +++++
 tools/testing/selftests/net/mptcp/mptcp_connect_sendfile.sh | 5 +++++
 4 files changed, 17 insertions(+), 1 deletion(-)
---
base-commit: b640daa2822a39ff76e70200cb2b7b892b896dce
change-id: 20250714-net-mptcp-sft-connect-alt-c1aaf073ef4e

Best regards,
-- 
Matthieu Baerts (NGI0) <matttbe@kernel.org>

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by patchwork-bot+netdevbpf@kernel.org 1 month, 3 weeks ago

Hello:

This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 15 Jul 2025 20:43:27 +0200 you wrote:
> mptcp_connect.sh can be executed manually with "-m <MODE>" and "-C" to
> make sure everything works as expected when using "mmap" and "sendfile"
> modes instead of "poll", and with the MPTCP checksum support.
> 
> These modes should be validated, but they are not when the selftests are
> executed via the kselftest helpers. It means that most CIs validating
> these selftests, like NIPA for the net development trees and LKFT for
> the stable ones, are not covering these modes.
> 
> [...]

Here is the summary with links:
  - [net,v2,1/2] selftests: mptcp: connect: also cover alt modes
    https://git.kernel.org/netdev/net/c/37848a456fc3
  - [net,v2,2/2] selftests: mptcp: connect: also cover checksum
    https://git.kernel.org/netdev/net/c/fdf0f60a2bb0

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Jakub Kicinski 2 months ago

On Tue, 15 Jul 2025 20:43:27 +0200 Matthieu Baerts (NGI0) wrote:
> mptcp_connect.sh can be executed manually with "-m <MODE>" and "-C" to
> make sure everything works as expected when using "mmap" and "sendfile"
> modes instead of "poll", and with the MPTCP checksum support.
> 
> These modes should be validated, but they are not when the selftests are
> executed via the kselftest helpers. It means that most CIs validating
> these selftests, like NIPA for the net development trees and LKFT for
> the stable ones, are not covering these modes.
> 
> To fix that, new test programs have been added, simply calling
> mptcp_connect.sh with the right parameters.
> 
> The first patch can be backported up to v5.6, and the second one up to
> v5.14.

Looks like the failures that Paolo flagged yesterday:

https://lore.kernel.org/all/a7a89aa2-7354-42c7-8219-99a3cafd3b33@redhat.com/

are back as soon as this hit NIPA :(

https://netdev.bots.linux.dev/contest.html?branch=net-next-2025-07-16--00-00&executor=vmksft-mptcp&pw-n=0&pass=0

No idea why TBH, the tests run sequentially and connect.sh run before
any of the new ones.

I'm gonna leave it in patchwork in case the next run is clean,
please use pw-bot to discard them if they keep failing.

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Jakub Kicinski 2 months ago

On Tue, 15 Jul 2025 18:53:08 -0700 Jakub Kicinski wrote:
> On Tue, 15 Jul 2025 20:43:27 +0200 Matthieu Baerts (NGI0) wrote:
> > mptcp_connect.sh can be executed manually with "-m <MODE>" and "-C" to
> > make sure everything works as expected when using "mmap" and "sendfile"
> > modes instead of "poll", and with the MPTCP checksum support.
> > 
> > These modes should be validated, but they are not when the selftests are
> > executed via the kselftest helpers. It means that most CIs validating
> > these selftests, like NIPA for the net development trees and LKFT for
> > the stable ones, are not covering these modes.
> > 
> > To fix that, new test programs have been added, simply calling
> > mptcp_connect.sh with the right parameters.
> > 
> > The first patch can be backported up to v5.6, and the second one up to
> > v5.14.  
> 
> Looks like the failures that Paolo flagged yesterday:
> 
> https://lore.kernel.org/all/a7a89aa2-7354-42c7-8219-99a3cafd3b33@redhat.com/
> 
> are back as soon as this hit NIPA :(
> 
> https://netdev.bots.linux.dev/contest.html?branch=net-next-2025-07-16--00-00&executor=vmksft-mptcp&pw-n=0&pass=0
> 
> No idea why TBH, the tests run sequentially and connect.sh run before
> any of the new ones.
> 
> I'm gonna leave it in patchwork in case the next run is clean,
> please use pw-bot to discard them if they keep failing.

It failed again on the latest run, in a somewhat more concerning way :(

# (duration 30279ms) [FAIL] file received by server does not match (in, out):
# -rw------- 1 root root 5171914 Jul 16 05:24 /tmp/tmp.W2c96hxSIz
# Trailing bytes are: 
# w,ѐ)-rw------- 1 root root 5166208 Jul 16 05:24 /tmp/tmp.s33PNcrN6M
# Trailing bytes are: 
# (<v /&^<ֱrnFsaC7INFO: with peek mode: saveAfterPeek

https://netdev-3.bots.linux.dev/vmksft-mptcp/results/211121/4-mptcp-connect-sh/stdout

BTW feeding the random data into hexdump-like formatter seems
advisable? :P

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Matthieu Baerts 2 months ago

Hi Jakub,

On 16/07/2025 16:26, Jakub Kicinski wrote:
> On Tue, 15 Jul 2025 18:53:08 -0700 Jakub Kicinski wrote:
>> On Tue, 15 Jul 2025 20:43:27 +0200 Matthieu Baerts (NGI0) wrote:
>>> mptcp_connect.sh can be executed manually with "-m <MODE>" and "-C" to
>>> make sure everything works as expected when using "mmap" and "sendfile"
>>> modes instead of "poll", and with the MPTCP checksum support.
>>>
>>> These modes should be validated, but they are not when the selftests are
>>> executed via the kselftest helpers. It means that most CIs validating
>>> these selftests, like NIPA for the net development trees and LKFT for
>>> the stable ones, are not covering these modes.
>>>
>>> To fix that, new test programs have been added, simply calling
>>> mptcp_connect.sh with the right parameters.
>>>
>>> The first patch can be backported up to v5.6, and the second one up to
>>> v5.14.  
>>
>> Looks like the failures that Paolo flagged yesterday:
>>
>> https://lore.kernel.org/all/a7a89aa2-7354-42c7-8219-99a3cafd3b33@redhat.com/
>>
>> are back as soon as this hit NIPA :(
>>
>> https://netdev.bots.linux.dev/contest.html?branch=net-next-2025-07-16--00-00&executor=vmksft-mptcp&pw-n=0&pass=0
>>
>> No idea why TBH, the tests run sequentially and connect.sh run before
>> any of the new ones.

And just to be sure, no CPU or IO overload at that moment? I didn't see
such errors reported by our CI, but I can try to reproduce them locally
in different conditions.

>> I'm gonna leave it in patchwork in case the next run is clean,
>> please use pw-bot to discard them if they keep failing.

Oops, sorry I forgot to reply: when I checked in the morning, the last
two builds were clean. I wanted to check the next one, then I forgot :)

> It failed again on the latest run, in a somewhat more concerning way :(
> 
> # (duration 30279ms) [FAIL] file received by server does not match (in, out):
> # -rw------- 1 root root 5171914 Jul 16 05:24 /tmp/tmp.W2c96hxSIz
> # Trailing bytes are: 
> # w,ѐ)-rw------- 1 root root 5166208 Jul 16 05:24 /tmp/tmp.s33PNcrN6M
> # Trailing bytes are: 
> # (<v /&^<rnFsaC7INFO: with peek mode: saveAfterPeek
> 
> https://netdev-3.bots.linux.dev/vmksft-mptcp/results/211121/4-mptcp-connect-sh/stdout

I see, the error can be a bit scary :)

If I'm not mistaken, there was a poll timeout error before. When it is
detected, the test is stopped. After each test, even in case of errors,
the received file is compared with the sending one. So here, this
concerning error is expected.

Anyway, even if the errors are not caused by this series, I think it is
better to delay these patches while we are investigating that:

pw-bot: cr


> BTW feeding the random data into hexdump-like formatter seems
> advisable? :P

It is just to check that the CIs can correctly parse random data :-D

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Jakub Kicinski 2 months ago

On Wed, 16 Jul 2025 16:55:21 +0200 Matthieu Baerts wrote:
> >> Looks like the failures that Paolo flagged yesterday:
> >>
> >> https://lore.kernel.org/all/a7a89aa2-7354-42c7-8219-99a3cafd3b33@redhat.com/
> >>
> >> are back as soon as this hit NIPA :(
> >>
> >> https://netdev.bots.linux.dev/contest.html?branch=net-next-2025-07-16--00-00&executor=vmksft-mptcp&pw-n=0&pass=0
> >>
> >> No idea why TBH, the tests run sequentially and connect.sh run before
> >> any of the new ones.  
> 
> And just to be sure, no CPU or IO overload at that moment? I didn't see
> such errors reported by our CI, but I can try to reproduce them locally
> in different conditions.

None that I can see. The test run ~10min after all the builds completed,
and we wait now for the CPU load to die down and writeback to finish
before we kick off VMs. The VMs for various tests are running at that
point, the CPU util averaged across cores is 66%.

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Matthieu Baerts 2 months ago

On 16/07/2025 17:36, Jakub Kicinski wrote:
> On Wed, 16 Jul 2025 16:55:21 +0200 Matthieu Baerts wrote:
>>>> Looks like the failures that Paolo flagged yesterday:
>>>>
>>>> https://lore.kernel.org/all/a7a89aa2-7354-42c7-8219-99a3cafd3b33@redhat.com/
>>>>
>>>> are back as soon as this hit NIPA :(
>>>>
>>>> https://netdev.bots.linux.dev/contest.html?branch=net-next-2025-07-16--00-00&executor=vmksft-mptcp&pw-n=0&pass=0
>>>>
>>>> No idea why TBH, the tests run sequentially and connect.sh run before
>>>> any of the new ones.  
>>
>> And just to be sure, no CPU or IO overload at that moment? I didn't see
>> such errors reported by our CI, but I can try to reproduce them locally
>> in different conditions.
> 
> None that I can see. The test run ~10min after all the builds completed,
> and we wait now for the CPU load to die down and writeback to finish
> before we kick off VMs. The VMs for various tests are running at that
> point, the CPU util averaged across cores is 66%.

Thank you for having checked, and for the explanations!

OK, so maybe running stress-ng in parallel to be able to reproduce the
issue might not help. We will investigate.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Jakub Kicinski 2 months ago

On Wed, 16 Jul 2025 18:35:11 +0200 Matthieu Baerts wrote:
> >> And just to be sure, no CPU or IO overload at that moment? I didn't see
> >> such errors reported by our CI, but I can try to reproduce them locally
> >> in different conditions.  
> > 
> > None that I can see. The test run ~10min after all the builds completed,
> > and we wait now for the CPU load to die down and writeback to finish
> > before we kick off VMs. The VMs for various tests are running at that
> > point, the CPU util averaged across cores is 66%.  
> 
> Thank you for having checked, and for the explanations!
> 
> OK, so maybe running stress-ng in parallel to be able to reproduce the
> issue might not help. We will investigate.

connect tests failed again overnight. Now I see why Paolo was
responding on Eric's series, that seems like a more likely culprit..

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Matthieu Baerts 2 months ago

Hi Jakub,

On 17/07/2025 16:42, Jakub Kicinski wrote:
> On Wed, 16 Jul 2025 18:35:11 +0200 Matthieu Baerts wrote:
>>>> And just to be sure, no CPU or IO overload at that moment? I didn't see
>>>> such errors reported by our CI, but I can try to reproduce them locally
>>>> in different conditions.  
>>>
>>> None that I can see. The test run ~10min after all the builds completed,
>>> and we wait now for the CPU load to die down and writeback to finish
>>> before we kick off VMs. The VMs for various tests are running at that
>>> point, the CPU util averaged across cores is 66%.  
>>
>> Thank you for having checked, and for the explanations!
>>
>> OK, so maybe running stress-ng in parallel to be able to reproduce the
>> issue might not help. We will investigate.
> 
> connect tests failed again overnight. Now I see why Paolo was
> responding on Eric's series, that seems like a more likely culprit..

Good point, Paolo was certainly right, as always :)

We do need to investigate. Note that it might be hard for me to do that
the next few days as I'm travelling for work, but we are tracking the issue:

  https://github.com/multipath-tcp/mptcp_net-next/issues/574

I see that you already marked the mptcp-connect-sh selftest as ignored,
so I guess we are not causing other troubles with the CI. (We could then
also apply this series here and ignore the new tests, but it is also
fine for me to wait.)

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Jakub Kicinski 2 months ago

On Fri, 18 Jul 2025 01:49:24 +0200 Matthieu Baerts wrote:
> I see that you already marked the mptcp-connect-sh selftest as ignored,
> so I guess we are not causing other troubles with the CI. (We could then
> also apply this series here and ignore the new tests, but it is also
> fine for me to wait.)

If you're okay either way I'd rather wait. From our perspective the new
tests would go straight into the ignore bucket.

Re: [PATCH net v2 0/2] selftests: mptcp: connect: cover alt modes

Posted by Jakub Kicinski 2 months ago

On Thu, 17 Jul 2025 18:33:46 -0700 Jakub Kicinski wrote:
> On Fri, 18 Jul 2025 01:49:24 +0200 Matthieu Baerts wrote:
> > I see that you already marked the mptcp-connect-sh selftest as ignored,
> > so I guess we are not causing other troubles with the CI. (We could then
> > also apply this series here and ignore the new tests, but it is also
> > fine for me to wait.)  
> 
> If you're okay either way I'd rather wait. From our perspective the new
> tests would go straight into the ignore bucket.

Restoring now, given Paolo's fixes.