I was going to send the related commit upstream, but then I realised in
a slow environment (e.g. public CI), with a debug kernel, the default
10 seconds timeout can be too short:
  086 multiple bind to allow joins v4
        join Rx                             [ OK ]
        join Tx                             [ OK ]
  main_loop_s: timed out
        add addr rx                         [ OK ]
        add addr echo rx                    [ OK ]
  ./mptcp_join.sh: line 3305: kill: (2032) - No such process
This doesn't affect the results, because it happens after the test, when
checking the counters. Still, better to fix that by using an infinite
timeout: the process will be killed at the end.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 tools/testing/selftests/net/mptcp/mptcp_join.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh
index 05faacefb719..e0cdb9c662aa 100755
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
+++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
@@ -3293,7 +3293,7 @@ bind_tests()
 
 		# Launching another app listening on a different address
 		# Note: it could be a totally different app, e.g. nc, socat, ...
-		ip netns exec ${ns1} ./mptcp_connect -l -p "$(get_port)" \
+		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p "$(get_port)" \
 			-s MPTCP 10.0.2.1 &
 		extra_bind=$!
 
@@ -3315,7 +3315,7 @@ bind_tests()
 
 		# Launching another app listening on a different address
 		# Note: it could be a totally different app, e.g. nc, socat, ...
-		ip netns exec ${ns1} ./mptcp_connect -l -p "$(get_port)" \
+		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p "$(get_port)" \
 			-s MPTCP dead:beef:2::1 &
 		extra_bind=$!
 
@@ -3340,7 +3340,7 @@ bind_tests()
 
 		wait_ll_ready $ns1 # to be able to bind
 		wait_ll_ready $ns2 # also needed to bind on the client side
-		ip netns exec ${ns1} ./mptcp_connect -l -p "$(get_port)" \
+		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p "$(get_port)" \
 			-s MPTCP "${ns1ll2}%ns1eth2" &
 		extra_bind=$!
 
@@ -3371,7 +3371,7 @@ bind_tests()
 
 		wait_ll_ready $ns1 # to be able to bind
 		wait_ll_ready $ns2 # also needed to bind on the client side
-		ip netns exec ${ns1} ./mptcp_connect -l -p "$(get_port)" \
+		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p "$(get_port)" \
 			-s MPTCP "${ns1ll2}%ns1eth2" &
 		extra_bind=$!
 
---
base-commit: 6354d0b328258d1974cc7d32a82983b2bd3e5871
change-id: 20251024-sft-fix-bind-timeout-7e7c47a5d50a
Best regards,
-- 
Matthieu Baerts (NGI0) <matttbe@kernel.org>Hi Matt,
Thanks for this fix.
On Fri, 2025-10-24 at 12:02 +0200, Matthieu Baerts (NGI0) wrote:
> I was going to send the related commit upstream, but then I realised
> in
> a slow environment (e.g. public CI), with a debug kernel, the default
> 10 seconds timeout can be too short:
> 
>   086 multiple bind to allow joins v4
>         join Rx                             [ OK ]
>         join Tx                             [ OK ]
>   main_loop_s: timed out
>         add addr rx                         [ OK ]
>         add addr echo rx                    [ OK ]
>   ./mptcp_join.sh: line 3305: kill: (2032) - No such process
> 
> This doesn't affect the results, because it happens after the test,
> when
> checking the counters. Still, better to fix that by using an infinite
> timeout: the process will be killed at the end.
> 
> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
>  tools/testing/selftests/net/mptcp/mptcp_join.sh | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> index 05faacefb719..e0cdb9c662aa 100755
> --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> @@ -3293,7 +3293,7 @@ bind_tests()
>  
>  		# Launching another app listening on a different
> address
>  		# Note: it could be a totally different app, e.g.
> nc, socat, ...
> -		ip netns exec ${ns1} ./mptcp_connect -l -p
> "$(get_port)" \
> +		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p
> "$(get_port)" \
A timeout of -1 means block indefinitely, right? Do you think using the
default 30 seconds (-t ${timeout_poll}) is sufficient?
Thanks,
-Geliang
>  			-s MPTCP 10.0.2.1 &
>  		extra_bind=$!
>  
> @@ -3315,7 +3315,7 @@ bind_tests()
>  
>  		# Launching another app listening on a different
> address
>  		# Note: it could be a totally different app, e.g.
> nc, socat, ...
> -		ip netns exec ${ns1} ./mptcp_connect -l -p
> "$(get_port)" \
> +		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p
> "$(get_port)" \
>  			-s MPTCP dead:beef:2::1 &
>  		extra_bind=$!
>  
> @@ -3340,7 +3340,7 @@ bind_tests()
>  
>  		wait_ll_ready $ns1 # to be able to bind
>  		wait_ll_ready $ns2 # also needed to bind on the
> client side
> -		ip netns exec ${ns1} ./mptcp_connect -l -p
> "$(get_port)" \
> +		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p
> "$(get_port)" \
>  			-s MPTCP "${ns1ll2}%ns1eth2" &
>  		extra_bind=$!
>  
> @@ -3371,7 +3371,7 @@ bind_tests()
>  
>  		wait_ll_ready $ns1 # to be able to bind
>  		wait_ll_ready $ns2 # also needed to bind on the
> client side
> -		ip netns exec ${ns1} ./mptcp_connect -l -p
> "$(get_port)" \
> +		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p
> "$(get_port)" \
>  			-s MPTCP "${ns1ll2}%ns1eth2" &
>  		extra_bind=$!
>  
> 
> ---
> base-commit: 6354d0b328258d1974cc7d32a82983b2bd3e5871
> change-id: 20251024-sft-fix-bind-timeout-7e7c47a5d50a
> 
> Best regards,
                
            Hi Geliang,
On 24/10/2025 14:56, Geliang Tang wrote:
> Hi Matt,
> 
> Thanks for this fix.
> 
> On Fri, 2025-10-24 at 12:02 +0200, Matthieu Baerts (NGI0) wrote:
>> I was going to send the related commit upstream, but then I realised
>> in
>> a slow environment (e.g. public CI), with a debug kernel, the default
>> 10 seconds timeout can be too short:
>>
>>   086 multiple bind to allow joins v4
>>         join Rx                             [ OK ]
>>         join Tx                             [ OK ]
>>   main_loop_s: timed out
>>         add addr rx                         [ OK ]
>>         add addr echo rx                    [ OK ]
>>   ./mptcp_join.sh: line 3305: kill: (2032) - No such process
>>
>> This doesn't affect the results, because it happens after the test,
>> when
>> checking the counters. Still, better to fix that by using an infinite
>> timeout: the process will be killed at the end.
>>
>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>> ---
>>  tools/testing/selftests/net/mptcp/mptcp_join.sh | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh
>> b/tools/testing/selftests/net/mptcp/mptcp_join.sh
>> index 05faacefb719..e0cdb9c662aa 100755
>> --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
>> +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
>> @@ -3293,7 +3293,7 @@ bind_tests()
>>  
>>  		# Launching another app listening on a different
>> address
>>  		# Note: it could be a totally different app, e.g.
>> nc, socat, ...
>> -		ip netns exec ${ns1} ./mptcp_connect -l -p
>> "$(get_port)" \
>> +		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p
>> "$(get_port)" \
> 
> A timeout of -1 means block indefinitely, right? Do you think using the
> default 30 seconds (-t ${timeout_poll}) is sufficient?
No need to use the same default timeout: this process is executed in the
background, and we explicitly kill it at the end. Worst case, the other
process timeout after 30 seconds, and we will still kill this one.
So it looks easier/better to use -1, no?
Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
                
            Hi Matt,
On Mon, 2025-10-27 at 13:21 +0100, Matthieu Baerts wrote:
> Hi Geliang,
> 
> On 24/10/2025 14:56, Geliang Tang wrote:
> > Hi Matt,
> > 
> > Thanks for this fix.
> > 
> > On Fri, 2025-10-24 at 12:02 +0200, Matthieu Baerts (NGI0) wrote:
> > > I was going to send the related commit upstream, but then I
> > > realised
> > > in
> > > a slow environment (e.g. public CI), with a debug kernel, the
> > > default
> > > 10 seconds timeout can be too short:
> > > 
> > >   086 multiple bind to allow joins v4
> > >         join Rx                             [ OK ]
> > >         join Tx                             [ OK ]
> > >   main_loop_s: timed out
> > >         add addr rx                         [ OK ]
> > >         add addr echo rx                    [ OK ]
> > >   ./mptcp_join.sh: line 3305: kill: (2032) - No such process
> > > 
> > > This doesn't affect the results, because it happens after the
> > > test,
> > > when
> > > checking the counters. Still, better to fix that by using an
> > > infinite
> > > timeout: the process will be killed at the end.
> > > 
> > > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> > > ---
> > >  tools/testing/selftests/net/mptcp/mptcp_join.sh | 8 ++++----
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > > b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > > index 05faacefb719..e0cdb9c662aa 100755
> > > --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > > +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
> > > @@ -3293,7 +3293,7 @@ bind_tests()
> > >  
> > >  		# Launching another app listening on a different
> > > address
> > >  		# Note: it could be a totally different app,
> > > e.g.
> > > nc, socat, ...
> > > -		ip netns exec ${ns1} ./mptcp_connect -l -p
> > > "$(get_port)" \
> > > +		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p
> > > "$(get_port)" \
> > 
> > A timeout of -1 means block indefinitely, right? Do you think using
> > the
> > default 30 seconds (-t ${timeout_poll}) is sufficient?
> 
> No need to use the same default timeout: this process is executed in
> the
> background, and we explicitly kill it at the end. Worst case, the
> other
> process timeout after 30 seconds, and we will still kill this one.
> 
> So it looks easier/better to use -1, no?
If that's the case, could we set a value that is, for example, twice or
ten times the value of ${timeout_poll}? But it's up to you.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Thanks,
-Geliang
> 
> Cheers,
> Matt
                
            Hi Geliang,
On 28/10/2025 02:34, Geliang Tang wrote:
> Hi Matt,
> 
> On Mon, 2025-10-27 at 13:21 +0100, Matthieu Baerts wrote:
>> Hi Geliang,
>>
>> On 24/10/2025 14:56, Geliang Tang wrote:
>>> Hi Matt,
>>>
>>> Thanks for this fix.
>>>
>>> On Fri, 2025-10-24 at 12:02 +0200, Matthieu Baerts (NGI0) wrote:
>>>> I was going to send the related commit upstream, but then I
>>>> realised
>>>> in
>>>> a slow environment (e.g. public CI), with a debug kernel, the
>>>> default
>>>> 10 seconds timeout can be too short:
>>>>
>>>>   086 multiple bind to allow joins v4
>>>>         join Rx                             [ OK ]
>>>>         join Tx                             [ OK ]
>>>>   main_loop_s: timed out
>>>>         add addr rx                         [ OK ]
>>>>         add addr echo rx                    [ OK ]
>>>>   ./mptcp_join.sh: line 3305: kill: (2032) - No such process
>>>>
>>>> This doesn't affect the results, because it happens after the
>>>> test,
>>>> when
>>>> checking the counters. Still, better to fix that by using an
>>>> infinite
>>>> timeout: the process will be killed at the end.
>>>>
>>>> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
>>>> ---
>>>>  tools/testing/selftests/net/mptcp/mptcp_join.sh | 8 ++++----
>>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh
>>>> b/tools/testing/selftests/net/mptcp/mptcp_join.sh
>>>> index 05faacefb719..e0cdb9c662aa 100755
>>>> --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh
>>>> +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh
>>>> @@ -3293,7 +3293,7 @@ bind_tests()
>>>>  
>>>>  		# Launching another app listening on a different
>>>> address
>>>>  		# Note: it could be a totally different app,
>>>> e.g.
>>>> nc, socat, ...
>>>> -		ip netns exec ${ns1} ./mptcp_connect -l -p
>>>> "$(get_port)" \
>>>> +		ip netns exec ${ns1} ./mptcp_connect -l -t -1 -p
>>>> "$(get_port)" \
>>>
>>> A timeout of -1 means block indefinitely, right? Do you think using
>>> the
>>> default 30 seconds (-t ${timeout_poll}) is sufficient?
>>
>> No need to use the same default timeout: this process is executed in
>> the
>> background, and we explicitly kill it at the end. Worst case, the
>> other
>> process timeout after 30 seconds, and we will still kill this one.
>>
>> So it looks easier/better to use -1, no?
> 
> If that's the case, could we set a value that is, for example, twice or
> ten times the value of ${timeout_poll}? But it's up to you.
I don't think that's needed: here, we don't wait for any events related
to this process executed in the background. So we don't need a timeout.
If really there is an issue to reach the kill part at the end, the
process will be killed when cleaning the netns.
New patches for t/upstream:
- 284511309756: "squashed" in "selftests: mptcp: join: validate extra
bind cases"
- Results: cfb47c65c4ba..2ae62d17b961 (export)
Tests are now in progress:
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/ef9e74921bb057425352352ee9e9e69df0d0c705/checks
Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
                
            Hi Matthieu,
Thank you for your modifications, that's great!
Our CI did some validations and here is its report:
- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 1 failed test(s): packetdrill_add_addr 🔴
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/18777184422
Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/0db3101c3b80
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1015428
If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:
    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal
For more details:
    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)
Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
                
            © 2016 - 2025 Red Hat, Inc.