[PATCH v2 mptcp-next] selftests: mptcp: fix diag instability

Paolo Abeni posted 1 patch 2 years, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/db50196c67b30e7b89d241faffc78e5ffbb04e83.1643976999.git.pabeni@redhat.com
Maintainers: Shuah Khan <shuah@kernel.org>, Mat Martineau <mathew.j.martineau@linux.intel.com>, "David S. Miller" <davem@davemloft.net>, Jakub Kicinski <kuba@kernel.org>, Matthieu Baerts <matthieu.baerts@tessares.net>
tools/testing/selftests/net/mptcp/diag.sh | 44 +++++++++++++++++++----
1 file changed, 37 insertions(+), 7 deletions(-)
[PATCH v2 mptcp-next] selftests: mptcp: fix diag instability
Posted by Paolo Abeni 2 years, 1 month ago
Instead of waiting for an arbitrary amount of time for the MPTCP
MP_CAPABLE handshake to complete, explicitly wait for the relevant
socket to enter into the established status.

Additionally let the data transfer application use the slowest
transfer mode available (-r), to cope with very slow host, or
high jitter caused by hosting VMs.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
v1 -> v2:
 - use wait_for_ instead larger sleep
 - hopefully better commit message
---
 tools/testing/selftests/net/mptcp/diag.sh | 44 +++++++++++++++++++----
 1 file changed, 37 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/net/mptcp/diag.sh b/tools/testing/selftests/net/mptcp/diag.sh
index 2674ba20d524..ff821025d309 100755
--- a/tools/testing/selftests/net/mptcp/diag.sh
+++ b/tools/testing/selftests/net/mptcp/diag.sh
@@ -71,6 +71,36 @@ chk_msk_remote_key_nr()
 		__chk_nr "grep -c remote_key" $*
 }
 
+# $1: ns, $2: port
+wait_local_port_listen()
+{
+	local listener_ns="${1}"
+	local port="${2}"
+
+	local port_hex i
+
+	port_hex="$(printf "%04X" "${port}")"
+	for i in $(seq 10); do
+		ip netns exec "${listener_ns}" cat /proc/net/tcp | \
+			awk "BEGIN {rc=1} {if (\$2 ~ /:${port_hex}\$/ && \$4 ~ /0A/) {rc=0; exit}} END {exit rc}" &&
+			break
+		sleep 0.1
+	done
+}
+
+wait_connected()
+{
+	local listener_ns="${1}"
+	local port="${2}"
+
+	local port_hex i
+
+	port_hex="$(printf "%04X" "${port}")"
+	for i in $(seq 10); do
+		ip netns exec ${listener_ns} grep -q " 0100007F:${port_hex} " /proc/net/tcp && break
+		sleep 0.1
+	done
+}
 
 trap cleanup EXIT
 ip netns add $ns
@@ -81,15 +111,15 @@ echo "a" | \
 		ip netns exec $ns \
 			./mptcp_connect -p 10000 -l -t ${timeout_poll} \
 				0.0.0.0 >/dev/null &
-sleep 0.1
+wait_local_port_listen $ns 10000
 chk_msk_nr 0 "no msk on netns creation"
 
 echo "b" | \
 	timeout ${timeout_test} \
 		ip netns exec $ns \
-			./mptcp_connect -p 10000 -j -t ${timeout_poll} \
+			./mptcp_connect -p 10000 -r 0 -t ${timeout_poll} \
 				127.0.0.1 >/dev/null &
-sleep 0.1
+wait_connected $ns 10000
 chk_msk_nr 2 "after MPC handshake "
 chk_msk_remote_key_nr 2 "....chk remote_key"
 chk_msk_fallback_nr 0 "....chk no fallback"
@@ -101,13 +131,13 @@ echo "a" | \
 		ip netns exec $ns \
 			./mptcp_connect -p 10001 -l -s TCP -t ${timeout_poll} \
 				0.0.0.0 >/dev/null &
-sleep 0.1
+wait_local_port_listen $ns 10001
 echo "b" | \
 	timeout ${timeout_test} \
 		ip netns exec $ns \
-			./mptcp_connect -p 10001 -j -t ${timeout_poll} \
+			./mptcp_connect -p 10001 -r 0 -t ${timeout_poll} \
 				127.0.0.1 >/dev/null &
-sleep 0.1
+wait_connected $ns 10001
 chk_msk_fallback_nr 1 "check fallback"
 flush_pids
 
@@ -119,7 +149,7 @@ for I in `seq 1 $NR_CLIENTS`; do
 				./mptcp_connect -p $((I+10001)) -l -w 10 \
 					-t ${timeout_poll} 0.0.0.0 >/dev/null &
 done
-sleep 0.1
+wait_local_port_listen $ns $((NR_CLIENTS + 10001))
 
 for I in `seq 1 $NR_CLIENTS`; do
 	echo "b" | \
-- 
2.34.1


Re: [PATCH v2 mptcp-next] selftests: mptcp: fix diag instability
Posted by Matthieu Baerts 2 years, 1 month ago
Hi Paolo,

On 04/02/2022 13:17, Paolo Abeni wrote:
> Instead of waiting for an arbitrary amount of time for the MPTCP
> MP_CAPABLE handshake to complete, explicitly wait for the relevant
> socket to enter into the established status.
> 
> Additionally let the data transfer application use the slowest
> transfer mode available (-r), to cope with very slow host, or
> high jitter caused by hosting VMs.
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> v1 -> v2:
>  - use wait_for_ instead larger sleep
>  - hopefully better commit message

Thank you for the new version!

It is being tested in the background. More than 175 attempts in a slow
environment with a debug kernel and no failures so far!

> diff --git a/tools/testing/selftests/net/mptcp/diag.sh b/tools/testing/selftests/net/mptcp/diag.sh
> index 2674ba20d524..ff821025d309 100755
> --- a/tools/testing/selftests/net/mptcp/diag.sh
> +++ b/tools/testing/selftests/net/mptcp/diag.sh-
> @@ -119,7 +149,7 @@ for I in `seq 1 $NR_CLIENTS`; do
>  				./mptcp_connect -p $((I+10001)) -l -w 10 \
>  					-t ${timeout_poll} 0.0.0.0 >/dev/null &
>  done
> -sleep 0.1
> +wait_local_port_listen $ns $((NR_CLIENTS + 10001))
>  
>  for I in `seq 1 $NR_CLIENTS`; do
>  	echo "b" | \

Do we need the change the last sleep (sleep 1.5) as well? We could wait
for a shorter time but well that's only 1.5 second once. And I guess we
might need to look for each connection because looking at the last one
might not be enough if there was an issue with a previous one.

So best to keep the "sleep 1.5"?

If yes, do you prefer if the test run for a longer time or can I already
apply it?

Just in case:

Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/258

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

Re: [PATCH v2 mptcp-next] selftests: mptcp: fix diag instability
Posted by Paolo Abeni 2 years, 1 month ago
On Fri, 2022-02-04 at 16:53 +0100, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 04/02/2022 13:17, Paolo Abeni wrote:
> > Instead of waiting for an arbitrary amount of time for the MPTCP
> > MP_CAPABLE handshake to complete, explicitly wait for the relevant
> > socket to enter into the established status.
> > 
> > Additionally let the data transfer application use the slowest
> > transfer mode available (-r), to cope with very slow host, or
> > high jitter caused by hosting VMs.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> > v1 -> v2:
> >  - use wait_for_ instead larger sleep
> >  - hopefully better commit message
> 
> Thank you for the new version!
> 
> It is being tested in the background. More than 175 attempts in a slow
> environment with a debug kernel and no failures so far!
> 
> > diff --git a/tools/testing/selftests/net/mptcp/diag.sh b/tools/testing/selftests/net/mptcp/diag.sh
> > index 2674ba20d524..ff821025d309 100755
> > --- a/tools/testing/selftests/net/mptcp/diag.sh
> > +++ b/tools/testing/selftests/net/mptcp/diag.sh-
> > @@ -119,7 +149,7 @@ for I in `seq 1 $NR_CLIENTS`; do
> >  				./mptcp_connect -p $((I+10001)) -l -w 10 \
> >  					-t ${timeout_poll} 0.0.0.0 >/dev/null &
> >  done
> > -sleep 0.1
> > +wait_local_port_listen $ns $((NR_CLIENTS + 10001))
> >  
> >  for I in `seq 1 $NR_CLIENTS`; do
> >  	echo "b" | \
> 
> Do we need the change the last sleep (sleep 1.5) as well? We could wait
> for a shorter time but well that's only 1.5 second once. And I guess we
> might need to look for each connection because looking at the last one
> might not be enough if there was an issue with a previous one.
> 
> So best to keep the "sleep 1.5"?

We need a more complex test to replace sleep 1.5 reliably. I'll look at
that later, surely in a different patch.
> 
> If yes, do you prefer if the test run for a longer time or can I already
> apply it?

Please, go ahead, thanks!

/P


Re: [PATCH v2 mptcp-next] selftests: mptcp: fix diag instability
Posted by Matthieu Baerts 2 years, 1 month ago
Hi Paolo, Mat,

On 04/02/2022 19:32, Paolo Abeni wrote:
> On Fri, 2022-02-04 at 16:53 +0100, Matthieu Baerts wrote:
>> Hi Paolo,
>>
>> On 04/02/2022 13:17, Paolo Abeni wrote:
>>> Instead of waiting for an arbitrary amount of time for the MPTCP
>>> MP_CAPABLE handshake to complete, explicitly wait for the relevant
>>> socket to enter into the established status.
>>>
>>> Additionally let the data transfer application use the slowest
>>> transfer mode available (-r), to cope with very slow host, or
>>> high jitter caused by hosting VMs.
>>>
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>> ---
>>> v1 -> v2:
>>>  - use wait_for_ instead larger sleep
>>>  - hopefully better commit message
>>
>> Thank you for the new version!
>>
>> It is being tested in the background. More than 175 attempts in a slow
>> environment with a debug kernel and no failures so far!
>>
>>> diff --git a/tools/testing/selftests/net/mptcp/diag.sh b/tools/testing/selftests/net/mptcp/diag.sh
>>> index 2674ba20d524..ff821025d309 100755
>>> --- a/tools/testing/selftests/net/mptcp/diag.sh
>>> +++ b/tools/testing/selftests/net/mptcp/diag.sh-
>>> @@ -119,7 +149,7 @@ for I in `seq 1 $NR_CLIENTS`; do
>>>  				./mptcp_connect -p $((I+10001)) -l -w 10 \
>>>  					-t ${timeout_poll} 0.0.0.0 >/dev/null &
>>>  done
>>> -sleep 0.1
>>> +wait_local_port_listen $ns $((NR_CLIENTS + 10001))
>>>  
>>>  for I in `seq 1 $NR_CLIENTS`; do
>>>  	echo "b" | \
>>
>> Do we need the change the last sleep (sleep 1.5) as well? We could wait
>> for a shorter time but well that's only 1.5 second once. And I guess we
>> might need to look for each connection because looking at the last one
>> might not be enough if there was an issue with a previous one.
>>
>> So best to keep the "sleep 1.5"?
> 
> We need a more complex test to replace sleep 1.5 reliably. I'll look at
> that later, surely in a different patch.

If we don't see other issues with the public CI, it might be OK like that!

>> If yes, do you prefer if the test run for a longer time or can I already
>> apply it?
> 
> Please, go ahead, thanks!

Thanks!

Should I add a Fixes tag and put it in -net?

Fixes: df62f2ec3df6 ("selftests/mptcp: add diag interface tests")

@Mat: Also OK for you?

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net

Re: [PATCH v2 mptcp-next] selftests: mptcp: fix diag instability
Posted by Mat Martineau 2 years, 1 month ago
On Fri, 4 Feb 2022, Matthieu Baerts wrote:

> Hi Paolo, Mat,
>
> On 04/02/2022 19:32, Paolo Abeni wrote:
>> On Fri, 2022-02-04 at 16:53 +0100, Matthieu Baerts wrote:
>>> Hi Paolo,
>>>
>>> On 04/02/2022 13:17, Paolo Abeni wrote:
>>>> Instead of waiting for an arbitrary amount of time for the MPTCP
>>>> MP_CAPABLE handshake to complete, explicitly wait for the relevant
>>>> socket to enter into the established status.
>>>>
>>>> Additionally let the data transfer application use the slowest
>>>> transfer mode available (-r), to cope with very slow host, or
>>>> high jitter caused by hosting VMs.
>>>>
>>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>>> ---
>>>> v1 -> v2:
>>>>  - use wait_for_ instead larger sleep
>>>>  - hopefully better commit message
>>>
>>> Thank you for the new version!
>>>
>>> It is being tested in the background. More than 175 attempts in a slow
>>> environment with a debug kernel and no failures so far!
>>>
>>>> diff --git a/tools/testing/selftests/net/mptcp/diag.sh b/tools/testing/selftests/net/mptcp/diag.sh
>>>> index 2674ba20d524..ff821025d309 100755
>>>> --- a/tools/testing/selftests/net/mptcp/diag.sh
>>>> +++ b/tools/testing/selftests/net/mptcp/diag.sh-
>>>> @@ -119,7 +149,7 @@ for I in `seq 1 $NR_CLIENTS`; do
>>>>  				./mptcp_connect -p $((I+10001)) -l -w 10 \
>>>>  					-t ${timeout_poll} 0.0.0.0 >/dev/null &
>>>>  done
>>>> -sleep 0.1
>>>> +wait_local_port_listen $ns $((NR_CLIENTS + 10001))
>>>>
>>>>  for I in `seq 1 $NR_CLIENTS`; do
>>>>  	echo "b" | \
>>>
>>> Do we need the change the last sleep (sleep 1.5) as well? We could wait
>>> for a shorter time but well that's only 1.5 second once. And I guess we
>>> might need to look for each connection because looking at the last one
>>> might not be enough if there was an issue with a previous one.
>>>
>>> So best to keep the "sleep 1.5"?
>>
>> We need a more complex test to replace sleep 1.5 reliably. I'll look at
>> that later, surely in a different patch.
>
> If we don't see other issues with the public CI, it might be OK like that!
>
>>> If yes, do you prefer if the test run for a longer time or can I already
>>> apply it?
>>
>> Please, go ahead, thanks!
>
> Thanks!
>
> Should I add a Fixes tag and put it in -net?
>
> Fixes: df62f2ec3df6 ("selftests/mptcp: add diag interface tests")
>
> @Mat: Also OK for you?
>

Yes, good for me. And:

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>

--
Mat Martineau
Intel

Re: [PATCH v2 mptcp-next] selftests: mptcp: fix diag instability
Posted by Matthieu Baerts 2 years, 1 month ago
Hi Paolo, Mat,

On 04/02/2022 13:17, Paolo Abeni wrote:
> Instead of waiting for an arbitrary amount of time for the MPTCP
> MP_CAPABLE handshake to complete, explicitly wait for the relevant
> socket to enter into the established status.
> 
> Additionally let the data transfer application use the slowest
> transfer mode available (-r), to cope with very slow host, or
> high jitter caused by hosting VMs.

Thank you for the patch and the review!

Now in our tree (fixes for -net) with Closes, Fixes, my R&Tb and Mat's
RvB tags:

- 3b63b1552ae2: selftests: mptcp: fix diag instability

- Results: 4514f808428c..3150c57070f7

Builds and tests are now in progress:



https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20220205T103236

https://github.com/multipath-tcp/mptcp_net-next/actions/workflows/build-validation.yml?query=branch:export

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net