[PATCH bpf-next] tools/testing/selftests/bpf/test_tc_tunnel.sh: Fix wait for server bind

Marco Leogrande posted 1 patch 1 week, 2 days ago
tools/testing/selftests/bpf/test_tc_tunnel.sh | 1 +
1 file changed, 1 insertion(+)
[PATCH bpf-next] tools/testing/selftests/bpf/test_tc_tunnel.sh: Fix wait for server bind
Posted by Marco Leogrande 1 week, 2 days ago
Commit f803bcf9208a ("selftests/bpf: Prevent client connect before
server bind in test_tc_tunnel.sh") added code that waits for the
netcat server to start before the netcat client attempts to connect to
it. However, not all calls to 'server_listen' were guarded.

This patch adds the existing 'wait_for_port' guard after the remaining
call to 'server_listen'.

Fixes: f803bcf9208a ("selftests/bpf: Prevent client connect before server bind in test_tc_tunnel.sh")
Signed-off-by: Marco Leogrande <leogrande@google.com>
---
 tools/testing/selftests/bpf/test_tc_tunnel.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
index 7989ec6084545..cb55a908bb0d7 100755
--- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
+++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
@@ -305,6 +305,7 @@ else
 	client_connect
 	verify_data
 	server_listen
+	wait_for_port ${port} ${netcat_opt}
 fi
 
 # serverside, use BPF for decap
-- 
2.47.0.338.g60cca15819-goog
Re: [PATCH bpf-next] tools/testing/selftests/bpf/test_tc_tunnel.sh: Fix wait for server bind
Posted by Stanislav Fomichev 1 week, 1 day ago
On 12/02, Marco Leogrande wrote:
> Commit f803bcf9208a ("selftests/bpf: Prevent client connect before
> server bind in test_tc_tunnel.sh") added code that waits for the
> netcat server to start before the netcat client attempts to connect to
> it. However, not all calls to 'server_listen' were guarded.
> 
> This patch adds the existing 'wait_for_port' guard after the remaining
> call to 'server_listen'.
> 
> Fixes: f803bcf9208a ("selftests/bpf: Prevent client connect before server bind in test_tc_tunnel.sh")
> Signed-off-by: Marco Leogrande <leogrande@google.com>
> ---
>  tools/testing/selftests/bpf/test_tc_tunnel.sh | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/testing/selftests/bpf/test_tc_tunnel.sh b/tools/testing/selftests/bpf/test_tc_tunnel.sh
> index 7989ec6084545..cb55a908bb0d7 100755
> --- a/tools/testing/selftests/bpf/test_tc_tunnel.sh
> +++ b/tools/testing/selftests/bpf/test_tc_tunnel.sh
> @@ -305,6 +305,7 @@ else
>  	client_connect
>  	verify_data
>  	server_listen
> +	wait_for_port ${port} ${netcat_opt}
>  fi
>  
>  # serverside, use BPF for decap
> -- 
> 2.47.0.338.g60cca15819-goog
> 

Do you see this failing in your CI or in the BPF CI? It seems ok
to add wait_for_port here, but the likelihood of the issue seems
minuscule. There is a bunch of ip/tc/etc calls between this
server_listen and the next client_connect (and I'd be surprised to hear
that netcat is still not listening by the time we reach next
client_connect).
Re: [PATCH bpf-next] tools/testing/selftests/bpf/test_tc_tunnel.sh: Fix wait for server bind
Posted by Marco Leogrande 1 week, 1 day ago
On Mon, Dec 2, 2024 at 4:15 PM Stanislav Fomichev <stfomichev@gmail.com> wrote:
> Do you see this failing in your CI or in the BPF CI?

I see this failing in our internal CI, in around 1% to 2% of the CI runs.

> It seems ok
> to add wait_for_port here, but the likelihood of the issue seems
> minuscule. There is a bunch of ip/tc/etc calls between this
> server_listen and the next client_connect (and I'd be surprised to hear
> that netcat is still not listening by the time we reach next
> client_connect).

I'm surprised as well, and I've not found a good correlation with the
root cause of the delayed server start, besides generic "slowness".

You also make a good point - by calling wait_for_port this early we
"waste" the opportunity to run the other ip commands in parallel in
the meantime.
I had considered moving this wait down, just before the next
client_connect, but I concluded it might be less readable since it
would be so distant from the server_listen it pairs with. But I can
make that change if it seems better.
Re: [PATCH bpf-next] tools/testing/selftests/bpf/test_tc_tunnel.sh: Fix wait for server bind
Posted by Stanislav Fomichev 1 week ago
On 12/03, Marco Leogrande wrote:
> On Mon, Dec 2, 2024 at 4:15 PM Stanislav Fomichev <stfomichev@gmail.com> wrote:
> > Do you see this failing in your CI or in the BPF CI?
> 
> I see this failing in our internal CI, in around 1% to 2% of the CI runs.
> 
> > It seems ok
> > to add wait_for_port here, but the likelihood of the issue seems
> > minuscule. There is a bunch of ip/tc/etc calls between this
> > server_listen and the next client_connect (and I'd be surprised to hear
> > that netcat is still not listening by the time we reach next
> > client_connect).
> 
> I'm surprised as well, and I've not found a good correlation with the
> root cause of the delayed server start, besides generic "slowness".
> 
> You also make a good point - by calling wait_for_port this early we
> "waste" the opportunity to run the other ip commands in parallel in
> the meantime.
> I had considered moving this wait down, just before the next
> client_connect, but I concluded it might be less readable since it
> would be so distant from the server_listen it pairs with. But I can
> make that change if it seems better.

Thanks for the details, let's keep as is.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>