[PATCH mptcp-next v5 0/3] mptcp: add statistics for mptcp socket in use

menglong8.dong@gmail.com posted 3 patches 1 year, 5 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/20221007092922.13169-1-imagedong@tencent.com
Maintainers: Mat Martineau <mathew.j.martineau@linux.intel.com>, Matthieu Baerts <matthieu.baerts@tessares.net>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Shuah Khan <shuah@kernel.org>
There is a newer version of this series
net/mptcp/protocol.c                      | 35 ++++++++++++++++++-----
net/mptcp/protocol.h                      |  1 +
net/mptcp/subflow.c                       |  3 ++
tools/testing/selftests/net/mptcp/diag.sh | 28 ++++++++++++++++++
4 files changed, 60 insertions(+), 7 deletions(-)
[PATCH mptcp-next v5 0/3] mptcp: add statistics for mptcp socket in use
Posted by menglong8.dong@gmail.com 1 year, 5 months ago
From: Menglong Dong <imagedong@tencent.com>

In the 1th patch, we do some code cleanup with replease 'sock->sk'
with 'sk'. In the 2th patch, we add statistics for mptcp socket in
use. And in the 3th patch, we add the testing for this commit.

Changes since v4:
- rebase to solve merge conflict

Changes since v3:
- rename MPTCP_DESTROIED to MPTCP_DESTROYED in the 2th patch

Changes since v2:
- add testing

Changes since v1:
- split the code cleanup into the 1th patch.
- decrease the statistics for listening mptcp socket inuse with
  mptcp_listen_inuse_dec()
- add MPTCP_DESTROIED flags to store if mptcp_destroy_common() was
  called on the msk. For fallback case, we need to decrease the
  statistics only once, and mptcp_destroy_common() can be called
  more than once.

Menglong Dong (3):
  mptcp: introduce 'sk' to replace 'sock->sk' in mptcp_listen()
  mptcp: add statistics for mptcp socket in use
  selftest: mptcp: add test for mptcp socket in use

 net/mptcp/protocol.c                      | 35 ++++++++++++++++++-----
 net/mptcp/protocol.h                      |  1 +
 net/mptcp/subflow.c                       |  3 ++
 tools/testing/selftests/net/mptcp/diag.sh | 28 ++++++++++++++++++
 4 files changed, 60 insertions(+), 7 deletions(-)

-- 
2.37.2
Re: [PATCH mptcp-next v5 0/3] mptcp: add statistics for mptcp socket in use
Posted by Mat Martineau 1 year, 5 months ago
On Fri, 7 Oct 2022, menglong8.dong@gmail.com wrote:

> From: Menglong Dong <imagedong@tencent.com>
>
> In the 1th patch, we do some code cleanup with replease 'sock->sk'
> with 'sk'. In the 2th patch, we add statistics for mptcp socket in
> use. And in the 3th patch, we add the testing for this commit.
>

I think v5 looks ok for the export branch - we can get more test coverage 
there and update if needed.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>

> Changes since v4:
> - rebase to solve merge conflict
>
> Changes since v3:
> - rename MPTCP_DESTROIED to MPTCP_DESTROYED in the 2th patch
>
> Changes since v2:
> - add testing
>
> Changes since v1:
> - split the code cleanup into the 1th patch.
> - decrease the statistics for listening mptcp socket inuse with
>  mptcp_listen_inuse_dec()
> - add MPTCP_DESTROIED flags to store if mptcp_destroy_common() was
>  called on the msk. For fallback case, we need to decrease the
>  statistics only once, and mptcp_destroy_common() can be called
>  more than once.
>
> Menglong Dong (3):
>  mptcp: introduce 'sk' to replace 'sock->sk' in mptcp_listen()
>  mptcp: add statistics for mptcp socket in use
>  selftest: mptcp: add test for mptcp socket in use
>
> net/mptcp/protocol.c                      | 35 ++++++++++++++++++-----
> net/mptcp/protocol.h                      |  1 +
> net/mptcp/subflow.c                       |  3 ++
> tools/testing/selftests/net/mptcp/diag.sh | 28 ++++++++++++++++++
> 4 files changed, 60 insertions(+), 7 deletions(-)
>
> -- 
> 2.37.2
>
>

--
Mat Martineau
Intel
Re: [PATCH mptcp-next v5 0/3] mptcp: add statistics for mptcp socket in use
Posted by Matthieu Baerts 1 year, 5 months ago
Hi Menglong, Mat,

On 12/10/2022 02:53, Mat Martineau wrote:
> On Fri, 7 Oct 2022, menglong8.dong@gmail.com wrote:
> 
>> From: Menglong Dong <imagedong@tencent.com>
>>
>> In the 1th patch, we do some code cleanup with replease 'sock->sk'
>> with 'sk'. In the 2th patch, we add statistics for mptcp socket in
>> use. And in the 3th patch, we add the testing for this commit.
>>
> 
> I think v5 looks ok for the export branch - we can get more test
> coverage there and update if needed.
> 
> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>

Thank you for the patch and the review!

I was going to apply it but I just noticed our public CI complained
about the new test during its second test:


 KVM Validation: debug:
  - Unstable: 2 failed test(s): selftest_diag selftest_mptcp_join 🔴:
  - Task: https://cirrus-ci.com/task/6538547548127232
  - Summary:
https://api.cirrus-ci.com/v1/artifact/task/6538547548127232/summary/summary.txt


The useful bit:

----------------------- 8< -----------------------
not ok 1 test: selftest_diag.tap # exit=7
# no msk on netns creation                          [  ok  ]
# listen match for dport 10000                      [  ok  ]
# listen match for sport 10000                      [  ok  ]
# listen match for saddr and sport                  [  ok  ]
# all listen sockets                                [  ok  ]
# after MPC handshake                               [  ok  ]
# ....chk remote_key                                [  ok  ]
# ....chk no fallback                               [  ok  ]
# msk in use statistics                             [  ok  ]
# msk in use statistics                             [  ok  ]
# check fallback                                    [  ok  ]
# many msk socket present                           [  ok  ]
# msk in use statistics                             [ fail ] expected
249 found 248
# msk in use statistics                             [  ok  ]
----------------------- 8< -----------------------

Do you mind looking at that please?


Please note that the public CI is known to be slow. Here it was failing
with a debug kernel config (KASAN, KMEMLEAK, etc. see
kernel/configs/debug.config file)

If you want to reproduce it in closer conditions, you can do something
like this to build the same kernel and run diag.sh in a loop until it fails:

  $ cd [kernel source code]
  $ echo "run_loop run_selftest_one ./diag.sh" > .virtme-exec-run
  $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
      --pull always mptcp/mptcp-upstream-virtme-docker:latest \
      auto-debug

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net
Re: [PATCH mptcp-next v5 0/3] mptcp: add statistics for mptcp socket in use
Posted by Menglong Dong 1 year, 5 months ago
On Wed, Oct 12, 2022 at 5:29 PM Matthieu Baerts
<matthieu.baerts@tessares.net> wrote:
>
> Hi Menglong, Mat,
>
> On 12/10/2022 02:53, Mat Martineau wrote:
> > On Fri, 7 Oct 2022, menglong8.dong@gmail.com wrote:
> >
> >> From: Menglong Dong <imagedong@tencent.com>
> >>
> >> In the 1th patch, we do some code cleanup with replease 'sock->sk'
> >> with 'sk'. In the 2th patch, we add statistics for mptcp socket in
> >> use. And in the 3th patch, we add the testing for this commit.
> >>
> >
> > I think v5 looks ok for the export branch - we can get more test
> > coverage there and update if needed.
> >
> > Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
>
> Thank you for the patch and the review!
>
> I was going to apply it but I just noticed our public CI complained
> about the new test during its second test:
>
>
>  KVM Validation: debug:
>   - Unstable: 2 failed test(s): selftest_diag selftest_mptcp_join 🔴:
>   - Task: https://cirrus-ci.com/task/6538547548127232
>   - Summary:
> https://api.cirrus-ci.com/v1/artifact/task/6538547548127232/summary/summary.txt
>
>
> The useful bit:
>
> ----------------------- 8< -----------------------
> not ok 1 test: selftest_diag.tap # exit=7
> # no msk on netns creation                          [  ok  ]
> # listen match for dport 10000                      [  ok  ]
> # listen match for sport 10000                      [  ok  ]
> # listen match for saddr and sport                  [  ok  ]
> # all listen sockets                                [  ok  ]
> # after MPC handshake                               [  ok  ]
> # ....chk remote_key                                [  ok  ]
> # ....chk no fallback                               [  ok  ]
> # msk in use statistics                             [  ok  ]
> # msk in use statistics                             [  ok  ]
> # check fallback                                    [  ok  ]
> # many msk socket present                           [  ok  ]
> # msk in use statistics                             [ fail ] expected
> 249 found 248
> # msk in use statistics                             [  ok  ]
> ----------------------- 8< -----------------------
>
> Do you mind looking at that please?

Hello,

With the test command on my computer, the 'many msk socket present'
always fails, and bellow is the log:

# no msk on netns creation                          [  ok  ]
# listen match for dport 10000                      [  ok  ]
# listen match for sport 10000                      [  ok  ]
# listen match for saddr and sport                  [  ok  ]
# all listen sockets                                [  ok  ]
# after MPC handshake                               [  ok  ]
# ....chk remote_key                                [  ok  ]
# ....chk no fallback                               [  ok  ]
# msk in use statistics                             [  ok  ]
# msk in use statistics                             [  ok  ]
# check fallback                                    [  ok  ]
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# main_loop_s: timed out
# main_loop_s: timed out
# main_loop_s: timed out
# main_loop_s: timed out
# main_loop_s: timed out
# main_loop_s: timed out
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# connect(): Connection refused
# many msk socket present                           [ fail ] timeout
while expecting 200 max 0 last 0
# msk in use statistics                             [ fail ] expected
201 found 0
# msk in use statistics                             [  ok  ]

Do you have any ideas?

I suspect the failure in your tests has something to do with the '+1'
in 'chk_msk_inuse $((NR_CLIENTS*2+1))'. I add the '+1' as I find
that the 'mptcp_connect' with '-r' won't exit after flush_pids(),
until the second flush_pids(). I'm sure if this opinion is right.

Thanks!
Menglong Dong

>
>
> Please note that the public CI is known to be slow. Here it was failing
> with a debug kernel config (KASAN, KMEMLEAK, etc. see
> kernel/configs/debug.config file)
>
> If you want to reproduce it in closer conditions, you can do something
> like this to build the same kernel and run diag.sh in a loop until it fails:
>
>   $ cd [kernel source code]
>   $ echo "run_loop run_selftest_one ./diag.sh" > .virtme-exec-run
>   $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
>       --pull always mptcp/mptcp-upstream-virtme-docker:latest \
>       auto-debug
>
> For more details:
>
>     https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
>
>
> Cheers,
> Matt
> --
> Tessares | Belgium | Hybrid Access Solutions
> www.tessares.net
Re: [PATCH mptcp-next v5 0/3] mptcp: add statistics for mptcp socket in use
Posted by Matthieu Baerts 1 year, 4 months ago
Hi Menglong,

On 28/10/2022 05:18, Menglong Dong wrote:
> On Wed, Oct 12, 2022 at 5:29 PM Matthieu Baerts
> <matthieu.baerts@tessares.net> wrote:
>>
>> Hi Menglong, Mat,
>>
>> On 12/10/2022 02:53, Mat Martineau wrote:
>>> On Fri, 7 Oct 2022, menglong8.dong@gmail.com wrote:
>>>
>>>> From: Menglong Dong <imagedong@tencent.com>
>>>>
>>>> In the 1th patch, we do some code cleanup with replease 'sock->sk'
>>>> with 'sk'. In the 2th patch, we add statistics for mptcp socket in
>>>> use. And in the 3th patch, we add the testing for this commit.
>>>>
>>>
>>> I think v5 looks ok for the export branch - we can get more test
>>> coverage there and update if needed.
>>>
>>> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
>>
>> Thank you for the patch and the review!
>>
>> I was going to apply it but I just noticed our public CI complained
>> about the new test during its second test:

(...)
>> Do you mind looking at that please?
> 
> Hello,
> 
> With the test command on my computer, the 'many msk socket present'
> always fails, and bellow is the log:
> 
> # no msk on netns creation                          [  ok  ]
> # listen match for dport 10000                      [  ok  ]
> # listen match for sport 10000                      [  ok  ]
> # listen match for saddr and sport                  [  ok  ]
> # all listen sockets                                [  ok  ]
> # after MPC handshake                               [  ok  ]
> # ....chk remote_key                                [  ok  ]
> # ....chk no fallback                               [  ok  ]
> # msk in use statistics                             [  ok  ]
> # msk in use statistics                             [  ok  ]
> # check fallback                                    [  ok  ]
> # main_loop_s: timed out
> # connect(): Connection refused
> # main_loop_s: timed out
> # main_loop_s: timed out
> # connect(): Connection refused
> # connect(): Connection refused

(...)

> # many msk socket present                           [ fail ] timeout
> while expecting 200 max 0 last 0
> # msk in use statistics                             [ fail ] expected
> 201 found 0
> # msk in use statistics                             [  ok  ]
> 
> Do you have any ideas?

The test "many msk socket present" creates 100 listening MPTCP sockets,
wait maximum 1 second for them to be ready, then tries to connect to it
and check counters reported by 'ss'.

Could it be possible your environment is quite slow and it needs more
than one second to have the 100 listening MPTCP sockets to be ready? May
you try this patch please?

-------------------- 8< --------------------
diff --git a/tools/testing/selftests/net/mptcp/diag.sh
b/tools/testing/selftests/net/mptcp/diag.sh
index 515859a5168b..72ff61f5dd99 100755
--- a/tools/testing/selftests/net/mptcp/diag.sh
+++ b/tools/testing/selftests/net/mptcp/diag.sh
@@ -150,7 +150,7 @@ wait_local_port_listen()
        local port_hex i

        port_hex="$(printf "%04X" "${port}")"
-       for i in $(seq 10); do
+       for i in $(seq 20); do
                ip netns exec "${listener_ns}" cat /proc/net/tcp | \
                        awk "BEGIN {rc=1} {if (\$2 ~ /:${port_hex}\$/ &&
\$4 ~ /0A/) {rc=0; exit}} END {exit rc}" &&
                        break
-------------------- 8< --------------------

(or even with 'seq 50')


Or is there something limiting the number of sockets being created on
your side? I see that you got 60 connect() errors, maybe limited to 40
somewhere?


> I suspect the failure in your tests has something to do with the '+1'
> in 'chk_msk_inuse $((NR_CLIENTS*2+1))'. I add the '+1' as I find
> that the 'mptcp_connect' with '-r' won't exit after flush_pids(),
> until the second flush_pids(). I'm sure if this opinion is right.

Mmm, flush_pids() sends a 'SIGUSR1' signal to all pids but:

(1) it doesn't wait for mptcp_connect processes to be finished: may you
try with this patch please?

-------------------- 8< --------------------
diff --git a/tools/testing/selftests/net/mptcp/diag.sh
b/tools/testing/selftests/net/mptcp/diag.sh
index 515859a5168b..4ad70ead3bdf 100755
--- a/tools/testing/selftests/net/mptcp/diag.sh
+++ b/tools/testing/selftests/net/mptcp/diag.sh
@@ -16,6 +16,12 @@ flush_pids()
        sleep 1.1

        ip netns pids "${ns}" | xargs --no-run-if-empty kill -SIGUSR1
&>/dev/null
+
+       for _ in $(seq 10); do
+               [ -z "$(ip netns pids "${ns}")" ] && break
+
+               sleep 0.1
+       done
 }

 cleanup()
-------------------- 8< --------------------


(2) I think mptcp_connect might not stop in some conditions when SIGUSR1
is received. May you try with this patch please?

-------------------- 8< --------------------
diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c
b/tools/testing/selftests/net/mptcp/mptcp_connect.c
index e54653ea2ed4..596e3344d5af 100644
--- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
+++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
@@ -586,7 +586,7 @@ static int copyfd_io_poll(int infd, int peerfd, int
outfd, bool *in_closed_after
                char rbuf[8192];
                ssize_t len;

-               if (fds.events == 0)
+               if (fds.events == 0 || quit)
                        break;

                switch (poll(&fds, 1, poll_timeout)) {
-------------------- 8< --------------------

If these fixes are needed, best to add them in (a) separated commit(s).
I can also send them if you prefer, they are not directly related to
your series.

Cheers,
Matt
-- 
Tessares | Belgium | Hybrid Access Solutions
www.tessares.net
Re: [PATCH mptcp-next v5 0/3] mptcp: add statistics for mptcp socket in use
Posted by Menglong Dong 1 year, 4 months ago
On Tue, Nov 1, 2022 at 1:07 AM Matthieu Baerts
<matthieu.baerts@tessares.net> wrote:
>
> Hi Menglong,
>
> On 28/10/2022 05:18, Menglong Dong wrote:
> > On Wed, Oct 12, 2022 at 5:29 PM Matthieu Baerts
> > <matthieu.baerts@tessares.net> wrote:
> >>
> >> Hi Menglong, Mat,
> >>
> >> On 12/10/2022 02:53, Mat Martineau wrote:
> >>> On Fri, 7 Oct 2022, menglong8.dong@gmail.com wrote:
> >>>
> >>>> From: Menglong Dong <imagedong@tencent.com>
> >>>>
> >>>> In the 1th patch, we do some code cleanup with replease 'sock->sk'
> >>>> with 'sk'. In the 2th patch, we add statistics for mptcp socket in
> >>>> use. And in the 3th patch, we add the testing for this commit.
> >>>>
> >>>
> >>> I think v5 looks ok for the export branch - we can get more test
> >>> coverage there and update if needed.
> >>>
> >>> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
> >>
> >> Thank you for the patch and the review!
> >>
> >> I was going to apply it but I just noticed our public CI complained
> >> about the new test during its second test:
>
> (...)
> >> Do you mind looking at that please?
> >
> > Hello,
> >
> > With the test command on my computer, the 'many msk socket present'
> > always fails, and bellow is the log:
> >
> > # no msk on netns creation                          [  ok  ]
> > # listen match for dport 10000                      [  ok  ]
> > # listen match for sport 10000                      [  ok  ]
> > # listen match for saddr and sport                  [  ok  ]
> > # all listen sockets                                [  ok  ]
> > # after MPC handshake                               [  ok  ]
> > # ....chk remote_key                                [  ok  ]
> > # ....chk no fallback                               [  ok  ]
> > # msk in use statistics                             [  ok  ]
> > # msk in use statistics                             [  ok  ]
> > # check fallback                                    [  ok  ]
> > # main_loop_s: timed out
> > # connect(): Connection refused
> > # main_loop_s: timed out
> > # main_loop_s: timed out
> > # connect(): Connection refused
> > # connect(): Connection refused
>
> (...)
>
> > # many msk socket present                           [ fail ] timeout
> > while expecting 200 max 0 last 0
> > # msk in use statistics                             [ fail ] expected
> > 201 found 0
> > # msk in use statistics                             [  ok  ]
> >
> > Do you have any ideas?
>
> The test "many msk socket present" creates 100 listening MPTCP sockets,
> wait maximum 1 second for them to be ready, then tries to connect to it
> and check counters reported by 'ss'.
>
> Could it be possible your environment is quite slow and it needs more
> than one second to have the 100 listening MPTCP sockets to be ready? May
> you try this patch please?
>
> -------------------- 8< --------------------
> diff --git a/tools/testing/selftests/net/mptcp/diag.sh
> b/tools/testing/selftests/net/mptcp/diag.sh
> index 515859a5168b..72ff61f5dd99 100755
> --- a/tools/testing/selftests/net/mptcp/diag.sh
> +++ b/tools/testing/selftests/net/mptcp/diag.sh
> @@ -150,7 +150,7 @@ wait_local_port_listen()
>         local port_hex i
>
>         port_hex="$(printf "%04X" "${port}")"
> -       for i in $(seq 10); do
> +       for i in $(seq 20); do
>                 ip netns exec "${listener_ns}" cat /proc/net/tcp | \
>                         awk "BEGIN {rc=1} {if (\$2 ~ /:${port_hex}\$/ &&
> \$4 ~ /0A/) {rc=0; exit}} END {exit rc}" &&
>                         break
> -------------------- 8< --------------------
>
> (or even with 'seq 50')
>
>
> Or is there something limiting the number of sockets being created on
> your side? I see that you got 60 connect() errors, maybe limited to 40
> somewhere?
>

You are right, the qemu that runs the tests is quite slow on
my computer. With increasing the 'timeout_poll', '-w' that passed to
mptcp_connect, etc, finally it works.

Meanwhile, I figure out why the test failed. On a fast computer,
the mptcp_connect with the '-r' won't exit after flush_pids().
However, on a slow computer, such as qemu, it will exit out of the
'timeout' command.

>
> > I suspect the failure in your tests has something to do with the '+1'
> > in 'chk_msk_inuse $((NR_CLIENTS*2+1))'. I add the '+1' as I find
> > that the 'mptcp_connect' with '-r' won't exit after flush_pids(),
> > until the second flush_pids(). I'm sure if this opinion is right.
>
> Mmm, flush_pids() sends a 'SIGUSR1' signal to all pids but:
>
> (1) it doesn't wait for mptcp_connect processes to be finished: may you
> try with this patch please?
>
> -------------------- 8< --------------------
> diff --git a/tools/testing/selftests/net/mptcp/diag.sh
> b/tools/testing/selftests/net/mptcp/diag.sh
> index 515859a5168b..4ad70ead3bdf 100755
> --- a/tools/testing/selftests/net/mptcp/diag.sh
> +++ b/tools/testing/selftests/net/mptcp/diag.sh
> @@ -16,6 +16,12 @@ flush_pids()
>         sleep 1.1
>
>         ip netns pids "${ns}" | xargs --no-run-if-empty kill -SIGUSR1
> &>/dev/null
> +
> +       for _ in $(seq 10); do
> +               [ -z "$(ip netns pids "${ns}")" ] && break
> +
> +               sleep 0.1
> +       done
>  }
>
>  cleanup()
> -------------------- 8< --------------------
>
>
> (2) I think mptcp_connect might not stop in some conditions when SIGUSR1
> is received. May you try with this patch please?
>
> -------------------- 8< --------------------
> diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.c
> b/tools/testing/selftests/net/mptcp/mptcp_connect.c
> index e54653ea2ed4..596e3344d5af 100644
> --- a/tools/testing/selftests/net/mptcp/mptcp_connect.c
> +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.c
> @@ -586,7 +586,7 @@ static int copyfd_io_poll(int infd, int peerfd, int
> outfd, bool *in_closed_after
>                 char rbuf[8192];
>                 ssize_t len;
>
> -               if (fds.events == 0)
> +               if (fds.events == 0 || quit)
>                         break;
>
>                 switch (poll(&fds, 1, poll_timeout)) {
> -------------------- 8< --------------------
>
> If these fixes are needed, best to add them in (a) separated commit(s).
> I can also send them if you prefer, they are not directly related to
> your series.
>

The 'quit' is not used anywhere, which also puzzled me before.
Seems it can exit normally with the second method you raise,
plus the following:

@@ -692,7 +692,7 @@ static int copyfd_io_poll(int infd, int peerfd,
int outfd, bool *in_closed_after
        }

        /* leave some time for late join/announce */
-       if (cfg_remove)
+       if (cfg_remove && !quit)
                usleep(cfg_wait);

I'll add them in a separate commit.

Thanks!
Menglong Dong

> Cheers,
> Matt
> --
> Tessares | Belgium | Hybrid Access Solutions
> www.tessares.net
Re: [PATCH mptcp-next v5 0/3] mptcp: add statistics for mptcp socket in use
Posted by Menglong Dong 1 year, 5 months ago
On Wed, Oct 12, 2022 at 5:29 PM Matthieu Baerts
<matthieu.baerts@tessares.net> wrote:
>
> Hi Menglong, Mat,
>
> On 12/10/2022 02:53, Mat Martineau wrote:
> > On Fri, 7 Oct 2022, menglong8.dong@gmail.com wrote:
> >
> >> From: Menglong Dong <imagedong@tencent.com>
> >>
> >> In the 1th patch, we do some code cleanup with replease 'sock->sk'
> >> with 'sk'. In the 2th patch, we add statistics for mptcp socket in
> >> use. And in the 3th patch, we add the testing for this commit.
> >>
> >
> > I think v5 looks ok for the export branch - we can get more test
> > coverage there and update if needed.
> >
> > Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
>
> Thank you for the patch and the review!
>
> I was going to apply it but I just noticed our public CI complained
> about the new test during its second test:
>
>
>  KVM Validation: debug:
>   - Unstable: 2 failed test(s): selftest_diag selftest_mptcp_join 🔴:
>   - Task: https://cirrus-ci.com/task/6538547548127232
>   - Summary:
> https://api.cirrus-ci.com/v1/artifact/task/6538547548127232/summary/summary.txt
>
>
> The useful bit:
>
> ----------------------- 8< -----------------------
> not ok 1 test: selftest_diag.tap # exit=7
> # no msk on netns creation                          [  ok  ]
> # listen match for dport 10000                      [  ok  ]
> # listen match for sport 10000                      [  ok  ]
> # listen match for saddr and sport                  [  ok  ]
> # all listen sockets                                [  ok  ]
> # after MPC handshake                               [  ok  ]
> # ....chk remote_key                                [  ok  ]
> # ....chk no fallback                               [  ok  ]
> # msk in use statistics                             [  ok  ]
> # msk in use statistics                             [  ok  ]
> # check fallback                                    [  ok  ]
> # many msk socket present                           [  ok  ]
> # msk in use statistics                             [ fail ] expected
> 249 found 248
> # msk in use statistics                             [  ok  ]
> ----------------------- 8< -----------------------
>
> Do you mind looking at that please?
>
>
> Please note that the public CI is known to be slow. Here it was failing
> with a debug kernel config (KASAN, KMEMLEAK, etc. see
> kernel/configs/debug.config file)
>
> If you want to reproduce it in closer conditions, you can do something
> like this to build the same kernel and run diag.sh in a loop until it fails:
>

Hello,

Seems there is something wrong with the self test that
I added. Maybe it needs some sleep between flush_pids
and chk_msk_fallback_nr().

>   $ cd [kernel source code]
>   $ echo "run_loop run_selftest_one ./diag.sh" > .virtme-exec-run
>   $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
>       --pull always mptcp/mptcp-upstream-virtme-docker:latest \
>       auto-debug
>

Thanks for the command, I'll try to reproduce and solve
it.

Menglong Dong

> For more details:
>
>     https://github.com/multipath-tcp/mptcp-upstream-virtme-docker
>
>
> Cheers,
> Matt
> --
> Tessares | Belgium | Hybrid Access Solutions
> www.tessares.net