:p
atchew
Login
Even more fixes for the in-kernel PM :) There are a few Squash-to patches fist: - a partial revert for a previous Squash-to patch: probably fine not to block the send of the original patch then. - improve code coverage - small fix for the fullmesh case when re-adding the ID 0 endpoint Then a few more fixes: - avoid the RM_SUBFLOW MIB counter to be incremented twice - fix re-re-creation of the ID 0 endpoint - validate that in the selftests - avoid duplicated SUB_CLOSED events - validate that in the selftests - ADD_ADDR 0 is not taking into account by the 'add_addr_accepted' counter, then it should bypass 'accept_addr' & make sure ADD_ADDR 0 is not sent with a new address - validate that in the selftests Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Matthieu Baerts (NGI0) (10): Squash to "mptcp: pm: re-using ID of unused removed ADD_ADDR" Squash to "selftests: mptcp: join: check removing ID 0 endpoint" Squash to "mptcp: pm: reuse ID 0 after delete and re-add" mptcp: pm: do not remove already closed subflows mptcp: pm: fix ID 0 endp usage after multiple re-creations selftests: mptcp: join: check re-re-adding ID 0 endp mptcp: avoid duplicated SUB_CLOSED events selftests: mptcp: join: validate event numbers mptcp: pm: ADD_ADDR 0 is not a new address selftests: mptcp: join: check re-re-adding ID 0 signal net/mptcp/pm.c | 4 +- net/mptcp/pm_netlink.c | 31 ++++-- net/mptcp/protocol.c | 6 ++ net/mptcp/protocol.h | 5 +- tools/testing/selftests/net/mptcp/mptcp_join.sh | 134 ++++++++++++++++++++---- tools/testing/selftests/net/mptcp/mptcp_lib.sh | 4 + 6 files changed, 157 insertions(+), 27 deletions(-) --- base-commit: 41b6d36c6f0efd987c937a4cc70cfb55be460e5b change-id: 20240813-mptcp-dup-close-evt-e4512ec78742 Best regards, -- Matthieu Baerts (NGI0) <matttbe@kernel.org>
When removing an announced ADD_ADDR, the ID should be marked as available only if it was announced before. Otherwise, local_addr_used will not be decremented when removing the endpoint. That's somehow the behaviour we had from the original patch, before the previous Squash-to patch [1]. Link: https://lore.kernel.org/20240802-mptcp-pm-avail-v6-1-964ba9ce279f@kernel.org [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static bool mptcp_pm_remove_anno_addr(struct mptcp_sock *msk, ret = remove_anno_list_by_saddr(msk, addr); if (ret || force) { spin_lock_bh(&msk->pm.lock); - __set_bit(addr->id, msk->pm.id_avail_bitmap); - msk->pm.add_addr_signaled -= ret; + if (ret) { + __set_bit(addr->id, msk->pm.id_avail_bitmap); + msk->pm.add_addr_signaled--; + } mptcp_pm_remove_addr(msk, &list); spin_unlock_bh(&msk->pm.lock); } -- 2.45.2
The original commit was replacing the recreation of an endpoint used by an additional subflow, by the one used by the initial subflow. Except that it reduced the code coverage, as shown by the previous patch fixing a bug no longer visible with the modification of "selftests: mptcp: join: check removing ID 0 endpoint". Instead of replacing the endpoint 2 by 1, here an additional del/add is done on the endpoint used by the initial subflow. So the two cases are now covered. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 32 ++++++++++++++++--------- 1 file changed, 21 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index XXXXXXX..XXXXXXX 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -XXX,XX +XXX,XX @@ endpoint_tests() wait_mpj $ns2 pm_nl_check_endpoint "creation" \ $ns2 10.0.2.2 id 2 flags subflow dev ns2eth2 - chk_subflow_nr "before delete" 2 + chk_subflow_nr "before delete id 2" 2 chk_mptcp_info subflows 1 subflows 1 - pm_nl_del_endpoint $ns2 1 10.0.1.2 + pm_nl_del_endpoint $ns2 2 10.0.2.2 sleep 0.5 - chk_subflow_nr "after delete" 1 - chk_mptcp_info subflows 1 subflows 1 + chk_subflow_nr "after delete id 2" 1 + chk_mptcp_info subflows 0 subflows 0 - pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow + pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow wait_mpj $ns2 - chk_subflow_nr "after re-add" 2 - chk_mptcp_info subflows 2 subflows 2 + chk_subflow_nr "after re-add id 2" 2 + chk_mptcp_info subflows 1 subflows 1 pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow wait_attempt_fail $ns2 chk_subflow_nr "after new reject" 2 - chk_mptcp_info subflows 2 subflows 2 + chk_mptcp_info subflows 1 subflows 1 ip netns exec "${ns2}" ${iptables} -D OUTPUT -s "10.0.3.2" -p tcp -j REJECT pm_nl_del_endpoint $ns2 3 10.0.3.2 pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow wait_mpj $ns2 chk_subflow_nr "after no reject" 3 + chk_mptcp_info subflows 2 subflows 2 + + pm_nl_del_endpoint $ns2 1 10.0.1.2 + sleep 0.5 + chk_subflow_nr "after delete id 0" 2 + chk_mptcp_info subflows 2 subflows 2 # only decr for additional sf + + pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow + wait_mpj $ns2 + chk_subflow_nr "after re-add id 0" 3 chk_mptcp_info subflows 3 subflows 3 mptcp_lib_kill_wait $tests_pid - join_syn_tx=4 \ - chk_join_nr 3 3 3 - chk_rm_nr 1 1 + join_syn_tx=5 \ + chk_join_nr 4 4 4 + chk_rm_nr 2 2 fi # remove and re-add -- 2.45.2
Set the address ID to 0 before calling fill_remote_addresses_vec(): for fullmesh cases, a bitmap will be created after having looked at all subflow IDs matching the local one. The ID visible on the wire (e.g. 0) should be compared to, not the one of the global endpoint (cannot be 0). Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) msk->pm.local_addr_used++; __clear_bit(local.addr.id, msk->pm.id_avail_bitmap); - nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh, addrs); - if (nr == 0) - continue; /* Special case for ID0: set the correct ID */ if (local.addr.id == msk->mpc_endpoint_id) local.addr.id = 0; + nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh, addrs); + if (nr == 0) + continue; + spin_unlock_bh(&msk->pm.lock); for (i = 0; i < nr; i++) __mptcp_subflow_connect(sk, &local, &addrs[i]); -- 2.45.2
It is possible to have in the list already closed subflows, e.g. the initial subflow has been already closed, but still in the list. No need to try to close it again, and increments the related counters again. Fixes: 0ee4261a3681 ("mptcp: implement mptcp_pm_remove_subflow") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk, int how = RCV_SHUTDOWN | SEND_SHUTDOWN; u8 id = subflow_get_local_id(subflow); + if (inet_sk_state_load(ssk) == TCP_CLOSE) + continue; if (rm_type == MPTCP_MIB_RMADDR && remote_id != rm_id) continue; if (rm_type == MPTCP_MIB_RMSUBFLOW && id != rm_id) -- 2.45.2
'local_addr_used' and 'add_addr_accepted' are decremented for addresses not related to the initial subflow (ID0), because the source and destination addresses of the initial subflows are known from the beginning: they don't count as "additional local address being used" or "ADD_ADDR being accepted". It is then required not to increment them when the entrypoint used by the initial subflow is removed and re-added during a connection. Without this modification, this entrypoint cannot be removed and re-added more than once. Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/512 Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) fullmesh = !!(local.flags & MPTCP_PM_ADDR_FLAG_FULLMESH); - msk->pm.local_addr_used++; __clear_bit(local.addr.id, msk->pm.id_avail_bitmap); /* Special case for ID0: set the correct ID */ if (local.addr.id == msk->mpc_endpoint_id) local.addr.id = 0; + else /* local_addr_used is not decr for ID 0 */ + msk->pm.local_addr_used++; nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh, addrs); if (nr == 0) @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) spin_lock_bh(&msk->pm.lock); if (sf_created) { - msk->pm.add_addr_accepted++; + /* add_addr_accepted is not decr for ID 0 */ + if (remote.id) + msk->pm.add_addr_accepted++; if (msk->pm.add_addr_accepted >= add_addr_accept_max || msk->pm.subflows >= subflows_max) WRITE_ONCE(msk->pm.accept_addr, false); -- 2.45.2
This test extends "delete and re-add" to validate the previous commit: when the endpoint linked to the initial subflow (ID 0) is re-added multiple times, it was no longer being used, because the internal linked counters are not decremented for this special endpoint: it is not an additional endpoint. Here, the "del/add id 0" steps are duplicated to unsure this case is validated. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index XXXXXXX..XXXXXXX 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -XXX,XX +XXX,XX @@ endpoint_tests() chk_subflow_nr "after re-add id 0" 3 chk_mptcp_info subflows 3 subflows 3 + pm_nl_del_endpoint $ns2 1 10.0.1.2 + sleep 0.5 + chk_subflow_nr "after re-delete id 0" 2 + chk_mptcp_info subflows 2 subflows 2 # it was an additional sf + + pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow + wait_mpj $ns2 + chk_subflow_nr "after re-re-add id 0" 3 + chk_mptcp_info subflows 3 subflows 3 + mptcp_lib_kill_wait $tests_pid - join_syn_tx=5 \ - chk_join_nr 4 4 4 - chk_rm_nr 2 2 + join_syn_tx=6 \ + chk_join_nr 5 5 5 + chk_rm_nr 3 3 fi # remove and re-add -- 2.45.2
The initial subflow might have already been closed, but still in the connection list. When the worker is instructed to close the subflows that have been marked as closed, it might then try to close the initial subflow again. A consequence of that is that the SUB_CLOSED event can be seen twice: # ip mptcp endpoint 1.1.1.1 id 1 subflow dev eth0 2.2.2.2 id 2 subflow dev eth1 # ip mptcp monitor & [ CREATED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ ESTABLISHED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ SF_ESTABLISHED] remid=0 locid=2 saddr4=2.2.2.2 daddr4=9.9.9.9 # ip mptcp endpoint delete id 1 [ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 The first one is coming from mptcp_pm_nl_rm_subflow_received(), and the second one from __mptcp_close_subflow(). To avoid doing the post-closed processing twice, the subflow is now marked as closed the first time. Note that it is not enough to check if we are dealing with the first subflow and check its sk_state: the subflow might have been reset or closed before calling mptcp_close_ssk(). Fixes: b911c97c7dc7 ("mptcp: add netlink event support") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/protocol.c | 6 ++++++ net/mptcp/protocol.h | 3 ++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -XXX,XX +XXX,XX @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, void mptcp_close_ssk(struct sock *sk, struct sock *ssk, struct mptcp_subflow_context *subflow) { + /* The first subflow can already be closed and still in the list */ + if (subflow->closed) + return; + + subflow->closed = true; + if (sk->sk_state == TCP_ESTABLISHED) mptcp_event(MPTCP_EVENT_SUB_CLOSED, mptcp_sk(sk), ssk, GFP_KERNEL); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -XXX,XX +XXX,XX @@ struct mptcp_subflow_context { stale : 1, /* unable to snd/rcv data, do not use for xmit */ valid_csum_seen : 1, /* at least one csum validated */ is_mptfo : 1, /* subflow is doing TFO */ - __unused : 10; + closed : 1, /* has done the post-closed part */ + __unused : 9; bool data_avail; bool scheduled; u32 remote_nonce; -- 2.45.2
This test extends "delete and re-add" to validate the previous commit: the number of MPTCP events are checked to make sure there are no duplicated or unexpected ones. A new helper has been introduced to easily check these events. The missing events have been added to the lib. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: b911c97c7dc7 ("mptcp: add netlink event support") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 74 ++++++++++++++++++++++++- tools/testing/selftests/net/mptcp/mptcp_lib.sh | 4 ++ 2 files changed, 75 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index XXXXXXX..XXXXXXX 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -XXX,XX +XXX,XX @@ reset_with_fail() fi } +start_events() +{ + mptcp_lib_events "${ns1}" "${evts_ns1}" evts_ns1_pid + mptcp_lib_events "${ns2}" "${evts_ns2}" evts_ns2_pid +} + reset_with_events() { reset "${1}" || return 1 - mptcp_lib_events "${ns1}" "${evts_ns1}" evts_ns1_pid - mptcp_lib_events "${ns2}" "${evts_ns2}" evts_ns2_pid + start_events } reset_with_tcp_filter() @@ -XXX,XX +XXX,XX @@ userspace_pm_chk_get_addr() fi } +# $1: ns ; $2: event type ; $3: count +chk_evt_nr() +{ + local ns=${1} + local evt_name="${2}" + local exp="${3}" + + local evts="${evts_ns1}" + local evt="${!evt_name}" + local count + + evt_name="${evt_name:16}" # without MPTCP_LIB_EVENT_ + [ "${ns}" == "ns2" ] && evts="${evts_ns2}" + + print_check "event ${ns} ${evt_name} (${exp})" + + if [[ "${evt_name}" = "LISTENER_"* ]] && + ! mptcp_lib_kallsyms_has "mptcp_event_pm_listener$"; then + print_skip "event not supported" + return + fi + + count=$(grep -cw "type:${evt}" "${evts}") + if [ "${count}" != "${exp}" ]; then + fail_test "got ${count} events, expected ${exp}" + else + print_ok + fi +} + userspace_tests() { # userspace pm type prevents add_addr @@ -XXX,XX +XXX,XX @@ endpoint_tests() if reset_with_tcp_filter "delete and re-add" ns2 10.0.3.2 REJECT OUTPUT && mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + start_events pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 0 3 pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow @@ -XXX,XX +XXX,XX @@ endpoint_tests() mptcp_lib_kill_wait $tests_pid + kill_events_pids + chk_evt_nr ns1 MPTCP_LIB_EVENT_LISTENER_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 3 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 3 + + chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 0 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 4 # one has been closed before estab + join_syn_tx=6 \ chk_join_nr 5 5 5 chk_rm_nr 3 3 fi # remove and re-add - if reset "delete re-add signal" && + if reset_with_events "delete re-add signal" && mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 3 3 @@ -XXX,XX +XXX,XX @@ endpoint_tests() chk_mptcp_info subflows 3 subflows 3 mptcp_lib_kill_wait $tests_pid + kill_events_pids + chk_evt_nr ns1 MPTCP_LIB_EVENT_LISTENER_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 2 + + chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 5 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 3 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 2 + join_connect_err=1 \ chk_join_nr 4 4 4 chk_add_nr 5 5 diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing/selftests/net/mptcp/mptcp_lib.sh index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh @@ -XXX,XX +XXX,XX @@ readonly KSFT_SKIP=4 readonly KSFT_TEST="${MPTCP_LIB_KSFT_TEST:-$(basename "${0}" .sh)}" # These variables are used in some selftests, read-only +declare -rx MPTCP_LIB_EVENT_CREATED=1 # MPTCP_EVENT_CREATED +declare -rx MPTCP_LIB_EVENT_ESTABLISHED=2 # MPTCP_EVENT_ESTABLISHED +declare -rx MPTCP_LIB_EVENT_CLOSED=3 # MPTCP_EVENT_CLOSED declare -rx MPTCP_LIB_EVENT_ANNOUNCED=6 # MPTCP_EVENT_ANNOUNCED declare -rx MPTCP_LIB_EVENT_REMOVED=7 # MPTCP_EVENT_REMOVED declare -rx MPTCP_LIB_EVENT_SUB_ESTABLISHED=10 # MPTCP_EVENT_SUB_ESTABLISHED declare -rx MPTCP_LIB_EVENT_SUB_CLOSED=11 # MPTCP_EVENT_SUB_CLOSED +declare -rx MPTCP_LIB_EVENT_SUB_PRIORITY=13 # MPTCP_EVENT_SUB_PRIORITY declare -rx MPTCP_LIB_EVENT_LISTENER_CREATED=15 # MPTCP_EVENT_LISTENER_CREATED declare -rx MPTCP_LIB_EVENT_LISTENER_CLOSED=16 # MPTCP_EVENT_LISTENER_CLOSED -- 2.45.2
The ADD_ADDR 0 with the address from the initial subflow should not be considered as a new address: this is not something new. If the host receives it, it simply means that the address is available again. When receiving an ADD_ADDR for the ID 0, the PM already doesn't consider it as new by not incrementing the 'add_addr_accepted' counter. But the 'accept_addr' might not be set if the limit has already been reached: this can be bypassed in this case. But before, it is important to check that this ADD_ADDR for the ID 0 is for the same address as the initial subflow. If not, it is not something that should happen, and the ADD_ADDR can be ignored. Note that if an ADD_ADDR is received while there is already a subflow opened using the same address, this ADD_ADDR is ignored as well. It means that if multiple ADD_ADDR for ID 0 are received, there will not be any duplicated subflows created by the client. Fixes: d0876b2284cf ("mptcp: add the incoming RM_ADDR support") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm.c | 4 +++- net/mptcp/pm_netlink.c | 9 +++++++++ net/mptcp/protocol.h | 2 ++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -XXX,XX +XXX,XX @@ void mptcp_pm_add_addr_received(const struct sock *ssk, } else { __MPTCP_INC_STATS(sock_net((struct sock *)msk), MPTCP_MIB_ADDADDRDROP); } - } else if (!READ_ONCE(pm->accept_addr)) { + /* id0 should not have a different address */ + } else if ((addr->id == 0 && !mptcp_pm_nl_is_init_remote_addr(msk, addr)) || + (addr->id > 0 && !READ_ONCE(pm->accept_addr))) { mptcp_pm_announce_addr(msk, addr, true); mptcp_pm_add_addr_send_ack(msk); } else if (mptcp_pm_schedule_work(msk, MPTCP_PM_ADD_ADDR_RECEIVED)) { diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) } } +bool mptcp_pm_nl_is_init_remote_addr(struct mptcp_sock *msk, + const struct mptcp_addr_info *remote) +{ + struct mptcp_addr_info mpc_remote; + + remote_address((struct sock_common *)msk, &mpc_remote); + return mptcp_addresses_equal(&mpc_remote, remote, remote->port); +} + void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk) { struct mptcp_subflow_context *subflow; diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -XXX,XX +XXX,XX @@ void mptcp_pm_add_addr_received(const struct sock *ssk, void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk, const struct mptcp_addr_info *addr); void mptcp_pm_add_addr_send_ack(struct mptcp_sock *msk); +bool mptcp_pm_nl_is_init_remote_addr(struct mptcp_sock *msk, + const struct mptcp_addr_info *remote); void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk); void mptcp_pm_rm_addr_received(struct mptcp_sock *msk, const struct mptcp_rm_list *rm_list); -- 2.45.2
This test extends "delete re-add signal" to validate the previous commit: when the 'signal' endpoint linked to the initial subflow (ID 0) is re-added multiple times, it will re-send the ADD_ADDR with id 0. The client should still be able to re-create this subflow, even if the add_addr_accepted limit has been reached as this special address is not considered as a new address. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: d0876b2284cf ("mptcp: add the incoming RM_ADDR support") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 30 ++++++++++++++++--------- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index XXXXXXX..XXXXXXX 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -XXX,XX +XXX,XX @@ endpoint_tests() pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal wait_mpj $ns2 - chk_subflow_nr "after re-add" 3 + chk_subflow_nr "after re-add ID 0" 3 + chk_mptcp_info subflows 3 subflows 3 + + pm_nl_del_endpoint $ns1 99 10.0.1.1 + sleep 0.5 + chk_subflow_nr "after re-delete ID 0" 2 + chk_mptcp_info subflows 2 subflows 2 + + pm_nl_add_endpoint $ns1 10.0.1.1 id 88 flags signal + wait_mpj $ns2 + chk_subflow_nr "after re-re-add ID 0" 3 chk_mptcp_info subflows 3 subflows 3 mptcp_lib_kill_wait $tests_pid @@ -XXX,XX +XXX,XX @@ endpoint_tests() chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 0 - chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 - chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 2 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 3 chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 - chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 5 - chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 3 - chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 - chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 2 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 6 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 4 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 3 join_connect_err=1 \ - chk_join_nr 4 4 4 - chk_add_nr 5 5 - chk_rm_nr 3 2 invert + chk_join_nr 5 5 5 + chk_add_nr 6 6 + chk_rm_nr 4 3 invert fi # flush and re-add -- 2.45.2
Even more fixes for the in-kernel PM :) There are a few Squash-to patches fist: - a partial revert for a previous Squash-to patch: probably fine not to block the send of the original patch then. - improve code coverage - use the right ID with __mark_subflow_endp_available() - small fix for the fullmesh case when re-adding the ID 0 endpoint - set the ID for __mark_subflow_endp_available() Then a few more fixes: - avoid the RM_SUBFLOW MIB counter to be incremented twice - fix re-re-creation of the ID 0 endpoint - validate that in the selftests - avoid duplicated SUB_CLOSED events - validate that in the selftests - ADD_ADDR 0 is not taking into account by the 'add_addr_accepted' counter, then it should bypass 'accept_addr' & make sure ADD_ADDR 0 is not sent with a new address - validate that in the selftests Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Changes in v2: - New squash-to patches after new issue reported by Arınç: patch 3,5/12. - Patch 8/12 is re-creating the ID 0 endpoint 3 times. - Link to v1: https://lore.kernel.org/r/20240815-mptcp-dup-close-evt-v1-0-5a551d3a66cc@kernel.org --- Matthieu Baerts (NGI0) (12): Squash to "mptcp: pm: re-using ID of unused removed ADD_ADDR" Squash to "selftests: mptcp: join: check removing ID 0 endpoint" Squash to "mptcp: pm: only mark 'subflow' endp as available" Squash to "mptcp: pm: reuse ID 0 after delete and re-add" Squash to "mptcp: pm: fix RM_ADDR ID for the initial subflow" mptcp: pm: do not remove already closed subflows mptcp: pm: fix ID 0 endp usage after multiple re-creations selftests: mptcp: join: check re-re-adding ID 0 endp mptcp: avoid duplicated SUB_CLOSED events selftests: mptcp: join: validate event numbers mptcp: pm: ADD_ADDR 0 is not a new address selftests: mptcp: join: check re-re-adding ID 0 signal net/mptcp/pm.c | 4 +- net/mptcp/pm_netlink.c | 38 +++++-- net/mptcp/protocol.c | 6 ++ net/mptcp/protocol.h | 5 +- tools/testing/selftests/net/mptcp/mptcp_join.sh | 129 ++++++++++++++++++++---- tools/testing/selftests/net/mptcp/mptcp_lib.sh | 4 + 6 files changed, 154 insertions(+), 32 deletions(-) --- base-commit: e2b07354530a5aea0cfc25e7d16094a15f080643 change-id: 20240813-mptcp-dup-close-evt-e4512ec78742 Best regards, -- Matthieu Baerts (NGI0) <matttbe@kernel.org>
When removing an announced ADD_ADDR, the ID should be marked as available only if it was announced before. Otherwise, local_addr_used will not be decremented when removing the endpoint. That's somehow the behaviour we had from the original patch, before the previous Squash-to patch [1]. Link: https://lore.kernel.org/20240802-mptcp-pm-avail-v6-1-964ba9ce279f@kernel.org [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static bool mptcp_pm_remove_anno_addr(struct mptcp_sock *msk, ret = remove_anno_list_by_saddr(msk, addr); if (ret || force) { spin_lock_bh(&msk->pm.lock); - __set_bit(addr->id, msk->pm.id_avail_bitmap); - msk->pm.add_addr_signaled -= ret; + if (ret) { + __set_bit(addr->id, msk->pm.id_avail_bitmap); + msk->pm.add_addr_signaled--; + } mptcp_pm_remove_addr(msk, &list); spin_unlock_bh(&msk->pm.lock); } -- 2.45.2
The original commit was replacing the recreation of an endpoint used by an additional subflow, by the one used by the initial subflow. Except that it reduced the code coverage, as shown by the previous patch fixing a bug no longer visible with the modification of "selftests: mptcp: join: check removing ID 0 endpoint". Instead of replacing the endpoint 2 by 1, here an additional del/add is done on the endpoint used by the initial subflow. So the two cases are now covered. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 32 ++++++++++++++++--------- 1 file changed, 21 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index XXXXXXX..XXXXXXX 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -XXX,XX +XXX,XX @@ endpoint_tests() wait_mpj $ns2 pm_nl_check_endpoint "creation" \ $ns2 10.0.2.2 id 2 flags subflow dev ns2eth2 - chk_subflow_nr "before delete" 2 + chk_subflow_nr "before delete id 2" 2 chk_mptcp_info subflows 1 subflows 1 - pm_nl_del_endpoint $ns2 1 10.0.1.2 + pm_nl_del_endpoint $ns2 2 10.0.2.2 sleep 0.5 - chk_subflow_nr "after delete" 1 - chk_mptcp_info subflows 1 subflows 1 + chk_subflow_nr "after delete id 2" 1 + chk_mptcp_info subflows 0 subflows 0 - pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow + pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow wait_mpj $ns2 - chk_subflow_nr "after re-add" 2 - chk_mptcp_info subflows 2 subflows 2 + chk_subflow_nr "after re-add id 2" 2 + chk_mptcp_info subflows 1 subflows 1 pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow wait_attempt_fail $ns2 chk_subflow_nr "after new reject" 2 - chk_mptcp_info subflows 2 subflows 2 + chk_mptcp_info subflows 1 subflows 1 ip netns exec "${ns2}" ${iptables} -D OUTPUT -s "10.0.3.2" -p tcp -j REJECT pm_nl_del_endpoint $ns2 3 10.0.3.2 pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow wait_mpj $ns2 chk_subflow_nr "after no reject" 3 + chk_mptcp_info subflows 2 subflows 2 + + pm_nl_del_endpoint $ns2 1 10.0.1.2 + sleep 0.5 + chk_subflow_nr "after delete id 0" 2 + chk_mptcp_info subflows 2 subflows 2 # only decr for additional sf + + pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow + wait_mpj $ns2 + chk_subflow_nr "after re-add id 0" 3 chk_mptcp_info subflows 3 subflows 3 mptcp_lib_kill_wait $tests_pid - join_syn_tx=4 \ - chk_join_nr 3 3 3 - chk_rm_nr 1 1 + join_syn_tx=5 \ + chk_join_nr 4 4 4 + chk_rm_nr 2 2 fi # remove and re-add -- 2.45.2
__mark_subflow_endp_available takes the ID of the subflow, which can be 0, not the one of the entry. If it is 0, the 'local_addr_used' is not decremented as expected. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net, if (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) { spin_lock_bh(&msk->pm.lock); - __mark_subflow_endp_available(msk, entry->addr.id); + __mark_subflow_endp_available(msk, list.ids[0]); spin_unlock_bh(&msk->pm.lock); } @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_nl_fullmesh(struct mptcp_sock *msk, spin_lock_bh(&msk->pm.lock); mptcp_pm_nl_rm_subflow_received(msk, &list); - __mark_subflow_endp_available(msk, addr->id); + __mark_subflow_endp_available(msk, list.ids[0]); mptcp_pm_create_subflow_or_signal_addr(msk); spin_unlock_bh(&msk->pm.lock); } -- 2.45.2
Set the address ID to 0 before calling fill_remote_addresses_vec(): for fullmesh cases, a bitmap will be created after having looked at all subflow IDs matching the local one. The ID visible on the wire (e.g. 0) should be compared to, not the one of the global endpoint (cannot be 0). Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) msk->pm.local_addr_used++; __clear_bit(local.addr.id, msk->pm.id_avail_bitmap); - nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh, addrs); - if (nr == 0) - continue; /* Special case for ID0: set the correct ID */ if (local.addr.id == msk->mpc_endpoint_id) local.addr.id = 0; + nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh, addrs); + if (nr == 0) + continue; + spin_unlock_bh(&msk->pm.lock); for (i = 0; i < nr; i++) __mptcp_subflow_connect(sk, &local, &addrs[i]); -- 2.45.2
To be used with __mark_subflow_endp_available() below: the ID should be the one of the subflow -- can be 0 -- not the one of the entry -- cannot be 0. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net, mptcp_pm_remove_anno_addr(msk, addr, remove_subflow && !(entry->flags & MPTCP_PM_ADDR_FLAG_IMPLICIT)); + list.ids[0] = mptcp_endp_get_local_id(msk, addr); if (remove_subflow) { - list.ids[0] = mptcp_endp_get_local_id(msk, addr); - spin_lock_bh(&msk->pm.lock); mptcp_pm_nl_rm_subflow_received(msk, &list); spin_unlock_bh(&msk->pm.lock); -- 2.45.2
It is possible to have in the list already closed subflows, e.g. the initial subflow has been already closed, but still in the list. No need to try to close it again, and increments the related counters again. Fixes: 0ee4261a3681 ("mptcp: implement mptcp_pm_remove_subflow") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk, int how = RCV_SHUTDOWN | SEND_SHUTDOWN; u8 id = subflow_get_local_id(subflow); + if (inet_sk_state_load(ssk) == TCP_CLOSE) + continue; if (rm_type == MPTCP_MIB_RMADDR && remote_id != rm_id) continue; if (rm_type == MPTCP_MIB_RMSUBFLOW && id != rm_id) -- 2.45.2
'local_addr_used' and 'add_addr_accepted' are decremented for addresses not related to the initial subflow (ID0), because the source and destination addresses of the initial subflows are known from the beginning: they don't count as "additional local address being used" or "ADD_ADDR being accepted". It is then required not to increment them when the entrypoint used by the initial subflow is removed and re-added during a connection. Without this modification, this entrypoint cannot be removed and re-added more than once. Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/512 Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm_netlink.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) fullmesh = !!(local.flags & MPTCP_PM_ADDR_FLAG_FULLMESH); - msk->pm.local_addr_used++; __clear_bit(local.addr.id, msk->pm.id_avail_bitmap); /* Special case for ID0: set the correct ID */ if (local.addr.id == msk->mpc_endpoint_id) local.addr.id = 0; + else /* local_addr_used is not decr for ID 0 */ + msk->pm.local_addr_used++; nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh, addrs); if (nr == 0) @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) spin_lock_bh(&msk->pm.lock); if (sf_created) { - msk->pm.add_addr_accepted++; + /* add_addr_accepted is not decr for ID 0 */ + if (remote.id) + msk->pm.add_addr_accepted++; if (msk->pm.add_addr_accepted >= add_addr_accept_max || msk->pm.subflows >= subflows_max) WRITE_ONCE(msk->pm.accept_addr, false); -- 2.45.2
This test extends "delete and re-add" to validate the previous commit: when the endpoint linked to the initial subflow (ID 0) is re-added multiple times, it was no longer being used, because the internal linked counters are not decremented for this special endpoint: it is not an additional endpoint. Here, the "del/add id 0" steps are done 3 times to unsure this case is validated. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- v2: - Re-create the ID 0 endpoint 3 times, in a loop --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index XXXXXXX..XXXXXXX 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -XXX,XX +XXX,XX @@ endpoint_tests() chk_subflow_nr "after no reject" 3 chk_mptcp_info subflows 2 subflows 2 - pm_nl_del_endpoint $ns2 1 10.0.1.2 - sleep 0.5 - chk_subflow_nr "after delete id 0" 2 - chk_mptcp_info subflows 2 subflows 2 # only decr for additional sf + local i + for i in $(seq 3); do + pm_nl_del_endpoint $ns2 1 10.0.1.2 + sleep 0.5 + chk_subflow_nr "after delete id 0 ($i)" 2 + chk_mptcp_info subflows 2 subflows 2 # only decr for additional sf - pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow - wait_mpj $ns2 - chk_subflow_nr "after re-add id 0" 3 - chk_mptcp_info subflows 3 subflows 3 + pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow + wait_mpj $ns2 + chk_subflow_nr "after re-add id 0 ($i)" 3 + chk_mptcp_info subflows 3 subflows 3 + done mptcp_lib_kill_wait $tests_pid - join_syn_tx=5 \ - chk_join_nr 4 4 4 - chk_rm_nr 2 2 + join_syn_tx=7 \ + chk_join_nr 6 6 6 + chk_rm_nr 4 4 fi # remove and re-add -- 2.45.2
The initial subflow might have already been closed, but still in the connection list. When the worker is instructed to close the subflows that have been marked as closed, it might then try to close the initial subflow again. A consequence of that is that the SUB_CLOSED event can be seen twice: # ip mptcp endpoint 1.1.1.1 id 1 subflow dev eth0 2.2.2.2 id 2 subflow dev eth1 # ip mptcp monitor & [ CREATED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ ESTABLISHED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ SF_ESTABLISHED] remid=0 locid=2 saddr4=2.2.2.2 daddr4=9.9.9.9 # ip mptcp endpoint delete id 1 [ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 The first one is coming from mptcp_pm_nl_rm_subflow_received(), and the second one from __mptcp_close_subflow(). To avoid doing the post-closed processing twice, the subflow is now marked as closed the first time. Note that it is not enough to check if we are dealing with the first subflow and check its sk_state: the subflow might have been reset or closed before calling mptcp_close_ssk(). Fixes: b911c97c7dc7 ("mptcp: add netlink event support") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/protocol.c | 6 ++++++ net/mptcp/protocol.h | 3 ++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -XXX,XX +XXX,XX @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, void mptcp_close_ssk(struct sock *sk, struct sock *ssk, struct mptcp_subflow_context *subflow) { + /* The first subflow can already be closed and still in the list */ + if (subflow->closed) + return; + + subflow->closed = true; + if (sk->sk_state == TCP_ESTABLISHED) mptcp_event(MPTCP_EVENT_SUB_CLOSED, mptcp_sk(sk), ssk, GFP_KERNEL); diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -XXX,XX +XXX,XX @@ struct mptcp_subflow_context { stale : 1, /* unable to snd/rcv data, do not use for xmit */ valid_csum_seen : 1, /* at least one csum validated */ is_mptfo : 1, /* subflow is doing TFO */ - __unused : 10; + closed : 1, /* has done the post-closed part */ + __unused : 9; bool data_avail; bool scheduled; u32 remote_nonce; -- 2.45.2
This test extends "delete and re-add" and "delete re-add signal" to validate the previous commit: the number of MPTCP events are checked to make sure there are no duplicated or unexpected ones. A new helper has been introduced to easily check these events. The missing events have been added to the lib. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: b911c97c7dc7 ("mptcp: add netlink event support") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 74 ++++++++++++++++++++++++- tools/testing/selftests/net/mptcp/mptcp_lib.sh | 4 ++ 2 files changed, 75 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index XXXXXXX..XXXXXXX 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -XXX,XX +XXX,XX @@ reset_with_fail() fi } +start_events() +{ + mptcp_lib_events "${ns1}" "${evts_ns1}" evts_ns1_pid + mptcp_lib_events "${ns2}" "${evts_ns2}" evts_ns2_pid +} + reset_with_events() { reset "${1}" || return 1 - mptcp_lib_events "${ns1}" "${evts_ns1}" evts_ns1_pid - mptcp_lib_events "${ns2}" "${evts_ns2}" evts_ns2_pid + start_events } reset_with_tcp_filter() @@ -XXX,XX +XXX,XX @@ userspace_pm_chk_get_addr() fi } +# $1: ns ; $2: event type ; $3: count +chk_evt_nr() +{ + local ns=${1} + local evt_name="${2}" + local exp="${3}" + + local evts="${evts_ns1}" + local evt="${!evt_name}" + local count + + evt_name="${evt_name:16}" # without MPTCP_LIB_EVENT_ + [ "${ns}" == "ns2" ] && evts="${evts_ns2}" + + print_check "event ${ns} ${evt_name} (${exp})" + + if [[ "${evt_name}" = "LISTENER_"* ]] && + ! mptcp_lib_kallsyms_has "mptcp_event_pm_listener$"; then + print_skip "event not supported" + return + fi + + count=$(grep -cw "type:${evt}" "${evts}") + if [ "${count}" != "${exp}" ]; then + fail_test "got ${count} events, expected ${exp}" + else + print_ok + fi +} + userspace_tests() { # userspace pm type prevents add_addr @@ -XXX,XX +XXX,XX @@ endpoint_tests() if reset_with_tcp_filter "delete and re-add" ns2 10.0.3.2 REJECT OUTPUT && mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + start_events pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 0 3 pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow @@ -XXX,XX +XXX,XX @@ endpoint_tests() mptcp_lib_kill_wait $tests_pid + kill_events_pids + chk_evt_nr ns1 MPTCP_LIB_EVENT_LISTENER_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 4 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 6 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 4 + + chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 0 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 6 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 5 # one has been closed before estab + join_syn_tx=7 \ chk_join_nr 6 6 6 chk_rm_nr 4 4 fi # remove and re-add - if reset "delete re-add signal" && + if reset_with_events "delete re-add signal" && mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 3 3 @@ -XXX,XX +XXX,XX @@ endpoint_tests() chk_mptcp_info subflows 3 subflows 3 mptcp_lib_kill_wait $tests_pid + kill_events_pids + chk_evt_nr ns1 MPTCP_LIB_EVENT_LISTENER_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 2 + + chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 5 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 3 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 2 + join_connect_err=1 \ chk_join_nr 4 4 4 chk_add_nr 5 5 diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing/selftests/net/mptcp/mptcp_lib.sh index XXXXXXX..XXXXXXX 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh @@ -XXX,XX +XXX,XX @@ readonly KSFT_SKIP=4 readonly KSFT_TEST="${MPTCP_LIB_KSFT_TEST:-$(basename "${0}" .sh)}" # These variables are used in some selftests, read-only +declare -rx MPTCP_LIB_EVENT_CREATED=1 # MPTCP_EVENT_CREATED +declare -rx MPTCP_LIB_EVENT_ESTABLISHED=2 # MPTCP_EVENT_ESTABLISHED +declare -rx MPTCP_LIB_EVENT_CLOSED=3 # MPTCP_EVENT_CLOSED declare -rx MPTCP_LIB_EVENT_ANNOUNCED=6 # MPTCP_EVENT_ANNOUNCED declare -rx MPTCP_LIB_EVENT_REMOVED=7 # MPTCP_EVENT_REMOVED declare -rx MPTCP_LIB_EVENT_SUB_ESTABLISHED=10 # MPTCP_EVENT_SUB_ESTABLISHED declare -rx MPTCP_LIB_EVENT_SUB_CLOSED=11 # MPTCP_EVENT_SUB_CLOSED +declare -rx MPTCP_LIB_EVENT_SUB_PRIORITY=13 # MPTCP_EVENT_SUB_PRIORITY declare -rx MPTCP_LIB_EVENT_LISTENER_CREATED=15 # MPTCP_EVENT_LISTENER_CREATED declare -rx MPTCP_LIB_EVENT_LISTENER_CLOSED=16 # MPTCP_EVENT_LISTENER_CLOSED -- 2.45.2
The ADD_ADDR 0 with the address from the initial subflow should not be considered as a new address: this is not something new. If the host receives it, it simply means that the address is available again. When receiving an ADD_ADDR for the ID 0, the PM already doesn't consider it as new by not incrementing the 'add_addr_accepted' counter. But the 'accept_addr' might not be set if the limit has already been reached: this can be bypassed in this case. But before, it is important to check that this ADD_ADDR for the ID 0 is for the same address as the initial subflow. If not, it is not something that should happen, and the ADD_ADDR can be ignored. Note that if an ADD_ADDR is received while there is already a subflow opened using the same address, this ADD_ADDR is ignored as well. It means that if multiple ADD_ADDR for ID 0 are received, there will not be any duplicated subflows created by the client. Fixes: d0876b2284cf ("mptcp: add the incoming RM_ADDR support") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- net/mptcp/pm.c | 4 +++- net/mptcp/pm_netlink.c | 9 +++++++++ net/mptcp/protocol.h | 2 ++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -XXX,XX +XXX,XX @@ void mptcp_pm_add_addr_received(const struct sock *ssk, } else { __MPTCP_INC_STATS(sock_net((struct sock *)msk), MPTCP_MIB_ADDADDRDROP); } - } else if (!READ_ONCE(pm->accept_addr)) { + /* id0 should not have a different address */ + } else if ((addr->id == 0 && !mptcp_pm_nl_is_init_remote_addr(msk, addr)) || + (addr->id > 0 && !READ_ONCE(pm->accept_addr))) { mptcp_pm_announce_addr(msk, addr, true); mptcp_pm_add_addr_send_ack(msk); } else if (mptcp_pm_schedule_work(msk, MPTCP_PM_ADD_ADDR_RECEIVED)) { diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -XXX,XX +XXX,XX @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) } } +bool mptcp_pm_nl_is_init_remote_addr(struct mptcp_sock *msk, + const struct mptcp_addr_info *remote) +{ + struct mptcp_addr_info mpc_remote; + + remote_address((struct sock_common *)msk, &mpc_remote); + return mptcp_addresses_equal(&mpc_remote, remote, remote->port); +} + void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk) { struct mptcp_subflow_context *subflow; diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index XXXXXXX..XXXXXXX 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -XXX,XX +XXX,XX @@ void mptcp_pm_add_addr_received(const struct sock *ssk, void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk, const struct mptcp_addr_info *addr); void mptcp_pm_add_addr_send_ack(struct mptcp_sock *msk); +bool mptcp_pm_nl_is_init_remote_addr(struct mptcp_sock *msk, + const struct mptcp_addr_info *remote); void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk); void mptcp_pm_rm_addr_received(struct mptcp_sock *msk, const struct mptcp_rm_list *rm_list); -- 2.45.2
This test extends "delete re-add signal" to validate the previous commit: when the 'signal' endpoint linked to the initial subflow (ID 0) is re-added multiple times, it will re-send the ADD_ADDR with id 0. The client should still be able to re-create this subflow, even if the add_addr_accepted limit has been reached as this special address is not considered as a new address. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: d0876b2284cf ("mptcp: add the incoming RM_ADDR support") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 30 ++++++++++++++++--------- 1 file changed, 20 insertions(+), 10 deletions(-) diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index XXXXXXX..XXXXXXX 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -XXX,XX +XXX,XX @@ endpoint_tests() pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal wait_mpj $ns2 - chk_subflow_nr "after re-add" 3 + chk_subflow_nr "after re-add ID 0" 3 + chk_mptcp_info subflows 3 subflows 3 + + pm_nl_del_endpoint $ns1 99 10.0.1.1 + sleep 0.5 + chk_subflow_nr "after re-delete ID 0" 2 + chk_mptcp_info subflows 2 subflows 2 + + pm_nl_add_endpoint $ns1 10.0.1.1 id 88 flags signal + wait_mpj $ns2 + chk_subflow_nr "after re-re-add ID 0" 3 chk_mptcp_info subflows 3 subflows 3 mptcp_lib_kill_wait $tests_pid @@ -XXX,XX +XXX,XX @@ endpoint_tests() chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 0 - chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 - chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 2 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 3 chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 - chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 5 - chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 3 - chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 - chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 2 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 6 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 4 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 3 join_connect_err=1 \ - chk_join_nr 4 4 4 - chk_add_nr 5 5 - chk_rm_nr 3 2 invert + chk_join_nr 5 5 5 + chk_add_nr 6 6 + chk_rm_nr 4 3 invert fi # flush and re-add -- 2.45.2