mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient

[RFC PATCH net] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient

Posted by Li Xiasong 2 weeks, 4 days ago

When TCP option space is insufficient (e.g., IPv6 with tcp_timestamps
enabled), the original code jumped to out_unlock without clearing the
addr_signal flag. This caused mptcp_pm_add_timer to keep rescheduling
indefinitely without sending ADD_ADDR, preventing the endpoint list from
being traversed.

In a pure ACK scenario (indicated by drop_other_suboptions=true), if
the option space is insufficient to carry the ADD_ADDR suboption, it
is appropriate to drop this address signal to allow the timer handler
to move on to other addresses.

Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Signed-off-by: Li Xiasong <lixiasong1@huawei.com>
---

Seeking feedback on:

When announcing addresses to the peer, MPTCP sends a pure ACK packet
to carry MPTCP options (ADD_ADDR). In this scenario, if the option space
is insufficient for ADD_ADDR, clearing addr_signal would:

  - Prevent the timer from retrying infinitely
  - Allow the timer to continue traversing and processing other addresses
  - Not block other subflow creation or address announcement operations

Is there any scenario where we should retry later instead of clearing
the address signal/echo flag? However, if a pure ACK doesn't have
enough space for the flag, subsequent packets won't either.

---
 net/mptcp/pm.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 57a456690406..1d49779c6a1f 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -881,19 +881,18 @@ bool mptcp_pm_add_addr_signal(struct mptcp_sock *msk, const struct sk_buff *skb,
 	}
 
 	*echo = mptcp_pm_should_add_signal_echo(msk);
+	add_addr = msk->pm.addr_signal &
+		~(*echo ? BIT(MPTCP_ADD_ADDR_ECHO) : BIT(MPTCP_ADD_ADDR_SIGNAL));
 	port = !!(*echo ? msk->pm.remote.port : msk->pm.local.port);
-
 	family = *echo ? msk->pm.remote.family : msk->pm.local.family;
-	if (remaining < mptcp_add_addr_len(family, *echo, port))
-		goto out_unlock;
 
-	if (*echo) {
-		*addr = msk->pm.remote;
-		add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_ECHO);
-	} else {
-		*addr = msk->pm.local;
-		add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_SIGNAL);
+	if (remaining < mptcp_add_addr_len(family, *echo, port)) {
+		if (*drop_other_suboptions)
+			WRITE_ONCE(msk->pm.addr_signal, add_addr);
+		goto out_unlock;
 	}
+
+	*addr = *echo ? msk->pm.remote : msk->pm.local;
 	WRITE_ONCE(msk->pm.addr_signal, add_addr);
 	ret = true;
 
-- 
2.34.1

Re: [RFC PATCH net] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient

Posted by Matthieu Baerts 2 weeks, 2 days ago

Hi Li,

On 18/04/2026 12:00, Li Xiasong wrote:
> When TCP option space is insufficient (e.g., IPv6 with tcp_timestamps
> enabled), the original code jumped to out_unlock without clearing the
> addr_signal flag. This caused mptcp_pm_add_timer to keep rescheduling
> indefinitely without sending ADD_ADDR,

Funny, I was looking at this issue on Friday evening :)

> preventing the endpoint list from being traversed.

It might help to add a bit of context: I guess here you meant that it
prevent advertising other ADD_ADDR, not using other subflows when
sending data, right?

> In a pure ACK scenario (indicated by drop_other_suboptions=true), if
> the option space is insufficient to carry the ADD_ADDR suboption, it
> is appropriate to drop this address signal to allow the timer handler
> to move on to other addresses.
> 
> Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
> Signed-off-by: Li Xiasong <lixiasong1@huawei.com>
> ---
> 
> Seeking feedback on:
> 
> When announcing addresses to the peer, MPTCP sends a pure ACK packet
> to carry MPTCP options (ADD_ADDR). In this scenario, if the option space
> is insufficient for ADD_ADDR, clearing addr_signal would:
> 
>   - Prevent the timer from retrying infinitely
>   - Allow the timer to continue traversing and processing other addresses
>   - Not block other subflow creation or address announcement operations
> 
> Is there any scenario where we should retry later instead of clearing
> the address signal/echo flag? However, if a pure ACK doesn't have
> enough space for the flag, subsequent packets won't either.

That's correct: for the moment, if it is a pure ACK and there is not
enough space, no need to retry later because it is not possible to have
more space. It should only happen with an ADD_ADDR containing an IPv6
address and a port number. It might be good to specify this in the
commit message.

> ---
>  net/mptcp/pm.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
> index 57a456690406..1d49779c6a1f 100644
> --- a/net/mptcp/pm.c
> +++ b/net/mptcp/pm.c
> @@ -881,19 +881,18 @@ bool mptcp_pm_add_addr_signal(struct mptcp_sock *msk, const struct sk_buff *skb,
>  	}
>  
>  	*echo = mptcp_pm_should_add_signal_echo(msk);
> +	add_addr = msk->pm.addr_signal &
> +		~(*echo ? BIT(MPTCP_ADD_ADDR_ECHO) : BIT(MPTCP_ADD_ADDR_SIGNAL));
>  	port = !!(*echo ? msk->pm.remote.port : msk->pm.local.port);
> -
>  	family = *echo ? msk->pm.remote.family : msk->pm.local.family;

nit: while at it, maybe clearer to have a dedicated 'if (*echo)' instead
of 3 lines with '*echo ? ... : ..., no?

  if (*echo) {
      add_addr = ...
      port = ...
      family = ...
  } else {
      add_addr = ...
      port = ...
      family = ...
  }

> -	if (remaining < mptcp_add_addr_len(family, *echo, port))
> -		goto out_unlock;
>  
> -	if (*echo) {
> -		*addr = msk->pm.remote;
> -		add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_ECHO);
> -	} else {
> -		*addr = msk->pm.local;
> -		add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_SIGNAL);
> +	if (remaining < mptcp_add_addr_len(family, *echo, port)) {
> +		if (*drop_other_suboptions)
> +			WRITE_ONCE(msk->pm.addr_signal, add_addr);

If it is dropped, it would be helpful to increment the ADDADDRTXDROP MIB
counter, and ideally check that in the MPTCP selftests (e.g. adding a
new subtest in mptcp_join.sh, in add_addr_ports_tests()?).

Also, I wonder if it would not be clearer to jump to a new label here...

> +		goto out_unlock;
>  	}
> +
> +	*addr = *echo ? msk->pm.remote : msk->pm.local;
>  	WRITE_ONCE(msk->pm.addr_signal, add_addr);
>  	ret = true;

... inverting the two lines above, and adding "drop_signal_mark" label?

Apart from the comments above, I think your patch is doing the right thing.

Also, one last request: do you mind sending the v2 only to the mptcp ML,
please? I have a bunch of related fixes [1] plus this one is not urgent.

In fact, except for (urgent) fixes, it might be better to send MPTCP
patches only the to MPTCP ML: to a restricted number of people for the
first versions, there is enough traffic on Netdev.

[1]
https://lore.kernel.org/20260415-mptcp-inc-limits-v5-0-e54c3bf80e4e@kernel.org

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [RFC PATCH net] mptcp: pm: fix ADD_ADDR timer infinite retry on option space insufficient

Posted by MPTCP CI 2 weeks, 4 days ago

Hi Li,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/24602264963

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/32c3fb79b0b4
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1082765


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)