When servers set the C-flag in their MP_CAPABLE to tell clients not to
create subflows to the initial address and port, clients will likely not
use their other endpoints. That's because the in-kernel path-manager
uses the 'subflow' endpoints to create subflows only to the initial
address and port.
If the limits have not been modified to accept ADD_ADDR, the client
doesn't try to establish new subflows. If the limits accept ADD_ADDR,
the routing routes will be used to select the source IP.
The C-flag is typically set when the server is operating behind a legacy
Layer 4 load balancer, or using anycast IP address. Clients having their
different 'subflow' endpoints setup, don't end up creating multiple
subflows as expected, and causing some deployment issues.
A special case is then added here: when servers set the C-flag in the
MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted.
The 'subflows' endpoints will then be used with this new remote IP and
port. This exception is only allowed when the ADD_ADDR is sent
immediately after the 3WHS, and makes the client switching to the 'fully
established' mode. After that, 'select_local_address()' will not be able
to find any subflows, because 'id_avail_bitmap' will be filled in
mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully
established' mode.
Fixes: df377be38725 ("mptcp: add deny_join_id0 in mptcp_options_received")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/536
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
Notes:
- I tried to find a simple solution that can hopefully be backported to
cover the 'CDN' use-case.
- I tried to find a solution respecting the 'accepted add addr' limit,
but I don't see how to... That's why here I limit the exception a
maximum: when this limit is 0 (default case), only the first ADD_ADDR,
etc. Feel free to share what you think about that.
- I have some additional code doing some cleanup, and I started to look
at #503, which is a bit linked to that.
---
net/mptcp/pm.c | 7 +++++--
net/mptcp/pm_kernel.c | 39 +++++++++++++++++++++++++++++++++++++++
2 files changed, 44 insertions(+), 2 deletions(-)
diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c
index 02dfb379417e2843301f121039ec0d370b040ef7..fda8d1880224dedbbfb20954f93a04da435346df 100644
--- a/net/mptcp/pm.c
+++ b/net/mptcp/pm.c
@@ -637,9 +637,12 @@ void mptcp_pm_add_addr_received(const struct sock *ssk,
} else {
__MPTCP_INC_STATS(sock_net((struct sock *)msk), MPTCP_MIB_ADDADDRDROP);
}
- /* id0 should not have a different address */
+ /* id0 should not have a different address ; special case for C-flag */
} else if ((addr->id == 0 && !mptcp_pm_is_init_remote_addr(msk, addr)) ||
- (addr->id > 0 && !READ_ONCE(pm->accept_addr))) {
+ (addr->id > 0 && !READ_ONCE(pm->accept_addr) &&
+ (!READ_ONCE(msk->pm.remote_deny_join_id0) ||
+ msk->pm.local_addr_used != 0 ||
+ mptcp_pm_get_add_addr_accept_max(msk) != 0))) {
mptcp_pm_announce_addr(msk, addr, true);
mptcp_pm_add_addr_send_ack(msk);
} else if (mptcp_pm_schedule_work(msk, MPTCP_PM_ADD_ADDR_RECEIVED)) {
diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c
index f30df8e884a0b3de9fb092e5774cafe08f6350ce..408801e20ed6e874b80ccbf38af76abc13615a3a 100644
--- a/net/mptcp/pm_kernel.c
+++ b/net/mptcp/pm_kernel.c
@@ -389,10 +389,15 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
struct mptcp_addr_info mpc_addr;
struct pm_nl_pernet *pernet;
unsigned int subflows_max;
+ bool deny_join_id0;
int i = 0;
pernet = pm_nl_get_pernet_from_msk(msk);
subflows_max = mptcp_pm_get_subflows_max(msk);
+ deny_join_id0 = remote->id && READ_ONCE(msk->pm.remote_deny_join_id0) &&
+ !READ_ONCE(msk->pm.accept_addr) &&
+ msk->pm.local_addr_used == 0 &&
+ mptcp_pm_get_add_addr_accept_max(msk) == 0;
mptcp_local_address((struct sock_common *)msk, &mpc_addr);
@@ -409,6 +414,9 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
locals[i].flags = entry->flags;
locals[i].ifindex = entry->ifindex;
+ if (deny_join_id0)
+ __clear_bit(locals[i].addr.id, msk->pm.id_avail_bitmap);
+
/* Special case for ID0: set the correct ID */
if (mptcp_addresses_equal(&locals[i].addr, &mpc_addr, locals[i].addr.port))
locals[i].addr.id = 0;
@@ -419,6 +427,37 @@ static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk,
}
rcu_read_unlock();
+ /* Special case: peer sets the C flag, accept one ADD_ADDR if default
+ * limits are used -- accepting no ADD_ADDR -- and use subflow endpoints
+ */
+ if (!i && deny_join_id0) {
+ unsigned int local_addr_max = mptcp_pm_get_local_addr_max(msk);
+
+ while (msk->pm.local_addr_used < local_addr_max &&
+ msk->pm.subflows < subflows_max) {
+ struct mptcp_pm_local *local = &locals[i];
+
+ if (!select_local_address(pernet, msk, local))
+ break;
+
+ __clear_bit(local->addr.id, msk->pm.id_avail_bitmap);
+
+ if (!mptcp_pm_addr_families_match(sk, &local->addr,
+ remote))
+ continue;
+
+ if (mptcp_addresses_equal(&local->addr, &mpc_addr,
+ local->addr.port))
+ continue;
+
+ msk->pm.local_addr_used++;
+ msk->pm.subflows++;
+ i++;
+ }
+
+ return i;
+ }
+
/* If the array is empty, fill in the single
* 'IPADDRANY' local address
*/
--
2.51.0
Hi Matt, On Mon, 2025-09-15 at 20:11 +0200, Matthieu Baerts (NGI0) wrote: > When servers set the C-flag in their MP_CAPABLE to tell clients not > to > create subflows to the initial address and port, clients will likely > not > use their other endpoints. That's because the in-kernel path-manager > uses the 'subflow' endpoints to create subflows only to the initial > address and port. > > If the limits have not been modified to accept ADD_ADDR, the client > doesn't try to establish new subflows. If the limits accept ADD_ADDR, > the routing routes will be used to select the source IP. > > The C-flag is typically set when the server is operating behind a > legacy > Layer 4 load balancer, or using anycast IP address. Clients having > their > different 'subflow' endpoints setup, don't end up creating multiple > subflows as expected, and causing some deployment issues. > > A special case is then added here: when servers set the C-flag in the > MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted. > The 'subflows' endpoints will then be used with this new remote IP > and > port. This exception is only allowed when the ADD_ADDR is sent > immediately after the 3WHS, and makes the client switching to the > 'fully > established' mode. After that, 'select_local_address()' will not be > able > to find any subflows, because 'id_avail_bitmap' will be filled in > mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully > established' mode. > > Fixes: df377be38725 ("mptcp: add deny_join_id0 in > mptcp_options_received") > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/536 > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > --- > Notes: > - I tried to find a simple solution that can hopefully be backported > to > cover the 'CDN' use-case. > - I tried to find a solution respecting the 'accepted add addr' > limit, > but I don't see how to... That's why here I limit the exception a > maximum: when this limit is 0 (default case), only the first > ADD_ADDR, > etc. Feel free to share what you think about that. > - I have some additional code doing some cleanup, and I started to > look > at #503, which is a bit linked to that. > --- > net/mptcp/pm.c | 7 +++++-- > net/mptcp/pm_kernel.c | 39 +++++++++++++++++++++++++++++++++++++++ > 2 files changed, 44 insertions(+), 2 deletions(-) > > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c > index > 02dfb379417e2843301f121039ec0d370b040ef7..fda8d1880224dedbbfb20954f93 > a04da435346df 100644 > --- a/net/mptcp/pm.c > +++ b/net/mptcp/pm.c > @@ -637,9 +637,12 @@ void mptcp_pm_add_addr_received(const struct > sock *ssk, > } else { > __MPTCP_INC_STATS(sock_net((struct sock > *)msk), MPTCP_MIB_ADDADDRDROP); > } > - /* id0 should not have a different address */ > + /* id0 should not have a different address ; special case > for C-flag */ > } else if ((addr->id == 0 && > !mptcp_pm_is_init_remote_addr(msk, addr)) || > - (addr->id > 0 && !READ_ONCE(pm->accept_addr))) { > + (addr->id > 0 && !READ_ONCE(pm->accept_addr) && > + (!READ_ONCE(msk->pm.remote_deny_join_id0) || > + msk->pm.local_addr_used != 0 || > + mptcp_pm_get_add_addr_accept_max(msk) != 0))) { > mptcp_pm_announce_addr(msk, addr, true); > mptcp_pm_add_addr_send_ack(msk); > } else if (mptcp_pm_schedule_work(msk, > MPTCP_PM_ADD_ADDR_RECEIVED)) { > diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c > index > f30df8e884a0b3de9fb092e5774cafe08f6350ce..408801e20ed6e874b80ccbf38af > 76abc13615a3a 100644 > --- a/net/mptcp/pm_kernel.c > +++ b/net/mptcp/pm_kernel.c > @@ -389,10 +389,15 @@ static unsigned int > fill_local_addresses_vec(struct mptcp_sock *msk, > struct mptcp_addr_info mpc_addr; > struct pm_nl_pernet *pernet; > unsigned int subflows_max; > + bool deny_join_id0; > int i = 0; > > pernet = pm_nl_get_pernet_from_msk(msk); > subflows_max = mptcp_pm_get_subflows_max(msk); > + deny_join_id0 = remote->id && READ_ONCE(msk- > >pm.remote_deny_join_id0) && > + !READ_ONCE(msk->pm.accept_addr) && > + msk->pm.local_addr_used == 0 && > + mptcp_pm_get_add_addr_accept_max(msk) == 0; > > mptcp_local_address((struct sock_common *)msk, &mpc_addr); > > @@ -409,6 +414,9 @@ static unsigned int > fill_local_addresses_vec(struct mptcp_sock *msk, > locals[i].flags = entry->flags; > locals[i].ifindex = entry->ifindex; > > + if (deny_join_id0) > + __clear_bit(locals[i].addr.id, msk- > >pm.id_avail_bitmap); > + > /* Special case for ID0: set the correct ID > */ > if (mptcp_addresses_equal(&locals[i].addr, > &mpc_addr, locals[i].addr.port)) > locals[i].addr.id = 0; > @@ -419,6 +427,37 @@ static unsigned int > fill_local_addresses_vec(struct mptcp_sock *msk, > } > rcu_read_unlock(); > > + /* Special case: peer sets the C flag, accept one ADD_ADDR > if default > + * limits are used -- accepting no ADD_ADDR -- and use > subflow endpoints > + */ > + if (!i && deny_join_id0) { > + unsigned int local_addr_max = > mptcp_pm_get_local_addr_max(msk); > + > + while (msk->pm.local_addr_used < local_addr_max && > + msk->pm.subflows < subflows_max) { > + struct mptcp_pm_local *local = &locals[i]; > + > + if (!select_local_address(pernet, msk, > local)) > + break; > + > + __clear_bit(local->addr.id, msk- > >pm.id_avail_bitmap); > + > + if (!mptcp_pm_addr_families_match(sk, > &local->addr, > + remote)) Here we need to consider the case of v4mapped addresses, same as in the section below "/* If the array is empty */". Thanks, -Geliang > + continue; > + > + if (mptcp_addresses_equal(&local->addr, > &mpc_addr, > + local->addr.port)) > + continue; > + > + msk->pm.local_addr_used++; > + msk->pm.subflows++; > + i++; > + } > + > + return i; > + } > + > /* If the array is empty, fill in the single > * 'IPADDRANY' local address > */
Hi Geliang, On 16/09/2025 12:16, Geliang Tang wrote: > Hi Matt, > > On Mon, 2025-09-15 at 20:11 +0200, Matthieu Baerts (NGI0) wrote: >> When servers set the C-flag in their MP_CAPABLE to tell clients not >> to >> create subflows to the initial address and port, clients will likely >> not >> use their other endpoints. That's because the in-kernel path-manager >> uses the 'subflow' endpoints to create subflows only to the initial >> address and port. >> >> If the limits have not been modified to accept ADD_ADDR, the client >> doesn't try to establish new subflows. If the limits accept ADD_ADDR, >> the routing routes will be used to select the source IP. >> >> The C-flag is typically set when the server is operating behind a >> legacy >> Layer 4 load balancer, or using anycast IP address. Clients having >> their >> different 'subflow' endpoints setup, don't end up creating multiple >> subflows as expected, and causing some deployment issues. >> >> A special case is then added here: when servers set the C-flag in the >> MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted. >> The 'subflows' endpoints will then be used with this new remote IP >> and >> port. This exception is only allowed when the ADD_ADDR is sent >> immediately after the 3WHS, and makes the client switching to the >> 'fully >> established' mode. After that, 'select_local_address()' will not be >> able >> to find any subflows, because 'id_avail_bitmap' will be filled in >> mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully >> established' mode. (...) >> diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c >> index >> f30df8e884a0b3de9fb092e5774cafe08f6350ce..408801e20ed6e874b80ccbf38af >> 76abc13615a3a 100644 >> --- a/net/mptcp/pm_kernel.c >> +++ b/net/mptcp/pm_kernel.c (...) >> @@ -419,6 +427,37 @@ static unsigned int >> fill_local_addresses_vec(struct mptcp_sock *msk, >> } >> rcu_read_unlock(); >> >> + /* Special case: peer sets the C flag, accept one ADD_ADDR >> if default >> + * limits are used -- accepting no ADD_ADDR -- and use >> subflow endpoints >> + */ >> + if (!i && deny_join_id0) { >> + unsigned int local_addr_max = >> mptcp_pm_get_local_addr_max(msk); >> + >> + while (msk->pm.local_addr_used < local_addr_max && >> + msk->pm.subflows < subflows_max) { >> + struct mptcp_pm_local *local = &locals[i]; >> + >> + if (!select_local_address(pernet, msk, >> local)) >> + break; >> + >> + __clear_bit(local->addr.id, msk- >>> pm.id_avail_bitmap); >> + >> + if (!mptcp_pm_addr_families_match(sk, >> &local->addr, >> + remote)) > > Here we need to consider the case of v4mapped addresses, same as in the > section below "/* If the array is empty */". I'm not sure to understand: here, we pick a local, then mptcp_pm_addr_families_match() takes care of the v4mapped addresses. Here under, we define a local, and we want to take one for the same address family. We still need the check just in case we receive a v6 address via the ADD_ADDR, and v6 is not supported. So I don't need to change anything here. Or did I miss something? Cheers, Matt -- Sponsored by the NGI0 Core fund.
Hi Matt, Thanks for this set. I have some comments below. On Mon, 2025-09-15 at 20:11 +0200, Matthieu Baerts (NGI0) wrote: > When servers set the C-flag in their MP_CAPABLE to tell clients not > to > create subflows to the initial address and port, clients will likely > not > use their other endpoints. That's because the in-kernel path-manager > uses the 'subflow' endpoints to create subflows only to the initial > address and port. > > If the limits have not been modified to accept ADD_ADDR, the client > doesn't try to establish new subflows. If the limits accept ADD_ADDR, > the routing routes will be used to select the source IP. > > The C-flag is typically set when the server is operating behind a > legacy > Layer 4 load balancer, or using anycast IP address. Clients having > their > different 'subflow' endpoints setup, don't end up creating multiple > subflows as expected, and causing some deployment issues. > > A special case is then added here: when servers set the C-flag in the > MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted. > The 'subflows' endpoints will then be used with this new remote IP > and > port. This exception is only allowed when the ADD_ADDR is sent > immediately after the 3WHS, and makes the client switching to the > 'fully > established' mode. After that, 'select_local_address()' will not be > able > to find any subflows, because 'id_avail_bitmap' will be filled in > mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully > established' mode. > > Fixes: df377be38725 ("mptcp: add deny_join_id0 in > mptcp_options_received") > Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/536 > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > --- > Notes: > - I tried to find a simple solution that can hopefully be backported > to > cover the 'CDN' use-case. > - I tried to find a solution respecting the 'accepted add addr' > limit, > but I don't see how to... That's why here I limit the exception a > maximum: when this limit is 0 (default case), only the first > ADD_ADDR, > etc. Feel free to share what you think about that. > - I have some additional code doing some cleanup, and I started to > look > at #503, which is a bit linked to that. > --- > net/mptcp/pm.c | 7 +++++-- > net/mptcp/pm_kernel.c | 39 +++++++++++++++++++++++++++++++++++++++ > 2 files changed, 44 insertions(+), 2 deletions(-) > > diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c > index > 02dfb379417e2843301f121039ec0d370b040ef7..fda8d1880224dedbbfb20954f93 > a04da435346df 100644 > --- a/net/mptcp/pm.c > +++ b/net/mptcp/pm.c > @@ -637,9 +637,12 @@ void mptcp_pm_add_addr_received(const struct > sock *ssk, > } else { > __MPTCP_INC_STATS(sock_net((struct sock > *)msk), MPTCP_MIB_ADDADDRDROP); > } > - /* id0 should not have a different address */ > + /* id0 should not have a different address ; special case > for C-flag */ > } else if ((addr->id == 0 && > !mptcp_pm_is_init_remote_addr(msk, addr)) || > - (addr->id > 0 && !READ_ONCE(pm->accept_addr))) { > + (addr->id > 0 && !READ_ONCE(pm->accept_addr) && > + (!READ_ONCE(msk->pm.remote_deny_join_id0) || > + msk->pm.local_addr_used != 0 || > + mptcp_pm_get_add_addr_accept_max(msk) != 0))) { I think we can define an mptcp_pm_deny_join_id0(struct mptcp_sock *msk) helper, then use !mptcp_pm_deny_join_id0(msk) here, and use mptcp_pm_deny_join_id0(msk) in fill_local_addresses_vec(). For example: static inline bool mptcp_pm_deny_join_id0(struct mptcp_sock *msk) { return (READ_ONCE(msk->pm.remote_deny_join_id0) && msk->pm.local_addr_used == 0 && mptcp_pm_get_add_addr_accept_max(msk) == 0); } > mptcp_pm_announce_addr(msk, addr, true); > mptcp_pm_add_addr_send_ack(msk); > } else if (mptcp_pm_schedule_work(msk, > MPTCP_PM_ADD_ADDR_RECEIVED)) { > diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c > index > f30df8e884a0b3de9fb092e5774cafe08f6350ce..408801e20ed6e874b80ccbf38af > 76abc13615a3a 100644 > --- a/net/mptcp/pm_kernel.c > +++ b/net/mptcp/pm_kernel.c > @@ -389,10 +389,15 @@ static unsigned int > fill_local_addresses_vec(struct mptcp_sock *msk, > struct mptcp_addr_info mpc_addr; > struct pm_nl_pernet *pernet; > unsigned int subflows_max; > + bool deny_join_id0; > int i = 0; > > pernet = pm_nl_get_pernet_from_msk(msk); > subflows_max = mptcp_pm_get_subflows_max(msk); > + deny_join_id0 = remote->id && READ_ONCE(msk- > >pm.remote_deny_join_id0) && > + !READ_ONCE(msk->pm.accept_addr) && > + msk->pm.local_addr_used == 0 && > + mptcp_pm_get_add_addr_accept_max(msk) == 0; I think it's unnecessary to check msk->pm.accept_addr again here, as it has already been verified in mptcp_pm_add_addr_received, which should be sufficient: deny_join_id0 = remote->id && mptcp_pm_deny_join_id0(msk); > > mptcp_local_address((struct sock_common *)msk, &mpc_addr); > > @@ -409,6 +414,9 @@ static unsigned int > fill_local_addresses_vec(struct mptcp_sock *msk, > locals[i].flags = entry->flags; > locals[i].ifindex = entry->ifindex; > > + if (deny_join_id0) > + __clear_bit(locals[i].addr.id, msk- > >pm.id_avail_bitmap); > + > /* Special case for ID0: set the correct ID > */ > if (mptcp_addresses_equal(&locals[i].addr, > &mpc_addr, locals[i].addr.port)) > locals[i].addr.id = 0; > @@ -419,6 +427,37 @@ static unsigned int > fill_local_addresses_vec(struct mptcp_sock *msk, > } > rcu_read_unlock(); > > + /* Special case: peer sets the C flag, accept one ADD_ADDR > if default > + * limits are used -- accepting no ADD_ADDR -- and use > subflow endpoints > + */ > + if (!i && deny_join_id0) { > + unsigned int local_addr_max = > mptcp_pm_get_local_addr_max(msk); How about moving this line to after subflows_max = mptcp_pm_get_subflows_max(msk);? They are similar and can be grouped together. > + > + while (msk->pm.local_addr_used < local_addr_max && > + msk->pm.subflows < subflows_max) { > + struct mptcp_pm_local *local = &locals[i]; > + > + if (!select_local_address(pernet, msk, > local)) > + break; > + > + __clear_bit(local->addr.id, msk- > >pm.id_avail_bitmap); > + > + if (!mptcp_pm_addr_families_match(sk, > &local->addr, > + remote)) > + continue; > + > + if (mptcp_addresses_equal(&local->addr, > &mpc_addr, > + local->addr.port)) > + continue; > + > + msk->pm.local_addr_used++; > + msk->pm.subflows++; > + i++; > + } > + > + return i; Here, we can avoid using a return statement to allow the subsequent array empty check to proceed. > + } > + > /* If the array is empty, fill in the single > * 'IPADDRANY' local address > */
Hi Geliang, On 16/09/2025 11:18, Geliang Tang wrote: > Hi Matt, > > Thanks for this set. I have some comments below. > > On Mon, 2025-09-15 at 20:11 +0200, Matthieu Baerts (NGI0) wrote: >> When servers set the C-flag in their MP_CAPABLE to tell clients not >> to >> create subflows to the initial address and port, clients will likely >> not >> use their other endpoints. That's because the in-kernel path-manager >> uses the 'subflow' endpoints to create subflows only to the initial >> address and port. >> >> If the limits have not been modified to accept ADD_ADDR, the client >> doesn't try to establish new subflows. If the limits accept ADD_ADDR, >> the routing routes will be used to select the source IP. >> >> The C-flag is typically set when the server is operating behind a >> legacy >> Layer 4 load balancer, or using anycast IP address. Clients having >> their >> different 'subflow' endpoints setup, don't end up creating multiple >> subflows as expected, and causing some deployment issues. >> >> A special case is then added here: when servers set the C-flag in the >> MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted. >> The 'subflows' endpoints will then be used with this new remote IP >> and >> port. This exception is only allowed when the ADD_ADDR is sent >> immediately after the 3WHS, and makes the client switching to the >> 'fully >> established' mode. After that, 'select_local_address()' will not be >> able >> to find any subflows, because 'id_avail_bitmap' will be filled in >> mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully >> established' mode. >> >> Fixes: df377be38725 ("mptcp: add deny_join_id0 in >> mptcp_options_received") >> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/536 >> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> >> --- >> Notes: >> - I tried to find a simple solution that can hopefully be backported >> to >> cover the 'CDN' use-case. >> - I tried to find a solution respecting the 'accepted add addr' >> limit, >> but I don't see how to... That's why here I limit the exception a >> maximum: when this limit is 0 (default case), only the first >> ADD_ADDR, >> etc. Feel free to share what you think about that. >> - I have some additional code doing some cleanup, and I started to >> look >> at #503, which is a bit linked to that. >> --- >> net/mptcp/pm.c | 7 +++++-- >> net/mptcp/pm_kernel.c | 39 +++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 44 insertions(+), 2 deletions(-) >> >> diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c >> index >> 02dfb379417e2843301f121039ec0d370b040ef7..fda8d1880224dedbbfb20954f93 >> a04da435346df 100644 >> --- a/net/mptcp/pm.c >> +++ b/net/mptcp/pm.c >> @@ -637,9 +637,12 @@ void mptcp_pm_add_addr_received(const struct >> sock *ssk, >> } else { >> __MPTCP_INC_STATS(sock_net((struct sock >> *)msk), MPTCP_MIB_ADDADDRDROP); >> } >> - /* id0 should not have a different address */ >> + /* id0 should not have a different address ; special case >> for C-flag */ >> } else if ((addr->id == 0 && >> !mptcp_pm_is_init_remote_addr(msk, addr)) || >> - (addr->id > 0 && !READ_ONCE(pm->accept_addr))) { >> + (addr->id > 0 && !READ_ONCE(pm->accept_addr) && >> + (!READ_ONCE(msk->pm.remote_deny_join_id0) || >> + msk->pm.local_addr_used != 0 || >> + mptcp_pm_get_add_addr_accept_max(msk) != 0))) { > > I think we can define an mptcp_pm_deny_join_id0(struct mptcp_sock *msk) > helper, then use !mptcp_pm_deny_join_id0(msk) here, and use > mptcp_pm_deny_join_id0(msk) in fill_local_addresses_vec(). > > For example: > > static inline bool mptcp_pm_deny_join_id0(struct mptcp_sock *msk) > { > return (READ_ONCE(msk->pm.remote_deny_join_id0) && > msk->pm.local_addr_used == 0 && > mptcp_pm_get_add_addr_accept_max(msk) == 0); > } I initially didn't want to add such helper which means modifying more files: this will cause more troubles for the backports just for that. But now thinking about that again, I guess we will have issues with the backports anyway, because the code has been moved around in pm_kernel.c. So I suppose it is fine to introduce more conflicts, and we can add a new helper. > >> mptcp_pm_announce_addr(msk, addr, true); >> mptcp_pm_add_addr_send_ack(msk); >> } else if (mptcp_pm_schedule_work(msk, >> MPTCP_PM_ADD_ADDR_RECEIVED)) { >> diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c >> index >> f30df8e884a0b3de9fb092e5774cafe08f6350ce..408801e20ed6e874b80ccbf38af >> 76abc13615a3a 100644 >> --- a/net/mptcp/pm_kernel.c >> +++ b/net/mptcp/pm_kernel.c >> @@ -389,10 +389,15 @@ static unsigned int >> fill_local_addresses_vec(struct mptcp_sock *msk, >> struct mptcp_addr_info mpc_addr; >> struct pm_nl_pernet *pernet; >> unsigned int subflows_max; >> + bool deny_join_id0; >> int i = 0; >> >> pernet = pm_nl_get_pernet_from_msk(msk); >> subflows_max = mptcp_pm_get_subflows_max(msk); >> + deny_join_id0 = remote->id && READ_ONCE(msk- >>> pm.remote_deny_join_id0) && >> + !READ_ONCE(msk->pm.accept_addr) && >> + msk->pm.local_addr_used == 0 && >> + mptcp_pm_get_add_addr_accept_max(msk) == 0; > > I think it's unnecessary to check msk->pm.accept_addr again here, as it > has already been verified in mptcp_pm_add_addr_received, which should > be sufficient: > > deny_join_id0 = remote->id && mptcp_pm_deny_join_id0(msk); Mmh, we can be in a situation where remote_deny_join_id0 is true, but accept_addr is true as well, and in this case, we don't want the exception. But I suppose that mptcp_pm_get_add_addr_accept_max(msk) == 0 implies accept_addr being false. So I guess we can drop this condition. > >> >> mptcp_local_address((struct sock_common *)msk, &mpc_addr); >> >> @@ -409,6 +414,9 @@ static unsigned int >> fill_local_addresses_vec(struct mptcp_sock *msk, >> locals[i].flags = entry->flags; >> locals[i].ifindex = entry->ifindex; >> >> + if (deny_join_id0) >> + __clear_bit(locals[i].addr.id, msk- >>> pm.id_avail_bitmap); >> + >> /* Special case for ID0: set the correct ID >> */ >> if (mptcp_addresses_equal(&locals[i].addr, >> &mpc_addr, locals[i].addr.port)) >> locals[i].addr.id = 0; >> @@ -419,6 +427,37 @@ static unsigned int >> fill_local_addresses_vec(struct mptcp_sock *msk, >> } >> rcu_read_unlock(); >> >> + /* Special case: peer sets the C flag, accept one ADD_ADDR >> if default >> + * limits are used -- accepting no ADD_ADDR -- and use >> subflow endpoints >> + */ >> + if (!i && deny_join_id0) { >> + unsigned int local_addr_max = >> mptcp_pm_get_local_addr_max(msk); > > How about moving this line to after > > subflows_max = mptcp_pm_get_subflows_max(msk);? > > They are similar and can be grouped together. Simply to restrict the scope to the block where it is used: local_addr_max is only used here. > >> + >> + while (msk->pm.local_addr_used < local_addr_max && >> + msk->pm.subflows < subflows_max) { >> + struct mptcp_pm_local *local = &locals[i]; >> + >> + if (!select_local_address(pernet, msk, >> local)) >> + break; >> + >> + __clear_bit(local->addr.id, msk- >>> pm.id_avail_bitmap); >> + >> + if (!mptcp_pm_addr_families_match(sk, >> &local->addr, >> + remote)) >> + continue; >> + >> + if (mptcp_addresses_equal(&local->addr, >> &mpc_addr, >> + local->addr.port)) >> + continue; >> + >> + msk->pm.local_addr_used++; >> + msk->pm.subflows++; >> + i++; >> + } >> + >> + return i; > > Here, we can avoid using a return statement to allow the subsequent > array empty check to proceed. No, we cannot: if we are in the exception (deny_join_id0), it is possible we are not allowed to create additional subflows using the 'subflow' endpoints, see the conditions in the 'while()'. In this case, we don't want to create a new subflow below, because the idea is to use the 'subflow' endpoints if we can. Or we need to check these conditions before, but that's a lot of conditions, and it seems safer to return here, no, > >> + } >> + >> /* If the array is empty, fill in the single >> * 'IPADDRANY' local address >> */ Cheers, Matt -- Sponsored by the NGI0 Core fund.
© 2016 - 2025 Red Hat, Inc.