net/mptcp/options.c | 3 +- net/mptcp/pm.c | 3 +- net/mptcp/pm_netlink.c | 20 +++++---- net/mptcp/protocol.h | 5 ++- tools/testing/selftests/net/mptcp/mptcp_join.sh | 55 ++++++++++++++++++------- 5 files changed, 60 insertions(+), 26 deletions(-)
When looking at improving the user experience around the MPTCP endpoints setup, I noticed that setting an endpoint with both the 'signal' and the 'subflow' flags -- as it has been done in the past by users according to bug reports we got -- were resulting on only announcing the endpoint, but not using it to create subflows: the 'subflow' flag was then ignored. My initial thought was to modify IPRoute2 to warn the user when the two flags were set, but it doesn't sound normal to ignore one of them. I then looked at modifying the kernel not to allow having the two flags set, but when discussing about that with Mat, we thought it was maybe not ideal to do that, as there might be use-cases, we might break some configs, and it was working before apparently. So instead, I fixed the support by splitting the previously shared 'available IDs' bitmap. While at it, I added a new selftest (patch 4) to validate this case -- including a modification of the chk_add_nr helper to inverse the sides were the counters are checked (patch 3) -- and allowed ADD_ADDR echo just after the MP_JOIN 3WHS. The selftests modification don't have a Fixes tag, because backporting will not be easy: on some versions, the backport will not generate conflicts, but the test will fail because some helpers are not available. That's OK, many CIs will use the selftests from the last stable version to validate previous stable releases. Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> --- Matthieu Baerts (NGI0) (4): mptcp: fully established after ADD_ADDR echo on MPJ mptcp: split id_avail_bitmap per target selftests: mptcp: join: ability to invert ADD_ADDR check selftests: mptcp: join: test both signal & subflow net/mptcp/options.c | 3 +- net/mptcp/pm.c | 3 +- net/mptcp/pm_netlink.c | 20 +++++---- net/mptcp/protocol.h | 5 ++- tools/testing/selftests/net/mptcp/mptcp_join.sh | 55 ++++++++++++++++++------- 5 files changed, 60 insertions(+), 26 deletions(-) --- base-commit: e6b62d29168e1aef6aa328f8956e5e0aadffe3d5 change-id: 20240620-mptcp-pm-avail-f5e3957be441 Best regards, -- Matthieu Baerts (NGI0) <matttbe@kernel.org>
Hi, On Fri, 2024-06-21 at 13:39 +0200, Matthieu Baerts (NGI0) wrote: > When looking at improving the user experience around the MPTCP endpoints > setup, I noticed that setting an endpoint with both the 'signal' and the > 'subflow' flags -- as it has been done in the past by users according to > bug reports we got -- were resulting on only announcing the endpoint, > but not using it to create subflows: the 'subflow' flag was then > ignored. > > My initial thought was to modify IPRoute2 to warn the user when the two > flags were set, but it doesn't sound normal to ignore one of them. I > then looked at modifying the kernel not to allow having the two flags > set, but when discussing about that with Mat, we thought it was maybe > not ideal to do that, as there might be use-cases, we might break some > configs, and it was working before apparently. So instead, I fixed the > support by splitting the previously shared 'available IDs' bitmap. > > While at it, I added a new selftest (patch 4) to validate this case -- > including a modification of the chk_add_nr helper to inverse the sides > were the counters are checked (patch 3) -- and allowed ADD_ADDR echo > just after the MP_JOIN 3WHS. > > The selftests modification don't have a Fixes tag, because backporting > will not be easy: on some versions, the backport will not generate > conflicts, but the test will fail because some helpers are not > available. That's OK, many CIs will use the selftests from the last > stable version to validate previous stable releases. I'm sorry for the large latency. I think the above is more a new feature than a fix, I would avoid including fixes tag, to prevent stable backports. I am a bit doubtful about having endpoint with both subflow | signal, do we want encourage such bogus setup? What if the endpoint specify a port, too? AFAICS it can not work, as the port-based signal endpoint will create a listener on <IP>:<port> and later on the 'subflow' flag will try to bind on the same <IP>:<port>, failing, while creating the outgoing connection. What about just documenting the existing behavior? Thanks, Paolo
Hi Paolo, On 04/07/2024 17:47, Paolo Abeni wrote: > Hi, > > On Fri, 2024-06-21 at 13:39 +0200, Matthieu Baerts (NGI0) wrote: >> When looking at improving the user experience around the MPTCP endpoints >> setup, I noticed that setting an endpoint with both the 'signal' and the >> 'subflow' flags -- as it has been done in the past by users according to >> bug reports we got -- were resulting on only announcing the endpoint, >> but not using it to create subflows: the 'subflow' flag was then >> ignored. >> >> My initial thought was to modify IPRoute2 to warn the user when the two >> flags were set, but it doesn't sound normal to ignore one of them. I >> then looked at modifying the kernel not to allow having the two flags >> set, but when discussing about that with Mat, we thought it was maybe >> not ideal to do that, as there might be use-cases, we might break some >> configs, and it was working before apparently. So instead, I fixed the >> support by splitting the previously shared 'available IDs' bitmap. >> >> While at it, I added a new selftest (patch 4) to validate this case -- >> including a modification of the chk_add_nr helper to inverse the sides >> were the counters are checked (patch 3) -- and allowed ADD_ADDR echo >> just after the MP_JOIN 3WHS. >> >> The selftests modification don't have a Fixes tag, because backporting >> will not be easy: on some versions, the backport will not generate >> conflicts, but the test will fail because some helpers are not >> available. That's OK, many CIs will use the selftests from the last >> stable version to validate previous stable releases. > > I'm sorry for the large latency. No problem, thank you for your reply! > I think the above is more a new feature than a fix, I would avoid > including fixes tag, to prevent stable backports. My main goal here is to reduce frustration from users who are (certainly "badly") setting both the 'signal' and 'subflow' flags on new endpoints. In the different doc we have, we mention what does 'signal' and 'subflow', but nothing about using them together. For us, it seems obvious not to set the two flags together. But if you are new to MPTCP, I can understand that "signalling doesn't seem to hurt", and both 'subflow' and 'signal' flags are used. (Note that in the mptcp.org implementation, the fullmesh PM was sending ADD_ADDR and establishing new subflows, just because "it doesn't hurt to announce all addresses" :) ) Currently, if we use the two flags together, the 'subflow' flag is ignored. I added the Fixes tag in patch 2, because it sounds like a bug not to create subflows when the 'subflow' flag is set. And also because it was working before the mentioned 'Fixes' commit which has accidentally removed this support. I understand this can be seen as a feature, but I'm not sure how to better fix that to avoid this frustration from users when the 'subflow' flag is ignored. Or we change the kernel not to accept setting the two flags together? > I am a bit doubtful about having endpoint with both subflow | signal, > do we want encourage such bogus setup? What if the endpoint specify a > port, too? > > AFAICS it can not work, as the port-based signal endpoint will create a > listener on <IP>:<port> and later on the 'subflow' flag will try to > bind on the same <IP>:<port>, failing, while creating the outgoing > connection. What about returning an error when a port is given with the 'subflow' flag? Currently, we do: if (addr.addr.port && !(addr.flags & MPTCP_PM_ADDR_FLAG_SIGNAL)) return -EINVAL; We could add "(... || addr.flags & MPTCP_PM_ADDR_FLAG_SUBLFOW)". > What about just documenting the existing behavior? I don't think we should document that "setting both the 'signal' and 'subflow' flags is allowed, but in fact the 'subflow' flag is ignored in >= 5.17 versions", no? :) I initially looked at adding a warning in IPRoute2, but then I failed to explain when setting these two flags can be justified. (And the new warning would take time to arrive in Linux distro currently ignoring the 'subflow' flag in this case.) Cheers, Matt -- Sponsored by the NGI0 Core fund.
On Thu, 2024-07-04 at 18:43 +0200, Matthieu Baerts wrote: > Hi Paolo, > > On 04/07/2024 17:47, Paolo Abeni wrote: > > Hi, > > > > On Fri, 2024-06-21 at 13:39 +0200, Matthieu Baerts (NGI0) wrote: > > > When looking at improving the user experience around the MPTCP endpoints > > > setup, I noticed that setting an endpoint with both the 'signal' and the > > > 'subflow' flags -- as it has been done in the past by users according to > > > bug reports we got -- were resulting on only announcing the endpoint, > > > but not using it to create subflows: the 'subflow' flag was then > > > ignored. > > > > > > My initial thought was to modify IPRoute2 to warn the user when the two > > > flags were set, but it doesn't sound normal to ignore one of them. I > > > then looked at modifying the kernel not to allow having the two flags > > > set, but when discussing about that with Mat, we thought it was maybe > > > not ideal to do that, as there might be use-cases, we might break some > > > configs, and it was working before apparently. So instead, I fixed the > > > support by splitting the previously shared 'available IDs' bitmap. > > > > > > While at it, I added a new selftest (patch 4) to validate this case -- > > > including a modification of the chk_add_nr helper to inverse the sides > > > were the counters are checked (patch 3) -- and allowed ADD_ADDR echo > > > just after the MP_JOIN 3WHS. > > > > > > The selftests modification don't have a Fixes tag, because backporting > > > will not be easy: on some versions, the backport will not generate > > > conflicts, but the test will fail because some helpers are not > > > available. That's OK, many CIs will use the selftests from the last > > > stable version to validate previous stable releases. > > > > I'm sorry for the large latency. > > No problem, thank you for your reply! > > > I think the above is more a new feature than a fix, I would avoid > > including fixes tag, to prevent stable backports. > > My main goal here is to reduce frustration from users who are (certainly > "badly") setting both the 'signal' and 'subflow' flags on new endpoints. > In the different doc we have, we mention what does 'signal' and > 'subflow', but nothing about using them together. For us, it seems > obvious not to set the two flags together. But if you are new to MPTCP, > I can understand that "signalling doesn't seem to hurt", and both > 'subflow' and 'signal' flags are used. > > (Note that in the mptcp.org implementation, the fullmesh PM was sending > ADD_ADDR and establishing new subflows, just because "it doesn't hurt to > announce all addresses" :) ) > > Currently, if we use the two flags together, the 'subflow' flag is > ignored. I added the Fixes tag in patch 2, because it sounds like a bug > not to create subflows when the 'subflow' flag is set. And also because > it was working before the mentioned 'Fixes' commit which has > accidentally removed this support. Ah, my bad! I underlooked the commit message and missed this part. > > I am a bit doubtful about having endpoint with both subflow | signal, > > do we want encourage such bogus setup? What if the endpoint specify a > > port, too? > > > > AFAICS it can not work, as the port-based signal endpoint will create a > > listener on <IP>:<port> and later on the 'subflow' flag will try to > > bind on the same <IP>:<port>, failing, while creating the outgoing > > connection. > > What about returning an error when a port is given with the 'subflow' flag? > > Currently, we do: > > if (addr.addr.port && !(addr.flags & MPTCP_PM_ADDR_FLAG_SIGNAL)) > return -EINVAL; > > We could add "(... || addr.flags & MPTCP_PM_ADDR_FLAG_SUBLFOW)". IMHO if we deny SIGNAL | SUBFLOW for a subset of the possible configuration, than is better to be consistent and deny them for all the conf. > > What about just documenting the existing behavior? > > I don't think we should document that "setting both the 'signal' and > 'subflow' flags is allowed, but in fact the 'subflow' flag is ignored in > > = 5.17 versions", no? :) Yup, I missed the regression part of it. If possible, I would be happy to reduce complexity by preventing signal and subflow flags together. > I initially looked at adding a warning in IPRoute2, but then I failed to > explain when setting these two flags can be justified. (And the new > warning would take time to arrive in Linux distro currently ignoring the > 'subflow' flag in this case.) Well, distro problems are distro problems, I'm sure we can ignore the distro lifecycle problems here. Should be addressed by a completely different set of persons ;) Indeed the ip-mptcp man page could help some significant rewording, what about starting from there? PORT description implicitly refers to 'signal' endpoint without mentioning such fact. Also it has effect for subflow endpoint, not mentioned there /P
4 Jul 2024 19:32:47 Paolo Abeni <pabeni@redhat.com>: > On Thu, 2024-07-04 at 18:43 +0200, Matthieu Baerts wrote: >> Hi Paolo, >> >> On 04/07/2024 17:47, Paolo Abeni wrote: >>> Hi, >>> >>> On Fri, 2024-06-21 at 13:39 +0200, Matthieu Baerts (NGI0) wrote: >>>> When looking at improving the user experience around the MPTCP endpoints >>>> setup, I noticed that setting an endpoint with both the 'signal' and the >>>> 'subflow' flags -- as it has been done in the past by users according to >>>> bug reports we got -- were resulting on only announcing the endpoint, >>>> but not using it to create subflows: the 'subflow' flag was then >>>> ignored. >>>> >>>> My initial thought was to modify IPRoute2 to warn the user when the two >>>> flags were set, but it doesn't sound normal to ignore one of them. I >>>> then looked at modifying the kernel not to allow having the two flags >>>> set, but when discussing about that with Mat, we thought it was maybe >>>> not ideal to do that, as there might be use-cases, we might break some >>>> configs, and it was working before apparently. So instead, I fixed the >>>> support by splitting the previously shared 'available IDs' bitmap. >>>> >>>> While at it, I added a new selftest (patch 4) to validate this case -- >>>> including a modification of the chk_add_nr helper to inverse the sides >>>> were the counters are checked (patch 3) -- and allowed ADD_ADDR echo >>>> just after the MP_JOIN 3WHS. >>>> >>>> The selftests modification don't have a Fixes tag, because backporting >>>> will not be easy: on some versions, the backport will not generate >>>> conflicts, but the test will fail because some helpers are not >>>> available. That's OK, many CIs will use the selftests from the last >>>> stable version to validate previous stable releases. >>> >>> I'm sorry for the large latency. >> >> No problem, thank you for your reply! >> >>> I think the above is more a new feature than a fix, I would avoid >>> including fixes tag, to prevent stable backports. >> >> My main goal here is to reduce frustration from users who are (certainly >> "badly") setting both the 'signal' and 'subflow' flags on new endpoints. >> In the different doc we have, we mention what does 'signal' and >> 'subflow', but nothing about using them together. For us, it seems >> obvious not to set the two flags together. But if you are new to MPTCP, >> I can understand that "signalling doesn't seem to hurt", and both >> 'subflow' and 'signal' flags are used. >> >> (Note that in the mptcp.org implementation, the fullmesh PM was sending >> ADD_ADDR and establishing new subflows, just because "it doesn't hurt to >> announce all addresses" :) ) >> >> Currently, if we use the two flags together, the 'subflow' flag is >> ignored. I added the Fixes tag in patch 2, because it sounds like a bug >> not to create subflows when the 'subflow' flag is set. And also because >> it was working before the mentioned 'Fixes' commit which has >> accidentally removed this support. > > Ah, my bad! I underlooked the commit message and missed this part. > >>> I am a bit doubtful about having endpoint with both subflow | signal, >>> do we want encourage such bogus setup? What if the endpoint specify a >>> port, too? >>> >>> AFAICS it can not work, as the port-based signal endpoint will create a >>> listener on <IP>:<port> and later on the 'subflow' flag will try to >>> bind on the same <IP>:<port>, failing, while creating the outgoing >>> connection. >> >> What about returning an error when a port is given with the 'subflow' flag? >> >> Currently, we do: >> >> if (addr.addr.port && !(addr.flags & MPTCP_PM_ADDR_FLAG_SIGNAL)) >> return -EINVAL; >> >> We could add "(... || addr.flags & MPTCP_PM_ADDR_FLAG_SUBLFOW)". > > IMHO if we deny SIGNAL | SUBFLOW for a subset of the possible > configuration, than is better to be consistent and deny them for all > the conf. I thought about denying them all, but I wonder if there are no use cases currently using that. I didn't want to break them. Do you think it would be OK? For the moment, using signal + subflow + port doesn't work. But more importantly, it cannot work. This looks similar to signal + fullmesh that is currently denied. So maybe OK to deny only a subset? >>> What about just documenting the existing behavior? >> >> I don't think we should document that "setting both the 'signal' and >> 'subflow' flags is allowed, but in fact the 'subflow' flag is ignored in >>> = 5.17 versions", no? :) > > Yup, I missed the regression part of it. > > If possible, I would be happy to reduce complexity by preventing signal > and subflow flags together. Do you think it would be OK to backport that? How to know if this configuration is not already used for valid use cases? >> I initially looked at adding a warning in IPRoute2, but then I failed to >> explain when setting these two flags can be justified. (And the new >> warning would take time to arrive in Linux distro currently ignoring the >> 'subflow' flag in this case.) > > Well, distro problems are distro problems, I'm sure we can ignore the > distro lifecycle problems here. Should be addressed by a completely > different set of persons ;) That's our problem if people report issues to us :-P > Indeed the ip-mptcp man page could help some significant rewording, > what about starting from there? Yes I started to do that, I will look at the v2 :) https://lore.kernel.org/mptcp/20240605-mptcp-user-feedback-v1-0-d2dc3b399643@kernel.org/T/ > PORT description implicitly refers to 'signal' endpoint without > mentioning such fact. Also it has effect for subflow endpoint, not > mentioned there Good idea to mention that. I will look at adding that in the v2. Cheers, Matt
On Thu, 2024-07-04 at 22:02 +0200, Matthieu Baerts wrote: > > > I don't think we should document that "setting both the 'signal' and > > > 'subflow' flags is allowed, but in fact the 'subflow' flag is ignored in > > > > = 5.17 versions", no? :) > > > > Yup, I missed the regression part of it. > > > > If possible, I would be happy to reduce complexity by preventing signal > > and subflow flags together. > > Do you think it would be OK to backport that? How to know if this > configuration is not already used for valid use cases? I guess that what worry me most is the addition of another large field to the msk "just" to track (again) endpoints. Thinking again about it, perhaps we don't need it. Something alike the following should be sufficient to replace patch 2/4 here: --- diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index ea9e5817b9e9..24e18cd43f48 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -513,8 +513,8 @@ __lookup_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *info) static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) { + struct mptcp_pm_addr_entry *local, *signal_and_subflow = NULL; struct sock *sk = (struct sock *)msk; - struct mptcp_pm_addr_entry *local; unsigned int add_addr_signal_max; unsigned int local_addr_max; struct pm_nl_pernet *pernet; @@ -568,14 +568,27 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL)) return; - if (local) { - if (mptcp_pm_alloc_anno_list(msk, &local->addr)) { - __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); - msk->pm.add_addr_signaled++; - mptcp_pm_announce_addr(msk, &local->addr, false); - mptcp_pm_nl_addr_send_ack(msk); - } - } + if (!local) + continue; + + /* If the allocation fails we are on memory pressure, it looks + * not worthy try any forther later action, that will likely fail + * And if local has both the SIGNAL and SUBFLOW flags, and the + * by bad luck the later subflow creation is successfully, the + * msk will end-up not signaling this address; not even + * re-attempting to do it. + * Likely worthy incrementing a new MIB here + */ + if (!mptcp_pm_alloc_anno_list(msk, &local->addr)) + return; + + __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); + msk->pm.add_addr_signaled++; + mptcp_pm_announce_addr(msk, &local->addr, false); + mptcp_pm_nl_addr_send_ack(msk); + if (local->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) + signal_and_subflow = local; + break; } /* check if should create a new subflow */ @@ -585,9 +598,14 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) bool fullmesh; int i, nr; - local = select_local_address(pernet, msk); - if (!local) - break; + if (signal_and_subflow) { + local = signal_and_subflow; + signal_and_subflow = NULL; + } else { + local = select_local_address(pernet, msk); + if (!local) + break; + } fullmesh = !!(local->flags & MPTCP_PM_ADDR_FLAG_FULLMESH); --- that is, when we use a signal endpoint, if that has the subflow flag, use additionally to create the subflows, without doing an additional search. WDYT? Thanks! Paolo p.s. Jakubs report the upstream mptcp_connect() st are flacky: https://netdev.bots.linux.dev/contest.html?executor=vmksft-mptcp-dbg&test=mptcp-connect-sh can you see that in our CI?
Hi Paolo, On 05/07/2024 10:20, Paolo Abeni wrote: > On Thu, 2024-07-04 at 22:02 +0200, Matthieu Baerts wrote: >>>> I don't think we should document that "setting both the 'signal' and >>>> 'subflow' flags is allowed, but in fact the 'subflow' flag is ignored in >>>>> = 5.17 versions", no? :) >>> >>> Yup, I missed the regression part of it. >>> >>> If possible, I would be happy to reduce complexity by preventing signal >>> and subflow flags together. >> >> Do you think it would be OK to backport that? How to know if this >> configuration is not already used for valid use cases? > > I guess that what worry me most is the addition of another large field > to the msk "just" to track (again) endpoints. > > Thinking again about it, perhaps we don't need it. Something alike the > following should be sufficient to replace patch 2/4 here: > > --- > diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c > index ea9e5817b9e9..24e18cd43f48 100644 > --- a/net/mptcp/pm_netlink.c > +++ b/net/mptcp/pm_netlink.c > @@ -513,8 +513,8 @@ __lookup_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *info) > > static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) > { > + struct mptcp_pm_addr_entry *local, *signal_and_subflow = NULL; > struct sock *sk = (struct sock *)msk; > - struct mptcp_pm_addr_entry *local; > unsigned int add_addr_signal_max; > unsigned int local_addr_max; > struct pm_nl_pernet *pernet; > @@ -568,14 +568,27 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) > if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL)) > return; > > - if (local) { > - if (mptcp_pm_alloc_anno_list(msk, &local->addr)) { > - __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); > - msk->pm.add_addr_signaled++; > - mptcp_pm_announce_addr(msk, &local->addr, false); > - mptcp_pm_nl_addr_send_ack(msk); > - } > - } > + if (!local) > + continue; > + > + /* If the allocation fails we are on memory pressure, it looks > + * not worthy try any forther later action, that will likely fail > + * And if local has both the SIGNAL and SUBFLOW flags, and the > + * by bad luck the later subflow creation is successfully, the > + * msk will end-up not signaling this address; not even > + * re-attempting to do it. If the subflow creation is successful, the new address will be somehow advertised to the other peer. Should we not try to create the subflow anyway? That would reduce the diff as well. > + * Likely worthy incrementing a new MIB here I think there are many other places where we don't have such MIB counters for not being able to allocate memory. Do you think we would need one here? > + */ > + if (!mptcp_pm_alloc_anno_list(msk, &local->addr)) > + return; > + > + __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); > + msk->pm.add_addr_signaled++; > + mptcp_pm_announce_addr(msk, &local->addr, false); > + mptcp_pm_nl_addr_send_ack(msk); > + if (local->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) > + signal_and_subflow = local; > + break; > } > > /* check if should create a new subflow */ > @@ -585,9 +598,14 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) > bool fullmesh; > int i, nr; > > - local = select_local_address(pernet, msk); > - if (!local) > - break; > + if (signal_and_subflow) { > + local = signal_and_subflow; > + signal_and_subflow = NULL; > + } else { > + local = select_local_address(pernet, msk); > + if (!local) > + break; > + } > > fullmesh = !!(local->flags & MPTCP_PM_ADDR_FLAG_FULLMESH); > --- > that is, when we use a signal endpoint, if that has the subflow flag, > use additionally to create the subflows, without doing an additional > search. > > WDYT? Smart! Indeed, we can do that because we always mark the IDs as being used for one or the other from the same place! I will update the v2 with your version! > Thanks! > > Paolo > > p.s. Jakubs report the upstream mptcp_connect() st are flacky: > > https://netdev.bots.linux.dev/contest.html?executor=vmksft-mptcp-dbg&test=mptcp-connect-sh I noticed that this morning too, but I didn't have a chance to look at the logs. Now I just did, and I think they are all linked to: https://github.com/multipath-tcp/mptcp_net-next/issues/499 > can you see that in our CI? No: https://ci-results.mptcp.dev/flakes.html?branch=export https://ci-results.mptcp.dev/flakes.html?branch=export-net I only saw these issues on the Netdev CI, in debug mode, when the same IP is used for both the client and the server. I just updated this #499 issue with the two new occurrences. Cheers, Matt -- Sponsored by the NGI0 Core fund.
On Fri, 2024-07-05 at 17:29 +0200, Matthieu Baerts wrote: > On 05/07/2024 10:20, Paolo Abeni wrote: > > @@ -568,14 +568,27 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) > > if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL)) > > return; > > > > - if (local) { > > - if (mptcp_pm_alloc_anno_list(msk, &local->addr)) { > > - __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); > > - msk->pm.add_addr_signaled++; > > - mptcp_pm_announce_addr(msk, &local->addr, false); > > - mptcp_pm_nl_addr_send_ack(msk); > > - } > > - } > > + if (!local) > > + continue; > > + > > + /* If the allocation fails we are on memory pressure, it looks > > + * not worthy try any forther later action, that will likely fail > > + * And if local has both the SIGNAL and SUBFLOW flags, and the > > + * by bad luck the later subflow creation is successfully, the > > + * msk will end-up not signaling this address; not even > > + * re-attempting to do it. > > If the subflow creation is successful, the new address will be somehow > advertised to the other peer. Should we not try to create the subflow > anyway? That would reduce the diff as well. The announce list requires a very small amount of memory from a single slab. A subflow creation will require a tcp socket, an mptcp context, and likely some lsm context, I think is very unlikely all of them will be successfully. The diff reduction will be just mostly for the trimmed comment, am I correct? If so I think we can keep the code this way. Note that most/all of comment above could be moved into the commit message itself. > > + * Likely worthy incrementing a new MIB here > > I think there are many other places where we don't have such MIB > counters for not being able to allocate memory. Do you think we would > need one here? Thinking again about it, we don't need them. The warn in dmsg should be enough ;) Thanks! Paolo
Hi Paolo, Thank you for your reply! On 09/07/2024 18:47, Paolo Abeni wrote: > On Fri, 2024-07-05 at 17:29 +0200, Matthieu Baerts wrote: >> On 05/07/2024 10:20, Paolo Abeni wrote: >>> @@ -568,14 +568,27 @@ static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) >>> if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL)) >>> return; >>> >>> - if (local) { >>> - if (mptcp_pm_alloc_anno_list(msk, &local->addr)) { >>> - __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); >>> - msk->pm.add_addr_signaled++; >>> - mptcp_pm_announce_addr(msk, &local->addr, false); >>> - mptcp_pm_nl_addr_send_ack(msk); >>> - } >>> - } >>> + if (!local) >>> + continue; >>> + >>> + /* If the allocation fails we are on memory pressure, it looks >>> + * not worthy try any forther later action, that will likely fail >>> + * And if local has both the SIGNAL and SUBFLOW flags, and the >>> + * by bad luck the later subflow creation is successfully, the >>> + * msk will end-up not signaling this address; not even >>> + * re-attempting to do it. >> >> If the subflow creation is successful, the new address will be somehow >> advertised to the other peer. Should we not try to create the subflow >> anyway? That would reduce the diff as well. > > The announce list requires a very small amount of memory from a single > slab. A subflow creation will require a tcp socket, an mptcp context, > and likely some lsm context, I think is very unlikely all of them will > be successfully. I agree, but I don't know if we should change the behaviour here with a return. Note that mptcp_pm_alloc_anno_list() can also return 'false' if the address was already in the list. It should not happen, but I didn't check if there were cases we didn't handle correctly there. > > The diff reduction will be just mostly for the trimmed comment, am I > correct? If so I think we can keep the code this way. Note that > most/all of comment above could be moved into the commit message > itself. For this fix, I would have only added these two lines after having called mptcp_pm_nl_addr_send_ack(): if (local->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) signal_and_subflow = local; Or we take this opportunity to reduce the level of indentations, but we use a "continue" if mptcp_pm_alloc_anno_list() failed. WDYT? >>> + * Likely worthy incrementing a new MIB here >> >> I think there are many other places where we don't have such MIB >> counters for not being able to allocate memory. Do you think we would >> need one here? > > Thinking again about it, we don't need them. The warn in dmsg should be > enough ;) Sounds good to me! Cheers, Matt -- Sponsored by the NGI0 Core fund.
On Tue, 2024-07-09 at 19:43 +0200, Matthieu Baerts wrote: > On 09/07/2024 18:47, Paolo Abeni wrote: > > The announce list requires a very small amount of memory from a single > > slab. A subflow creation will require a tcp socket, an mptcp context, > > and likely some lsm context, I think is very unlikely all of them will > > be successfully. > > I agree, but I don't know if we should change the behaviour here with a > return. > > Note that mptcp_pm_alloc_anno_list() can also return 'false' if the > address was already in the list. It should not happen, but I didn't > check if there were cases we didn't handle correctly there. AFAICS, we should never enter that branch. In fact I think we could add a WARN_ON_ONCE there: if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk))) return false; because the bitmap should prevent us from adding an already announce address. > > > > > The diff reduction will be just mostly for the trimmed comment, am I > > correct? If so I think we can keep the code this way. Note that > > most/all of comment above could be moved into the commit message > > itself. > > For this fix, I would have only added these two lines after having > called mptcp_pm_nl_addr_send_ack(): > > if (local->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) > signal_and_subflow = local; > > Or we take this opportunity to reduce the level of indentations, but we > use a "continue" if mptcp_pm_alloc_anno_list() failed. WDYT? I would go for reducing the level of indentation. I think that in practice the behavior will be the same with both 'continue' and 'return': on 'continue' the next iteration will very likely fail. Perhaps it's better to be avoid wasting cycles and/or put extreme memory pressure on the system and consistently 'return'? Thanks! Paolo
Hi Matthieu, Thank you for your modifications, that's great! Our CI did some validations and here is its report: - KVM Validation: normal: Success! ✅ - KVM Validation: debug: Success! ✅ - KVM Validation: btf (only bpftest_all): Success! ✅ - Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/9613128523 Initiator: Patchew Applier Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/5b1c305d4d2d Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=864250 If there are some issues, you can reproduce them using the same environment as the one used by the CI thanks to a docker image, e.g.: $ cd [kernel source code] $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \ --pull always mptcp/mptcp-upstream-virtme-docker:latest \ auto-normal For more details: https://github.com/multipath-tcp/mptcp-upstream-virtme-docker Please note that despite all the efforts that have been already done to have a stable tests suite when executed on a public CI like here, it is possible some reported issues are not due to your modifications. Still, do not hesitate to help us improve that ;-) Cheers, MPTCP GH Action bot Bot operated by Matthieu Baerts (NGI0 Core)
© 2016 - 2024 Red Hat, Inc.