The 'mptcp_subflow_context' structure has two items related to the
backup flags:
- 'backup': the subflow has been marked as backup by the other peer
- 'request_bkup': the backup flag has been set by the host
Before this patch, the scheduler was only looking at the 'backup' flag.
That can make sense in some cases, but it looks like that's not what we
wanted for the general use, because either the path-manager was setting
both of them when sending an MP_PRIO, or the receiver was duplicating
the 'backup' flag in the subflow request.
Note that the use of these two flags in the path-manager are going to be
fixed in the next commits, but this change here is needed not to modify
the behaviour.
Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
include/trace/events/mptcp.h | 2 +-
net/mptcp/protocol.c | 10 ++++++----
2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/include/trace/events/mptcp.h b/include/trace/events/mptcp.h
index 09e72215b9f9..085b749cdd97 100644
--- a/include/trace/events/mptcp.h
+++ b/include/trace/events/mptcp.h
@@ -34,7 +34,7 @@ TRACE_EVENT(mptcp_subflow_get_send,
struct sock *ssk;
__entry->active = mptcp_subflow_active(subflow);
- __entry->backup = subflow->backup;
+ __entry->backup = subflow->backup || subflow->request_bkup;
if (subflow->tcp_sock && sk_fullsock(subflow->tcp_sock))
__entry->free = sk_stream_memory_free(subflow->tcp_sock);
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index ac94225489f8..b3a48d97f009 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1422,13 +1422,15 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
}
mptcp_for_each_subflow(msk, subflow) {
+ bool backup = subflow->backup || subflow->request_bkup;
+
trace_mptcp_subflow_get_send(subflow);
ssk = mptcp_subflow_tcp_sock(subflow);
if (!mptcp_subflow_active(subflow))
continue;
tout = max(tout, mptcp_timeout_from_subflow(subflow));
- nr_active += !subflow->backup;
+ nr_active += !backup;
pace = subflow->avg_pacing_rate;
if (unlikely(!pace)) {
/* init pacing rate from socket */
@@ -1439,9 +1441,9 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
}
linger_time = div_u64((u64)READ_ONCE(ssk->sk_wmem_queued) << 32, pace);
- if (linger_time < send_info[subflow->backup].linger_time) {
- send_info[subflow->backup].ssk = ssk;
- send_info[subflow->backup].linger_time = linger_time;
+ if (linger_time < send_info[backup].linger_time) {
+ send_info[backup].ssk = ssk;
+ send_info[backup].linger_time = linger_time;
}
}
__mptcp_set_timeout(sk, tout);
--
2.45.2
On Tue, 2024-07-16 at 22:53 +0200, Matthieu Baerts (NGI0) wrote: > The 'mptcp_subflow_context' structure has two items related to the > backup flags: > > - 'backup': the subflow has been marked as backup by the other peer > > - 'request_bkup': the backup flag has been set by the host The two lines are not aligned. > > Before this patch, the scheduler was only looking at the 'backup' > flag. > That can make sense in some cases, but it looks like that's not what > we > wanted for the general use, because either the path-manager was > setting > both of them when sending an MP_PRIO, or the receiver was duplicating > the 'backup' flag in the subflow request. > > Note that the use of these two flags in the path-manager are going to > be > fixed in the next commits, but this change here is needed not to > modify > the behaviour. > > Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN > requests") Patch 3 can be squashed into this one, with two "Fixes" tags here. WDYT? Thanks, -Geliang > Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> > --- > include/trace/events/mptcp.h | 2 +- > net/mptcp/protocol.c | 10 ++++++---- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/include/trace/events/mptcp.h > b/include/trace/events/mptcp.h > index 09e72215b9f9..085b749cdd97 100644 > --- a/include/trace/events/mptcp.h > +++ b/include/trace/events/mptcp.h > @@ -34,7 +34,7 @@ TRACE_EVENT(mptcp_subflow_get_send, > struct sock *ssk; > > __entry->active = mptcp_subflow_active(subflow); > - __entry->backup = subflow->backup; > + __entry->backup = subflow->backup || subflow- > >request_bkup; > > if (subflow->tcp_sock && sk_fullsock(subflow- > >tcp_sock)) > __entry->free = > sk_stream_memory_free(subflow->tcp_sock); > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > index ac94225489f8..b3a48d97f009 100644 > --- a/net/mptcp/protocol.c > +++ b/net/mptcp/protocol.c > @@ -1422,13 +1422,15 @@ struct sock *mptcp_subflow_get_send(struct > mptcp_sock *msk) > } > > mptcp_for_each_subflow(msk, subflow) { > + bool backup = subflow->backup || subflow- > >request_bkup; > + > trace_mptcp_subflow_get_send(subflow); > ssk = mptcp_subflow_tcp_sock(subflow); > if (!mptcp_subflow_active(subflow)) > continue; > > tout = max(tout, > mptcp_timeout_from_subflow(subflow)); > - nr_active += !subflow->backup; > + nr_active += !backup; > pace = subflow->avg_pacing_rate; > if (unlikely(!pace)) { > /* init pacing rate from socket */ > @@ -1439,9 +1441,9 @@ struct sock *mptcp_subflow_get_send(struct > mptcp_sock *msk) > } > > linger_time = div_u64((u64)READ_ONCE(ssk- > >sk_wmem_queued) << 32, pace); > - if (linger_time < send_info[subflow- > >backup].linger_time) { > - send_info[subflow->backup].ssk = ssk; > - send_info[subflow->backup].linger_time = > linger_time; > + if (linger_time < send_info[backup].linger_time) { > + send_info[backup].ssk = ssk; > + send_info[backup].linger_time = linger_time; > } > } > __mptcp_set_timeout(sk, tout); >
Hi Geliang, Thank you for the review! On 17/07/2024 06:25, Geliang Tang wrote: > On Tue, 2024-07-16 at 22:53 +0200, Matthieu Baerts (NGI0) wrote: >> The 'mptcp_subflow_context' structure has two items related to the >> backup flags: >> >> - 'backup': the subflow has been marked as backup by the other peer >> >> - 'request_bkup': the backup flag has been set by the host > > The two lines are not aligned. Good catch! >> Before this patch, the scheduler was only looking at the 'backup' >> flag. >> That can make sense in some cases, but it looks like that's not what >> we >> wanted for the general use, because either the path-manager was >> setting >> both of them when sending an MP_PRIO, or the receiver was duplicating >> the 'backup' flag in the subflow request. >> >> Note that the use of these two flags in the path-manager are going to >> be >> fixed in the next commits, but this change here is needed not to >> modify >> the behaviour. >> >> Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN >> requests") > > Patch 3 can be squashed into this one, with two "Fixes" tags here. I think we should avoid putting two Fixes tags, because it makes the backports harder. Also, the issue is really there since MP_JOIN got supported: at the beginning, the extra subflows were flagged as backup, and since the beginning, the backup flag in the SYN+MPJ is reflected in the SYN+ACK+MPJ (patch 2). The scheduler was then always looking at both side by accident since the beginning. By fixing use of the two flags internally, we should also modify the scheduler to keep the same behaviour (looking at both sides is what we should have done from the beginning to me). No? Cheers, Matt -- Sponsored by the NGI0 Core fund.
© 2016 - 2024 Red Hat, Inc.