[PATCH mptcp-net v2 1/9] mptcp: sched: check both directions for backup

Matthieu Baerts (NGI0) posted 9 patches 2 months ago
There is a newer version of this series
[PATCH mptcp-net v2 1/9] mptcp: sched: check both directions for backup
Posted by Matthieu Baerts (NGI0) 2 months ago
The 'mptcp_subflow_context' structure has two items related to the
backup flags:

 - 'backup': the subflow has been marked as backup by the other peer

- 'request_bkup': the backup flag has been set by the host

Before this patch, the scheduler was only looking at the 'backup' flag.
That can make sense in some cases, but it looks like that's not what we
wanted for the general use, because either the path-manager was setting
both of them when sending an MP_PRIO, or the receiver was duplicating
the 'backup' flag in the subflow request.

Note that the use of these two flags in the path-manager are going to be
fixed in the next commits, but this change here is needed not to modify
the behaviour.

Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
---
 include/trace/events/mptcp.h |  2 +-
 net/mptcp/protocol.c         | 10 ++++++----
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/trace/events/mptcp.h b/include/trace/events/mptcp.h
index 09e72215b9f9..085b749cdd97 100644
--- a/include/trace/events/mptcp.h
+++ b/include/trace/events/mptcp.h
@@ -34,7 +34,7 @@ TRACE_EVENT(mptcp_subflow_get_send,
 		struct sock *ssk;
 
 		__entry->active = mptcp_subflow_active(subflow);
-		__entry->backup = subflow->backup;
+		__entry->backup = subflow->backup || subflow->request_bkup;
 
 		if (subflow->tcp_sock && sk_fullsock(subflow->tcp_sock))
 			__entry->free = sk_stream_memory_free(subflow->tcp_sock);
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index ac94225489f8..b3a48d97f009 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1422,13 +1422,15 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
 	}
 
 	mptcp_for_each_subflow(msk, subflow) {
+		bool backup = subflow->backup || subflow->request_bkup;
+
 		trace_mptcp_subflow_get_send(subflow);
 		ssk =  mptcp_subflow_tcp_sock(subflow);
 		if (!mptcp_subflow_active(subflow))
 			continue;
 
 		tout = max(tout, mptcp_timeout_from_subflow(subflow));
-		nr_active += !subflow->backup;
+		nr_active += !backup;
 		pace = subflow->avg_pacing_rate;
 		if (unlikely(!pace)) {
 			/* init pacing rate from socket */
@@ -1439,9 +1441,9 @@ struct sock *mptcp_subflow_get_send(struct mptcp_sock *msk)
 		}
 
 		linger_time = div_u64((u64)READ_ONCE(ssk->sk_wmem_queued) << 32, pace);
-		if (linger_time < send_info[subflow->backup].linger_time) {
-			send_info[subflow->backup].ssk = ssk;
-			send_info[subflow->backup].linger_time = linger_time;
+		if (linger_time < send_info[backup].linger_time) {
+			send_info[backup].ssk = ssk;
+			send_info[backup].linger_time = linger_time;
 		}
 	}
 	__mptcp_set_timeout(sk, tout);

-- 
2.45.2
Re: [PATCH mptcp-net v2 1/9] mptcp: sched: check both directions for backup
Posted by Geliang Tang 2 months ago
On Tue, 2024-07-16 at 22:53 +0200, Matthieu Baerts (NGI0) wrote:
> The 'mptcp_subflow_context' structure has two items related to the
> backup flags:
> 
>  - 'backup': the subflow has been marked as backup by the other peer
> 
> - 'request_bkup': the backup flag has been set by the host

The two lines are not aligned.

> 
> Before this patch, the scheduler was only looking at the 'backup'
> flag.
> That can make sense in some cases, but it looks like that's not what
> we
> wanted for the general use, because either the path-manager was
> setting
> both of them when sending an MP_PRIO, or the receiver was duplicating
> the 'backup' flag in the subflow request.
> 
> Note that the use of these two flags in the path-manager are going to
> be
> fixed in the next commits, but this change here is needed not to
> modify
> the behaviour.
> 
> Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN
> requests")

Patch 3 can be squashed into this one, with two "Fixes" tags here.

WDYT?

Thanks,
-Geliang

> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> ---
>  include/trace/events/mptcp.h |  2 +-
>  net/mptcp/protocol.c         | 10 ++++++----
>  2 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/include/trace/events/mptcp.h
> b/include/trace/events/mptcp.h
> index 09e72215b9f9..085b749cdd97 100644
> --- a/include/trace/events/mptcp.h
> +++ b/include/trace/events/mptcp.h
> @@ -34,7 +34,7 @@ TRACE_EVENT(mptcp_subflow_get_send,
>  		struct sock *ssk;
>  
>  		__entry->active = mptcp_subflow_active(subflow);
> -		__entry->backup = subflow->backup;
> +		__entry->backup = subflow->backup || subflow-
> >request_bkup;
>  
>  		if (subflow->tcp_sock && sk_fullsock(subflow-
> >tcp_sock))
>  			__entry->free =
> sk_stream_memory_free(subflow->tcp_sock);
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index ac94225489f8..b3a48d97f009 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -1422,13 +1422,15 @@ struct sock *mptcp_subflow_get_send(struct
> mptcp_sock *msk)
>  	}
>  
>  	mptcp_for_each_subflow(msk, subflow) {
> +		bool backup = subflow->backup || subflow-
> >request_bkup;
> +
>  		trace_mptcp_subflow_get_send(subflow);
>  		ssk =  mptcp_subflow_tcp_sock(subflow);
>  		if (!mptcp_subflow_active(subflow))
>  			continue;
>  
>  		tout = max(tout,
> mptcp_timeout_from_subflow(subflow));
> -		nr_active += !subflow->backup;
> +		nr_active += !backup;
>  		pace = subflow->avg_pacing_rate;
>  		if (unlikely(!pace)) {
>  			/* init pacing rate from socket */
> @@ -1439,9 +1441,9 @@ struct sock *mptcp_subflow_get_send(struct
> mptcp_sock *msk)
>  		}
>  
>  		linger_time = div_u64((u64)READ_ONCE(ssk-
> >sk_wmem_queued) << 32, pace);
> -		if (linger_time < send_info[subflow-
> >backup].linger_time) {
> -			send_info[subflow->backup].ssk = ssk;
> -			send_info[subflow->backup].linger_time =
> linger_time;
> +		if (linger_time < send_info[backup].linger_time) {
> +			send_info[backup].ssk = ssk;
> +			send_info[backup].linger_time = linger_time;
>  		}
>  	}
>  	__mptcp_set_timeout(sk, tout);
> 

Re: [PATCH mptcp-net v2 1/9] mptcp: sched: check both directions for backup
Posted by Matthieu Baerts 2 months ago
Hi Geliang,

Thank you for the review!

On 17/07/2024 06:25, Geliang Tang wrote:
> On Tue, 2024-07-16 at 22:53 +0200, Matthieu Baerts (NGI0) wrote:
>> The 'mptcp_subflow_context' structure has two items related to the
>> backup flags:
>>
>>  - 'backup': the subflow has been marked as backup by the other peer
>>
>> - 'request_bkup': the backup flag has been set by the host
> 
> The two lines are not aligned.

Good catch!

>> Before this patch, the scheduler was only looking at the 'backup'
>> flag.
>> That can make sense in some cases, but it looks like that's not what
>> we
>> wanted for the general use, because either the path-manager was
>> setting
>> both of them when sending an MP_PRIO, or the receiver was duplicating
>> the 'backup' flag in the subflow request.
>>
>> Note that the use of these two flags in the path-manager are going to
>> be
>> fixed in the next commits, but this change here is needed not to
>> modify
>> the behaviour.
>>
>> Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN
>> requests")
> 
> Patch 3 can be squashed into this one, with two "Fixes" tags here.

I think we should avoid putting two Fixes tags, because it makes the
backports harder. Also, the issue is really there since MP_JOIN got
supported: at the beginning, the extra subflows were flagged as backup,
and since the beginning, the backup flag in the SYN+MPJ is reflected in
the SYN+ACK+MPJ (patch 2). The scheduler was then always looking at both
side by accident since the beginning. By fixing use of the two flags
internally, we should also modify the scheduler to keep the same
behaviour (looking at both sides is what we should have done from the
beginning to me). No?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.