[PATCH v2 mptcp-net 1/2] mptcp: do not fallback when OoO is present

Paolo Abeni posted 2 patches 2 weeks, 2 days ago
There is a newer version of this series
[PATCH v2 mptcp-net 1/2] mptcp: do not fallback when OoO is present
Posted by Paolo Abeni 2 weeks, 2 days ago
In case of DSS corruption, the MPTCP protocol tries to avoid the
subflow reset if fallback is possible. Such corruptions happen in
the receive path; to ensure fallback is possible the stack additionally
need to check for OoO data, otherwise the fallback will break the data
stream.

Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
Note: this does not avoid the WARN(), but fixes the inconsistend
read() behavior; the ingress data is OoO, we should not ack it
---
 net/mptcp/protocol.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index d6b08e1de358..7b966f105f89 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -646,7 +646,8 @@ static void mptcp_check_data_fin(struct sock *sk)
 
 static void mptcp_dss_corruption(struct mptcp_sock *msk, struct sock *ssk)
 {
-	if (!mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
+	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) ||
+	    !mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
 		MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DSSCORRUPTIONRESET);
 		mptcp_subflow_reset(ssk);
 	}
-- 
2.51.1
Re: [PATCH v2 mptcp-net 1/2] mptcp: do not fallback when OoO is present
Posted by Matthieu Baerts 2 weeks, 1 day ago
Hi Paolo,

On 11/11/2025 08:24, Paolo Abeni wrote:
> In case of DSS corruption, the MPTCP protocol tries to avoid the
> subflow reset if fallback is possible. Such corruptions happen in
> the receive path; to ensure fallback is possible the stack additionally
> need to check for OoO data, otherwise the fallback will break the data
> stream.

Thank you for the fix!
> Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption")
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> Note: this does not avoid the WARN(), but fixes the inconsistend
> read() behavior; the ingress data is OoO, we should not ack it
> ---
>  net/mptcp/protocol.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index d6b08e1de358..7b966f105f89 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -646,7 +646,8 @@ static void mptcp_check_data_fin(struct sock *sk)
>  
>  static void mptcp_dss_corruption(struct mptcp_sock *msk, struct sock *ssk)
>  {
> -	if (!mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
> +	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) ||
> +	    !mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {

Does it mean we should check the OoO queue each time mptcp_try_fallback
is called?

Should we not eventually set msk->allow_infinite_fallback to false in
mptcp_data_queue_ofo()?

>  		MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_DSSCORRUPTIONRESET);
>  		mptcp_subflow_reset(ssk);
>  	}

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH v2 mptcp-net 1/2] mptcp: do not fallback when OoO is present
Posted by Paolo Abeni 2 weeks ago
On 11/11/25 6:50 PM, Matthieu Baerts wrote:
> On 11/11/2025 08:24, Paolo Abeni wrote:
>> In case of DSS corruption, the MPTCP protocol tries to avoid the
>> subflow reset if fallback is possible. Such corruptions happen in
>> the receive path; to ensure fallback is possible the stack additionally
>> need to check for OoO data, otherwise the fallback will break the data
>> stream.
> 
> Thank you for the fix!
>> Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption")
>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598
>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>> ---
>> Note: this does not avoid the WARN(), but fixes the inconsistend
>> read() behavior; the ingress data is OoO, we should not ack it
>> ---
>>  net/mptcp/protocol.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>> index d6b08e1de358..7b966f105f89 100644
>> --- a/net/mptcp/protocol.c
>> +++ b/net/mptcp/protocol.c
>> @@ -646,7 +646,8 @@ static void mptcp_check_data_fin(struct sock *sk)
>>  
>>  static void mptcp_dss_corruption(struct mptcp_sock *msk, struct sock *ssk)
>>  {
>> -	if (!mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
>> +	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) ||
>> +	    !mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
> 
> Does it mean we should check the OoO queue each time mptcp_try_fallback
> is called?
> 
> Should we not eventually set msk->allow_infinite_fallback to false in
> mptcp_data_queue_ofo()?

Good question. According to the RFC here we should unconditionally reset
the subflow. "historically" we try hard to fallback - even because in
early releases fallback was a bit to easy to obtain.

Setting msk->allow_infinite_fallback = false in mptcp_data_queue_ofo()
could possibly hit performances. queue_ofo is basically fastpath with
multiple streams and we will likely need to acquire the fallback lock to
to the thing race free.

I'm tempted to just do a plain reset here. WDYT?

Thanks,

Paolo
Re: [PATCH v2 mptcp-net 1/2] mptcp: do not fallback when OoO is present
Posted by Matthieu Baerts 2 weeks ago
Hi Paolo,

On 13/11/2025 01:02, Paolo Abeni wrote:
> On 11/11/25 6:50 PM, Matthieu Baerts wrote:
>> On 11/11/2025 08:24, Paolo Abeni wrote:
>>> In case of DSS corruption, the MPTCP protocol tries to avoid the
>>> subflow reset if fallback is possible. Such corruptions happen in
>>> the receive path; to ensure fallback is possible the stack additionally
>>> need to check for OoO data, otherwise the fallback will break the data
>>> stream.
>>
>> Thank you for the fix!
>>> Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption")
>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>> ---
>>> Note: this does not avoid the WARN(), but fixes the inconsistend
>>> read() behavior; the ingress data is OoO, we should not ack it
>>> ---
>>>  net/mptcp/protocol.c | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>>> index d6b08e1de358..7b966f105f89 100644
>>> --- a/net/mptcp/protocol.c
>>> +++ b/net/mptcp/protocol.c
>>> @@ -646,7 +646,8 @@ static void mptcp_check_data_fin(struct sock *sk)
>>>  
>>>  static void mptcp_dss_corruption(struct mptcp_sock *msk, struct sock *ssk)
>>>  {
>>> -	if (!mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
>>> +	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) ||
>>> +	    !mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
>>
>> Does it mean we should check the OoO queue each time mptcp_try_fallback
>> is called?
>>
>> Should we not eventually set msk->allow_infinite_fallback to false in
>> mptcp_data_queue_ofo()?
> 
> Good question. According to the RFC here we should unconditionally reset
> the subflow. "historically" we try hard to fallback - even because in
> early releases fallback was a bit to easy to obtain.
> 
> Setting msk->allow_infinite_fallback = false in mptcp_data_queue_ofo()
> could possibly hit performances. queue_ofo is basically fastpath with
> multiple streams and we will likely need to acquire the fallback lock to
> to the thing race free.
> 
> I'm tempted to just do a plain reset here. WDYT?

Can mptcp_dss_corruption() not be called before being in fully
established mode? In this case, should we not fallback instead? e.g. if
a middlebox start to alter MPTCP options after the 3WHS?

Or maybe moving this RB_EMPTY_ROOT(&msk->out_of_order_queue) check to
mptcp_try_fallback()? (mmh, we don't have the msk)

Or maybe just a plain reset is OK here?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH v2 mptcp-net 1/2] mptcp: do not fallback when OoO is present
Posted by Paolo Abeni 1 week, 6 days ago
On 11/13/25 10:08 AM, Matthieu Baerts wrote:
> On 13/11/2025 01:02, Paolo Abeni wrote:
>> On 11/11/25 6:50 PM, Matthieu Baerts wrote:
>>> On 11/11/2025 08:24, Paolo Abeni wrote:
>>>> In case of DSS corruption, the MPTCP protocol tries to avoid the
>>>> subflow reset if fallback is possible. Such corruptions happen in
>>>> the receive path; to ensure fallback is possible the stack additionally
>>>> need to check for OoO data, otherwise the fallback will break the data
>>>> stream.
>>>
>>> Thank you for the fix!
>>>> Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption")
>>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598
>>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>>> ---
>>>> Note: this does not avoid the WARN(), but fixes the inconsistend
>>>> read() behavior; the ingress data is OoO, we should not ack it
>>>> ---
>>>>  net/mptcp/protocol.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>>>> index d6b08e1de358..7b966f105f89 100644
>>>> --- a/net/mptcp/protocol.c
>>>> +++ b/net/mptcp/protocol.c
>>>> @@ -646,7 +646,8 @@ static void mptcp_check_data_fin(struct sock *sk)
>>>>  
>>>>  static void mptcp_dss_corruption(struct mptcp_sock *msk, struct sock *ssk)
>>>>  {
>>>> -	if (!mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
>>>> +	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) ||
>>>> +	    !mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
>>>
>>> Does it mean we should check the OoO queue each time mptcp_try_fallback
>>> is called?
>>>
>>> Should we not eventually set msk->allow_infinite_fallback to false in
>>> mptcp_data_queue_ofo()?
>>
>> Good question. According to the RFC here we should unconditionally reset
>> the subflow. "historically" we try hard to fallback - even because in
>> early releases fallback was a bit to easy to obtain.
>>
>> Setting msk->allow_infinite_fallback = false in mptcp_data_queue_ofo()
>> could possibly hit performances. queue_ofo is basically fastpath with
>> multiple streams and we will likely need to acquire the fallback lock to
>> to the thing race free.
>>
>> I'm tempted to just do a plain reset here. WDYT?
> 
> Can mptcp_dss_corruption() not be called before being in fully
> established mode? In this case, should we not fallback instead? e.g. if
> a middlebox start to alter MPTCP options after the 3WHS?

Even if we would be before reaching fully established status, the DSS
corruption caused by the pktdrill test causes MPTCP-level OoO with data
already acked at the TCP level. We can't drop it nor we can safely
fallback, we have do reset (if ofo queue) is not empty.
> Or maybe moving this RB_EMPTY_ROOT(&msk->out_of_order_queue) check to
> mptcp_try_fallback()? (mmh, we don't have the msk)
> 
> Or maybe just a plain reset is OK here?
I think the main question is if constraining such resets to OoO cases
(in the attempt to do fallback in other scenarios) or always reset here.

I propose to keep it simple, minimize the difference from current status
and always check OoO.

/P
Re: [PATCH v2 mptcp-net 1/2] mptcp: do not fallback when OoO is present
Posted by Matthieu Baerts 1 week, 6 days ago
On 13/11/2025 18:32, Paolo Abeni wrote:
> On 11/13/25 10:08 AM, Matthieu Baerts wrote:
>> On 13/11/2025 01:02, Paolo Abeni wrote:
>>> On 11/11/25 6:50 PM, Matthieu Baerts wrote:
>>>> On 11/11/2025 08:24, Paolo Abeni wrote:
>>>>> In case of DSS corruption, the MPTCP protocol tries to avoid the
>>>>> subflow reset if fallback is possible. Such corruptions happen in
>>>>> the receive path; to ensure fallback is possible the stack additionally
>>>>> need to check for OoO data, otherwise the fallback will break the data
>>>>> stream.
>>>>
>>>> Thank you for the fix!
>>>>> Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption")
>>>>> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598
>>>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>>>> ---
>>>>> Note: this does not avoid the WARN(), but fixes the inconsistend
>>>>> read() behavior; the ingress data is OoO, we should not ack it
>>>>> ---
>>>>>  net/mptcp/protocol.c | 3 ++-
>>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>>>>> index d6b08e1de358..7b966f105f89 100644
>>>>> --- a/net/mptcp/protocol.c
>>>>> +++ b/net/mptcp/protocol.c
>>>>> @@ -646,7 +646,8 @@ static void mptcp_check_data_fin(struct sock *sk)
>>>>>  
>>>>>  static void mptcp_dss_corruption(struct mptcp_sock *msk, struct sock *ssk)
>>>>>  {
>>>>> -	if (!mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
>>>>> +	if (!RB_EMPTY_ROOT(&msk->out_of_order_queue) ||
>>>>> +	    !mptcp_try_fallback(ssk, MPTCP_MIB_DSSCORRUPTIONFALLBACK)) {
>>>>
>>>> Does it mean we should check the OoO queue each time mptcp_try_fallback
>>>> is called?
>>>>
>>>> Should we not eventually set msk->allow_infinite_fallback to false in
>>>> mptcp_data_queue_ofo()?
>>>
>>> Good question. According to the RFC here we should unconditionally reset
>>> the subflow. "historically" we try hard to fallback - even because in
>>> early releases fallback was a bit to easy to obtain.
>>>
>>> Setting msk->allow_infinite_fallback = false in mptcp_data_queue_ofo()
>>> could possibly hit performances. queue_ofo is basically fastpath with
>>> multiple streams and we will likely need to acquire the fallback lock to
>>> to the thing race free.
>>>
>>> I'm tempted to just do a plain reset here. WDYT?
>>
>> Can mptcp_dss_corruption() not be called before being in fully
>> established mode? In this case, should we not fallback instead? e.g. if
>> a middlebox start to alter MPTCP options after the 3WHS?
> 
> Even if we would be before reaching fully established status, the DSS
> corruption caused by the pktdrill test causes MPTCP-level OoO with data
> already acked at the TCP level. We can't drop it nor we can safely
> fallback, we have do reset (if ofo queue) is not empty.

Yes, OK to reset in this case.

>> Or maybe moving this RB_EMPTY_ROOT(&msk->out_of_order_queue) check to
>> mptcp_try_fallback()? (mmh, we don't have the msk)
>>
>> Or maybe just a plain reset is OK here?
>
> I think the main question is if constraining such resets to OoO cases
> (in the attempt to do fallback in other scenarios) or always reset here.
> 
> I propose to keep it simple, minimize the difference from current status
> and always check OoO.

Just to be sure I understand this correctly: your prefer keeping the
patch as it is, and there is no need to move the OoO check to the
__mptcp_try_fallback() help to check it in other cases?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.