[v4] mptcp: fix fallback-related races

[PATCH v4 mptcp-net 0/5] mptcp: fix fallback-related races

Posted by Paolo Abeni 7 months ago

This series contains 3 fixes somewhat related to various races we have
while handling fallback and 2 small follow-up likely more suited for
net-next.

The root cause of the issues addressed here is that the check for
"we can fallback to tcp now" and the related action are not atomic. That
also applies to fallback due to MP_FAIL - where the window race is even
wider.

Address the issue introducing an additional spinlock to bundle together
all the relevant events, as per patch 1 and 2.

Note that mptcp_disconnect() unconditionally
clears the fallback status (zeroing msk->flags) but don't tuch the
`allows_infinite_fallback` flag. Such issue is addressed in patch 3.

Patch 4 cleans up a bit the fallback code, introducing specific MIB for
each FB reason, and patch 5 drops the, hopefully now redundant
pr_fallback().
---
v3 -> v4:
 - fic MIB exposed names

v2 -> v3:
 - mptcp_do_fallback  -> mptcp_try_fallback
 - refactored patch 3/5
 - changed mibs names, increment fail only when the protocol mandate it
 - fix W=1 warn in patch 5/5

Paolo Abeni (5):
  mptcp: make fallback action and fallback decision atomic
  mptcp: plug races between subflow fail and subflow creation
  mptcp: reset fallback status gracefully at disconnect() time
  mptcp: track fallbacks accurately via mibs
  mptcp: remove pr_fallback()

 net/mptcp/ctrl.c     |  4 +-
 net/mptcp/mib.c      |  5 +++
 net/mptcp/mib.h      |  7 ++++
 net/mptcp/options.c  |  6 ++-
 net/mptcp/pm.c       |  8 +++-
 net/mptcp/protocol.c | 97 ++++++++++++++++++++++++++++++++++----------
 net/mptcp/protocol.h | 35 ++++++++--------
 net/mptcp/subflow.c  | 40 ++++++++++--------
 8 files changed, 140 insertions(+), 62 deletions(-)

-- 
2.50.0

Re: [PATCH v4 mptcp-net 0/5] mptcp: fix fallback-related races

Posted by Matthieu Baerts 7 months ago

Hi Paolo,

On 11/07/2025 11:29, Paolo Abeni wrote:
> This series contains 3 fixes somewhat related to various races we have
> while handling fallback and 2 small follow-up likely more suited for
> net-next.
> 
> The root cause of the issues addressed here is that the check for
> "we can fallback to tcp now" and the related action are not atomic. That
> also applies to fallback due to MP_FAIL - where the window race is even
> wider.
> 
> Address the issue introducing an additional spinlock to bundle together
> all the relevant events, as per patch 1 and 2.
> 
> Note that mptcp_disconnect() unconditionally
> clears the fallback status (zeroing msk->flags) but don't tuch the
> `allows_infinite_fallback` flag. Such issue is addressed in patch 3.
> 
> Patch 4 cleans up a bit the fallback code, introducing specific MIB for
> each FB reason, and patch 5 drops the, hopefully now redundant
> pr_fallback().
> ---
> v3 -> v4:
>  - fic MIB exposed names

Thank you for the new version, it looks good to me!

Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

I can apply this series, restart syzkaller on top of it, and send
patches 1-3 on Monday if everything is OK. WDYT?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH v4 mptcp-net 0/5] mptcp: fix fallback-related races

Posted by Matthieu Baerts 7 months ago

On 11/07/2025 11:36, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 11/07/2025 11:29, Paolo Abeni wrote:
>> This series contains 3 fixes somewhat related to various races we have
>> while handling fallback and 2 small follow-up likely more suited for
>> net-next.
>>
>> The root cause of the issues addressed here is that the check for
>> "we can fallback to tcp now" and the related action are not atomic. That
>> also applies to fallback due to MP_FAIL - where the window race is even
>> wider.
>>
>> Address the issue introducing an additional spinlock to bundle together
>> all the relevant events, as per patch 1 and 2.
>>
>> Note that mptcp_disconnect() unconditionally
>> clears the fallback status (zeroing msk->flags) but don't tuch the
>> `allows_infinite_fallback` flag. Such issue is addressed in patch 3.
>>
>> Patch 4 cleans up a bit the fallback code, introducing specific MIB for
>> each FB reason, and patch 5 drops the, hopefully now redundant
>> pr_fallback().
>> ---
>> v3 -> v4:
>>  - fic MIB exposed names
> 
> Thank you for the new version, it looks good to me!
> 
> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> 
> I can apply this series, restart syzkaller on top of it, and send
> patches 1-3 on Monday if everything is OK. WDYT?

I just applied them in our tree, I will now restart syzkaller. On
Monday, I will send them upstream if there are no objections.

New patches for t/upstream-net and t/upstream:
- 158a4c639b9b: mptcp: make fallback action and fallback decision atomic
- afbefa78d049: mptcp: plug races between subflow fail and subflow creation
- a741374bcb64: mptcp: reset fallback status gracefully at disconnect() time
- Results: 46fddb9ffaee..67147340a318 (export-net)
- Results: ac3befbcaa5b..65264ae05855 (export)

New patches for t/upstream:
- 54d5646bfb71: mptcp: track fallbacks accurately via mibs
- 58f3909466f6: mptcp: remove pr_fallback()
- Results: 65264ae05855..712a5aad742a (export)

Tests are now in progress:

- export-net:
https://github.com/multipath-tcp/mptcp_net-next/commit/f5d6eefb3351a27f96c3d62119772169e4be6552/checks
- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/455f606172fae38e46b6bc4b5aa6ee1e5564f0ad/checks

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH v4 mptcp-net 0/5] mptcp: fix fallback-related races

Posted by Paolo Abeni 7 months ago

On 7/11/25 11:36 AM, Matthieu Baerts wrote:
> On 11/07/2025 11:29, Paolo Abeni wrote:
>> This series contains 3 fixes somewhat related to various races we have
>> while handling fallback and 2 small follow-up likely more suited for
>> net-next.
>>
>> The root cause of the issues addressed here is that the check for
>> "we can fallback to tcp now" and the related action are not atomic. That
>> also applies to fallback due to MP_FAIL - where the window race is even
>> wider.
>>
>> Address the issue introducing an additional spinlock to bundle together
>> all the relevant events, as per patch 1 and 2.
>>
>> Note that mptcp_disconnect() unconditionally
>> clears the fallback status (zeroing msk->flags) but don't tuch the
>> `allows_infinite_fallback` flag. Such issue is addressed in patch 3.
>>
>> Patch 4 cleans up a bit the fallback code, introducing specific MIB for
>> each FB reason, and patch 5 drops the, hopefully now redundant
>> pr_fallback().
>> ---
>> v3 -> v4:
>>  - fic MIB exposed names
> 
> Thank you for the new version, it looks good to me!
> 
> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>

Thank you!

> I can apply this series, restart syzkaller on top of it, and send
> patches 1-3 on Monday if everything is OK. WDYT?

I would wait a little more for syzkaller. The underlying issue is not
very recent and the fix is not that invasive, I guess it's not a tragedy
if it reaches Linus' tree the following week.

But no strong opinion either ways.

/P

Re: [PATCH v4 mptcp-net 0/5] mptcp: fix fallback-related races

Posted by Matthieu Baerts 7 months ago

Hi Paolo,

On 11/07/2025 15:51, Paolo Abeni wrote:
> On 7/11/25 11:36 AM, Matthieu Baerts wrote:
>> On 11/07/2025 11:29, Paolo Abeni wrote:
>>> This series contains 3 fixes somewhat related to various races we have
>>> while handling fallback and 2 small follow-up likely more suited for
>>> net-next.
>>>
>>> The root cause of the issues addressed here is that the check for
>>> "we can fallback to tcp now" and the related action are not atomic. That
>>> also applies to fallback due to MP_FAIL - where the window race is even
>>> wider.
>>>
>>> Address the issue introducing an additional spinlock to bundle together
>>> all the relevant events, as per patch 1 and 2.
>>>
>>> Note that mptcp_disconnect() unconditionally
>>> clears the fallback status (zeroing msk->flags) but don't tuch the
>>> `allows_infinite_fallback` flag. Such issue is addressed in patch 3.
>>>
>>> Patch 4 cleans up a bit the fallback code, introducing specific MIB for
>>> each FB reason, and patch 5 drops the, hopefully now redundant
>>> pr_fallback().
>>> ---
>>> v3 -> v4:
>>>  - fic MIB exposed names
>>
>> Thank you for the new version, it looks good to me!
>>
>> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
> 
> Thank you!
> 
>> I can apply this series, restart syzkaller on top of it, and send
>> patches 1-3 on Monday if everything is OK. WDYT?
> 
> I would wait a little more for syzkaller. The underlying issue is not
> very recent and the fix is not that invasive, I guess it's not a tragedy
> if it reaches Linus' tree the following week.
> 
> But no strong opinion either ways.

Sure, I can wait, but it was just to be able to flush the new MIB
counters in next before the new release :)

Note that syzkaller was validating your previous versions since Monday,
not only from today then :)

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.