mptcp: really cope with fastopen race.

[PATCH mptcp-net] mptcp: really cope with fastopen race.

Posted by Paolo Abeni 6 months ago

Fastopen and PM-trigger subflow shutdown can race, as reported by
syzkaller.

In my first attempt to close such race, I missed the fact that
the subflow status can change again before the subflow_state_change
callback is invoked.

Address the issue additionally copying with all the states directly
reachable from TCP_FIN_WAIT1.

Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/mptcp/protocol.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index 4a32d3d11fb6..de04b97e8dd1 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -1171,7 +1171,8 @@ static inline bool subflow_simultaneous_connect(struct sock *sk)
 {
 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
 
-	return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_FIN_WAIT1) &&
+	return (1 << sk->sk_state) &
+	       (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING) &&
 	       is_active_ssk(subflow) &&
 	       !subflow->conn_finished;
 }
-- 
2.43.0

Re: [PATCH mptcp-net] mptcp: really cope with fastopen race.

Posted by Paolo Abeni 5 months, 3 weeks ago

On Wed, 2024-01-24 at 22:36 +0100, Paolo Abeni wrote:
> Fastopen and PM-trigger subflow shutdown can race, as reported by
> syzkaller.
> 
> In my first attempt to close such race, I missed the fact that
> the subflow status can change again before the subflow_state_change
> callback is invoked.
> 
> Address the issue additionally copying with all the states directly
> reachable from TCP_FIN_WAIT1.
> 
> Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
> Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
> Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com

Hopefully it's not too late...

The above should be:

Reported-by: syzbot+732ab7be796ec0d104ac@syzkaller.appspotmail.com

@Mat(s), could you please update the path directly in the export
branch?

Thanks!

Paolo

Re: [PATCH mptcp-net] mptcp: really cope with fastopen race.

Posted by Matthieu Baerts 5 months, 4 weeks ago

Hi Paolo, Mat,

On 24/01/2024 22:36, Paolo Abeni wrote:
> Fastopen and PM-trigger subflow shutdown can race, as reported by
> syzkaller.
> 
> In my first attempt to close such race, I missed the fact that
> the subflow status can change again before the subflow_state_change
> callback is invoked.
> 
> Address the issue additionally copying with all the states directly
> reachable from TCP_FIN_WAIT1.

Thank you for the patch and the review!

Now in our tree (fix for -net) without the '.' at the end of the commit
title and a Closes tag ;)

New patches for t/upstream-net and t/upstream:
- d59403d0c91b: mptcp: really cope with fastopen race
- Results: ca82f8ccb3cc..6d56aa24e733 (export-net)
- Results: 4b8f1ec22243..cdd4fe00f4c5 (export)

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export-net/20240130T120521
https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20240130T120521

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH mptcp-net] mptcp: really cope with fastopen race.

Posted by Mat Martineau 6 months ago

On Wed, 24 Jan 2024, Paolo Abeni wrote:

> Fastopen and PM-trigger subflow shutdown can race, as reported by
> syzkaller.
>
> In my first attempt to close such race, I missed the fact that
> the subflow status can change again before the subflow_state_change
> callback is invoked.
>
> Address the issue additionally copying with all the states directly
> reachable from TCP_FIN_WAIT1.
>
> Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
> Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
> Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

I wasn't able to reproduce the CI failure, and the change looks good to 
me. We'll see if CI is happy when this is applied to export-net.

Reviewed-by: Mat Martineau <martineau@kernel.org>

> ---
> net/mptcp/protocol.h | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
> index 4a32d3d11fb6..de04b97e8dd1 100644
> --- a/net/mptcp/protocol.h
> +++ b/net/mptcp/protocol.h
> @@ -1171,7 +1171,8 @@ static inline bool subflow_simultaneous_connect(struct sock *sk)
> {
> 	struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk);
>
> -	return (1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_FIN_WAIT1) &&
> +	return (1 << sk->sk_state) &
> +	       (TCPF_ESTABLISHED | TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING) &&
> 	       is_active_ssk(subflow) &&
> 	       !subflow->conn_finished;
> }
> -- 
> 2.43.0
>
>
>

Re: [PATCH mptcp-net] mptcp: really cope with fastopen race.

Posted by Paolo Abeni 5 months, 4 weeks ago

On Fri, 2024-01-26 at 17:18 -0800, Mat Martineau wrote:
> On Wed, 24 Jan 2024, Paolo Abeni wrote:
> 
> > Fastopen and PM-trigger subflow shutdown can race, as reported by
> > syzkaller.
> > 
> > In my first attempt to close such race, I missed the fact that
> > the subflow status can change again before the subflow_state_change
> > callback is invoked.
> > 
> > Address the issue additionally copying with all the states directly
> > reachable from TCP_FIN_WAIT1.
> > 
> > Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
> > Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
> > Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> 
> I wasn't able to reproduce the CI failure, and the change looks good to 
> me. We'll see if CI is happy when this is applied to export-net.

I used the repro provided by syzbot upstream:

https://lore.kernel.org/netdev/000000000000e8ecae060fae7a47@google.com/
https://syzkaller.appspot.com/x/repro.c?x=17abc23be80000

with my default debug config, and here it reproduced the issue reliably
on the unpatched kernel (and no splat with the patched one).

> 
> Reviewed-by: Mat Martineau <martineau@kernel.org>

Thanks,

Paolo

Re: [PATCH mptcp-net] mptcp: really cope with fastopen race.

Posted by Mat Martineau 5 months, 4 weeks ago

On Mon, 29 Jan 2024, Paolo Abeni wrote:

> On Fri, 2024-01-26 at 17:18 -0800, Mat Martineau wrote:
>> On Wed, 24 Jan 2024, Paolo Abeni wrote:
>>
>>> Fastopen and PM-trigger subflow shutdown can race, as reported by
>>> syzkaller.
>>>
>>> In my first attempt to close such race, I missed the fact that
>>> the subflow status can change again before the subflow_state_change
>>> callback is invoked.
>>>
>>> Address the issue additionally copying with all the states directly
>>> reachable from TCP_FIN_WAIT1.
>>>
>>> Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
>>> Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
>>> Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com
>>> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>>
>> I wasn't able to reproduce the CI failure, and the change looks good to
>> me. We'll see if CI is happy when this is applied to export-net.
>
> I used the repro provided by syzbot upstream:
>
> https://lore.kernel.org/netdev/000000000000e8ecae060fae7a47@google.com/
> https://syzkaller.appspot.com/x/repro.c?x=17abc23be80000
>
> with my default debug config, and here it reproduced the issue reliably
> on the unpatched kernel (and no splat with the patched one).
>

I was not very specific in my wording: I was referring to the CI failures 
when it applied this patch to export-net:

https://lore.kernel.org/mptcp/482549332bcc96a8b4dd37292cb289b271b49511.camel@redhat.com/T/#mfb81b40b8a42912362eede622b5a15b17313d2a2

...not the failure reported by syzbot that this patch is intended to fix.

I did run export-net plus this patch through the 
mptcp-upstream-virtme-docker environment and it worked fine, so I think 
it's ok to apply.

- Mat

Re: [PATCH mptcp-net] mptcp: really cope with fastopen race.

Posted by Paolo Abeni 5 months, 4 weeks ago

On Mon, 2024-01-29 at 11:16 -0800, Mat Martineau wrote:
> On Mon, 29 Jan 2024, Paolo Abeni wrote:
> 
> > On Fri, 2024-01-26 at 17:18 -0800, Mat Martineau wrote:
> > > On Wed, 24 Jan 2024, Paolo Abeni wrote:
> > > 
> > > > Fastopen and PM-trigger subflow shutdown can race, as reported by
> > > > syzkaller.
> > > > 
> > > > In my first attempt to close such race, I missed the fact that
> > > > the subflow status can change again before the subflow_state_change
> > > > callback is invoked.
> > > > 
> > > > Address the issue additionally copying with all the states directly
> > > > reachable from TCP_FIN_WAIT1.
> > > > 
> > > > Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
> > > > Fixes: 4fd19a307016 ("mptcp: fix inconsistent state on fastopen race")
> > > > Reported-by: syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com
> > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > > 
> > > I wasn't able to reproduce the CI failure, and the change looks good to
> > > me. We'll see if CI is happy when this is applied to export-net.
> > 
> > I used the repro provided by syzbot upstream:
> > 
> > https://lore.kernel.org/netdev/000000000000e8ecae060fae7a47@google.com/
> > https://syzkaller.appspot.com/x/repro.c?x=17abc23be80000
> > 
> > with my default debug config, and here it reproduced the issue reliably
> > on the unpatched kernel (and no splat with the patched one).
> > 
> 
> I was not very specific in my wording: I was referring to the CI failures 
> when it applied this patch to export-net:
> 
> https://lore.kernel.org/mptcp/482549332bcc96a8b4dd37292cb289b271b49511.camel@redhat.com/T/#mfb81b40b8a42912362eede622b5a15b17313d2a2

Ah, that one!

I think it's caused by:

commit 198bc90e0e734e5f98c3d2833e8390cac3df61b2
Author: Zhengchao Shao <shaozhengchao@huawei.com>
Date:   Thu Jan 18 09:20:19 2024 +0800

    tcp: make sure init the accept_queue's spinlocks once

and fixed by:

commit 435e202d645c197dcfd39d7372eb2a56529b6640 (net68/tmp_png)
Author: Zhengchao Shao <shaozhengchao@huawei.com>
Date:   Mon Jan 22 18:20:01 2024 +0800

    ipv6: init the accept_queue's spinlocks in inet6_create

Cheers,

Paolo