net: remove release/lock_sock in tcp_splice_read

[PATCH] net: remove release/lock_sock in tcp_splice_read

Posted by sunyiqi 1 year, 5 months ago

When enters tcp_splice_read, tcp_splice_read will call lock_sock
for sk in order to prevent other threads acquiring sk and release it
before return.

But in while(tss.len) loop, it releases and re-locks sk, give the other
thread a small window to lock the sk.

As a result, release/lock_sock in the while loop in tcp_splice_read may
cause race condition.

Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.")
Signed-off-by: sunyiqi <sunyiqixm@gmail.com>
---
 net/ipv4/tcp.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e03a342c9162..7a2ce0e2e5be 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -856,8 +856,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
 
 		if (!tss.len || !timeo)
 			break;
-		release_sock(sk);
-		lock_sock(sk);
 
 		if (sk->sk_err || sk->sk_state == TCP_CLOSE ||
 		    (sk->sk_shutdown & RCV_SHUTDOWN) ||
-- 
2.34.1

Re: [PATCH] net: remove release/lock_sock in tcp_splice_read

Posted by Jason Xing 1 year, 5 months ago

On Thu, Aug 15, 2024 at 5:46 PM sunyiqi <sunyiqixm@gmail.com> wrote:
>
> When enters tcp_splice_read, tcp_splice_read will call lock_sock
> for sk in order to prevent other threads acquiring sk and release it
> before return.
>
> But in while(tss.len) loop, it releases and re-locks sk, give the other
> thread a small window to lock the sk.
>
> As a result, release/lock_sock in the while loop in tcp_splice_read may
> cause race condition.
>
> Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.")
> Signed-off-by: sunyiqi <sunyiqixm@gmail.com>

It's more of an optimization instead of a BUG, no?

I don't consider it as a bug, unless you can prove it... Let me ask
what kind of race issues could re-lock cause?

I think holding the socket lock too long is not a good idea because
releasing the lock can give others breathing (having the chance to
finish their own stuff). Please see release_sock().

Thanks,
Jason

> ---
>  net/ipv4/tcp.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index e03a342c9162..7a2ce0e2e5be 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -856,8 +856,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
>
>                 if (!tss.len || !timeo)
>                         break;
> -               release_sock(sk);
> -               lock_sock(sk);
>
>                 if (sk->sk_err || sk->sk_state == TCP_CLOSE ||
>                     (sk->sk_shutdown & RCV_SHUTDOWN) ||
> --
> 2.34.1
>
>

Re: [PATCH] net: remove release/lock_sock in tcp_splice_read

Posted by Paolo Abeni 1 year, 5 months ago

On 8/15/24 10:43, sunyiqi wrote:
> When enters tcp_splice_read, tcp_splice_read will call lock_sock
> for sk in order to prevent other threads acquiring sk and release it
> before return.
> 
> But in while(tss.len) loop, it releases and re-locks sk, give the other
> thread a small window to lock the sk.
> 
> As a result, release/lock_sock in the while loop in tcp_splice_read may
> cause race condition.

Which race condition exactly? do you have a backtrace?

> 
> Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.")
> Signed-off-by: sunyiqi <sunyiqixm@gmail.com>
> ---
>   net/ipv4/tcp.c | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index e03a342c9162..7a2ce0e2e5be 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -856,8 +856,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
>   
>   		if (!tss.len || !timeo)
>   			break;
> -		release_sock(sk);
> -		lock_sock(sk);

This is needed to flush the sk backlog.

Somewhat related, I think we could replace the pair with sk_flush_backlog().

Thanks,

Paolo

Re: [PATCH] net: remove release/lock_sock in tcp_splice_read

Posted by Jason Xing 1 year, 5 months ago

Hello Paolo,

On Thu, Aug 15, 2024 at 6:40 PM Paolo Abeni <pabeni@redhat.com> wrote:
[...]
> > -             release_sock(sk);
> > -             lock_sock(sk);
>
> This is needed to flush the sk backlog.
>
> Somewhat related, I think we could replace the pair with sk_flush_backlog().
>

Do you think we could do this like the following commit:

commit d41a69f1d390fa3f2546498103cdcd78b30676ff
Author: Eric Dumazet <edumazet@google.com>
Date:   Fri Apr 29 14:16:53 2016 -0700

    tcp: make tcp_sendmsg() aware of socket backlog

    Large sendmsg()/write() hold socket lock for the duration of the call,
    unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
    are parked into socket backlog for a long time.

?

Then we can avoid taking the lock too long which results in too many
packets in the backlog.

Thanks,
Jason

Re: [PATCH] net: remove release/lock_sock in tcp_splice_read

Posted by Paolo Abeni 1 year, 5 months ago


On 8/15/24 12:55, Jason Xing wrote:
> On Thu, Aug 15, 2024 at 6:40 PM Paolo Abeni <pabeni@redhat.com> wrote:
> [...]
>>> -             release_sock(sk);
>>> -             lock_sock(sk);
>>
>> This is needed to flush the sk backlog.
>>
>> Somewhat related, I think we could replace the pair with sk_flush_backlog().
>>
> 
> Do you think we could do this like the following commit:
> 
> commit d41a69f1d390fa3f2546498103cdcd78b30676ff
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Fri Apr 29 14:16:53 2016 -0700
> 
>      tcp: make tcp_sendmsg() aware of socket backlog
> 
>      Large sendmsg()/write() hold socket lock for the duration of the call,
>      unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
>      are parked into socket backlog for a long time. >
> ?

Yep. To be more accurate I was looking at commit 
93afcfd1db35882921b2521a637c78755c27b02c

In any case this should be unrelated from the supposed issue.

Cheers,

Paolo

Re: [PATCH] net: remove release/lock_sock in tcp_splice_read

Posted by Jason Xing 1 year, 5 months ago

On Thu, Aug 15, 2024 at 7:23 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
>
>
> On 8/15/24 12:55, Jason Xing wrote:
> > On Thu, Aug 15, 2024 at 6:40 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > [...]
> >>> -             release_sock(sk);
> >>> -             lock_sock(sk);
> >>
> >> This is needed to flush the sk backlog.
> >>
> >> Somewhat related, I think we could replace the pair with sk_flush_backlog().
> >>
> >
> > Do you think we could do this like the following commit:
> >
> > commit d41a69f1d390fa3f2546498103cdcd78b30676ff
> > Author: Eric Dumazet <edumazet@google.com>
> > Date:   Fri Apr 29 14:16:53 2016 -0700
> >
> >      tcp: make tcp_sendmsg() aware of socket backlog
> >
> >      Large sendmsg()/write() hold socket lock for the duration of the call,
> >      unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
> >      are parked into socket backlog for a long time. >
> > ?
>
> Yep. To be more accurate I was looking at commit
> 93afcfd1db35882921b2521a637c78755c27b02c

Thanks. It arouses my interest. Now I do believe we can do such
optimization in this function.

>
> In any case this should be unrelated from the supposed issue.

Sure.

Thanks,
Jason