net/ipv4/tcp.c | 2 -- 1 file changed, 2 deletions(-)
When enters tcp_splice_read, tcp_splice_read will call lock_sock
for sk in order to prevent other threads acquiring sk and release it
before return.
But in while(tss.len) loop, it releases and re-locks sk, give the other
thread a small window to lock the sk.
As a result, release/lock_sock in the while loop in tcp_splice_read may
cause race condition.
Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.")
Signed-off-by: sunyiqi <sunyiqixm@gmail.com>
---
net/ipv4/tcp.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index e03a342c9162..7a2ce0e2e5be 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -856,8 +856,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
if (!tss.len || !timeo)
break;
- release_sock(sk);
- lock_sock(sk);
if (sk->sk_err || sk->sk_state == TCP_CLOSE ||
(sk->sk_shutdown & RCV_SHUTDOWN) ||
--
2.34.1
On Thu, Aug 15, 2024 at 5:46 PM sunyiqi <sunyiqixm@gmail.com> wrote:
>
> When enters tcp_splice_read, tcp_splice_read will call lock_sock
> for sk in order to prevent other threads acquiring sk and release it
> before return.
>
> But in while(tss.len) loop, it releases and re-locks sk, give the other
> thread a small window to lock the sk.
>
> As a result, release/lock_sock in the while loop in tcp_splice_read may
> cause race condition.
>
> Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.")
> Signed-off-by: sunyiqi <sunyiqixm@gmail.com>
It's more of an optimization instead of a BUG, no?
I don't consider it as a bug, unless you can prove it... Let me ask
what kind of race issues could re-lock cause?
I think holding the socket lock too long is not a good idea because
releasing the lock can give others breathing (having the chance to
finish their own stuff). Please see release_sock().
Thanks,
Jason
> ---
> net/ipv4/tcp.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index e03a342c9162..7a2ce0e2e5be 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -856,8 +856,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
>
> if (!tss.len || !timeo)
> break;
> - release_sock(sk);
> - lock_sock(sk);
>
> if (sk->sk_err || sk->sk_state == TCP_CLOSE ||
> (sk->sk_shutdown & RCV_SHUTDOWN) ||
> --
> 2.34.1
>
>
On 8/15/24 10:43, sunyiqi wrote:
> When enters tcp_splice_read, tcp_splice_read will call lock_sock
> for sk in order to prevent other threads acquiring sk and release it
> before return.
>
> But in while(tss.len) loop, it releases and re-locks sk, give the other
> thread a small window to lock the sk.
>
> As a result, release/lock_sock in the while loop in tcp_splice_read may
> cause race condition.
Which race condition exactly? do you have a backtrace?
>
> Fixes: 9c55e01c0cc8 ("[TCP]: Splice receive support.")
> Signed-off-by: sunyiqi <sunyiqixm@gmail.com>
> ---
> net/ipv4/tcp.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index e03a342c9162..7a2ce0e2e5be 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -856,8 +856,6 @@ ssize_t tcp_splice_read(struct socket *sock, loff_t *ppos,
>
> if (!tss.len || !timeo)
> break;
> - release_sock(sk);
> - lock_sock(sk);
This is needed to flush the sk backlog.
Somewhat related, I think we could replace the pair with sk_flush_backlog().
Thanks,
Paolo
Hello Paolo,
On Thu, Aug 15, 2024 at 6:40 PM Paolo Abeni <pabeni@redhat.com> wrote:
[...]
> > - release_sock(sk);
> > - lock_sock(sk);
>
> This is needed to flush the sk backlog.
>
> Somewhat related, I think we could replace the pair with sk_flush_backlog().
>
Do you think we could do this like the following commit:
commit d41a69f1d390fa3f2546498103cdcd78b30676ff
Author: Eric Dumazet <edumazet@google.com>
Date: Fri Apr 29 14:16:53 2016 -0700
tcp: make tcp_sendmsg() aware of socket backlog
Large sendmsg()/write() hold socket lock for the duration of the call,
unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
are parked into socket backlog for a long time.
?
Then we can avoid taking the lock too long which results in too many
packets in the backlog.
Thanks,
Jason
On 8/15/24 12:55, Jason Xing wrote: > On Thu, Aug 15, 2024 at 6:40 PM Paolo Abeni <pabeni@redhat.com> wrote: > [...] >>> - release_sock(sk); >>> - lock_sock(sk); >> >> This is needed to flush the sk backlog. >> >> Somewhat related, I think we could replace the pair with sk_flush_backlog(). >> > > Do you think we could do this like the following commit: > > commit d41a69f1d390fa3f2546498103cdcd78b30676ff > Author: Eric Dumazet <edumazet@google.com> > Date: Fri Apr 29 14:16:53 2016 -0700 > > tcp: make tcp_sendmsg() aware of socket backlog > > Large sendmsg()/write() hold socket lock for the duration of the call, > unless sk->sk_sndbuf limit is hit. This is bad because incoming packets > are parked into socket backlog for a long time. > > ? Yep. To be more accurate I was looking at commit 93afcfd1db35882921b2521a637c78755c27b02c In any case this should be unrelated from the supposed issue. Cheers, Paolo
On Thu, Aug 15, 2024 at 7:23 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > On 8/15/24 12:55, Jason Xing wrote: > > On Thu, Aug 15, 2024 at 6:40 PM Paolo Abeni <pabeni@redhat.com> wrote: > > [...] > >>> - release_sock(sk); > >>> - lock_sock(sk); > >> > >> This is needed to flush the sk backlog. > >> > >> Somewhat related, I think we could replace the pair with sk_flush_backlog(). > >> > > > > Do you think we could do this like the following commit: > > > > commit d41a69f1d390fa3f2546498103cdcd78b30676ff > > Author: Eric Dumazet <edumazet@google.com> > > Date: Fri Apr 29 14:16:53 2016 -0700 > > > > tcp: make tcp_sendmsg() aware of socket backlog > > > > Large sendmsg()/write() hold socket lock for the duration of the call, > > unless sk->sk_sndbuf limit is hit. This is bad because incoming packets > > are parked into socket backlog for a long time. > > > ? > > Yep. To be more accurate I was looking at commit > 93afcfd1db35882921b2521a637c78755c27b02c Thanks. It arouses my interest. Now I do believe we can do such optimization in this function. > > In any case this should be unrelated from the supposed issue. Sure. Thanks, Jason
© 2016 - 2026 Red Hat, Inc.