net/ipv4/tcp_bpf.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
The tcp_bpf_recvmsg_parser() function, running in user context,
retrieves seq_copied from tcp_sk without holding the socket lock, and
stores it in a local variable seq. However, the softirq context can
modify tcp_sk->seq_copied concurrently, for example, n tcp_read_sock().
As a result, the seq value is stale when it is assigned back to
tcp_sk->copied_seq at the end of tcp_bpf_recvmsg_parser(), leading to
incorrect behavior.
Signed-off-by: mrpre <mrpre@163.com>
---
net/ipv4/tcp_bpf.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index e7658c5d6b79..7b44d4ece8b2 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -221,9 +221,9 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk,
int flags,
int *addr_len)
{
- struct tcp_sock *tcp = tcp_sk(sk);
+ struct tcp_sock *tcp;
+ u32 seq;
int peek = flags & MSG_PEEK;
- u32 seq = tcp->copied_seq;
struct sk_psock *psock;
int copied = 0;
@@ -238,7 +238,8 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk,
return tcp_recvmsg(sk, msg, len, flags, addr_len);
lock_sock(sk);
-
+ tcp = tcp_sk(sk);
+ seq = tcp->copied_seq;
/* We may have received data on the sk_receive_queue pre-accept and
* then we can not use read_skb in this context because we haven't
* assigned a sk_socket yet so have no link to the ops. The work-around
--
2.43.5
On Mon, Oct 21, 2024 at 09:37:05AM +0800, mrpre wrote: > The tcp_bpf_recvmsg_parser() function, running in user context, > retrieves seq_copied from tcp_sk without holding the socket lock, and > stores it in a local variable seq. However, the softirq context can > modify tcp_sk->seq_copied concurrently, for example, n tcp_read_sock(). > > As a result, the seq value is stale when it is assigned back to > tcp_sk->copied_seq at the end of tcp_bpf_recvmsg_parser(), leading to > incorrect behavior. Good catch! This makes sense to me. Mind to be more specific on the "incorrect behavior" here? What error or misbehavior did you see? > > Signed-off-by: mrpre <mrpre@163.com> Please use your real name for SoB, see https://docs.kernel.org/process/submitting-patches.html Thanks.
The tcp_bpf_recvmsg_parser() function, running in user context,
retrieves seq_copied from tcp_sk without holding the socket lock, and
stores it in a local variable seq. However, the softirq context can
modify tcp_sk->seq_copied concurrently, for example, n tcp_read_sock().
As a result, the seq value is stale when it is assigned back to
tcp_sk->copied_seq at the end of tcp_bpf_recvmsg_parser(), leading to
incorrect behavior.
Due to concurrency, the copied_seq field in tcp_bpf_recvmsg_parser()
might be set to an incorrect value (less than the actual copied_seq) at
the end of function: 'WRITE_ONCE(tcp->copied_seq, seq)'. This causes the
'offset' to be negative in tcp_read_sock()->tcp_recv_skb() when
processing new incoming packets (sk->copied_seq - skb->seq becomes less
than 0), and all subsequent packets will be dropped.
Signed-off-by: Jiayuan Chen <mrpre@163.com>
---
V1 -> V2: add more commit message to describle the issue
---
net/ipv4/tcp_bpf.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index e7658c5d6b79..7b44d4ece8b2 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -221,9 +221,9 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk,
int flags,
int *addr_len)
{
- struct tcp_sock *tcp = tcp_sk(sk);
+ struct tcp_sock *tcp;
+ u32 seq;
int peek = flags & MSG_PEEK;
- u32 seq = tcp->copied_seq;
struct sk_psock *psock;
int copied = 0;
@@ -238,7 +238,8 @@ static int tcp_bpf_recvmsg_parser(struct sock *sk,
return tcp_recvmsg(sk, msg, len, flags, addr_len);
lock_sock(sk);
-
+ tcp = tcp_sk(sk);
+ seq = tcp->copied_seq;
/* We may have received data on the sk_receive_queue pre-accept and
* then we can not use read_skb in this context because we haven't
* assigned a sk_socket yet so have no link to the ops. The work-around
--
2.43.5
© 2016 - 2024 Red Hat, Inc.