From nobody Mon May 25 06:41:57 2026 Received: from passt.top (passt.top [88.198.0.164]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 246553D1CB1; Sun, 17 May 2026 18:48:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=88.198.0.164 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779043699; cv=none; b=n23bXVlJWdt8EH26eC9aloMzZKgSNu9aa1vFyMy47UzWqI2NRPiKrYkRsJG/ogqYSh5nXMPlZX1oXGCe6FaqighJLvvqNSEY4l+xkruCSyUFqtb1QXf6KnTJ9om38m5ZIHhE/bhb/mTXwBIwv33Q0ZjYvk0vAJRjneHKVgU0V+A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779043699; c=relaxed/simple; bh=fI26Y5U4Ia4oGDKWxYsJZYTJ4HXI5HkCkxk3KlTwKik=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CIvL9wdLFWe/y4k5nfP5tHRW96d7Rmzpu/EabRgfKARNBtgdH4cET+e12ekentB5ZyoheudykUWphPh2W9+ChYgDflFb/CdDqPRXDZk+LBsfiMUmj1+Uoo2eXCWW91++JEbj6KErUU9Crxrrm/s+7oWWWHZCpdrCRBU4Q5T5Y1w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=passt.top; arc=none smtp.client-ip=88.198.0.164 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=passt.top Received: by passt.top (Postfix, from userid 1000) id CCE2E5A061A; Sun, 17 May 2026 20:41:58 +0200 (CEST) From: Stefano Brivio To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Pavel Emelyanov , Laurent Vivier , Jon Maloy , Dmitry Safonov , Andrei Vagin , netdev@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Neal Cardwell , Kuniyuki Iwashima , Simon Horman , Shuah Khan Subject: [PATCH net 1/2] tcp: Don't accept data when socket is in repair mode Date: Sun, 17 May 2026 20:41:57 +0200 Message-ID: <20260517184158.2757505-2-sbrivio@redhat.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260517184158.2757505-1-sbrivio@redhat.com> References: <20260517184158.2757505-1-sbrivio@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Once a socket enters repair mode (TCP_REPAIR socket option with TCP_REPAIR_ON value), it's possible to dump the receive sequence number (TCP_QUEUE_SEQ) and the contents of the receive queue itself (using TCP_REPAIR_QUEUE to select it). If we receive data after the application fetched the sequence number or saved the contents of the queue, though, the application will now have outdated information, which defeats the whole functionality, because this leads to gaps in sequence and data once they're restored by the target instance of the application, resulting in a hanging or otherwise non-functional TCP connection. This type of race condition was discovered in the KubeVirt integration of passt(1), using a remote iperf3 client connected to an iperf3 server running in the guest which is being migrated. The setup allows traffic to reach the origin node hosting the guest during the migration. If passt dumps sequence number and contents of the queue *before* further data is received and acknowledged to the peer by the kernel, once the TCP data connection is migrated to the target node, the remote client becomes unable to continue sending, because a portion of the data it sent *and received an acknowledgement for* is now lost. Schematically: 1. client --seq 1:100--> origin host --> passt --> guest --> server 2. client <--ACK: 100-- origin host 3. migration starts, passt enables repair mode, dumps the sequence number (101) and sends it to the target node of the guest migration 4. client --seq 101:201--> origin host (passt not receiving anymore) 5. client <--ACK: 201-- origin host 6. migration completes, and passt restores sequence number 101 on the migrated socket 7. client --seq 201:301--> target host (now seeing a sequence jump) 8. client <--ACK: 100-- target host ...and the connection can't recover anymore, because the client can't resend data that was already (erroneously) acknowledged. We need to avoid step 5. above. This would equally affect CRIU (the other known user of TCP_REPAIR), should data be received while the original container is frozen: the sequence dumped and the contents of the saved incoming queue would then depend on the timing. The race condition is also illustrated in the kselftests introduced by the next patch. To prevent this issue, discard data received for a socket in repair mode, with a new reason, SKB_DROP_REASON_SOCKET_REPAIR. Fixes: ee9952831cfd ("tcp: Initial repair mode") Tested-by: Laurent Vivier Signed-off-by: Stefano Brivio --- include/net/dropreason-core.h | 3 +++ net/ipv4/tcp_input.c | 14 +++++++++++++- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/include/net/dropreason-core.h b/include/net/dropreason-core.h index 2f312d1f67d6..19ab9e6ffc33 100644 --- a/include/net/dropreason-core.h +++ b/include/net/dropreason-core.h @@ -9,6 +9,7 @@ FN(SOCKET_CLOSE) \ FN(SOCKET_FILTER) \ FN(SOCKET_RCVBUFF) \ + FN(SOCKET_REPAIR) \ FN(UNIX_DISCONNECT) \ FN(UNIX_SKIP_OOB) \ FN(PKT_TOO_SMALL) \ @@ -158,6 +159,8 @@ enum skb_drop_reason { SKB_DROP_REASON_SOCKET_FILTER, /** @SKB_DROP_REASON_SOCKET_RCVBUFF: socket receive buff is full */ SKB_DROP_REASON_SOCKET_RCVBUFF, + /** @SKB_DROP_REASON_SOCKET_REPAIR: socket is in repair mode */ + SKB_DROP_REASON_SOCKET_REPAIR, /** * @SKB_DROP_REASON_UNIX_DISCONNECT: recv queue is purged when SOCK_DGRAM * or SOCK_SEQPACKET socket re-connect()s to another socket or notices diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index d5c9e65d9760..6eca34274f97 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6457,6 +6457,7 @@ static bool tcp_validate_incoming(struct sock *sk, st= ruct sk_buff *skb, * or pure receivers (this means either the sequence number or the ack * value must stay constant) * - Unexpected TCP option. + * - Socket is in repair mode. * * When these conditions are not satisfied it drops into a standard * receive procedure patterned after RFC793 to handle all cases. @@ -6506,7 +6507,8 @@ void tcp_rcv_established(struct sock *sk, struct sk_b= uff *skb) =20 if ((tcp_flag_word(th) & TCP_HP_BITS) =3D=3D tp->pred_flags && TCP_SKB_CB(skb)->seq =3D=3D tp->rcv_nxt && - !after(TCP_SKB_CB(skb)->ack_seq, tp->snd_nxt)) { + !after(TCP_SKB_CB(skb)->ack_seq, tp->snd_nxt) && + !tp->repair) { int tcp_header_len =3D tp->tcp_header_len; s32 delta =3D 0; int flag =3D 0; @@ -6632,6 +6634,11 @@ void tcp_rcv_established(struct sock *sk, struct sk_= buff *skb) goto discard; } =20 + if (tp->repair) { + reason =3D SKB_DROP_REASON_SOCKET_REPAIR; + goto discard; + } + /* * Standard slow path. */ @@ -7125,6 +7132,11 @@ tcp_rcv_state_process(struct sock *sk, struct sk_buf= f *skb) int queued =3D 0; SKB_DR(reason); =20 + if (tp->repair) { + SKB_DR_SET(reason, SOCKET_REPAIR); + goto discard; + } + switch (sk->sk_state) { case TCP_CLOSE: SKB_DR_SET(reason, TCP_CLOSE); --=20 2.43.0 From nobody Mon May 25 06:41:57 2026 Received: from passt.top (passt.top [88.198.0.164]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 250A43BB12E; Sun, 17 May 2026 18:48:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=88.198.0.164 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779043701; cv=none; b=FsoF62spSvOPiWvjJ2IvqCePWH/eWbloB+ZQTQG7yZiXedO3yeja0XYDu/h64rtt3tnYdM8fFbKK8sX8yN/9kd81dBWQliIHsIRgo/utkybGapw9amYUDygKUU9HsMHjNEyRpSTiwRl3eeZmHxltY1MntoqUUDG8E4mUGiA1h1Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779043701; c=relaxed/simple; bh=+MypbuIEfLMl4PKjMXEMqONpWrswGgRkBYEzylkgilU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=g4PiptnJRr7xwrLC6fd5Xxg/YeeZqKYGBC17Xb2wy2Aqh/fYJc2oIxawxCvSphlw9jpzVuEcBgY9U3BHnH6jYohXA5tNkpymLK4FlaOtLYb46EOaagqEwHjVmckCtK4mDMWodB/R8vAh6dqOk+Yz374+1r3fRPG2y0RFC1PHEeY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=passt.top; arc=none smtp.client-ip=88.198.0.164 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=passt.top Received: by passt.top (Postfix, from userid 1000) id CFC7B5A061B; Sun, 17 May 2026 20:41:58 +0200 (CEST) From: Stefano Brivio To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Pavel Emelyanov , Laurent Vivier , Jon Maloy , Dmitry Safonov , Andrei Vagin , netdev@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Neal Cardwell , Kuniyuki Iwashima , Simon Horman , Shuah Khan Subject: [PATCH net 2/2] selftests: Add data path tests for TCP_REPAIR mode Date: Sun, 17 May 2026 20:41:58 +0200 Message-ID: <20260517184158.2757505-3-sbrivio@redhat.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260517184158.2757505-1-sbrivio@redhat.com> References: <20260517184158.2757505-1-sbrivio@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The new tests check that, once TCP_REPAIR is enabled on a socket: - additional incoming data queued to it don't increase the sequence number dumped via TCP_QUEUE_SEQ socket option - a connected endpoint will not receive ACK segments after sending more data These tests are implemented by a client, sending data and commands (accept connection, enter repair mode, dump sequences, receive data, and exit) describing the test sequence, and a server receiving data and implementing those commands. This way, the client can accurately synchronise repair modes with data exchanges. In order to avoid using loopback connections, where data would be immediately acknowledged by the server side without packets being actually sent or received, the client resides in a separate network namespace ("inner") compared to the server, and a veth interface pair connects them. The tests can be run unprivileged as the outer network and user namespaces are also detached as a first step, so that the veth interfaces can be created in outer and inner namespaces without capabilities. Signed-off-by: Stefano Brivio --- tools/testing/selftests/Makefile | 1 + .../selftests/net/tcp_repair/.gitignore | 3 + .../testing/selftests/net/tcp_repair/Makefile | 23 ++ .../testing/selftests/net/tcp_repair/client.c | 273 ++++++++++++++++++ .../testing/selftests/net/tcp_repair/inner.sh | 32 ++ .../testing/selftests/net/tcp_repair/outer.sh | 44 +++ tools/testing/selftests/net/tcp_repair/run.sh | 12 + .../testing/selftests/net/tcp_repair/server.c | 155 ++++++++++ tools/testing/selftests/net/tcp_repair/talk.h | 26 ++ 9 files changed, 569 insertions(+) create mode 100644 tools/testing/selftests/net/tcp_repair/.gitignore create mode 100644 tools/testing/selftests/net/tcp_repair/Makefile create mode 100644 tools/testing/selftests/net/tcp_repair/client.c create mode 100755 tools/testing/selftests/net/tcp_repair/inner.sh create mode 100755 tools/testing/selftests/net/tcp_repair/outer.sh create mode 100755 tools/testing/selftests/net/tcp_repair/run.sh create mode 100644 tools/testing/selftests/net/tcp_repair/server.c create mode 100644 tools/testing/selftests/net/tcp_repair/talk.h diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Mak= efile index 6e59b8f63e41..e119abe5c78e 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -84,6 +84,7 @@ TARGETS +=3D net/packetdrill TARGETS +=3D net/ppp TARGETS +=3D net/rds TARGETS +=3D net/tcp_ao +TARGETS +=3D net/tcp_repair TARGETS +=3D nolibc TARGETS +=3D nsfs TARGETS +=3D pci_endpoint diff --git a/tools/testing/selftests/net/tcp_repair/.gitignore b/tools/test= ing/selftests/net/tcp_repair/.gitignore new file mode 100644 index 000000000000..9756c86770b9 --- /dev/null +++ b/tools/testing/selftests/net/tcp_repair/.gitignore @@ -0,0 +1,3 @@ +# SPDX-License-Identifier: GPL-2.0 +client +server diff --git a/tools/testing/selftests/net/tcp_repair/Makefile b/tools/testin= g/selftests/net/tcp_repair/Makefile new file mode 100644 index 000000000000..d01d0a20b6df --- /dev/null +++ b/tools/testing/selftests/net/tcp_repair/Makefile @@ -0,0 +1,23 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# selftests/net/tcp_repair: TCP_REPAIR connection tests +# +# Makefile - Build test client and server, declare run.sh entrypoint +# +# Copyright (c) 2026 Red Hat GmbH +# +# Author: Stefano Brivio + +top_srcdir =3D ../../../../.. + +CFLAGS +=3D -Wall -Wextra -pedantic + +TEST_PROGS :=3D \ + run.sh + +TEST_GEN_FILES :=3D \ + client \ + server \ +# end of TEST_GEN_FILES + +include ../../lib.mk diff --git a/tools/testing/selftests/net/tcp_repair/client.c b/tools/testin= g/selftests/net/tcp_repair/client.c new file mode 100644 index 000000000000..b5106bf922b1 --- /dev/null +++ b/tools/testing/selftests/net/tcp_repair/client.c @@ -0,0 +1,273 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* selftests/net/tcp_repair: TCP_REPAIR connection tests + * + * client.c - Run list of tests, send commands and data to server + * + * Copyright (c) 2026 Red Hat GmbH + * + * Author: Stefano Brivio + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include /* latest and greatest struct tcp_info, but */ +#define SOL_TCP 6 /* we can't include netinet/tcp.h as a result */ + +#include "talk.h" + +/** + * srv() - Send command to server, return received value (not for ACCEPT) + * @ctl: Control socket + * @op: Command type + * @arg: Optional argument (always sent, might be zero) + * + * Return: integer value received by client as response + */ +int srv(int ctl, enum op op, int arg) +{ + int cmd[2] =3D { op, arg }, ret; + + send(ctl, cmd, sizeof(cmd), 0); + if (op !=3D ACCEPT) + recv(ctl, &ret, sizeof(ret), 0); + + return ret; +} + +/** + * test_seq_slow_path() - Sequence doesn't change after sending one byte + * @ctl: Control socket + * @data: Data socket + * + * Return: 0 if the test passes, -1 if it fails + */ +int test_seq_slow_path(int ctl, int data) +{ + uint32_t seq1, seq2; + + (void)ctl; + (void)data; + + srv(ctl, REPAIR, TCP_REPAIR_ON); + seq1 =3D (uint32_t)srv(ctl, DUMP_RECV_SEQ, 0); + + send(data, (char *)("a"), 1, 0); + + seq2 =3D (uint32_t)srv(ctl, DUMP_RECV_SEQ, 0); + + if (seq1 !=3D seq2) { + fprintf(stderr, "Sequence changed in repair mode, %u -> %u\n", + seq1, seq2); + return -1; + } + + return 0; +} + +/** + * test_seq_fast_path() - Sequence doesn't change after a large transfer + * @ctl: Control socket + * @data: Data socket + * + * Return: 0 if the test passes, -1 if it fails + */ +int test_seq_fast_path(int ctl, int data) +{ + char buf[1000] =3D { 0 }; + uint32_t seq1, seq2; + int i; + + (void)ctl; + (void)data; + + for (i =3D 0; i < 50; i++) { + send(data, buf, sizeof(buf), 0); + srv(ctl, RECV, sizeof(buf)); + } + + srv(ctl, REPAIR, TCP_REPAIR_ON); + seq1 =3D (uint32_t)srv(ctl, DUMP_RECV_SEQ, 0); + + fcntl(data, F_SETFL, O_NONBLOCK); + for (i =3D 0; i < 50; i++) + send(data, buf, sizeof(buf), 0); + + seq2 =3D (uint32_t)srv(ctl, DUMP_RECV_SEQ, 0); + + if (seq1 !=3D seq2) { + fprintf(stderr, "Sequence changed in repair mode, %u -> %u\n", + seq1, seq2); + return -1; + } + + return 0; +} + +/** + * test_acked_slow_path() - Our ACK sequence doesn't change after sending = byte + * @ctl: Control socket + * @data: Data socket + * + * Return: 0 if the test passes, -1 if it fails + */ +int test_acked_slow_path(int ctl, int data) +{ + unsigned long acked1, acked2; + struct tcp_info tinfo; + socklen_t sl; + + (void)ctl; + (void)data; + + srv(ctl, REPAIR, TCP_REPAIR_ON); + + sl =3D sizeof(tinfo); + getsockopt(data, SOL_TCP, TCP_INFO, &tinfo, &sl); + acked1 =3D tinfo.tcpi_bytes_acked; + + send(data, (char *)("a"), 1, 0); + + getsockopt(data, SOL_TCP, TCP_INFO, &tinfo, &sl); + acked2 =3D tinfo.tcpi_bytes_acked; + + if (acked1 !=3D acked2) { + fprintf(stderr, "ACK received in repair mode, %lu -> %lu\n", + acked1, acked2); + return -1; + } + + return 0; +} + +/** + * test_acked_fast_path() - Our ACK sequence doesn't change after large tr= ansfer + * @ctl: Control socket + * @data: Data socket + * + * Return: 0 if the test passes, -1 if it fails + */ +int test_acked_fast_path(int ctl, int data) +{ + unsigned long acked1, acked2; + char buf[1000] =3D { 0 }; + struct tcp_info tinfo; + socklen_t sl; + int i; + + (void)ctl; + (void)data; + + for (i =3D 0; i < 50; i++) { + send(data, buf, sizeof(buf), 0); + srv(ctl, RECV, sizeof(buf)); + } + + srv(ctl, REPAIR, TCP_REPAIR_ON); + + sl =3D sizeof(tinfo); + getsockopt(data, SOL_TCP, TCP_INFO, &tinfo, &sl); + acked1 =3D tinfo.tcpi_bytes_acked; + + fcntl(data, F_SETFL, O_NONBLOCK); + for (i =3D 0; i < 50; i++) + send(data, buf, sizeof(buf), 0); + + getsockopt(data, SOL_TCP, TCP_INFO, &tinfo, &sl); + acked2 =3D tinfo.tcpi_bytes_acked; + + if (acked1 !=3D acked2) { + fprintf(stderr, "ACK received in repair mode, %lu -> %lu\n", + acked1, acked2); + return -1; + } + + return 0; +} + +/** + * struct test - List of tests + * @desc: Test description + * @f: Function executing the test + */ +struct { + char *desc; + int (*f)(int ctl, int data); +} test[] =3D { + { + "Sequence freezes in repair mode, slow path TCP input", + test_seq_slow_path, + }, + { + "Sequence freezes in repair mode, fast path TCP input", + test_seq_fast_path, + }, + { + "No ACKs in repair mode, slow path TCP input", + test_acked_slow_path, + }, + { + "No ACKs in repair mode, fast path TCP input", + test_acked_fast_path, + }, +}; + +/** + * main() - Entry point, connect control socket to server and run list of = tests + * @argc: Argument count, must be 3 (two options) + * @argv: Options: server address and port + * + * Return: -1 on bad usage, 0 on success, 1 if at least one test fails + */ +int main(int argc, char **argv) +{ + struct addrinfo hints =3D { 0, AF_UNSPEC, SOCK_STREAM, 0, 0, + NULL, NULL, NULL }; + int ctl, data, ret =3D 0; + struct addrinfo *r; + unsigned i; + + if (argc !=3D 3) { + fprintf(stderr, "%s DST_ADDR DST_PORT\n", argv[0]); + return -1; + } + + getaddrinfo(argv[1], argv[2], &hints, &r); + + ctl =3D socket(r->ai_family, SOCK_STREAM, IPPROTO_TCP); + setsockopt(ctl, SOL_SOCKET, SO_REUSEADDR, &((int){ 1 }), sizeof(int)); + connect(ctl, r->ai_addr, r->ai_addrlen); + + for (i =3D 0; i < sizeof(test) / sizeof(test[0]); i++) { + int rc; + + data =3D socket(r->ai_family, SOCK_STREAM, IPPROTO_TCP); + setsockopt(data, SOL_SOCKET, SO_REUSEADDR, + &((int){ 1 }), sizeof(int)); + srv(ctl, ACCEPT, 0); + connect(data, r->ai_addr, r->ai_addrlen); + + rc =3D test[i].f(ctl, data); + + close(data); + + if (rc) { + fprintf(stdout, "TEST: %-60s [FAIL]\n", test[i].desc); + ret =3D 1; + } else { + fprintf(stdout, "TEST: %-60s [ OK ]\n", test[i].desc); + } + } + + return ret; +} diff --git a/tools/testing/selftests/net/tcp_repair/inner.sh b/tools/testin= g/selftests/net/tcp_repair/inner.sh new file mode 100755 index 000000000000..3987dc0514a8 --- /dev/null +++ b/tools/testing/selftests/net/tcp_repair/inner.sh @@ -0,0 +1,32 @@ +#!/bin/sh -euf +# SPDX-License-Identifier: GPL-2.0 +# +# selftests/net/tcp_repair: TCP_REPAIR connection tests +# +# inner.sh - Set up link to outer namespace, run test client in inner name= space +# +# Copyright (c) 2026 Red Hat GmbH +# +# Author: Stefano Brivio + +ns_inner_dir=3D"${1}" + +# Tell the parent shell about our PID +echo "${$}" > "${ns_inner_dir}/pid" +mkdir "${ns_inner_dir}/pid_ready" + +# Wait for veth to appear +while [ -z "$(sed -n '4p' /proc/net/dev)" ]; do + sleep 0.1 || sleep 1 +done + +# Set up link to outer namespace +ip link set dev veth0 up +ip addr add 169.254.2.2 dev veth0 +ip ro add default dev veth0 + +# Finally run tests +set +e +./client 169.254.2.1 1024 +echo "${?}" > "${ns_inner_dir}/result" +mkdir "${ns_inner_dir}/result_ready" diff --git a/tools/testing/selftests/net/tcp_repair/outer.sh b/tools/testin= g/selftests/net/tcp_repair/outer.sh new file mode 100755 index 000000000000..17ae1230f0e5 --- /dev/null +++ b/tools/testing/selftests/net/tcp_repair/outer.sh @@ -0,0 +1,44 @@ +#!/bin/sh -euf +# SPDX-License-Identifier: GPL-2.0 +# +# selftests/net/tcp_repair: TCP_REPAIR connection tests +# +# outer.sh - Set up outer namespace, run test server there +# +# Copyright (c) 2026 Red Hat GmbH +# +# Author: Stefano Brivio + +ns_inner_dir=3D"$(mktemp -d)" + +cleanup() { + rm -rf "${ns_inner_dir}" +} + +trap cleanup EXIT + +# Detach inner namespace in a subshell, tests start from there +unshare -rUn -- ./inner.sh "${ns_inner_dir}" & + +# Wait for inner namespace +while [ ! -d "${ns_inner_dir}/pid_ready" ]; do + sleep 0.1 || sleep 1 +done + +# Set up link to inner namespace +ip link add veth0 type veth peer name veth0 netns "$(cat "${ns_inner_dir}/= pid")" +ip link set dev veth0 up +ip addr add 169.254.2.1 dev veth0 +ip ro add default dev veth0 + +# Run test server +./server 1024 + +# Wait for test results +while [ ! -d "${ns_inner_dir}/result_ready" ]; do + sleep 0.1 || sleep 1 +done + +# Clean up and return results +ret=3D"$(cat "${ns_inner_dir}/result")" +exit "${ret}" diff --git a/tools/testing/selftests/net/tcp_repair/run.sh b/tools/testing/= selftests/net/tcp_repair/run.sh new file mode 100755 index 000000000000..f87ff0a8a6f6 --- /dev/null +++ b/tools/testing/selftests/net/tcp_repair/run.sh @@ -0,0 +1,12 @@ +#!/bin/sh -euf +# SPDX-License-Identifier: GPL-2.0 +# +# selftests/net/tcp_repair: TCP_REPAIR connection tests +# +# run.sh - Test entry point: detach outer namespace and run outer.sh in it +# +# Copyright (c) 2026 Red Hat GmbH +# +# Author: Stefano Brivio + +unshare -rUn -- ./outer.sh diff --git a/tools/testing/selftests/net/tcp_repair/server.c b/tools/testin= g/selftests/net/tcp_repair/server.c new file mode 100644 index 000000000000..256c321320b7 --- /dev/null +++ b/tools/testing/selftests/net/tcp_repair/server.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* selftests/net/tcp_repair: TCP_REPAIR connection tests + * + * server.c - Receive commands and data, set TCP_REPAIR options on data so= cket + * + * Copyright (c) 2026 Red Hat GmbH + * + * Author: Stefano Brivio + */ + +#include +#include +#include +#include +#include +#include + +#include /* needed for TCP_REPAIR constants but */ +#define SOL_TCP 6 /* we can't include netinet/tcp.h as a result */ + +#include "talk.h" + +/** + * cmd_accept() - Accept data connection (must be first command in test) + * @unused: Not used + * @listen: Listening socket + * @data: Return value from accept(), set on return + * + * Return: 0 + */ +int cmd_accept(int unused, int listen, int *data) +{ + (void)unused; + + if (*data !=3D -1) + close(*data); + + *data =3D accept(listen, NULL, NULL); + + return 0; +} + +/** + * cmd_dump_recv_seq() - Dump receive sequence of data socket + * @unused: Not used + * @unused2: Not used + * @data: File descriptor for data socket + * + * Return: receive sequence of data socket + */ +int cmd_dump_recv_seq(int unused, int unused2, int *data) +{ + socklen_t sl; + int v; + + (void)unused; + (void)unused2; + + v =3D TCP_RECV_QUEUE; + setsockopt(*data, SOL_TCP, TCP_REPAIR_QUEUE, &v, sizeof(v)); + + sl =3D sizeof(v); + getsockopt(*data, SOL_TCP, TCP_QUEUE_SEQ, &v, &sl); + return v; +} + +/** + * cmd_exit() - Exit successfully + * @unused: Not used + * @unused2: Not used + * @unused3: Not used + * + * Return: this function doesn't actually return + */ +int cmd_exit(int unused, int unused2, int *unused3) +{ + (void)unused; + (void)unused2; + (void)unused3; + + exit(EXIT_SUCCESS); + return 0; +} + +/** + * cmd_recv() - Receive (discard) a given amount of bytes + * @len: Amount of bytes the client wants us to receive + * @unused: Not used + * @data: File descriptor for data socket + * + * Return: return code from recv() + */ +int cmd_recv(int len, int unused, int *data) +{ + (void)unused; + + return recv(*data, NULL, len, MSG_TRUNC); +} + +/** + * cmd_repair() - Set repair mode to mode supplied by client + * @mode: Value for socket option provided by the client + * @unused: Not used + * @data: File descriptor for data socket + * + * Return: return code from setsockopt() + */ +int cmd_repair(int mode, int unused, int *data) +{ + (void)unused; + + return setsockopt(*data, SOL_TCP, TCP_REPAIR, &mode, sizeof(mode)); +} + +/* List of commands and their handlers */ +int (*fn[])(int arg, int listen, int *data) =3D { + [ACCEPT] =3D cmd_accept, + [DUMP_RECV_SEQ] =3D cmd_dump_recv_seq, + [EXIT] =3D cmd_exit, + [RECV] =3D cmd_recv, + [REPAIR] =3D cmd_repair, +}; + +/** + * main() - Entry point, accept control connection and dispatch commands + * @argc: Argument count, must be 2 (one option) + * @argv: Options: server port + * + * Return: 0 on success, exit on failure + */ +int main(int argc, char **argv) +{ + struct sockaddr_in a =3D { AF_INET, htons(atoi(argv[1])), { 0 }, { 0 } }; + int s, ctl, data =3D -1, cmd[2]; + + s =3D socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); + setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &((int){ 1 }), sizeof(int)); + + if (argc !=3D 2) { + fprintf(stderr, "%s PORT\n", argv[0]); + exit(EXIT_FAILURE); + } + + bind(s, (struct sockaddr *)&a, sizeof(a)); + listen(s, 0); + ctl =3D accept(s, NULL, NULL); + + while (recv(ctl, cmd, sizeof(cmd), 0) =3D=3D sizeof(cmd)) { + int ret =3D fn[cmd[0]](cmd[1], s, &data); + if (cmd[0] !=3D ACCEPT) + send(ctl, &ret, sizeof(ret), 0); + } + + return 0; +} diff --git a/tools/testing/selftests/net/tcp_repair/talk.h b/tools/testing/= selftests/net/tcp_repair/talk.h new file mode 100644 index 000000000000..e2fbad7fae07 --- /dev/null +++ b/tools/testing/selftests/net/tcp_repair/talk.h @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* selftests/net/tcp_repair: TCP_REPAIR connection tests + * + * talk.h - Communication protocol for client and server + * + * Copyright (c) 2026 Red Hat GmbH + * + * Author: Stefano Brivio + */ + +/** + * enum op - Server command type (taking optional int argument, returning = int) + * @ACCEPT Accept connection on data socket (doesn't return int) + * @DUMP_RECV_SEQ Dump receive sequence, return it to the client + * @EXIT Exit, return 0 to the client + * @RECV Try receiving given amount of bytes, return received + * @REPAIR Set repair mode to argument, return setsockopt() value + */ +enum op { + ACCEPT, + DUMP_RECV_SEQ, + EXIT, + RECV, + REPAIR, +}; --=20 2.43.0