From nobody Fri Dec 19 19:17:16 2025 Received: from mailtransmit04.runbox.com (mailtransmit04.runbox.com [185.226.149.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D648A29CB54; Fri, 11 Apr 2025 11:34:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.226.149.37 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744371247; cv=none; b=pMiU5y+JBjZYzp1HwtGj0Su4WybEiWTIUckjfkvLBWYQEKNxanvP1WF78UVPulI2YVWFWSUglGmeWWBlQthpZGNUa8qr0Leb8jD+UdRsXR6PSrwJC9bDuil40a35zHtBj4zu6zsraG0mXxIf/HVOZis3GCrYdES6YDJjFL5AfCc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744371247; c=relaxed/simple; bh=T+17D6EsEibnbbc0OAQIvZnzUovhqRhpQoYXJIa2fcw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=ZpLoMuMDhe0iPpbx4Qddz+aaXDctdYjBqxkEtBezpLj5NTNIiKUpewaRyv2rNZuPsuxNR5VRLCy6aIcGkzWRMCMs3HsULXiLd2cg86qgkMXJU/eMMuu50Yp7pwWuGEyHWmMG2o2sC+NemAsinJJWgJnyfs6qyYAuC94BDjm/bP0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co; spf=pass smtp.mailfrom=rbox.co; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b=AzzTChIh; arc=none smtp.client-ip=185.226.149.37 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=rbox.co Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rbox.co Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rbox.co header.i=@rbox.co header.b="AzzTChIh" Received: from mailtransmit02.runbox ([10.9.9.162] helo=aibo.runbox.com) by mailtransmit04.runbox.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1u3Ce0-005wzk-Ep; Fri, 11 Apr 2025 13:34:00 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=rbox.co; s=selector1; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From; bh=sIcuhbVG+/4g3Cb6ICYaud1ko/WhVE0oAYVCV2tTpQA=; b=AzzTChIhHtm9QY+s9Wg/nu4yfw 4bw2rYR3xDBpCQOL4Xlr0whaOVxZmmVhLWatHrkMyLiZAyGMCitmKohzknXoMUGVTIaV34cEPuiVS V/MYEtw2RrxrLZUM3JzifhGnI+3gCXGvO2ImEpfhtzAP6lHYcl4GEPs3kVs3SsSS69WEIRJzL31s0 UYEfz+SosWR/rHp3a+wZ/eq+Y+YGNckyol0nvv0fDAnQhU73ucAQQFM2gEL9JWtzVngYiyDcxNh0E llGc4B59mNNBsg2zhykVJm+1b3mupRgfILmsnZlgEd5GE8T+P7FceUEuxtAPtSz3pQoMTRVECHijc MAYau0Jw==; Received: from [10.9.9.73] (helo=submission02.runbox) by mailtransmit02.runbox with esmtp (Exim 4.86_2) (envelope-from ) id 1u3Cdu-0005tm-5K; Fri, 11 Apr 2025 13:33:54 +0200 Received: by submission02.runbox with esmtpsa [Authenticated ID (604044)] (TLS1.2:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.93) id 1u3CdX-00D5Ut-93; Fri, 11 Apr 2025 13:33:31 +0200 From: Michal Luczaj Date: Fri, 11 Apr 2025 13:32:41 +0200 Subject: [PATCH bpf-next v2 5/9] selftests/bpf: Add selftest for sockmap/hashmap redirection Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250411-selftests-sockmap-redir-v2-5-5f9b018d6704@rbox.co> References: <20250411-selftests-sockmap-redir-v2-0-5f9b018d6704@rbox.co> In-Reply-To: <20250411-selftests-sockmap-redir-v2-0-5f9b018d6704@rbox.co> To: Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Shuah Khan , Jonathan Corbet Cc: bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Jakub Sitnicki , Michal Luczaj X-Mailer: b4 0.14.2 Test redirection logic. All supported and unsupported redirect combinations are tested for success and failure respectively. BPF_MAP_TYPE_SOCKMAP BPF_MAP_TYPE_SOCKHASH x sk_msg-to-egress sk_msg-to-ingress sk_skb-to-egress sk_skb-to-ingress x AF_INET, SOCK_STREAM AF_INET6, SOCK_STREAM AF_INET, SOCK_DGRAM AF_INET6, SOCK_DGRAM AF_UNIX, SOCK_STREAM AF_UNIX, SOCK_DGRAM AF_VSOCK, SOCK_STREAM AF_VSOCK, SOCK_SEQPACKET Suggested-by: Jakub Sitnicki Signed-off-by: Michal Luczaj --- .../selftests/bpf/prog_tests/sockmap_redir.c | 461 +++++++++++++++++= ++++ 1 file changed, 461 insertions(+) diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_redir.c b/tools= /testing/selftests/bpf/prog_tests/sockmap_redir.c new file mode 100644 index 0000000000000000000000000000000000000000..df550759c7e50d248322be3655b= 02b3a21267b4a --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/sockmap_redir.c @@ -0,0 +1,461 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Test for sockmap/sockhash redirection. + * + * BPF_MAP_TYPE_SOCKMAP + * BPF_MAP_TYPE_SOCKHASH + * x + * sk_msg-to-egress + * sk_msg-to-ingress + * sk_skb-to-egress + * sk_skb-to-ingress + * x + * AF_INET, SOCK_STREAM + * AF_INET6, SOCK_STREAM + * AF_INET, SOCK_DGRAM + * AF_INET6, SOCK_DGRAM + * AF_UNIX, SOCK_STREAM + * AF_UNIX, SOCK_DGRAM + * AF_VSOCK, SOCK_STREAM + * AF_VSOCK, SOCK_SEQPACKET + */ + +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "linux/const.h" +#include "test_progs.h" +#include "sockmap_helpers.h" +#include "test_sockmap_listen.skel.h" + +/* The meaning of SUPPORTED is "will redirect packet as expected". + */ +#define SUPPORTED _BITUL(0) + +/* Note on sk_skb-to-ingress ->af_vsock: + * + * Peer socket may receive the packet some time after the return from send= msg(). + * In a typical usage scenario, recvmsg() will block until the redirected = packet + * appears in the destination queue, or timeout if the packet was dropped.= By + * that point, the verdict map has already been updated to reflect what has + * happened. + * + * But sk_skb-to-ingress/af_vsock is an unsupported combination, so no rec= vmsg() + * takes place. Which means we may race the execution of the verdict logic= and + * read map_verd before it has been updated, i.e. we might observe + * map_verd[SK_DROP]=3D0 instead of map_verd[SK_DROP]=3D1. + * + * This confuses the selftest logic: if there was no packet dropped, where= 's the + * packet? So here's a heuristic: on map_verd[SK_DROP]=3Dmap_verd[SK_PASS]= =3D0 + * (which implies the verdict program has not been ran) just re-read the v= erdict + * map again. + */ +#define UNSUPPORTED_RACY_VERD _BITUL(1) + +enum prog_type { + SK_MSG_EGRESS, + SK_MSG_INGRESS, + SK_SKB_EGRESS, + SK_SKB_INGRESS, +}; + +enum { + SEND_INNER =3D 0, + SEND_OUTER, +}; + +enum { + RECV_INNER =3D 0, + RECV_OUTER, +}; + +struct maps { + int in; + int out; + int verd; +}; + +struct combo_spec { + enum prog_type prog_type; + const char *in, *out; +}; + +struct redir_spec { + const char *name; + int idx_send; + int idx_recv; + enum prog_type prog_type; +}; + +struct socket_spec { + int family; + int sotype; + int send_flags; + int in[2]; + int out[2]; +}; + +static int socket_spec_pairs(struct socket_spec *s) +{ + return create_socket_pairs(s->family, s->sotype, + &s->in[0], &s->out[0], + &s->in[1], &s->out[1]); +} + +static void socket_spec_close(struct socket_spec *s) +{ + xclose(s->in[0]); + xclose(s->in[1]); + xclose(s->out[0]); + xclose(s->out[1]); +} + +static void get_redir_params(struct redir_spec *redir, + struct test_sockmap_listen *skel, + int *prog_fd, enum bpf_attach_type *attach_type, + bool *ingress_flag) +{ + enum prog_type type =3D redir->prog_type; + struct bpf_program *prog; + bool sk_msg; + + sk_msg =3D type =3D=3D SK_MSG_INGRESS || type =3D=3D SK_MSG_EGRESS; + prog =3D sk_msg ? skel->progs.prog_msg_verdict : skel->progs.prog_skb_ver= dict; + + *prog_fd =3D bpf_program__fd(prog); + *attach_type =3D sk_msg ? BPF_SK_MSG_VERDICT : BPF_SK_SKB_VERDICT; + *ingress_flag =3D type =3D=3D SK_MSG_INGRESS || type =3D=3D SK_SKB_INGRES= S; +} + +static void try_recv(const char *prefix, int fd, int flags, bool expect_su= ccess) +{ + ssize_t n; + char buf; + + errno =3D 0; + n =3D recv(fd, &buf, 1, flags); + if (n < 0 && expect_success) + FAIL_ERRNO("%s: unexpected failure: retval=3D%zd", prefix, n); + if (!n && !expect_success) + FAIL("%s: expected failure: retval=3D%zd", prefix, n); +} + +static void handle_unsupported(int sd_send, int sd_peer, int sd_in, int sd= _out, + int sd_recv, int map_verd, int status) +{ + unsigned int drop, pass; + char recv_buf; + ssize_t n; + +get_verdict: + if (xbpf_map_lookup_elem(map_verd, &u32(SK_DROP), &drop) || + xbpf_map_lookup_elem(map_verd, &u32(SK_PASS), &pass)) + return; + + if (pass =3D=3D 0 && drop =3D=3D 0 && (status & UNSUPPORTED_RACY_VERD)) { + sched_yield(); + goto get_verdict; + } + + if (pass !=3D 0) { + FAIL("unsupported: wanted verdict pass 0, have %u", pass); + return; + } + + /* If nothing was dropped, packet should have reached the peer */ + if (drop =3D=3D 0) { + errno =3D 0; + n =3D recv_timeout(sd_peer, &recv_buf, 1, 0, IO_TIMEOUT_SEC); + if (n !=3D 1) + FAIL_ERRNO("unsupported: packet missing, retval=3D%zd", n); + } + + /* Ensure queues are empty */ + try_recv("bpf.recv(sd_send)", sd_send, MSG_DONTWAIT, false); + if (sd_in !=3D sd_send) + try_recv("bpf.recv(sd_in)", sd_in, MSG_DONTWAIT, false); + + try_recv("bpf.recv(sd_out)", sd_out, MSG_DONTWAIT, false); + if (sd_recv !=3D sd_out) + try_recv("bpf.recv(sd_recv)", sd_recv, MSG_DONTWAIT, false); +} + +static void test_send_redir_recv(int sd_send, int send_flags, int sd_peer, + int sd_in, int sd_out, int sd_recv, + struct maps *maps, int status) +{ + unsigned int drop, pass; + char *send_buf =3D "ab"; + char recv_buf =3D '\0'; + ssize_t n, len =3D 1; + + /* Zero out the verdict map */ + if (xbpf_map_update_elem(maps->verd, &u32(SK_DROP), &u32(0), BPF_ANY) || + xbpf_map_update_elem(maps->verd, &u32(SK_PASS), &u32(0), BPF_ANY)) + return; + + if (xbpf_map_update_elem(maps->in, &u32(0), &u64(sd_in), BPF_NOEXIST)) + return; + + if (xbpf_map_update_elem(maps->out, &u32(0), &u64(sd_out), BPF_NOEXIST)) + goto del_in; + + /* Last byte is OOB data when send_flags has MSG_OOB bit set */ + if (send_flags & MSG_OOB) + len++; + n =3D send(sd_send, send_buf, len, send_flags); + if (n >=3D 0 && n < len) + FAIL("incomplete send"); + if (n < 0) { + /* sk_msg redirect combo not supported? */ + if (status & SUPPORTED || errno !=3D EACCES) + FAIL_ERRNO("send"); + goto out; + } + + if (!(status & SUPPORTED)) { + handle_unsupported(sd_send, sd_peer, sd_in, sd_out, sd_recv, + maps->verd, status); + goto out; + } + + errno =3D 0; + n =3D recv_timeout(sd_recv, &recv_buf, 1, 0, IO_TIMEOUT_SEC); + if (n !=3D 1) { + FAIL_ERRNO("recv_timeout()"); + goto out; + } + + /* Check verdict _after_ recv(); af_vsock may need time to catch up */ + if (xbpf_map_lookup_elem(maps->verd, &u32(SK_DROP), &drop) || + xbpf_map_lookup_elem(maps->verd, &u32(SK_PASS), &pass)) + goto out; + + if (drop !=3D 0 || pass !=3D 1) + FAIL("unexpected verdict drop/pass: wanted 0/1, have %u/%u", + drop, pass); + + if (recv_buf !=3D send_buf[0]) + FAIL("recv(): payload check, %02x !=3D %02x", recv_buf, send_buf[0]); + + if (send_flags & MSG_OOB) { + /* Fail reading OOB while in sockmap */ + try_recv("bpf.recv(sd_out, MSG_OOB)", sd_out, + MSG_OOB | MSG_DONTWAIT, false); + + /* Remove sd_out from sockmap */ + xbpf_map_delete_elem(maps->out, &u32(0)); + + /* Check that OOB was dropped on redirect */ + try_recv("recv(sd_out, MSG_OOB)", sd_out, + MSG_OOB | MSG_DONTWAIT, false); + + goto del_in; + } +out: + xbpf_map_delete_elem(maps->out, &u32(0)); +del_in: + xbpf_map_delete_elem(maps->in, &u32(0)); +} + +static int is_redir_supported(enum prog_type type, const char *in, + const char *out) +{ + /* Matching based on strings returned by socket_kind_to_str(): + * tcp4, udp4, tcp6, udp6, u_str, u_dgr, v_str, v_seq + * Plus a wildcard: any + * Not in use: u_seq, v_dgr + */ + struct combo_spec *c, combos[] =3D { + /* Send to local: TCP -> any, but vsock */ + { SK_MSG_INGRESS, "tcp", "tcp" }, + { SK_MSG_INGRESS, "tcp", "udp" }, + { SK_MSG_INGRESS, "tcp", "u_str" }, + { SK_MSG_INGRESS, "tcp", "u_dgr" }, + + /* Send to egress: TCP -> TCP */ + { SK_MSG_EGRESS, "tcp", "tcp" }, + + /* Ingress to egress: any -> any */ + { SK_SKB_EGRESS, "any", "any" }, + + /* Ingress to local: any -> any, but vsock */ + { SK_SKB_INGRESS, "any", "tcp" }, + { SK_SKB_INGRESS, "any", "udp" }, + { SK_SKB_INGRESS, "any", "u_str" }, + { SK_SKB_INGRESS, "any", "u_dgr" }, + }; + + for (c =3D combos; c < combos + ARRAY_SIZE(combos); c++) { + if (c->prog_type =3D=3D type && + (!strcmp(c->in, "any") || strstarts(in, c->in)) && + (!strcmp(c->out, "any") || strstarts(out, c->out))) + return SUPPORTED; + } + + return 0; +} + +static int get_support_status(enum prog_type type, const char *in, + const char *out) +{ + int status =3D is_redir_supported(type, in, out); + + if (type =3D=3D SK_SKB_INGRESS && strstarts(out, "v_")) + status |=3D UNSUPPORTED_RACY_VERD; + + return status; +} + +static void test_socket(enum bpf_map_type type, struct redir_spec *redir, + struct maps *maps, struct socket_spec *s_in, + struct socket_spec *s_out) +{ + int fd_in, fd_out, fd_send, fd_peer, fd_recv, flags, status; + const char *in_str, *out_str; + char s[MAX_TEST_NAME]; + + fd_in =3D s_in->in[0]; + fd_out =3D s_out->out[0]; + fd_send =3D s_in->in[redir->idx_send]; + fd_peer =3D s_in->in[redir->idx_send ^ 1]; + fd_recv =3D s_out->out[redir->idx_recv]; + flags =3D s_in->send_flags; + + in_str =3D socket_kind_to_str(fd_in); + out_str =3D socket_kind_to_str(fd_out); + status =3D get_support_status(redir->prog_type, in_str, out_str); + + snprintf(s, sizeof(s), + "%-4s %-17s %-5s %s %-5s%6s", + /* hash sk_skb-to-ingress u_str =E2=86=92 v_str (OOB) */ + type =3D=3D BPF_MAP_TYPE_SOCKMAP ? "map" : "hash", + redir->name, + in_str, + status & SUPPORTED ? "=E2=86=92" : " ", + out_str, + (flags & MSG_OOB) ? "(OOB)" : ""); + + if (!test__start_subtest(s)) + return; + + test_send_redir_recv(fd_send, flags, fd_peer, fd_in, fd_out, fd_recv, + maps, status); +} + +static void test_redir(enum bpf_map_type type, struct redir_spec *redir, + struct maps *maps) +{ + struct socket_spec *s, sockets[] =3D { + { AF_INET, SOCK_STREAM }, + // { AF_INET, SOCK_STREAM, MSG_OOB }, /* Known to be broken */ + { AF_INET6, SOCK_STREAM }, + { AF_INET, SOCK_DGRAM }, + { AF_INET6, SOCK_DGRAM }, + { AF_UNIX, SOCK_STREAM }, + { AF_UNIX, SOCK_STREAM, MSG_OOB }, + { AF_UNIX, SOCK_DGRAM }, + // { AF_UNIX, SOCK_SEQPACKET}, /* Unsupported BPF_MAP_UPDATE_ELEM */ + { AF_VSOCK, SOCK_STREAM }, + // { AF_VSOCK, SOCK_DGRAM }, /* Unsupported socket() */ + { AF_VSOCK, SOCK_SEQPACKET }, + }; + + for (s =3D sockets; s < sockets + ARRAY_SIZE(sockets); s++) + if (socket_spec_pairs(s)) + goto out; + + /* Intra-proto */ + for (s =3D sockets; s < sockets + ARRAY_SIZE(sockets); s++) + test_socket(type, redir, maps, s, s); + + /* Cross-proto */ + for (int i =3D 0; i < ARRAY_SIZE(sockets); i++) { + for (int j =3D 0; j < ARRAY_SIZE(sockets); j++) { + struct socket_spec *out =3D &sockets[j]; + struct socket_spec *in =3D &sockets[i]; + + /* Skip intra-proto and between variants */ + if (out->send_flags || + (in->family =3D=3D out->family && + in->sotype =3D=3D out->sotype)) + continue; + + test_socket(type, redir, maps, in, out); + } + } +out: + while (--s >=3D sockets) + socket_spec_close(s); +} + +static void test_map(enum bpf_map_type type) +{ + struct redir_spec *r, redirs[] =3D { + { "sk_msg-to-ingress", SEND_INNER, RECV_INNER, SK_MSG_INGRESS }, + { "sk_msg-to-egress", SEND_INNER, RECV_OUTER, SK_MSG_EGRESS }, + { "sk_skb-to-egress", SEND_OUTER, RECV_OUTER, SK_SKB_EGRESS }, + { "sk_skb-to-ingress", SEND_OUTER, RECV_INNER, SK_SKB_INGRESS }, + }; + + for (r =3D redirs; r < redirs + ARRAY_SIZE(redirs); r++) { + enum bpf_attach_type attach_type; + struct test_sockmap_listen *skel; + struct maps maps; + int prog_fd; + + skel =3D test_sockmap_listen__open_and_load(); + if (!skel) { + FAIL("open_and_load"); + return; + } + + switch (type) { + case BPF_MAP_TYPE_SOCKMAP: + skel->bss->test_sockmap =3D true; + maps.out =3D bpf_map__fd(skel->maps.sock_map); + break; + case BPF_MAP_TYPE_SOCKHASH: + skel->bss->test_sockmap =3D false; + maps.out =3D bpf_map__fd(skel->maps.sock_hash); + break; + default: + FAIL("Unsupported bpf_map_type"); + return; + } + + maps.in =3D bpf_map__fd(skel->maps.nop_map); + maps.verd =3D bpf_map__fd(skel->maps.verdict_map); + get_redir_params(r, skel, &prog_fd, &attach_type, + &skel->bss->test_ingress); + + if (xbpf_prog_attach(prog_fd, maps.in, attach_type, 0)) + return; + + test_redir(type, r, &maps); + + if (xbpf_prog_detach2(prog_fd, maps.in, attach_type)) + return; + + test_sockmap_listen__destroy(skel); + } +} + +void serial_test_sockmap_redir(void) +{ + test_map(BPF_MAP_TYPE_SOCKMAP); + test_map(BPF_MAP_TYPE_SOCKHASH); +} --=20 2.49.0