.../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-- .../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++------- .../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++-------------- 3 files changed, 124 insertions(+), 170 deletions(-)
Series takes care of few bugs and missing features with the aim to improve
the test coverage of sockmap/sockhash.
Last patch is a create_pair() rewrite making use of
__attribute__((cleanup)) to handle socket fd lifetime.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
Changes in v2:
- Rebase on bpf-next (Jakub)
- Use cleanup helpers from kernel's cleanup.h (Jakub)
- Fix subject of patch 3, rephrase patch 4, use correct prefix
- Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712@rbox.co
Changes in v1:
- No declarations in function body (Jakub)
- Don't touch output arguments until function succeeds (Jakub)
- Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/
---
Michal Luczaj (6):
selftests/bpf: Support more socket types in create_pair()
selftests/bpf: Socket pair creation, cleanups
selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible()
selftests/bpf: Honour the sotype of af_unix redir tests
selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected()
selftests/bpf: Introduce __attribute__((cleanup)) in create_pair()
.../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++--
.../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++-------
.../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++--------------
3 files changed, 124 insertions(+), 170 deletions(-)
---
base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9
change-id: 20240729-selftest-sockmap-fixes-bcca996e143b
Best regards,
--
Michal Luczaj <mhal@rbox.co>
On 7/31/24 3:01 AM, Michal Luczaj wrote: > Series takes care of few bugs and missing features with the aim to improve > the test coverage of sockmap/sockhash. > > Last patch is a create_pair() rewrite making use of > __attribute__((cleanup)) to handle socket fd lifetime. Applied to bpf-next/net. Thanks.
On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: > Series takes care of few bugs and missing features with the aim to improve > the test coverage of sockmap/sockhash. > > Last patch is a create_pair() rewrite making use of > __attribute__((cleanup)) to handle socket fd lifetime. > > Signed-off-by: Michal Luczaj <mhal@rbox.co> > --- > Changes in v2: > - Rebase on bpf-next (Jakub) > - Use cleanup helpers from kernel's cleanup.h (Jakub) > - Fix subject of patch 3, rephrase patch 4, use correct prefix > - Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712@rbox.co > > Changes in v1: > - No declarations in function body (Jakub) > - Don't touch output arguments until function succeeds (Jakub) > - Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/ > > --- > Michal Luczaj (6): > selftests/bpf: Support more socket types in create_pair() > selftests/bpf: Socket pair creation, cleanups > selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible() > selftests/bpf: Honour the sotype of af_unix redir tests > selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected() > selftests/bpf: Introduce __attribute__((cleanup)) in create_pair() > > .../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-- > .../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++------- > .../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++-------------- > 3 files changed, 124 insertions(+), 170 deletions(-) > --- > base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9 > change-id: 20240729-selftest-sockmap-fixes-bcca996e143b > > Best regards, Thanks again for these fixes. For the series: Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
On 8/6/24 14:01, Jakub Sitnicki wrote: > On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: >> Series takes care of few bugs and missing features with the aim to improve >> the test coverage of sockmap/sockhash. >> >> Last patch is a create_pair() rewrite making use of >> __attribute__((cleanup)) to handle socket fd lifetime. >> >> Signed-off-by: Michal Luczaj <mhal@rbox.co> >> --- >> Changes in v2: >> - Rebase on bpf-next (Jakub) >> - Use cleanup helpers from kernel's cleanup.h (Jakub) >> - Fix subject of patch 3, rephrase patch 4, use correct prefix >> - Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712@rbox.co >> >> Changes in v1: >> - No declarations in function body (Jakub) >> - Don't touch output arguments until function succeeds (Jakub) >> - Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/ >> >> --- >> Michal Luczaj (6): >> selftests/bpf: Support more socket types in create_pair() >> selftests/bpf: Socket pair creation, cleanups >> selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible() >> selftests/bpf: Honour the sotype of af_unix redir tests >> selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected() >> selftests/bpf: Introduce __attribute__((cleanup)) in create_pair() >> >> .../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-- >> .../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++------- >> .../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++-------------- >> 3 files changed, 124 insertions(+), 170 deletions(-) >> --- >> base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9 >> change-id: 20240729-selftest-sockmap-fixes-bcca996e143b >> >> Best regards, > > Thanks again for these fixes. For the series: > > Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> > Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Great, thanks for the review. With this completed, I guess we can unwind the (mail) stack to [1]. Is that ingress-to-local et al. something you wanted to take care of yourself or can I give it a try? [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ Also, I've noticed patchwork still lists (besides this one) the old version of this series. Am I supposed to tell the bot to disregard it?
On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: > On 8/6/24 14:01, Jakub Sitnicki wrote: >> On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: >>> Series takes care of few bugs and missing features with the aim to improve >>> the test coverage of sockmap/sockhash. >>> >>> Last patch is a create_pair() rewrite making use of >>> __attribute__((cleanup)) to handle socket fd lifetime. >>> >>> Signed-off-by: Michal Luczaj <mhal@rbox.co> >>> --- >>> Changes in v2: >>> - Rebase on bpf-next (Jakub) >>> - Use cleanup helpers from kernel's cleanup.h (Jakub) >>> - Fix subject of patch 3, rephrase patch 4, use correct prefix >>> - Link to v1: https://lore.kernel.org/r/20240724-sockmap-selftest-fixes-v1-0-46165d224712@rbox.co >>> >>> Changes in v1: >>> - No declarations in function body (Jakub) >>> - Don't touch output arguments until function succeeds (Jakub) >>> - Link to v0: https://lore.kernel.org/netdev/027fdb41-ee11-4be0-a493-22f28a1abd7c@rbox.co/ >>> >>> --- >>> Michal Luczaj (6): >>> selftests/bpf: Support more socket types in create_pair() >>> selftests/bpf: Socket pair creation, cleanups >>> selftests/bpf: Simplify inet_socketpair() and vsock_socketpair_connectible() >>> selftests/bpf: Honour the sotype of af_unix redir tests >>> selftests/bpf: Exercise SOCK_STREAM unix_inet_redir_to_connected() >>> selftests/bpf: Introduce __attribute__((cleanup)) in create_pair() >>> >>> .../selftests/bpf/prog_tests/sockmap_basic.c | 28 ++-- >>> .../selftests/bpf/prog_tests/sockmap_helpers.h | 149 ++++++++++++++------- >>> .../selftests/bpf/prog_tests/sockmap_listen.c | 117 ++-------------- >>> 3 files changed, 124 insertions(+), 170 deletions(-) >>> --- >>> base-commit: 92cc2456e9775dc4333fb4aa430763ae4ac2f2d9 >>> change-id: 20240729-selftest-sockmap-fixes-bcca996e143b >>> >>> Best regards, >> >> Thanks again for these fixes. For the series: >> >> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> >> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> > > Great, thanks for the review. With this completed, I guess we can unwind > the (mail) stack to [1]. Is that ingress-to-local et al. something you > wanted to take care of yourself or can I give it a try? I haven't stated any work on. You're welcome to tackle that. All I have is a toy test that I've used to generate the redirect matrix. Perhaps it can serve as inspiration: https://github.com/jsitnicki/sockmap-redir-matrix > > [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ > > Also, I've noticed patchwork still lists (besides this one) the old version > of this series. Am I supposed to tell the bot to disregard it? Only the maintainers can change patch set status in patchwork: https://docs.kernel.org/process/maintainer-netdev.html#updating-patch-status
On 8/6/24 19:45, Jakub Sitnicki wrote:
> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote:
>> Great, thanks for the review. With this completed, I guess we can unwind
>> the (mail) stack to [1]. Is that ingress-to-local et al. something you
>> wanted to take care of yourself or can I give it a try?
>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/
>
> I haven't stated any work on. You're welcome to tackle that.
>
> All I have is a toy test that I've used to generate the redirect matrix.
> Perhaps it can serve as inspiration:
>
> https://github.com/jsitnicki/sockmap-redir-matrix
All right, please let me know if this is more or less what you meant and
I'll post the whole series for a review (+patch to purge sockmap_listen of
redir tests, fix misnomers). Mostly I've just copypasted your code
(mangling it terribly along the way), so I feel silly claiming the
authorship. Should I assign you as an author?
Note that the patches are based on [2], which has not reached bpf-next
(patchwork says: "Needs ACK").
[2] [PATCH bpf-next v2 0/6] selftests/bpf: Various sockmap-related fixes
https://lore.kernel.org/bpf/20240731-selftest-sockmap-fixes-v2-0-08a0c73abed2@rbox.co/
On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: > On 8/6/24 19:45, Jakub Sitnicki wrote: >> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>> Great, thanks for the review. With this completed, I guess we can unwind >>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>> wanted to take care of yourself or can I give it a try? >>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >> >> I haven't stated any work on. You're welcome to tackle that. >> >> All I have is a toy test that I've used to generate the redirect matrix. >> Perhaps it can serve as inspiration: >> >> https://github.com/jsitnicki/sockmap-redir-matrix > > All right, please let me know if this is more or less what you meant and > I'll post the whole series for a review (+patch to purge sockmap_listen of > redir tests, fix misnomers). [...] Gave it a look as promised. It makes sense to me as well to put these tests in a new module. There will be some overlap with sockmap_listen, which has diverged from its inital scope, but we can dedup that later. One thought that I had is that it could make sense to test the not supported redirect combos (and expect an error). Sometimes folks make changes and enable some parts of the API by accient. Just a suggestion. This will be a nice improvement to the test coverage even without the negative tests. > Note that the patches are based on [2], which has not reached bpf-next > (patchwork says: "Needs ACK"). I think it might be fair to resend the series to attract the maintainers attention at this point. Thanks, Jakub
On 8/19/24 22:05, Jakub Sitnicki wrote: > On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: >> On 8/6/24 19:45, Jakub Sitnicki wrote: >>> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>>> Great, thanks for the review. With this completed, I guess we can unwind >>>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>>> wanted to take care of yourself or can I give it a try? >>>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >>> >>> I haven't stated any work on. You're welcome to tackle that. >>> >>> All I have is a toy test that I've used to generate the redirect matrix. >>> Perhaps it can serve as inspiration: >>> >>> https://github.com/jsitnicki/sockmap-redir-matrix >> >> All right, please let me know if this is more or less what you meant and >> I'll post the whole series for a review (+patch to purge sockmap_listen of >> redir tests, fix misnomers). [...] > > Gave it a look as promised. It makes sense to me as well to put these > tests in a new module. There will be some overlap with sockmap_listen, > which has diverged from its inital scope, but we can dedup that later. >> One thought that I had is that it could make sense to test the not > supported redirect combos (and expect an error). Sometimes folks make > changes and enable some parts of the API by accient. > > Just a suggestion. This will be a nice improvement to the test coverage > even without the negative tests. > ... So this took me for a bit of a journey (sorry for delay), but after tweaking some things here and there, here's v2 [1]. There are not that many changes, but I did try introducing negative tests and deduping vs. sockmap_listen. Please let me know what you think. https://lore.kernel.org/bpf/20250411-selftests-sockmap-redir-v2-0-5f9b018d6704@rbox.co/ Thanks, Michal
On 8/19/24 22:05, Jakub Sitnicki wrote:
> On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote:
>> On 8/6/24 19:45, Jakub Sitnicki wrote:
>>> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote:
>>>> Great, thanks for the review. With this completed, I guess we can unwind
>>>> the (mail) stack to [1]. Is that ingress-to-local et al. something you
>>>> wanted to take care of yourself or can I give it a try?
>>>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/
>>>
>>> I haven't stated any work on. You're welcome to tackle that.
>>>
>>> All I have is a toy test that I've used to generate the redirect matrix.
>>> Perhaps it can serve as inspiration:
>>>
>>> https://github.com/jsitnicki/sockmap-redir-matrix
>>
>> All right, please let me know if this is more or less what you meant and
>> I'll post the whole series for a review (+patch to purge sockmap_listen of
>> redir tests, fix misnomers). [...]
>
> Gave it a look as promised. It makes sense to me as well to put these
> tests in a new module. There will be some overlap with sockmap_listen,
> which has diverged from its inital scope, but we can dedup that later.
>
> One thought that I had is that it could make sense to test the not
> supported redirect combos (and expect an error). Sometimes folks make
> changes and enable some parts of the API by accient.
All right, so I did what sockmap_listen does: check
test_sockmap_listen.c:verdict_map[SK_PASS] to see if the redirect took
place for a given combo. And that works well... except for skb/msg to
ingress af_vsock. Even though this is unsupported and no redirect
actually happens, verdict appears to be SK_PASS. Is this correct?
Maybe I'm missing something, so below is a crude testcase I've cobbled
together.
And sorry for the delay, I was away from keyboard.
Michal
All error logs:
./test_progs:unix_vsock_redir_fail:1600: want pass=0 / drop=1, have 1 / 0
unix_vsock_redir_fail:FAIL:1600
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
index 4ee1148d22be..e59e1654f110 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
@@ -1561,6 +1561,45 @@ static void vsock_unix_redir_connectible(int sock_mapfd, int verd_mapfd,
close(u1);
}
+static void unix_vsock_redir_fail(int sock_mapfd, int verd_mapfd)
+{
+ int v0, v1, u[2], pass, drop;
+ char a = 'a';
+
+ bpf_map_delete_elem(sock_mapfd, &(int){0});
+ bpf_map_delete_elem(sock_mapfd, &(int){1});
+ zero_verdict_count(verd_mapfd);
+
+ if (socketpair(AF_UNIX, SOCK_STREAM, 0, u)) {
+ FAIL_ERRNO("socketpair(af_unix)");
+ return;
+ }
+
+ if (create_pair(AF_VSOCK, SOCK_STREAM, &v0, &v1))
+ return;
+
+ if (add_to_sockmap(sock_mapfd, v0, u[0]))
+ return;
+
+ if (write(u[1], &a, sizeof(a)) != 1) {
+ FAIL_ERRNO("write()");
+ return;
+ }
+
+ errno = 0;
+ if (recv_timeout(v0, &a, sizeof(a), 0, 1) >= 0 ||
+ recv_timeout(v1, &a, sizeof(a), 0, 1) >= 0 ||
+ recv_timeout(u[0], &a, sizeof(a), 0, 1) >= 0 ||
+ recv_timeout(u[1], &a, sizeof(a), 0, 1) >= 0)
+ FAIL("recv() returned >=0, errno=%d", errno);
+
+ if (xbpf_map_lookup_elem(verd_mapfd, &(int){SK_PASS}, &pass) ||
+ xbpf_map_lookup_elem(verd_mapfd, &(int){SK_DROP}, &drop))
+ return;
+ if (pass != 0 || drop != 1)
+ FAIL("want pass=0 / drop=1, have %d / %d", pass, drop);
+}
+
static void vsock_unix_skb_redir_connectible(struct test_sockmap_listen *skel,
struct bpf_map *inner_map,
int sotype)
@@ -1582,6 +1621,23 @@ static void vsock_unix_skb_redir_connectible(struct test_sockmap_listen *skel,
xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
}
+static void unix_vsock_redir(struct test_sockmap_listen *skel, struct bpf_map *inner_map)
+{
+ int verdict = bpf_program__fd(skel->progs.prog_skb_verdict);
+ int verdict_map = bpf_map__fd(skel->maps.verdict_map);
+ int sock_map = bpf_map__fd(inner_map);
+ int err;
+
+ err = xbpf_prog_attach(verdict, sock_map, BPF_SK_SKB_VERDICT, 0);
+ if (err)
+ return;
+
+ skel->bss->test_ingress = true;
+ unix_vsock_redir_fail(sock_map, verdict_map);
+
+ xbpf_prog_detach2(verdict, sock_map, BPF_SK_SKB_VERDICT);
+}
+
static void test_vsock_redir(struct test_sockmap_listen *skel, struct bpf_map *map)
{
const char *family_name, *map_name;
@@ -1883,6 +1939,7 @@ void serial_test_sockmap_listen(void)
test_unix_redir(skel, skel->maps.sock_map, SOCK_DGRAM);
test_unix_redir(skel, skel->maps.sock_map, SOCK_STREAM);
test_vsock_redir(skel, skel->maps.sock_map);
+ unix_vsock_redir(skel, skel->maps.sock_map);
skel->bss->test_sockmap = false;
run_tests(skel, skel->maps.sock_hash, AF_INET);
On 9/24/24 12:25, Michal Luczaj wrote:
> On 8/19/24 22:05, Jakub Sitnicki wrote:
>> On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote:
>>> On 8/6/24 19:45, Jakub Sitnicki wrote:
>>>> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote:
>>>>> Great, thanks for the review. With this completed, I guess we can unwind
>>>>> the (mail) stack to [1]. Is that ingress-to-local et al. something you
>>>>> wanted to take care of yourself or can I give it a try?
>>>>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/
>>>>
>>>> I haven't stated any work on. You're welcome to tackle that.
>>>>
>>>> All I have is a toy test that I've used to generate the redirect matrix.
>>>> Perhaps it can serve as inspiration:
>>>>
>>>> https://github.com/jsitnicki/sockmap-redir-matrix
>>>
>>> All right, please let me know if this is more or less what you meant and
>>> I'll post the whole series for a review (+patch to purge sockmap_listen of
>>> redir tests, fix misnomers). [...]
>>
>> Gave it a look as promised. It makes sense to me as well to put these
>> tests in a new module. There will be some overlap with sockmap_listen,
>> which has diverged from its inital scope, but we can dedup that later.
>>
>> One thought that I had is that it could make sense to test the not
>> supported redirect combos (and expect an error). Sometimes folks make
>> changes and enable some parts of the API by accient.
>
> All right, so I did what sockmap_listen does: check
> test_sockmap_listen.c:verdict_map[SK_PASS] to see if the redirect took
> place for a given combo. And that works well... except for skb/msg to
> ingress af_vsock. Even though this is unsupported and no redirect
> actually happens, verdict appears to be SK_PASS. Is this correct?
Here's a follow up: my guess is that some checks are missing. I'm not sure
if it's the best approach, but this fixes things for me:
diff --git a/include/net/sock.h b/include/net/sock.h
index c58ca8dd561b..c87295f3476d 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2715,6 +2715,11 @@ static inline bool sk_is_stream_unix(const struct sock *sk)
return sk->sk_family == AF_UNIX && sk->sk_type == SOCK_STREAM;
}
+static inline bool sk_is_vsock(const struct sock *sk)
+{
+ return sk->sk_family == AF_VSOCK;
+}
+
/**
* sk_eat_skb - Release a skb if it is no longer needed
* @sk: socket to eat this skb from
diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 242c91a6e3d3..07d6aa4e39ef 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -647,6 +647,8 @@ BPF_CALL_4(bpf_sk_redirect_map, struct sk_buff *, skb,
sk = __sock_map_lookup_elem(map, key);
if (unlikely(!sk || !sock_map_redirect_allowed(sk)))
return SK_DROP;
+ if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk))
+ return SK_DROP;
skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS);
return SK_PASS;
@@ -675,6 +677,8 @@ BPF_CALL_4(bpf_msg_redirect_map, struct sk_msg *, msg,
return SK_DROP;
if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk))
return SK_DROP;
+ if (sk_is_vsock(sk))
+ return SK_DROP;
msg->flags = flags;
msg->sk_redir = sk;
@@ -1249,6 +1253,8 @@ BPF_CALL_4(bpf_sk_redirect_hash, struct sk_buff *, skb,
sk = __sock_hash_lookup_elem(map, key);
if (unlikely(!sk || !sock_map_redirect_allowed(sk)))
return SK_DROP;
+ if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk))
+ return SK_DROP;
skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS);
return SK_PASS;
@@ -1277,6 +1283,8 @@ BPF_CALL_4(bpf_msg_redirect_hash, struct sk_msg *, msg,
return SK_DROP;
if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk))
return SK_DROP;
+ if (sk_is_vsock(sk))
+ return SK_DROP;
msg->flags = flags;
msg->sk_redir = sk;
On Fri, Sep 27, 2024 at 12:54 AM +02, Michal Luczaj wrote:
> On 9/24/24 12:25, Michal Luczaj wrote:
>> On 8/19/24 22:05, Jakub Sitnicki wrote:
>>> On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote:
>>>> On 8/6/24 19:45, Jakub Sitnicki wrote:
>>>>> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote:
>>>>>> Great, thanks for the review. With this completed, I guess we can unwind
>>>>>> the (mail) stack to [1]. Is that ingress-to-local et al. something you
>>>>>> wanted to take care of yourself or can I give it a try?
>>>>>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/
>>>>>
>>>>> I haven't stated any work on. You're welcome to tackle that.
>>>>>
>>>>> All I have is a toy test that I've used to generate the redirect matrix.
>>>>> Perhaps it can serve as inspiration:
>>>>>
>>>>> https://github.com/jsitnicki/sockmap-redir-matrix
>>>>
>>>> All right, please let me know if this is more or less what you meant and
>>>> I'll post the whole series for a review (+patch to purge sockmap_listen of
>>>> redir tests, fix misnomers). [...]
>>>
>>> Gave it a look as promised. It makes sense to me as well to put these
>>> tests in a new module. There will be some overlap with sockmap_listen,
>>> which has diverged from its inital scope, but we can dedup that later.
>>>
>>> One thought that I had is that it could make sense to test the not
>>> supported redirect combos (and expect an error). Sometimes folks make
>>> changes and enable some parts of the API by accient.
>>
>> All right, so I did what sockmap_listen does: check
>> test_sockmap_listen.c:verdict_map[SK_PASS] to see if the redirect took
>> place for a given combo. And that works well... except for skb/msg to
>> ingress af_vsock. Even though this is unsupported and no redirect
>> actually happens, verdict appears to be SK_PASS. Is this correct?
>
> Here's a follow up: my guess is that some checks are missing. I'm not sure
> if it's the best approach, but this fixes things for me:
So you have already found a bug with a negative test. Nice.
Your patch makes sense to me.
FYI, I've started a GH repo for anciallary materials for sockmap.
Code samples, pointers to resources, a backlog of stuff to do (?).
Inspired by the xdp-project repo:
https://github.com/xdp-project/xdp-project
We can move it to a dedicated project namespace, right now it's at:
https://github.com/jsitnicki/sockmap-project/
Feel free to add stuff there.
> diff --git a/include/net/sock.h b/include/net/sock.h
> index c58ca8dd561b..c87295f3476d 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -2715,6 +2715,11 @@ static inline bool sk_is_stream_unix(const struct sock *sk)
> return sk->sk_family == AF_UNIX && sk->sk_type == SOCK_STREAM;
> }
>
> +static inline bool sk_is_vsock(const struct sock *sk)
> +{
> + return sk->sk_family == AF_VSOCK;
> +}
> +
> /**
> * sk_eat_skb - Release a skb if it is no longer needed
> * @sk: socket to eat this skb from
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 242c91a6e3d3..07d6aa4e39ef 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -647,6 +647,8 @@ BPF_CALL_4(bpf_sk_redirect_map, struct sk_buff *, skb,
> sk = __sock_map_lookup_elem(map, key);
> if (unlikely(!sk || !sock_map_redirect_allowed(sk)))
> return SK_DROP;
> + if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk))
> + return SK_DROP;
>
> skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS);
> return SK_PASS;
> @@ -675,6 +677,8 @@ BPF_CALL_4(bpf_msg_redirect_map, struct sk_msg *, msg,
> return SK_DROP;
> if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk))
> return SK_DROP;
> + if (sk_is_vsock(sk))
> + return SK_DROP;
>
> msg->flags = flags;
> msg->sk_redir = sk;
> @@ -1249,6 +1253,8 @@ BPF_CALL_4(bpf_sk_redirect_hash, struct sk_buff *, skb,
> sk = __sock_hash_lookup_elem(map, key);
> if (unlikely(!sk || !sock_map_redirect_allowed(sk)))
> return SK_DROP;
> + if ((flags & BPF_F_INGRESS) && sk_is_vsock(sk))
> + return SK_DROP;
>
> skb_bpf_set_redir(skb, sk, flags & BPF_F_INGRESS);
> return SK_PASS;
> @@ -1277,6 +1283,8 @@ BPF_CALL_4(bpf_msg_redirect_hash, struct sk_msg *, msg,
> return SK_DROP;
> if (!(flags & BPF_F_INGRESS) && !sk_is_tcp(sk))
> return SK_DROP;
> + if (sk_is_vsock(sk))
> + return SK_DROP;
>
> msg->flags = flags;
> msg->sk_redir = sk;
On 9/27/24 11:15, Jakub Sitnicki wrote: > On Fri, Sep 27, 2024 at 12:54 AM +02, Michal Luczaj wrote: >> ... >> Here's a follow up: my guess is that some checks are missing. I'm not sure >> if it's the best approach, but this fixes things for me: > > So you have already found a bug with a negative test. Nice. > > Your patch makes sense to me. Great, I'll submit it properly. Another thing I've noticed is that unsupported (non-TCP) sk_msg redirects fail silently, i.e. send() is successful, then packet appears to be dropped, but because the BPF_SK_MSG_VERDICT program is never run, the verdict[SK_DROP] isn't updated. Is this by design? Also, for unsupported af_vsock sk_skb-to-ingress we hit the warning: [ 233.396654] rx_queue is empty, but rx_bytes is non-zero [ 233.396702] WARNING: CPU: 11 PID: 40601 at net/vmw_vsock/virtio_transport_common.c:589 virtio_transport_stream_dequeue+0x2e5/0x2f0 I'll try to fix that. Now, the series begin to grow long. Should the fixes come separately? > FYI, I've started a GH repo for anciallary materials for sockmap. > Code samples, pointers to resources, a backlog of stuff to do (?). > Inspired by the xdp-project repo: > > https://github.com/xdp-project/xdp-project > > We can move it to a dedicated project namespace, right now it's at: > > https://github.com/jsitnicki/sockmap-project/ > > Feel free to add stuff there. Not that I have anything to add, but sure, thanks :) Michal
I'm back after a short break. Sorry for delay. On Wed, Oct 02, 2024 at 10:27 AM +02, Michal Luczaj wrote: > On 9/27/24 11:15, Jakub Sitnicki wrote: >> On Fri, Sep 27, 2024 at 12:54 AM +02, Michal Luczaj wrote: >>> ... >>> Here's a follow up: my guess is that some checks are missing. I'm not sure >>> if it's the best approach, but this fixes things for me: >> >> So you have already found a bug with a negative test. Nice. >> >> Your patch makes sense to me. > > Great, I'll submit it properly. > > Another thing I've noticed is that unsupported (non-TCP) sk_msg redirects > fail silently, i.e. send() is successful, then packet appears to be > dropped, but because the BPF_SK_MSG_VERDICT program is never run, the > verdict[SK_DROP] isn't updated. Is this by design? That's curious. We don't override the proto::sendmsg callback for protocols which don't support sk_msg redirects, like UDP: https://elixir.bootlin.com/linux/v6.12-rc2/source/net/ipv4/udp_bpf.c#L114 The packet should get delivered to the peer socket as w/o sockmap. I will have to double check that. > Also, for unsupported af_vsock sk_skb-to-ingress we hit the warning: > > [ 233.396654] rx_queue is empty, but rx_bytes is non-zero > [ 233.396702] WARNING: CPU: 11 PID: 40601 at net/vmw_vsock/virtio_transport_common.c:589 virtio_transport_stream_dequeue+0x2e5/0x2f0 > > I'll try to fix that. Now, the series begin to grow long. Should the fixes > come separately? Thanks. And yes - if possible, better to push fixes separately. Because they go through the bpf tree, and they will still land in the upcoming -rc releases (and get backported). While improvements go through bpf-next. Of course that sometimes makes life more difficult if the improvements depend on some fixes... Not sure if anything from bpf-next gets backported if it has a Fixes tag. We can ask the stable kernel maintainers, if needed.
On 10/9/24 11:46, Jakub Sitnicki wrote: > That's curious. We don't override the proto::sendmsg callback for > protocols which don't support sk_msg redirects, like UDP: > > https://elixir.bootlin.com/linux/v6.12-rc2/source/net/ipv4/udp_bpf.c#L114 > > The packet should get delivered to the peer socket as w/o sockmap. > I will have to double check that. Ugh, no, you're right. I was checking the wrong queue all that time... Sorry for the confusion. > Thanks. And yes - if possible, better to push fixes separately. Because > they go through the bpf tree, and they will still land in the upcoming > -rc releases (and get backported). > > While improvements go through bpf-next. Of course that sometimes makes > life more difficult if the improvements depend on some fixes... I'm afraid that's the case for the redir selftest to run cleanly. Anyway, so those are the fixes mentioned, targeting bpf: https://lore.kernel.org/bpf/20241009-vsock-fixes-for-redir-v1-0-e455416f6d78@rbox.co/ > Not sure if anything from bpf-next gets backported if it has a Fixes > tag. We can ask the stable kernel maintainers, if needed.
On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: > On 8/6/24 19:45, Jakub Sitnicki wrote: >> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>> Great, thanks for the review. With this completed, I guess we can unwind >>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>> wanted to take care of yourself or can I give it a try? >>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >> >> I haven't stated any work on. You're welcome to tackle that. >> >> All I have is a toy test that I've used to generate the redirect matrix. >> Perhaps it can serve as inspiration: >> >> https://github.com/jsitnicki/sockmap-redir-matrix > > All right, please let me know if this is more or less what you meant and > I'll post the whole series for a review (+patch to purge sockmap_listen of > redir tests, fix misnomers). Mostly I've just copypasted your code > (mangling it terribly along the way), so I feel silly claiming the > authorship. Should I assign you as an author? Don't worry about it. I appreciate the help. I will take a look at the redirect tests this weekend. > Note that the patches are based on [2], which has not reached bpf-next > (patchwork says: "Needs ACK"). > > [2] [PATCH bpf-next v2 0/6] selftests/bpf: Various sockmap-related fixes > https://lore.kernel.org/bpf/20240731-selftest-sockmap-fixes-v2-0-08a0c73abed2@rbox.co/ Might have slipped throught the cracks... Andrii, Martin, The patch set still applies cleanly to bpf-next. Would you be able to a look at this series? Anything we need to do? Thanks, (the other) Jakub
On 8/16/24 12:03 PM, Jakub Sitnicki wrote: > On Wed, Aug 14, 2024 at 06:14 PM +02, Michal Luczaj wrote: >> On 8/6/24 19:45, Jakub Sitnicki wrote: >>> On Tue, Aug 06, 2024 at 07:18 PM +02, Michal Luczaj wrote: >>>> Great, thanks for the review. With this completed, I guess we can unwind >>>> the (mail) stack to [1]. Is that ingress-to-local et al. something you >>>> wanted to take care of yourself or can I give it a try? >>>> [1] https://lore.kernel.org/netdev/87msmqn9ws.fsf@cloudflare.com/ >>> >>> I haven't stated any work on. You're welcome to tackle that. >>> >>> All I have is a toy test that I've used to generate the redirect matrix. >>> Perhaps it can serve as inspiration: >>> >>> https://github.com/jsitnicki/sockmap-redir-matrix >> >> All right, please let me know if this is more or less what you meant and >> I'll post the whole series for a review (+patch to purge sockmap_listen of >> redir tests, fix misnomers). Mostly I've just copypasted your code >> (mangling it terribly along the way), so I feel silly claiming the >> authorship. Should I assign you as an author? > > Don't worry about it. I appreciate the help. > > I will take a look at the redirect tests this weekend. > >> Note that the patches are based on [2], which has not reached bpf-next >> (patchwork says: "Needs ACK"). >> >> [2] [PATCH bpf-next v2 0/6] selftests/bpf: Various sockmap-related fixes >> https://lore.kernel.org/bpf/20240731-selftest-sockmap-fixes-v2-0-08a0c73abed2@rbox.co/ > > Might have slipped throught the cracks... > > > Andrii, Martin, > > The patch set still applies cleanly to bpf-next. > > Would you be able to a look at this series? Anything we need to do? will take a look. no need to resend.
Handle AF_UNIX in init_addr_loopback(). For pair creation, bind() the peer
socket to make SOCK_DGRAM connect() happy.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
.../bpf/prog_tests/sockmap_helpers.h | 29 +++++++++++++++----
1 file changed, 24 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h b/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h
index 38e35c72bdaa..c50efa834a11 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h
@@ -1,6 +1,7 @@
#ifndef __SOCKMAP_HELPERS__
#define __SOCKMAP_HELPERS__
+#include <sys/un.h>
#include <linux/vm_sockets.h>
/* include/linux/net.h */
@@ -283,6 +284,15 @@ static inline void init_addr_loopback6(struct sockaddr_storage *ss,
*len = sizeof(*addr6);
}
+static inline void init_addr_loopback_unix(struct sockaddr_storage *ss,
+ socklen_t *len)
+{
+ struct sockaddr_un *addr = memset(ss, 0, sizeof(*ss));
+
+ addr->sun_family = AF_UNIX;
+ *len = sizeof(sa_family_t);
+}
+
static inline void init_addr_loopback_vsock(struct sockaddr_storage *ss,
socklen_t *len)
{
@@ -304,6 +314,9 @@ static inline void init_addr_loopback(int family, struct sockaddr_storage *ss,
case AF_INET6:
init_addr_loopback6(ss, len);
return;
+ case AF_UNIX:
+ init_addr_loopback_unix(ss, len);
+ return;
case AF_VSOCK:
init_addr_loopback_vsock(ss, len);
return;
@@ -390,21 +403,27 @@ static inline int create_pair(int family, int sotype, int *p0, int *p1)
{
__close_fd int s, c = -1, p = -1;
struct sockaddr_storage addr;
- socklen_t len = sizeof(addr);
+ socklen_t len;
int err;
s = socket_loopback(family, sotype);
if (s < 0)
return s;
- err = xgetsockname(s, sockaddr(&addr), &len);
- if (err)
- return err;
-
c = xsocket(family, sotype, 0);
if (c < 0)
return c;
+ init_addr_loopback(family, &addr, &len);
+ err = xbind(c, sockaddr(&addr), len);
+ if (err)
+ return err;
+
+ len = sizeof(addr);
+ err = xgetsockname(s, sockaddr(&addr), &len);
+ if (err)
+ return err;
+
err = connect(c, sockaddr(&addr), len);
if (err) {
if (errno != EINPROGRESS) {
--
2.46.0
Let a selftest set BPF_F_INGRESS for map/hash redirect.
In run_tests(), explicitly reset skel->bss->test_ingress to false. Earlier
tests might have left it flipped.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
tools/testing/selftests/bpf/prog_tests/sockmap_listen.c | 2 ++
tools/testing/selftests/bpf/progs/test_sockmap_listen.c | 6 ++++--
2 files changed, 6 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
index da5a6fb03b69..a5e7d27760cf 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_listen.c
@@ -1850,6 +1850,8 @@ static void test_udp_unix_redir(struct test_sockmap_listen *skel, struct bpf_map
static void run_tests(struct test_sockmap_listen *skel, struct bpf_map *map,
int family)
{
+ skel->bss->test_ingress = false;
+
test_ops(skel, map, family, SOCK_STREAM);
test_ops(skel, map, family, SOCK_DGRAM);
test_redir(skel, map, family, SOCK_STREAM);
diff --git a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
index b7250eb9c30c..5a3504d5dfc3 100644
--- a/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
+++ b/tools/testing/selftests/bpf/progs/test_sockmap_listen.c
@@ -106,9 +106,11 @@ int prog_msg_verdict(struct sk_msg_md *msg)
int verdict;
if (test_sockmap)
- verdict = bpf_msg_redirect_map(msg, &sock_map, zero, 0);
+ verdict = bpf_msg_redirect_map(msg, &sock_map, zero,
+ test_ingress ? BPF_F_INGRESS : 0);
else
- verdict = bpf_msg_redirect_hash(msg, &sock_hash, &zero, 0);
+ verdict = bpf_msg_redirect_hash(msg, &sock_hash, &zero,
+ test_ingress ? BPF_F_INGRESS : 0);
count = bpf_map_lookup_elem(&verdict_map, &verdict);
if (count)
--
2.46.0
Test redirection logic.
BPF_MAP_TYPE_SOCKMAP
BPF_MAP_TYPE_SOCKHASH
✕
sk_msg-to-egress
sk_msg-to-ingress
sk_skb-to-egress
sk_skb-to-ingress
✕
AF_INET, SOCK_STREAM
AF_INET6, SOCK_STREAM
AF_INET, SOCK_DGRAM
AF_INET6, SOCK_DGRAM
AF_UNIX, SOCK_STREAM
AF_UNIX, SOCK_DGRAM
AF_VSOCK, SOCK_STREAM
AF_VSOCK, SOCK_SEQPACKET
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
.../bpf/prog_tests/sockmap_helpers.h | 58 ++++
.../selftests/bpf/prog_tests/sockmap_redir.c | 315 ++++++++++++++++++
2 files changed, 373 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/sockmap_redir.c
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h b/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h
index c50efa834a11..db0a7b4dc8be 100644
--- a/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_helpers.h
@@ -16,6 +16,9 @@
#define VMADDR_CID_LOCAL 1
#endif
+#define u32(v) ((u32){(v)})
+#define u64(v) ((u64){(v)})
+
#define __always_unused __attribute__((__unused__))
/* include/linux/cleanup.h */
@@ -485,4 +488,59 @@ static inline int create_socket_pairs(int family, int sotype, int *c0, int *c1,
return err;
}
+static inline const char *socket_kind_to_str(int sock_fd)
+{
+ int domain = 0, type = 0;
+ socklen_t opt_len;
+
+ opt_len = sizeof(domain);
+ if (getsockopt(sock_fd, SOL_SOCKET, SO_DOMAIN, &domain, &opt_len))
+ FAIL_ERRNO("getsockopt(SO_DOMAIN)");
+
+ opt_len = sizeof(type);
+ if (getsockopt(sock_fd, SOL_SOCKET, SO_TYPE, &type, &opt_len))
+ FAIL_ERRNO("getsockopt(SO_TYPE)");
+
+ switch (domain) {
+ case AF_INET:
+ switch (type) {
+ case SOCK_STREAM:
+ return "tcp4";
+ case SOCK_DGRAM:
+ return "udp4";
+ }
+ break;
+ case AF_INET6:
+ switch (type) {
+ case SOCK_STREAM:
+ return "tcp6";
+ case SOCK_DGRAM:
+ return "udp6";
+ }
+ break;
+ case AF_UNIX:
+ switch (type) {
+ case SOCK_STREAM:
+ return "u_str";
+ case SOCK_DGRAM:
+ return "u_dgr";
+ case SOCK_SEQPACKET:
+ return "u_seq";
+ }
+ break;
+ case AF_VSOCK:
+ switch (type) {
+ case SOCK_STREAM:
+ return "v_str";
+ case SOCK_DGRAM:
+ return "v_dgr";
+ case SOCK_SEQPACKET:
+ return "v_seq";
+ }
+ break;
+ }
+
+ return "???";
+}
+
#endif // __SOCKMAP_HELPERS__
diff --git a/tools/testing/selftests/bpf/prog_tests/sockmap_redir.c b/tools/testing/selftests/bpf/prog_tests/sockmap_redir.c
new file mode 100644
index 000000000000..335dd348b019
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/sockmap_redir.c
@@ -0,0 +1,315 @@
+#include <stdio.h>
+#include <errno.h>
+#include <error.h>
+#include <sys/types.h>
+
+#include <sys/socket.h>
+#include <sys/un.h>
+#include <netinet/in.h>
+#include <linux/vm_sockets.h>
+
+#include <bpf/bpf.h>
+#include <bpf/libbpf.h>
+
+#include "test_progs.h"
+#include "sockmap_helpers.h"
+#include "test_sockmap_listen.skel.h"
+
+enum prog_kind {
+ SK_MSG_EGRESS,
+ SK_MSG_INGRESS,
+ SK_SKB_EGRESS,
+ SK_SKB_INGRESS,
+};
+
+struct {
+ enum prog_kind prog_kind;
+ const char *in, *out;
+} supported[] = {
+ /* Send to local: TCP -> any but vsock */
+ { SK_MSG_INGRESS, "tcp", "tcp" },
+ { SK_MSG_INGRESS, "tcp", "udp" },
+ { SK_MSG_INGRESS, "tcp", "u_str" },
+ { SK_MSG_INGRESS, "tcp", "u_dgr" },
+ /* Send to egress: TCP -> TCP */
+ { SK_MSG_EGRESS, "tcp", "tcp" },
+ /* Ingress to egress: any -> any */
+ { SK_SKB_EGRESS, "any", "any" },
+ /* Ingress to local: any -> any but vsock */
+ { SK_SKB_INGRESS, "any", "tcp" },
+ { SK_SKB_INGRESS, "any", "udp" },
+ { SK_SKB_INGRESS, "any", "u_str" },
+ { SK_SKB_INGRESS, "any", "u_dgr" },
+};
+
+enum {
+ SEND_INNER = 0,
+ SEND_OUTER,
+};
+
+enum {
+ RECV_INNER = 0,
+ RECV_OUTER,
+};
+
+enum map_kind {
+ SOCKMAP,
+ SOCKHASH,
+};
+
+struct redir_spec {
+ const char *name;
+ int idx_send;
+ int idx_recv;
+ enum prog_kind prog_kind;
+};
+
+struct socket_spec {
+ int family;
+ int sotype;
+ int send_flags;
+ int in[2];
+ int out[2];
+};
+
+static int socket_spec_pairs(struct socket_spec *s)
+{
+ return create_socket_pairs(s->family, s->sotype,
+ &s->in[0], &s->out[0],
+ &s->in[1], &s->out[1]);
+}
+
+static void socket_spec_close(struct socket_spec *s)
+{
+ xclose(s->in[0]);
+ xclose(s->in[1]);
+ xclose(s->out[0]);
+ xclose(s->out[1]);
+}
+
+static void get_redir_params(struct redir_spec *redir,
+ struct test_sockmap_listen *skel,
+ int *prog_fd, enum bpf_attach_type *attach_type,
+ bool *ingress_flag)
+{
+ enum prog_kind kind = redir->prog_kind;
+ struct bpf_program *prog;
+ bool sk_msg;
+
+ sk_msg = kind == SK_MSG_INGRESS || kind == SK_MSG_EGRESS;
+ prog = sk_msg ? skel->progs.prog_msg_verdict : skel->progs.prog_skb_verdict;
+
+ *prog_fd = bpf_program__fd(prog);
+ *attach_type = sk_msg ? BPF_SK_MSG_VERDICT : BPF_SK_SKB_VERDICT;
+ *ingress_flag = kind == SK_MSG_INGRESS || kind == SK_SKB_INGRESS;
+}
+
+static void test_send_redir_recv(int sd_send, int send_flags, int sd_in,
+ int sd_out, int sd_recv, int map_in, int map_out)
+{
+ char *send_buf = "ab";
+ char recv_buf = '\0';
+ ssize_t n, len = 1;
+
+ if (xbpf_map_update_elem(map_in, &u32(0), &u64(sd_in), BPF_NOEXIST))
+ return;
+
+ if (xbpf_map_update_elem(map_out, &u32(0), &u64(sd_out), BPF_NOEXIST))
+ goto del_in;
+
+ /* Last byte is OOB data when send_flags has MSG_OOB bit set */
+ if (send_flags & MSG_OOB)
+ len++;
+ n = send(sd_send, send_buf, len, send_flags);
+ if (n >= 0 && n < len)
+ FAIL("incomplete send");
+ if (n < len && errno != EACCES) {
+ FAIL_ERRNO("send");
+ goto out;
+ }
+
+ /* sk_msg redirect combo not supported */
+ if (errno == EACCES) {
+ test__skip();
+ goto out;
+ }
+
+ n = recv_timeout(sd_recv, &recv_buf, 1, 0, IO_TIMEOUT_SEC);
+ if (n != 1) {
+ FAIL_ERRNO("recv");
+ goto out;
+ }
+ if (recv_buf != send_buf[0])
+ FAIL("recv: payload check, %02x != %02x", recv_buf, send_buf[0]);
+
+ if (send_flags & MSG_OOB) {
+ /* Check that we can't read OOB while in sockmap */
+ errno = 0;
+ n = recv(sd_out, &recv_buf, 1, MSG_OOB | MSG_DONTWAIT);
+ if (n != -1)
+ FAIL("recv(MSG_OOB): expected failure: retval=%zd errno=%d",
+ n, errno);
+
+ /* Remove sd_out from sockmap */
+ xbpf_map_delete_elem(map_out, &u32(0));
+
+ /* Check that OOB was dropped on redirect */
+ errno = 0;
+ n = recv(sd_out, &recv_buf, 1, MSG_OOB | MSG_DONTWAIT);
+ if (n != -1)
+ FAIL("recv(MSG_OOB): expected failure: retval=%zd errno=%d",
+ n, errno);
+
+ goto del_in;
+ }
+out:
+ xbpf_map_delete_elem(map_out, &u32(0));
+del_in:
+ xbpf_map_delete_elem(map_in, &u32(0));
+}
+
+static bool is_supported(enum prog_kind prog_kind, const char *in, const char *out)
+{
+ for (int i = 0; i < ARRAY_SIZE(supported); ++i) {
+ if (supported[i].prog_kind == prog_kind &&
+ (!strcmp(supported[i].in, "any") || strstr(in, supported[i].in)) &&
+ (!strcmp(supported[i].out, "any") || strstr(out, supported[i].out)))
+ return true;
+ }
+
+ return false;
+}
+
+static void test_socket(enum map_kind map_kind, struct redir_spec *redir,
+ int map_in, int map_out, struct socket_spec *s_in,
+ struct socket_spec *s_out)
+{
+ int fd_in, fd_out, fd_send, fd_recv, send_flags;
+ const char *in_str, *out_str;
+ char s[MAX_TEST_NAME];
+
+ fd_in = s_in->in[0];
+ fd_out = s_out->out[0];
+ fd_send = s_in->in[redir->idx_send];
+ fd_recv = s_out->out[redir->idx_recv];
+ send_flags = s_in->send_flags;
+
+ in_str = socket_kind_to_str(fd_in);
+ out_str = socket_kind_to_str(fd_out);
+
+ snprintf(s, sizeof(s),
+ "%-4s %-17s %-5s → %-5s%6s",
+ /* hash sk_skb-to-ingress u_str → v_str (OOB) */
+ map_kind == SOCKMAP ? "map" : "hash",
+ redir->name,
+ in_str,
+ out_str,
+ send_flags & MSG_OOB ? "(OOB)" : "");
+
+ if (!test__start_subtest(s))
+ return;
+
+ if (!is_supported(redir->prog_kind, in_str, out_str)) {
+ test__skip();
+ return;
+ }
+
+ test_send_redir_recv(fd_send, send_flags, fd_in, fd_out, fd_recv,
+ map_in, map_out);
+}
+
+static void test_redir(enum map_kind map_kind, struct redir_spec *redir,
+ int map_in, int map_out)
+{
+ struct socket_spec *s, sockets[] = {
+ { AF_INET, SOCK_STREAM },
+ // { AF_INET, SOCK_STREAM, MSG_OOB }, /* Known to be broken */
+ { AF_INET6, SOCK_STREAM },
+ { AF_INET, SOCK_DGRAM },
+ { AF_INET6, SOCK_DGRAM },
+ { AF_UNIX, SOCK_STREAM },
+ { AF_UNIX, SOCK_STREAM, MSG_OOB },
+ { AF_UNIX, SOCK_DGRAM },
+ // { AF_UNIX, SOCK_SEQPACKET}, /* Not supported */
+ { AF_VSOCK, SOCK_STREAM },
+ // { AF_VSOCK, SOCK_DGRAM }, /* Not supported */
+ { AF_VSOCK, SOCK_SEQPACKET },
+ };
+
+ for (s = sockets; s < sockets + ARRAY_SIZE(sockets); s++)
+ if (socket_spec_pairs(s))
+ goto out;
+
+ /* Intra-proto */
+ for (s = sockets; s < sockets + ARRAY_SIZE(sockets); s++)
+ test_socket(map_kind, redir, map_in, map_out, s, s);
+
+ /* Cross-proto */
+ for (int i = 0; i < ARRAY_SIZE(sockets); i++) {
+ for (int j = 0; j < ARRAY_SIZE(sockets); j++) {
+ struct socket_spec *in = &sockets[i];
+ struct socket_spec *out = &sockets[j];
+
+ /* Skip intra-proto and between variants */
+ if (out->send_flags ||
+ (in->family == out->family &&
+ in->sotype == out->sotype))
+ continue;
+
+ test_socket(map_kind, redir, map_in, map_out, in, out);
+ }
+ }
+out:
+ while (--s >= sockets)
+ socket_spec_close(s);
+}
+
+static void test_map(enum map_kind map_kind)
+{
+ struct redir_spec *r, redirs[] = {
+ { "sk_msg-to-egress", SEND_INNER, RECV_OUTER, SK_MSG_EGRESS },
+ { "sk_msg-to-ingress", SEND_INNER, RECV_INNER, SK_MSG_INGRESS },
+ { "sk_skb-to-egress", SEND_OUTER, RECV_OUTER, SK_SKB_EGRESS },
+ { "sk_skb-to-ingress", SEND_OUTER, RECV_INNER, SK_SKB_INGRESS },
+ };
+
+ for (r = redirs; r < redirs + ARRAY_SIZE(redirs); r++) {
+ struct test_sockmap_listen *skel;
+ enum bpf_attach_type attach_type;
+ int prog, map_in, map_out;
+
+ skel = test_sockmap_listen__open_and_load();
+ if (!skel) {
+ FAIL("open_and_load");
+ return;
+ }
+
+ if (map_kind == SOCKMAP) {
+ skel->bss->test_sockmap = true;
+ map_out = bpf_map__fd(skel->maps.sock_map);
+ } else {
+ skel->bss->test_sockmap = false;
+ map_out = bpf_map__fd(skel->maps.sock_hash);
+ }
+
+ map_in = bpf_map__fd(skel->maps.nop_map);
+ get_redir_params(r, skel, &prog, &attach_type,
+ &skel->bss->test_ingress);
+
+ if (xbpf_prog_attach(prog, map_in, attach_type, 0))
+ return;
+
+ test_redir(map_kind, r, map_in, map_out);
+
+ if (xbpf_prog_detach2(prog, map_in, attach_type))
+ return;
+
+ test_sockmap_listen__destroy(skel);
+ }
+}
+
+void serial_test_sockmap_redir(void)
+{
+ test_map(SOCKMAP);
+ test_map(SOCKHASH);
+}
--
2.46.0
On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: > Series takes care of few bugs and missing features with the aim to improve > the test coverage of sockmap/sockhash. > > Last patch is a create_pair() rewrite making use of > __attribute__((cleanup)) to handle socket fd lifetime. > > Signed-off-by: Michal Luczaj <mhal@rbox.co> > --- Sorry for the long turn-around time. I have opened some kind of Pandora's box with a recent USO change and been battling a regression even since. Also it was CfP deadline week. I will run & review this today / tomorrow latest.
On 8/5/24 17:22, Jakub Sitnicki wrote: > On Wed, Jul 31, 2024 at 12:01 PM +02, Michal Luczaj wrote: >> Series takes care of few bugs and missing features with the aim to improve >> the test coverage of sockmap/sockhash. >> >> Last patch is a create_pair() rewrite making use of >> __attribute__((cleanup)) to handle socket fd lifetime. >> >> Signed-off-by: Michal Luczaj <mhal@rbox.co> >> --- > > Sorry for the long turn-around time. > > I have opened some kind of Pandora's box with a recent USO change and > been battling a regression even since. Also it was CfP deadline week. > > I will run & review this today / tomorrow latest. Thanks for the update. But, really, as far as I'm concerned, no need for any rush.
© 2016 - 2025 Red Hat, Inc.