When the XDP metadata area was used, it is expected that the same
metadata can also be accessed from TC, as can be read in the description
of the bpf_xdp_adjust_meta helper function. In the tun driver, this was
not yet implemented.
To make this work, the skb that is being built on XDP_PASS should know
of the current size of the metadata area. This is ensured by adding
calls to skb_metadata_set. For the tun_xdp_one code path, an additional
check is necessary to handle the case where the externally initialized
xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).
More information about this feature can be found in the commit message
of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").
Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
---
drivers/net/tun.c | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4ec8fbd93c8d..70208b3a2e93 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1600,7 +1600,8 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
struct page_frag *alloc_frag, char *buf,
- int buflen, int len, int pad)
+ int buflen, int len, int pad,
+ int metasize)
{
struct sk_buff *skb = build_skb(buf, buflen);
@@ -1609,6 +1610,8 @@ static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
skb_reserve(skb, pad);
skb_put(skb, len);
+ if (metasize)
+ skb_metadata_set(skb, metasize);
skb_set_owner_w(skb, tfile->socket.sk);
get_page(alloc_frag->page);
@@ -1668,6 +1671,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
char *buf;
size_t copied;
int pad = TUN_RX_PAD;
+ int metasize = 0;
int err = 0;
rcu_read_lock();
@@ -1695,7 +1699,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
if (hdr->gso_type || !xdp_prog) {
*skb_xdp = 1;
return __tun_build_skb(tfile, alloc_frag, buf, buflen, len,
- pad);
+ pad, metasize);
}
*skb_xdp = 0;
@@ -1730,12 +1734,18 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
pad = xdp.data - xdp.data_hard_start;
len = xdp.data_end - xdp.data;
+
+ /* It is known that the xdp_buff was prepared with metadata
+ * support, so the metasize will never be negative.
+ */
+ metasize = xdp.data - xdp.data_meta;
}
bpf_net_ctx_clear(bpf_net_ctx);
rcu_read_unlock();
local_bh_enable();
- return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad);
+ return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad,
+ metasize);
out:
bpf_net_ctx_clear(bpf_net_ctx);
@@ -2452,6 +2462,7 @@ static int tun_xdp_one(struct tun_struct *tun,
struct sk_buff_head *queue;
u32 rxhash = 0, act;
int buflen = hdr->buflen;
+ int metasize = 0;
int ret = 0;
bool skb_xdp = false;
struct page *page;
@@ -2506,6 +2517,14 @@ static int tun_xdp_one(struct tun_struct *tun,
skb_reserve(skb, xdp->data - xdp->data_hard_start);
skb_put(skb, xdp->data_end - xdp->data);
+ /* The externally provided xdp_buff may have no metadata support, which
+ * is marked by xdp->data_meta being xdp->data + 1. This will lead to a
+ * metasize of -1 and is the reason why the condition checks for > 0.
+ */
+ metasize = xdp->data - xdp->data_meta;
+ if (metasize > 0)
+ skb_metadata_set(skb, metasize);
+
if (virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) {
atomic_long_inc(&tun->rx_frame_errors);
kfree_skb(skb);
--
2.43.0
On 2/27/25 6:23 AM, Marcus Wichelmann wrote:
> When the XDP metadata area was used, it is expected that the same
> metadata can also be accessed from TC, as can be read in the description
> of the bpf_xdp_adjust_meta helper function. In the tun driver, this was
> not yet implemented.
>
> To make this work, the skb that is being built on XDP_PASS should know
> of the current size of the metadata area. This is ensured by adding
> calls to skb_metadata_set. For the tun_xdp_one code path, an additional
> check is necessary to handle the case where the externally initialized
> xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).
>
> More information about this feature can be found in the commit message
> of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").
> > Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
> Reviewed-by: Willem de Bruijn <willemb@google.com>
> Acked-by: Jason Wang <jasowang@redhat.com>
> ---
> drivers/net/tun.c | 25 ++++++++++++++++++++++---
> 1 file changed, 22 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 4ec8fbd93c8d..70208b3a2e93 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
The changes have conflicts with the commit 2506251e81d1 ("tun: Decouple vnet handling").
It is better to rebase the works onto the bpf-next/net,
i.e. the "net" branch instead of the "master" branch.
Am 28.02.25 um 20:49 schrieb Martin KaFai Lau:
> On 2/27/25 6:23 AM, Marcus Wichelmann wrote:
>> When the XDP metadata area was used, it is expected that the same
>> metadata can also be accessed from TC, as can be read in the description
>> of the bpf_xdp_adjust_meta helper function. In the tun driver, this was
>> not yet implemented.
>>
>> To make this work, the skb that is being built on XDP_PASS should know
>> of the current size of the metadata area. This is ensured by adding
>> calls to skb_metadata_set. For the tun_xdp_one code path, an additional
>> check is necessary to handle the case where the externally initialized
>> xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).
>>
>> More information about this feature can be found in the commit message
>> of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").
>> > Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
>> Reviewed-by: Willem de Bruijn <willemb@google.com>
>> Acked-by: Jason Wang <jasowang@redhat.com>
>> ---
>> drivers/net/tun.c | 25 ++++++++++++++++++++++---
>> 1 file changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 4ec8fbd93c8d..70208b3a2e93 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>
> The changes have conflicts with the commit 2506251e81d1 ("tun: Decouple vnet handling").
>
> It is better to rebase the works onto the bpf-next/net,
> i.e. the "net" branch instead of the "master" branch.
Alright, will do that. Should I send it as a v5 and still with "PATCH bpf-next"
in the header or something else?
Thanks!
Marcus
On 3/3/25 8:13 AM, Marcus Wichelmann wrote:
> Am 28.02.25 um 20:49 schrieb Martin KaFai Lau:
>> On 2/27/25 6:23 AM, Marcus Wichelmann wrote:
>>> When the XDP metadata area was used, it is expected that the same
>>> metadata can also be accessed from TC, as can be read in the description
>>> of the bpf_xdp_adjust_meta helper function. In the tun driver, this was
>>> not yet implemented.
>>>
>>> To make this work, the skb that is being built on XDP_PASS should know
>>> of the current size of the metadata area. This is ensured by adding
>>> calls to skb_metadata_set. For the tun_xdp_one code path, an additional
>>> check is necessary to handle the case where the externally initialized
>>> xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).
>>>
>>> More information about this feature can be found in the commit message
>>> of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").
>>>> Signed-off-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de>
>>> Reviewed-by: Willem de Bruijn <willemb@google.com>
>>> Acked-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>> drivers/net/tun.c | 25 ++++++++++++++++++++++---
>>> 1 file changed, 22 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>> index 4ec8fbd93c8d..70208b3a2e93 100644
>>> --- a/drivers/net/tun.c
>>> +++ b/drivers/net/tun.c
>>
>> The changes have conflicts with the commit 2506251e81d1 ("tun: Decouple vnet handling").
>>
>> It is better to rebase the works onto the bpf-next/net,
>> i.e. the "net" branch instead of the "master" branch.
>
> Alright, will do that. Should I send it as a v5 and still with "PATCH bpf-next"
> in the header or something else?
That should do. The bpf CI should pick up the bpf-next/net if it fails to apply
to the bpf-next/master because of the conflict mentioned above.
For patch 3, it should help to avoid the future merge conflict if the
open_tuntap() is added a few lines above in the network_helpers.h. For patch 6,
the "test_ns_" naming is not in bpf-next/net yet. Other tests in the same file
is doing netns_new. May be just do the same and cleanup all at once of this file
later.
© 2016 - 2026 Red Hat, Inc.