[PATCH bpf-next,v3 1/2] doc: enhance explanation of XDP Rx metadata layout and METADATA_SIZE

Song Yoong Siang posted 2 patches 3 months, 1 week ago
[PATCH bpf-next,v3 1/2] doc: enhance explanation of XDP Rx metadata layout and METADATA_SIZE
Posted by Song Yoong Siang 3 months, 1 week ago
Add diagram to show metadata layout of devices that utilize the data_meta
area for their own purposes. Besides, enhance the documentation on
selecting an appropriate METADATA_SIZE for XDP Rx metadata, ensuring it
accommodates both device-reserved and custom metadata. It includes
considerations for alignment and size constraints. The updated guidance
helps users correctly allocate and access metadata in AF_XDP scenarios.

Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
---
 Documentation/networking/xdp-rx-metadata.rst | 36 ++++++++++++++++----
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
index a6e0ece18be5..65a1a6e0f7a2 100644
--- a/Documentation/networking/xdp-rx-metadata.rst
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -54,6 +54,19 @@ area in whichever format it chooses. Later consumers of the metadata
 will have to agree on the format by some out of band contract (like for
 the AF_XDP use case, see below).
 
+It is important to note that some devices may utilize the ``data_meta`` area for
+their own purposes. For example, the IGC device utilizes ``IGC_TS_HDR_LEN``
+bytes of the ``data_meta`` area for receiving hardware timestamps. Therefore,
+the XDP program should ensure that it does not overwrite any existing metadata.
+The metadata layout of such device is depicted below::
+
+  +----------+-----------------+--------------------------+------+
+  | headroom | custom metadata | device-reserved metadata | data |
+  +----------+-----------------+--------------------------+------+
+             ^                                            ^
+             |                                            |
+   xdp_buff->data_meta                              xdp_buff->data
+
 AF_XDP
 ======
 
@@ -69,12 +82,23 @@ descriptor does _not_ explicitly carry the size of the metadata).
 
 Here is the ``AF_XDP`` consumer layout (note missing ``data_meta`` pointer)::
 
-  +----------+-----------------+------+
-  | headroom | custom metadata | data |
-  +----------+-----------------+------+
-                               ^
-                               |
-                        rx_desc->address
+             |<--------------METADATA_SIZE--------------->|
+  +----------+-----------------+--------------------------+------+
+  | headroom | custom metadata | device-reserved metadata | data |
+  +----------+-----------------+--------------------------+------+
+                                                          ^
+                                                          |
+                                                   rx_desc->address
+
+It is crucial that the agreed ``METADATA_SIZE`` between the BPF program and the
+final consumer is sufficient to accommodate both device-reserved metadata and
+the data the BPF program needs to populate.
+
+``bpf_xdp_adjust_meta`` ensures that ``METADATA_SIZE`` is aligned to 4 bytes,
+does not exceed 252 bytes, and leaves sufficient space for building the
+xdp_frame. If these conditions are not met, it returns a negative error. In this
+case, the BPF program should not proceed to populate data into the ``data_meta``
+area.
 
 XDP_PASS
 ========
-- 
2.34.1
Re: [PATCH bpf-next,v3 1/2] doc: enhance explanation of XDP Rx metadata layout and METADATA_SIZE
Posted by Daniel Borkmann 3 months ago
On 7/2/25 6:57 PM, Song Yoong Siang wrote:
[...]
> +It is important to note that some devices may utilize the ``data_meta`` area for
> +their own purposes. For example, the IGC device utilizes ``IGC_TS_HDR_LEN``
> +bytes of the ``data_meta`` area for receiving hardware timestamps. Therefore,
> +the XDP program should ensure that it does not overwrite any existing metadata.
> +The metadata layout of such device is depicted below::
> +
> +  +----------+-----------------+--------------------------+------+
> +  | headroom | custom metadata | device-reserved metadata | data |
> +  +----------+-----------------+--------------------------+------+
> +             ^                                            ^
> +             |                                            |
> +   xdp_buff->data_meta                              xdp_buff->data

Imho, this section is misleading to developers. Suppose you're a XDP program writer
and you want to implement a generic native BPF program (independent of the underlying
NIC). Does this mean, the expectation is to dig into driver code to gather whether
or not a driver is prepopulating and how much of it? What are the implications if the
data is overwritten? For example, in Cilium today we use the buffer described here
as device-reserved metadata and override it. How will users know what breaks?
RE: [PATCH bpf-next,v3 1/2] doc: enhance explanation of XDP Rx metadata layout and METADATA_SIZE
Posted by Song, Yoong Siang 3 months ago
On Thursday, July 3, 2025 11:58 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>On 7/2/25 6:57 PM, Song Yoong Siang wrote:
>[...]
>> +It is important to note that some devices may utilize the ``data_meta`` area for
>> +their own purposes. For example, the IGC device utilizes ``IGC_TS_HDR_LEN``
>> +bytes of the ``data_meta`` area for receiving hardware timestamps. Therefore,
>> +the XDP program should ensure that it does not overwrite any existing metadata.
>> +The metadata layout of such device is depicted below::
>> +
>> +  +----------+-----------------+--------------------------+------+
>> +  | headroom | custom metadata | device-reserved metadata | data |
>> +  +----------+-----------------+--------------------------+------+
>> +             ^                                            ^
>> +             |                                            |
>> +   xdp_buff->data_meta                              xdp_buff->data
>
>Imho, this section is misleading to developers. Suppose you're a XDP program writer
>and you want to implement a generic native BPF program (independent of the underlying
>NIC). Does this mean, the expectation is to dig into driver code to gather whether
>or not a driver is prepopulating and how much of it? What are the implications if the
>data is overwritten? For example, in Cilium today we use the buffer described here
>as device-reserved metadata and override it. How will users know what breaks?

Thanks for your input.

A generic XDP program can always check the size of device-reserved metadata by
"ctx->data - ctx->data_meta" and avoid overwrite it, as shown in code below in my
v1 submission [1]. This requires driver to expose the metadata length used [2].
However, I dint have good justification for making the metadata length user-visible.
So, I submitted this v3 to keep it simple. Any thoughts?

+	metalen_used = ctx->data - ctx->data_meta;
+	metalen_to_adjust = XDP_METADATA_SIZE - metalen_used;
+	if (metalen_to_adjust < (int)sizeof(struct xdp_meta))
+		return XDP_DROP;
+
+	ret = bpf_xdp_adjust_meta(ctx, -metalen_to_adjust);

[1] https://lore.kernel.org/netdev/20250701042940.3272325-3-yoong.siang.song@intel.com/
[2] https://lore.kernel.org/netdev/20250701080955.3273137-1-yoong.siang.song@intel.com/