Add diagram to show metadata layout of devices that utilize the data_meta
area for their own purposes. Besides, enhance the documentation on
selecting an appropriate METADATA_SIZE for XDP Rx metadata, ensuring it
accommodates both device-reserved and custom metadata. It includes
considerations for alignment and size constraints. The updated guidance
helps users correctly allocate and access metadata in AF_XDP scenarios.
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
---
Documentation/networking/xdp-rx-metadata.rst | 36 ++++++++++++++++----
1 file changed, 30 insertions(+), 6 deletions(-)
diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
index a6e0ece18be5..65a1a6e0f7a2 100644
--- a/Documentation/networking/xdp-rx-metadata.rst
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -54,6 +54,19 @@ area in whichever format it chooses. Later consumers of the metadata
will have to agree on the format by some out of band contract (like for
the AF_XDP use case, see below).
+It is important to note that some devices may utilize the ``data_meta`` area for
+their own purposes. For example, the IGC device utilizes ``IGC_TS_HDR_LEN``
+bytes of the ``data_meta`` area for receiving hardware timestamps. Therefore,
+the XDP program should ensure that it does not overwrite any existing metadata.
+The metadata layout of such device is depicted below::
+
+ +----------+-----------------+--------------------------+------+
+ | headroom | custom metadata | device-reserved metadata | data |
+ +----------+-----------------+--------------------------+------+
+ ^ ^
+ | |
+ xdp_buff->data_meta xdp_buff->data
+
AF_XDP
======
@@ -69,12 +82,23 @@ descriptor does _not_ explicitly carry the size of the metadata).
Here is the ``AF_XDP`` consumer layout (note missing ``data_meta`` pointer)::
- +----------+-----------------+------+
- | headroom | custom metadata | data |
- +----------+-----------------+------+
- ^
- |
- rx_desc->address
+ |<--------------METADATA_SIZE--------------->|
+ +----------+-----------------+--------------------------+------+
+ | headroom | custom metadata | device-reserved metadata | data |
+ +----------+-----------------+--------------------------+------+
+ ^
+ |
+ rx_desc->address
+
+It is crucial that the agreed ``METADATA_SIZE`` between the BPF program and the
+final consumer is sufficient to accommodate both device-reserved metadata and
+the data the BPF program needs to populate.
+
+``bpf_xdp_adjust_meta`` ensures that ``METADATA_SIZE`` is aligned to 4 bytes,
+does not exceed 252 bytes, and leaves sufficient space for building the
+xdp_frame. If these conditions are not met, it returns a negative error. In this
+case, the BPF program should not proceed to populate data into the ``data_meta``
+area.
XDP_PASS
========
--
2.34.1
On 7/2/25 6:57 PM, Song Yoong Siang wrote: [...] > +It is important to note that some devices may utilize the ``data_meta`` area for > +their own purposes. For example, the IGC device utilizes ``IGC_TS_HDR_LEN`` > +bytes of the ``data_meta`` area for receiving hardware timestamps. Therefore, > +the XDP program should ensure that it does not overwrite any existing metadata. > +The metadata layout of such device is depicted below:: > + > + +----------+-----------------+--------------------------+------+ > + | headroom | custom metadata | device-reserved metadata | data | > + +----------+-----------------+--------------------------+------+ > + ^ ^ > + | | > + xdp_buff->data_meta xdp_buff->data Imho, this section is misleading to developers. Suppose you're a XDP program writer and you want to implement a generic native BPF program (independent of the underlying NIC). Does this mean, the expectation is to dig into driver code to gather whether or not a driver is prepopulating and how much of it? What are the implications if the data is overwritten? For example, in Cilium today we use the buffer described here as device-reserved metadata and override it. How will users know what breaks?
On Thursday, July 3, 2025 11:58 PM, Daniel Borkmann <daniel@iogearbox.net> wrote: >On 7/2/25 6:57 PM, Song Yoong Siang wrote: >[...] >> +It is important to note that some devices may utilize the ``data_meta`` area for >> +their own purposes. For example, the IGC device utilizes ``IGC_TS_HDR_LEN`` >> +bytes of the ``data_meta`` area for receiving hardware timestamps. Therefore, >> +the XDP program should ensure that it does not overwrite any existing metadata. >> +The metadata layout of such device is depicted below:: >> + >> + +----------+-----------------+--------------------------+------+ >> + | headroom | custom metadata | device-reserved metadata | data | >> + +----------+-----------------+--------------------------+------+ >> + ^ ^ >> + | | >> + xdp_buff->data_meta xdp_buff->data > >Imho, this section is misleading to developers. Suppose you're a XDP program writer >and you want to implement a generic native BPF program (independent of the underlying >NIC). Does this mean, the expectation is to dig into driver code to gather whether >or not a driver is prepopulating and how much of it? What are the implications if the >data is overwritten? For example, in Cilium today we use the buffer described here >as device-reserved metadata and override it. How will users know what breaks? Thanks for your input. A generic XDP program can always check the size of device-reserved metadata by "ctx->data - ctx->data_meta" and avoid overwrite it, as shown in code below in my v1 submission [1]. This requires driver to expose the metadata length used [2]. However, I dint have good justification for making the metadata length user-visible. So, I submitted this v3 to keep it simple. Any thoughts? + metalen_used = ctx->data - ctx->data_meta; + metalen_to_adjust = XDP_METADATA_SIZE - metalen_used; + if (metalen_to_adjust < (int)sizeof(struct xdp_meta)) + return XDP_DROP; + + ret = bpf_xdp_adjust_meta(ctx, -metalen_to_adjust); [1] https://lore.kernel.org/netdev/20250701042940.3272325-3-yoong.siang.song@intel.com/ [2] https://lore.kernel.org/netdev/20250701080955.3273137-1-yoong.siang.song@intel.com/
© 2016 - 2025 Red Hat, Inc.