[PATCH bpf-next 0/2] Clarify and Enhance XDP Rx Metadata Handling

Song Yoong Siang posted 2 patches 3 months, 1 week ago
There is a newer version of this series
Documentation/networking/xdp-rx-metadata.rst  | 38 +++++++++++++++++++
.../selftests/bpf/prog_tests/xdp_metadata.c   |  2 +-
.../selftests/bpf/progs/xdp_hw_metadata.c     | 10 ++++-
.../selftests/bpf/progs/xdp_metadata.c        |  8 +++-
tools/testing/selftests/bpf/xdp_hw_metadata.c |  2 +-
tools/testing/selftests/bpf/xdp_metadata.h    |  7 ++++
6 files changed, 63 insertions(+), 4 deletions(-)
[PATCH bpf-next 0/2] Clarify and Enhance XDP Rx Metadata Handling
Posted by Song Yoong Siang 3 months, 1 week ago
This patch set improves the documentation and selftests for XDP Rx metadata
handling. The first patch clarifies the documentation around XDP metadata
layout and the use of bpf_xdp_adjust_meta. The second patch enhances the
BPF selftests to make XDP metadata handling more robust and portable across
different NICs.

Prior to this patch set, the user application retrieved the xdp_meta by
calculating backward from the data pointer, while the XDP program fill in
the xdp_meta by calculating backward from data_meta. This approach will
cause mismatch if there is device-reserved metadata.

                        |<---sizeof(xdp_meta)--|
                        |                      |
                 struct xdp_meta               rx_desc->address
                        ^                      ^
                        |                      |
+----------+----------------------+------------+------+
| headroom |    custom metadata   |  reserved  | data |
+----------+----------------------+------------+------+
           ^                      ^            ^
           |                      |            |
    struct xdp_meta     xdp_buff->data_meta    xdp_buff->data
           |                      |
           |<---sizeof(xdp_meta)--|

Song Yoong Siang (2):
  doc: clarify XDP Rx metadata layout and bpf_xdp_adjust_meta usage
  selftests/bpf: Enhance XDP Rx Metadata Handling

 Documentation/networking/xdp-rx-metadata.rst  | 38 +++++++++++++++++++
 .../selftests/bpf/prog_tests/xdp_metadata.c   |  2 +-
 .../selftests/bpf/progs/xdp_hw_metadata.c     | 10 ++++-
 .../selftests/bpf/progs/xdp_metadata.c        |  8 +++-
 tools/testing/selftests/bpf/xdp_hw_metadata.c |  2 +-
 tools/testing/selftests/bpf/xdp_metadata.h    |  7 ++++
 6 files changed, 63 insertions(+), 4 deletions(-)

-- 
2.34.1
Re: [PATCH bpf-next 0/2] Clarify and Enhance XDP Rx Metadata Handling
Posted by Jakub Kicinski 3 months ago
On Tue,  1 Jul 2025 12:29:38 +0800 Song Yoong Siang wrote:
>                         |<---sizeof(xdp_meta)--|
>                         |                      |
>                  struct xdp_meta               rx_desc->address
>                         ^                      ^
>                         |                      |
> +----------+----------------------+------------+------+
> | headroom |    custom metadata   |  reserved  | data |
> +----------+----------------------+------------+------+
>            ^                      ^            ^
>            |                      |            |
>     struct xdp_meta     xdp_buff->data_meta    xdp_buff->data
>            |                      |
>            |<---sizeof(xdp_meta)--|

Huh. Did AF_XDP maintainers explicitly sign off on this or it's just how
IGC implementation works and nobody noticed?

For normal XDP my understanding is that its the driver's responsibility
to move the "reserved" stuff out of place before presenting the frame to
program.
RE: [PATCH bpf-next 0/2] Clarify and Enhance XDP Rx Metadata Handling
Posted by Song, Yoong Siang 3 months ago
On Tuesday, July 8, 2025 4:55 AM, Jakub Kicinski <kuba@kernel.org> wrote:
>On Tue,  1 Jul 2025 12:29:38 +0800 Song Yoong Siang wrote:
>>                         |<---sizeof(xdp_meta)--|
>>                         |                      |
>>                  struct xdp_meta               rx_desc->address
>>                         ^                      ^
>>                         |                      |
>> +----------+----------------------+------------+------+
>> | headroom |    custom metadata   |  reserved  | data |
>> +----------+----------------------+------------+------+
>>            ^                      ^            ^
>>            |                      |            |
>>     struct xdp_meta     xdp_buff->data_meta    xdp_buff->data
>>            |                      |
>>            |<---sizeof(xdp_meta)--|
>
>Huh. Did AF_XDP maintainers explicitly sign off on this or it's just how
>IGC implementation works and nobody noticed?
>
Previously,  IGC do copy out the Rx hwts from metadata area,
so no problem when implementing XDP Rx metadata.

After that, net_device_ops.ndo_get_tstamp() is added into IGC to
support timestamping from both free-running clock and adjustable clock.
The 2 timers are stored in the metadata area, thus causing the issue.

>For normal XDP my understanding is that its the driver's responsibility
>to move the "reserved" stuff out of place before presenting the frame to
>program.

Is it means that driver needs to move out the "reserved" stuff before XDP program
and then move back the stuff after XDP program for certain situation, like XDP_PASS?

IMHO, if driver is allowed to use some portion of the metadata area, then
the packet processing will be more efficiency and also align with the "zero-copy" idea.

Any thoughts?
  
Re: [PATCH bpf-next 0/2] Clarify and Enhance XDP Rx Metadata Handling
Posted by Jakub Kicinski 3 months ago
On Tue, 8 Jul 2025 01:34:13 +0000 Song, Yoong Siang wrote:
> >For normal XDP my understanding is that its the driver's responsibility
> >to move the "reserved" stuff out of place before presenting the frame to
> >program.  
> 
> Is it means that driver needs to move out the "reserved" stuff before XDP program
> and then move back the stuff after XDP program for certain situation, like XDP_PASS?

Why would the driver need to move it back?
On XDP_PASS an skb is constructed, so the metadata should 
be transferred to the skb. There is no need to copy it back
as a prepend.
RE: [PATCH bpf-next 0/2] Clarify and Enhance XDP Rx Metadata Handling
Posted by Song, Yoong Siang 3 months ago
On Tuesday, July 8, 2025 9:45 AM, Jakub Kicinski <kuba@kernel.org> wrote:
>On Tue, 8 Jul 2025 01:34:13 +0000 Song, Yoong Siang wrote:
>> >For normal XDP my understanding is that its the driver's responsibility
>> >to move the "reserved" stuff out of place before presenting the frame to
>> >program.
>>
>> Is it means that driver needs to move out the "reserved" stuff before XDP program
>> and then move back the stuff after XDP program for certain situation, like
>XDP_PASS?
>
>Why would the driver need to move it back?
>On XDP_PASS an skb is constructed, so the metadata should
>be transferred to the skb. There is no need to copy it back
>as a prepend.

I said so because I thought need to put back the timestamp
as prepend and then point skb_shared_hwtstamps.netdev_data to it
to support the ndo_get_tstamp().

I haven't study the code flow in detail, so I might be missing something.
Re: [PATCH bpf-next 0/2] Clarify and Enhance XDP Rx Metadata Handling
Posted by Jakub Kicinski 3 months ago
On Tue, 8 Jul 2025 02:06:11 +0000 Song, Yoong Siang wrote:
>> Why would the driver need to move it back?
>> On XDP_PASS an skb is constructed, so the metadata should
>> be transferred to the skb. There is no need to copy it back
>> as a prepend.  
> 
> I said so because I thought need to put back the timestamp
> as prepend and then point skb_shared_hwtstamps.netdev_data to it
> to support the ndo_get_tstamp().

No need, the timestamps are set in shared info directly.
There are multiple drivers which use the metadata prepend
method, so I'm pretty sure it should work.
RE: [PATCH bpf-next 0/2] Clarify and Enhance XDP Rx Metadata Handling
Posted by Song, Yoong Siang 3 months ago
On Tuesday, July 8, 2025 10:18 AM, Jakub Kicinski <kuba@kernel.org> wrote:
>On Tue, 8 Jul 2025 02:06:11 +0000 Song, Yoong Siang wrote:
>>> Why would the driver need to move it back?
>>> On XDP_PASS an skb is constructed, so the metadata should
>>> be transferred to the skb. There is no need to copy it back
>>> as a prepend.
>>
>> I said so because I thought need to put back the timestamp
>> as prepend and then point skb_shared_hwtstamps.netdev_data to it
>> to support the ndo_get_tstamp().
>
>No need, the timestamps are set in shared info directly.
>There are multiple drivers which use the metadata prepend
>method, so I'm pretty sure it should work.

Thanks for pointing me in the right direction.
I'll proceed with updating the IGC driver and conduct tests
to ensure everything works as expected.