[PATCH net-next V7 0/2] net/mlx5: Avoid payload in skb's linear part for better GRO-processing

Tariq Toukan posted 2 patches 6 days, 23 hours ago
.../net/ethernet/mellanox/mlx5/core/en_rx.c   | 33 ++++++++++++-------
1 file changed, 22 insertions(+), 11 deletions(-)
[PATCH net-next V7 0/2] net/mlx5: Avoid payload in skb's linear part for better GRO-processing
Posted by Tariq Toukan 6 days, 23 hours ago
Hi,

This is V7 of a series originally submitted by Christoph.

When LRO is enabled on the MLX, mlx5e_skb_from_cqe_mpwrq_nonlinear
copies parts of the payload to the linear part of the skb.

This triggers suboptimal processing in GRO, causing slow throughput.

This patch series addresses this by using eth_get_headlen to compute the
size of the protocol headers and only copy those bits. This results in a
significant throughput improvement (detailed results in the specific
patch).

Regards,
Tariq

---

V7:
- Drop cache aligned memcpy patch as it no longer shows benefits on
  further testing on other hosts.
- For XDP, pull at most ETH_HLEN bytes into linear part.
- Fix skb pull length calculation for XDP (Amery Hung).
- Switched from min_t() to min() to avoid skb->data_len 16 bit
  truncation (David Laigh).
- Improved commit message for last patch to make it clear
  that the benchmark is not on native XDP (Sashiko).

V6:
https://lore.kernel.org/all/20260507095330.318892-1-tariqt@nvidia.com/

Christoph Paasch (2):
  net/mlx5e: DMA-sync earlier in mlx5e_skb_from_cqe_mpwrq_nonlinear
  net/mlx5e: Avoid copying payload to the skb's linear part

 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 33 ++++++++++++-------
 1 file changed, 22 insertions(+), 11 deletions(-)


base-commit: 8415598365503ced2e3d019491b0a2756c85c494
-- 
2.44.0