[RFC net-next v2 00/12] Add TSO map-once DMA helpers and bnxt SW USO support

Joe Damato posted 12 patches 3 weeks, 4 days ago
There is a newer version of this series
drivers/net/ethernet/broadcom/bnxt/Makefile   |   2 +-
drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 177 +++++++++++---
drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  29 +++
.../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  19 +-
drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c | 230 ++++++++++++++++++
drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h |  31 +++
drivers/net/netdevsim/netdev.c                | 100 +++++++-
include/net/tso.h                             |  42 ++++
net/core/tso.c                                | 165 +++++++++++++
tools/testing/selftests/drivers/net/Makefile  |   1 +
tools/testing/selftests/drivers/net/uso.py    |  87 +++++++
11 files changed, 843 insertions(+), 40 deletions(-)
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c
create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h
create mode 100755 tools/testing/selftests/drivers/net/uso.py
[RFC net-next v2 00/12] Add TSO map-once DMA helpers and bnxt SW USO support
Posted by Joe Damato 3 weeks, 4 days ago
Greetings:

This series extends net/tso to add a data structure and some helpers allowing
drivers to DMA map headers and packet payloads a single time. The helpers can
then be used to reference slices of shared mapping for each segment. This
helps to avoid the cost of repeated DMA mappings, especially on systems which
use an IOMMU. N per-packet DMA maps are replaced with a single map for the
entire GSO skb.

The added helpers are then used in bnxt to add support for software UDP
Segmentation Offloading (SW USO) for older bnxt devices which do not have
support for USO in hardware. Since the helpers are generic, other drivers
can be extended similarly.

Testing on a production UDP workload shows a ~4x reduction in DMA mapping
calls at the same wire packet rate.

Special care is taken to make bnxt ethtool operations work correctly: the ring
size cannot be reduced below a minimum threshold while USO is enabled and
growing the ring automatically re-enables USO if it was previously blocked.

I've extended netdevsim to have support for SW USO, but I used
tso_build_hdr/tso_build_data in netdevsim because I couldn't figure out if
there was a way to test the DMA helpers added by this series. If anyone has
suggestions, let me know. I think to test the DMA helpers you probably need
to use real hardware.

I ran the added uso.py test on both netdevsim and a real bnxt and the test
passed. I've also let this run in a production environment for ~24 hours.

Thanks,
Joe

RFCv2:
  - Some bugs were discovered shortly after sending: incorrect handling of the
    shared header space and a bug in the unmap path in the TX completion.
    Sorry about that; I was more careful this time.
  - On that note: this rfc includes a test.

RFCv1: https://lore.kernel.org/netdev/20260310212209.2263939-1-joe@dama.to/

Joe Damato (12):
  net: tso: Introduce tso_dma_map
  net: tso: Add tso_dma_map helpers
  net: bnxt: Export bnxt_xmit_get_cfa_action
  net: bnxt: Add a helper for tx_bd_ext
  net: bnxt: Use dma_unmap_len for TX completion unmapping
  net: bnxt: Add TX inline buffer infrastructure
  net: bnxt: Add boilerplate GSO code
  net: bnxt: Implement software USO
  net: bnxt: Add SW GSO completion and teardown support
  net: bnxt: Dispatch to SW USO
  net: netdevsim: Add support for SW USO
  selftests: drv-net: Add USO test

 drivers/net/ethernet/broadcom/bnxt/Makefile   |   2 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 177 +++++++++++---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  29 +++
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  19 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c | 230 ++++++++++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h |  31 +++
 drivers/net/netdevsim/netdev.c                | 100 +++++++-
 include/net/tso.h                             |  42 ++++
 net/core/tso.c                                | 165 +++++++++++++
 tools/testing/selftests/drivers/net/Makefile  |   1 +
 tools/testing/selftests/drivers/net/uso.py    |  87 +++++++
 11 files changed, 843 insertions(+), 40 deletions(-)
 create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.c
 create mode 100644 drivers/net/ethernet/broadcom/bnxt/bnxt_gso.h
 create mode 100755 tools/testing/selftests/drivers/net/uso.py


base-commit: 8e7adcf81564a3fe886a6270eea7558f063e5538
-- 
2.52.0
Re: [RFC net-next v2 00/12] Add TSO map-once DMA helpers and bnxt SW USO support
Posted by Leon Romanovsky 3 weeks ago
On Thu, Mar 12, 2026 at 03:34:37PM -0700, Joe Damato wrote:
> Greetings:
> 
> This series extends net/tso to add a data structure and some helpers allowing
> drivers to DMA map headers and packet payloads a single time. The helpers can
> then be used to reference slices of shared mapping for each segment. This
> helps to avoid the cost of repeated DMA mappings, especially on systems which
> use an IOMMU.

In modern kernels, it is done by using DMA IOVA API, see NVMe
driver/block layer for the most comprehensive example.

The pseudo code is:
 if (with_iommu)
    use dma_iova_link/dma_iova_unlink
 else
    use dma_map_phys()

https://lore.kernel.org/all/cover.1746424934.git.leon@kernel.org/
https://lore.kernel.org/all/20250623141259.76767-1-hch@lst.de/
https://lwn.net/Articles/997563/

Thanks
Re: [RFC net-next v2 00/12] Add TSO map-once DMA helpers and bnxt SW USO support
Posted by Joe Damato 3 weeks ago
On Mon, Mar 16, 2026 at 09:44:19PM +0200, Leon Romanovsky wrote:
> On Thu, Mar 12, 2026 at 03:34:37PM -0700, Joe Damato wrote:
> > Greetings:
> > 
> > This series extends net/tso to add a data structure and some helpers allowing
> > drivers to DMA map headers and packet payloads a single time. The helpers can
> > then be used to reference slices of shared mapping for each segment. This
> > helps to avoid the cost of repeated DMA mappings, especially on systems which
> > use an IOMMU.
> 
> In modern kernels, it is done by using DMA IOVA API, see NVMe
> driver/block layer for the most comprehensive example.
> 
> The pseudo code is:
>  if (with_iommu)
>     use dma_iova_link/dma_iova_unlink
>  else
>     use dma_map_phys()
> 
> https://lore.kernel.org/all/cover.1746424934.git.leon@kernel.org/
> https://lore.kernel.org/all/20250623141259.76767-1-hch@lst.de/
> https://lwn.net/Articles/997563/

Thanks for the pointer. 

I agree it's the right approach. Batching the IOVA allocation and IOTLB sync
across all regions is a clear win over the per-region
dma_map_single/skb_frag_dma_map calls I had in v2.

I'll submit a v3 with the tso_dma_map internals updated to use
dma_iova_try_alloc + dma_iova_link + dma_iova_sync, with a
dma_map_phys fallback.