[PATCH net V2 0/3] net/mlx5: Fixes for Socket-Direct

Tariq Toukan posted 3 patches 2 months ago
There is a newer version of this series
.../net/ethernet/mellanox/mlx5/core/en_main.c | 18 +++--
.../net/ethernet/mellanox/mlx5/core/lib/sd.c  | 70 ++++++++++++++++---
.../net/ethernet/mellanox/mlx5/core/lib/sd.h  |  2 +
3 files changed, 77 insertions(+), 13 deletions(-)
[PATCH net V2 0/3] net/mlx5: Fixes for Socket-Direct
Posted by Tariq Toukan 2 months ago
Hi,

This series fixes several race conditions and bugs in the mlx5
Socket-Direct (SD) single netdev flow.

Patch 1 serializes mlx5_sd_init()/mlx5_sd_cleanup() with
mlx5_devcom_comp_lock() and tracks the SD group state on the primary
device, preventing concurrent or duplicate bring-up/tear-down.

Patch 2 fixes the debugfs "multi-pf" directory being stored on the
calling device's sd struct instead of the primary's, which caused
memory leaks and recreation errors when cleanup ran from a different PF.

Patch 3 fixes a race where a secondary PF could access the primary's
auxiliary device after it had been unbound, by holding the primary's
device lock while operating on its auxiliary device.

Regards,
Tariq

V2:
- Link to V1:
  https://lore.kernel.org/all/20260330193412.53408-1-tariqt@nvidia.com/
- Reorder the patches so that "net/mlx5: SD: Serialize init/cleanup"
  is first.
- Add MLX5_SD_STATE_DESTROYING to the patch above to solve a concurrent
  edge case.
- Expend commit message of "net/mlx5e: SD, Fix race condition in
  secondary device probe/remove"

Shay Drory (3):
  net/mlx5: SD: Serialize init/cleanup
  net/mlx5: SD, Keep multi-pf debugfs entries on primary
  net/mlx5e: SD, Fix race condition in secondary device probe/remove

 .../net/ethernet/mellanox/mlx5/core/en_main.c | 18 +++--
 .../net/ethernet/mellanox/mlx5/core/lib/sd.c  | 70 ++++++++++++++++---
 .../net/ethernet/mellanox/mlx5/core/lib/sd.h  |  2 +
 3 files changed, 77 insertions(+), 13 deletions(-)


base-commit: 2dddb34dd0d07b01fa770eca89480a4da4f13153
-- 
2.44.0