[PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking

Prathamesh Deshpande posted 2 patches 2 months, 1 week ago
There is a newer version of this series
drivers/infiniband/hw/mlx5/main.c | 81 +++++++++++++++++++++++--------
1 file changed, 62 insertions(+), 19 deletions(-)
[PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking
Posted by Prathamesh Deshpande 2 months, 1 week ago
This series fixes transport-domain rollback and loopback state
consistency in mlx5 IB.

Patch 1 fixes TD rollback on mlx5_ib_enable_lb() failure, makes the
success return path explicit, and initializes lb.mutex earlier.

Patch 2 serializes MP force-enable state updates with lb.mutex and
implements capability-aware thresholds (td_base) to ensure correct
loopback behavior on both TD-capable and no-TD hardware.

v9:
- Address race/state issues around force_enable and enabled.
- Fix TD leak on failure after successful allocation.
- Implement hardware-aware thresholds via mlx5_ib_lb_td_base() to
  handle both TD-capable and no-TD hardware correctly.
- Serialize MP force-enable transitions under lb.mutex.

v8:
- Resubmitted as a fresh, independent thread per maintainer request.
- No functional changes since v7.

v7:
- Split the series into two patches to isolate the return-value/mutex 
  initialization fix from the refcounting logic.
- Moved force_enable check after increments/decrements to fix leaks.
- Updated hardware disable condition to a strict zero-check.

v1-v6:
- Initial combined versions.
- Added deallocation of tdn on failure.
- Moved mutex_init to stage_init_init to prevent crashes on non-ETH.
- Implemented atomic rollback in enable/disable paths.

Prathamesh Deshpande (2):
  IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
  IB/mlx5: Serialize force-enable state and preserve loopback accounting

 drivers/infiniband/hw/mlx5/main.c | 81 +++++++++++++++++++++++--------
 1 file changed, 62 insertions(+), 19 deletions(-)

-- 
2.43.0
Re: [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking
Posted by Leon Romanovsky 1 month, 1 week ago
On Fri, Apr 10, 2026 at 01:52:16AM +0100, Prathamesh Deshpande wrote:
> This series fixes transport-domain rollback and loopback state
> consistency in mlx5 IB.
> 
> Patch 1 fixes TD rollback on mlx5_ib_enable_lb() failure, makes the
> success return path explicit, and initializes lb.mutex earlier.
> 
> Patch 2 serializes MP force-enable state updates with lb.mutex and
> implements capability-aware thresholds (td_base) to ensure correct
> loopback behavior on both TD-capable and no-TD hardware.
> 
> v9:
> - Address race/state issues around force_enable and enabled.
> - Fix TD leak on failure after successful allocation.
> - Implement hardware-aware thresholds via mlx5_ib_lb_td_base() to
>   handle both TD-capable and no-TD hardware correctly.
> - Serialize MP force-enable transitions under lb.mutex.
> 
> v8:
> - Resubmitted as a fresh, independent thread per maintainer request.
> - No functional changes since v7.
> 
> v7:
> - Split the series into two patches to isolate the return-value/mutex 
>   initialization fix from the refcounting logic.
> - Moved force_enable check after increments/decrements to fix leaks.
> - Updated hardware disable condition to a strict zero-check.
> 
> v1-v6:
> - Initial combined versions.
> - Added deallocation of tdn on failure.
> - Moved mutex_init to stage_init_init to prevent crashes on non-ETH.
> - Implemented atomic rollback in enable/disable paths.
> 
> Prathamesh Deshpande (2):
>   IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier

I agree that this patch is needed.

>   IB/mlx5: Serialize force-enable state and preserve loopback accounting

This change does not appear to be justified. The commit message provides no
clear explanation of why it is needed.

Thanks

> 
>  drivers/infiniband/hw/mlx5/main.c | 81 +++++++++++++++++++++++--------
>  1 file changed, 62 insertions(+), 19 deletions(-)
> 
> -- 
> 2.43.0
>
Re: [PATCH v9 0/2] IB/mlx5: Fix loopback rollback and locking
Posted by Prathamesh Deshpande 1 month, 1 week ago
On Sun, May 10, 2026 at 13:55:31 +0300, Leon Romanovsky wrote:
> On Fri, Apr 10, 2026 at 01:52:16AM +0100, Prathamesh Deshpande wrote:
> > Prathamesh Deshpande (2):
> >   IB/mlx5: Fix transport-domain rollback and initialize lb mutex earlier
> 
> I agree that this patch is needed.
> 
> >   IB/mlx5: Serialize force-enable state and preserve loopback accounting
> 
> This change does not appear to be justified. The commit message provides no
> clear explanation of why it is needed.
> 
> Thanks
 
Thanks, Leon.
 
v11 dropped the MP force-enable locking changes and kept MP helper behavior
unchanged. Patch 2 is now limited to the regular-path threshold/accounting
fixes.

I have also just sent a fresh v12 series that addresses your Patch 1 
review regarding the missing mutex cleanups. You can find the updated 
series here: https://lore.kernel.org/all/20260510222258.6654-1-prathameshdeshpande7@gmail.com/