[PATCH net] octeon_ep: Fix host hang issue during device reboot

Sathesh B Edara posted 1 patch 9 months, 2 weeks ago
drivers/net/ethernet/marvell/octeon_ep/octep_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH net] octeon_ep: Fix host hang issue during device reboot
Posted by Sathesh B Edara 9 months, 2 weeks ago
When the host loses heartbeat messages from the device,
the driver calls the device-specific ndo_stop function,
which frees the resources. If the driver is unloaded in
this scenario, it calls ndo_stop again, attempting to free
resources that have already been freed, leading to a host
hang issue. To resolve this, dev_close should be called
instead of the device-specific stop function.dev_close
internally calls ndo_stop to stop the network interface
and performs additional cleanup tasks. During the driver
unload process, if the device is already down, ndo_stop
is not called.

Fixes: 5cb96c29aa0e ("octeon_ep: add heartbeat monitor")
Signed-off-by: Sathesh B Edara <sedara@marvell.com>
---
 drivers/net/ethernet/marvell/octeon_ep/octep_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
index 0a679e95196f..24499bb36c00 100644
--- a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
+++ b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
@@ -1223,7 +1223,7 @@ static void octep_hb_timeout_task(struct work_struct *work)
 		miss_cnt);
 	rtnl_lock();
 	if (netif_running(oct->netdev))
-		octep_stop(oct->netdev);
+		dev_close(oct->netdev);
 	rtnl_unlock();
 }
 
-- 
2.36.0
Re: [PATCH net] octeon_ep: Fix host hang issue during device reboot
Posted by Simon Horman 9 months, 1 week ago
On Tue, Apr 29, 2025 at 04:46:24AM -0700, Sathesh B Edara wrote:
> When the host loses heartbeat messages from the device,
> the driver calls the device-specific ndo_stop function,
> which frees the resources. If the driver is unloaded in
> this scenario, it calls ndo_stop again, attempting to free
> resources that have already been freed, leading to a host
> hang issue. To resolve this, dev_close should be called
> instead of the device-specific stop function.dev_close
> internally calls ndo_stop to stop the network interface
> and performs additional cleanup tasks. During the driver
> unload process, if the device is already down, ndo_stop
> is not called.
> 
> Fixes: 5cb96c29aa0e ("octeon_ep: add heartbeat monitor")
> Signed-off-by: Sathesh B Edara <sedara@marvell.com>

Reviewed-by: Simon Horman <horms@kernel.org>