[[PATCH v2 iwl-next] v2 1/4] idpf: address an rtnl lock splat in tx timeout recovery path

Manoj Vishwanathan posted 1 patch 1 year, 3 months ago
drivers/net/ethernet/intel/idpf/idpf_txrx.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
[[PATCH v2 iwl-next] v2 1/4] idpf: address an rtnl lock splat in tx timeout recovery path
Posted by Manoj Vishwanathan 1 year, 3 months ago
From: David Decotigny <decot@google.com>

Adopt the same pattern as in other places in the code to take the rtnl
lock during hard resets.
Tested the patch by injecting tx timeout in IDPF , observe that idpf
recovers and IDPF comes back reachable

Without this patch causes there is a splat:
[  270.145214] WARNING: CPU:  PID:  at net/sched/sch_generic.c:534 dev_watchdog

Fixes: d4d5587182664 (idpf: initialize interrupts and enable vport)
Signed-off-by: Manoj Vishwanathan <manojvishy@google.com>
---
 drivers/net/ethernet/intel/idpf/idpf_txrx.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
index af2879f03b8d..806a8b6ea5c5 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
@@ -4326,6 +4326,7 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport)
  */
 int idpf_vport_intr_init(struct idpf_vport *vport)
 {
+	bool hr_reset_in_prog;
 	char *int_name;
 	int err;
 
@@ -4334,8 +4335,19 @@ int idpf_vport_intr_init(struct idpf_vport *vport)
 		return err;
 
 	idpf_vport_intr_map_vector_to_qs(vport);
+	/** 
+	 * If we're in normal up path, the stack already takes the
+	 * rtnl_lock for us, however, if we're doing up as a part of a
+	 * hard reset, we'll need to take the lock ourself before
+	 * touching the netdev.
+	 */
+	hr_reset_in_prog = test_bit(IDPF_HR_RESET_IN_PROG,
+					vport->adapter->flags);
+	if (hr_reset_in_prog)
+		rtnl_lock();
 	idpf_vport_intr_napi_add_all(vport);
-
+	if (hr_reset_in_prog)
+		rtnl_unlock();
 	err = vport->adapter->dev_ops.reg_ops.intr_reg_init(vport);
 	if (err)
 		goto unroll_vectors_alloc;
-- 
2.46.0.295.g3b9ea8a38a-goog
Re: [Intel-wired-lan] [[PATCH v2 iwl-next] v2 1/4] idpf: address an rtnl lock splat in tx timeout recovery path
Posted by Jacob Keller 1 year, 3 months ago

On 8/26/2024 11:10 AM, Manoj Vishwanathan wrote:
> From: David Decotigny <decot@google.com>
> 
> Adopt the same pattern as in other places in the code to take the rtnl
> lock during hard resets.
> Tested the patch by injecting tx timeout in IDPF , observe that idpf
> recovers and IDPF comes back reachable
> 
> Without this patch causes there is a splat:
> [  270.145214] WARNING: CPU:  PID:  at net/sched/sch_generic.c:534 dev_watchdog
> 
> Fixes: d4d5587182664 (idpf: initialize interrupts and enable vport)
> Signed-off-by: Manoj Vishwanathan <manojvishy@google.com>
> ---
>  drivers/net/ethernet/intel/idpf/idpf_txrx.c | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.c b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> index af2879f03b8d..806a8b6ea5c5 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
> @@ -4326,6 +4326,7 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport)
>   */
>  int idpf_vport_intr_init(struct idpf_vport *vport)
>  {
> +	bool hr_reset_in_prog;
>  	char *int_name;
>  	int err;
>  
> @@ -4334,8 +4335,19 @@ int idpf_vport_intr_init(struct idpf_vport *vport)
>  		return err;
>  
>  	idpf_vport_intr_map_vector_to_qs(vport);
> +	/** 
> +	 * If we're in normal up path, the stack already takes the
> +	 * rtnl_lock for us, however, if we're doing up as a part of a
> +	 * hard reset, we'll need to take the lock ourself before
> +	 * touching the netdev.
> +	 */
> +	hr_reset_in_prog = test_bit(IDPF_HR_RESET_IN_PROG,
> +					vport->adapter->flags);
> +	if (hr_reset_in_prog)
> +		rtnl_lock();
>  	idpf_vport_intr_napi_add_all(vport);
> -
> +	if (hr_reset_in_prog)
> +		rtnl_unlock();

This feels a little fragile. Why not pass the reset in progress as a
flag from the caller? Surely the caller knows whether this is happening
due to an interface up or due to a reset?