[PATCH] ice: wait for reset completion in ice_resume()

Aaron Ma posted 1 patch 2 months, 1 week ago
There is a newer version of this series
drivers/net/ethernet/intel/ice/ice_main.c | 10 ++++++++++
1 file changed, 10 insertions(+)
[PATCH] ice: wait for reset completion in ice_resume()
Posted by Aaron Ma 2 months, 1 week ago
ice_resume() schedules an asynchronous PF reset and returns
immediately. The reset runs later in ice_service_task(). If
userspace tries to bring up the net device before the reset
finishes, ice_open() fails with -EBUSY:

  ice_resume()
    ice_schedule_reset()          # sets ICE_PFR_REQ, returns
  ...
  ice_open()
    ice_is_reset_in_progress()    # ICE_PFR_REQ still set, -EBUSY
  ...
  ice_service_task()
    ice_do_reset()
      ice_rebuild()               # clears ICE_PFR_REQ, too late

Reproduced on E800 series NICs during suspend/resume with irdma
enabled, where the aux device probe widens the race window.

Wait for the reset to complete before returning from ice_resume().

Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
---
 drivers/net/ethernet/intel/ice/ice_main.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 3c36e3641b9e9..a029c247510fd 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -5702,6 +5702,16 @@ static int ice_resume(struct device *dev)
 	/* Restart the service task */
 	mod_timer(&pf->serv_tmr, round_jiffies(jiffies + pf->serv_tmr_period));
 
+	/* Wait for the scheduled reset to finish so that the device is fully
+	 * operational before returning. Without this, userspace (e.g.
+	 * NetworkManager) may try to open the net device while the
+	 * asynchronous reset and rebuild is still in progress, resulting in
+	 * "can't open net device while reset is in progress" errors.
+	 */
+	ret = ice_wait_for_reset(pf, 10 * HZ);
+	if (ret)
+		dev_err(dev, "Wait for reset failed during resume: %d\n", ret);
+
 	return 0;
 }
 
-- 
2.43.0
Re: [Intel-wired-lan] [PATCH] ice: wait for reset completion in ice_resume()
Posted by Kohei Enju 2 months, 1 week ago
On 04/02 10:42, Aaron Ma via Intel-wired-lan wrote:
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 3c36e3641b9e9..a029c247510fd 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -5702,6 +5702,16 @@ static int ice_resume(struct device *dev)
>  	/* Restart the service task */
>  	mod_timer(&pf->serv_tmr, round_jiffies(jiffies + pf->serv_tmr_period));
>  
> +	/* Wait for the scheduled reset to finish so that the device is fully
> +	 * operational before returning. Without this, userspace (e.g.
> +	 * NetworkManager) may try to open the net device while the
> +	 * asynchronous reset and rebuild is still in progress, resulting in
> +	 * "can't open net device while reset is in progress" errors.
> +	 */

nit:
IIUC, this change is best-effort, since ice_resume() still returns
success even if ice_wait_for_reset() fails. If so, the new comment may
be better phrased to reflect that.

Otherwise, it looks good to me.