[v2] scsi: ufs: core: handle PM SSU timeout before SCSI EH

[PATCH v2] scsi: ufs: core: handle PM SSU timeout before SCSI EH

Posted by Hongjie Fang 1 week, 4 days ago

A PM START STOP sent from the UFS well-known LU resume path can race with
SCSI EH.

The "wl resume" task flow is:
  __ufshcd_wl_resume()
    ufshcd_set_dev_pwr_mode(UFS_ACTIVE_PWR_MODE)
      ufshcd_execute_start_stop()
        scsi_execute_cmd()
          blk_execute_rq           <-- wait
          scsi_check_passthrough() <-- may retry START STOP

If the first START STOP time out, SCSI EH may already recover the link and
reset the device before scsi_execute_cmd() returns:
  scsi_timeout()
    scsi_eh_scmd_add()
      scsi_error_handler()
        scsi_unjam_host()
          scsi_eh_ready_devs()
            scsi_eh_host_reset()
              ufshcd_eh_host_reset_handler()
                if (hba->pm_op_in_progress)
                  ufshcd_link_recovery()
                    ufshcd_device_reset()
                    ufshcd_host_reset_and_restore()
          ...
          scsi_eh_flush_done_q()   <-- wakeup "wl resume" task
        ...                        <-- host still in SHOST_RECOVERY
        scsi_restart_operations()

A later passthrough retry can then run while the host is still in
SHOST_RECOVERY and hit the SCMD_FAIL_IF_RECOVERING path:
  scsi_queue_rq()
    if (scsi_host_in_recovery(shost) &&
        cmd->flags & SCMD_FAIL_IF_RECOVERING)
      return BLK_STS_OFFLINE

That retry completes with DID_ERROR or DID_NO_CONNECT even though EH may
already have restored the device to an operational ACTIVE state.

Handle PM SSU timeout directly from ufshcd_eh_timed_out() instead of
letting these commands enter regular SCSI EH. Limit this path to SSU
commands for the UFS device WLUN while a PM operation is in progress.
If link recovery fails, return SCSI_EH_NOT_HANDLED so regular SCSI
timeout handling can take over.

Since this path bypasses scsi_eh_scmd_add(), UFS reset/restore must also
complete the timed-out request itself. MCQ mode already force-completes
requests without CQEs when force_compl is true. Add the same behavior for
the legacy single-doorbell path: first process requests whose doorbell has
already been cleared, then complete the remaining outstanding SCSI requests
with DID_REQUEUE so callers can re-issue commands whose outcome became
unknown after the host reset.

The system_suspending flag is no longer needed because PM SSU timeout
handling now uses pm_op_in_progress and command filtering.

Fixes: b8c3a7bac9b6 ("scsi: ufs: Have midlayer retry start stop errors")
Signed-off-by: Hongjie Fang <hongjiefang@asrmicro.com>
---

v2: handle PM SSU timeout directly from ufshcd_eh_timed_out() suggested
by Bart Van Assche

drivers/ufs/core/ufshcd.c | 56 ++++++++++++++++++++++++++++++++-------
 include/ufs/ufshcd.h      |  3 ---
 2 files changed, 47 insertions(+), 12 deletions(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index c3f08957d179..ce79c0f30b46 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -5953,6 +5953,37 @@ static irqreturn_t ufshcd_transfer_req_compl(struct ufs_hba *hba)
 	return IRQ_HANDLED;
 }
 
+static void ufshcd_force_compl_pending_transfer(struct ufs_hba *hba)
+{
+	unsigned long completed_reqs;
+	unsigned long flags;
+	int tag;
+
+	ufshcd_transfer_req_compl(hba);
+
+	spin_lock_irqsave(&hba->outstanding_lock, flags);
+	completed_reqs = hba->outstanding_reqs;
+	hba->outstanding_reqs = 0;
+	spin_unlock_irqrestore(&hba->outstanding_lock, flags);
+
+	for_each_set_bit(tag, &completed_reqs, hba->nutrs) {
+		struct scsi_cmnd *cmd = ufshcd_tag_to_cmd(hba, tag);
+
+		if (cmd && ufshcd_is_scsi_cmd(cmd) &&
+		    !test_bit(SCMD_STATE_COMPLETE, &cmd->state)) {
+			/*
+			 * The host has been reset and the original command
+			 * outcome is unknown. Requeue SCSI commands so callers
+			 * such as ufshcd_set_dev_pwr_mode() can re-issue START
+			 * STOP UNIT and converge the device power mode.
+			 */
+			set_host_byte(cmd, DID_REQUEUE);
+			ufshcd_release_scsi_cmd(hba, cmd);
+			scsi_done(cmd);
+		}
+	}
+}
+
 int __ufshcd_write_ee_control(struct ufs_hba *hba, u32 ee_ctrl_mask)
 {
 	return ufshcd_query_attr_retry(hba, UPIU_QUERY_OPCODE_WRITE_ATTR,
@@ -6517,6 +6548,8 @@ static void ufshcd_complete_requests(struct ufs_hba *hba, bool force_compl)
 {
 	if (hba->mcq_enabled)
 		ufshcd_mcq_compl_pending_transfer(hba, force_compl);
+	else if (force_compl)
+		ufshcd_force_compl_pending_transfer(hba);
 	else
 		ufshcd_transfer_req_compl(hba);
 
@@ -9465,23 +9498,30 @@ static void ufshcd_async_scan(void *data, async_cookie_t cookie)
 static enum scsi_timeout_action ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
 {
 	struct ufs_hba *hba = shost_priv(scmd->device->host);
+	int ret;
 
-	if (!hba->system_suspending) {
+	if (!hba->pm_op_in_progress || scmd->device != hba->ufs_device_wlun ||
+	    scmd->cmnd[0] != START_STOP) {
 		/* Activate the error handler in the SCSI core. */
 		return SCSI_EH_NOT_HANDLED;
 	}
 
 	/*
-	 * If we get here we know that no TMFs are outstanding and also that
-	 * the only pending command is a START STOP UNIT command. Handle the
-	 * timeout of that command directly to prevent a deadlock between
-	 * ufshcd_set_dev_pwr_mode() and ufshcd_err_handler().
+	 * PM START STOP UNIT commands are issued while a PM operation is in
+	 * progress. Handle such timeouts directly to avoid entering regular
+	 * SCSI EH, which may deadlock with the PM operation and may also make
+	 * scsi_execute_cmd() retries fail while the host is still in recovery.
 	 */
-	ufshcd_link_recovery(hba);
+	ret = ufshcd_link_recovery(hba);
 	dev_info(hba->dev, "%s() finished; outstanding_tasks = %#lx.\n",
 		 __func__, hba->outstanding_tasks);
 
-	return scsi_host_busy(hba->host) ? SCSI_EH_RESET_TIMER : SCSI_EH_DONE;
+	if (ret)
+		return SCSI_EH_NOT_HANDLED;
+
+	WARN_ON_ONCE(!test_bit(SCMD_STATE_COMPLETE, &scmd->state));
+
+	return SCSI_EH_DONE;
 }
 
 static const struct attribute_group *ufshcd_driver_groups[] = {
@@ -10518,7 +10558,6 @@ static int ufshcd_wl_suspend(struct device *dev)
 
 	hba = shost_priv(sdev->host);
 	down(&hba->host_sem);
-	hba->system_suspending = true;
 
 	if (pm_runtime_suspended(dev))
 		goto out;
@@ -10560,7 +10599,6 @@ static int ufshcd_wl_resume(struct device *dev)
 		hba->curr_dev_pwr_mode, hba->uic_link_state);
 	if (!ret)
 		hba->is_sys_suspended = false;
-	hba->system_suspending = false;
 	up(&hba->host_sem);
 	return ret;
 }
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index cfbc75d8df83..8280a95c00c7 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -1020,8 +1020,6 @@ enum ufshcd_mcq_opr {
  * @caps: bitmask with information about UFS controller capabilities
  * @devfreq: frequency scaling information owned by the devfreq core
  * @clk_scaling: frequency scaling information owned by the UFS driver
- * @system_suspending: system suspend has been started and system resume has
- *	not yet finished.
  * @is_sys_suspended: UFS device has been suspended because of system suspend
  * @urgent_bkops_lvl: keeps track of urgent bkops level for device
  * @is_urgent_bkops_lvl_checked: keeps track if the urgent bkops level for
@@ -1197,7 +1195,6 @@ struct ufs_hba {
 
 	struct devfreq *devfreq;
 	struct ufs_clk_scaling clk_scaling;
-	bool system_suspending;
 	bool is_sys_suspended;
 
 	enum bkops_status urgent_bkops_lvl;
-- 
2.25.1

Re: [PATCH v2] scsi: ufs: core: handle PM SSU timeout before SCSI EH

Posted by Bart Van Assche 1 week, 2 days ago

On 5/28/26 4:34 AM, Hongjie Fang wrote:
> @@ -9465,23 +9498,30 @@ static void ufshcd_async_scan(void *data, async_cookie_t cookie)
>   static enum scsi_timeout_action ufshcd_eh_timed_out(struct scsi_cmnd *scmd)
>   {
>   	struct ufs_hba *hba = shost_priv(scmd->device->host);
> +	int ret;
>   
> -	if (!hba->system_suspending) {
> +	if (!hba->pm_op_in_progress || scmd->device != hba->ufs_device_wlun ||
> +	    scmd->cmnd[0] != START_STOP) {
>   		/* Activate the error handler in the SCSI core. */
>   		return SCSI_EH_NOT_HANDLED;
>   	}

Please don't make the code any more complex than necessary. If a power 
management operation is in progress it is guaranteed that no other SCSI
commands are in progress. See also blk_pre_runtime_suspend() and
blk_try_enter_queue(). Hence, for the above if-condition, testing
hba->pm_op_in_progress is sufficient.

>   	/*
> -	 * If we get here we know that no TMFs are outstanding and also that
> -	 * the only pending command is a START STOP UNIT command. Handle the
> -	 * timeout of that command directly to prevent a deadlock between
> -	 * ufshcd_set_dev_pwr_mode() and ufshcd_err_handler().
> +	 * PM START STOP UNIT commands are issued while a PM operation is in
> +	 * progress. Handle such timeouts directly to avoid entering regular
> +	 * SCSI EH, which may deadlock with the PM operation and may also make
> +	 * scsi_execute_cmd() retries fail while the host is still in recovery.
>   	 */

The original comment is fine, isn't it?

> -	return scsi_host_busy(hba->host) ? SCSI_EH_RESET_TIMER : SCSI_EH_DONE;
> +	if (ret)
> +		return SCSI_EH_NOT_HANDLED;

This is wrong because it may activate the SCSI error handler for a START
STOP UNIT command. We don't want this - we want the error to be
propagated to the scsi_execute_cmd() caller.

> +	WARN_ON_ONCE(!test_bit(SCMD_STATE_COMPLETE, &scmd->state));

This is also wrong. According to Documentation/process/coding-style.rst,
WARN*() should only be used for this-should-never-happen situations. If
the above statement is reached it is almost a certainty that
SCMD_STATE_COMPLETE is not set.

The rest of this patch looks good to me.

Thanks,

Bart.

RE: [PATCH v2] scsi: ufs: core: handle PM SSU timeout before SCSI EH

Posted by Fang Hongjie(方洪杰) 1 week, 2 days ago

> From: Bart Van Assche [mailto:bvanassche@acm.org]
> Sent: Saturday, May 30, 2026 5:46 AM
> To: Fang Hongjie(方洪杰) <hongjiefang@asrmicro.com>;
> alim.akhtar@samsung.com; avri.altman@wdc.com;
> James.Bottomley@HansenPartnership.com; martin.petersen@oracle.com;
> peter.wang@mediatek.com; beanhuo@micron.com
> Cc: linux-scsi@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH v2] scsi: ufs: core: handle PM SSU timeout before SCSI
> EH
> 
> On 5/28/26 4:34 AM, Hongjie Fang wrote:
> > @@ -9465,23 +9498,30 @@ static void ufshcd_async_scan(void *data,
> async_cookie_t cookie)
> >   static enum scsi_timeout_action ufshcd_eh_timed_out(struct scsi_cmnd
> *scmd)
> >   {
> >   	struct ufs_hba *hba = shost_priv(scmd->device->host);
> > +	int ret;
> >
> > -	if (!hba->system_suspending) {
> > +	if (!hba->pm_op_in_progress || scmd->device != hba-
> >ufs_device_wlun ||
> > +	    scmd->cmnd[0] != START_STOP) {
> >   		/* Activate the error handler in the SCSI core. */
> >   		return SCSI_EH_NOT_HANDLED;
> >   	}
> 
> Please don't make the code any more complex than necessary. If a power
> management operation is in progress it is guaranteed that no other SCSI
> commands are in progress. See also blk_pre_runtime_suspend() and
> blk_try_enter_queue(). Hence, for the above if-condition, testing
> hba->pm_op_in_progress is sufficient.
> 

I agree with your comments. It will only check whether a PM operation is
in progress for normal SCSI commands:
    if (!hba->pm_op_in_progress || !ufshcd_is_scsi_cmd(scmd))
            return SCSI_EH_NOT_HANDLED;
			
The reason for keeping ufshcd_is_scsi_cmd() is to avoid applying this
special PM timeout path to UFS internal device-management commands.


> >   	/*
> > -	 * If we get here we know that no TMFs are outstanding and also that
> > -	 * the only pending command is a START STOP UNIT command.
> Handle the
> > -	 * timeout of that command directly to prevent a deadlock between
> > -	 * ufshcd_set_dev_pwr_mode() and ufshcd_err_handler().
> > +	 * PM START STOP UNIT commands are issued while a PM operation
> is in
> > +	 * progress. Handle such timeouts directly to avoid entering regular
> > +	 * SCSI EH, which may deadlock with the PM operation and may also
> make
> > +	 * scsi_execute_cmd() retries fail while the host is still in recovery.
> >   	 */
> 
> The original comment is fine, isn't it?
> 
> > -	return scsi_host_busy(hba->host) ? SCSI_EH_RESET_TIMER :
> SCSI_EH_DONE;
> > +	if (ret)
> > +		return SCSI_EH_NOT_HANDLED;
> 
> This is wrong because it may activate the SCSI error handler for a START
> STOP UNIT command. We don't want this - we want the error to be
> propagated to the scsi_execute_cmd() caller.
> 
> > +	WARN_ON_ONCE(!test_bit(SCMD_STATE_COMPLETE, &scmd-
> >state));
> 
> This is also wrong. According to Documentation/process/coding-style.rst,
> WARN*() should only be used for this-should-never-happen situations. If
> the above statement is reached it is almost a certainty that
> SCMD_STATE_COMPLETE is not set.
> 

It will keep the original comment in ufshcd_eh_timed_out(), remove the
SCSI_EH_NOT_HANDLED fallback after link recovery, and remove the WARN_ON
as suggested.


> The rest of this patch looks good to me.
> 
> Thanks,
> 
> Bart.


Best.