[PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism

JiangJianJun posted 14 patches 1 month, 2 weeks ago
There is a newer version of this series
drivers/scsi/iscsi_tcp.c   |   4 +
drivers/scsi/scsi_debug.c  |  26 +-
drivers/scsi/scsi_error.c  | 800 ++++++++++++++++++++++++++++++++++---
drivers/scsi/scsi_lib.c    |  17 +-
drivers/scsi/scsi_priv.h   |  15 +
drivers/scsi/scsi_scan.c   |  19 +
drivers/scsi/scsi_sysfs.c  |   2 +
drivers/scsi/virtio_scsi.c |   3 +
include/scsi/scsi_device.h |  87 ++++
include/scsi/scsi_eh.h     |   8 +
include/scsi/scsi_host.h   |  33 +-
11 files changed, 952 insertions(+), 62 deletions(-)
[PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by JiangJianJun 1 month, 2 weeks ago
It's unbearable for systems with large scale scsi devices share HBAs to
block all devices' IOs when handle error commands, we need a new error
handle mechanism to address this issue.

I consulted about this issue a year ago, the discuss link can be found in
refenence. Hannes replied about why we have to block the SCSI host
then perform error recovery kindly. I think it's unnecessary to block
SCSI host for all drivers and can try a small level recovery(LUN based for
example) first to avoid block the SCSI host.

The new error handle mechanism introduced in this patchset has been
developed and tested with out self developed hardware since one year
ago, now we want this mechanism can be used by more drivers.

Drivers can decide if using the new error handle mechanism and how to
handle error commands when scsi_device are scanned,the new mechanism
makes SCSI error handle more flexible.

SCSI error recovery strategy after blocking host's IO is mainly
following steps:

- LUN reset
- Target reset
- Bus reset
- Host reset

Some drivers did not implement callbacks for host reset, it's unnecessary
to block host's IO for these drivers. For example, virtio_scsi only
registered device reset, if device reset failed, it's meaningless to
fallback to target reset, bus reset or host reset any more, because these
steps would also failed.

Here are some drivers we concerned:(there are too many kinds of drivers
to figure out, so here I just list some drivers I am familiar with)

+-------------+--------------+--------------+-----------+------------+
|  drivers    | device_reset | target_reset | bus_reset | host_reset |
+-------------+--------------+--------------+-----------+------------+
| mpt3sas     |     Y        |     Y        |    N      |    Y       |
+-------------+--------------+--------------+-----------+------------+
| smartpqi    |     Y        |     N        |    N      |    N       |
+-------------+--------------+--------------+-----------+------------+
| megaraidsas |     N        |     Y        |    N      |    Y       |
+-------------+--------------+--------------+-----------+------------+
| virtioscsi  |     Y        |     N        |    N      |    N       |
+-------------+--------------+--------------+-----------+------------+
| iscsi_tcp   |     Y        |     Y        |    N      |    N       |
+-------------+--------------+--------------+-----------+------------+
| hisisas     |     Y        |     Y        |    N      |    N       |
+-------------+--------------+--------------+-----------+------------+

For LUN based error handle, when scsi command is classified as error,
we would block the scsi device's IO and try to recover this scsi
device, if still can not recover all error commands, it might
fallback to target or host level recovery.

It's same for target based error handle, but target based error handle
would block the scsi target's IO then try to recover the error commands
of this target.

The first patch defines basic framework to support LUN/target based error
handle mechanism, three key operations are abstracted which are:
 - add error command
 - wake up error handle
 - block IOs when error command is added and recoverying.

The driver can implement these three function callbacks and set them in
the SCSI midlayer. I have added callbacks for setup/clear implementations:
 - setup/clear LUN based error handler
 - setup/clear target based error handler
As well as a generic implementation that the driver can directly reference. 

The changes of SCSI middle level's error handle are tested with scsi_debug
which support single LUN error injection, the scsi_debug patches can be
found in reference, following scenarios are tested.

Scenario1: LUN based error handle is enabled:
+-----------+---------+-------------------------------------------------------+
| lun reset | TUR     | Desired result                                        |
+ --------- + ------- + ------------------------------------------------------+
| success   | success | retry or finish with  EIO(may offline disk)           |
+ --------- + ------- + ------------------------------------------------------+
| success   | fail    | fallback to host  recovery, retry or finish with      |
|           |         | EIO(may offline disk)                                 |
+ --------- + ------- + ------------------------------------------------------+
| fail      | NA      | fallback to host  recovery, retry or finish with      |
|           |         | EIO(may offline disk)                                 |
+ --------- + ------- + ------------------------------------------------------+

Scenario2: target based error handle is enabled:
+-----------+---------+--------------+---------+------------------------------+
| lun reset | TUR     | target reset | TUR     | Desired result               |
+-----------+---------+--------------+---------+------------------------------+
| success   | success | NA           | NA      | retry or finish with         |
|           |         |              |         | EIO(may offline disk)        |
+-----------+---------+--------------+---------+------------------------------+
| success   | fail    | success      | success | retry or finish with         |
|           |         |              |         | EIO(may offline disk)        |
+-----------+---------+--------------+---------+------------------------------+
| fail      | NA      | success      | success | retry or finish with         |
|           |         |              |         | EIO(may offline disk)        |
+-----------+---------+--------------+---------+------------------------------+
| fail      | NA      | success      | fail    | fallback to host recovery,   |
|           |         |              |         | retry or finish with EIO(may |
|           |         |              |         | offline disk)                |
+-----------+---------+--------------+---------+------------------------------+
| fail      | NA      | fail         | NA      | fallback to host  recovery,  |
|           |         |              |         | retry or finish with EIO(may |
|           |         |              |         | offline disk)                |
+-----------+---------+--------------+---------+------------------------------+

Scenario3: both LUN and target based error handle are enabled:
+-----------+---------+--------------+---------+------------------------------+
| lun reset | TUR     | target reset | TUR     | Desired result               |
+-----------+---------+--------------+---------+------------------------------+
| success   | success | NA           | NA      | retry or finish with         |
|           |         |              |         | EIO(may offline disk)        |
+-----------+---------+--------------+---------+------------------------------+
| success   | fail    | success      | success | lun recovery fallback to     |
|           |         |              |         | target recovery, retry or    |
|           |         |              |         | finish with EIO(may offline  |
|           |         |              |         | disk                         |
+-----------+---------+--------------+---------+------------------------------+
| fail      | NA      | success      | success | lun recovery fallback to     |
|           |         |              |         | target recovery, retry or    |
|           |         |              |         | finish with EIO(may offline  |
|           |         |              |         | disk                         |
+-----------+---------+--------------+---------+------------------------------+
| fail      | NA      | success      | fail    | lun recovery fallback to     |
|           |         |              |         | target recovery, then fall   |
|           |         |              |         | back to host recovery, retry |
|           |         |              |         | or fhinsi with EIO(may       |
|           |         |              |         | offline disk)                |
+-----------+---------+--------------+---------+------------------------------+
| fail      | NA      | fail         | NA      | lun recovery fallback to     |
|           |         |              |         | target recovery, then fall   |
|           |         |              |         | back to host recovery, retry |
|           |         |              |         | or fhinsi with EIO(may       |
|           |         |              |         | offline disk)                |
+-----------+---------+--------------+---------+------------------------------+

References: https://lore.kernel.org/linux-scsi/20230815122316.4129333-1-haowenchao2@huawei.com/
References: https://lore.kernel.org/linux-scsi/71e09bb4-ff0a-23fe-38b4-fe6425670efa@huawei.com/

In this version of the push, I have made revisions based on the previous review comments. There
is previous version link:
References: https://lore.kernel.org/linux-scsi/20230901094127.2010873-1-haowenchao2@huawei.com/
References: https://lore.kernel.org/all/20250314012927.150860-1-jiangjianjun3@huawei.com/

JiangJianJun (4):
  scsi: core: increase/decrease target_busy if set error handler
  scsi: scsi_debug: Add params for configuring the error handler
  scsi: virtio_scsi: enable LUN based error handlers
  scsi: iscsi_tcp: enable LUN-based and target-based error handlers

Wenchao Hao (10):
  scsi: scsi_error: Define framework for LUN/target based error handle
  scsi: scsi_error: Move complete variable eh_action from shost to
    sdevice
  scsi: scsi_error: Check if to do reset in scsi_try_xxx_reset
  scsi: scsi_error: Add helper scsi_eh_sdev_stu to do START_UNIT
  scsi: scsi_error: Add helper scsi_eh_sdev_reset to do lun reset
  scsi: scsi_error: Add flags to mark error handle steps has done
  scsi: scsi_error: Add helper to handle scsi device's error command
    list
  scsi: scsi_error: Add a general LUN based error handler
  scsi: scsi_error: Add helper to handle scsi target's error command
    list
  scsi: scsi_error: Add a general target based error handler

 drivers/scsi/iscsi_tcp.c   |   4 +
 drivers/scsi/scsi_debug.c  |  26 +-
 drivers/scsi/scsi_error.c  | 800 ++++++++++++++++++++++++++++++++++---
 drivers/scsi/scsi_lib.c    |  17 +-
 drivers/scsi/scsi_priv.h   |  15 +
 drivers/scsi/scsi_scan.c   |  19 +
 drivers/scsi/scsi_sysfs.c  |   2 +
 drivers/scsi/virtio_scsi.c |   3 +
 include/scsi/scsi_device.h |  87 ++++
 include/scsi/scsi_eh.h     |   8 +
 include/scsi/scsi_host.h   |  33 +-
 11 files changed, 952 insertions(+), 62 deletions(-)

-- 
2.33.0
Re: [PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by Hannes Reinecke 1 month, 2 weeks ago
On 8/16/25 13:24, JiangJianJun wrote:
> It's unbearable for systems with large scale scsi devices share HBAs to
> block all devices' IOs when handle error commands, we need a new error
> handle mechanism to address this issue.
> 
> I consulted about this issue a year ago, the discuss link can be found in
> refenence. Hannes replied about why we have to block the SCSI host
> then perform error recovery kindly. I think it's unnecessary to block
> SCSI host for all drivers and can try a small level recovery(LUN based for
> example) first to avoid block the SCSI host.
> 
> The new error handle mechanism introduced in this patchset has been
> developed and tested with out self developed hardware since one year
> ago, now we want this mechanism can be used by more drivers.
> 
> Drivers can decide if using the new error handle mechanism and how to
> handle error commands when scsi_device are scanned,the new mechanism
> makes SCSI error handle more flexible.
> 
Hmm. Yes, and no.

I fully agree that SCSI EH is in need of reworking. But adding
another layer of complexity on top of the existing one ... not sure.

Additionally: TARGET RESET TMF is dead, and has been removed from SAM
since several years. It really is not worthwhile implementing.

Can't we take a simple step, and just try to have a non-blocking version
of device reset?
I think that should cover quite some issues already.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
[RFC PATCH v4 0/9] scsi: scsi_error: Introduce new error handle mechanism
Posted by JiangJianJun 2 weeks, 6 days ago
> Additionally: TARGET RESET TMF is dead, and has been removed from SAM
> since several years. It really is not worthwhile implementing.

Hannes suggested removing the faulty handler on the target, so I revised
a version and submitted it as a proposal.

> Also, was this all tested with libata and libsas attached devices as well ?
> They all depend on scsi EH.

And I removed the settings for iscsi_tcp and virtio_scsi, maybe that
adding this proposal to specific drivers before the plan is finalized is
not appropriate. Keep the setting of scsi_debug for test the solution.

The introduction can be simplified as follows: This tmf only blocks IO
on the current device and does not interfere with IO or tmf on other
devices. 


JiangJianJun (1):
  scsi: scsi_debug: Add params for configuring the error handler

Wenchao Hao (8):
  scsi: scsi_error: Define framework for LUN based error handle
  scsi: scsi_error: Move complete variable eh_action from shost to
    sdevice
  scsi: scsi_error: Check if to do reset in scsi_try_xxx_reset
  scsi: scsi_error: Add helper scsi_eh_sdev_stu to do START_UNIT
  scsi: scsi_error: Add helper scsi_eh_sdev_reset to do lun reset
  scsi: scsi_error: Add flags to mark error handle steps has done
  scsi: scsi_error: Add helper to handle scsi device's error command
    list
  scsi: scsi_error: Add a general LUN based error handler

 drivers/scsi/scsi_debug.c  |  15 +-
 drivers/scsi/scsi_error.c  | 422 ++++++++++++++++++++++++++++++++-----
 drivers/scsi/scsi_lib.c    |   7 +
 drivers/scsi/scsi_priv.h   |   8 +
 drivers/scsi/scsi_scan.c   |   7 +
 drivers/scsi/scsi_sysfs.c  |   2 +
 include/scsi/scsi_device.h |  49 +++++
 include/scsi/scsi_eh.h     |   4 +
 include/scsi/scsi_host.h   |  18 +-
 9 files changed, 476 insertions(+), 56 deletions(-)

-- 
2.33.0
[RFC PATCH v4 1/9] scsi: scsi_error: Define framework for LUN based error handle
Posted by JiangJianJun 2 weeks, 6 days ago
From: Wenchao Hao <haowenchao2@huawei.com>

The old scsi error handle logic is based on host, once a scsi command
in one LUN of this host is classfied as failed, SCSI mid-level would
set the whole host to recovery state, and no IO can be submitted to
all LUNs of this host any more before recovery finished, while the
recovery process might take a long time to finish.
It's unreasonable when there are a lot of LUNs in one host.

This change introduce a way for driver to implement its own
error handle logic which can be based on scsi LUN as minimum unit.

scsi_device_eh is defined for error handle based on scsi LUN, and
pointer struct scsi_device_eh "eh" is added in scsi_device, which
is NULL by default.
LLDs can initialize the sdev->eh in hostt->slave_alloc to implement an
scsi LUN based error handle. If this member is not NULL, SCSI mid-level
would branch to drivers' error handler rather than the old one which
block whole host's IO.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Co-developed-by: JiangJianJun <jiangjianjun3@huawei.com>
Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_error.c  | 42 +++++++++++++++++++++++++++++++++++---
 drivers/scsi/scsi_lib.c    |  7 +++++++
 drivers/scsi/scsi_priv.h   |  8 ++++++++
 include/scsi/scsi_device.h | 29 ++++++++++++++++++++++++++
 4 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 746ff6a1f309..b5b04f2c5d62 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -291,11 +291,28 @@ static void scsi_eh_inc_host_failed(struct rcu_head *head)
 	spin_unlock_irqrestore(shost->host_lock, flags);
 }
 
+static bool scsi_eh_scmd_add_sdev(struct scsi_cmnd *scmd)
+{
+	struct scsi_device *sdev = scmd->device;
+	struct scsi_device_eh *eh = sdev->eh;
+
+	if (!eh || !eh->add_cmnd)
+		return true;
+
+	scsi_eh_reset(scmd);
+	eh->add_cmnd(scmd);
+
+	if (eh->wakeup)
+		eh->wakeup(sdev);
+
+	return false;
+}
+
 /**
- * scsi_eh_scmd_add - add scsi cmd to error handling.
- * @scmd:	scmd to run eh on.
+ * Add scsi cmd to error handling of Scsi_Host.
+ * This is default action of error handle.
  */
-void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
+static void scsi_eh_scmd_add_shost(struct scsi_cmnd *scmd)
 {
 	struct Scsi_Host *shost = scmd->device->host;
 	unsigned long flags;
@@ -322,6 +339,25 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
 	call_rcu_hurry(&scmd->rcu, scsi_eh_inc_host_failed);
 }
 
+/**
+ * scsi_eh_scmd_add - add scsi cmd to error handling.
+ * @scmd:	scmd to run eh on.
+ */
+void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
+{
+	struct Scsi_Host *shost = scmd->device->host;
+
+	if (unlikely(scsi_host_in_recovery(shost)) ||
+		scsi_eh_scmd_add_sdev(scmd))
+		scsi_eh_scmd_add_shost(scmd);
+}
+
+void scsi_eh_try_wakeup_sdev(struct scsi_device *sdev)
+{
+	if (sdev->eh && sdev->eh->wakeup)
+		sdev->eh->wakeup(sdev);
+}
+
 /**
  * scsi_timeout - Timeout function for normal scsi commands.
  * @req:	request that is timing out.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0c65ecfedfbd..ac8fb801aa82 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -398,6 +398,8 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd)
 
 	sbitmap_put(&sdev->budget_map, cmd->budget_token);
 	cmd->budget_token = -1;
+
+	scsi_eh_try_wakeup_sdev(sdev);
 }
 
 /*
@@ -1360,6 +1362,9 @@ static inline int scsi_dev_queue_ready(struct request_queue *q,
 {
 	int token;
 
+	if (scsi_device_in_recovery(sdev))
+		return -1;
+
 	token = sbitmap_get(&sdev->budget_map);
 	if (token < 0)
 		return -1;
@@ -1374,6 +1379,7 @@ static inline int scsi_dev_queue_ready(struct request_queue *q,
 	if (scsi_device_busy(sdev) > 1 ||
 	    atomic_dec_return(&sdev->device_blocked) > 0) {
 		sbitmap_put(&sdev->budget_map, token);
+		scsi_eh_try_wakeup_sdev(sdev);
 		return -1;
 	}
 
@@ -1882,6 +1888,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 out_put_budget:
 	scsi_mq_put_budget(q, cmd->budget_token);
 	cmd->budget_token = -1;
+	scsi_eh_try_wakeup_sdev(sdev);
 	switch (ret) {
 	case BLK_STS_OK:
 		break;
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 5b2b19f5e8ec..906f090d917c 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -92,6 +92,7 @@ extern int scsi_error_handler(void *host);
 extern enum scsi_disposition scsi_decide_disposition(struct scsi_cmnd *cmd);
 extern void scsi_eh_wakeup(struct Scsi_Host *shost, unsigned int busy);
 extern void scsi_eh_scmd_add(struct scsi_cmnd *);
+extern void scsi_eh_try_wakeup_sdev(struct scsi_device *sdev);
 void scsi_eh_ready_devs(struct Scsi_Host *shost,
 			struct list_head *work_q,
 			struct list_head *done_q);
@@ -191,6 +192,13 @@ static inline void scsi_dh_add_device(struct scsi_device *sdev) { }
 static inline void scsi_dh_release_device(struct scsi_device *sdev) { }
 #endif
 
+static inline bool scsi_device_in_recovery(struct scsi_device *sdev)
+{
+	struct scsi_device_eh *eh = sdev->eh;
+
+	return eh && eh->is_busy && eh->is_busy(sdev);
+}
+
 struct bsg_device *scsi_bsg_register_queue(struct scsi_device *sdev);
 
 extern int scsi_device_max_queue_depth(struct scsi_device *sdev);
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 6d6500148c4b..c21b0a84bbd2 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -100,6 +100,34 @@ struct scsi_vpd {
 	unsigned char	data[];
 };
 
+struct scsi_device;
+
+struct scsi_device_eh {
+	/*
+	 * add scsi command to error handler so it would be handuled by
+	 * driver's error handle strategy
+	 */
+	void (*add_cmnd)(struct scsi_cmnd *scmd);
+
+	/*
+	 * to judge if the device is busy handling errors, called before
+	 * dispatch scsi cmnd
+	 *
+	 * return 0 if it's ready to accepy scsi cmnd
+	 * return 1 if it's in error handle, command's would not be dispatched
+	 */
+	bool (*is_busy)(struct scsi_device *sdev);
+
+	/*
+	 * wakeup device's error handle
+	 *
+	 * usually the error handler strategy would not run at once when
+	 * error command is added. This function would be called when any
+	 * scsi cmnd is finished or when scsi cmnd is added.
+	 */
+	void (*wakeup)(struct scsi_device *sdev);
+};
+
 struct scsi_device {
 	struct Scsi_Host *host;
 	struct request_queue *request_queue;
@@ -289,6 +317,7 @@ struct scsi_device {
 	struct mutex		state_mutex;
 	enum scsi_device_state sdev_state;
 	struct task_struct	*quiesced_by;
+	struct scsi_device_eh	*eh;
 	unsigned long		sdev_data[];
 } __attribute__((aligned(sizeof(unsigned long))));
 
-- 
2.33.0
[RFC PATCH v4 2/9] scsi: scsi_error: Move complete variable eh_action from shost to sdevice
Posted by JiangJianJun 2 weeks, 6 days ago
From: Wenchao Hao <haowenchao2@huawei.com>

eh_action is used to wait for error handle command's completion if scsi
command is send in error handle. Now the error handler might based on
scsi_device, so move it to scsi_device.

This is preparation for a genernal LUN based error handle strategy.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Co-developed-by: JiangJianJun <jiangjianjun3@huawei.com>
Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_error.c  | 6 +++---
 include/scsi/scsi_device.h | 2 ++
 include/scsi/scsi_host.h   | 2 --
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index b5b04f2c5d62..d8b3f5b0fd47 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -917,7 +917,7 @@ void scsi_eh_done(struct scsi_cmnd *scmd)
 	SCSI_LOG_ERROR_RECOVERY(3, scmd_printk(KERN_INFO, scmd,
 			"%s result: %x\n", __func__, scmd->result));
 
-	eh_action = scmd->device->host->eh_action;
+	eh_action = scmd->device->eh_action;
 	if (eh_action)
 		complete(eh_action);
 }
@@ -1206,7 +1206,7 @@ static enum scsi_disposition scsi_send_eh_cmnd(struct scsi_cmnd *scmd,
 
 retry:
 	scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes);
-	shost->eh_action = &done;
+	sdev->eh_action = &done;
 
 	scsi_log_send(scmd);
 	scmd->submitter = SUBMITTED_BY_SCSI_ERROR_HANDLER;
@@ -1250,7 +1250,7 @@ static enum scsi_disposition scsi_send_eh_cmnd(struct scsi_cmnd *scmd,
 		rtn = SUCCESS;
 	}
 
-	shost->eh_action = NULL;
+	sdev->eh_action = NULL;
 
 	scsi_log_completion(scmd, rtn);
 
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index c21b0a84bbd2..9d42858035ed 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -318,6 +318,8 @@ struct scsi_device {
 	enum scsi_device_state sdev_state;
 	struct task_struct	*quiesced_by;
 	struct scsi_device_eh	*eh;
+	struct completion	*eh_action;	/* Wait for specific actions */
+						/* on the device. */
 	unsigned long		sdev_data[];
 } __attribute__((aligned(sizeof(unsigned long))));
 
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index c53812b9026f..46f57fe78505 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -558,8 +558,6 @@ struct Scsi_Host {
 	struct list_head	eh_abort_list;
 	struct list_head	eh_cmd_q;
 	struct task_struct    * ehandler;  /* Error recovery thread. */
-	struct completion     * eh_action; /* Wait for specific actions on the
-					      host. */
 	wait_queue_head_t       host_wait;
 	const struct scsi_host_template *hostt;
 	struct scsi_transport_template *transportt;
-- 
2.33.0
[RFC PATCH v4 3/9] scsi: scsi_error: Check if to do reset in scsi_try_xxx_reset
Posted by JiangJianJun 2 weeks, 6 days ago
From: Wenchao Hao <haowenchao2@huawei.com>

This is preparation for a genernal LUN based error handle
strategy, the strategy would reuse some error handler APIs,
but some steps of these function should not be performed. For
example, we should not perform bus/host reset if we just stop IOs
on one single LUN.

This change add checks in scsi_try_xxx_reset to make sure
the reset operations would not be performed only if the condition
is not satisfied.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Co-developed-by: JiangJianJun <jiangjianjun3@huawei.com>
Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_error.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index d8b3f5b0fd47..80a85b387068 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -926,7 +926,7 @@ void scsi_eh_done(struct scsi_cmnd *scmd)
  * scsi_try_host_reset - ask host adapter to reset itself
  * @scmd:	SCSI cmd to send host reset.
  */
-static enum scsi_disposition scsi_try_host_reset(struct scsi_cmnd *scmd)
+static enum scsi_disposition __scsi_try_host_reset(struct scsi_cmnd *scmd)
 {
 	unsigned long flags;
 	enum scsi_disposition rtn;
@@ -952,11 +952,19 @@ static enum scsi_disposition scsi_try_host_reset(struct scsi_cmnd *scmd)
 	return rtn;
 }
 
+static enum scsi_disposition scsi_try_host_reset(struct scsi_cmnd *scmd)
+{
+	if (!scsi_host_in_recovery(scmd->device->host))
+		return FAILED;
+
+	return __scsi_try_host_reset(scmd);
+}
+
 /**
  * scsi_try_bus_reset - ask host to perform a bus reset
  * @scmd:	SCSI cmd to send bus reset.
  */
-static enum scsi_disposition scsi_try_bus_reset(struct scsi_cmnd *scmd)
+static enum scsi_disposition __scsi_try_bus_reset(struct scsi_cmnd *scmd)
 {
 	unsigned long flags;
 	enum scsi_disposition rtn;
@@ -982,6 +990,14 @@ static enum scsi_disposition scsi_try_bus_reset(struct scsi_cmnd *scmd)
 	return rtn;
 }
 
+static enum scsi_disposition scsi_try_bus_reset(struct scsi_cmnd *scmd)
+{
+	if (!scsi_host_in_recovery(scmd->device->host))
+		return FAILED;
+
+	return __scsi_try_bus_reset(scmd);
+}
+
 static void __scsi_report_device_reset(struct scsi_device *sdev, void *data)
 {
 	sdev->was_reset = 1;
@@ -2547,12 +2563,12 @@ scsi_ioctl_reset(struct scsi_device *dev, int __user *arg)
 			break;
 		fallthrough;
 	case SG_SCSI_RESET_BUS:
-		rtn = scsi_try_bus_reset(scmd);
+		rtn = __scsi_try_bus_reset(scmd);
 		if (rtn == SUCCESS || (val & SG_SCSI_RESET_NO_ESCALATE))
 			break;
 		fallthrough;
 	case SG_SCSI_RESET_HOST:
-		rtn = scsi_try_host_reset(scmd);
+		rtn = __scsi_try_host_reset(scmd);
 		if (rtn == SUCCESS)
 			break;
 		fallthrough;
-- 
2.33.0
[RFC PATCH v4 4/9] scsi: scsi_error: Add helper scsi_eh_sdev_stu to do START_UNIT
Posted by JiangJianJun 2 weeks, 6 days ago
From: Wenchao Hao <haowenchao2@huawei.com>

Add helper function scsi_eh_sdev_stu() to perform START_UNIT and check
if to finish some error commands.

This is preparation for a genernal LUN based error handle
strategy and did not change original logic.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Co-developed-by: JiangJianJun <jiangjianjun3@huawei.com>
Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_error.c | 50 +++++++++++++++++++++++----------------
 1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 80a85b387068..a1b700dbb0ec 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1559,6 +1559,31 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
 	return 1;
 }
 
+static int scsi_eh_sdev_stu(struct scsi_cmnd *scmd,
+			      struct list_head *work_q,
+			      struct list_head *done_q)
+{
+	struct scsi_device *sdev = scmd->device;
+	struct scsi_cmnd *next;
+
+	SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev,
+				"%s: Sending START_UNIT\n", current->comm));
+
+	if (scsi_eh_try_stu(scmd)) {
+		SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev,
+				    "%s: START_UNIT failed\n", current->comm));
+		return 0;
+	}
+
+	if (!scsi_device_online(sdev) || !scsi_eh_tur(scmd))
+		list_for_each_entry_safe(scmd, next, work_q, eh_entry)
+			if (scmd->device == sdev &&
+			    scsi_eh_action(scmd, SUCCESS) == SUCCESS)
+				scsi_eh_finish_cmd(scmd, done_q);
+
+	return list_empty(work_q);
+}
+
  /**
  * scsi_eh_stu - send START_UNIT if needed
  * @shost:	&scsi host being recovered.
@@ -1573,7 +1598,7 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
 			      struct list_head *work_q,
 			      struct list_head *done_q)
 {
-	struct scsi_cmnd *scmd, *stu_scmd, *next;
+	struct scsi_cmnd *scmd, *stu_scmd;
 	struct scsi_device *sdev;
 
 	shost_for_each_device(sdev, shost) {
@@ -1596,26 +1621,9 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
 		if (!stu_scmd)
 			continue;
 
-		SCSI_LOG_ERROR_RECOVERY(3,
-			sdev_printk(KERN_INFO, sdev,
-				     "%s: Sending START_UNIT\n",
-				    current->comm));
-
-		if (!scsi_eh_try_stu(stu_scmd)) {
-			if (!scsi_device_online(sdev) ||
-			    !scsi_eh_tur(stu_scmd)) {
-				list_for_each_entry_safe(scmd, next,
-							  work_q, eh_entry) {
-					if (scmd->device == sdev &&
-					    scsi_eh_action(scmd, SUCCESS) == SUCCESS)
-						scsi_eh_finish_cmd(scmd, done_q);
-				}
-			}
-		} else {
-			SCSI_LOG_ERROR_RECOVERY(3,
-				sdev_printk(KERN_INFO, sdev,
-					    "%s: START_UNIT failed\n",
-					    current->comm));
+		if (scsi_eh_sdev_stu(stu_scmd, work_q, done_q)) {
+			scsi_device_put(sdev);
+			break;
 		}
 	}
 
-- 
2.33.0
[RFC PATCH v4 5/9] scsi: scsi_error: Add helper scsi_eh_sdev_reset to do lun reset
Posted by JiangJianJun 2 weeks, 6 days ago
From: Wenchao Hao <haowenchao2@huawei.com>

Add helper function scsi_eh_sdev_reset() to perform lun reset and check
if to finish some error commands.

This is preparation for a genernal LUN based error handle
strategy and did not change original logic.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Co-developed-by: JiangJianJun <jiangjianjun3@huawei.com>
Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_error.c | 54 +++++++++++++++++++++++----------------
 1 file changed, 32 insertions(+), 22 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index a1b700dbb0ec..f58aad351463 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1630,6 +1630,34 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
 	return list_empty(work_q);
 }
 
+static int scsi_eh_sdev_reset(struct scsi_cmnd *scmd,
+			      struct list_head *work_q,
+			      struct list_head *done_q)
+{
+	struct scsi_cmnd *next;
+	struct scsi_device *sdev = scmd->device;
+	enum scsi_disposition rtn;
+
+	SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev,
+			     "%s: Sending BDR\n", current->comm));
+
+	rtn = scsi_try_bus_device_reset(scmd);
+	if (rtn != SUCCESS && rtn != FAST_IO_FAIL) {
+		SCSI_LOG_ERROR_RECOVERY(3,
+			sdev_printk(KERN_INFO, sdev,
+				    "%s: BDR failed\n", current->comm));
+		return 0;
+	}
+
+	if (!scsi_device_online(sdev) || rtn == FAST_IO_FAIL ||
+	    !scsi_eh_tur(scmd))
+		list_for_each_entry_safe(scmd, next, work_q, eh_entry)
+			if (scmd->device == sdev &&
+			    scsi_eh_action(scmd, rtn) != FAILED)
+				scsi_eh_finish_cmd(scmd, done_q);
+
+	return list_empty(work_q);
+}
 
 /**
  * scsi_eh_bus_device_reset - send bdr if needed
@@ -1647,9 +1675,8 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost,
 				    struct list_head *work_q,
 				    struct list_head *done_q)
 {
-	struct scsi_cmnd *scmd, *bdr_scmd, *next;
+	struct scsi_cmnd *scmd, *bdr_scmd;
 	struct scsi_device *sdev;
-	enum scsi_disposition rtn;
 
 	shost_for_each_device(sdev, shost) {
 		if (scsi_host_eh_past_deadline(shost)) {
@@ -1670,26 +1697,9 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost,
 		if (!bdr_scmd)
 			continue;
 
-		SCSI_LOG_ERROR_RECOVERY(3,
-			sdev_printk(KERN_INFO, sdev,
-				     "%s: Sending BDR\n", current->comm));
-		rtn = scsi_try_bus_device_reset(bdr_scmd);
-		if (rtn == SUCCESS || rtn == FAST_IO_FAIL) {
-			if (!scsi_device_online(sdev) ||
-			    rtn == FAST_IO_FAIL ||
-			    !scsi_eh_tur(bdr_scmd)) {
-				list_for_each_entry_safe(scmd, next,
-							 work_q, eh_entry) {
-					if (scmd->device == sdev &&
-					    scsi_eh_action(scmd, rtn) != FAILED)
-						scsi_eh_finish_cmd(scmd,
-								   done_q);
-				}
-			}
-		} else {
-			SCSI_LOG_ERROR_RECOVERY(3,
-				sdev_printk(KERN_INFO, sdev,
-					    "%s: BDR failed\n", current->comm));
+		if (scsi_eh_sdev_reset(bdr_scmd, work_q, done_q)) {
+			scsi_device_put(sdev);
+			break;
 		}
 	}
 
-- 
2.33.0
[RFC PATCH v4 6/9] scsi: scsi_error: Add flags to mark error handle steps has done
Posted by JiangJianJun 2 weeks, 6 days ago
From: Wenchao Hao <haowenchao2@huawei.com>

LUN based error handle would mainly do three steps to recover
commands which are check sense, start unit, and reset lun. It might
fallback to host based error handler which would do these steps
too.

Add some flags to mark these steps are done to avoid repeating
these steps.

The flags should be cleared when LUN based error handler is waked up
or when host based error handler finished, and set when fallback to
host based error handle.

scsi_eh_get_sense, scsi_eh_stu, scsi_eh_bus_device_reset would check
these flags before actually action.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Co-developed-by: JiangJianJun <jiangjianjun3@huawei.com>
Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_error.c  | 33 +++++++++++++++++++++++++++++++++
 include/scsi/scsi_device.h | 18 ++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index f58aad351463..239f8231c3ff 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -57,10 +57,36 @@
 #define BUS_RESET_SETTLE_TIME   (10)
 #define HOST_RESET_SETTLE_TIME  (10)
 
+#define sdev_flags_done(flag)					\
+static inline int sdev_##flag(struct scsi_device *sdev)		\
+{								\
+	struct scsi_device_eh *eh = sdev->eh;			\
+	if (!eh)						\
+		return 0;					\
+	return eh->flag;					\
+}
+
 static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
 static enum scsi_disposition scsi_try_to_abort_cmd(const struct scsi_host_template *,
 						   struct scsi_cmnd *);
 
+sdev_flags_done(get_sense_done);
+sdev_flags_done(stu_done);
+sdev_flags_done(reset_done);
+
+static inline void shost_clear_eh_done(struct Scsi_Host *shost)
+{
+	struct scsi_device *sdev;
+
+	shost_for_each_device(sdev, shost) {
+		if (!sdev->eh)
+			continue;
+		sdev->eh->get_sense_done = 0;
+		sdev->eh->stu_done	 = 0;
+		sdev->eh->reset_done	 = 0;
+	}
+}
+
 void scsi_eh_wakeup(struct Scsi_Host *shost, unsigned int busy)
 {
 	lockdep_assert_held(shost->host_lock);
@@ -1397,6 +1423,8 @@ int scsi_eh_get_sense(struct list_head *work_q,
 					     current->comm));
 			break;
 		}
+		if (sdev_get_sense_done(scmd->device))
+			continue;
 		if (!scsi_status_is_check_condition(scmd->result))
 			/*
 			 * don't request sense if there's no check condition
@@ -1610,6 +1638,8 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
 			scsi_device_put(sdev);
 			break;
 		}
+		if (sdev_stu_done(sdev))
+			continue;
 		stu_scmd = NULL;
 		list_for_each_entry(scmd, work_q, eh_entry)
 			if (scmd->device == sdev && SCSI_SENSE_VALID(scmd) &&
@@ -1693,6 +1723,8 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host *shost,
 				bdr_scmd = scmd;
 				break;
 			}
+		if (sdev_reset_done(sdev))
+			continue;
 
 		if (!bdr_scmd)
 			continue;
@@ -2357,6 +2389,7 @@ static void scsi_unjam_host(struct Scsi_Host *shost)
 	if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q))
 		scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q);
 
+	shost_clear_eh_done(shost);
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (shost->eh_deadline != -1)
 		shost->last_reset = 0;
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index 9d42858035ed..9fc052e48a3b 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -103,6 +103,24 @@ struct scsi_vpd {
 struct scsi_device;
 
 struct scsi_device_eh {
+	/*
+	 * LUN rebased error handle would mainly do three
+	 * steps to recovery commands which are
+	 *   check sense
+	 *   start unit
+	 *   reset lun
+	 * While we would fallback to host based error handler which would
+	 * do these steps too. Add flags to mark thes steps are done to
+	 * avoid repeating these steps.
+	 *
+	 * The flags should be cleared when LUN based error handler is
+	 * wakedup or when host based error handler finished, set when
+	 * fallback to host based error handle.
+	 */
+	unsigned get_sense_done:1;
+	unsigned stu_done:1;
+	unsigned reset_done:1;
+
 	/*
 	 * add scsi command to error handler so it would be handuled by
 	 * driver's error handle strategy
-- 
2.33.0
[RFC PATCH v4 7/9] scsi: scsi_error: Add helper to handle scsi device's error command list
Posted by JiangJianJun 2 weeks, 6 days ago
From: Wenchao Hao <haowenchao2@huawei.com>

Add helper scsi_sdev_eh() to handle scsi device's error command list,
it would perform some steps which can be done with LUN's IO blocked,
including check sense, start unit and reset lun.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Co-developed-by: JiangJianJun <jiangjianjun3@huawei.com>
Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_error.c | 37 +++++++++++++++++++++++++++++++++++++
 include/scsi/scsi_eh.h    |  2 ++
 2 files changed, 39 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 239f8231c3ff..1f2b3deace32 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -2486,6 +2486,43 @@ int scsi_error_handler(void *data)
 	return 0;
 }
 
+/*
+ * Single LUN error handle
+ *
+ * @work_q: list of scsi commands need to recovery
+ * @done_q: list of scsi commands handled
+ *
+ * return: return 1 if all commands in work_q is recoveryed, else 0 is returned
+ */
+int scsi_sdev_eh(struct scsi_device *sdev,
+		 struct list_head *work_q,
+		 struct list_head *done_q)
+{
+	int ret = 0;
+	struct scsi_cmnd *scmd;
+
+	SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev,
+		"%s:luneh: checking sense\n", current->comm));
+	ret = scsi_eh_get_sense(work_q, done_q);
+	if (ret)
+		return ret;
+
+	SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev,
+		"%s:luneh: start unit\n", current->comm));
+	scmd = list_first_entry(work_q, struct scsi_cmnd, eh_entry);
+	ret = scsi_eh_sdev_stu(scmd, work_q, done_q);
+	if (ret)
+		return ret;
+
+	SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev,
+		"%s:luneh: reset LUN\n", current->comm));
+	scmd = list_first_entry(work_q, struct scsi_cmnd, eh_entry);
+	ret = scsi_eh_sdev_reset(scmd, work_q, done_q);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(scsi_sdev_eh);
+
 /**
  * scsi_report_bus_reset() - report bus reset observed
  *
diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h
index 1ae08e81339f..5ce791063baf 100644
--- a/include/scsi/scsi_eh.h
+++ b/include/scsi/scsi_eh.h
@@ -18,6 +18,8 @@ extern int scsi_block_when_processing_errors(struct scsi_device *);
 extern bool scsi_command_normalize_sense(const struct scsi_cmnd *cmd,
 					 struct scsi_sense_hdr *sshdr);
 extern enum scsi_disposition scsi_check_sense(struct scsi_cmnd *);
+extern int scsi_sdev_eh(struct scsi_device *sdev, struct list_head *workq,
+			struct list_head *doneq);
 
 static inline bool scsi_sense_is_deferred(const struct scsi_sense_hdr *sshdr)
 {
-- 
2.33.0
[RFC PATCH v4 8/9] scsi: scsi_error: Add a general LUN based error handler
Posted by JiangJianJun 2 weeks, 6 days ago
From: Wenchao Hao <haowenchao2@huawei.com>

Add a general LUN based error handler which can be used by drivers
directly. This error handler implements an scsi_device_eh, when handling
error commands, it would call helper function scsi_sdev_eh() added before
to try recover error commands.

The behavior if scsi_sdev_eh() can not recover all error commands
depends on fallback flag, which is initialized when scsi_device is
allocated. If fallback is set, it would fallback to further error
recover strategy like old host based error handle; else it would
mark this scsi device offline and flush all error commands. Add a
flag for controlling rollback in scsi_host_template.

Add interface sdev_setup_eh/sdev_clear_eh in scsi_host_template, used
for setup/clear LUN based error handler. Drivers can implements them
custom, or use inner implements:
  scsi_device_setup_eh/scsi_device_clear_eh.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Co-developed-by: JiangJianJun <jiangjianjun3@huawei.com>
Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_error.c | 176 ++++++++++++++++++++++++++++++++++++++
 drivers/scsi/scsi_scan.c  |   7 ++
 drivers/scsi/scsi_sysfs.c |   2 +
 include/scsi/scsi_eh.h    |   2 +
 include/scsi/scsi_host.h  |  16 ++++
 5 files changed, 203 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 1f2b3deace32..c1b4cd10216b 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -2736,3 +2736,179 @@ bool scsi_get_sense_info_fld(const u8 *sense_buffer, int sb_len,
 	}
 }
 EXPORT_SYMBOL(scsi_get_sense_info_fld);
+
+static inline void *scsi_eh_device_priv(struct scsi_device_eh *eh)
+{
+	return eh + 1;
+}
+
+struct scsi_lun_eh {
+	spinlock_t		eh_lock;
+	unsigned int		eh_num;
+	struct list_head	eh_cmd_q;
+	struct scsi_device	*sdev;
+	struct work_struct	eh_handle_work;
+	unsigned int		fallback:1;
+};
+
+/*
+ * error handle strategy based on LUN, following steps
+ * is applied to recovery error commands in list:
+ *    check sense data
+ *    send start unit
+ *    reset lun
+ * if there are still error commands, it would fallback to
+ * host based error handler for further recovery.
+ */
+static void sdev_eh_work(struct work_struct *work)
+{
+	unsigned long flags;
+	struct scsi_lun_eh *luneh =
+			container_of(work, struct scsi_lun_eh, eh_handle_work);
+	struct scsi_device *sdev = luneh->sdev;
+	struct scsi_device_eh *eh = sdev->eh;
+	struct Scsi_Host *shost = sdev->host;
+	struct scsi_cmnd *scmd, *next;
+	LIST_HEAD(eh_work_q);
+	LIST_HEAD(eh_done_q);
+
+	spin_lock_irqsave(&luneh->eh_lock, flags);
+	list_splice_init(&luneh->eh_cmd_q, &eh_work_q);
+	spin_unlock_irqrestore(&luneh->eh_lock, flags);
+
+	if (scsi_sdev_eh(sdev, &eh_work_q, &eh_done_q))
+		goto out_flush_done;
+
+	if (!luneh->fallback) {
+		list_for_each_entry_safe(scmd, next, &eh_work_q, eh_entry)
+			scsi_eh_finish_cmd(scmd, &eh_done_q);
+
+		sdev_printk(KERN_INFO, sdev,
+			"%s:luneh: Device offlined - not ready after error recovery\n",
+			current->comm);
+
+		mutex_lock(&sdev->state_mutex);
+		scsi_device_set_state(sdev, SDEV_OFFLINE);
+		mutex_unlock(&sdev->state_mutex);
+
+		goto out_flush_done;
+	}
+
+	/*
+	 * fallback to host based error handler
+	 */
+	SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev,
+		"%s:luneh fallback to further recovery\n", current->comm));
+	list_for_each_entry_safe(scmd, next, &eh_work_q, eh_entry) {
+		list_del_init(&scmd->eh_entry);
+		scsi_eh_scmd_add_shost(scmd);
+	}
+
+	eh->get_sense_done = 1;
+	eh->stu_done = 1;
+	eh->reset_done = 1;
+
+out_flush_done:
+	scsi_eh_flush_done_q(&eh_done_q);
+	spin_lock_irqsave(&luneh->eh_lock, flags);
+	luneh->eh_num = 0;
+	spin_unlock_irqrestore(&luneh->eh_lock, flags);
+}
+static void sdev_eh_add_cmnd(struct scsi_cmnd *scmd)
+{
+	unsigned long flags;
+	struct scsi_lun_eh *luneh;
+	struct scsi_device *sdev = scmd->device;
+
+	luneh = scsi_eh_device_priv(sdev->eh);
+
+	spin_lock_irqsave(&luneh->eh_lock, flags);
+	list_add_tail(&scmd->eh_entry, &luneh->eh_cmd_q);
+	luneh->eh_num++;
+	spin_unlock_irqrestore(&luneh->eh_lock, flags);
+}
+static bool sdev_eh_is_busy(struct scsi_device *sdev)
+{
+	int ret = 0;
+	unsigned long flags;
+	struct scsi_lun_eh *luneh;
+
+	if (!sdev->eh)
+		return false;
+
+	luneh = scsi_eh_device_priv(sdev->eh);
+
+	spin_lock_irqsave(&luneh->eh_lock, flags);
+	ret = luneh->eh_num;
+	spin_unlock_irqrestore(&luneh->eh_lock, flags);
+
+	return ret != 0;
+}
+static void sdev_eh_wakeup(struct scsi_device *sdev)
+{
+	unsigned long flags;
+	unsigned int nr_error;
+	unsigned int nr_busy;
+	struct scsi_lun_eh *luneh;
+
+	luneh = scsi_eh_device_priv(sdev->eh);
+
+	spin_lock_irqsave(&luneh->eh_lock, flags);
+	nr_error = luneh->eh_num;
+	spin_unlock_irqrestore(&luneh->eh_lock, flags);
+
+	nr_busy = scsi_device_busy(sdev);
+
+	if (!nr_error || nr_busy != nr_error) {
+		SCSI_LOG_ERROR_RECOVERY(5, sdev_printk(KERN_INFO, sdev,
+			"%s:luneh: do not wake up, busy/error: %d/%d\n",
+			current->comm, nr_busy, nr_error));
+		return;
+	}
+
+	SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev,
+		"%s:luneh: waking up, busy/error: %d/%d\n",
+		current->comm, nr_busy, nr_error));
+
+	schedule_work(&luneh->eh_handle_work);
+}
+
+/*
+ * This is default implement of Scsi_Host.sdev_setup_eh.
+ */
+int scsi_device_setup_eh(struct scsi_device *sdev)
+{
+	struct scsi_device_eh *eh;
+	struct scsi_lun_eh *luneh;
+
+	eh = kzalloc(sizeof(struct scsi_device_eh) + sizeof(struct scsi_lun_eh),
+		GFP_KERNEL);
+	if (!eh)
+		return -ENOMEM;
+	luneh = scsi_eh_device_priv(eh);
+
+	eh->add_cmnd = sdev_eh_add_cmnd;
+	eh->is_busy  = sdev_eh_is_busy;
+	eh->wakeup   = sdev_eh_wakeup;
+
+	luneh->fallback  = sdev->host->hostt->sdev_eh_fallback;
+	luneh->sdev  = sdev;
+	spin_lock_init(&luneh->eh_lock);
+	INIT_LIST_HEAD(&luneh->eh_cmd_q);
+	INIT_WORK(&luneh->eh_handle_work, sdev_eh_work);
+
+	sdev->eh = eh;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(scsi_device_setup_eh);
+
+/*
+ * This is default implement of Scsi_Host.sdev_clear_eh.
+ */
+void scsi_device_clear_eh(struct scsi_device *sdev)
+{
+	kfree(sdev->eh);
+	sdev->eh = NULL;
+}
+EXPORT_SYMBOL_GPL(scsi_device_clear_eh);
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 3c6e089e80c3..1c5cb77dfb22 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -377,9 +377,16 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
 			goto out_device_destroy;
 		}
 	}
+	if (shost->hostt->sdev_setup_eh) {
+		ret = shost->hostt->sdev_setup_eh(sdev);
+		if (ret)
+			goto out_device_eh;
+	}
 
 	return sdev;
 
+out_device_eh:
+	shost->hostt->sdev_destroy(sdev);
 out_device_destroy:
 	__scsi_remove_device(sdev);
 out:
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 15ba493d2138..76e788450345 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1513,6 +1513,8 @@ void __scsi_remove_device(struct scsi_device *sdev)
 	kref_put(&sdev->host->tagset_refcnt, scsi_mq_free_tags);
 	cancel_work_sync(&sdev->requeue_work);
 
+	if (sdev->host->hostt->sdev_clear_eh)
+		sdev->host->hostt->sdev_clear_eh(sdev);
 	if (sdev->host->hostt->sdev_destroy)
 		sdev->host->hostt->sdev_destroy(sdev);
 	transport_destroy_device(dev);
diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h
index 5ce791063baf..d8e4475ff004 100644
--- a/include/scsi/scsi_eh.h
+++ b/include/scsi/scsi_eh.h
@@ -20,6 +20,8 @@ extern bool scsi_command_normalize_sense(const struct scsi_cmnd *cmd,
 extern enum scsi_disposition scsi_check_sense(struct scsi_cmnd *);
 extern int scsi_sdev_eh(struct scsi_device *sdev, struct list_head *workq,
 			struct list_head *doneq);
+extern int scsi_device_setup_eh(struct scsi_device *sdev);
+extern void scsi_device_clear_eh(struct scsi_device *sdev);
 
 static inline bool scsi_sense_is_deferred(const struct scsi_sense_hdr *sshdr)
 {
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 46f57fe78505..9cc34bfff3f7 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -225,6 +225,19 @@ struct scsi_host_template {
 	 */
 	void (* sdev_destroy)(struct scsi_device *);
 
+	/*
+	 * Setup or clear error handler field scsi_device.eh.
+	 * This error handler is working on designated device, it will only
+	 * operate the designated device, do not affect other devices.
+	 * If not set, error handle will fallback.
+	 * LLDD can use custom error handler, or use inner defined:
+	 *   scsi_device_setup_eh/scsi_device_clear_eh
+	 *
+	 * Status: OPTIONAL
+	 */
+	int (*sdev_setup_eh)(struct scsi_device *sdev);
+	void (*sdev_clear_eh)(struct scsi_device *sdev);
+
 	/*
 	 * Before the mid layer attempts to scan for a new device attached
 	 * to a target where no target currently exists, it will call this
@@ -468,6 +481,9 @@ struct scsi_host_template {
 	/* The queuecommand callback may block. See also BLK_MQ_F_BLOCKING. */
 	unsigned queuecommand_may_block:1;
 
+	/* The error handle of scsi_device will fallback when failed. */
+	unsigned sdev_eh_fallback:1;
+
 	/*
 	 * Countdown for host blocking with no commands outstanding.
 	 */
-- 
2.33.0
[RFC PATCH v4 9/9] scsi: scsi_debug: Add params for configuring the error handler
Posted by JiangJianJun 2 weeks, 6 days ago
Add a new module parameter to configure error handlers based on
LUN and toggle the enable/disable fallback functionality.

Signed-off-by: JiangJianJun <jiangjianjun3@huawei.com>
---
 drivers/scsi/scsi_debug.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 353cb60e1abe..f221ad31b44d 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -961,6 +961,8 @@ static bool write_since_sync;
 static bool sdebug_statistics = DEF_STATISTICS;
 static bool sdebug_wp;
 static bool sdebug_allow_restart;
+static bool sdebug_lun_eh;
+static bool sdebug_lun_eh_fallback;
 static enum {
 	BLK_ZONED_NONE	= 0,
 	BLK_ZONED_HA	= 1,
@@ -7385,6 +7387,8 @@ module_param_named(zone_max_open, sdeb_zbc_max_open, int, S_IRUGO);
 module_param_named(zone_nr_conv, sdeb_zbc_nr_conv, int, S_IRUGO);
 module_param_named(zone_size_mb, sdeb_zbc_zone_size_mb, int, S_IRUGO);
 module_param_named(allow_restart, sdebug_allow_restart, bool, S_IRUGO | S_IWUSR);
+module_param_named(lun_eh, sdebug_lun_eh, bool, 0444);
+module_param_named(lun_eh_fallback, sdebug_lun_eh_fallback, bool, 0444);
 
 MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert");
 MODULE_DESCRIPTION("SCSI debug adapter driver");
@@ -7464,6 +7468,8 @@ MODULE_PARM_DESC(zone_max_open, "Maximum number of open zones; [0] for no limit
 MODULE_PARM_DESC(zone_nr_conv, "Number of conventional zones (def=1)");
 MODULE_PARM_DESC(zone_size_mb, "Zone size in MiB (def=auto)");
 MODULE_PARM_DESC(allow_restart, "Set scsi_device's allow_restart flag(def=0)");
+MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=0)");
+MODULE_PARM_DESC(lun_eh_fallback, "Fallback to further recovery if LUN recovery failed (def=0)");
 
 #define SDEBUG_INFO_LEN 256
 static char sdebug_info[SDEBUG_INFO_LEN];
@@ -8450,6 +8456,7 @@ static struct attribute *sdebug_drv_attrs[] = {
 ATTRIBUTE_GROUPS(sdebug_drv);
 
 static struct device *pseudo_primary;
+static struct scsi_host_template sdebug_driver_template;
 
 static int __init scsi_debug_init(void)
 {
@@ -8458,6 +8465,12 @@ static int __init scsi_debug_init(void)
 	int k, ret, hosts_to_add;
 	int idx = -1;
 
+	if (sdebug_lun_eh) {
+		sdebug_driver_template.sdev_setup_eh = scsi_device_setup_eh;
+		sdebug_driver_template.sdev_clear_eh = scsi_device_clear_eh;
+		sdebug_driver_template.sdev_eh_fallback = sdebug_lun_eh_fallback;
+	}
+
 	if (sdebug_ndelay >= 1000 * 1000 * 1000) {
 		pr_warn("ndelay must be less than 1 second, ignored\n");
 		sdebug_ndelay = 0;
@@ -9435,7 +9448,7 @@ static int sdebug_init_cmd_priv(struct Scsi_Host *shost, struct scsi_cmnd *cmd)
 	return 0;
 }
 
-static const struct scsi_host_template sdebug_driver_template = {
+static struct scsi_host_template sdebug_driver_template = {
 	.show_info =		scsi_debug_show_info,
 	.write_info =		scsi_debug_write_info,
 	.proc_name =		sdebug_proc_name,
-- 
2.33.0
[PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by JiangJianJun 1 month ago
>I fully agree that SCSI EH is in need of reworking. But adding 
>another layer of complexity on top of the existing one ... not sure.

Perhaps it would have been better to use only the error handler on the
device from the start. Users might wonder why a single disk failure
could cause other disks to become blocking.

>Additionally: TARGET RESET TMF is dead, and has been removed from SAM
>since several years. It really is not worthwhile implementing.

Hmm.

>Can't we take a simple step, and just try to have a non-blocking version
>of device reset?
>I think that should cover quite some issues already.

Do you think it's necessary to escalate the issue after the device reset
fails? Should we reset the bus or the host? 
Moreover, a failed device reset does not necessarily indicate a fault
with the target or host. 
And what means of "non-blocking"?
Re: [PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by Hannes Reinecke 1 month ago
On 9/2/25 07:56, JiangJianJun wrote:
>> I fully agree that SCSI EH is in need of reworking. But adding
>> another layer of complexity on top of the existing one ... not sure.
> 
> Perhaps it would have been better to use only the error handler on the
> device from the start. Users might wonder why a single disk failure
> could cause other disks to become blocking.
> 
>> Additionally: TARGET RESET TMF is dead, and has been removed from SAM
>> since several years. It really is not worthwhile implementing.
> 
> Hmm.
> 
>> Can't we take a simple step, and just try to have a non-blocking version
>> of device reset?
>> I think that should cover quite some issues already.
> 
> Do you think it's necessary to escalate the issue after the device reset
> fails? Should we reset the bus or the host?
> Moreover, a failed device reset does not necessarily indicate a fault
> with the target or host.
> And what means of "non-blocking"?
> 
On the contrary, a failed device reset _always_ needs to be escalated.
The problem is that all EH issues start with a failed command (ignoring
the sg_reset case for now).
And a command typically is associated with data buffers / memory areas.
So when a command is failed we need to know when these buffers can be
released. If the device reset fails the command could not be reset,
and the buffers cannot be released. And without further escalation the
buffers remain locked until the next reboot.
That's why host reset is so important: that typically resets the entire
HBA (via a PCI-level reset or similar), so we can be sure that
afterwards all buffers are released and the command can be completed.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@suse.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
[PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by JiangJianJun 1 month, 2 weeks ago
I apologize for taking too long to test and modify these patches.Here
are the revision points, with the first two being the main ones:
  1. The scsi_target.can_queue value can control the increment and
    decrement operations of scsi_target.target_busy. The new error
    handler also needs to adjust target_busy. Both Bart and Mike opposed
    removing this control condition. In this version, I have only added
    the error handler check when scsi_target.can_queue <= 0.
    link: https://lore.kernel.org/linux-scsi/daba5c92-2395-4eee-b212-978fbe83b56f@oracle.com/
  2. I have added callbacks for setting and clearing the error handler
    in scsi_host_template. Drivers can support device or target error
    handlers by setting these callbacks. I believe the advantage of this
    approach is that driver developers will be aware of this feature
    when they see the callback prototypes and comments, and it can be
    used even without device or target initialization callbacks.
    However, this means that the modparam controlling the enablement can
    only be removed. I have considered adding configurations only for
    the virtio_scsi and iscsi_tcp drivers. 
    link: https://lore.kernel.org/linux-scsi/b8350de1-6ac8-4d5f-aaa7-7b03e2f7aa93@oracle.com/
  3. In scsi_eh_scmd_add, a return statement was added under the
    condition of xxx_in_recovery because each branch ultimately calls
    scsi_eh_scmd_add_shost, which does not fail. Therefore, continuing
    further would result in duplicate additions.
  4. The return type of ->is_busy was changed to bool. 

I have retained the original content of Wenchao's email in cover letter.
Re: [PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by Damien Le Moal 1 month, 1 week ago
On 8/16/25 20:24, JiangJianJun wrote:
> It's unbearable for systems with large scale scsi devices share HBAs to
> block all devices' IOs when handle error commands, we need a new error
> handle mechanism to address this issue.
> 
> I consulted about this issue a year ago, the discuss link can be found in
> refenence. Hannes replied about why we have to block the SCSI host
> then perform error recovery kindly. I think it's unnecessary to block
> SCSI host for all drivers and can try a small level recovery(LUN based for
> example) first to avoid block the SCSI host.
> 
> The new error handle mechanism introduced in this patchset has been
> developed and tested with out self developed hardware since one year
> ago, now we want this mechanism can be used by more drivers.
> 
> Drivers can decide if using the new error handle mechanism and how to
> handle error commands when scsi_device are scanned,the new mechanism
> makes SCSI error handle more flexible.

Barely half of your emails have made it through for me and they landed in my
spam folder. So please check your email setup.

Also, was this all tested with libata and libsas attached devices as well ?
They all depend on scsi EH.


-- 
Damien Le Moal
Western Digital Research
[PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by JiangJianJun 1 month ago
>Barely half of your emails have made it through for me and they landed in my
>spam folder. So please check your email setup.

I also find it strange, but my colleague can receive it. Maybe i reset email
and send again? 

>Also, was this all tested with libata and libsas attached devices as well ?
>They all depend on scsi EH.

There is currently no tool available for injecting faults into hard drives,
but we have implemented this solution in our company's products. So i just
test with scsi_debug.
Re: [PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by Damien Le Moal 1 month ago
On 9/2/25 2:30 PM, JiangJianJun wrote:
>> Barely half of your emails have made it through for me and they landed in my
>> spam folder. So please check your email setup.
> 
> I also find it strange, but my colleague can receive it. Maybe i reset email
> and send again? 
> 
>> Also, was this all tested with libata and libsas attached devices as well ?
>> They all depend on scsi EH.
> 
> There is currently no tool available for injecting faults into hard drives,
> but we have implemented this solution in our company's products. So i just
> test with scsi_debug.

Use write long command to "destroy" sectors. Then try to read them. That will
generate uncorrectable read errors.

See sg_write_long (sg3utils).


-- 
Damien Le Moal
Western Digital Research
[PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by JiangJianJun 1 month ago
>Use write long command to "destroy" sectors. Then try to read them. That will
>generate uncorrectable read errors.

There is a misunderstanding here, the condition that triggers this error
handler is when the device is slow or unresponsive, or fails to start;
bad blocks refer to data errors rather than faults.
Re: [PATCH 00/14] scsi: scsi_error: Introduce new error handle mechanism
Posted by Damien Le Moal 1 month ago
On 9/2/25 3:03 PM, JiangJianJun wrote:
>> Use write long command to "destroy" sectors. Then try to read them. That will
>> generate uncorrectable read errors.
> 
> There is a misunderstanding here, the condition that triggers this error
> handler is when the device is slow or unresponsive, or fails to start;
> bad blocks refer to data errors rather than faults.

You probably easilly can simulate an unresponsive device using either qemu or
tcmu-runner.


-- 
Damien Le Moal
Western Digital Research