From nobody Thu Oct 2 16:48:56 2025 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 15516235063; Sun, 14 Sep 2025 10:11:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844688; cv=none; b=eQCGNBW1In8W+67d2IGZxUpzXGVyY4Kc+G3vJgmUff7TSJl9vxN3aa18RwJfBw0vOond0IttvLyotyhdTbhUzxaVKyhqu/g+L5RiEQ8fkVVSpvorFxgDx778RxB/0PeGxPAnkLLzpotEYrAqf8v2+gJrshih1JcHSP+yAJ31S+s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844688; c=relaxed/simple; bh=dGMhuHtWhO6mFY5j8wH09YxCtZDEgNPiyFnCcKgMkaM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=r3KRktsDl6uSyscUgJNl35Px4aqP+0ZDNzbpBX9N06hIpRKQDwRUJqE9SQsIPF4XGFY2XgPayvoAbeYBtkq896yePMF95LYoeYUVHOKIPmnOiN8+6CUIp+QZKigHNVBoDH/OvSGj+JQTdvuWFKXfhMxnk1KNUiCgvMKZVVL6IiI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=h-partners.com; arc=none smtp.client-ip=45.249.212.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=h-partners.com Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4cPkT13HQRz14MbG; Sun, 14 Sep 2025 18:11:01 +0800 (CST) Received: from kwepemk500001.china.huawei.com (unknown [7.202.194.86]) by mail.maildlp.com (Postfix) with ESMTPS id 7E71418006C; Sun, 14 Sep 2025 18:11:16 +0800 (CST) Received: from localhost.localdomain (10.175.104.170) by kwepemk500001.china.huawei.com (7.202.194.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sun, 14 Sep 2025 18:11:15 +0800 From: JiangJianJun To: , , CC: , , , , , , Subject: [RFC PATCH v4 1/9] scsi: scsi_error: Define framework for LUN based error handle Date: Sun, 14 Sep 2025 18:41:37 +0800 Message-ID: <20250914104145.2239901-2-jiangjianjun3@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20250914104145.2239901-1-jiangjianjun3@huawei.com> References: <17230842-0a7a-403e-abc7-a15e3aa5d424@suse.de> <20250914104145.2239901-1-jiangjianjun3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500001.china.huawei.com (7.202.194.86) Content-Type: text/plain; charset="utf-8" From: Wenchao Hao The old scsi error handle logic is based on host, once a scsi command in one LUN of this host is classfied as failed, SCSI mid-level would set the whole host to recovery state, and no IO can be submitted to all LUNs of this host any more before recovery finished, while the recovery process might take a long time to finish. It's unreasonable when there are a lot of LUNs in one host. This change introduce a way for driver to implement its own error handle logic which can be based on scsi LUN as minimum unit. scsi_device_eh is defined for error handle based on scsi LUN, and pointer struct scsi_device_eh "eh" is added in scsi_device, which is NULL by default. LLDs can initialize the sdev->eh in hostt->slave_alloc to implement an scsi LUN based error handle. If this member is not NULL, SCSI mid-level would branch to drivers' error handler rather than the old one which block whole host's IO. Signed-off-by: Wenchao Hao Co-developed-by: JiangJianJun Signed-off-by: JiangJianJun --- drivers/scsi/scsi_error.c | 42 +++++++++++++++++++++++++++++++++++--- drivers/scsi/scsi_lib.c | 7 +++++++ drivers/scsi/scsi_priv.h | 8 ++++++++ include/scsi/scsi_device.h | 29 ++++++++++++++++++++++++++ 4 files changed, 83 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 746ff6a1f309..b5b04f2c5d62 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -291,11 +291,28 @@ static void scsi_eh_inc_host_failed(struct rcu_head *= head) spin_unlock_irqrestore(shost->host_lock, flags); } =20 +static bool scsi_eh_scmd_add_sdev(struct scsi_cmnd *scmd) +{ + struct scsi_device *sdev =3D scmd->device; + struct scsi_device_eh *eh =3D sdev->eh; + + if (!eh || !eh->add_cmnd) + return true; + + scsi_eh_reset(scmd); + eh->add_cmnd(scmd); + + if (eh->wakeup) + eh->wakeup(sdev); + + return false; +} + /** - * scsi_eh_scmd_add - add scsi cmd to error handling. - * @scmd: scmd to run eh on. + * Add scsi cmd to error handling of Scsi_Host. + * This is default action of error handle. */ -void scsi_eh_scmd_add(struct scsi_cmnd *scmd) +static void scsi_eh_scmd_add_shost(struct scsi_cmnd *scmd) { struct Scsi_Host *shost =3D scmd->device->host; unsigned long flags; @@ -322,6 +339,25 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd) call_rcu_hurry(&scmd->rcu, scsi_eh_inc_host_failed); } =20 +/** + * scsi_eh_scmd_add - add scsi cmd to error handling. + * @scmd: scmd to run eh on. + */ +void scsi_eh_scmd_add(struct scsi_cmnd *scmd) +{ + struct Scsi_Host *shost =3D scmd->device->host; + + if (unlikely(scsi_host_in_recovery(shost)) || + scsi_eh_scmd_add_sdev(scmd)) + scsi_eh_scmd_add_shost(scmd); +} + +void scsi_eh_try_wakeup_sdev(struct scsi_device *sdev) +{ + if (sdev->eh && sdev->eh->wakeup) + sdev->eh->wakeup(sdev); +} + /** * scsi_timeout - Timeout function for normal scsi commands. * @req: request that is timing out. diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 0c65ecfedfbd..ac8fb801aa82 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -398,6 +398,8 @@ void scsi_device_unbusy(struct scsi_device *sdev, struc= t scsi_cmnd *cmd) =20 sbitmap_put(&sdev->budget_map, cmd->budget_token); cmd->budget_token =3D -1; + + scsi_eh_try_wakeup_sdev(sdev); } =20 /* @@ -1360,6 +1362,9 @@ static inline int scsi_dev_queue_ready(struct request= _queue *q, { int token; =20 + if (scsi_device_in_recovery(sdev)) + return -1; + token =3D sbitmap_get(&sdev->budget_map); if (token < 0) return -1; @@ -1374,6 +1379,7 @@ static inline int scsi_dev_queue_ready(struct request= _queue *q, if (scsi_device_busy(sdev) > 1 || atomic_dec_return(&sdev->device_blocked) > 0) { sbitmap_put(&sdev->budget_map, token); + scsi_eh_try_wakeup_sdev(sdev); return -1; } =20 @@ -1882,6 +1888,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ct= x *hctx, out_put_budget: scsi_mq_put_budget(q, cmd->budget_token); cmd->budget_token =3D -1; + scsi_eh_try_wakeup_sdev(sdev); switch (ret) { case BLK_STS_OK: break; diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h index 5b2b19f5e8ec..906f090d917c 100644 --- a/drivers/scsi/scsi_priv.h +++ b/drivers/scsi/scsi_priv.h @@ -92,6 +92,7 @@ extern int scsi_error_handler(void *host); extern enum scsi_disposition scsi_decide_disposition(struct scsi_cmnd *cmd= ); extern void scsi_eh_wakeup(struct Scsi_Host *shost, unsigned int busy); extern void scsi_eh_scmd_add(struct scsi_cmnd *); +extern void scsi_eh_try_wakeup_sdev(struct scsi_device *sdev); void scsi_eh_ready_devs(struct Scsi_Host *shost, struct list_head *work_q, struct list_head *done_q); @@ -191,6 +192,13 @@ static inline void scsi_dh_add_device(struct scsi_devi= ce *sdev) { } static inline void scsi_dh_release_device(struct scsi_device *sdev) { } #endif =20 +static inline bool scsi_device_in_recovery(struct scsi_device *sdev) +{ + struct scsi_device_eh *eh =3D sdev->eh; + + return eh && eh->is_busy && eh->is_busy(sdev); +} + struct bsg_device *scsi_bsg_register_queue(struct scsi_device *sdev); =20 extern int scsi_device_max_queue_depth(struct scsi_device *sdev); diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 6d6500148c4b..c21b0a84bbd2 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -100,6 +100,34 @@ struct scsi_vpd { unsigned char data[]; }; =20 +struct scsi_device; + +struct scsi_device_eh { + /* + * add scsi command to error handler so it would be handuled by + * driver's error handle strategy + */ + void (*add_cmnd)(struct scsi_cmnd *scmd); + + /* + * to judge if the device is busy handling errors, called before + * dispatch scsi cmnd + * + * return 0 if it's ready to accepy scsi cmnd + * return 1 if it's in error handle, command's would not be dispatched + */ + bool (*is_busy)(struct scsi_device *sdev); + + /* + * wakeup device's error handle + * + * usually the error handler strategy would not run at once when + * error command is added. This function would be called when any + * scsi cmnd is finished or when scsi cmnd is added. + */ + void (*wakeup)(struct scsi_device *sdev); +}; + struct scsi_device { struct Scsi_Host *host; struct request_queue *request_queue; @@ -289,6 +317,7 @@ struct scsi_device { struct mutex state_mutex; enum scsi_device_state sdev_state; struct task_struct *quiesced_by; + struct scsi_device_eh *eh; unsigned long sdev_data[]; } __attribute__((aligned(sizeof(unsigned long)))); =20 --=20 2.33.0 From nobody Thu Oct 2 16:48:56 2025 Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BEB4B238C0D; Sun, 14 Sep 2025 10:11:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844687; cv=none; b=CCX1JyHopM1m57GabsIBE6LLjwZzTAL+R/XFYlocBc2JCcuZt8HMM72laBoZi9OzgrQ/VpMLKsPaMCeuNg14Eo8STIoi0916a4IHxH7HBGtOF/M170ylyFC7LSKzRBmQT3Qn2uyySeEHpY+eY86C4Ophgvwh7AhJ2WPzpg+4FCg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844687; c=relaxed/simple; bh=Z4FKAKyAGyU1IB9jcpQbKJ1PUYxMFhEnjWUtXfsR3L0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=sGnfoqDN0qfg26JjzPHql5cqaMLR2Q7/Qx2mDQfevHmzMZqbR1cz7vqWCLBMybxyLMCrqZERoX47YaAKsIdSsxnpEX+nDyj1/T1iYTwY7YVpEWvJ0mYd8X11hSiX7ogRUCA4C55N4qu6ozW1FLwF2bMRiit+0LcZJQWitOaBioE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=h-partners.com; arc=none smtp.client-ip=45.249.212.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=h-partners.com Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4cPkVf6tNxz27j0x; Sun, 14 Sep 2025 18:12:26 +0800 (CST) Received: from kwepemk500001.china.huawei.com (unknown [7.202.194.86]) by mail.maildlp.com (Postfix) with ESMTPS id 2A4781A0188; Sun, 14 Sep 2025 18:11:17 +0800 (CST) Received: from localhost.localdomain (10.175.104.170) by kwepemk500001.china.huawei.com (7.202.194.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sun, 14 Sep 2025 18:11:16 +0800 From: JiangJianJun To: , , CC: , , , , , , Subject: [RFC PATCH v4 2/9] scsi: scsi_error: Move complete variable eh_action from shost to sdevice Date: Sun, 14 Sep 2025 18:41:38 +0800 Message-ID: <20250914104145.2239901-3-jiangjianjun3@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20250914104145.2239901-1-jiangjianjun3@huawei.com> References: <17230842-0a7a-403e-abc7-a15e3aa5d424@suse.de> <20250914104145.2239901-1-jiangjianjun3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500001.china.huawei.com (7.202.194.86) Content-Type: text/plain; charset="utf-8" From: Wenchao Hao eh_action is used to wait for error handle command's completion if scsi command is send in error handle. Now the error handler might based on scsi_device, so move it to scsi_device. This is preparation for a genernal LUN based error handle strategy. Signed-off-by: Wenchao Hao Co-developed-by: JiangJianJun Signed-off-by: JiangJianJun --- drivers/scsi/scsi_error.c | 6 +++--- include/scsi/scsi_device.h | 2 ++ include/scsi/scsi_host.h | 2 -- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index b5b04f2c5d62..d8b3f5b0fd47 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -917,7 +917,7 @@ void scsi_eh_done(struct scsi_cmnd *scmd) SCSI_LOG_ERROR_RECOVERY(3, scmd_printk(KERN_INFO, scmd, "%s result: %x\n", __func__, scmd->result)); =20 - eh_action =3D scmd->device->host->eh_action; + eh_action =3D scmd->device->eh_action; if (eh_action) complete(eh_action); } @@ -1206,7 +1206,7 @@ static enum scsi_disposition scsi_send_eh_cmnd(struct= scsi_cmnd *scmd, =20 retry: scsi_eh_prep_cmnd(scmd, &ses, cmnd, cmnd_size, sense_bytes); - shost->eh_action =3D &done; + sdev->eh_action =3D &done; =20 scsi_log_send(scmd); scmd->submitter =3D SUBMITTED_BY_SCSI_ERROR_HANDLER; @@ -1250,7 +1250,7 @@ static enum scsi_disposition scsi_send_eh_cmnd(struct= scsi_cmnd *scmd, rtn =3D SUCCESS; } =20 - shost->eh_action =3D NULL; + sdev->eh_action =3D NULL; =20 scsi_log_completion(scmd, rtn); =20 diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index c21b0a84bbd2..9d42858035ed 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -318,6 +318,8 @@ struct scsi_device { enum scsi_device_state sdev_state; struct task_struct *quiesced_by; struct scsi_device_eh *eh; + struct completion *eh_action; /* Wait for specific actions */ + /* on the device. */ unsigned long sdev_data[]; } __attribute__((aligned(sizeof(unsigned long)))); =20 diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index c53812b9026f..46f57fe78505 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -558,8 +558,6 @@ struct Scsi_Host { struct list_head eh_abort_list; struct list_head eh_cmd_q; struct task_struct * ehandler; /* Error recovery thread. */ - struct completion * eh_action; /* Wait for specific actions on the - host. */ wait_queue_head_t host_wait; const struct scsi_host_template *hostt; struct scsi_transport_template *transportt; --=20 2.33.0 From nobody Thu Oct 2 16:48:56 2025 Received: from szxga04-in.huawei.com (szxga04-in.huawei.com [45.249.212.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A049C237713; Sun, 14 Sep 2025 10:11:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.190 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844687; cv=none; b=R5WAKnmNML3BBD1B6rRI4L9bvHyDxPB/bvLFtgXlaO24bg5lhvRzaR8huy0x3xBjCJxisju3IhP0w0LagYZ33tPLKAUrJpDqCUwIurpYgcrKARaVdYapB1YXBOf7km+URJrO97JNZQlCYgopRiHpakKDo7tLR17MfeGO7NtcxlI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844687; c=relaxed/simple; bh=vL+au5akwb6Zlvmhtnpk9H6gQNiqpw3WdYDKasAzjRA=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=a5EewklItlVXji7NIekwdiD7EyOh/Dn4Pi4KWZy6XHFzQhjKJwegQ9m8DIT2ps+6JkaXvcbqAT3y5wXoOUwINKR1wBg/elJ/TmhSQ3ENIM+aZNn0aPgWVpRv1CbE5ZVPimi/VFewWDBn5/+41m4PxvQ3sdrgbj5PAtAr+7z1iZk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=h-partners.com; arc=none smtp.client-ip=45.249.212.190 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=h-partners.com Received: from mail.maildlp.com (unknown [172.19.88.163]) by szxga04-in.huawei.com (SkyGuard) with ESMTP id 4cPkN35ch4z2CgCX; Sun, 14 Sep 2025 18:06:43 +0800 (CST) Received: from kwepemk500001.china.huawei.com (unknown [7.202.194.86]) by mail.maildlp.com (Postfix) with ESMTPS id C5C4C180043; Sun, 14 Sep 2025 18:11:17 +0800 (CST) Received: from localhost.localdomain (10.175.104.170) by kwepemk500001.china.huawei.com (7.202.194.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sun, 14 Sep 2025 18:11:16 +0800 From: JiangJianJun To: , , CC: , , , , , , Subject: [RFC PATCH v4 3/9] scsi: scsi_error: Check if to do reset in scsi_try_xxx_reset Date: Sun, 14 Sep 2025 18:41:39 +0800 Message-ID: <20250914104145.2239901-4-jiangjianjun3@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20250914104145.2239901-1-jiangjianjun3@huawei.com> References: <17230842-0a7a-403e-abc7-a15e3aa5d424@suse.de> <20250914104145.2239901-1-jiangjianjun3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500001.china.huawei.com (7.202.194.86) Content-Type: text/plain; charset="utf-8" From: Wenchao Hao This is preparation for a genernal LUN based error handle strategy, the strategy would reuse some error handler APIs, but some steps of these function should not be performed. For example, we should not perform bus/host reset if we just stop IOs on one single LUN. This change add checks in scsi_try_xxx_reset to make sure the reset operations would not be performed only if the condition is not satisfied. Signed-off-by: Wenchao Hao Co-developed-by: JiangJianJun Signed-off-by: JiangJianJun --- drivers/scsi/scsi_error.c | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index d8b3f5b0fd47..80a85b387068 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -926,7 +926,7 @@ void scsi_eh_done(struct scsi_cmnd *scmd) * scsi_try_host_reset - ask host adapter to reset itself * @scmd: SCSI cmd to send host reset. */ -static enum scsi_disposition scsi_try_host_reset(struct scsi_cmnd *scmd) +static enum scsi_disposition __scsi_try_host_reset(struct scsi_cmnd *scmd) { unsigned long flags; enum scsi_disposition rtn; @@ -952,11 +952,19 @@ static enum scsi_disposition scsi_try_host_reset(stru= ct scsi_cmnd *scmd) return rtn; } =20 +static enum scsi_disposition scsi_try_host_reset(struct scsi_cmnd *scmd) +{ + if (!scsi_host_in_recovery(scmd->device->host)) + return FAILED; + + return __scsi_try_host_reset(scmd); +} + /** * scsi_try_bus_reset - ask host to perform a bus reset * @scmd: SCSI cmd to send bus reset. */ -static enum scsi_disposition scsi_try_bus_reset(struct scsi_cmnd *scmd) +static enum scsi_disposition __scsi_try_bus_reset(struct scsi_cmnd *scmd) { unsigned long flags; enum scsi_disposition rtn; @@ -982,6 +990,14 @@ static enum scsi_disposition scsi_try_bus_reset(struct= scsi_cmnd *scmd) return rtn; } =20 +static enum scsi_disposition scsi_try_bus_reset(struct scsi_cmnd *scmd) +{ + if (!scsi_host_in_recovery(scmd->device->host)) + return FAILED; + + return __scsi_try_bus_reset(scmd); +} + static void __scsi_report_device_reset(struct scsi_device *sdev, void *dat= a) { sdev->was_reset =3D 1; @@ -2547,12 +2563,12 @@ scsi_ioctl_reset(struct scsi_device *dev, int __use= r *arg) break; fallthrough; case SG_SCSI_RESET_BUS: - rtn =3D scsi_try_bus_reset(scmd); + rtn =3D __scsi_try_bus_reset(scmd); if (rtn =3D=3D SUCCESS || (val & SG_SCSI_RESET_NO_ESCALATE)) break; fallthrough; case SG_SCSI_RESET_HOST: - rtn =3D scsi_try_host_reset(scmd); + rtn =3D __scsi_try_host_reset(scmd); if (rtn =3D=3D SUCCESS) break; fallthrough; --=20 2.33.0 From nobody Thu Oct 2 16:48:56 2025 Received: from szxga06-in.huawei.com (szxga06-in.huawei.com [45.249.212.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BEAD522DF9E; Sun, 14 Sep 2025 10:11:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844688; cv=none; b=CcBasS393uahtuo5WpDK0GZsetFpg77RBjNgz2YScBBPc+qKWFbEb1ED9axTQrtoFnT4dMeWhtZfxa47PfTRk8u1cMvpM2fNuuzwjM9NjLTnq/1KCIhf3kRX3TpgTzDXOyXpQGEe34EkieUhmsxmJV4qgIotfZLxiXNJV8uBtvg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844688; c=relaxed/simple; bh=wgQ3RYY9DNiXsBqjzIz+5B5okUFc/bdGT33SjQ1d4kY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=meHbKn2tAPt+0Wp6aEyiO8Wti3QbZEHmn1/xP3TgRfYUdDaXluYN8KWCMT0hjLlCmwzsCrCFTTtY2cuOOvfNmNdi85xDqLBafqd/75/tr54rTyqBUJaqqtZ86/TApguyDIEnfcFQiKDOtwDdo/gPwYgo9B66QgjqB/9DX2kImX8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=h-partners.com; arc=none smtp.client-ip=45.249.212.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=h-partners.com Received: from mail.maildlp.com (unknown [172.19.163.44]) by szxga06-in.huawei.com (SkyGuard) with ESMTP id 4cPkVh1svQz27j5N; Sun, 14 Sep 2025 18:12:28 +0800 (CST) Received: from kwepemk500001.china.huawei.com (unknown [7.202.194.86]) by mail.maildlp.com (Postfix) with ESMTPS id 715971402CA; Sun, 14 Sep 2025 18:11:18 +0800 (CST) Received: from localhost.localdomain (10.175.104.170) by kwepemk500001.china.huawei.com (7.202.194.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sun, 14 Sep 2025 18:11:17 +0800 From: JiangJianJun To: , , CC: , , , , , , Subject: [RFC PATCH v4 4/9] scsi: scsi_error: Add helper scsi_eh_sdev_stu to do START_UNIT Date: Sun, 14 Sep 2025 18:41:40 +0800 Message-ID: <20250914104145.2239901-5-jiangjianjun3@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20250914104145.2239901-1-jiangjianjun3@huawei.com> References: <17230842-0a7a-403e-abc7-a15e3aa5d424@suse.de> <20250914104145.2239901-1-jiangjianjun3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500001.china.huawei.com (7.202.194.86) Content-Type: text/plain; charset="utf-8" From: Wenchao Hao Add helper function scsi_eh_sdev_stu() to perform START_UNIT and check if to finish some error commands. This is preparation for a genernal LUN based error handle strategy and did not change original logic. Signed-off-by: Wenchao Hao Co-developed-by: JiangJianJun Signed-off-by: JiangJianJun --- drivers/scsi/scsi_error.c | 50 +++++++++++++++++++++++---------------- 1 file changed, 29 insertions(+), 21 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 80a85b387068..a1b700dbb0ec 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1559,6 +1559,31 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd) return 1; } =20 +static int scsi_eh_sdev_stu(struct scsi_cmnd *scmd, + struct list_head *work_q, + struct list_head *done_q) +{ + struct scsi_device *sdev =3D scmd->device; + struct scsi_cmnd *next; + + SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev, + "%s: Sending START_UNIT\n", current->comm)); + + if (scsi_eh_try_stu(scmd)) { + SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev, + "%s: START_UNIT failed\n", current->comm)); + return 0; + } + + if (!scsi_device_online(sdev) || !scsi_eh_tur(scmd)) + list_for_each_entry_safe(scmd, next, work_q, eh_entry) + if (scmd->device =3D=3D sdev && + scsi_eh_action(scmd, SUCCESS) =3D=3D SUCCESS) + scsi_eh_finish_cmd(scmd, done_q); + + return list_empty(work_q); +} + /** * scsi_eh_stu - send START_UNIT if needed * @shost: &scsi host being recovered. @@ -1573,7 +1598,7 @@ static int scsi_eh_stu(struct Scsi_Host *shost, struct list_head *work_q, struct list_head *done_q) { - struct scsi_cmnd *scmd, *stu_scmd, *next; + struct scsi_cmnd *scmd, *stu_scmd; struct scsi_device *sdev; =20 shost_for_each_device(sdev, shost) { @@ -1596,26 +1621,9 @@ static int scsi_eh_stu(struct Scsi_Host *shost, if (!stu_scmd) continue; =20 - SCSI_LOG_ERROR_RECOVERY(3, - sdev_printk(KERN_INFO, sdev, - "%s: Sending START_UNIT\n", - current->comm)); - - if (!scsi_eh_try_stu(stu_scmd)) { - if (!scsi_device_online(sdev) || - !scsi_eh_tur(stu_scmd)) { - list_for_each_entry_safe(scmd, next, - work_q, eh_entry) { - if (scmd->device =3D=3D sdev && - scsi_eh_action(scmd, SUCCESS) =3D=3D SUCCESS) - scsi_eh_finish_cmd(scmd, done_q); - } - } - } else { - SCSI_LOG_ERROR_RECOVERY(3, - sdev_printk(KERN_INFO, sdev, - "%s: START_UNIT failed\n", - current->comm)); + if (scsi_eh_sdev_stu(stu_scmd, work_q, done_q)) { + scsi_device_put(sdev); + break; } } =20 --=20 2.33.0 From nobody Thu Oct 2 16:48:56 2025 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C7C1A234973; Sun, 14 Sep 2025 10:11:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844689; cv=none; b=Q8v2AKBopM+HtUqr7ZNaXjdIbVBBDGNkojOaQD9IWbb2Z1B1PAb2JrrhYq//hPC2WfDSCyBePMaeYjKKOWil5gD9o9LqrR9c6FtcTMqUAk39fWUwOADhOrv4U/zhj5FEBXkULFzVDt6heqC2OGZ0H2vO2T6pzPqHebvGOWYJSzo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844689; c=relaxed/simple; bh=Lnr6tGUtlbXLP+Z/ARbSnGTNMadVel0cOI/IQcIzfJU=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ldNti6EDLcHv37ddaFKZ1Buij0kWklXY0o2dWQimfq35ZZTlA+yyBo029Y3xVN7NOh6BfQjnXDuWbM4B9pnjoYSzfUtyuqwNcIhJmf9vQBN2u5mFX50E4gX87n1EDGUCsntPhLRthEjYy2vZJcbBdXQW4sqaBk0WzM9+xrWWN+U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=h-partners.com; arc=none smtp.client-ip=45.249.212.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=h-partners.com Received: from mail.maildlp.com (unknown [172.19.88.194]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4cPkSJ4jpTztTSW; Sun, 14 Sep 2025 18:10:24 +0800 (CST) Received: from kwepemk500001.china.huawei.com (unknown [7.202.194.86]) by mail.maildlp.com (Postfix) with ESMTPS id 1D954140275; Sun, 14 Sep 2025 18:11:19 +0800 (CST) Received: from localhost.localdomain (10.175.104.170) by kwepemk500001.china.huawei.com (7.202.194.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sun, 14 Sep 2025 18:11:18 +0800 From: JiangJianJun To: , , CC: , , , , , , Subject: [RFC PATCH v4 5/9] scsi: scsi_error: Add helper scsi_eh_sdev_reset to do lun reset Date: Sun, 14 Sep 2025 18:41:41 +0800 Message-ID: <20250914104145.2239901-6-jiangjianjun3@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20250914104145.2239901-1-jiangjianjun3@huawei.com> References: <17230842-0a7a-403e-abc7-a15e3aa5d424@suse.de> <20250914104145.2239901-1-jiangjianjun3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500001.china.huawei.com (7.202.194.86) Content-Type: text/plain; charset="utf-8" From: Wenchao Hao Add helper function scsi_eh_sdev_reset() to perform lun reset and check if to finish some error commands. This is preparation for a genernal LUN based error handle strategy and did not change original logic. Signed-off-by: Wenchao Hao Co-developed-by: JiangJianJun Signed-off-by: JiangJianJun --- drivers/scsi/scsi_error.c | 54 +++++++++++++++++++++++---------------- 1 file changed, 32 insertions(+), 22 deletions(-) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index a1b700dbb0ec..f58aad351463 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -1630,6 +1630,34 @@ static int scsi_eh_stu(struct Scsi_Host *shost, return list_empty(work_q); } =20 +static int scsi_eh_sdev_reset(struct scsi_cmnd *scmd, + struct list_head *work_q, + struct list_head *done_q) +{ + struct scsi_cmnd *next; + struct scsi_device *sdev =3D scmd->device; + enum scsi_disposition rtn; + + SCSI_LOG_ERROR_RECOVERY(3, sdev_printk(KERN_INFO, sdev, + "%s: Sending BDR\n", current->comm)); + + rtn =3D scsi_try_bus_device_reset(scmd); + if (rtn !=3D SUCCESS && rtn !=3D FAST_IO_FAIL) { + SCSI_LOG_ERROR_RECOVERY(3, + sdev_printk(KERN_INFO, sdev, + "%s: BDR failed\n", current->comm)); + return 0; + } + + if (!scsi_device_online(sdev) || rtn =3D=3D FAST_IO_FAIL || + !scsi_eh_tur(scmd)) + list_for_each_entry_safe(scmd, next, work_q, eh_entry) + if (scmd->device =3D=3D sdev && + scsi_eh_action(scmd, rtn) !=3D FAILED) + scsi_eh_finish_cmd(scmd, done_q); + + return list_empty(work_q); +} =20 /** * scsi_eh_bus_device_reset - send bdr if needed @@ -1647,9 +1675,8 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host = *shost, struct list_head *work_q, struct list_head *done_q) { - struct scsi_cmnd *scmd, *bdr_scmd, *next; + struct scsi_cmnd *scmd, *bdr_scmd; struct scsi_device *sdev; - enum scsi_disposition rtn; =20 shost_for_each_device(sdev, shost) { if (scsi_host_eh_past_deadline(shost)) { @@ -1670,26 +1697,9 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host= *shost, if (!bdr_scmd) continue; =20 - SCSI_LOG_ERROR_RECOVERY(3, - sdev_printk(KERN_INFO, sdev, - "%s: Sending BDR\n", current->comm)); - rtn =3D scsi_try_bus_device_reset(bdr_scmd); - if (rtn =3D=3D SUCCESS || rtn =3D=3D FAST_IO_FAIL) { - if (!scsi_device_online(sdev) || - rtn =3D=3D FAST_IO_FAIL || - !scsi_eh_tur(bdr_scmd)) { - list_for_each_entry_safe(scmd, next, - work_q, eh_entry) { - if (scmd->device =3D=3D sdev && - scsi_eh_action(scmd, rtn) !=3D FAILED) - scsi_eh_finish_cmd(scmd, - done_q); - } - } - } else { - SCSI_LOG_ERROR_RECOVERY(3, - sdev_printk(KERN_INFO, sdev, - "%s: BDR failed\n", current->comm)); + if (scsi_eh_sdev_reset(bdr_scmd, work_q, done_q)) { + scsi_device_put(sdev); + break; } } =20 --=20 2.33.0 From nobody Thu Oct 2 16:48:56 2025 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AE76245014; Sun, 14 Sep 2025 10:11:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.187 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844691; cv=none; b=Evz1E/6DMxe8SdiAWQdOynamtxe8s3XBVGTlfTeYKynE3BXB5S4fhdVsd5Y38a31jHpPE3pE4vsInzKDtyOynA2D6iNrm6Ci9Nh0fUiAHDe0XpPNxvTf/8EfoizZRrIcMgJgNyejm+1xnQQ8qgxfC2JHs4DPW6TOHEj/Sn8FcvQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844691; c=relaxed/simple; bh=hRKhAeq9hGdXXPUYg7nH8gwzAO+Il/3zJ4PB9RB80aE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=FPr3A51EwucSiyWFkaGQaPo8KMbc/gLBqbWKi5dKnqFFC7+aYueP1eVp7nDRof+8kSOLfn+c+/05uMR5hsbRbu5Zn8Ei65VmKwb7izv8SDfkOYKuMJ02Xl1e/Fl1JgUmmSzA/J2SkcvHr2o3PlGnDUV5Kg8x1J5025a5FiFMRME= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=h-partners.com; arc=none smtp.client-ip=45.249.212.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=h-partners.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4cPkNf15nBz13N7W; Sun, 14 Sep 2025 18:07:14 +0800 (CST) Received: from kwepemk500001.china.huawei.com (unknown [7.202.194.86]) by mail.maildlp.com (Postfix) with ESMTPS id BF8561401F4; Sun, 14 Sep 2025 18:11:19 +0800 (CST) Received: from localhost.localdomain (10.175.104.170) by kwepemk500001.china.huawei.com (7.202.194.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sun, 14 Sep 2025 18:11:18 +0800 From: JiangJianJun To: , , CC: , , , , , , Subject: [RFC PATCH v4 6/9] scsi: scsi_error: Add flags to mark error handle steps has done Date: Sun, 14 Sep 2025 18:41:42 +0800 Message-ID: <20250914104145.2239901-7-jiangjianjun3@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20250914104145.2239901-1-jiangjianjun3@huawei.com> References: <17230842-0a7a-403e-abc7-a15e3aa5d424@suse.de> <20250914104145.2239901-1-jiangjianjun3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500001.china.huawei.com (7.202.194.86) Content-Type: text/plain; charset="utf-8" From: Wenchao Hao LUN based error handle would mainly do three steps to recover commands which are check sense, start unit, and reset lun. It might fallback to host based error handler which would do these steps too. Add some flags to mark these steps are done to avoid repeating these steps. The flags should be cleared when LUN based error handler is waked up or when host based error handler finished, and set when fallback to host based error handle. scsi_eh_get_sense, scsi_eh_stu, scsi_eh_bus_device_reset would check these flags before actually action. Signed-off-by: Wenchao Hao Co-developed-by: JiangJianJun Signed-off-by: JiangJianJun --- drivers/scsi/scsi_error.c | 33 +++++++++++++++++++++++++++++++++ include/scsi/scsi_device.h | 18 ++++++++++++++++++ 2 files changed, 51 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index f58aad351463..239f8231c3ff 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -57,10 +57,36 @@ #define BUS_RESET_SETTLE_TIME (10) #define HOST_RESET_SETTLE_TIME (10) =20 +#define sdev_flags_done(flag) \ +static inline int sdev_##flag(struct scsi_device *sdev) \ +{ \ + struct scsi_device_eh *eh =3D sdev->eh; \ + if (!eh) \ + return 0; \ + return eh->flag; \ +} + static int scsi_eh_try_stu(struct scsi_cmnd *scmd); static enum scsi_disposition scsi_try_to_abort_cmd(const struct scsi_host_= template *, struct scsi_cmnd *); =20 +sdev_flags_done(get_sense_done); +sdev_flags_done(stu_done); +sdev_flags_done(reset_done); + +static inline void shost_clear_eh_done(struct Scsi_Host *shost) +{ + struct scsi_device *sdev; + + shost_for_each_device(sdev, shost) { + if (!sdev->eh) + continue; + sdev->eh->get_sense_done =3D 0; + sdev->eh->stu_done =3D 0; + sdev->eh->reset_done =3D 0; + } +} + void scsi_eh_wakeup(struct Scsi_Host *shost, unsigned int busy) { lockdep_assert_held(shost->host_lock); @@ -1397,6 +1423,8 @@ int scsi_eh_get_sense(struct list_head *work_q, current->comm)); break; } + if (sdev_get_sense_done(scmd->device)) + continue; if (!scsi_status_is_check_condition(scmd->result)) /* * don't request sense if there's no check condition @@ -1610,6 +1638,8 @@ static int scsi_eh_stu(struct Scsi_Host *shost, scsi_device_put(sdev); break; } + if (sdev_stu_done(sdev)) + continue; stu_scmd =3D NULL; list_for_each_entry(scmd, work_q, eh_entry) if (scmd->device =3D=3D sdev && SCSI_SENSE_VALID(scmd) && @@ -1693,6 +1723,8 @@ static int scsi_eh_bus_device_reset(struct Scsi_Host = *shost, bdr_scmd =3D scmd; break; } + if (sdev_reset_done(sdev)) + continue; =20 if (!bdr_scmd) continue; @@ -2357,6 +2389,7 @@ static void scsi_unjam_host(struct Scsi_Host *shost) if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q)) scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q); =20 + shost_clear_eh_done(shost); spin_lock_irqsave(shost->host_lock, flags); if (shost->eh_deadline !=3D -1) shost->last_reset =3D 0; diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 9d42858035ed..9fc052e48a3b 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -103,6 +103,24 @@ struct scsi_vpd { struct scsi_device; =20 struct scsi_device_eh { + /* + * LUN rebased error handle would mainly do three + * steps to recovery commands which are + * check sense + * start unit + * reset lun + * While we would fallback to host based error handler which would + * do these steps too. Add flags to mark thes steps are done to + * avoid repeating these steps. + * + * The flags should be cleared when LUN based error handler is + * wakedup or when host based error handler finished, set when + * fallback to host based error handle. + */ + unsigned get_sense_done:1; + unsigned stu_done:1; + unsigned reset_done:1; + /* * add scsi command to error handler so it would be handuled by * driver's error handle strategy --=20 2.33.0 From nobody Thu Oct 2 16:48:56 2025 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 82BA423C8AE; Sun, 14 Sep 2025 10:11:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844689; cv=none; b=EODcJ1mmyvB9OVUq1Fn0Q+AH/XgUuw3qHb6gl7g9UN0+kX8iXjSLYYztx8VC7iE/rH/CRUWsM+2dMGW8e1uvandhyvqJg6QPCJ6LcjheLO8E+buatfT9Gm4R5rF6kfciVlekkvBfm/0NbtF+mlubTYPqt+DpiUtdzdjNFYJyfqs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844689; c=relaxed/simple; bh=bOuxSEJpCFPZ22gagljLYxEievhn979D7a2mhkiEWQw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=hzFfOwPo1JBfWxdss3YKuh07KjTqWnW/I/YZqmtRVcuB/Snig/bosFFe789N+SQYzowEMl4jbuRdeRiHuSy1VwZGA4bbdCeodP690FqmS2Q9uIpoVyXtgNkxO5skzwBSXD2MFLWoEBizpocBt99UsiEZWA/LBpZgbt1+QY27Gm8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=h-partners.com; arc=none smtp.client-ip=45.249.212.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=h-partners.com Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4cPkN31Nk1z5vMb; Sun, 14 Sep 2025 18:06:43 +0800 (CST) Received: from kwepemk500001.china.huawei.com (unknown [7.202.194.86]) by mail.maildlp.com (Postfix) with ESMTPS id 67FB318006C; Sun, 14 Sep 2025 18:11:20 +0800 (CST) Received: from localhost.localdomain (10.175.104.170) by kwepemk500001.china.huawei.com (7.202.194.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sun, 14 Sep 2025 18:11:19 +0800 From: JiangJianJun To: , , CC: , , , , , , Subject: [RFC PATCH v4 7/9] scsi: scsi_error: Add helper to handle scsi device's error command list Date: Sun, 14 Sep 2025 18:41:43 +0800 Message-ID: <20250914104145.2239901-8-jiangjianjun3@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20250914104145.2239901-1-jiangjianjun3@huawei.com> References: <17230842-0a7a-403e-abc7-a15e3aa5d424@suse.de> <20250914104145.2239901-1-jiangjianjun3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500001.china.huawei.com (7.202.194.86) Content-Type: text/plain; charset="utf-8" From: Wenchao Hao Add helper scsi_sdev_eh() to handle scsi device's error command list, it would perform some steps which can be done with LUN's IO blocked, including check sense, start unit and reset lun. Signed-off-by: Wenchao Hao Co-developed-by: JiangJianJun Signed-off-by: JiangJianJun --- drivers/scsi/scsi_error.c | 37 +++++++++++++++++++++++++++++++++++++ include/scsi/scsi_eh.h | 2 ++ 2 files changed, 39 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 239f8231c3ff..1f2b3deace32 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -2486,6 +2486,43 @@ int scsi_error_handler(void *data) return 0; } =20 +/* + * Single LUN error handle + * + * @work_q: list of scsi commands need to recovery + * @done_q: list of scsi commands handled + * + * return: return 1 if all commands in work_q is recoveryed, else 0 is ret= urned + */ +int scsi_sdev_eh(struct scsi_device *sdev, + struct list_head *work_q, + struct list_head *done_q) +{ + int ret =3D 0; + struct scsi_cmnd *scmd; + + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh: checking sense\n", current->comm)); + ret =3D scsi_eh_get_sense(work_q, done_q); + if (ret) + return ret; + + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh: start unit\n", current->comm)); + scmd =3D list_first_entry(work_q, struct scsi_cmnd, eh_entry); + ret =3D scsi_eh_sdev_stu(scmd, work_q, done_q); + if (ret) + return ret; + + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh: reset LUN\n", current->comm)); + scmd =3D list_first_entry(work_q, struct scsi_cmnd, eh_entry); + ret =3D scsi_eh_sdev_reset(scmd, work_q, done_q); + + return ret; +} +EXPORT_SYMBOL_GPL(scsi_sdev_eh); + /** * scsi_report_bus_reset() - report bus reset observed * diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h index 1ae08e81339f..5ce791063baf 100644 --- a/include/scsi/scsi_eh.h +++ b/include/scsi/scsi_eh.h @@ -18,6 +18,8 @@ extern int scsi_block_when_processing_errors(struct scsi_= device *); extern bool scsi_command_normalize_sense(const struct scsi_cmnd *cmd, struct scsi_sense_hdr *sshdr); extern enum scsi_disposition scsi_check_sense(struct scsi_cmnd *); +extern int scsi_sdev_eh(struct scsi_device *sdev, struct list_head *workq, + struct list_head *doneq); =20 static inline bool scsi_sense_is_deferred(const struct scsi_sense_hdr *ssh= dr) { --=20 2.33.0 From nobody Thu Oct 2 16:48:56 2025 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5EA62459D4; Sun, 14 Sep 2025 10:11:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.189 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844691; cv=none; b=iitDvHmsfRX8K5M4Rj5h5dHji71RQlYd5GX476kEbAZVbOqpkDyxCteImWBwzwZI640rI198Rl3gi7mpjPJ0Anu8dEH3/VSZL0P0+umzzvXKcgy7xzeyb5+JWurteZGTxoACU7hvWqOCwxMZeLvdYB6/eV9pSXlOk/TZ1p3hNPU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844691; c=relaxed/simple; bh=D66FD/o5HEMoLk5L2P+7sSic2HjDi4s1vVdWYrvDt84=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=p2ubh5gCYg6W548F7eAbTeY7QdO9AP7lmCGdUu4Y7WxOFo5X4MnpH/Sl5bFXNR7x0U88t0IKoBs7jmE3DPvXAeVeQHyd5Knn4XU3NhB3KZSHM1BjBbABhloUnzepUUBfsWgjWr8KGVlzZ14gaqV7XP+oAYSsZ1uQx3LVYLImmbE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=h-partners.com; arc=none smtp.client-ip=45.249.212.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=h-partners.com Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4cPkN65yF4zdc9T; Sun, 14 Sep 2025 18:06:46 +0800 (CST) Received: from kwepemk500001.china.huawei.com (unknown [7.202.194.86]) by mail.maildlp.com (Postfix) with ESMTPS id 184E4180464; Sun, 14 Sep 2025 18:11:21 +0800 (CST) Received: from localhost.localdomain (10.175.104.170) by kwepemk500001.china.huawei.com (7.202.194.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sun, 14 Sep 2025 18:11:20 +0800 From: JiangJianJun To: , , CC: , , , , , , Subject: [RFC PATCH v4 8/9] scsi: scsi_error: Add a general LUN based error handler Date: Sun, 14 Sep 2025 18:41:44 +0800 Message-ID: <20250914104145.2239901-9-jiangjianjun3@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20250914104145.2239901-1-jiangjianjun3@huawei.com> References: <17230842-0a7a-403e-abc7-a15e3aa5d424@suse.de> <20250914104145.2239901-1-jiangjianjun3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500001.china.huawei.com (7.202.194.86) Content-Type: text/plain; charset="utf-8" From: Wenchao Hao Add a general LUN based error handler which can be used by drivers directly. This error handler implements an scsi_device_eh, when handling error commands, it would call helper function scsi_sdev_eh() added before to try recover error commands. The behavior if scsi_sdev_eh() can not recover all error commands depends on fallback flag, which is initialized when scsi_device is allocated. If fallback is set, it would fallback to further error recover strategy like old host based error handle; else it would mark this scsi device offline and flush all error commands. Add a flag for controlling rollback in scsi_host_template. Add interface sdev_setup_eh/sdev_clear_eh in scsi_host_template, used for setup/clear LUN based error handler. Drivers can implements them custom, or use inner implements: scsi_device_setup_eh/scsi_device_clear_eh. Signed-off-by: Wenchao Hao Co-developed-by: JiangJianJun Signed-off-by: JiangJianJun --- drivers/scsi/scsi_error.c | 176 ++++++++++++++++++++++++++++++++++++++ drivers/scsi/scsi_scan.c | 7 ++ drivers/scsi/scsi_sysfs.c | 2 + include/scsi/scsi_eh.h | 2 + include/scsi/scsi_host.h | 16 ++++ 5 files changed, 203 insertions(+) diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index 1f2b3deace32..c1b4cd10216b 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -2736,3 +2736,179 @@ bool scsi_get_sense_info_fld(const u8 *sense_buffer= , int sb_len, } } EXPORT_SYMBOL(scsi_get_sense_info_fld); + +static inline void *scsi_eh_device_priv(struct scsi_device_eh *eh) +{ + return eh + 1; +} + +struct scsi_lun_eh { + spinlock_t eh_lock; + unsigned int eh_num; + struct list_head eh_cmd_q; + struct scsi_device *sdev; + struct work_struct eh_handle_work; + unsigned int fallback:1; +}; + +/* + * error handle strategy based on LUN, following steps + * is applied to recovery error commands in list: + * check sense data + * send start unit + * reset lun + * if there are still error commands, it would fallback to + * host based error handler for further recovery. + */ +static void sdev_eh_work(struct work_struct *work) +{ + unsigned long flags; + struct scsi_lun_eh *luneh =3D + container_of(work, struct scsi_lun_eh, eh_handle_work); + struct scsi_device *sdev =3D luneh->sdev; + struct scsi_device_eh *eh =3D sdev->eh; + struct Scsi_Host *shost =3D sdev->host; + struct scsi_cmnd *scmd, *next; + LIST_HEAD(eh_work_q); + LIST_HEAD(eh_done_q); + + spin_lock_irqsave(&luneh->eh_lock, flags); + list_splice_init(&luneh->eh_cmd_q, &eh_work_q); + spin_unlock_irqrestore(&luneh->eh_lock, flags); + + if (scsi_sdev_eh(sdev, &eh_work_q, &eh_done_q)) + goto out_flush_done; + + if (!luneh->fallback) { + list_for_each_entry_safe(scmd, next, &eh_work_q, eh_entry) + scsi_eh_finish_cmd(scmd, &eh_done_q); + + sdev_printk(KERN_INFO, sdev, + "%s:luneh: Device offlined - not ready after error recovery\n", + current->comm); + + mutex_lock(&sdev->state_mutex); + scsi_device_set_state(sdev, SDEV_OFFLINE); + mutex_unlock(&sdev->state_mutex); + + goto out_flush_done; + } + + /* + * fallback to host based error handler + */ + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh fallback to further recovery\n", current->comm)); + list_for_each_entry_safe(scmd, next, &eh_work_q, eh_entry) { + list_del_init(&scmd->eh_entry); + scsi_eh_scmd_add_shost(scmd); + } + + eh->get_sense_done =3D 1; + eh->stu_done =3D 1; + eh->reset_done =3D 1; + +out_flush_done: + scsi_eh_flush_done_q(&eh_done_q); + spin_lock_irqsave(&luneh->eh_lock, flags); + luneh->eh_num =3D 0; + spin_unlock_irqrestore(&luneh->eh_lock, flags); +} +static void sdev_eh_add_cmnd(struct scsi_cmnd *scmd) +{ + unsigned long flags; + struct scsi_lun_eh *luneh; + struct scsi_device *sdev =3D scmd->device; + + luneh =3D scsi_eh_device_priv(sdev->eh); + + spin_lock_irqsave(&luneh->eh_lock, flags); + list_add_tail(&scmd->eh_entry, &luneh->eh_cmd_q); + luneh->eh_num++; + spin_unlock_irqrestore(&luneh->eh_lock, flags); +} +static bool sdev_eh_is_busy(struct scsi_device *sdev) +{ + int ret =3D 0; + unsigned long flags; + struct scsi_lun_eh *luneh; + + if (!sdev->eh) + return false; + + luneh =3D scsi_eh_device_priv(sdev->eh); + + spin_lock_irqsave(&luneh->eh_lock, flags); + ret =3D luneh->eh_num; + spin_unlock_irqrestore(&luneh->eh_lock, flags); + + return ret !=3D 0; +} +static void sdev_eh_wakeup(struct scsi_device *sdev) +{ + unsigned long flags; + unsigned int nr_error; + unsigned int nr_busy; + struct scsi_lun_eh *luneh; + + luneh =3D scsi_eh_device_priv(sdev->eh); + + spin_lock_irqsave(&luneh->eh_lock, flags); + nr_error =3D luneh->eh_num; + spin_unlock_irqrestore(&luneh->eh_lock, flags); + + nr_busy =3D scsi_device_busy(sdev); + + if (!nr_error || nr_busy !=3D nr_error) { + SCSI_LOG_ERROR_RECOVERY(5, sdev_printk(KERN_INFO, sdev, + "%s:luneh: do not wake up, busy/error: %d/%d\n", + current->comm, nr_busy, nr_error)); + return; + } + + SCSI_LOG_ERROR_RECOVERY(2, sdev_printk(KERN_INFO, sdev, + "%s:luneh: waking up, busy/error: %d/%d\n", + current->comm, nr_busy, nr_error)); + + schedule_work(&luneh->eh_handle_work); +} + +/* + * This is default implement of Scsi_Host.sdev_setup_eh. + */ +int scsi_device_setup_eh(struct scsi_device *sdev) +{ + struct scsi_device_eh *eh; + struct scsi_lun_eh *luneh; + + eh =3D kzalloc(sizeof(struct scsi_device_eh) + sizeof(struct scsi_lun_eh), + GFP_KERNEL); + if (!eh) + return -ENOMEM; + luneh =3D scsi_eh_device_priv(eh); + + eh->add_cmnd =3D sdev_eh_add_cmnd; + eh->is_busy =3D sdev_eh_is_busy; + eh->wakeup =3D sdev_eh_wakeup; + + luneh->fallback =3D sdev->host->hostt->sdev_eh_fallback; + luneh->sdev =3D sdev; + spin_lock_init(&luneh->eh_lock); + INIT_LIST_HEAD(&luneh->eh_cmd_q); + INIT_WORK(&luneh->eh_handle_work, sdev_eh_work); + + sdev->eh =3D eh; + + return 0; +} +EXPORT_SYMBOL_GPL(scsi_device_setup_eh); + +/* + * This is default implement of Scsi_Host.sdev_clear_eh. + */ +void scsi_device_clear_eh(struct scsi_device *sdev) +{ + kfree(sdev->eh); + sdev->eh =3D NULL; +} +EXPORT_SYMBOL_GPL(scsi_device_clear_eh); diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index 3c6e089e80c3..1c5cb77dfb22 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -377,9 +377,16 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi= _target *starget, goto out_device_destroy; } } + if (shost->hostt->sdev_setup_eh) { + ret =3D shost->hostt->sdev_setup_eh(sdev); + if (ret) + goto out_device_eh; + } =20 return sdev; =20 +out_device_eh: + shost->hostt->sdev_destroy(sdev); out_device_destroy: __scsi_remove_device(sdev); out: diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index 15ba493d2138..76e788450345 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -1513,6 +1513,8 @@ void __scsi_remove_device(struct scsi_device *sdev) kref_put(&sdev->host->tagset_refcnt, scsi_mq_free_tags); cancel_work_sync(&sdev->requeue_work); =20 + if (sdev->host->hostt->sdev_clear_eh) + sdev->host->hostt->sdev_clear_eh(sdev); if (sdev->host->hostt->sdev_destroy) sdev->host->hostt->sdev_destroy(sdev); transport_destroy_device(dev); diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h index 5ce791063baf..d8e4475ff004 100644 --- a/include/scsi/scsi_eh.h +++ b/include/scsi/scsi_eh.h @@ -20,6 +20,8 @@ extern bool scsi_command_normalize_sense(const struct scs= i_cmnd *cmd, extern enum scsi_disposition scsi_check_sense(struct scsi_cmnd *); extern int scsi_sdev_eh(struct scsi_device *sdev, struct list_head *workq, struct list_head *doneq); +extern int scsi_device_setup_eh(struct scsi_device *sdev); +extern void scsi_device_clear_eh(struct scsi_device *sdev); =20 static inline bool scsi_sense_is_deferred(const struct scsi_sense_hdr *ssh= dr) { diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h index 46f57fe78505..9cc34bfff3f7 100644 --- a/include/scsi/scsi_host.h +++ b/include/scsi/scsi_host.h @@ -225,6 +225,19 @@ struct scsi_host_template { */ void (* sdev_destroy)(struct scsi_device *); =20 + /* + * Setup or clear error handler field scsi_device.eh. + * This error handler is working on designated device, it will only + * operate the designated device, do not affect other devices. + * If not set, error handle will fallback. + * LLDD can use custom error handler, or use inner defined: + * scsi_device_setup_eh/scsi_device_clear_eh + * + * Status: OPTIONAL + */ + int (*sdev_setup_eh)(struct scsi_device *sdev); + void (*sdev_clear_eh)(struct scsi_device *sdev); + /* * Before the mid layer attempts to scan for a new device attached * to a target where no target currently exists, it will call this @@ -468,6 +481,9 @@ struct scsi_host_template { /* The queuecommand callback may block. See also BLK_MQ_F_BLOCKING. */ unsigned queuecommand_may_block:1; =20 + /* The error handle of scsi_device will fallback when failed. */ + unsigned sdev_eh_fallback:1; + /* * Countdown for host blocking with no commands outstanding. */ --=20 2.33.0 From nobody Thu Oct 2 16:48:56 2025 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2BEC2566E2; Sun, 14 Sep 2025 10:11:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.189 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844691; cv=none; b=dYy4+dNQTBYqdnTeVuDvFKovcvLeQsldvDCd5Vvq+NiZ7M15eeyh4jJY8eJVe1dGM+NuAnWsWAw9UBHkT9Ej9AjdJ6nM9y9s+7rJp4TUGAmHgpy4Ju8ne12eelNvqVdVMYqJEWvBp4VZCk4syP5uTBo25NoMj2K9Kc1lKdQrOp8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757844691; c=relaxed/simple; bh=BTc67/T1k3an/APq68H/wXDOjvF0wFUOsuG67udJTaQ=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=CHz/buAL6+bbcbbk/fx9U2FMrL7NYeBQI7CaE/4oVUXcxIg0m3wkvWXr+4+JTBBKjuLd9M5rXUWc8mhbw7gU90KJRLHmwCUP5MQc5dV4CrYMqnmjQcIjI5mIixvnjqZzUVztw0Ume0Si2tEiLKM5rMMJUbWUHa9JI8AdDXyT/Fg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=h-partners.com; arc=none smtp.client-ip=45.249.212.189 Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=h-partners.com Received: from mail.maildlp.com (unknown [172.19.88.105]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4cPkN73LCdzdcDr; Sun, 14 Sep 2025 18:06:47 +0800 (CST) Received: from kwepemk500001.china.huawei.com (unknown [7.202.194.86]) by mail.maildlp.com (Postfix) with ESMTPS id B4EFE1402DA; Sun, 14 Sep 2025 18:11:21 +0800 (CST) Received: from localhost.localdomain (10.175.104.170) by kwepemk500001.china.huawei.com (7.202.194.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sun, 14 Sep 2025 18:11:20 +0800 From: JiangJianJun To: , , CC: , , , , , , Subject: [RFC PATCH v4 9/9] scsi: scsi_debug: Add params for configuring the error handler Date: Sun, 14 Sep 2025 18:41:45 +0800 Message-ID: <20250914104145.2239901-10-jiangjianjun3@huawei.com> X-Mailer: git-send-email 2.33.0 In-Reply-To: <20250914104145.2239901-1-jiangjianjun3@huawei.com> References: <17230842-0a7a-403e-abc7-a15e3aa5d424@suse.de> <20250914104145.2239901-1-jiangjianjun3@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500001.china.huawei.com (7.202.194.86) Content-Type: text/plain; charset="utf-8" Add a new module parameter to configure error handlers based on LUN and toggle the enable/disable fallback functionality. Signed-off-by: JiangJianJun --- drivers/scsi/scsi_debug.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 353cb60e1abe..f221ad31b44d 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -961,6 +961,8 @@ static bool write_since_sync; static bool sdebug_statistics =3D DEF_STATISTICS; static bool sdebug_wp; static bool sdebug_allow_restart; +static bool sdebug_lun_eh; +static bool sdebug_lun_eh_fallback; static enum { BLK_ZONED_NONE =3D 0, BLK_ZONED_HA =3D 1, @@ -7385,6 +7387,8 @@ module_param_named(zone_max_open, sdeb_zbc_max_open, = int, S_IRUGO); module_param_named(zone_nr_conv, sdeb_zbc_nr_conv, int, S_IRUGO); module_param_named(zone_size_mb, sdeb_zbc_zone_size_mb, int, S_IRUGO); module_param_named(allow_restart, sdebug_allow_restart, bool, S_IRUGO | S_= IWUSR); +module_param_named(lun_eh, sdebug_lun_eh, bool, 0444); +module_param_named(lun_eh_fallback, sdebug_lun_eh_fallback, bool, 0444); =20 MODULE_AUTHOR("Eric Youngdale + Douglas Gilbert"); MODULE_DESCRIPTION("SCSI debug adapter driver"); @@ -7464,6 +7468,8 @@ MODULE_PARM_DESC(zone_max_open, "Maximum number of op= en zones; [0] for no limit MODULE_PARM_DESC(zone_nr_conv, "Number of conventional zones (def=3D1)"); MODULE_PARM_DESC(zone_size_mb, "Zone size in MiB (def=3Dauto)"); MODULE_PARM_DESC(allow_restart, "Set scsi_device's allow_restart flag(def= =3D0)"); +MODULE_PARM_DESC(lun_eh, "LUN based error handle (def=3D0)"); +MODULE_PARM_DESC(lun_eh_fallback, "Fallback to further recovery if LUN rec= overy failed (def=3D0)"); =20 #define SDEBUG_INFO_LEN 256 static char sdebug_info[SDEBUG_INFO_LEN]; @@ -8450,6 +8456,7 @@ static struct attribute *sdebug_drv_attrs[] =3D { ATTRIBUTE_GROUPS(sdebug_drv); =20 static struct device *pseudo_primary; +static struct scsi_host_template sdebug_driver_template; =20 static int __init scsi_debug_init(void) { @@ -8458,6 +8465,12 @@ static int __init scsi_debug_init(void) int k, ret, hosts_to_add; int idx =3D -1; =20 + if (sdebug_lun_eh) { + sdebug_driver_template.sdev_setup_eh =3D scsi_device_setup_eh; + sdebug_driver_template.sdev_clear_eh =3D scsi_device_clear_eh; + sdebug_driver_template.sdev_eh_fallback =3D sdebug_lun_eh_fallback; + } + if (sdebug_ndelay >=3D 1000 * 1000 * 1000) { pr_warn("ndelay must be less than 1 second, ignored\n"); sdebug_ndelay =3D 0; @@ -9435,7 +9448,7 @@ static int sdebug_init_cmd_priv(struct Scsi_Host *sho= st, struct scsi_cmnd *cmd) return 0; } =20 -static const struct scsi_host_template sdebug_driver_template =3D { +static struct scsi_host_template sdebug_driver_template =3D { .show_info =3D scsi_debug_show_info, .write_info =3D scsi_debug_write_info, .proc_name =3D sdebug_proc_name, --=20 2.33.0