[PATCH v2 0/4] SCSI: Fix issues between removing device and error handle

Wenchao Hao posted 4 patches 2 years, 4 months ago
There is a newer version of this series
drivers/scsi/scsi.c        | 43 +++++++++++++++++++++++++-------------
drivers/scsi/scsi_error.c  |  4 ++--
drivers/scsi/scsi_lib.c    |  2 +-
include/scsi/scsi_device.h | 25 +++++++++++++++++++---
4 files changed, 53 insertions(+), 21 deletions(-)
[PATCH v2 0/4] SCSI: Fix issues between removing device and error handle
Posted by Wenchao Hao 2 years, 4 months ago
I am testing SCSI error handle with my previous scsi_debug error
injection patches, and found some issues when removing device and
error handler happened together.

These issues are triggered because devices in removing would be skipped
when calling shost_for_each_device().

Three issues are found:
1. statistic info printed at beginning of scsi_error_handler is wrong
2. device reset is not triggered
3. IO requeued to request_queue would be hang after error handle

V2:
  - Fix IO hang by run all devices' queue after error handler
  - Do not modify shost_for_each_device() directly but add a new
    helper to iterate devices but do not skip devices in removing

Wenchao Hao (4):
  scsi: core: Add new helper to iterate all devices of host
  scsi: scsi_error: Fix wrong statistic when print error info
  scsi: scsi_error: Fix device reset is not triggered
  scsi: scsi_core:  Fix IO hang when device removing

 drivers/scsi/scsi.c        | 43 +++++++++++++++++++++++++-------------
 drivers/scsi/scsi_error.c  |  4 ++--
 drivers/scsi/scsi_lib.c    |  2 +-
 include/scsi/scsi_device.h | 25 +++++++++++++++++++---
 4 files changed, 53 insertions(+), 21 deletions(-)

-- 
2.32.0
Re: [PATCH v2 0/4] SCSI: Fix issues between removing device and error handle
Posted by Wenchao Hao 2 years, 4 months ago
On 2023/9/28 15:35, Wenchao Hao wrote:
> I am testing SCSI error handle with my previous scsi_debug error
> injection patches, and found some issues when removing device and
> error handler happened together.
> 
> These issues are triggered because devices in removing would be skipped
> when calling shost_for_each_device().
> 
> Three issues are found:
> 1. statistic info printed at beginning of scsi_error_handler is wrong
> 2. device reset is not triggered
> 3. IO requeued to request_queue would be hang after error handle
> 

These patches fix bug which is easy to recurrent when removing device
and error handle happened together, so friendly ping again...

> V2:
>    - Fix IO hang by run all devices' queue after error handler
>    - Do not modify shost_for_each_device() directly but add a new
>      helper to iterate devices but do not skip devices in removing
> 
> Wenchao Hao (4):
>    scsi: core: Add new helper to iterate all devices of host
>    scsi: scsi_error: Fix wrong statistic when print error info
>    scsi: scsi_error: Fix device reset is not triggered
>    scsi: scsi_core:  Fix IO hang when device removing
> 
>   drivers/scsi/scsi.c        | 43 +++++++++++++++++++++++++-------------
>   drivers/scsi/scsi_error.c  |  4 ++--
>   drivers/scsi/scsi_lib.c    |  2 +-
>   include/scsi/scsi_device.h | 25 +++++++++++++++++++---
>   4 files changed, 53 insertions(+), 21 deletions(-)
>
Re: [PATCH v2 0/4] SCSI: Fix issues between removing device and error handle
Posted by Wenchao Hao 2 years, 4 months ago
On 2023/9/28 15:35, Wenchao Hao wrote:
> I am testing SCSI error handle with my previous scsi_debug error
> injection patches, and found some issues when removing device and
> error handler happened together.
> 
> These issues are triggered because devices in removing would be skipped
> when calling shost_for_each_device().
> 

ping...

> Three issues are found:
> 1. statistic info printed at beginning of scsi_error_handler is wrong
> 2. device reset is not triggered
> 3. IO requeued to request_queue would be hang after error handle
> 
> V2:
>    - Fix IO hang by run all devices' queue after error handler
>    - Do not modify shost_for_each_device() directly but add a new
>      helper to iterate devices but do not skip devices in removing
> 
> Wenchao Hao (4):
>    scsi: core: Add new helper to iterate all devices of host
>    scsi: scsi_error: Fix wrong statistic when print error info
>    scsi: scsi_error: Fix device reset is not triggered
>    scsi: scsi_core:  Fix IO hang when device removing
> 
>   drivers/scsi/scsi.c        | 43 +++++++++++++++++++++++++-------------
>   drivers/scsi/scsi_error.c  |  4 ++--
>   drivers/scsi/scsi_lib.c    |  2 +-
>   include/scsi/scsi_device.h | 25 +++++++++++++++++++---
>   4 files changed, 53 insertions(+), 21 deletions(-)
>
Re: [PATCH v2 0/4] SCSI: Fix issues between removing device and error handle
Posted by Wenchao Hao 2 years, 4 months ago
On 2023/9/28 15:35, Wenchao Hao wrote:
> I am testing SCSI error handle with my previous scsi_debug error
> injection patches, and found some issues when removing device and
> error handler happened together.
> 
> These issues are triggered because devices in removing would be skipped
> when calling shost_for_each_device().
> 
> Three issues are found:
> 1. statistic info printed at beginning of scsi_error_handler is wrong
> 2. device reset is not triggered
> 3. IO requeued to request_queue would be hang after error handle
> 

Hi Martin, would you help review these patches?

> V2:
>    - Fix IO hang by run all devices' queue after error handler
>    - Do not modify shost_for_each_device() directly but add a new
>      helper to iterate devices but do not skip devices in removing
> 
> Wenchao Hao (4):
>    scsi: core: Add new helper to iterate all devices of host
>    scsi: scsi_error: Fix wrong statistic when print error info
>    scsi: scsi_error: Fix device reset is not triggered
>    scsi: scsi_core:  Fix IO hang when device removing
> 
>   drivers/scsi/scsi.c        | 43 +++++++++++++++++++++++++-------------
>   drivers/scsi/scsi_error.c  |  4 ++--
>   drivers/scsi/scsi_lib.c    |  2 +-
>   include/scsi/scsi_device.h | 25 +++++++++++++++++++---
>   4 files changed, 53 insertions(+), 21 deletions(-)
>