drivers/scsi/smartpqi/smartpqi_init.c | 47 +++++++++++++++++++++++++++ drivers/scsi/smartpqi/smartpqi_sis.c | 2 +- 2 files changed, 48 insertions(+), 1 deletion(-)
A PCIe bus reset (e.g. "echo 1 > /sys/bus/pci/devices/<bdf>/reset") on a controller without FLR support leaves the HPE SR932i-p Gen10+ unusable until reboot: smartpqi registers no pci_error_handlers, so the driver is not notified, firmware reverts to SIS mode, and all queue mappings are dropped while the driver still drives PQI. Patch 1 adds .reset_prepare / .reset_done reusing pqi_ofa_ctrl_quiesce() / _unquiesce() / pqi_ctrl_init_resume(). Patch 2 raises SIS_CTRL_READY_RESUME_TIMEOUT_SECS from 90s to 180s, matching the cold-boot path; without this patch 1 fails at the SIS ready check because firmware boot after reset takes ~125s on the SR932i-p Gen10+. Tested on HPE SR932i-p Gen10+ against Linus' master at 74fe02ce122a. Note: the From: header is my Posteo address because my employer's SMTP is unavailable for external mailing lists. The Signed-off-by carries the Microchip attribution. Mateusz Nowicki (2): scsi: smartpqi: add pci_error_handlers for bus reset recovery scsi: smartpqi: increase SIS ctrl ready resume timeout to 180s drivers/scsi/smartpqi/smartpqi_init.c | 47 +++++++++++++++++++++++++++ drivers/scsi/smartpqi/smartpqi_sis.c | 2 +- 2 files changed, 48 insertions(+), 1 deletion(-) -- 2.43.0
On Wed, 2026-05-06 at 14:01 +0000, Mateusz Nowicki wrote: > A PCIe bus reset (e.g. "echo 1 > /sys/bus/pci/devices/<bdf>/reset") > on a > controller without FLR support leaves the HPE SR932i-p Gen10+ > unusable > until reboot: smartpqi registers no pci_error_handlers, so the driver > is not notified, firmware reverts to SIS mode, and all queue mappings > are dropped while the driver still drives PQI. > > Patch 1 adds .reset_prepare / .reset_done reusing > pqi_ofa_ctrl_quiesce() / _unquiesce() / pqi_ctrl_init_resume(). > > Patch 2 raises SIS_CTRL_READY_RESUME_TIMEOUT_SECS from 90s to 180s, > matching the cold-boot path; without this patch 1 fails at the SIS > ready check because firmware boot after reset takes ~125s on the > SR932i-p Gen10+. > > Tested on HPE SR932i-p Gen10+ against Linus' master at 74fe02ce122a. > > Note: the From: header is my Posteo address because my employer's > SMTP > is unavailable for external mailing lists. The Signed-off-by carries > the Microchip attribution. > > Mateusz Nowicki (2): > scsi: smartpqi: add pci_error_handlers for bus reset recovery > scsi: smartpqi: increase SIS ctrl ready resume timeout to 180s > > drivers/scsi/smartpqi/smartpqi_init.c | 47 > +++++++++++++++++++++++++++ > drivers/scsi/smartpqi/smartpqi_sis.c | 2 +- > 2 files changed, 48 insertions(+), 1 deletion(-) > > -- > 2.43.0 > > > Hello I did reproduce this so I am testing the patches as well. They look correct to me, I will reply again after testing with a review. Thanks Laurence [2513778.140012] smartpqi 0000:64:00.0: no heartbeat detected - last heartbeat count: 4207808511 [2513778.140031] smartpqi 0000:64:00.0: controller offline: reason code 0x4 (no controller heartbeat detected) [2513778.141346] sd 1:0:0:0: [sda] tag#549 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=18s [2513778.141355] sd 1:0:0:0: [sda] tag#550 FAILED Result: "xfs_buf_ioend_handle_error+0xd5/0x3f0 [xfs]" at daddr 0x9f78 len 8 error 5 [2513778.141526] XFS (dm-0): log I/O error -5
On Wed, 2026-05-06 at 18:21 -0400, Laurence Oberman wrote: > On Wed, 2026-05-06 at 14:01 +0000, Mateusz Nowicki wrote: > > A PCIe bus reset (e.g. "echo 1 > /sys/bus/pci/devices/<bdf>/reset") > > on a > > controller without FLR support leaves the HPE SR932i-p Gen10+ > > unusable > > until reboot: smartpqi registers no pci_error_handlers, so the > > driver > > is not notified, firmware reverts to SIS mode, and all queue > > mappings > > are dropped while the driver still drives PQI. > > > > Patch 1 adds .reset_prepare / .reset_done reusing > > pqi_ofa_ctrl_quiesce() / _unquiesce() / pqi_ctrl_init_resume(). > > > > Patch 2 raises SIS_CTRL_READY_RESUME_TIMEOUT_SECS from 90s to 180s, > > matching the cold-boot path; without this patch 1 fails at the SIS > > ready check because firmware boot after reset takes ~125s on the > > SR932i-p Gen10+. > > > > Tested on HPE SR932i-p Gen10+ against Linus' master at > > 74fe02ce122a. > > > > Note: the From: header is my Posteo address because my employer's > > SMTP > > is unavailable for external mailing lists. The Signed-off-by > > carries > > the Microchip attribution. > > > > Mateusz Nowicki (2): > > scsi: smartpqi: add pci_error_handlers for bus reset recovery > > scsi: smartpqi: increase SIS ctrl ready resume timeout to 180s > > > > drivers/scsi/smartpqi/smartpqi_init.c | 47 > > +++++++++++++++++++++++++++ > > drivers/scsi/smartpqi/smartpqi_sis.c | 2 +- > > 2 files changed, 48 insertions(+), 1 deletion(-) > > > > -- > > 2.43.0 > > > > > > > Hello > > I did reproduce this so I am testing the patches as well. > They look correct to me, I will reply again after testing with a > review. > > Thanks > Laurence > > > [2513778.140012] smartpqi 0000:64:00.0: no heartbeat detected - last > heartbeat count: 4207808511 > [2513778.140031] smartpqi 0000:64:00.0: controller offline: reason > code > 0x4 (no controller heartbeat detected) > [2513778.141346] sd 1:0:0:0: [sda] tag#549 FAILED Result: > hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=18s > [2513778.141355] sd 1:0:0:0: [sda] tag#550 FAILED Result: > > "xfs_buf_ioend_handle_error+0xd5/0x3f0 [xfs]" at daddr 0x9f78 len 8 > error 5 > [2513778.141526] XFS (dm-0): log I/O error -5 > Hello For the series: I tested the patches and it recovers with them applied. The patches look good. Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com>
Hello, Thank you Laurence, appreciated. I'll add your Tested-by and Reviewed-by to both patches in v2 if the series needs a respin; otherwise Don or Martin will pick it up on apply. Thanks, Mateusz On 07.05.2026 03:45, Laurence Oberman wrote: > On Wed, 2026-05-06 at 18:21 -0400, Laurence Oberman wrote: >> On Wed, 2026-05-06 at 14:01 +0000, Mateusz Nowicki wrote: >> > A PCIe bus reset (e.g. "echo 1 > /sys/bus/pci/devices/<bdf>/reset") >> > on a >> > controller without FLR support leaves the HPE SR932i-p Gen10+ >> > unusable >> > until reboot: smartpqi registers no pci_error_handlers, so the >> > driver >> > is not notified, firmware reverts to SIS mode, and all queue >> > mappings >> > are dropped while the driver still drives PQI. >> > >> > Patch 1 adds .reset_prepare / .reset_done reusing >> > pqi_ofa_ctrl_quiesce() / _unquiesce() / pqi_ctrl_init_resume(). >> > >> > Patch 2 raises SIS_CTRL_READY_RESUME_TIMEOUT_SECS from 90s to 180s, >> > matching the cold-boot path; without this patch 1 fails at the SIS >> > ready check because firmware boot after reset takes ~125s on the >> > SR932i-p Gen10+. >> > >> > Tested on HPE SR932i-p Gen10+ against Linus' master at >> > 74fe02ce122a. >> > >> > Note: the From: header is my Posteo address because my employer's >> > SMTP >> > is unavailable for external mailing lists. The Signed-off-by >> > carries >> > the Microchip attribution. >> > >> > Mateusz Nowicki (2): >> > scsi: smartpqi: add pci_error_handlers for bus reset recovery >> > scsi: smartpqi: increase SIS ctrl ready resume timeout to 180s >> > >> > drivers/scsi/smartpqi/smartpqi_init.c | 47 >> > +++++++++++++++++++++++++++ >> > drivers/scsi/smartpqi/smartpqi_sis.c | 2 +- >> > 2 files changed, 48 insertions(+), 1 deletion(-) >> > >> > -- >> > 2.43.0 >> > >> > >> > >> Hello >> >> I did reproduce this so I am testing the patches as well. >> They look correct to me, I will reply again after testing with a >> review. >> >> Thanks >> Laurence >> >> >> [2513778.140012] smartpqi 0000:64:00.0: no heartbeat detected - last >> heartbeat count: 4207808511 >> [2513778.140031] smartpqi 0000:64:00.0: controller offline: reason >> code >> 0x4 (no controller heartbeat detected) >> [2513778.141346] sd 1:0:0:0: [sda] tag#549 FAILED Result: >> hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=18s >> [2513778.141355] sd 1:0:0:0: [sda] tag#550 FAILED Result: >> >> "xfs_buf_ioend_handle_error+0xd5/0x3f0 [xfs]" at daddr 0x9f78 len 8 >> error 5 >> [2513778.141526] XFS (dm-0): log I/O error -5 >> > > Hello > > For the series: > > I tested the patches and it recovers with them applied. > The patches look good. > > Tested-by: Laurence Oberman <loberman@redhat.com> > Reviewed-by: Laurence Oberman <loberman@redhat.com>
© 2016 - 2026 Red Hat, Inc.