If a device is in an error state, then any reads of device registers can
return error value. Add addtional checks to validate if a device is in an
error state before doing an flr reset.
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
drivers/pci/pci.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a3d93d1baee7..327fefc6a1eb 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4576,12 +4576,19 @@ EXPORT_SYMBOL_GPL(pcie_flr);
*/
int pcie_reset_flr(struct pci_dev *dev, bool probe)
{
+ u32 reg;
+
if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
return -ENOTTY;
if (!(dev->devcap & PCI_EXP_DEVCAP_FLR))
return -ENOTTY;
+ if (pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, ®)) {
+ pci_warn(dev, "Device unable to do an FLR\n");
+ return -ENOTTY;
+ }
+
if (probe)
return 0;
--
2.43.0
On Wed, Sep 24, 2025 at 10:16:20AM -0700, Farhan Ali wrote: > If a device is in an error state, then any reads of device registers can > return error value. Add addtional checks to validate if a device is in an > error state before doing an flr reset. > > Signed-off-by: Farhan Ali <alifm@linux.ibm.com> > --- > drivers/pci/pci.c | 7 +++++++ > 1 file changed, 7 insertions(+) > Apart from the discussion about a stable tag this looks good to me. Reviewed-by: Benjamin Block <bblock@linux.ibm.com> -- Best Regards, Benjamin Block / Linux on IBM Z Kernel Development IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy Vors. Aufs.-R.: Wolfgang Wendt / Geschäftsführung: David Faller Sitz der Ges.: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294
On Wed, Sep 24, 2025 at 10:16:20AM -0700, Farhan Ali wrote: > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index a3d93d1baee7..327fefc6a1eb 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -4576,12 +4576,19 @@ EXPORT_SYMBOL_GPL(pcie_flr); > */ > int pcie_reset_flr(struct pci_dev *dev, bool probe) > { > + u32 reg; > + > if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET) > return -ENOTTY; > > if (!(dev->devcap & PCI_EXP_DEVCAP_FLR)) > return -ENOTTY; > > + if (pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, ®)) { > + pci_warn(dev, "Device unable to do an FLR\n"); > + return -ENOTTY; > + } Just thinking out loud, not sure whether it make sense, but since you already read an up-to-date value from the config space, would it make sense to pull the check above `dev->devcap & PCI_EXP_DEVCAP_FLR` below this read, and check on the just read `reg`? Also wondering whether it makes sense to stable-tag this? We've recently seen "unpleasant" recovery attempts that look like this in the kernel logs: [ 663.330053] vfio-pci 0007:00:00.1: timed out waiting for pending transaction; performing function level reset anyway [ 664.730051] vfio-pci 0007:00:00.1: not ready 1023ms after FLR; waiting [ 665.830023] vfio-pci 0007:00:00.1: not ready 2047ms after FLR; waiting [ 667.910023] vfio-pci 0007:00:00.1: not ready 4095ms after FLR; waiting [ 672.070022] vfio-pci 0007:00:00.1: not ready 8191ms after FLR; waiting [ 680.550025] vfio-pci 0007:00:00.1: not ready 16383ms after FLR; waiting [ 697.190023] vfio-pci 0007:00:00.1: not ready 32767ms after FLR; waiting [ 730.470021] vfio-pci 0007:00:00.1: not ready 65535ms after FLR; giving up The VF here was already dead in the water at that point, so I suspect `pci_read_config_dword()` might have failed, and so this check would have failed, and we wouldn't have "wasted" the minute waiting for a FLR that was never going to happen anyway. -- Best Regards, Benjamin Block / Linux on IBM Z Kernel Development IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy Vors. Aufs.-R.: Wolfgang Wendt / Geschäftsführung: David Faller Sitz der Ges.: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294
On 9/30/2025 3:03 AM, Benjamin Block wrote: > On Wed, Sep 24, 2025 at 10:16:20AM -0700, Farhan Ali wrote: >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index a3d93d1baee7..327fefc6a1eb 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -4576,12 +4576,19 @@ EXPORT_SYMBOL_GPL(pcie_flr); >> */ >> int pcie_reset_flr(struct pci_dev *dev, bool probe) >> { >> + u32 reg; >> + >> if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET) >> return -ENOTTY; >> >> if (!(dev->devcap & PCI_EXP_DEVCAP_FLR)) >> return -ENOTTY; >> >> + if (pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, ®)) { >> + pci_warn(dev, "Device unable to do an FLR\n"); >> + return -ENOTTY; >> + } > Just thinking out loud, not sure whether it make sense, but since you already > read an up-to-date value from the config space, would it make sense to > pull the check above `dev->devcap & PCI_EXP_DEVCAP_FLR` below this read, and > check on the just read `reg`? My thinking was we could exit early if the device never had FLR capability (and so was not cached in devcap). This way we avoid an extra PCI read. > > Also wondering whether it makes sense to stable-tag this? We've recently seen > "unpleasant" recovery attempts that look like this in the kernel logs: > > [ 663.330053] vfio-pci 0007:00:00.1: timed out waiting for pending transaction; performing function level reset anyway > [ 664.730051] vfio-pci 0007:00:00.1: not ready 1023ms after FLR; waiting > [ 665.830023] vfio-pci 0007:00:00.1: not ready 2047ms after FLR; waiting > [ 667.910023] vfio-pci 0007:00:00.1: not ready 4095ms after FLR; waiting > [ 672.070022] vfio-pci 0007:00:00.1: not ready 8191ms after FLR; waiting > [ 680.550025] vfio-pci 0007:00:00.1: not ready 16383ms after FLR; waiting > [ 697.190023] vfio-pci 0007:00:00.1: not ready 32767ms after FLR; waiting > [ 730.470021] vfio-pci 0007:00:00.1: not ready 65535ms after FLR; giving up > > The VF here was already dead in the water at that point, so I suspect > `pci_read_config_dword()` might have failed, and so this check would have > failed, and we wouldn't have "wasted" the minute waiting for a FLR that was > never going to happen anyway. I think maybe we could? I don't think this patch fixes anything that's "broken" but rather improves the behavior to escalate to other reset method if the device is already in a bad state. I will cc stable. Thanks Farhan
On Tue, Sep 30, 2025 at 10:04:25AM -0700, Farhan Ali wrote: > > On 9/30/2025 3:03 AM, Benjamin Block wrote: > > On Wed, Sep 24, 2025 at 10:16:20AM -0700, Farhan Ali wrote: > >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > >> index a3d93d1baee7..327fefc6a1eb 100644 > >> --- a/drivers/pci/pci.c > >> +++ b/drivers/pci/pci.c --8<-- > >> + if (pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, ®)) { > >> + pci_warn(dev, "Device unable to do an FLR\n"); > >> + return -ENOTTY; > >> + } > > Just thinking out loud, not sure whether it make sense, but since you already > > read an up-to-date value from the config space, would it make sense to > > pull the check above `dev->devcap & PCI_EXP_DEVCAP_FLR` below this read, and > > check on the just read `reg`? > > My thinking was we could exit early if the device never had FLR > capability (and so was not cached in devcap). This way we avoid an extra > PCI read. That makes sense. > > Also wondering whether it makes sense to stable-tag this? We've recently seen > > "unpleasant" recovery attempts that look like this in the kernel logs: > > > > [ 663.330053] vfio-pci 0007:00:00.1: timed out waiting for pending transaction; performing function level reset anyway > > [ 664.730051] vfio-pci 0007:00:00.1: not ready 1023ms after FLR; waiting > > [ 665.830023] vfio-pci 0007:00:00.1: not ready 2047ms after FLR; waiting > > [ 667.910023] vfio-pci 0007:00:00.1: not ready 4095ms after FLR; waiting > > [ 672.070022] vfio-pci 0007:00:00.1: not ready 8191ms after FLR; waiting > > [ 680.550025] vfio-pci 0007:00:00.1: not ready 16383ms after FLR; waiting > > [ 697.190023] vfio-pci 0007:00:00.1: not ready 32767ms after FLR; waiting > > [ 730.470021] vfio-pci 0007:00:00.1: not ready 65535ms after FLR; giving up > > > > The VF here was already dead in the water at that point, so I suspect > > `pci_read_config_dword()` might have failed, and so this check would have > > failed, and we wouldn't have "wasted" the minute waiting for a FLR that was > > never going to happen anyway. > > I think maybe we could? I don't think this patch fixes anything that's > "broken" but rather improves the behavior to escalate to other reset > method if the device is already in a bad state. I will cc stable. Right, adding a Fixes tag doesn't really make sense. But it does help with accelerating recoveries, so maybe a stable tag will work :) -- Best Regards, Benjamin Block / Linux on IBM Z Kernel Development IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy Vors. Aufs.-R.: Wolfgang Wendt / Geschäftsführung: David Faller Sitz der Ges.: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294
© 2016 - 2025 Red Hat, Inc.