drivers/pci/pci.c | 8 ++++++-- drivers/pci/pci.h | 2 ++ drivers/pci/quirks.c | 20 ++++++++++++++++++++ include/linux/pci.h | 1 + 4 files changed, 29 insertions(+), 2 deletions(-)
From: Graham Whyte <grwhyte@linux.microsoft.com> Add a new flr_delay member of the pci_dev struct to allow customization of the delay after FLR for devices that do not support immediate readiness or readiness time reporting. The main scenario this addresses is VF removal and rescan during runtime repairs and driver updates, which, if fixed to 100ms, introduces significant delays across multiple VFs. These delays are unnecessary for devices that complete the FLR well within this timeframe. Patch 1 adds the flr_delay member to the pci_dev struct Patch 2 adds the msft device specific quirk to utilize the flr_delay --- v2->v3: - Removed Microsoft specific pcie reset reset, replaced with customizable flr_delay parameter - Changed msleep in pcie_flr to usleep_range to support flr delays of under 20ms v1->v2: - Removed unnecessary EXPORT_SYMBOL_GPL for function pci_dev_wait - Link to thread:https://lore.kernel.org/linux-pci/?q=f%3Agrwhyte&x=t#m7453647902a1b22840f5e39434a631fd7b2515ce' Link to V1: https://lore.kernel.org/linux-pci/20250522085253.GN7435@unreal/T/#m7453647902a1b22840f5e39434a631fd7b2515ce Graham Whyte (2): PCI: Add flr_delay parameter to pci_dev struct PCI: Reduce FLR delay to 10ms for MSFT devices drivers/pci/pci.c | 8 ++++++-- drivers/pci/pci.h | 2 ++ drivers/pci/quirks.c | 20 ++++++++++++++++++++ include/linux/pci.h | 1 + 4 files changed, 29 insertions(+), 2 deletions(-) -- 2.25.1
On Wed, Jun 11, 2025 at 12:05:50AM +0000, grwhyte@linux.microsoft.com wrote: > From: Graham Whyte <grwhyte@linux.microsoft.com> > > Add a new flr_delay member of the pci_dev struct to allow customization of > the delay after FLR for devices that do not support immediate readiness > or readiness time reporting. The main scenario this addresses is VF > removal and rescan during runtime repairs and driver updates, which, > if fixed to 100ms, introduces significant delays across multiple VFs. > These delays are unnecessary for devices that complete the FLR well > within this timeframe. > I don't think it is acceptable to *reduce* the standard delay just because your device completes it more quickly. Proper way to reduce the timing would be to support FRS as you said, but we cannot have arbitrary delays for random devices. - Mani > Patch 1 adds the flr_delay member to the pci_dev struct > Patch 2 adds the msft device specific quirk to utilize the flr_delay > > --- > v2->v3: > - Removed Microsoft specific pcie reset reset, replaced with customizable flr_delay parameter > - Changed msleep in pcie_flr to usleep_range to support flr delays of under 20ms > v1->v2: > - Removed unnecessary EXPORT_SYMBOL_GPL for function pci_dev_wait > - Link to thread:https://lore.kernel.org/linux-pci/?q=f%3Agrwhyte&x=t#m7453647902a1b22840f5e39434a631fd7b2515ce' > > Link to V1: https://lore.kernel.org/linux-pci/20250522085253.GN7435@unreal/T/#m7453647902a1b22840f5e39434a631fd7b2515ce > > Graham Whyte (2): > PCI: Add flr_delay parameter to pci_dev struct > PCI: Reduce FLR delay to 10ms for MSFT devices > > drivers/pci/pci.c | 8 ++++++-- > drivers/pci/pci.h | 2 ++ > drivers/pci/quirks.c | 20 ++++++++++++++++++++ > include/linux/pci.h | 1 + > 4 files changed, 29 insertions(+), 2 deletions(-) > > -- > 2.25.1 > > -- மணிவண்ணன் சதாசிவம்
On Fri, Jun 13, 2025 at 05:12:48PM +0530, Manivannan Sadhasivam wrote: > On Wed, Jun 11, 2025 at 12:05:50AM +0000, grwhyte@linux.microsoft.com wrote: > > Add a new flr_delay member of the pci_dev struct to allow customization of > > the delay after FLR for devices that do not support immediate readiness > > or readiness time reporting. The main scenario this addresses is VF > > removal and rescan during runtime repairs and driver updates, which, > > if fixed to 100ms, introduces significant delays across multiple VFs. > > These delays are unnecessary for devices that complete the FLR well > > within this timeframe. > > > > I don't think it is acceptable to *reduce* the standard delay just > because your device completes it more quickly. Proper way to reduce > the timing would be to support FRS as you said, but we cannot have > arbitrary delays for random devices. To be fair, we already have that for certain devices: The quirk delay_250ms_after_flr() is referenced by three different Vendor ID / Device ID combos and *lengthens* the delay after FLR. It's probably difficult to justify rejecting custom delays for certain MANA devices, even though we allowed them for three other devices. The proposed patch introduces a generic solution which avoids further cluttering up pci_dev_reset_methods[] with extra entries, so I think it's an approach worth considering. There are a bunch of nits in the proposed patches, such as "pci" not being capitalized, but the general approach seems fine to me. Thanks, Lukas
On Fri, Jun 13, 2025 at 03:45:35PM +0200, Lukas Wunner wrote: > On Fri, Jun 13, 2025 at 05:12:48PM +0530, Manivannan Sadhasivam wrote: > > On Wed, Jun 11, 2025 at 12:05:50AM +0000, grwhyte@linux.microsoft.com wrote: > > > Add a new flr_delay member of the pci_dev struct to allow customization of > > > the delay after FLR for devices that do not support immediate readiness > > > or readiness time reporting. The main scenario this addresses is VF > > > removal and rescan during runtime repairs and driver updates, which, > > > if fixed to 100ms, introduces significant delays across multiple VFs. > > > These delays are unnecessary for devices that complete the FLR well > > > within this timeframe. > > > > > > > I don't think it is acceptable to *reduce* the standard delay just > > because your device completes it more quickly. Proper way to reduce > > the timing would be to support FRS as you said, but we cannot have > > arbitrary delays for random devices. > > To be fair, we already have that for certain devices: > > The quirk delay_250ms_after_flr() is referenced by three different > Vendor ID / Device ID combos and *lengthens* the delay after FLR. > This quirk is fine as it works around an issue in the device. But this patch is not fixing/working around an issue in the device, but rather optimizing the delay for performance, which is not what quirks are used for AFAIK. > It's probably difficult to justify rejecting custom delays for > certain MANA devices, even though we allowed them for three other > devices. > If the MANA devices require extended delay, then a quirk indeed makes sense. But it is the other way around. > The proposed patch introduces a generic solution which avoids > further cluttering up pci_dev_reset_methods[] with extra entries, > so I think it's an approach worth considering. > > There are a bunch of nits in the proposed patches, such as "pci" > not being capitalized, but the general approach seems fine to me. > I honestly don't know if there is any other way to handle this. So I think it is upto Bjorn to take a call. - Mani -- மணிவண்ணன் சதாசிவம்
On Wed, Jun 11, 2025 at 12:05:50AM +0000, grwhyte@linux.microsoft.com wrote: > From: Graham Whyte <grwhyte@linux.microsoft.com> > > Add a new flr_delay member of the pci_dev struct to allow customization of > the delay after FLR for devices that do not support immediate readiness > or readiness time reporting. The main scenario this addresses is VF > removal and rescan during runtime repairs and driver updates, which, > if fixed to 100ms, introduces significant delays across multiple VFs. > These delays are unnecessary for devices that complete the FLR well > within this timeframe. Please work with the PCIe SIG to have a standard capability for this instead of piling up hacks like this quirk.
On Tue, Jun 10, 2025 at 08:27:12PM -0700, Christoph Hellwig wrote: > On Wed, Jun 11, 2025 at 12:05:50AM +0000, grwhyte@linux.microsoft.com wrote: > > From: Graham Whyte <grwhyte@linux.microsoft.com> > > > > Add a new flr_delay member of the pci_dev struct to allow customization of > > the delay after FLR for devices that do not support immediate readiness > > or readiness time reporting. The main scenario this addresses is VF > > removal and rescan during runtime repairs and driver updates, which, > > if fixed to 100ms, introduces significant delays across multiple VFs. > > These delays are unnecessary for devices that complete the FLR well > > within this timeframe. > > Please work with the PCIe SIG to have a standard capability for this > instead of piling up hacks like this quirk. There is already some support in PCIe for this. For Conventional Reset, see PCIe 6.0, section 6.6.1, there is the "Readiness Time Reporting Extended Capability": "For a Device that implements the Readiness Time Reporting Extended Capability, and has reported a Reset Time shorter than 100ms, software is permitted to send a Configuration Request to the Device after waiting the reported Reset Time from Conventional Reset." For FLR, see PCIe 6.0, section 6.6.2, there is the "Function Readiness Status": "After an FLR has been initiated by writing a 1b to the Initiate Function Level Reset bit, the Function must complete the FLR within 100 ms. [...] If Function Readiness Status (FRS - see § Section 6.22.2 ) is implemented, then system software is permitted to issue Configuration Requests to the Function immediately following receipt of an FRS Message indicating Configuration-Ready, however, this does not necessarily indicate that outstanding Requests initiated by the Function have completed." Kind regards, Niklas
On 6/11/2025 12:23 AM, Niklas Cassel wrote: > On Tue, Jun 10, 2025 at 08:27:12PM -0700, Christoph Hellwig wrote: >> On Wed, Jun 11, 2025 at 12:05:50AM +0000, grwhyte@linux.microsoft.com wrote: >>> From: Graham Whyte <grwhyte@linux.microsoft.com> >>> >>> Add a new flr_delay member of the pci_dev struct to allow customization of >>> the delay after FLR for devices that do not support immediate readiness >>> or readiness time reporting. The main scenario this addresses is VF >>> removal and rescan during runtime repairs and driver updates, which, >>> if fixed to 100ms, introduces significant delays across multiple VFs. >>> These delays are unnecessary for devices that complete the FLR well >>> within this timeframe. >> >> Please work with the PCIe SIG to have a standard capability for this >> instead of piling up hacks like this quirk. > > There is already some support in PCIe for this. > > For Conventional Reset, see PCIe 6.0, section 6.6.1, there is the > "Readiness Time Reporting Extended Capability": > "For a Device that implements the Readiness Time Reporting Extended Capability, > and has reported a Reset Time shorter than 100ms, software is permitted to send > a Configuration Request to the Device after waiting the reported Reset Time > from Conventional Reset." > > > For FLR, see PCIe 6.0, section 6.6.2, there is the "Function Readiness Status": > "After an FLR has been initiated by writing a 1b to the Initiate Function Level > Reset bit, the Function must complete the FLR within 100 ms. [...] If Function > Readiness Status (FRS - see § Section 6.22.2 ) is implemented, then system > software is permitted to issue Configuration Requests to the Function > immediately following receipt of an FRS Message indicating Configuration-Ready, > however, this does not necessarily indicate that outstanding Requests initiated > by the Function have completed." > > > Kind regards, > Niklas We can ask our HW engineers to implement function readiness but we need to be able to support exiting products, hence why posting it as a quirk.
On Wed, Jun 11, 2025 at 01:08:21PM -0700, Graham Whyte wrote: > We can ask our HW engineers to implement function readiness but we need > to be able to support exiting products, hence why posting it as a quirk. Your report sounds like it works perfectly fine, it's just that you want to reduce the delay. For that you'll need to stick to the standard methods instead of adding quirks, which are for buggy hardware that does not otherwise work.
On 6/11/2025 11:31 PM, Christoph Hellwig wrote: > On Wed, Jun 11, 2025 at 01:08:21PM -0700, Graham Whyte wrote: >> We can ask our HW engineers to implement function readiness but we need >> to be able to support exiting products, hence why posting it as a quirk. > > Your report sounds like it works perfectly fine, it's just that you > want to reduce the delay. For that you'll need to stick to the standard > methods instead of adding quirks, which are for buggy hardware that does > not otherwise work. Bjorn, what would you recommend as next steps here?
On Thu, Jun 12, 2025 at 09:41:45AM -0700, Graham Whyte wrote: > On 6/11/2025 11:31 PM, Christoph Hellwig wrote: > > On Wed, Jun 11, 2025 at 01:08:21PM -0700, Graham Whyte wrote: > >> We can ask our HW engineers to implement function readiness but we need > >> to be able to support exiting products, hence why posting it as a quirk. > > > > Your report sounds like it works perfectly fine, it's just that you > > want to reduce the delay. For that you'll need to stick to the standard > > methods instead of adding quirks, which are for buggy hardware that does > > not otherwise work. > > Bjorn, what would you recommend as next steps here? This is a tough call and I don't pretend to have an obvious answer. I understand the desire to improve performance. On the other hand, PCI has been successful over the long term because devices adhere to standardized ways of doing things, which makes generic software possible. Quirks degrade that story, of course, especially when there is an existing standardized solution that isn't being used. I'm not at all happy about vendors that decide against the standard solution and then ask OS folks to do extra work to compensate. Bjorn
On 6/13/2025 8:33 AM, Bjorn Helgaas wrote: > On Thu, Jun 12, 2025 at 09:41:45AM -0700, Graham Whyte wrote: >> On 6/11/2025 11:31 PM, Christoph Hellwig wrote: >>> On Wed, Jun 11, 2025 at 01:08:21PM -0700, Graham Whyte wrote: >>>> We can ask our HW engineers to implement function readiness but we need >>>> to be able to support exiting products, hence why posting it as a quirk. >>> >>> Your report sounds like it works perfectly fine, it's just that you >>> want to reduce the delay. For that you'll need to stick to the standard >>> methods instead of adding quirks, which are for buggy hardware that does >>> not otherwise work. >> >> Bjorn, what would you recommend as next steps here? > > This is a tough call and I don't pretend to have an obvious answer. I > understand the desire to improve performance. On the other hand, PCI > has been successful over the long term because devices adhere to > standardized ways of doing things, which makes generic software > possible. Quirks degrade that story, of course, especially when there > is an existing standardized solution that isn't being used. I'm not > at all happy about vendors that decide against the standard solution > and then ask OS folks to do extra work to compensate. > > Bjorn Hi Bjorn, Should someone want to implement readiness time reporting down the road, they'll need to do the same work as patch 1 in this series (making the flr delay a configurable parameter). This change lays the groundwork for that work, while also supporting devices that can't use readiness time reporting. Alternatively could we use a sysfs file to make this parameter configurable via the user space application? Similar to switching between d3hot and d3cold by writing to /sys/bus/pci/slots/$DEVICE/power. Thanks, Graham
On Mon, Jun 16, 2025 at 12:02:41PM -0700, Graham Whyte wrote: > On 6/13/2025 8:33 AM, Bjorn Helgaas wrote: > > On Thu, Jun 12, 2025 at 09:41:45AM -0700, Graham Whyte wrote: > >> On 6/11/2025 11:31 PM, Christoph Hellwig wrote: > >>> On Wed, Jun 11, 2025 at 01:08:21PM -0700, Graham Whyte wrote: > >>>> We can ask our HW engineers to implement function readiness but we need > >>>> to be able to support exiting products, hence why posting it as a quirk. > >>> > >>> Your report sounds like it works perfectly fine, it's just that you > >>> want to reduce the delay. For that you'll need to stick to the standard > >>> methods instead of adding quirks, which are for buggy hardware that does > >>> not otherwise work. > >> > >> Bjorn, what would you recommend as next steps here? > > > > This is a tough call and I don't pretend to have an obvious answer. I > > understand the desire to improve performance. On the other hand, PCI > > has been successful over the long term because devices adhere to > > standardized ways of doing things, which makes generic software > > possible. Quirks degrade that story, of course, especially when there > > is an existing standardized solution that isn't being used. I'm not > > at all happy about vendors that decide against the standard solution > > and then ask OS folks to do extra work to compensate. > > Should someone want to implement readiness time reporting down the road, > they'll need to do the same work as patch 1 in this series (making the > flr delay a configurable parameter). Sure. That's a trivial change. The problem is the quirk itself. The Readiness Time Reporting Extended Capability is read-only with no control bits in it so it requires no actual logic in the device. Maybe you can just implement that capability with a firmware change on the device and add the corresponding Linux support for it.
On 6/16/2025 2:05 PM, Bjorn Helgaas wrote: > On Mon, Jun 16, 2025 at 12:02:41PM -0700, Graham Whyte wrote: >> On 6/13/2025 8:33 AM, Bjorn Helgaas wrote: >>> On Thu, Jun 12, 2025 at 09:41:45AM -0700, Graham Whyte wrote: >>>> On 6/11/2025 11:31 PM, Christoph Hellwig wrote: >>>>> On Wed, Jun 11, 2025 at 01:08:21PM -0700, Graham Whyte wrote: >>>>>> We can ask our HW engineers to implement function readiness but we need >>>>>> to be able to support exiting products, hence why posting it as a quirk. >>>>> >>>>> Your report sounds like it works perfectly fine, it's just that you >>>>> want to reduce the delay. For that you'll need to stick to the standard >>>>> methods instead of adding quirks, which are for buggy hardware that does >>>>> not otherwise work. >>>> >>>> Bjorn, what would you recommend as next steps here? >>> >>> This is a tough call and I don't pretend to have an obvious answer. I >>> understand the desire to improve performance. On the other hand, PCI >>> has been successful over the long term because devices adhere to >>> standardized ways of doing things, which makes generic software >>> possible. Quirks degrade that story, of course, especially when there >>> is an existing standardized solution that isn't being used. I'm not >>> at all happy about vendors that decide against the standard solution >>> and then ask OS folks to do extra work to compensate. >> >> Should someone want to implement readiness time reporting down the road, >> they'll need to do the same work as patch 1 in this series (making the >> flr delay a configurable parameter). > > Sure. That's a trivial change. The problem is the quirk itself. > > The Readiness Time Reporting Extended Capability is read-only with no > control bits in it so it requires no actual logic in the device. > Maybe you can just implement that capability with a firmware change on > the device and add the corresponding Linux support for it. Hi Bjorn, We checked with our HW folks, it's not possible for us to update the pci register components with this particular card, they are read only. What are your thoughts on the sysfs approach mentioned in the previous email? Thanks, Graham
On 6/18/2025 9:42 AM, Graham Whyte wrote: > > > On 6/16/2025 2:05 PM, Bjorn Helgaas wrote: >> On Mon, Jun 16, 2025 at 12:02:41PM -0700, Graham Whyte wrote: >>> On 6/13/2025 8:33 AM, Bjorn Helgaas wrote: >>>> On Thu, Jun 12, 2025 at 09:41:45AM -0700, Graham Whyte wrote: >>>>> On 6/11/2025 11:31 PM, Christoph Hellwig wrote: >>>>>> On Wed, Jun 11, 2025 at 01:08:21PM -0700, Graham Whyte wrote: >>>>>>> We can ask our HW engineers to implement function readiness but we need >>>>>>> to be able to support exiting products, hence why posting it as a quirk. >>>>>> >>>>>> Your report sounds like it works perfectly fine, it's just that you >>>>>> want to reduce the delay. For that you'll need to stick to the standard >>>>>> methods instead of adding quirks, which are for buggy hardware that does >>>>>> not otherwise work. >>>>> >>>>> Bjorn, what would you recommend as next steps here? >>>> >>>> This is a tough call and I don't pretend to have an obvious answer. I >>>> understand the desire to improve performance. On the other hand, PCI >>>> has been successful over the long term because devices adhere to >>>> standardized ways of doing things, which makes generic software >>>> possible. Quirks degrade that story, of course, especially when there >>>> is an existing standardized solution that isn't being used. I'm not >>>> at all happy about vendors that decide against the standard solution >>>> and then ask OS folks to do extra work to compensate. >>> >>> Should someone want to implement readiness time reporting down the road, >>> they'll need to do the same work as patch 1 in this series (making the >>> flr delay a configurable parameter). >> >> Sure. That's a trivial change. The problem is the quirk itself. >> >> The Readiness Time Reporting Extended Capability is read-only with no >> control bits in it so it requires no actual logic in the device. >> Maybe you can just implement that capability with a firmware change on >> the device and add the corresponding Linux support for it. > > Hi Bjorn, > > We checked with our HW folks, it's not possible for us to update the pci > register components with this particular card, they are read only. What > are your thoughts on the sysfs approach mentioned in the previous email? > > Thanks, > Graham Hi Bjorn, just wanted to follow up on this here.
© 2016 - 2025 Red Hat, Inc.