The msleep() function with small values (less than 20ms) may not sleep
for the exact duration due to the kernel's timer wheel design. According
to the comment in kernel/time/sleep_timeout.c:
"The slack of timers which will end up in level 0 depends on sleep
duration (msecs) and HZ configuration. For example, with HZ=1000 and
a requested sleep of 2ms, the slack can be as high as 50% (1ms) because
the minimum slack is 12.5% but the actual calculation for level 0 timers
is slack = MSECS_PER_TICK / msecs. This means that msleep(2) can
actually take up to 3ms (2ms + 1ms) on a system with HZ=1000."
This unnecessary delay can impact system responsiveness during PCI
operations, especially since the PCIe r7.0 specification, section
7.5.1.3.13, requires only a minimum Trst of 1ms. We double this to 2ms
to ensure we meet the minimum requirement, but using msleep(2) may
actually wait longer than needed.
Using fsleep() provides a more precise delay that matches the stated
intent of the code. The fsleep() function uses high-resolution timers
where available to achieve microsecond precision.
Replace msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS) with
fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US) to ensure the actual delay is
closer to the intended 2ms delay.
Signed-off-by: Hans Zhang <18255117159@163.com>
---
drivers/pci/pci.c | 2 +-
drivers/pci/pci.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c05a4c2fa643..81105dfc2f62 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4964,7 +4964,7 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
/* Double this to 2ms to ensure that we meet the minimum requirement */
- msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS);
+ fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US);
ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4d7e9c3f3453..9d38ef26c6a9 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -61,7 +61,7 @@ struct pcie_tlp_log;
#define PCIE_LINK_WAIT_SLEEP_MS 90
/* PCIe r7.0, sec 7.5.1.3.13, requires minimum Trst of 1ms */
-#define PCI_T_RST_SEC_BUS_DELAY_MS 1
+#define PCI_T_RST_SEC_BUS_DELAY_US 1000
/* Message Routing (r[2:0]); PCIe r6.0, sec 2.2.8 */
#define PCIE_MSG_TYPE_R_RC 0
--
2.25.1
On Wed, Aug 27, 2025 at 01:03:09AM +0800, Hans Zhang wrote: > The msleep() function with small values (less than 20ms) may not sleep > for the exact duration due to the kernel's timer wheel design. According > to the comment in kernel/time/sleep_timeout.c: > > "The slack of timers which will end up in level 0 depends on sleep > duration (msecs) and HZ configuration. For example, with HZ=1000 and > a requested sleep of 2ms, the slack can be as high as 50% (1ms) because > the minimum slack is 12.5% but the actual calculation for level 0 timers > is slack = MSECS_PER_TICK / msecs. This means that msleep(2) can > actually take up to 3ms (2ms + 1ms) on a system with HZ=1000." I thought I heard something about 20ms being the minimum actual delay for small msleeps. I suppose the error is larger for HZ=100 systems. The fsleep() would turn into something between 2ms and 2.5ms, so if we're talking about reducing 3ms to 2.5ms, I have a hard time getting worried about that. And we're going to wait at least 100ms before touching the device below the bridge anyway. > This unnecessary delay can impact system responsiveness during PCI > operations, especially since the PCIe r7.0 specification, section > 7.5.1.3.13, requires only a minimum Trst of 1ms. We double this to 2ms > to ensure we meet the minimum requirement, but using msleep(2) may > actually wait longer than needed. > > Using fsleep() provides a more precise delay that matches the stated > intent of the code. The fsleep() function uses high-resolution timers > where available to achieve microsecond precision. > > Replace msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS) with > fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US) to ensure the actual delay is > closer to the intended 2ms delay. > > Signed-off-by: Hans Zhang <18255117159@163.com> > --- > drivers/pci/pci.c | 2 +- > drivers/pci/pci.h | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index c05a4c2fa643..81105dfc2f62 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -4964,7 +4964,7 @@ void pci_reset_secondary_bus(struct pci_dev *dev) > pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); > > /* Double this to 2ms to ensure that we meet the minimum requirement */ > - msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS); > + fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US); > > ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET; > pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index 4d7e9c3f3453..9d38ef26c6a9 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -61,7 +61,7 @@ struct pcie_tlp_log; > #define PCIE_LINK_WAIT_SLEEP_MS 90 > > /* PCIe r7.0, sec 7.5.1.3.13, requires minimum Trst of 1ms */ > -#define PCI_T_RST_SEC_BUS_DELAY_MS 1 > +#define PCI_T_RST_SEC_BUS_DELAY_US 1000 > > /* Message Routing (r[2:0]); PCIe r6.0, sec 2.2.8 */ > #define PCIE_MSG_TYPE_R_RC 0 > -- > 2.25.1 >
On 2025/8/27 07:05, Bjorn Helgaas wrote: > On Wed, Aug 27, 2025 at 01:03:09AM +0800, Hans Zhang wrote: >> The msleep() function with small values (less than 20ms) may not sleep >> for the exact duration due to the kernel's timer wheel design. According >> to the comment in kernel/time/sleep_timeout.c: >> >> "The slack of timers which will end up in level 0 depends on sleep >> duration (msecs) and HZ configuration. For example, with HZ=1000 and >> a requested sleep of 2ms, the slack can be as high as 50% (1ms) because >> the minimum slack is 12.5% but the actual calculation for level 0 timers >> is slack = MSECS_PER_TICK / msecs. This means that msleep(2) can >> actually take up to 3ms (2ms + 1ms) on a system with HZ=1000." > > I thought I heard something about 20ms being the minimum actual delay > for small msleeps. I suppose the error is larger for HZ=100 systems. > Yes. > The fsleep() would turn into something between 2ms and 2.5ms, so if > we're talking about reducing 3ms to 2.5ms, I have a hard time getting > worried about that. > > And we're going to wait at least 100ms before touching the device > below the bridge anyway. int pci_bridge_secondary_bus_reset(struct pci_dev *dev) pcibios_reset_secondary_bus(dev); pci_reset_secondary_bus(dev); pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &ctrl); ctrl |= PCI_BRIDGE_CTL_BUS_RESET; pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); /* * PCI spec v3.0 7.6.4.2 requires minimum Trst of 1ms. Double * this to 2ms to ensure that we meet the minimum requirement. */ msleep(2); // As you mentioned, is it necessary to be very precise here? Please decide whether you want to make the modification. ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET; pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); pci_bridge_wait_for_secondary_bus(dev, "bus reset"); // There are also some delays in between. // The delay here is also long enough. pci_dev_wait(child, reset_type, PCIE_RESET_READY_POLL_MS - delay); If it's not necessary to do so, do I still need to resubmit the version? Or do you choose a few acceptable ones or the first patch? Best regards, Hans > >> This unnecessary delay can impact system responsiveness during PCI >> operations, especially since the PCIe r7.0 specification, section >> 7.5.1.3.13, requires only a minimum Trst of 1ms. We double this to 2ms >> to ensure we meet the minimum requirement, but using msleep(2) may >> actually wait longer than needed. >> >> Using fsleep() provides a more precise delay that matches the stated >> intent of the code. The fsleep() function uses high-resolution timers >> where available to achieve microsecond precision. >> >> Replace msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS) with >> fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US) to ensure the actual delay is >> closer to the intended 2ms delay. >> >> Signed-off-by: Hans Zhang <18255117159@163.com> >> --- >> drivers/pci/pci.c | 2 +- >> drivers/pci/pci.h | 2 +- >> 2 files changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c >> index c05a4c2fa643..81105dfc2f62 100644 >> --- a/drivers/pci/pci.c >> +++ b/drivers/pci/pci.c >> @@ -4964,7 +4964,7 @@ void pci_reset_secondary_bus(struct pci_dev *dev) >> pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); >> >> /* Double this to 2ms to ensure that we meet the minimum requirement */ >> - msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS); >> + fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US); >> >> ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET; >> pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); >> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h >> index 4d7e9c3f3453..9d38ef26c6a9 100644 >> --- a/drivers/pci/pci.h >> +++ b/drivers/pci/pci.h >> @@ -61,7 +61,7 @@ struct pcie_tlp_log; >> #define PCIE_LINK_WAIT_SLEEP_MS 90 >> >> /* PCIe r7.0, sec 7.5.1.3.13, requires minimum Trst of 1ms */ >> -#define PCI_T_RST_SEC_BUS_DELAY_MS 1 >> +#define PCI_T_RST_SEC_BUS_DELAY_US 1000 >> >> /* Message Routing (r[2:0]); PCIe r6.0, sec 2.2.8 */ >> #define PCIE_MSG_TYPE_R_RC 0 >> -- >> 2.25.1 >>
© 2016 - 2025 Red Hat, Inc.