The msleep() function with small values (less than 20ms) may not sleep
for the exact duration due to the kernel's timer wheel design. According
to the comment in kernel/time/sleep_timeout.c:
"The slack of timers which will end up in level 0 depends on sleep
duration (msecs) and HZ configuration. For example, with HZ=1000 and
a requested sleep of 2ms, the slack can be as high as 50% (1ms) because
the minimum slack is 12.5% but the actual calculation for level 0 timers
is slack = MSECS_PER_TICK / msecs. This means that msleep(2) can
actually take up to 3ms (2ms + 1ms) on a system with HZ=1000."
This unnecessary delay can impact system responsiveness during PCI
operations, especially since the PCIe r7.0 specification, section
7.5.1.3.13, requires only a minimum Trst of 1ms. We double this to 2ms
to ensure we meet the minimum requirement, but using msleep(2) may
actually wait longer than needed.
Using fsleep() provides a more precise delay that matches the stated
intent of the code. The fsleep() function uses high-resolution timers
where available to achieve microsecond precision.
Replace msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS) with
fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US) to ensure the actual delay is
closer to the intended 2ms delay.
Signed-off-by: Hans Zhang <18255117159@163.com>
---
drivers/pci/pci.c | 2 +-
drivers/pci/pci.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index c05a4c2fa643..81105dfc2f62 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4964,7 +4964,7 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
/* Double this to 2ms to ensure that we meet the minimum requirement */
- msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS);
+ fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US);
ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 4d7e9c3f3453..9d38ef26c6a9 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -61,7 +61,7 @@ struct pcie_tlp_log;
#define PCIE_LINK_WAIT_SLEEP_MS 90
/* PCIe r7.0, sec 7.5.1.3.13, requires minimum Trst of 1ms */
-#define PCI_T_RST_SEC_BUS_DELAY_MS 1
+#define PCI_T_RST_SEC_BUS_DELAY_US 1000
/* Message Routing (r[2:0]); PCIe r6.0, sec 2.2.8 */
#define PCIE_MSG_TYPE_R_RC 0
--
2.25.1
On Wed, Aug 27, 2025 at 01:03:09AM +0800, Hans Zhang wrote: > The msleep() function with small values (less than 20ms) may not sleep > for the exact duration due to the kernel's timer wheel design. According > to the comment in kernel/time/sleep_timeout.c: > > "The slack of timers which will end up in level 0 depends on sleep > duration (msecs) and HZ configuration. For example, with HZ=1000 and > a requested sleep of 2ms, the slack can be as high as 50% (1ms) because > the minimum slack is 12.5% but the actual calculation for level 0 timers > is slack = MSECS_PER_TICK / msecs. This means that msleep(2) can > actually take up to 3ms (2ms + 1ms) on a system with HZ=1000." I thought I heard something about 20ms being the minimum actual delay for small msleeps. I suppose the error is larger for HZ=100 systems. The fsleep() would turn into something between 2ms and 2.5ms, so if we're talking about reducing 3ms to 2.5ms, I have a hard time getting worried about that. And we're going to wait at least 100ms before touching the device below the bridge anyway. > This unnecessary delay can impact system responsiveness during PCI > operations, especially since the PCIe r7.0 specification, section > 7.5.1.3.13, requires only a minimum Trst of 1ms. We double this to 2ms > to ensure we meet the minimum requirement, but using msleep(2) may > actually wait longer than needed. > > Using fsleep() provides a more precise delay that matches the stated > intent of the code. The fsleep() function uses high-resolution timers > where available to achieve microsecond precision. > > Replace msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS) with > fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US) to ensure the actual delay is > closer to the intended 2ms delay. > > Signed-off-by: Hans Zhang <18255117159@163.com> > --- > drivers/pci/pci.c | 2 +- > drivers/pci/pci.h | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c > index c05a4c2fa643..81105dfc2f62 100644 > --- a/drivers/pci/pci.c > +++ b/drivers/pci/pci.c > @@ -4964,7 +4964,7 @@ void pci_reset_secondary_bus(struct pci_dev *dev) > pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); > > /* Double this to 2ms to ensure that we meet the minimum requirement */ > - msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS); > + fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US); > > ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET; > pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl); > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index 4d7e9c3f3453..9d38ef26c6a9 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -61,7 +61,7 @@ struct pcie_tlp_log; > #define PCIE_LINK_WAIT_SLEEP_MS 90 > > /* PCIe r7.0, sec 7.5.1.3.13, requires minimum Trst of 1ms */ > -#define PCI_T_RST_SEC_BUS_DELAY_MS 1 > +#define PCI_T_RST_SEC_BUS_DELAY_US 1000 > > /* Message Routing (r[2:0]); PCIe r6.0, sec 2.2.8 */ > #define PCIE_MSG_TYPE_R_RC 0 > -- > 2.25.1 >
On 2025/8/27 07:05, Bjorn Helgaas wrote:
> On Wed, Aug 27, 2025 at 01:03:09AM +0800, Hans Zhang wrote:
>> The msleep() function with small values (less than 20ms) may not sleep
>> for the exact duration due to the kernel's timer wheel design. According
>> to the comment in kernel/time/sleep_timeout.c:
>>
>> "The slack of timers which will end up in level 0 depends on sleep
>> duration (msecs) and HZ configuration. For example, with HZ=1000 and
>> a requested sleep of 2ms, the slack can be as high as 50% (1ms) because
>> the minimum slack is 12.5% but the actual calculation for level 0 timers
>> is slack = MSECS_PER_TICK / msecs. This means that msleep(2) can
>> actually take up to 3ms (2ms + 1ms) on a system with HZ=1000."
>
> I thought I heard something about 20ms being the minimum actual delay
> for small msleeps. I suppose the error is larger for HZ=100 systems.
>
Yes.
> The fsleep() would turn into something between 2ms and 2.5ms, so if
> we're talking about reducing 3ms to 2.5ms, I have a hard time getting
> worried about that.
>
> And we're going to wait at least 100ms before touching the device
> below the bridge anyway.
int pci_bridge_secondary_bus_reset(struct pci_dev *dev)
pcibios_reset_secondary_bus(dev);
pci_reset_secondary_bus(dev);
pci_read_config_word(dev, PCI_BRIDGE_CONTROL,
&ctrl);
ctrl |= PCI_BRIDGE_CTL_BUS_RESET;
pci_write_config_word(dev, PCI_BRIDGE_CONTROL,
ctrl);
/*
* PCI spec v3.0 7.6.4.2 requires minimum Trst
of 1ms. Double
* this to 2ms to ensure that we meet the
minimum requirement.
*/
msleep(2); // As you mentioned, is it
necessary to be very precise here? Please decide whether you want to
make the modification.
ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
pci_write_config_word(dev, PCI_BRIDGE_CONTROL,
ctrl);
pci_bridge_wait_for_secondary_bus(dev, "bus reset");
// There are also some delays in between.
// The delay here is also long enough.
pci_dev_wait(child, reset_type,
PCIE_RESET_READY_POLL_MS - delay);
If it's not necessary to do so, do I still need to resubmit the version?
Or do you choose a few acceptable ones or the first patch?
Best regards,
Hans
>
>> This unnecessary delay can impact system responsiveness during PCI
>> operations, especially since the PCIe r7.0 specification, section
>> 7.5.1.3.13, requires only a minimum Trst of 1ms. We double this to 2ms
>> to ensure we meet the minimum requirement, but using msleep(2) may
>> actually wait longer than needed.
>>
>> Using fsleep() provides a more precise delay that matches the stated
>> intent of the code. The fsleep() function uses high-resolution timers
>> where available to achieve microsecond precision.
>>
>> Replace msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS) with
>> fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US) to ensure the actual delay is
>> closer to the intended 2ms delay.
>>
>> Signed-off-by: Hans Zhang <18255117159@163.com>
>> ---
>> drivers/pci/pci.c | 2 +-
>> drivers/pci/pci.h | 2 +-
>> 2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index c05a4c2fa643..81105dfc2f62 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4964,7 +4964,7 @@ void pci_reset_secondary_bus(struct pci_dev *dev)
>> pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
>>
>> /* Double this to 2ms to ensure that we meet the minimum requirement */
>> - msleep(2 * PCI_T_RST_SEC_BUS_DELAY_MS);
>> + fsleep(2 * PCI_T_RST_SEC_BUS_DELAY_US);
>>
>> ctrl &= ~PCI_BRIDGE_CTL_BUS_RESET;
>> pci_write_config_word(dev, PCI_BRIDGE_CONTROL, ctrl);
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index 4d7e9c3f3453..9d38ef26c6a9 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -61,7 +61,7 @@ struct pcie_tlp_log;
>> #define PCIE_LINK_WAIT_SLEEP_MS 90
>>
>> /* PCIe r7.0, sec 7.5.1.3.13, requires minimum Trst of 1ms */
>> -#define PCI_T_RST_SEC_BUS_DELAY_MS 1
>> +#define PCI_T_RST_SEC_BUS_DELAY_US 1000
>>
>> /* Message Routing (r[2:0]); PCIe r6.0, sec 2.2.8 */
>> #define PCIE_MSG_TYPE_R_RC 0
>> --
>> 2.25.1
>>
© 2016 - 2026 Red Hat, Inc.