From: Julien Grall <jgrall@amazon.com>
Currently timer_irq_works() will wait the full 100ms before checking
that pit0_ticks has been incremented at least 4 times.
However, the bulk of the BIOS/platform should not have a buggy timer.
So waiting for the full 100ms is a bit harsh.
Rework the logic to only wait until 100ms passed or we saw more than
4 ticks. So now, in the good case, this will reduce the wait time
to ~50ms.
Signed-off-by: Julien Grall <jgrall@amazon.com>
---
xen/arch/x86/io_apic.c | 30 +++++++++++++++++++-----------
1 file changed, 19 insertions(+), 11 deletions(-)
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 4875bb97003f..0bd774962a68 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1509,6 +1509,8 @@ static void __init setup_ioapic_ids_from_mpc(void)
static int __init timer_irq_works(void)
{
unsigned long t1, flags;
+ /* Wait for maximum 10 ticks */
+ unsigned long msec = (10 * 1000) / HZ;
if ( no_timer_check )
return 1;
@@ -1517,19 +1519,25 @@ static int __init timer_irq_works(void)
local_save_flags(flags);
local_irq_enable();
- /* Let ten ticks pass... */
- mdelay((10 * 1000) / HZ);
- local_irq_restore(flags);
- /*
- * Expect a few ticks at least, to be sure some possible
- * glue logic does not lock up after one or two first
- * ticks in a non-ExtINT mode. Also the local APIC
- * might have cached one ExtINT interrupt. Finally, at
- * least one tick may be lost due to delays.
- */
- if ( (ACCESS_ONCE(pit0_ticks) - t1) > 4 )
+ while ( msec-- )
+ {
+ mdelay(1);
+ /*
+ * Expect a few ticks at least, to be sure some possible
+ * glue logic does not lock up after one or two first
+ * ticks in a non-ExtINT mode. Also the local APIC
+ * might have cached one ExtINT interrupt. Finally, at
+ * least one tick may be lost due to delays.
+ */
+ if ( (ACCESS_ONCE(pit0_ticks) - t1) <= 4 )
+ continue;
+
+ local_irq_restore(flags);
return 1;
+ }
+
+ local_irq_restore(flags);
return 0;
}
--
2.40.1
On 18.08.2023 15:44, Julien Grall wrote:
> From: Julien Grall <jgrall@amazon.com>
>
> Currently timer_irq_works() will wait the full 100ms before checking
> that pit0_ticks has been incremented at least 4 times.
>
> However, the bulk of the BIOS/platform should not have a buggy timer.
> So waiting for the full 100ms is a bit harsh.
>
> Rework the logic to only wait until 100ms passed or we saw more than
> 4 ticks. So now, in the good case, this will reduce the wait time
> to ~50ms.
>
> Signed-off-by: Julien Grall <jgrall@amazon.com>
In principle this is all fine. There's a secondary aspect though which
may call for a slight rework of the patch.
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -1509,6 +1509,8 @@ static void __init setup_ioapic_ids_from_mpc(void)
> static int __init timer_irq_works(void)
> {
> unsigned long t1, flags;
> + /* Wait for maximum 10 ticks */
> + unsigned long msec = (10 * 1000) / HZ;
(Minor remark: I don't think this needs to be unsigned long; unsigned
in will suffice.)
> @@ -1517,19 +1519,25 @@ static int __init timer_irq_works(void)
>
> local_save_flags(flags);
> local_irq_enable();
> - /* Let ten ticks pass... */
> - mdelay((10 * 1000) / HZ);
> - local_irq_restore(flags);
>
> - /*
> - * Expect a few ticks at least, to be sure some possible
> - * glue logic does not lock up after one or two first
> - * ticks in a non-ExtINT mode. Also the local APIC
> - * might have cached one ExtINT interrupt. Finally, at
> - * least one tick may be lost due to delays.
> - */
> - if ( (ACCESS_ONCE(pit0_ticks) - t1) > 4 )
> + while ( msec-- )
> + {
> + mdelay(1);
> + /*
> + * Expect a few ticks at least, to be sure some possible
> + * glue logic does not lock up after one or two first
> + * ticks in a non-ExtINT mode. Also the local APIC
> + * might have cached one ExtINT interrupt. Finally, at
> + * least one tick may be lost due to delays.
> + */
> + if ( (ACCESS_ONCE(pit0_ticks) - t1) <= 4 )
> + continue;
> +
> + local_irq_restore(flags);
> return 1;
> + }
> +
> + local_irq_restore(flags);
>
> return 0;
> }
While Andrew has a patch pending (not sure why it didn't go in yet)
to simplify local_irq_restore(), and while further it shouldn't really
need using here (local_irq_disable() ought to be fine), I can see that
you don't want to make such an adjustment here. But then I'd prefer if
we got away with just a single instance, adjusting the final return
statement accordingly (easiest would likely be to go from the value of
"msec").
Jan
Hi Jan,
On 07/09/2023 15:28, Jan Beulich wrote:
> On 18.08.2023 15:44, Julien Grall wrote:
>> From: Julien Grall <jgrall@amazon.com>
>>
>> Currently timer_irq_works() will wait the full 100ms before checking
>> that pit0_ticks has been incremented at least 4 times.
>>
>> However, the bulk of the BIOS/platform should not have a buggy timer.
>> So waiting for the full 100ms is a bit harsh.
>>
>> Rework the logic to only wait until 100ms passed or we saw more than
>> 4 ticks. So now, in the good case, this will reduce the wait time
>> to ~50ms.
>>
>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>
> In principle this is all fine. There's a secondary aspect though which
> may call for a slight rework of the patch.
>
>> --- a/xen/arch/x86/io_apic.c
>> +++ b/xen/arch/x86/io_apic.c
>> @@ -1509,6 +1509,8 @@ static void __init setup_ioapic_ids_from_mpc(void)
>> static int __init timer_irq_works(void)
>> {
>> unsigned long t1, flags;
>> + /* Wait for maximum 10 ticks */
>> + unsigned long msec = (10 * 1000) / HZ;
>
> (Minor remark: I don't think this needs to be unsigned long; unsigned
> in will suffice.)
You are right. I can switch to unsigned int.
>
>> @@ -1517,19 +1519,25 @@ static int __init timer_irq_works(void)
>>
>> local_save_flags(flags);
>> local_irq_enable();
>> - /* Let ten ticks pass... */
>> - mdelay((10 * 1000) / HZ);
>> - local_irq_restore(flags);
>>
>> - /*
>> - * Expect a few ticks at least, to be sure some possible
>> - * glue logic does not lock up after one or two first
>> - * ticks in a non-ExtINT mode. Also the local APIC
>> - * might have cached one ExtINT interrupt. Finally, at
>> - * least one tick may be lost due to delays.
>> - */
>> - if ( (ACCESS_ONCE(pit0_ticks) - t1) > 4 )
>> + while ( msec-- )
>> + {
>> + mdelay(1);
>> + /*
>> + * Expect a few ticks at least, to be sure some possible
>> + * glue logic does not lock up after one or two first
>> + * ticks in a non-ExtINT mode. Also the local APIC
>> + * might have cached one ExtINT interrupt. Finally, at
>> + * least one tick may be lost due to delays.
>> + */
>> + if ( (ACCESS_ONCE(pit0_ticks) - t1) <= 4 )
>> + continue;
>> +
>> + local_irq_restore(flags);
>> return 1;
>> + }
>> +
>> + local_irq_restore(flags);
>>
>> return 0;
>> }
>
> While Andrew has a patch pending (not sure why it didn't go in yet)
> to simplify local_irq_restore(), and while further it shouldn't really
> need using here (local_irq_disable() ought to be fine)
Skimming through the code, the last call of timer_irq_works() in
check_timer() happens after the interrupts masking state have been restored:
local_irq_restore(flags);
if ( timer_irq_works() )
...
So I think timer_irq_works() can be called with interrupts enabled and
therefore we can't use local_irq_disable().
> I can see that
> you don't want to make such an adjustment here. But then I'd prefer if
> we got away with just a single instance, adjusting the final return
> statement accordingly (easiest would likely be to go from the value of
> "msec").
I was thinking to use 'msec > 0' as a condition to determine if the test
passed. However, it would consider a failure if it tooks 10ms for the
test to pass.
I will see if I can rework the loop.
Cheers,
--
Julien Grall
On 15.09.2023 16:00, Julien Grall wrote:
> Hi Jan,
>
> On 07/09/2023 15:28, Jan Beulich wrote:
>> On 18.08.2023 15:44, Julien Grall wrote:
>>> From: Julien Grall <jgrall@amazon.com>
>>>
>>> Currently timer_irq_works() will wait the full 100ms before checking
>>> that pit0_ticks has been incremented at least 4 times.
>>>
>>> However, the bulk of the BIOS/platform should not have a buggy timer.
>>> So waiting for the full 100ms is a bit harsh.
>>>
>>> Rework the logic to only wait until 100ms passed or we saw more than
>>> 4 ticks. So now, in the good case, this will reduce the wait time
>>> to ~50ms.
>>>
>>> Signed-off-by: Julien Grall <jgrall@amazon.com>
>>
>> In principle this is all fine. There's a secondary aspect though which
>> may call for a slight rework of the patch.
>>
>>> --- a/xen/arch/x86/io_apic.c
>>> +++ b/xen/arch/x86/io_apic.c
>>> @@ -1509,6 +1509,8 @@ static void __init setup_ioapic_ids_from_mpc(void)
>>> static int __init timer_irq_works(void)
>>> {
>>> unsigned long t1, flags;
>>> + /* Wait for maximum 10 ticks */
>>> + unsigned long msec = (10 * 1000) / HZ;
>>
>> (Minor remark: I don't think this needs to be unsigned long; unsigned
>> in will suffice.)
>
> You are right. I can switch to unsigned int.
>
>>
>>> @@ -1517,19 +1519,25 @@ static int __init timer_irq_works(void)
>>>
>>> local_save_flags(flags);
>>> local_irq_enable();
>>> - /* Let ten ticks pass... */
>>> - mdelay((10 * 1000) / HZ);
>>> - local_irq_restore(flags);
>>>
>>> - /*
>>> - * Expect a few ticks at least, to be sure some possible
>>> - * glue logic does not lock up after one or two first
>>> - * ticks in a non-ExtINT mode. Also the local APIC
>>> - * might have cached one ExtINT interrupt. Finally, at
>>> - * least one tick may be lost due to delays.
>>> - */
>>> - if ( (ACCESS_ONCE(pit0_ticks) - t1) > 4 )
>>> + while ( msec-- )
>>> + {
>>> + mdelay(1);
>>> + /*
>>> + * Expect a few ticks at least, to be sure some possible
>>> + * glue logic does not lock up after one or two first
>>> + * ticks in a non-ExtINT mode. Also the local APIC
>>> + * might have cached one ExtINT interrupt. Finally, at
>>> + * least one tick may be lost due to delays.
>>> + */
>>> + if ( (ACCESS_ONCE(pit0_ticks) - t1) <= 4 )
>>> + continue;
>>> +
>>> + local_irq_restore(flags);
>>> return 1;
>>> + }
>>> +
>>> + local_irq_restore(flags);
>>>
>>> return 0;
>>> }
>>
>> While Andrew has a patch pending (not sure why it didn't go in yet)
>> to simplify local_irq_restore(), and while further it shouldn't really
>> need using here (local_irq_disable() ought to be fine)
>
> Skimming through the code, the last call of timer_irq_works() in
> check_timer() happens after the interrupts masking state have been restored:
>
> local_irq_restore(flags);
>
> if ( timer_irq_works() )
> ...
>
>
> So I think timer_irq_works() can be called with interrupts enabled and
> therefore we can't use local_irq_disable().
Hmm, yes, you're right. That's inconsistent, but dealing with that is a
separate task.
Jan
© 2016 - 2026 Red Hat, Inc.