[v2] xen/x86: Optimize timer_irq_works()

[PATCH v2 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

Posted by Julien Grall 2 years, 2 months ago

From: Julien Grall <jgrall@amazon.com>

Currently, Xen will spend ~100ms to check if the timer works. If the
Admin knows their platform have a working timer, then it would be
handy to be able to bypass the check.

Introduce a command line option 'pit-irq-works' for this purpose.

Signed-off-by: Julien Grall <jgrall@amazon.com>

---

Changelog since v1:
    - Rename the command line option. I went with pit-irq-works rather
      than timer-irq-works because Roger thought it would be better suited
    - Rework the command line description
---
 docs/misc/xen-command-line.pandoc | 11 +++++++++++
 xen/arch/x86/io_apic.c            | 11 +++++++++++
 2 files changed, 22 insertions(+)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 8e65f8bd18bf..c382b061b302 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2535,6 +2535,17 @@ pages) must also be specified via the tbuf_size parameter.
 ### tickle_one_idle_cpu
 > `= <boolean>`
 
+### pit-irq-works (x86)
+> `=<boolean>`
+
+> Default: `false`
+
+Disables the code which tests for broken timer IRQ sources. Enabling
+this option will reduce boot time on HW where the timer works properly.
+
+If the system is unstable when enabling the option, then it means you
+may have a broken HW and therefore the testing cannot be be skipped.
+
 ### timer_slop
 > `= <integer>`
 
diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index d11c880544e6..238b6c1c2837 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
 int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
 int __read_mostly nr_ioapics;
 
+/*
+ * The logic to check if the timer is working is expensive. So allow
+ * the admin to bypass it if they know their platform doesn't have
+ * a buggy timer.
+ */
+static bool __initdata pit_irq_works;
+boolean_param("pit-irq-works", pit_irq_works);
+
 /*
  * Rough estimation of how many shared IRQs there are, can
  * be changed anytime.
@@ -1502,6 +1510,9 @@ static int __init timer_irq_works(void)
 {
     unsigned long t1, flags;
 
+    if ( pit_irq_works )
+        return 1;
+
     t1 = ACCESS_ONCE(pit0_ticks);
 
     local_save_flags(flags);
-- 
2.40.1

Re: [PATCH v2 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

Posted by Jan Beulich 2 years, 1 month ago

On 11.12.2023 13:23, Julien Grall wrote:
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -2535,6 +2535,17 @@ pages) must also be specified via the tbuf_size parameter.
>  ### tickle_one_idle_cpu
>  > `= <boolean>`
>  
> +### pit-irq-works (x86)
> +> `=<boolean>`
> +
> +> Default: `false`
> +
> +Disables the code which tests for broken timer IRQ sources. Enabling
> +this option will reduce boot time on HW where the timer works properly.
> +
> +If the system is unstable when enabling the option, then it means you
> +may have a broken HW and therefore the testing cannot be be skipped.
> +
>  ### timer_slop
>  > `= <integer>`

With the rename this now needs to move up to retain sorting.

> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
>  int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
>  int __read_mostly nr_ioapics;
>  
> +/*
> + * The logic to check if the timer is working is expensive. So allow
> + * the admin to bypass it if they know their platform doesn't have
> + * a buggy timer.
> + */
> +static bool __initdata pit_irq_works;
> +boolean_param("pit-irq-works", pit_irq_works);
> +
>  /*
>   * Rough estimation of how many shared IRQs there are, can
>   * be changed anytime.
> @@ -1502,6 +1510,9 @@ static int __init timer_irq_works(void)
>  {
>      unsigned long t1, flags;
>  
> +    if ( pit_irq_works )
> +        return 1;

When the check is placed here, what exactly use of the option means is
system dependent. I consider this somewhat risky, so I'd prefer if the
check was put on the "normal" path in check_timer(). That way it'll
affect only the one case which we can generally consider "known good",
but not the cases where the virtual wire setups are being probed. I.e.

    if (pin1 != -1) {
        /*
         * Ok, does IRQ0 through the IOAPIC work?
         */
        unmask_IO_APIC_irq(irq_to_desc(0));
        if (pit_irq_works || timer_irq_works()) {
            local_irq_restore(flags);
            return;
        }

Plus this way changes to the various fallback paths can also be done
without needing to consider users who might be making use of the new
option.

Jan

Re: [PATCH v2 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

Posted by Julien Grall 2 years, 1 month ago

Hi,

On 14/12/2023 10:10, Jan Beulich wrote:
> On 11.12.2023 13:23, Julien Grall wrote:
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -2535,6 +2535,17 @@ pages) must also be specified via the tbuf_size parameter.
>>   ### tickle_one_idle_cpu
>>   > `= <boolean>`
>>   
>> +### pit-irq-works (x86)
>> +> `=<boolean>`
>> +
>> +> Default: `false`
>> +
>> +Disables the code which tests for broken timer IRQ sources. Enabling
>> +this option will reduce boot time on HW where the timer works properly.
>> +
>> +If the system is unstable when enabling the option, then it means you
>> +may have a broken HW and therefore the testing cannot be be skipped.
>> +
>>   ### timer_slop
>>   > `= <integer>`
> 
> With the rename this now needs to move up to retain sorting.

Ok.

> 
>> --- a/xen/arch/x86/io_apic.c
>> +++ b/xen/arch/x86/io_apic.c
>> @@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
>>   int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
>>   int __read_mostly nr_ioapics;
>>   
>> +/*
>> + * The logic to check if the timer is working is expensive. So allow
>> + * the admin to bypass it if they know their platform doesn't have
>> + * a buggy timer.
>> + */
>> +static bool __initdata pit_irq_works;
>> +boolean_param("pit-irq-works", pit_irq_works);
>> +
>>   /*
>>    * Rough estimation of how many shared IRQs there are, can
>>    * be changed anytime.
>> @@ -1502,6 +1510,9 @@ static int __init timer_irq_works(void)
>>   {
>>       unsigned long t1, flags;
>>   
>> +    if ( pit_irq_works )
>> +        return 1;
> 
> When the check is placed here, what exactly use of the option means is
> system dependent. I consider this somewhat risky, so I'd prefer if the
> check was put on the "normal" path in check_timer(). That way it'll
> affect only the one case which we can generally consider "known good",
> but not the cases where the virtual wire setups are being probed. I.e.

I am not against restricting when we allow skipping the timer check. But 
in that case, I wonder why Linux is doing it differently?

After all, this code is heavily borrowed from Linux. So shouldn't we 
follow what they are doing?

Cheers,

-- 
Julien Grall

Re: [PATCH v2 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

Posted by Jan Beulich 2 years, 1 month ago

On 14.12.2023 11:14, Julien Grall wrote:
> On 14/12/2023 10:10, Jan Beulich wrote:
>> On 11.12.2023 13:23, Julien Grall wrote:
>>> --- a/xen/arch/x86/io_apic.c
>>> +++ b/xen/arch/x86/io_apic.c
>>> @@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
>>>   int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
>>>   int __read_mostly nr_ioapics;
>>>   
>>> +/*
>>> + * The logic to check if the timer is working is expensive. So allow
>>> + * the admin to bypass it if they know their platform doesn't have
>>> + * a buggy timer.
>>> + */
>>> +static bool __initdata pit_irq_works;
>>> +boolean_param("pit-irq-works", pit_irq_works);
>>> +
>>>   /*
>>>    * Rough estimation of how many shared IRQs there are, can
>>>    * be changed anytime.
>>> @@ -1502,6 +1510,9 @@ static int __init timer_irq_works(void)
>>>   {
>>>       unsigned long t1, flags;
>>>   
>>> +    if ( pit_irq_works )
>>> +        return 1;
>>
>> When the check is placed here, what exactly use of the option means is
>> system dependent. I consider this somewhat risky, so I'd prefer if the
>> check was put on the "normal" path in check_timer(). That way it'll
>> affect only the one case which we can generally consider "known good",
>> but not the cases where the virtual wire setups are being probed. I.e.
> 
> I am not against restricting when we allow skipping the timer check. But 
> in that case, I wonder why Linux is doing it differently?

Sadly Linux'es git history doesn't go back far enough (begins only at past
2.6.11), so I can't (easily) find the patch (and description) for the x86-64
change. The later i386 change is justified mainly by paravirt needs, so
isn't applicable here. I wouldn't therefore exclude that my point above
wasn't even taken into consideration. Furthermore their command line option
is "no_timer_check", which to me firmly says "don't check" without regard to
whether the source (PIT) is actually okay. That's different with the option
name you (imo validly) chose.

Jan

Re: [PATCH v2 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

Posted by Julien Grall 2 years, 1 month ago

Hi Jan,

On 14/12/2023 10:35, Jan Beulich wrote:
> On 14.12.2023 11:14, Julien Grall wrote:
>> On 14/12/2023 10:10, Jan Beulich wrote:
>>> On 11.12.2023 13:23, Julien Grall wrote:
>>>> --- a/xen/arch/x86/io_apic.c
>>>> +++ b/xen/arch/x86/io_apic.c
>>>> @@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
>>>>    int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
>>>>    int __read_mostly nr_ioapics;
>>>>    
>>>> +/*
>>>> + * The logic to check if the timer is working is expensive. So allow
>>>> + * the admin to bypass it if they know their platform doesn't have
>>>> + * a buggy timer.
>>>> + */
>>>> +static bool __initdata pit_irq_works;
>>>> +boolean_param("pit-irq-works", pit_irq_works);
>>>> +
>>>>    /*
>>>>     * Rough estimation of how many shared IRQs there are, can
>>>>     * be changed anytime.
>>>> @@ -1502,6 +1510,9 @@ static int __init timer_irq_works(void)
>>>>    {
>>>>        unsigned long t1, flags;
>>>>    
>>>> +    if ( pit_irq_works )
>>>> +        return 1;
>>>
>>> When the check is placed here, what exactly use of the option means is
>>> system dependent. I consider this somewhat risky, so I'd prefer if the
>>> check was put on the "normal" path in check_timer(). That way it'll
>>> affect only the one case which we can generally consider "known good",
>>> but not the cases where the virtual wire setups are being probed. I.e.

By "known good", do you mean the following:

diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index c89fbed8d675..c39d39ee951a 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1960,7 +1959,8 @@ static void __init check_timer(void)
           * Ok, does IRQ0 through the IOAPIC work?
           */
          unmask_IO_APIC_irq(irq_to_desc(0));
-        if (timer_irq_works()) {
+        if (pit_irq_works || timer_irq_works()) {
+            printk("====== pirq_irq_works %d =====\n", pit_irq_works);
              local_irq_restore(flags);
              return;
          }

>>
>> I am not against restricting when we allow skipping the timer check. But
>> in that case, I wonder why Linux is doing it differently?
> 
> Sadly Linux'es git history doesn't go back far enough (begins only at past
> 2.6.11), so I can't (easily) find the patch (and description) for the x86-64
> change. The later i386 change is justified mainly by paravirt needs, so
> isn't applicable here. I wouldn't therefore exclude that my point above
> wasn't even taken into consideration. Furthermore their command line option
> is "no_timer_check", which to me firmly says "don't check" without regard to
> whether the source (PIT) is actually okay. That's different with the option
> name you (imo validly) chose.

Just to note that the name was suggested by Roger. I have to admit that 
I didn't check if this made sense for the existing placement.

Anyway, I tested the change on the HW where I wanted to skip the timer 
check. And I can confirm this is still skipping the timer check.

So I will send a new version with the diff above and some updated comments.

Cheers,

-- 
Julien Grall

Re: [PATCH v2 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

Posted by Jan Beulich 2 years, 1 month ago

On 02.01.2024 20:09, Julien Grall wrote:
> Hi Jan,
> 
> On 14/12/2023 10:35, Jan Beulich wrote:
>> On 14.12.2023 11:14, Julien Grall wrote:
>>> On 14/12/2023 10:10, Jan Beulich wrote:
>>>> On 11.12.2023 13:23, Julien Grall wrote:
>>>>> --- a/xen/arch/x86/io_apic.c
>>>>> +++ b/xen/arch/x86/io_apic.c
>>>>> @@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
>>>>>    int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
>>>>>    int __read_mostly nr_ioapics;
>>>>>    
>>>>> +/*
>>>>> + * The logic to check if the timer is working is expensive. So allow
>>>>> + * the admin to bypass it if they know their platform doesn't have
>>>>> + * a buggy timer.
>>>>> + */
>>>>> +static bool __initdata pit_irq_works;
>>>>> +boolean_param("pit-irq-works", pit_irq_works);
>>>>> +
>>>>>    /*
>>>>>     * Rough estimation of how many shared IRQs there are, can
>>>>>     * be changed anytime.
>>>>> @@ -1502,6 +1510,9 @@ static int __init timer_irq_works(void)
>>>>>    {
>>>>>        unsigned long t1, flags;
>>>>>    
>>>>> +    if ( pit_irq_works )
>>>>> +        return 1;
>>>>
>>>> When the check is placed here, what exactly use of the option means is
>>>> system dependent. I consider this somewhat risky, so I'd prefer if the
>>>> check was put on the "normal" path in check_timer(). That way it'll
>>>> affect only the one case which we can generally consider "known good",
>>>> but not the cases where the virtual wire setups are being probed. I.e.
> 
> By "known good", do you mean the following:
> 
> diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
> index c89fbed8d675..c39d39ee951a 100644
> --- a/xen/arch/x86/io_apic.c
> +++ b/xen/arch/x86/io_apic.c
> @@ -1960,7 +1959,8 @@ static void __init check_timer(void)
>            * Ok, does IRQ0 through the IOAPIC work?
>            */
>           unmask_IO_APIC_irq(irq_to_desc(0));
> -        if (timer_irq_works()) {
> +        if (pit_irq_works || timer_irq_works()) {
> +            printk("====== pirq_irq_works %d =====\n", pit_irq_works);
>               local_irq_restore(flags);
>               return;
>           }

Yes.

>>> I am not against restricting when we allow skipping the timer check. But
>>> in that case, I wonder why Linux is doing it differently?
>>
>> Sadly Linux'es git history doesn't go back far enough (begins only at past
>> 2.6.11), so I can't (easily) find the patch (and description) for the x86-64
>> change. The later i386 change is justified mainly by paravirt needs, so
>> isn't applicable here. I wouldn't therefore exclude that my point above
>> wasn't even taken into consideration. Furthermore their command line option
>> is "no_timer_check", which to me firmly says "don't check" without regard to
>> whether the source (PIT) is actually okay. That's different with the option
>> name you (imo validly) chose.
> 
> Just to note that the name was suggested by Roger. I have to admit that 
> I didn't check if this made sense for the existing placement.

Roger, thoughts?

Jan

> Anyway, I tested the change on the HW where I wanted to skip the timer 
> check. And I can confirm this is still skipping the timer check.
> 
> So I will send a new version with the diff above and some updated comments.
> 
> Cheers,
>

Re: [PATCH v2 1/2] xen/x86: io_apic: Introduce a command line option to skip timer check

Posted by Roger Pau Monné 2 years ago

On Thu, Jan 04, 2024 at 09:54:30AM +0100, Jan Beulich wrote:
> On 02.01.2024 20:09, Julien Grall wrote:
> > Hi Jan,
> > 
> > On 14/12/2023 10:35, Jan Beulich wrote:
> >> On 14.12.2023 11:14, Julien Grall wrote:
> >>> On 14/12/2023 10:10, Jan Beulich wrote:
> >>>> On 11.12.2023 13:23, Julien Grall wrote:
> >>>>> --- a/xen/arch/x86/io_apic.c
> >>>>> +++ b/xen/arch/x86/io_apic.c
> >>>>> @@ -57,6 +57,14 @@ bool __initdata ioapic_ack_forced;
> >>>>>    int __read_mostly nr_ioapic_entries[MAX_IO_APICS];
> >>>>>    int __read_mostly nr_ioapics;
> >>>>>    
> >>>>> +/*
> >>>>> + * The logic to check if the timer is working is expensive. So allow
> >>>>> + * the admin to bypass it if they know their platform doesn't have
> >>>>> + * a buggy timer.
> >>>>> + */
> >>>>> +static bool __initdata pit_irq_works;
> >>>>> +boolean_param("pit-irq-works", pit_irq_works);
> >>>>> +
> >>>>>    /*
> >>>>>     * Rough estimation of how many shared IRQs there are, can
> >>>>>     * be changed anytime.
> >>>>> @@ -1502,6 +1510,9 @@ static int __init timer_irq_works(void)
> >>>>>    {
> >>>>>        unsigned long t1, flags;
> >>>>>    
> >>>>> +    if ( pit_irq_works )
> >>>>> +        return 1;
> >>>>
> >>>> When the check is placed here, what exactly use of the option means is
> >>>> system dependent. I consider this somewhat risky, so I'd prefer if the
> >>>> check was put on the "normal" path in check_timer(). That way it'll
> >>>> affect only the one case which we can generally consider "known good",
> >>>> but not the cases where the virtual wire setups are being probed. I.e.
> > 
> > By "known good", do you mean the following:
> > 
> > diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
> > index c89fbed8d675..c39d39ee951a 100644
> > --- a/xen/arch/x86/io_apic.c
> > +++ b/xen/arch/x86/io_apic.c
> > @@ -1960,7 +1959,8 @@ static void __init check_timer(void)
> >            * Ok, does IRQ0 through the IOAPIC work?
> >            */
> >           unmask_IO_APIC_irq(irq_to_desc(0));
> > -        if (timer_irq_works()) {
> > +        if (pit_irq_works || timer_irq_works()) {
> > +            printk("====== pirq_irq_works %d =====\n", pit_irq_works);
> >               local_irq_restore(flags);
> >               return;
> >           }
> 
> Yes.
> 
> >>> I am not against restricting when we allow skipping the timer check. But
> >>> in that case, I wonder why Linux is doing it differently?
> >>
> >> Sadly Linux'es git history doesn't go back far enough (begins only at past
> >> 2.6.11), so I can't (easily) find the patch (and description) for the x86-64
> >> change. The later i386 change is justified mainly by paravirt needs, so
> >> isn't applicable here. I wouldn't therefore exclude that my point above
> >> wasn't even taken into consideration. Furthermore their command line option
> >> is "no_timer_check", which to me firmly says "don't check" without regard to
> >> whether the source (PIT) is actually okay. That's different with the option
> >> name you (imo validly) chose.
> > 
> > Just to note that the name was suggested by Roger. I have to admit that 
> > I didn't check if this made sense for the existing placement.
> 
> Roger, thoughts?

Right, with the usage of HPET in legacy replacement mode we are no
longer exclusively testing the PIT, so might make sense to use a more
generic name, timer-irq-works or some such.

Thanks, Roger.