xen/irq: Perf improvements

[PATCH 1/2] xen/ioapic: Don't use local_irq_restore() to disable irqs

Posted by Andrew Cooper 2 years, 11 months ago

Despite its name, the irq{save,restore}() APIs are only intended to
conditionally disable and re-enable interrupts.

IO-APIC's timer_irq_works() violates this intention.  As it is init code,
switch to simple irq enable/disable().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>

This is a logical equivelent to Linux's 058df195c2 "x86/ioapic: Cleanup the
timer_works() irqflags mess", but we've diverged far enough for the patch to
not be remotely relevant.
---
 xen/arch/x86/io_apic.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/io_apic.c b/xen/arch/x86/io_apic.c
index 9b8a972cf570..199098fa3e0f 100644
--- a/xen/arch/x86/io_apic.c
+++ b/xen/arch/x86/io_apic.c
@@ -1502,15 +1502,14 @@ static void __init setup_ioapic_ids_from_mpc(void)
  */
 static int __init timer_irq_works(void)
 {
-    unsigned long t1, flags;
+    unsigned long t1;
 
     t1 = ACCESS_ONCE(pit0_ticks);
 
-    local_save_flags(flags);
     local_irq_enable();
     /* Let ten ticks pass... */
     mdelay((10 * 1000) / HZ);
-    local_irq_restore(flags);
+    local_irq_disable();
 
     /*
      * Expect a few ticks at least, to be sure some possible
-- 
2.30.2

Re: [PATCH 1/2] xen/ioapic: Don't use local_irq_restore() to disable irqs

Posted by Jan Beulich 2 years, 11 months ago

On 20.02.2023 20:47, Andrew Cooper wrote:
> Despite its name, the irq{save,restore}() APIs are only intended to
> conditionally disable and re-enable interrupts.

Are they? Maybe nowadays they indeed are, but I couldn't spot any wording
to this effect in Linux'es Documentation/ (and I don't think we have
anywhere such could be put). Yet I'd expect such a statement to be backed
by something.

> IO-APIC's timer_irq_works() violates this intention.  As it is init code,
> switch to simple irq enable/disable().

If all callers of the function obviously did have interrupts off, I might
agree. But the last of them sits _after_ local_irq_restore(), and all of
this is called from underneath smp_prepare_cpus(), i.e. after IRQs were
already enabled. I'm willing to believe that the local_irq_restore()
there really comes too early, but then, ...

> No functional change.

... in order for this to be true, that'll need fixing at the same time (or
in a prereq patch).

Jan

Re: [PATCH 1/2] xen/ioapic: Don't use local_irq_restore() to disable irqs

Posted by Andrew Cooper 2 years, 11 months ago

On 21/02/2023 1:40 pm, Jan Beulich wrote:
> On 20.02.2023 20:47, Andrew Cooper wrote:
>> Despite its name, the irq{save,restore}() APIs are only intended to
>> conditionally disable and re-enable interrupts.
> Are they?

Yes, absolutely.

And as said before, the potentially dubious naming does not give us
permission or the right to interpret the behaviour differently.

>  Maybe nowadays they indeed are, but I couldn't spot any wording
> to this effect in Linux'es Documentation/ (and I don't think we have
> anywhere such could be put). Yet I'd expect such a statement to be backed
> by something.

It is backed by the rude words the owners of this API had to say on
discovering this particular use.

Conditionally enabling with a construct like this is bogus everywhere. 
It is only ever safe to enable irqs if you know exactly why they were
disabled previously, and that can never be the case with a construct
like this.

It only reason this one example doesn't explode is because its an init
path not nested within an irq/irqsave lock or other critical region.

>
>> IO-APIC's timer_irq_works() violates this intention.  As it is init code,
>> switch to simple irq enable/disable().
> If all callers of the function obviously did have interrupts off, I might
> agree. But the last of them sits _after_ local_irq_restore(), and all of
> this is called from underneath smp_prepare_cpus()

Which do you mean "the last of them" ?

There is a large amount of genuinely dead logic here.

~Andrew

Re: [PATCH 1/2] xen/ioapic: Don't use local_irq_restore() to disable irqs

Posted by Jan Beulich 2 years, 11 months ago

On 21.02.2023 19:14, Andrew Cooper wrote:
> On 21/02/2023 1:40 pm, Jan Beulich wrote:
>> On 20.02.2023 20:47, Andrew Cooper wrote:
>>> Despite its name, the irq{save,restore}() APIs are only intended to
>>> conditionally disable and re-enable interrupts.
>> Are they?
> 
> Yes, absolutely.
> 
> And as said before, the potentially dubious naming does not give us
> permission or the right to interpret the behaviour differently.

When I started to work with Linux, I seem to recall there were more uses
of this nature, not considered dubious at the time. Views change, and if
such views aren't expressed in the function names, then it is even more
so important that this is spelled out in some other prominent way. So ...

>>  Maybe nowadays they indeed are, but I couldn't spot any wording
>> to this effect in Linux'es Documentation/ (and I don't think we have
>> anywhere such could be put). Yet I'd expect such a statement to be backed
>> by something.
> 
> It is backed by the rude words the owners of this API had to say on
> discovering this particular use.

... have those rude words found their way into documentation, in a
place I simply didn't spot?

> Conditionally enabling with a construct like this is bogus everywhere. 
> It is only ever safe to enable irqs if you know exactly why they were
> disabled previously, and that can never be the case with a construct
> like this.
> 
> It only reason this one example doesn't explode is because its an init
> path not nested within an irq/irqsave lock or other critical region.

Which is why, I assume, it was deemed fine to code that way back then.

>>> IO-APIC's timer_irq_works() violates this intention.  As it is init code,
>>> switch to simple irq enable/disable().
>> If all callers of the function obviously did have interrupts off, I might
>> agree. But the last of them sits _after_ local_irq_restore(), and all of
>> this is called from underneath smp_prepare_cpus()
> 
> Which do you mean "the last of them" ?

Is "last" ambiguous here in any way? If so, this code fragment is the one:

    unlock_ExtINT_logic();

    local_irq_restore(flags);

    if (timer_irq_works()) {
        printk(" works.\n");
        return;
    }
    printk(" failed :(.\n");
    panic("IO-APIC + timer doesn't work!  Boot with apic_verbosity=debug and send a report\n");
}

Besides aspects one may deem benign and/or pre-existing, your change leads
to the "works" path now running (returning) with IRQs off, when the caller
expects them on.

> There is a large amount of genuinely dead logic here.

Possibly.

Jan