[v1] x86/i8259: do not assume interrupts always target CPU0

[PATCH] x86/i8259: do not assume interrupts always target CPU0

Posted by Roger Pau Monne 7 months, 1 week ago

Sporadically we have seen the following during AP bringup on AMD platforms
only:

microcode: CPU59 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
microcode: CPU60 updated from revision 0x830104d to 0x830107a, date = 2023-05-17
CPU60: No irq handler for vector 27 (IRQ -2147483648)
microcode: CPU61 updated from revision 0x830107a to 0x830107a, date = 2023-05-17

This is similar to the issue raised on Linux commit 36e9e1eab777e, where they
also observed i8259 (active) vectors getting delivered to CPUs different than
0.

Adjust the target CPU mask of i8259 interrupt descriptors to contain all
possible CPUs, so that APs will reserve the vector at startup if any legacy IRQ
is still delivered through the i8259.  Note that if the IO-APIC takes over
those interrupt descriptors the CPU mask will be reset.

Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected even
when all i8259 pins are masked, and hence need to be handled on all CPUs.
Reserve such vectors in order to prevent dynamic interrupt sources from using
them.

Finally, handle spurious i8259 interrupts on all CPUs and adjust the printed
message to display the CPU where the spurious interrupt has been received, so
it looks like:

microcode: CPU1 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
cpu1: spurious 8259A interrupt: IRQ7
microcode: CPU2 updated from revision 0x830104d to 0x830107a, date = 2023-05-17

Fixes: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
One theory I have is that the APs at some point (before jumping into Xen code)
have the local APIC hardware-disabled, and hence are considered valid targets
by the i8259, but by the time the vector is fetched from the i8259 the
interrupt has either been masked, or already consumed by a different CPU.
---
 xen/arch/x86/i8259.c | 18 ++++++++++++++++--
 xen/arch/x86/irq.c   |  9 ++++++++-
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/i8259.c b/xen/arch/x86/i8259.c
index ed9f55abe51e..ad3bca9895d0 100644
--- a/xen/arch/x86/i8259.c
+++ b/xen/arch/x86/i8259.c
@@ -222,7 +222,8 @@ static bool _mask_and_ack_8259A_irq(unsigned int irq)
         is_real_irq = false;
         /* Report spurious IRQ, once per IRQ line. */
         if (!(spurious_irq_mask & irqmask)) {
-            printk("spurious 8259A interrupt: IRQ%d.\n", irq);
+            printk("cpu%u: spurious 8259A interrupt: IRQ%u\n",
+                   smp_processor_id(), irq);
             spurious_irq_mask |= irqmask;
         }
         /*
@@ -349,7 +350,20 @@ void __init init_IRQ(void)
             continue;
         desc->handler = &i8259A_irq_type;
         per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
-        cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
+
+        /*
+         * The interrupt affinity logic never targets interrupts to offline
+         * CPUs, hence it's safe to use cpumask_all here.
+         *
+         * Legacy PIC interrupts are only targeted to CPU0, but depending on
+         * the platform they can be distributed to any online CPU in hardware.
+         * The kernel has no influence on that. So all active legacy vectors
+         * must be installed on all CPUs.
+         *
+         * IO-APIC will change the destination mask if/when taking ownership of
+         * the interrupt.
+         */
+        cpumask_copy(desc->arch.cpu_mask, &cpumask_all);
         desc->arch.vector = LEGACY_VECTOR(irq);
     }
     
diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
index 6abfd8162120..2379fdda3a7e 100644
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -466,6 +466,14 @@ int __init init_irq_data(void)
           vector++ )
         __set_bit(vector, used_vectors);
 
+    /*
+     * Mark i8259 spurious vectors as used to avoid (re)using them.  Otherwise
+     * it won't be possible to distinguish between device triggered interrupts
+     * or spurious i8259 ones.
+     */
+    __set_bit(LEGACY_VECTOR(7), used_vectors);
+    __set_bit(LEGACY_VECTOR(15), used_vectors);
+
     return 0;
 }
 
@@ -1920,7 +1928,6 @@ void do_IRQ(struct cpu_user_regs *regs)
                 kind = "";
             if ( !(vector >= FIRST_LEGACY_VECTOR &&
                    vector <= LAST_LEGACY_VECTOR &&
-                   !smp_processor_id() &&
                    bogus_8259A_irq(vector - FIRST_LEGACY_VECTOR)) )
             {
                 printk("CPU%u: No irq handler for vector %02x (IRQ %d%s)\n",
-- 
2.42.0

Re: [PATCH] x86/i8259: do not assume interrupts always target CPU0

Posted by Jan Beulich 7 months ago

On 19.09.2023 13:06, Roger Pau Monne wrote:
> Sporadically we have seen the following during AP bringup on AMD platforms
> only:
> 
> microcode: CPU59 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
> microcode: CPU60 updated from revision 0x830104d to 0x830107a, date = 2023-05-17
> CPU60: No irq handler for vector 27 (IRQ -2147483648)
> microcode: CPU61 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
> 
> This is similar to the issue raised on Linux commit 36e9e1eab777e, where they
> also observed i8259 (active) vectors getting delivered to CPUs different than
> 0.
> 
> Adjust the target CPU mask of i8259 interrupt descriptors to contain all
> possible CPUs, so that APs will reserve the vector at startup if any legacy IRQ
> is still delivered through the i8259.  Note that if the IO-APIC takes over
> those interrupt descriptors the CPU mask will be reset.
> 
> Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected even
> when all i8259 pins are masked, and hence need to be handled on all CPUs.
> Reserve such vectors in order to prevent dynamic interrupt sources from using
> them.
> 
> Finally, handle spurious i8259 interrupts on all CPUs and adjust the printed
> message to display the CPU where the spurious interrupt has been received, so
> it looks like:
> 
> microcode: CPU1 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
> cpu1: spurious 8259A interrupt: IRQ7
> microcode: CPU2 updated from revision 0x830104d to 0x830107a, date = 2023-05-17
> 
> Fixes: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> One theory I have is that the APs at some point (before jumping into Xen code)
> have the local APIC hardware-disabled, and hence are considered valid targets
> by the i8259, but by the time the vector is fetched from the i8259 the
> interrupt has either been masked, or already consumed by a different CPU.

Aiui with LAPIC disabled, IRQs should only be possible to go to CPU0,
for there simply not being any SMP without LAPICs.

Did you check that there are unmasked ExtINT LVT0 on APs? And unmasked
PIC IRQs? (Although, for the latter, it could of course be that by the
time we gain control, they're all masked again, but an IRQ was in the
meantime classified as spurious.)

> --- a/xen/arch/x86/i8259.c
> +++ b/xen/arch/x86/i8259.c
> @@ -222,7 +222,8 @@ static bool _mask_and_ack_8259A_irq(unsigned int irq)
>          is_real_irq = false;
>          /* Report spurious IRQ, once per IRQ line. */
>          if (!(spurious_irq_mask & irqmask)) {
> -            printk("spurious 8259A interrupt: IRQ%d.\n", irq);
> +            printk("cpu%u: spurious 8259A interrupt: IRQ%u\n",
> +                   smp_processor_id(), irq);
>              spurious_irq_mask |= irqmask;
>          }

Nit: Perhaps, to be in line with the other message in context below,
"CPU%u: ..."?

> @@ -349,7 +350,20 @@ void __init init_IRQ(void)
>              continue;
>          desc->handler = &i8259A_irq_type;
>          per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
> -        cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
> +
> +        /*
> +         * The interrupt affinity logic never targets interrupts to offline
> +         * CPUs, hence it's safe to use cpumask_all here.
> +         *
> +         * Legacy PIC interrupts are only targeted to CPU0, but depending on
> +         * the platform they can be distributed to any online CPU in hardware.

I'm unaware of such a distribution mechanism. Do you have a reference? (If
I recall correctly, there needs to be a unique entity in the system that
runs the INTA protocol with the [master] PIC.)

> +         * The kernel has no influence on that. So all active legacy vectors
> +         * must be installed on all CPUs.
> +         *
> +         * IO-APIC will change the destination mask if/when taking ownership of
> +         * the interrupt.
> +         */
> +        cpumask_copy(desc->arch.cpu_mask, &cpumask_all);
>          desc->arch.vector = LEGACY_VECTOR(irq);
>      }
>      
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -466,6 +466,14 @@ int __init init_irq_data(void)
>            vector++ )
>          __set_bit(vector, used_vectors);
>  
> +    /*
> +     * Mark i8259 spurious vectors as used to avoid (re)using them.  Otherwise
> +     * it won't be possible to distinguish between device triggered interrupts
> +     * or spurious i8259 ones.
> +     */

You certainly mean {L,IO}APIC device triggered interrupts here? If so, why
would they not be distinguishable? ExtINT IRQs don't set ISR bits, iirc.

> +    __set_bit(LEGACY_VECTOR(7), used_vectors);
> +    __set_bit(LEGACY_VECTOR(15), used_vectors);
> +
>      return 0;
>  }

Assuming no IRQs are handled through the PIC (which ought to be the case on
virtually all systems), we'd have masked all of them before any APs are
brought up. With them masked, aiui even spurious interrupts cannot occur.
Still you're permanently removing close to 1% of an already known scarce
resource. Can we, if at all, do this just temporarily, such that later on
the vectors can become usable again? (See also setup_local_APIC() for
ExtINT setup - only the BSP would ever have LVT0 unmasked once we've
finished with our setup.)

Jan

> @@ -1920,7 +1928,6 @@ void do_IRQ(struct cpu_user_regs *regs)
>                  kind = "";
>              if ( !(vector >= FIRST_LEGACY_VECTOR &&
>                     vector <= LAST_LEGACY_VECTOR &&
> -                   !smp_processor_id() &&
>                     bogus_8259A_irq(vector - FIRST_LEGACY_VECTOR)) )
>              {
>                  printk("CPU%u: No irq handler for vector %02x (IRQ %d%s)\n",

Re: [PATCH] x86/i8259: do not assume interrupts always target CPU0

Posted by Roger Pau Monné 7 months ago

On Mon, Sep 25, 2023 at 11:55:25AM +0200, Jan Beulich wrote:
> On 19.09.2023 13:06, Roger Pau Monne wrote:
> > Sporadically we have seen the following during AP bringup on AMD platforms
> > only:
> > 
> > microcode: CPU59 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
> > microcode: CPU60 updated from revision 0x830104d to 0x830107a, date = 2023-05-17
> > CPU60: No irq handler for vector 27 (IRQ -2147483648)
> > microcode: CPU61 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
> > 
> > This is similar to the issue raised on Linux commit 36e9e1eab777e, where they
> > also observed i8259 (active) vectors getting delivered to CPUs different than
> > 0.
> > 
> > Adjust the target CPU mask of i8259 interrupt descriptors to contain all
> > possible CPUs, so that APs will reserve the vector at startup if any legacy IRQ
> > is still delivered through the i8259.  Note that if the IO-APIC takes over
> > those interrupt descriptors the CPU mask will be reset.
> > 
> > Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected even
> > when all i8259 pins are masked, and hence need to be handled on all CPUs.
> > Reserve such vectors in order to prevent dynamic interrupt sources from using
> > them.
> > 
> > Finally, handle spurious i8259 interrupts on all CPUs and adjust the printed
> > message to display the CPU where the spurious interrupt has been received, so
> > it looks like:
> > 
> > microcode: CPU1 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
> > cpu1: spurious 8259A interrupt: IRQ7
> > microcode: CPU2 updated from revision 0x830104d to 0x830107a, date = 2023-05-17
> > 
> > Fixes: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs')
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> > One theory I have is that the APs at some point (before jumping into Xen code)
> > have the local APIC hardware-disabled, and hence are considered valid targets
> > by the i8259, but by the time the vector is fetched from the i8259 the
> > interrupt has either been masked, or already consumed by a different CPU.
> 
> Aiui with LAPIC disabled, IRQs should only be possible to go to CPU0,
> for there simply not being any SMP without LAPICs.
> 
> Did you check that there are unmasked ExtINT LVT0 on APs?

Yes, I did.  None of the APs LVT0 have ExtINT set, I was really hoping
they would have ExtINT set, as this would then make sense.

Maybe there's a point before handling control to Xen (in firmware)
where APs have ExtINT LVT0 set, I certainly have no idea about that.

Seeing Linux commit 36e9e1eab777e I get the impression that on AMD
systems PIC interrupts are delivered to any CPU that either has LVT0
ExtINT or the lapic hardware-disabled.  I have however found no
information about this at all in the PM or PPRs.

> And unmasked
> PIC IRQs? (Although, for the latter, it could of course be that by the
> time we gain control, they're all masked again, but an IRQ was in the
> meantime classified as spurious.)

Even if there are unmasked PIC IRQs (which is not the case AFAICT), I
would expect them to be delivered to CPU#0 because that's the only one
that has ExtINT set in LVT0.

> > @@ -349,7 +350,20 @@ void __init init_IRQ(void)
> >              continue;
> >          desc->handler = &i8259A_irq_type;
> >          per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
> > -        cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
> > +
> > +        /*
> > +         * The interrupt affinity logic never targets interrupts to offline
> > +         * CPUs, hence it's safe to use cpumask_all here.
> > +         *
> > +         * Legacy PIC interrupts are only targeted to CPU0, but depending on
> > +         * the platform they can be distributed to any online CPU in hardware.
> 
> I'm unaware of such a distribution mechanism. Do you have a reference? (If
> I recall correctly, there needs to be a unique entity in the system that
> runs the INTA protocol with the [master] PIC.)

This comment is mostly picked from Linux.  See:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/apic/vector.c#n842

As said, I haven't been able to find any reference to this in manuals,
but given commit 36e9e1eab777e it does seem Linux has also experienced
PIC interrupts being delivered to CPUs different than the BSP at least
on AMD.

> > +         * The kernel has no influence on that. So all active legacy vectors
> > +         * must be installed on all CPUs.
> > +         *
> > +         * IO-APIC will change the destination mask if/when taking ownership of
> > +         * the interrupt.
> > +         */
> > +        cpumask_copy(desc->arch.cpu_mask, &cpumask_all);
> >          desc->arch.vector = LEGACY_VECTOR(irq);
> >      }
> >      
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -466,6 +466,14 @@ int __init init_irq_data(void)
> >            vector++ )
> >          __set_bit(vector, used_vectors);
> >  
> > +    /*
> > +     * Mark i8259 spurious vectors as used to avoid (re)using them.  Otherwise
> > +     * it won't be possible to distinguish between device triggered interrupts
> > +     * or spurious i8259 ones.
> > +     */
> 
> You certainly mean {L,IO}APIC device triggered interrupts here? If so, why
> would they not be distinguishable? ExtINT IRQs don't set ISR bits, iirc.

Right, but we don't really check any of this when servicing the
interrupt in do_IRQ if the vector has a valid irq associated.

So if we assign the vector used by PIC pin 7 (IRQ7) to an {L,IO}-APIC
triggered interrupt, how would Xen distinguish a spurious PIC
interrupt from a L,IO}-APIC originated one?

Check the LAPIC ISR for each vector in do_IRQ() seems like an
unnecessary burden, and that would be a non-trivial overhead when
running virtualized.

So yes, it's possible to distinguish them, but I'm not sure we want to
engage in that logic in do_IRQ().

> > +    __set_bit(LEGACY_VECTOR(7), used_vectors);
> > +    __set_bit(LEGACY_VECTOR(15), used_vectors);
> > +
> >      return 0;
> >  }
> 
> Assuming no IRQs are handled through the PIC (which ought to be the case on
> virtually all systems), we'd have masked all of them before any APs are
> brought up. With them masked, aiui even spurious interrupts cannot occur.

It can occur if there was a pending interrupt in the PIC before
the masking happens, however such spurious interrupt would be handled
by the BSP then, as that's the only online CPU at that point (at least
from Xen PoV).

> Still you're permanently removing close to 1% of an already known scarce
> resource.

I don't have more arguments that what I've already exposed here.
I could try to bring this up with AMD, but I'm unsure this will shed
any light on the issue.

Another option is to not reserve the vectors, and assume that if the
vectors are used for {IO,L]-APIC interrupts some spurious PIC
interrupts might be incorrectly handled as device triggered
interrupts.

> Can we, if at all, do this just temporarily, such that later on
> the vectors can become usable again? (See also setup_local_APIC() for
> ExtINT setup - only the BSP would ever have LVT0 unmasked once we've
> finished with our setup.)

Yeah, I know, believe me I've spent a non-trivial amount of time trying
to figure out what's going on.

I even modified iommu_dte_add_device_entry() to block ExtINT on all
IOMMU devices (on the platform I was testing only the FCH had
ACPI_IVHD_EINT_PASS set in IVRS device flags), and I would still get
those spurious interrupts at AP bring up.

Thanks, Roger.

Re: [PATCH] x86/i8259: do not assume interrupts always target CPU0

Posted by Jan Beulich 7 months ago

On 25.09.2023 12:42, Roger Pau Monné wrote:
> On Mon, Sep 25, 2023 at 11:55:25AM +0200, Jan Beulich wrote:
>> On 19.09.2023 13:06, Roger Pau Monne wrote:
>>> Sporadically we have seen the following during AP bringup on AMD platforms
>>> only:
>>>
>>> microcode: CPU59 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
>>> microcode: CPU60 updated from revision 0x830104d to 0x830107a, date = 2023-05-17
>>> CPU60: No irq handler for vector 27 (IRQ -2147483648)
>>> microcode: CPU61 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
>>>
>>> This is similar to the issue raised on Linux commit 36e9e1eab777e, where they
>>> also observed i8259 (active) vectors getting delivered to CPUs different than
>>> 0.
>>>
>>> Adjust the target CPU mask of i8259 interrupt descriptors to contain all
>>> possible CPUs, so that APs will reserve the vector at startup if any legacy IRQ
>>> is still delivered through the i8259.  Note that if the IO-APIC takes over
>>> those interrupt descriptors the CPU mask will be reset.
>>>
>>> Spurious i8259 interrupt vectors however (IRQ7 and IRQ15) can be injected even
>>> when all i8259 pins are masked, and hence need to be handled on all CPUs.
>>> Reserve such vectors in order to prevent dynamic interrupt sources from using
>>> them.
>>>
>>> Finally, handle spurious i8259 interrupts on all CPUs and adjust the printed
>>> message to display the CPU where the spurious interrupt has been received, so
>>> it looks like:
>>>
>>> microcode: CPU1 updated from revision 0x830107a to 0x830107a, date = 2023-05-17
>>> cpu1: spurious 8259A interrupt: IRQ7
>>> microcode: CPU2 updated from revision 0x830104d to 0x830107a, date = 2023-05-17
>>>
>>> Fixes: 3fba06ba9f8b ('x86/IRQ: re-use legacy vector ranges on APs')
>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>> ---
>>> One theory I have is that the APs at some point (before jumping into Xen code)
>>> have the local APIC hardware-disabled, and hence are considered valid targets
>>> by the i8259, but by the time the vector is fetched from the i8259 the
>>> interrupt has either been masked, or already consumed by a different CPU.
>>
>> Aiui with LAPIC disabled, IRQs should only be possible to go to CPU0,
>> for there simply not being any SMP without LAPICs.
>>
>> Did you check that there are unmasked ExtINT LVT0 on APs?
> 
> Yes, I did.  None of the APs LVT0 have ExtINT set, I was really hoping
> they would have ExtINT set, as this would then make sense.
> 
> Maybe there's a point before handling control to Xen (in firmware)
> where APs have ExtINT LVT0 set, I certainly have no idea about that.
> 
> Seeing Linux commit 36e9e1eab777e I get the impression that on AMD
> systems PIC interrupts are delivered to any CPU that either has LVT0
> ExtINT or the lapic hardware-disabled.  I have however found no
> information about this at all in the PM or PPRs.

Just to mention it: While that commit make some relevant remarks, its
purpose was to work with "noapic", and one of its chunks is even
wrapped in "#ifndef CONFIG_X86_IO_APIC".

>> And unmasked
>> PIC IRQs? (Although, for the latter, it could of course be that by the
>> time we gain control, they're all masked again, but an IRQ was in the
>> meantime classified as spurious.)
> 
> Even if there are unmasked PIC IRQs (which is not the case AFAICT), I
> would expect them to be delivered to CPU#0 because that's the only one
> that has ExtINT set in LVT0.
> 
>>> @@ -349,7 +350,20 @@ void __init init_IRQ(void)
>>>              continue;
>>>          desc->handler = &i8259A_irq_type;
>>>          per_cpu(vector_irq, cpu)[LEGACY_VECTOR(irq)] = irq;
>>> -        cpumask_copy(desc->arch.cpu_mask, cpumask_of(cpu));
>>> +
>>> +        /*
>>> +         * The interrupt affinity logic never targets interrupts to offline
>>> +         * CPUs, hence it's safe to use cpumask_all here.
>>> +         *
>>> +         * Legacy PIC interrupts are only targeted to CPU0, but depending on
>>> +         * the platform they can be distributed to any online CPU in hardware.
>>
>> I'm unaware of such a distribution mechanism. Do you have a reference? (If
>> I recall correctly, there needs to be a unique entity in the system that
>> runs the INTA protocol with the [master] PIC.)
> 
> This comment is mostly picked from Linux.  See:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/apic/vector.c#n842
> 
> As said, I haven't been able to find any reference to this in manuals,
> but given commit 36e9e1eab777e it does seem Linux has also experienced
> PIC interrupts being delivered to CPUs different than the BSP at least
> on AMD.
> 
>>> +         * The kernel has no influence on that. So all active legacy vectors
>>> +         * must be installed on all CPUs.
>>> +         *
>>> +         * IO-APIC will change the destination mask if/when taking ownership of
>>> +         * the interrupt.
>>> +         */
>>> +        cpumask_copy(desc->arch.cpu_mask, &cpumask_all);
>>>          desc->arch.vector = LEGACY_VECTOR(irq);
>>>      }
>>>      
>>> --- a/xen/arch/x86/irq.c
>>> +++ b/xen/arch/x86/irq.c
>>> @@ -466,6 +466,14 @@ int __init init_irq_data(void)
>>>            vector++ )
>>>          __set_bit(vector, used_vectors);
>>>  
>>> +    /*
>>> +     * Mark i8259 spurious vectors as used to avoid (re)using them.  Otherwise
>>> +     * it won't be possible to distinguish between device triggered interrupts
>>> +     * or spurious i8259 ones.
>>> +     */
>>
>> You certainly mean {L,IO}APIC device triggered interrupts here? If so, why
>> would they not be distinguishable? ExtINT IRQs don't set ISR bits, iirc.
> 
> Right, but we don't really check any of this when servicing the
> interrupt in do_IRQ if the vector has a valid irq associated.
> 
> So if we assign the vector used by PIC pin 7 (IRQ7) to an {L,IO}-APIC
> triggered interrupt, how would Xen distinguish a spurious PIC
> interrupt from a L,IO}-APIC originated one?
> 
> Check the LAPIC ISR for each vector in do_IRQ() seems like an
> unnecessary burden, and that would be a non-trivial overhead when
> running virtualized.
> 
> So yes, it's possible to distinguish them, but I'm not sure we want to
> engage in that logic in do_IRQ().

Right, and here my comment was merely to ask that the code comment please
make correct statements. Saying "we don't want to go as far" is quite
okay, as long as we're (kind of) happy with the undesirable effects of
the actual code change.

>>> +    __set_bit(LEGACY_VECTOR(7), used_vectors);
>>> +    __set_bit(LEGACY_VECTOR(15), used_vectors);
>>> +
>>>      return 0;
>>>  }
>>
>> Assuming no IRQs are handled through the PIC (which ought to be the case on
>> virtually all systems), we'd have masked all of them before any APs are
>> brought up. With them masked, aiui even spurious interrupts cannot occur.
> 
> It can occur if there was a pending interrupt in the PIC before
> the masking happens, however such spurious interrupt would be handled
> by the BSP then, as that's the only online CPU at that point (at least
> from Xen PoV).
> 
>> Still you're permanently removing close to 1% of an already known scarce
>> resource.
> 
> I don't have more arguments that what I've already exposed here.
> I could try to bring this up with AMD, but I'm unsure this will shed
> any light on the issue.
> 
> Another option is to not reserve the vectors, and assume that if the
> vectors are used for {IO,L]-APIC interrupts some spurious PIC
> interrupts might be incorrectly handled as device triggered
> interrupts.
> 
>> Can we, if at all, do this just temporarily, such that later on
>> the vectors can become usable again? (See also setup_local_APIC() for
>> ExtINT setup - only the BSP would ever have LVT0 unmasked once we've
>> finished with our setup.)
> 
> Yeah, I know, believe me I've spent a non-trivial amount of time trying
> to figure out what's going on.
> 
> I even modified iommu_dte_add_device_entry() to block ExtINT on all
> IOMMU devices (on the platform I was testing only the FCH had
> ACPI_IVHD_EINT_PASS set in IVRS device flags), and I would still get
> those spurious interrupts at AP bring up.

Which means there's pretty clearly something they're not telling us.

So if this affects AMD (and by implication Hygon) only, can we make
reserving those vectors at least AMD-only? That said, to be honest
from your reply it didn't really become clear what you think of trying
to not permanently remove those vectors. Of course I understand that
with the behavior unexplained and undocumented, there's some guesswork
involved in trying such a less aggressive approach.

Jan