[for-6.2] hw/intc/arm_gicv3: Update cached state after acknowledging LPI

Peter Maydell posted 1 patch 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20211123171031.975367-1-peter.maydell@linaro.org
Maintainers: Peter Maydell <peter.maydell@linaro.org>
hw/intc/arm_gicv3_cpuif.c | 1 +
1 file changed, 1 insertion(+)

[for-6.2] hw/intc/arm_gicv3: Update cached state after acknowledging LPI

Posted by Peter Maydell 2 weeks ago
In gicv3_redist_lpi_pending() we update cs->hpplpi to indicate the
new highest priority pending LPI after changing the requested LPI
pending bit.  However the overall highest priority pending interrupt
information won't be updated unless we call gicv3_redist_update().
We do that from the callsite in gicv3-redist_process_lpi(), but not
from the callsite in icc_activate_irq().  The effect is that when the
guest acknowledges an LPI by reading ICC_IAR1_EL1, we mark it as not
pending in the data structure but still leave it in cs->hppi so will
offer it to the guest again.

The effect is that if we are using an emulated GICv3 and ITS and
using devices which use LPIs (ie PCI devices) then Linux will
complain "irq 54: nobody cared" and then hang (probably because the
stale bogus interrupt info meant we never tried to deliver some other
real interrupt).

Add the missing gicv3_redist_update() call.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
Marked for-6.2 because this is a bug fix to the ITS support
which is new in this release. At least for me this is necessary
to boot Debian on the virt board, since the ITS is default-enabled.
The failure seemed to be somewhat intermittent; I haven't bothered
to try to work out why.

 hw/intc/arm_gicv3_cpuif.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c
index 3fe5de8ad7d..2d9f2ad2b6c 100644
--- a/hw/intc/arm_gicv3_cpuif.c
+++ b/hw/intc/arm_gicv3_cpuif.c
@@ -906,6 +906,7 @@ static void icc_activate_irq(GICv3CPUState *cs, int irq)
         gicv3_update(cs->gic, irq, 1);
     } else {
         gicv3_redist_lpi_pending(cs, irq, 0);
+        gicv3_redist_update(cs);
     }
 }
 
-- 
2.25.1


Re: [for-6.2] hw/intc/arm_gicv3: Update cached state after acknowledging LPI

Posted by Richard Henderson 2 weeks ago
On 11/23/21 6:10 PM, Peter Maydell wrote:
> In gicv3_redist_lpi_pending() we update cs->hpplpi to indicate the
> new highest priority pending LPI after changing the requested LPI
> pending bit.  However the overall highest priority pending interrupt
> information won't be updated unless we call gicv3_redist_update().
> We do that from the callsite in gicv3-redist_process_lpi(), but not
> from the callsite in icc_activate_irq().  The effect is that when the
> guest acknowledges an LPI by reading ICC_IAR1_EL1, we mark it as not
> pending in the data structure but still leave it in cs->hppi so will
> offer it to the guest again.
> 
> The effect is that if we are using an emulated GICv3 and ITS and
> using devices which use LPIs (ie PCI devices) then Linux will
> complain "irq 54: nobody cared" and then hang (probably because the
> stale bogus interrupt info meant we never tried to deliver some other
> real interrupt).
> 
> Add the missing gicv3_redist_update() call.
> 
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> Marked for-6.2 because this is a bug fix to the ITS support
> which is new in this release. At least for me this is necessary
> to boot Debian on the virt board, since the ITS is default-enabled.
> The failure seemed to be somewhat intermittent; I haven't bothered
> to try to work out why.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~

Re: [for-6.2] hw/intc/arm_gicv3: Update cached state after acknowledging LPI

Posted by Peter Maydell 2 weeks ago
On Tue, 23 Nov 2021 at 17:10, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> In gicv3_redist_lpi_pending() we update cs->hpplpi to indicate the
> new highest priority pending LPI after changing the requested LPI
> pending bit.  However the overall highest priority pending interrupt
> information won't be updated unless we call gicv3_redist_update().
> We do that from the callsite in gicv3-redist_process_lpi(), but not
> from the callsite in icc_activate_irq().  The effect is that when the
> guest acknowledges an LPI by reading ICC_IAR1_EL1, we mark it as not
> pending in the data structure but still leave it in cs->hppi so will
> offer it to the guest again.
>
> The effect is that if we are using an emulated GICv3 and ITS and
> using devices which use LPIs (ie PCI devices) then Linux will
> complain "irq 54: nobody cared" and then hang (probably because the
> stale bogus interrupt info meant we never tried to deliver some other
> real interrupt).

Hmm; this is definitely a bug, but maybe it's not the cause of
the symptoms listed above -- I've just seen them again even
with this fix. I'll keep digging...

-- PMM

RE: [for-6.2] hw/intc/arm_gicv3: Update cached state after acknowledging LPI

Posted by Shashi Mallela 2 weeks ago

                
            

Re: [for-6.2] hw/intc/arm_gicv3: Update cached state after acknowledging LPI

Posted by Peter Maydell 1 week, 6 days ago
On Tue, 23 Nov 2021 at 19:22, Shashi Mallela <shashi.mallela@linaro.org> wrote:
>
> Since LPIs do not have an active or active and pending state,the current implementation only clears the LPI pending state from the pending table once ICC_IAR1_EL1 acknowledges the interrupt.
>
> But, as part of gicv3_lpi_pending() processing, cs->hpplpi is updated with the next best priotiy lpi (only if the current acknowledged irq was best priority irq).

Yes. But we don't update cs->hppi there, and the GIC code assumes
that that cs->hppi always indicates the highest priority pending
interrupt, so leaving it stale will break things.

> By calling gicv3_redist_update() in icc_activate_irq(), we are
> re-initiating high priority irqs scan in redistributor and if
> applicable trigger of next best pending lpi from the latest
> cs->hpplpi info (which otherwise would have happened on next
> irq trigger from source).

We will figure out which the next best pending interrupt is
(which might be an LPI or might be some other interrupt).
But we won't actually trigger it, because it must be lower
priority than the LPI that we are activating.

(The way this works is that activating the LPI in icc_activate_irq()
writes into the Active Priority Registers to indicate the priority
of the current active interrupt. When gicv3_cpuif_update() is
deciding whether to set IRQ/FIQ to tell the CPU it has an interrupt
it calls icc_hppi_can_preempt(), which checks the priority of the
pending interrupt recorded in cs->hppi against the priority
of the active interrupt as calculated by icc_highest_active_prio().
So we won't take another interrupt until either (a) this one is
deactivated or (b) a fresh one arrives at a higher priority.)

-- PMM

Re: [for-6.2] hw/intc/arm_gicv3: Update cached state after acknowledging LPI

Posted by Alex Bennée 2 weeks ago
Peter Maydell <peter.maydell@linaro.org> writes:

> On Tue, 23 Nov 2021 at 17:10, Peter Maydell <peter.maydell@linaro.org> wrote:
>>
>> In gicv3_redist_lpi_pending() we update cs->hpplpi to indicate the
>> new highest priority pending LPI after changing the requested LPI
>> pending bit.  However the overall highest priority pending interrupt
>> information won't be updated unless we call gicv3_redist_update().
>> We do that from the callsite in gicv3-redist_process_lpi(), but not
>> from the callsite in icc_activate_irq().  The effect is that when the
>> guest acknowledges an LPI by reading ICC_IAR1_EL1, we mark it as not
>> pending in the data structure but still leave it in cs->hppi so will
>> offer it to the guest again.
>>
>> The effect is that if we are using an emulated GICv3 and ITS and
>> using devices which use LPIs (ie PCI devices) then Linux will
>> complain "irq 54: nobody cared" and then hang (probably because the
>> stale bogus interrupt info meant we never tried to deliver some other
>> real interrupt).
>
> Hmm; this is definitely a bug, but maybe it's not the cause of
> the symptoms listed above -- I've just seen them again even
> with this fix. I'll keep digging...

Hmm interesting - does this affect the kvm-unit-tests for GICv3?

>
> -- PMM


-- 
Alex Bennée