[PATCH v6 06/13] xen/arm: irq: Restore state of local IRQs during system resume

Mykola Kvach posted 13 patches 1 month, 4 weeks ago
[PATCH v6 06/13] xen/arm: irq: Restore state of local IRQs during system resume
Posted by Mykola Kvach 1 month, 4 weeks ago
From: Mykola Kvach <mykola_kvach@epam.com>

On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
and not restored by gic_resume (for secondary cpus).

This patch introduces restore_local_irqs_on_resume, a function that
restores the state of local interrupts on the target CPU during
system resume.

It iterates over all local IRQs and re-enables those that were not
disabled, reprogramming their routing and affinity accordingly.

The function is invoked from start_secondary, ensuring that local IRQ
state is restored early during CPU bring-up after suspend.

Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
---
Changes in V6:
- Call handler->disable() instead of just setting the _IRQ_DISABLED flag
- Move the system state check outside of restore_local_irqs_on_resume()
---
 xen/arch/arm/irq.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 6c899347ca..ddd2940554 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -116,6 +116,41 @@ static int init_local_irq_data(unsigned int cpu)
     return 0;
 }
 
+/*
+ * The first 32 interrupts (PPIs and SGIs) are per-CPU,
+ * so call this function on the target CPU to restore them.
+ *
+ * SPIs are restored via gic_resume.
+ */
+static void restore_local_irqs_on_resume(void)
+{
+    int irq;
+
+    spin_lock(&local_irqs_type_lock);
+
+    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
+    {
+        struct irq_desc *desc = irq_to_desc(irq);
+
+        spin_lock(&desc->lock);
+
+        if ( test_bit(_IRQ_DISABLED, &desc->status) )
+        {
+            spin_unlock(&desc->lock);
+            continue;
+        }
+
+        /* Disable the IRQ to avoid assertions in the following calls */
+        desc->handler->disable(desc);
+        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);
+        desc->handler->startup(desc);
+
+        spin_unlock(&desc->lock);
+    }
+
+    spin_unlock(&local_irqs_type_lock);
+}
+
 static int cpu_callback(struct notifier_block *nfb, unsigned long action,
                         void *hcpu)
 {
@@ -134,6 +169,10 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
             printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
                    cpu);
         break;
+    case CPU_STARTING:
+        if ( system_state == SYS_STATE_resume )
+            restore_local_irqs_on_resume();
+        break;
     }
 
     return notifier_from_errno(rc);
-- 
2.48.1
Re: [PATCH v6 06/13] xen/arm: irq: Restore state of local IRQs during system resume
Posted by Oleksandr Tyshchenko 1 month, 4 weeks ago

On 02.09.25 01:10, Mykola Kvach wrote:

Hello Mykola

> From: Mykola Kvach <mykola_kvach@epam.com>
> 
> On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
> and not restored by gic_resume (for secondary cpus).
> 
> This patch introduces restore_local_irqs_on_resume, a function that
> restores the state of local interrupts on the target CPU during
> system resume.
> 
> It iterates over all local IRQs and re-enables those that were not
> disabled, reprogramming their routing and affinity accordingly.
> 
> The function is invoked from start_secondary, ensuring that local IRQ
> state is restored early during CPU bring-up after suspend.
> 
> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> ---
> Changes in V6:
> - Call handler->disable() instead of just setting the _IRQ_DISABLED flag
> - Move the system state check outside of restore_local_irqs_on_resume()
> ---
>   xen/arch/arm/irq.c | 39 +++++++++++++++++++++++++++++++++++++++
>   1 file changed, 39 insertions(+)
> 
> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> index 6c899347ca..ddd2940554 100644
> --- a/xen/arch/arm/irq.c
> +++ b/xen/arch/arm/irq.c
> @@ -116,6 +116,41 @@ static int init_local_irq_data(unsigned int cpu)
>       return 0;
>   }
>   
> +/*
> + * The first 32 interrupts (PPIs and SGIs) are per-CPU,
> + * so call this function on the target CPU to restore them.
> + *
> + * SPIs are restored via gic_resume.
> + */
> +static void restore_local_irqs_on_resume(void)
> +{
> +    int irq;

NIT: Please, use "unsigned int" if irq cannot be negative

> +
> +    spin_lock(&local_irqs_type_lock);
> +
> +    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
> +    {
> +        struct irq_desc *desc = irq_to_desc(irq);
> +
> +        spin_lock(&desc->lock);
> +
> +        if ( test_bit(_IRQ_DISABLED, &desc->status) )
> +        {
> +            spin_unlock(&desc->lock);
> +            continue;
> +        }
> +
> +        /* Disable the IRQ to avoid assertions in the following calls */
> +        desc->handler->disable(desc);
> +        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);

Shouldn't we use GIC_PRI_IPI for SGIs?


> +        desc->handler->startup(desc);
> +
> +        spin_unlock(&desc->lock);
> +    }
> +
> +    spin_unlock(&local_irqs_type_lock);
> +}
> +
>   static int cpu_callback(struct notifier_block *nfb, unsigned long action,
>                           void *hcpu)
>   {
> @@ -134,6 +169,10 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
>               printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
>                      cpu);
>           break;
> +    case CPU_STARTING:
> +        if ( system_state == SYS_STATE_resume )
> +            restore_local_irqs_on_resume();
> +        break;

May I please ask, why all this new code (i.e. 
restore_local_irqs_on_resume()) is not covered by #ifdef 
CONFIG_SYSTEM_SUSPEND?

>       }
>   
>       return notifier_from_errno(rc);
Re: [PATCH v6 06/13] xen/arm: irq: Restore state of local IRQs during system resume
Posted by Mykola Kvach 1 month, 4 weeks ago
On Tue, Sep 2, 2025 at 7:49 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
>
>
>
> On 02.09.25 01:10, Mykola Kvach wrote:
>
> Hello Mykola
>
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
> > and not restored by gic_resume (for secondary cpus).
> >
> > This patch introduces restore_local_irqs_on_resume, a function that
> > restores the state of local interrupts on the target CPU during
> > system resume.
> >
> > It iterates over all local IRQs and re-enables those that were not
> > disabled, reprogramming their routing and affinity accordingly.
> >
> > The function is invoked from start_secondary, ensuring that local IRQ
> > state is restored early during CPU bring-up after suspend.
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in V6:
> > - Call handler->disable() instead of just setting the _IRQ_DISABLED flag
> > - Move the system state check outside of restore_local_irqs_on_resume()
> > ---
> >   xen/arch/arm/irq.c | 39 +++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 39 insertions(+)
> >
> > diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> > index 6c899347ca..ddd2940554 100644
> > --- a/xen/arch/arm/irq.c
> > +++ b/xen/arch/arm/irq.c
> > @@ -116,6 +116,41 @@ static int init_local_irq_data(unsigned int cpu)
> >       return 0;
> >   }
> >
> > +/*
> > + * The first 32 interrupts (PPIs and SGIs) are per-CPU,
> > + * so call this function on the target CPU to restore them.
> > + *
> > + * SPIs are restored via gic_resume.
> > + */
> > +static void restore_local_irqs_on_resume(void)
> > +{
> > +    int irq;
>
> NIT: Please, use "unsigned int" if irq cannot be negative
>
> > +
> > +    spin_lock(&local_irqs_type_lock);
> > +
> > +    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
> > +    {
> > +        struct irq_desc *desc = irq_to_desc(irq);
> > +
> > +        spin_lock(&desc->lock);
> > +
> > +        if ( test_bit(_IRQ_DISABLED, &desc->status) )
> > +        {
> > +            spin_unlock(&desc->lock);
> > +            continue;
> > +        }
> > +
> > +        /* Disable the IRQ to avoid assertions in the following calls */
> > +        desc->handler->disable(desc);
> > +        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);
>
> Shouldn't we use GIC_PRI_IPI for SGIs?

I'll update the priority value in the next version.

Initially, I assumed gic_route_irq_to_xen() was used for all
interrupts with the same priority. But looking more closely, it
doesn't appear to be called for SGIs at all.

In fact, SGI configuration, including priority, is handled during CPU
initialization in gic_init_secondary_cpu(), which is called before
the CPU_STARTING notifier.

Given that, it's probably better to avoid updating SGI priorities here
entirely and rely on their boot-time configuration instead.

>
>
> > +        desc->handler->startup(desc);
> > +
> > +        spin_unlock(&desc->lock);
> > +    }
> > +
> > +    spin_unlock(&local_irqs_type_lock);
> > +}
> > +
> >   static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> >                           void *hcpu)
> >   {
> > @@ -134,6 +169,10 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> >               printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
> >                      cpu);
> >           break;
> > +    case CPU_STARTING:
> > +        if ( system_state == SYS_STATE_resume )
> > +            restore_local_irqs_on_resume();
> > +        break;
>
> May I please ask, why all this new code (i.e.
> restore_local_irqs_on_resume()) is not covered by #ifdef
> CONFIG_SYSTEM_SUSPEND?
>
> >       }
> >
> >       return notifier_from_errno(rc);
>
Re: [PATCH v6 06/13] xen/arm: irq: Restore state of local IRQs during system resume
Posted by Mykola Kvach 1 month, 4 weeks ago
Hi Oleksandr,

On Tue, Sep 2, 2025 at 7:49 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
>
>
>
> On 02.09.25 01:10, Mykola Kvach wrote:
>
> Hello Mykola
>
> > From: Mykola Kvach <mykola_kvach@epam.com>
> >
> > On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
> > and not restored by gic_resume (for secondary cpus).
> >
> > This patch introduces restore_local_irqs_on_resume, a function that
> > restores the state of local interrupts on the target CPU during
> > system resume.
> >
> > It iterates over all local IRQs and re-enables those that were not
> > disabled, reprogramming their routing and affinity accordingly.
> >
> > The function is invoked from start_secondary, ensuring that local IRQ
> > state is restored early during CPU bring-up after suspend.
> >
> > Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > ---
> > Changes in V6:
> > - Call handler->disable() instead of just setting the _IRQ_DISABLED flag
> > - Move the system state check outside of restore_local_irqs_on_resume()
> > ---
> >   xen/arch/arm/irq.c | 39 +++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 39 insertions(+)
> >
> > diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> > index 6c899347ca..ddd2940554 100644
> > --- a/xen/arch/arm/irq.c
> > +++ b/xen/arch/arm/irq.c
> > @@ -116,6 +116,41 @@ static int init_local_irq_data(unsigned int cpu)
> >       return 0;
> >   }
> >
> > +/*
> > + * The first 32 interrupts (PPIs and SGIs) are per-CPU,
> > + * so call this function on the target CPU to restore them.
> > + *
> > + * SPIs are restored via gic_resume.
> > + */
> > +static void restore_local_irqs_on_resume(void)
> > +{
> > +    int irq;
>
> NIT: Please, use "unsigned int" if irq cannot be negative

ok

>
> > +
> > +    spin_lock(&local_irqs_type_lock);
> > +
> > +    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
> > +    {
> > +        struct irq_desc *desc = irq_to_desc(irq);
> > +
> > +        spin_lock(&desc->lock);
> > +
> > +        if ( test_bit(_IRQ_DISABLED, &desc->status) )
> > +        {
> > +            spin_unlock(&desc->lock);
> > +            continue;
> > +        }
> > +
> > +        /* Disable the IRQ to avoid assertions in the following calls */
> > +        desc->handler->disable(desc);
> > +        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);
>
> Shouldn't we use GIC_PRI_IPI for SGIs?

Yes, we should. But currently I am restoring the same value
as it was before suspend...

I definitely agree that this needs to be fixed at the original
place where the issue was introduced, but I was planning to
address it in a future patch.

>
>
> > +        desc->handler->startup(desc);
> > +
> > +        spin_unlock(&desc->lock);
> > +    }
> > +
> > +    spin_unlock(&local_irqs_type_lock);
> > +}
> > +
> >   static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> >                           void *hcpu)
> >   {
> > @@ -134,6 +169,10 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> >               printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
> >                      cpu);
> >           break;
> > +    case CPU_STARTING:
> > +        if ( system_state == SYS_STATE_resume )
> > +            restore_local_irqs_on_resume();
> > +        break;
>
> May I please ask, why all this new code (i.e.
> restore_local_irqs_on_resume()) is not covered by #ifdef
> CONFIG_SYSTEM_SUSPEND?

I don’t see a reason to introduce such "macaron-style" code. On ARM, the
system suspend state is only set when CONFIG_SYSTEM_SUSPEND is defined
anyway.

If you would prefer me to wrap all relevant code with this define, please
let me know and I’ll make the change. In this case, I think the current
approach is cleaner, but I’m open to your opinion.

>
> >       }
> >
> >       return notifier_from_errno(rc);
>

Best regards,
Mykola
Re: [PATCH v6 06/13] xen/arm: irq: Restore state of local IRQs during system resume
Posted by Oleksandr Tyshchenko 1 month, 4 weeks ago

On 02.09.25 20:43, Mykola Kvach wrote:
> Hi Oleksandr,

Hello Mykola

> 
> On Tue, Sep 2, 2025 at 7:49 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
>>
>>
>>
>> On 02.09.25 01:10, Mykola Kvach wrote:
>>
>> Hello Mykola
>>
>>> From: Mykola Kvach <mykola_kvach@epam.com>
>>>
>>> On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
>>> and not restored by gic_resume (for secondary cpus).
>>>
>>> This patch introduces restore_local_irqs_on_resume, a function that
>>> restores the state of local interrupts on the target CPU during
>>> system resume.
>>>
>>> It iterates over all local IRQs and re-enables those that were not
>>> disabled, reprogramming their routing and affinity accordingly.
>>>
>>> The function is invoked from start_secondary, ensuring that local IRQ
>>> state is restored early during CPU bring-up after suspend.
>>>
>>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
>>> ---
>>> Changes in V6:
>>> - Call handler->disable() instead of just setting the _IRQ_DISABLED flag
>>> - Move the system state check outside of restore_local_irqs_on_resume()
>>> ---
>>>    xen/arch/arm/irq.c | 39 +++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 39 insertions(+)
>>>
>>> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
>>> index 6c899347ca..ddd2940554 100644
>>> --- a/xen/arch/arm/irq.c
>>> +++ b/xen/arch/arm/irq.c
>>> @@ -116,6 +116,41 @@ static int init_local_irq_data(unsigned int cpu)
>>>        return 0;
>>>    }
>>>
>>> +/*
>>> + * The first 32 interrupts (PPIs and SGIs) are per-CPU,
>>> + * so call this function on the target CPU to restore them.
>>> + *
>>> + * SPIs are restored via gic_resume.
>>> + */
>>> +static void restore_local_irqs_on_resume(void)
>>> +{
>>> +    int irq;
>>
>> NIT: Please, use "unsigned int" if irq cannot be negative
> 
> ok
> 
>>
>>> +
>>> +    spin_lock(&local_irqs_type_lock);
>>> +
>>> +    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
>>> +    {
>>> +        struct irq_desc *desc = irq_to_desc(irq);
>>> +
>>> +        spin_lock(&desc->lock);
>>> +
>>> +        if ( test_bit(_IRQ_DISABLED, &desc->status) )
>>> +        {
>>> +            spin_unlock(&desc->lock);
>>> +            continue;
>>> +        }
>>> +
>>> +        /* Disable the IRQ to avoid assertions in the following calls */
>>> +        desc->handler->disable(desc);
>>> +        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);
>>
>> Shouldn't we use GIC_PRI_IPI for SGIs?
> 
> Yes, we should. But currently I am restoring the same value
> as it was before suspend...
> 
> I definitely agree that this needs to be fixed at the original
> place where the issue was introduced, but I was planning to
> address it in a future patch.
> 
>>
>>
>>> +        desc->handler->startup(desc);
>>> +
>>> +        spin_unlock(&desc->lock);
>>> +    }
>>> +
>>> +    spin_unlock(&local_irqs_type_lock);
>>> +}
>>> +
>>>    static int cpu_callback(struct notifier_block *nfb, unsigned long action,
>>>                            void *hcpu)
>>>    {
>>> @@ -134,6 +169,10 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
>>>                printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
>>>                       cpu);
>>>            break;
>>> +    case CPU_STARTING:
>>> +        if ( system_state == SYS_STATE_resume )
>>> +            restore_local_irqs_on_resume();
>>> +        break;
>>
>> May I please ask, why all this new code (i.e.
>> restore_local_irqs_on_resume()) is not covered by #ifdef
>> CONFIG_SYSTEM_SUSPEND?
> 
> I don’t see a reason to introduce such "macaron-style" code. On ARM, the
> system suspend state is only set when CONFIG_SYSTEM_SUSPEND is defined
> anyway.

right

> 
> If you would prefer me to wrap all relevant code with this define, please
> let me know and I’ll make the change. In this case, I think the current
> approach is cleaner, but I’m open to your opinion.

In other patches, you seem to wrap functions/code that only get called 
during suspend/resume with #ifdef CONFIG_SYSTEM_SUSPEND, so I wondered 
why restore_local_irqs_on_resume() could not be compiled out
if the feature is not enabled. But if you still think it would be 
cleaner this way (w/o #ifdef), I would be ok.

> 
>>
>>>        }
>>>
>>>        return notifier_from_errno(rc);
>>
> 
> Best regards,
> Mykola


Re: [PATCH v6 06/13] xen/arm: irq: Restore state of local IRQs during system resume
Posted by Mykola Kvach 1 month, 4 weeks ago
On Tue, Sep 2, 2025 at 9:16 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
>
>
>
> On 02.09.25 20:43, Mykola Kvach wrote:
> > Hi Oleksandr,
>
> Hello Mykola
>
> >
> > On Tue, Sep 2, 2025 at 7:49 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
> >>
> >>
> >>
> >> On 02.09.25 01:10, Mykola Kvach wrote:
> >>
> >> Hello Mykola
> >>
> >>> From: Mykola Kvach <mykola_kvach@epam.com>
> >>>
> >>> On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
> >>> and not restored by gic_resume (for secondary cpus).
> >>>
> >>> This patch introduces restore_local_irqs_on_resume, a function that
> >>> restores the state of local interrupts on the target CPU during
> >>> system resume.
> >>>
> >>> It iterates over all local IRQs and re-enables those that were not
> >>> disabled, reprogramming their routing and affinity accordingly.
> >>>
> >>> The function is invoked from start_secondary, ensuring that local IRQ
> >>> state is restored early during CPU bring-up after suspend.
> >>>
> >>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> >>> ---
> >>> Changes in V6:
> >>> - Call handler->disable() instead of just setting the _IRQ_DISABLED flag
> >>> - Move the system state check outside of restore_local_irqs_on_resume()
> >>> ---
> >>>    xen/arch/arm/irq.c | 39 +++++++++++++++++++++++++++++++++++++++
> >>>    1 file changed, 39 insertions(+)
> >>>
> >>> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> >>> index 6c899347ca..ddd2940554 100644
> >>> --- a/xen/arch/arm/irq.c
> >>> +++ b/xen/arch/arm/irq.c
> >>> @@ -116,6 +116,41 @@ static int init_local_irq_data(unsigned int cpu)
> >>>        return 0;
> >>>    }
> >>>
> >>> +/*
> >>> + * The first 32 interrupts (PPIs and SGIs) are per-CPU,
> >>> + * so call this function on the target CPU to restore them.
> >>> + *
> >>> + * SPIs are restored via gic_resume.
> >>> + */
> >>> +static void restore_local_irqs_on_resume(void)
> >>> +{
> >>> +    int irq;
> >>
> >> NIT: Please, use "unsigned int" if irq cannot be negative
> >
> > ok
> >
> >>
> >>> +
> >>> +    spin_lock(&local_irqs_type_lock);
> >>> +
> >>> +    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
> >>> +    {
> >>> +        struct irq_desc *desc = irq_to_desc(irq);
> >>> +
> >>> +        spin_lock(&desc->lock);
> >>> +
> >>> +        if ( test_bit(_IRQ_DISABLED, &desc->status) )
> >>> +        {
> >>> +            spin_unlock(&desc->lock);
> >>> +            continue;
> >>> +        }
> >>> +
> >>> +        /* Disable the IRQ to avoid assertions in the following calls */
> >>> +        desc->handler->disable(desc);
> >>> +        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);
> >>
> >> Shouldn't we use GIC_PRI_IPI for SGIs?
> >
> > Yes, we should. But currently I am restoring the same value
> > as it was before suspend...
> >
> > I definitely agree that this needs to be fixed at the original
> > place where the issue was introduced, but I was planning to
> > address it in a future patch.
> >
> >>
> >>
> >>> +        desc->handler->startup(desc);
> >>> +
> >>> +        spin_unlock(&desc->lock);
> >>> +    }
> >>> +
> >>> +    spin_unlock(&local_irqs_type_lock);
> >>> +}
> >>> +
> >>>    static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> >>>                            void *hcpu)
> >>>    {
> >>> @@ -134,6 +169,10 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> >>>                printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
> >>>                       cpu);
> >>>            break;
> >>> +    case CPU_STARTING:
> >>> +        if ( system_state == SYS_STATE_resume )
> >>> +            restore_local_irqs_on_resume();
> >>> +        break;
> >>
> >> May I please ask, why all this new code (i.e.
> >> restore_local_irqs_on_resume()) is not covered by #ifdef
> >> CONFIG_SYSTEM_SUSPEND?
> >
> > I don’t see a reason to introduce such "macaron-style" code. On ARM, the
> > system suspend state is only set when CONFIG_SYSTEM_SUSPEND is defined
> > anyway.
>
> right
>
> >
> > If you would prefer me to wrap all relevant code with this define, please
> > let me know and I’ll make the change. In this case, I think the current
> > approach is cleaner, but I’m open to your opinion.
>
> In other patches, you seem to wrap functions/code that only get called
> during suspend/resume with #ifdef CONFIG_SYSTEM_SUSPEND, so I wondered
> why restore_local_irqs_on_resume() could not be compiled out
> if the feature is not enabled. But if you still think it would be
> cleaner this way (w/o #ifdef), I would be ok.

It’s not entirely true -- I only wrapped code that has a direct dependency
on host_system_suspend(), either being called from it or required for its
correct operation.

If you look through this patch series for the pattern:
SYS_STATE_(suspend|resume)

you’ll see that not all suspend/resume-related code is wrapped in
#ifdef CONFIG_SYSTEM_SUSPEND. This is intentional -- the same applies to
some code already merged into the common parts of Xen.

So restore_local_irqs_on_resume is consistent with the existing approach
in all cpu notifier blocks.

>
> >
> >>
> >>>        }
> >>>
> >>>        return notifier_from_errno(rc);
> >>
> >
> > Best regards,
> > Mykola
>
Re: [PATCH v6 06/13] xen/arm: irq: Restore state of local IRQs during system resume
Posted by Mykola Kvach 1 month, 4 weeks ago
On Tue, Sep 2, 2025 at 11:08 PM Mykola Kvach <xakep.amatop@gmail.com> wrote:
>
> On Tue, Sep 2, 2025 at 9:16 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
> >
> >
> >
> > On 02.09.25 20:43, Mykola Kvach wrote:
> > > Hi Oleksandr,
> >
> > Hello Mykola
> >
> > >
> > > On Tue, Sep 2, 2025 at 7:49 PM Oleksandr Tyshchenko <olekstysh@gmail.com> wrote:
> > >>
> > >>
> > >>
> > >> On 02.09.25 01:10, Mykola Kvach wrote:
> > >>
> > >> Hello Mykola
> > >>
> > >>> From: Mykola Kvach <mykola_kvach@epam.com>
> > >>>
> > >>> On ARM, the first 32 interrupts (SGIs and PPIs) are banked per-CPU
> > >>> and not restored by gic_resume (for secondary cpus).
> > >>>
> > >>> This patch introduces restore_local_irqs_on_resume, a function that
> > >>> restores the state of local interrupts on the target CPU during
> > >>> system resume.
> > >>>
> > >>> It iterates over all local IRQs and re-enables those that were not
> > >>> disabled, reprogramming their routing and affinity accordingly.
> > >>>
> > >>> The function is invoked from start_secondary, ensuring that local IRQ
> > >>> state is restored early during CPU bring-up after suspend.
> > >>>
> > >>> Signed-off-by: Mykola Kvach <mykola_kvach@epam.com>
> > >>> ---
> > >>> Changes in V6:
> > >>> - Call handler->disable() instead of just setting the _IRQ_DISABLED flag
> > >>> - Move the system state check outside of restore_local_irqs_on_resume()
> > >>> ---
> > >>>    xen/arch/arm/irq.c | 39 +++++++++++++++++++++++++++++++++++++++
> > >>>    1 file changed, 39 insertions(+)
> > >>>
> > >>> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> > >>> index 6c899347ca..ddd2940554 100644
> > >>> --- a/xen/arch/arm/irq.c
> > >>> +++ b/xen/arch/arm/irq.c
> > >>> @@ -116,6 +116,41 @@ static int init_local_irq_data(unsigned int cpu)
> > >>>        return 0;
> > >>>    }
> > >>>
> > >>> +/*
> > >>> + * The first 32 interrupts (PPIs and SGIs) are per-CPU,
> > >>> + * so call this function on the target CPU to restore them.
> > >>> + *
> > >>> + * SPIs are restored via gic_resume.
> > >>> + */
> > >>> +static void restore_local_irqs_on_resume(void)
> > >>> +{
> > >>> +    int irq;
> > >>
> > >> NIT: Please, use "unsigned int" if irq cannot be negative
> > >
> > > ok
> > >
> > >>
> > >>> +
> > >>> +    spin_lock(&local_irqs_type_lock);
> > >>> +
> > >>> +    for ( irq = 0; irq < NR_LOCAL_IRQS; irq++ )
> > >>> +    {
> > >>> +        struct irq_desc *desc = irq_to_desc(irq);
> > >>> +
> > >>> +        spin_lock(&desc->lock);
> > >>> +
> > >>> +        if ( test_bit(_IRQ_DISABLED, &desc->status) )
> > >>> +        {
> > >>> +            spin_unlock(&desc->lock);
> > >>> +            continue;
> > >>> +        }
> > >>> +
> > >>> +        /* Disable the IRQ to avoid assertions in the following calls */
> > >>> +        desc->handler->disable(desc);
> > >>> +        gic_route_irq_to_xen(desc, GIC_PRI_IRQ);
> > >>
> > >> Shouldn't we use GIC_PRI_IPI for SGIs?
> > >
> > > Yes, we should. But currently I am restoring the same value
> > > as it was before suspend...
> > >
> > > I definitely agree that this needs to be fixed at the original
> > > place where the issue was introduced, but I was planning to
> > > address it in a future patch.
> > >
> > >>
> > >>
> > >>> +        desc->handler->startup(desc);
> > >>> +
> > >>> +        spin_unlock(&desc->lock);
> > >>> +    }
> > >>> +
> > >>> +    spin_unlock(&local_irqs_type_lock);
> > >>> +}
> > >>> +
> > >>>    static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> > >>>                            void *hcpu)
> > >>>    {
> > >>> @@ -134,6 +169,10 @@ static int cpu_callback(struct notifier_block *nfb, unsigned long action,
> > >>>                printk(XENLOG_ERR "Unable to allocate local IRQ for CPU%u\n",
> > >>>                       cpu);
> > >>>            break;
> > >>> +    case CPU_STARTING:
> > >>> +        if ( system_state == SYS_STATE_resume )
> > >>> +            restore_local_irqs_on_resume();
> > >>> +        break;
> > >>
> > >> May I please ask, why all this new code (i.e.
> > >> restore_local_irqs_on_resume()) is not covered by #ifdef
> > >> CONFIG_SYSTEM_SUSPEND?
> > >
> > > I don’t see a reason to introduce such "macaron-style" code. On ARM, the
> > > system suspend state is only set when CONFIG_SYSTEM_SUSPEND is defined
> > > anyway.
> >
> > right
> >
> > >
> > > If you would prefer me to wrap all relevant code with this define, please
> > > let me know and I’ll make the change. In this case, I think the current
> > > approach is cleaner, but I’m open to your opinion.
> >
> > In other patches, you seem to wrap functions/code that only get called
> > during suspend/resume with #ifdef CONFIG_SYSTEM_SUSPEND, so I wondered
> > why restore_local_irqs_on_resume() could not be compiled out
> > if the feature is not enabled. But if you still think it would be
> > cleaner this way (w/o #ifdef), I would be ok.
>
> It’s not entirely true -- I only wrapped code that has a direct dependency
> on host_system_suspend(), either being called from it or required for its
> correct operation.
>
> If you look through this patch series for the pattern:
> SYS_STATE_(suspend|resume)
>
> you’ll see that not all suspend/resume-related code is wrapped in
> #ifdef CONFIG_SYSTEM_SUSPEND. This is intentional -- the same applies to
> some code already merged into the common parts of Xen.
>
> So restore_local_irqs_on_resume is consistent with the existing approach
> in all cpu notifier blocks.

Of course, I can wrap all code in this patch series if needed. For me, the
current approach looks clearer and aligns with existing code. On the other
hand, I introduced this config option not so long ago, so that may be why
some parts in common code and even in some architectures like x86 are still
uncovered.

In any case, I don't mind covering all the code if you think it would be better.
Right now, this implementation is mainly my preference and aligns with the
existing code. There isn't any other reasoning behind this decision.


>
> >
> > >
> > >>
> > >>>        }
> > >>>
> > >>>        return notifier_from_errno(rc);
> > >>
> > >
> > > Best regards,
> > > Mykola
> >