kernel/irq/chip.c | 3 ++- kernel/irq/cpuhotplug.c | 4 +++- kernel/irq/manage.c | 29 ++++++++++++++++------------- kernel/irq/pm.c | 2 +- 4 files changed, 22 insertions(+), 16 deletions(-)
The IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag doesn't work properly if the
IRQ disable depth is not 0 or 1 at suspend time. In this case, the
IRQ's depth will be decremented but the IRQ won't be enabled to wake
the system up. Add a special case for when we're suspending and always
enable the IRQ in that case.
A few notes:
* As part of this, irq_startup() can no longer force the depth to 0.
Change irq_startup() to decrement by 1 and make all other callers of
irq_startup() initialize the depth to 1 to keep them working the
same.
* In order to avoid turning __enable_irq() into spaghetti code for the
special case, swap from a "switch" statement to a handful of
"if/else". Nicely, this eliminates an old "goto".
Fixes: 90428a8eb494 ("genirq/PM: Introduce IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
I fully realize there are rough edges to this patch. Specifically,
it's a bit ugly to init the depth to 1 in all the callers. Things
could also get dicey of any code tries to enable/disable the IRQ
_after_ we've set the "IRQD_IRQ_ENABLED_ON_SUSPEND", though I hope
that doesn't happen.
If you hate this patch, feel free to simply treat it as a bug report.
I'm happy if someone wants to write an alternative patch or I can make
whatever cleanups are needed and send a v2/v3/etc. Let me know!
This patch has only been lightly tested.
kernel/irq/chip.c | 3 ++-
kernel/irq/cpuhotplug.c | 4 +++-
kernel/irq/manage.c | 29 ++++++++++++++++-------------
kernel/irq/pm.c | 2 +-
4 files changed, 22 insertions(+), 16 deletions(-)
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 36cf1b09cc84..24a5958c09e4 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -272,7 +272,7 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force)
const struct cpumask *aff = irq_data_get_affinity_mask(d);
int ret = 0;
- desc->depth = 0;
+ desc->depth--;
if (irqd_is_started(d)) {
irq_enable(desc);
@@ -313,6 +313,7 @@ int irq_activate_and_startup(struct irq_desc *desc, bool resend)
{
if (WARN_ON(irq_activate(desc)))
return 0;
+ desc->depth = 1;
return irq_startup(desc, resend, IRQ_START_FORCE);
}
diff --git a/kernel/irq/cpuhotplug.c b/kernel/irq/cpuhotplug.c
index 15a7654eff68..8d98a2961e91 100644
--- a/kernel/irq/cpuhotplug.c
+++ b/kernel/irq/cpuhotplug.c
@@ -218,8 +218,10 @@ static void irq_restore_affinity_of_irq(struct irq_desc *desc, unsigned int cpu)
if (desc->istate & IRQS_SUSPENDED)
return;
- if (irqd_is_managed_and_shutdown(data))
+ if (irqd_is_managed_and_shutdown(data)) {
+ desc->depth = 1;
irq_startup(desc, IRQ_RESEND, IRQ_START_COND);
+ }
/*
* If the interrupt can only be directed to a single target
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 753eef8e041c..a8c9b645fe49 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -688,7 +688,14 @@ EXPORT_SYMBOL_GPL(irq_set_vcpu_affinity);
void __disable_irq(struct irq_desc *desc)
{
- if (!desc->depth++)
+ struct irq_data *irqd = &desc->irq_data;
+
+ /*
+ * Always increase the disable depth; actually do the disable if
+ * the depth was 0 _or_ we temporarily enabled during the suspend
+ * path.
+ */
+ if (!desc->depth++ || irqd_is_enabled_on_suspend(irqd))
irq_disable(desc);
}
@@ -786,15 +793,12 @@ void disable_nmi_nosync(unsigned int irq)
void __enable_irq(struct irq_desc *desc)
{
- switch (desc->depth) {
- case 0:
- err_out:
+ struct irq_data *irqd = &desc->irq_data;
+
+ if (desc->depth == 0 || desc->istate & IRQS_SUSPENDED) {
WARN(1, KERN_WARNING "Unbalanced enable for IRQ %d\n",
irq_desc_get_irq(desc));
- break;
- case 1: {
- if (desc->istate & IRQS_SUSPENDED)
- goto err_out;
+ } else if (desc->depth == 1 || irqd_is_enabled_on_suspend(irqd)) {
/* Prevent probing on this irq: */
irq_settings_set_noprobe(desc);
/*
@@ -809,9 +813,7 @@ void __enable_irq(struct irq_desc *desc)
* invoke irq_enable() under the hood.
*/
irq_startup(desc, IRQ_RESEND, IRQ_START_FORCE);
- break;
- }
- default:
+ } else {
desc->depth--;
}
}
@@ -1774,6 +1776,9 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
}
+ /* Undo nested disables: */
+ desc->depth = 1;
+
if (!(new->flags & IRQF_NO_AUTOEN) &&
irq_settings_can_autoenable(desc)) {
irq_startup(desc, IRQ_RESEND, IRQ_START_COND);
@@ -1785,8 +1790,6 @@ __setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
* interrupts forever.
*/
WARN_ON_ONCE(new->flags & IRQF_SHARED);
- /* Undo nested disables: */
- desc->depth = 1;
}
} else if (new->flags & IRQF_TRIGGER_MASK) {
diff --git a/kernel/irq/pm.c b/kernel/irq/pm.c
index c556bc49d213..b151a94f6233 100644
--- a/kernel/irq/pm.c
+++ b/kernel/irq/pm.c
@@ -86,8 +86,8 @@ static bool suspend_device_irq(struct irq_desc *desc)
* Enable interrupt here to unmask/enable in irqchip
* to be able to resume with such interrupts.
*/
- __enable_irq(desc);
irqd_set(irqd, IRQD_IRQ_ENABLED_ON_SUSPEND);
+ __enable_irq(desc);
}
/*
* We return true here to force the caller to issue
--
2.49.0.1045.g170613ef41-goog
On Mon, May 12, 2025 at 05:32:52PM -0700, Doug Anderson wrote:
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -272,7 +272,7 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force)
> const struct cpumask *aff = irq_data_get_affinity_mask(d);
> int ret = 0;
>
> - desc->depth = 0;
> + desc->depth--;
I'm certainly no expert here, but I'm treading on the same code and ran
into a problem with this line too. It seems wise to make this a
decrement, and not an unconditional 0. But I'm not sure that we should
then proceed to unmask an interrupt that might have depth==1 in the
general case. I think we should defer the unmask until we actually reach
0.
In fact, that's one aspect of the very problem I'm dealing with here:
Subject: [PATCH 0/2] genirq: Retain disable-depth across irq_{shutdown,startup}()
https://lore.kernel.org/all/20250513224402.864767-1-briannorris@chromium.org/
It seems to me (again, not an expert) that maybe you need to solve your
problems by dodging the disable-depth entirely. But I'm not sure the
best way to do that.
>
> if (irqd_is_started(d)) {
> irq_enable(desc);
Brian
Hi,
On Tue, May 13, 2025 at 4:02 PM Brian Norris <briannorris@chromium.org> wrote:
>
> On Mon, May 12, 2025 at 05:32:52PM -0700, Doug Anderson wrote:
> > --- a/kernel/irq/chip.c
> > +++ b/kernel/irq/chip.c
> > @@ -272,7 +272,7 @@ int irq_startup(struct irq_desc *desc, bool resend, bool force)
> > const struct cpumask *aff = irq_data_get_affinity_mask(d);
> > int ret = 0;
> >
> > - desc->depth = 0;
> > + desc->depth--;
>
> I'm certainly no expert here, but I'm treading on the same code and ran
> into a problem with this line too. It seems wise to make this a
> decrement, and not an unconditional 0. But I'm not sure that we should
> then proceed to unmask an interrupt that might have depth==1 in the
> general case. I think we should defer the unmask until we actually reach
> 0.
>
> In fact, that's one aspect of the very problem I'm dealing with here:
>
> Subject: [PATCH 0/2] genirq: Retain disable-depth across irq_{shutdown,startup}()
> https://lore.kernel.org/all/20250513224402.864767-1-briannorris@chromium.org/
>
> It seems to me (again, not an expert) that maybe you need to solve your
> problems by dodging the disable-depth entirely. But I'm not sure the
> best way to do that.
I can give a shot at spinning the patch, but before doing so I'd love
to get agreement that this problem is worth solving. As I said above,
we're not actually hitting this in any real cases and the issue was
just found during code review. To me it feels like it's a real
(potential) bug and worth solving before it bites someone in the
future, but I won't force the issue and I'll drop the patch if that's
what everyone wants.
If it's agreed that I should move forward, I'd love advice on which
approach I should use. Should I do as Brian says and try to sidestep
disable-depth entirely in this case? I could factor out the "case 1"
case of __enable_irq() and call it directly and then make sure that
all I do is count "depth" while `IRQD_IRQ_ENABLED_ON_SUSPEND` is set.
That doesn't seem like it would be too ugly...
-Doug
On Tue, Jun 10 2025 at 09:43, Doug Anderson wrote:
> On Tue, May 13, 2025 at 4:02 PM Brian Norris <briannorris@chromium.org> wrote:
>> It seems to me (again, not an expert) that maybe you need to solve your
>> problems by dodging the disable-depth entirely. But I'm not sure the
>> best way to do that.
>
> I can give a shot at spinning the patch, but before doing so I'd love
> to get agreement that this problem is worth solving. As I said above,
> we're not actually hitting this in any real cases and the issue was
> just found during code review. To me it feels like it's a real
> (potential) bug and worth solving before it bites someone in the
> future, but I won't force the issue and I'll drop the patch if that's
> what everyone wants.
I don't have a strong opinion either way.
> If it's agreed that I should move forward, I'd love advice on which
> approach I should use. Should I do as Brian says and try to sidestep
> disable-depth entirely in this case? I could factor out the "case 1"
> case of __enable_irq() and call it directly and then make sure that
> all I do is count "depth" while `IRQD_IRQ_ENABLED_ON_SUSPEND` is set.
> That doesn't seem like it would be too ugly...
No. That's creating inconstent state. It's already ugly enough. So if we
go there then we make it explicit like we did for the managed case,
i.e. something like this
void enable_wakeup_irq(struct irq_desc *desc)
{
irqd_set(irqd, IRQD_IRQ_ENABLED_ON_SUSPEND);
desc->saved_depth = desc->depth;
desc->depth = 1;
__enable_irq(desc);
}
and then have a counterpart which disables and restores the state.
Thanks,
tglx
Hey Doug, On Tue, 13 May 2025 01:32:52 +0100, Douglas Anderson <dianders@chromium.org> wrote: > > The IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag doesn't work properly if the > IRQ disable depth is not 0 or 1 at suspend time. In this case, the > IRQ's depth will be decremented but the IRQ won't be enabled to wake > the system up. Add a special case for when we're suspending and always > enable the IRQ in that case. For my own edification, can you explain why you end-up in this situation? Because I think I can see a use case for the current behaviour, where a driver controls whether it wants to use this interrupt as a wake-up source or not dynamically, irrespective of the underlying chip's capabilities (which I suspect is pinctrl-msm). This is also consistent with the way disable_irq() nests, making IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND the equivalent of an enable_irq() call. I have the feeling this outlines an issue in the endpoint driver, more than a core code deficiency. Thanks, M. -- Without deviation from the norm, progress is not possible.
Hi, On Tue, May 13, 2025 at 12:49 AM Marc Zyngier <maz@kernel.org> wrote: > > Hey Doug, > > On Tue, 13 May 2025 01:32:52 +0100, > Douglas Anderson <dianders@chromium.org> wrote: > > > > The IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND flag doesn't work properly if the > > IRQ disable depth is not 0 or 1 at suspend time. In this case, the > > IRQ's depth will be decremented but the IRQ won't be enabled to wake > > the system up. Add a special case for when we're suspending and always > > enable the IRQ in that case. > > For my own edification, can you explain why you end-up in this > situation? Dang, I meant to put it in the commit message. ...but this patch doesn't fix any problems we're experiencing. The issue was found by code inspection during a code review. > Because I think I can see a use case for the current > behaviour, where a driver controls whether it wants to use this > interrupt as a wake-up source or not dynamically, irrespective of the > underlying chip's capabilities (which I suspect is pinctrl-msm). If a driver wants to dynamically choose whether it's a wake-up source, wouldn't it just call disable_irq_wake() and enable_irq_wake()? > This is also consistent with the way disable_irq() nests, making > IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND the equivalent of an enable_irq() > call. To me it feels like we need to go back to what we were trying to accomplish when we added IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND. The main issue we were trying to solve was to make it so that an interrupt that was masked by disable_irq() could still wake the system. IMO an interrupt that has been masked _twice_ by disable_irq() is logically in the same state and should still be able to wake the system up. I think we'd also have to compare this to how it would work on systems where there is a different bit for enabling an interrupt and enabling wakeup. Those are the kinds of controllers that _don't_ need IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND. In those controllers, we're not going to look at how many disable_irq() refcounts we have when we decide to set the bit enabling the wakeup... -Doug
© 2016 - 2026 Red Hat, Inc.