[PATCH v1] gpio: pca953x: fix IRQ storm on system wake up

Francesco Dolcini posted 1 patch 8 months, 3 weeks ago
There is a newer version of this series
drivers/gpio/gpio-pca953x.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH v1] gpio: pca953x: fix IRQ storm on system wake up
Posted by Francesco Dolcini 8 months, 3 weeks ago
From: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>

If an input changes state during wake-up and is used as an interrupt
source, the IRQ handler reads the volatile input register to clear the
interrupt mask and deassert the IRQ line. However, the IRQ handler is
triggered before access to the register is granted, causing the read
operation to fail.

As a result, the IRQ handler enters a loop, repeatedly printing the
"failed reading register" message, until `pca953x_resume` is eventually
called, which restores the driver context and enables access to
registers.

Fix by using DEFINE_NOIRQ_DEV_PM_OPS which ensures that `pca953x_resume`
is called before the IRQ handler is called.

Fixes: b76574300504 ("gpio: pca953x: Restore registers after suspend/resume cycle")
Cc: stable@vger.kernel.org
Signed-off-by: Emanuele Ghidoli <emanuele.ghidoli@toradex.com>
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
---
 drivers/gpio/gpio-pca953x.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpio/gpio-pca953x.c b/drivers/gpio/gpio-pca953x.c
index d63c1030e6ac..d39bdc125cfc 100644
--- a/drivers/gpio/gpio-pca953x.c
+++ b/drivers/gpio/gpio-pca953x.c
@@ -1252,7 +1252,7 @@ static int pca953x_resume(struct device *dev)
 	return ret;
 }
 
-static DEFINE_SIMPLE_DEV_PM_OPS(pca953x_pm_ops, pca953x_suspend, pca953x_resume);
+static DEFINE_NOIRQ_DEV_PM_OPS(pca953x_pm_ops, pca953x_suspend, pca953x_resume);
 
 /* convenience to stop overlong match-table lines */
 #define OF_653X(__nrgpio, __int) ((void *)(__nrgpio | PCAL653X_TYPE | __int))
-- 
2.39.5
Re: [PATCH v1] gpio: pca953x: fix IRQ storm on system wake up
Posted by Bartosz Golaszewski 8 months, 2 weeks ago
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>


On Wed, 26 Mar 2025 18:38:38 +0100, Francesco Dolcini wrote:
> If an input changes state during wake-up and is used as an interrupt
> source, the IRQ handler reads the volatile input register to clear the
> interrupt mask and deassert the IRQ line. However, the IRQ handler is
> triggered before access to the register is granted, causing the read
> operation to fail.
> 
> As a result, the IRQ handler enters a loop, repeatedly printing the
> "failed reading register" message, until `pca953x_resume` is eventually
> called, which restores the driver context and enables access to
> registers.
> 
> [...]

Applied, thanks!

[1/1] gpio: pca953x: fix IRQ storm on system wake up
      commit: 23334dfbeec89bf79f2ab893034b50612d039594

Best regards,
-- 
Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Re: [PATCH v1] gpio: pca953x: fix IRQ storm on system wake up
Posted by Andy Shevchenko 8 months, 2 weeks ago
+Cc: Geert

On Thu, Apr 03, 2025 at 02:07:05PM +0200, Bartosz Golaszewski wrote:
> On Wed, 26 Mar 2025 18:38:38 +0100, Francesco Dolcini wrote:

> > If an input changes state during wake-up and is used as an interrupt
> > source, the IRQ handler reads the volatile input register to clear the
> > interrupt mask and deassert the IRQ line. However, the IRQ handler is
> > triggered before access to the register is granted, causing the read
> > operation to fail.
> > 
> > As a result, the IRQ handler enters a loop, repeatedly printing the
> > "failed reading register" message, until `pca953x_resume` is eventually
> > called, which restores the driver context and enables access to
> > registers.

[...]

> Applied, thanks!

Won't this regress as it happens the last time [1]?

[1]: https://lore.kernel.org/linux-gpio/CAMuHMdVnKX23yi7ir1LVxfXAMeeWMFzM+cdgSSTNjpn1OnC2xw@mail.gmail.com/

-- 
With Best Regards,
Andy Shevchenko
Re: [PATCH v1] gpio: pca953x: fix IRQ storm on system wake up
Posted by Bartosz Golaszewski 8 months, 2 weeks ago
On Thu, Apr 3, 2025 at 3:54 PM Andy Shevchenko
<andriy.shevchenko@intel.com> wrote:
>
> +Cc: Geert
>
> On Thu, Apr 03, 2025 at 02:07:05PM +0200, Bartosz Golaszewski wrote:
> > On Wed, 26 Mar 2025 18:38:38 +0100, Francesco Dolcini wrote:
>
> > > If an input changes state during wake-up and is used as an interrupt
> > > source, the IRQ handler reads the volatile input register to clear the
> > > interrupt mask and deassert the IRQ line. However, the IRQ handler is
> > > triggered before access to the register is granted, causing the read
> > > operation to fail.
> > >
> > > As a result, the IRQ handler enters a loop, repeatedly printing the
> > > "failed reading register" message, until `pca953x_resume` is eventually
> > > called, which restores the driver context and enables access to
> > > registers.
>
> [...]
>
> > Applied, thanks!
>
> Won't this regress as it happens the last time [1]?
>
> [1]: https://lore.kernel.org/linux-gpio/CAMuHMdVnKX23yi7ir1LVxfXAMeeWMFzM+cdgSSTNjpn1OnC2xw@mail.gmail.com/
>

Ah, good catch. I'm wondering what the right fix here is but don't
really have any ideas at the moment. Any hints are appreciated.

For now, I'm dropping it.

Bart
Re: [PATCH v1] gpio: pca953x: fix IRQ storm on system wake up
Posted by Emanuele Ghidoli 8 months, 2 weeks ago
On 03/04/2025 15:56, Bartosz Golaszewski wrote:
> On Thu, Apr 3, 2025 at 3:54 PM Andy Shevchenko
> <andriy.shevchenko@intel.com> wrote:
>>
>> +Cc: Geert
>>
>> On Thu, Apr 03, 2025 at 02:07:05PM +0200, Bartosz Golaszewski wrote:
>>> On Wed, 26 Mar 2025 18:38:38 +0100, Francesco Dolcini wrote:
>>
>>>> If an input changes state during wake-up and is used as an interrupt
>>>> source, the IRQ handler reads the volatile input register to clear the
>>>> interrupt mask and deassert the IRQ line. However, the IRQ handler is
>>>> triggered before access to the register is granted, causing the read
>>>> operation to fail.
>>>>
>>>> As a result, the IRQ handler enters a loop, repeatedly printing the
>>>> "failed reading register" message, until `pca953x_resume` is eventually
>>>> called, which restores the driver context and enables access to
>>>> registers.
>>
>> [...]
>>
>>> Applied, thanks!
>>
>> Won't this regress as it happens the last time [1]?
>>
>> [1]: https://lore.kernel.org/linux-gpio/CAMuHMdVnKX23yi7ir1LVxfXAMeeWMFzM+cdgSSTNjpn1OnC2xw@mail.gmail.com/
>>
> 
> Ah, good catch. I'm wondering what the right fix here is but don't
> really have any ideas at the moment. Any hints are appreciated.
> 
> For now, I'm dropping it.
> 
> Bart

I’ve found another possible solution: disable the PCA953x IRQ in
pca953x_suspend() and re-enable it in pca953x_resume().
This would prevent the ISR from being triggered while the regmap is in
cache-only mode.
The wake-up capability is preserved, since an IRQ can still wake the system
even when disabled with disable_irq(), as long as it has wake enabled.

This should avoid introducing regressions and still handle Geert’s use case
properly.

Andy, Bart, Geert - what do you think?
Re: [PATCH v1] gpio: pca953x: fix IRQ storm on system wake up
Posted by Andy Shevchenko 8 months, 2 weeks ago
On Mon, Apr 07, 2025 at 05:11:14PM +0200, Emanuele Ghidoli wrote:
> On 03/04/2025 15:56, Bartosz Golaszewski wrote:
> > On Thu, Apr 3, 2025 at 3:54 PM Andy Shevchenko
> > <andriy.shevchenko@intel.com> wrote:
> >>
> >> +Cc: Geert
> >>
> >> On Thu, Apr 03, 2025 at 02:07:05PM +0200, Bartosz Golaszewski wrote:
> >>> On Wed, 26 Mar 2025 18:38:38 +0100, Francesco Dolcini wrote:
> >>
> >>>> If an input changes state during wake-up and is used as an interrupt
> >>>> source, the IRQ handler reads the volatile input register to clear the
> >>>> interrupt mask and deassert the IRQ line. However, the IRQ handler is
> >>>> triggered before access to the register is granted, causing the read
> >>>> operation to fail.
> >>>>
> >>>> As a result, the IRQ handler enters a loop, repeatedly printing the
> >>>> "failed reading register" message, until `pca953x_resume` is eventually
> >>>> called, which restores the driver context and enables access to
> >>>> registers.

[...]

> >>> Applied, thanks!
> >>
> >> Won't this regress as it happens the last time [1]?
> >>
> >> [1]: https://lore.kernel.org/linux-gpio/CAMuHMdVnKX23yi7ir1LVxfXAMeeWMFzM+cdgSSTNjpn1OnC2xw@mail.gmail.com/
> >>
> > 
> > Ah, good catch. I'm wondering what the right fix here is but don't
> > really have any ideas at the moment. Any hints are appreciated.
> > 
> > For now, I'm dropping it.
> > 
> > Bart
> 
> I’ve found another possible solution: disable the PCA953x IRQ in
> pca953x_suspend() and re-enable it in pca953x_resume().
> This would prevent the ISR from being triggered while the regmap is in
> cache-only mode.
> The wake-up capability is preserved, since an IRQ can still wake the system
> even when disabled with disable_irq(), as long as it has wake enabled.

Can you enable IRQ debugfs and dump the state of the wake* nodes for the
respective interrupts? In this case we will be 100% sure it works as expected.

> This should avoid introducing regressions and still handle Geert’s use case
> properly.
> 
> Andy, Bart, Geert - what do you think?

Sounds okay, but please double check the above.

-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH v1] gpio: pca953x: fix IRQ storm on system wake up
Posted by Emanuele Ghidoli 8 months, 1 week ago
On 07/04/2025 17:40, Andy Shevchenko wrote:
>> I’ve found another possible solution: disable the PCA953x IRQ in
>> pca953x_suspend() and re-enable it in pca953x_resume().
>> This would prevent the ISR from being triggered while the regmap is in
>> cache-only mode.
>> The wake-up capability is preserved, since an IRQ can still wake the system
>> even when disabled with disable_irq(), as long as it has wake enabled.
> 
> Can you enable IRQ debugfs and dump the state of the wake* nodes for the
> respective interrupts? In this case we will be 100% sure it works as expected.
> 
# cat /sys/kernel/debug/irq/irqs/124

handler:  handle_level_irq


device:   (null)


status:   0x00000508


            _IRQ_NOPROBE


istate:   0x00004020


            IRQS_ONESHOT


ddepth:   0


wdepth:   0


dstate:   0x02402208


            IRQ_TYPE_LEVEL_LOW


            IRQD_LEVEL


            IRQD_ACTIVATED


            IRQD_IRQ_STARTED


            IRQD_DEFAULT_TRIGGER_SET


node:     0


affinity: 0-5


effectiv:


domain:  :soc:gpio@47400000


 hwirq:   0xb


 chip:    gpio-vf610


  flags:   0xa04


             IRQCHIP_MASK_ON_SUSPEND


             IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND


             IRQCHIP_IMMUTABLE


# cat /sys/kernel/debug/irq/irqs/209

handler:  handle_simple_irq


device:   (null)


status:   0x00008403


            _IRQ_NOPROBE


            _IRQ_NESTED_THREAD


istate:   0x00004000


ddepth:   0


wdepth:   0


dstate:   0x00400203


            IRQ_TYPE_EDGE_RISING


            IRQ_TYPE_EDGE_FALLING


            IRQD_ACTIVATED


            IRQD_IRQ_STARTED


node:     0


affinity: 0-5


effectiv:


domain:  :soc:bus@42000000:i2c@42540000:gpio-expander@29


 hwirq:   0x4


 chip:    3-0029


  flags:   0x800


             IRQCHIP_IMMUTABLE

And these just for confirmation (4 interrupt triggered by pushing the
SMARC_SLEEP# button):
# cat /proc/interrupts |grep 0029

124:          4          0          0          0          0          0
gpio-vf610  11 Level     3-0029

209:          0          4          0          0          0          0 3-0029
 4 Edge      SMARC_SLEEP#


# cat /sys/kernel/debug/wakeup_sources

name            active_count    event_count     wakeup_count    expire_count
 active_since    total_time      max_time        last_change
prevent_suspend_time
gpio-keys       4               4               0               0
 0               43              14              293116          0

>> This should avoid introducing regressions and still handle Geert’s use case
>> properly.
>>
>> Andy, Bart, Geert - what do you think?
> 
> Sounds okay, but please double check the above.
> 
It took me a while to realize that the relevant information is only available
when CONFIG_GENERIC_IRQ_DEBUGFS is enabled.
All /sys/kernel/irq/*/wakeup always reports "disabled", even if wakeup is
actually configured. I guess if this is the information you were asking for.

Regards,
Emanuele