[PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains

Oleksij Rempel posted 8 patches 1 month ago
Only 7 patches received!
There is a newer version of this series
[PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Oleksij Rempel 1 month ago
In gpiochip_hierarchy_irq_domain_alloc(), calling irq_domain_set_info()
before irq_domain_alloc_irqs_parent() causes a NULL pointer dereference
for slow-bus (SPI/I2C) IRQ chips.

irq_domain_set_info() locks the child descriptor, triggering .irq_bus_lock.
If the child proxies this lock to the parent, it crashes because
parent->chip is not yet allocated.

Fix this by allocating the parent IRQs first, ensuring parent->chip is
populated before the child's .irq_bus_lock is invoked.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
changes v3
- new patch
---
 drivers/gpio/gpiolib.c | 32 +++++++++++++++++---------------
 1 file changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index ada572aaebd6..1ea9531934cc 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1628,19 +1628,6 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
 	}
 	gpiochip_dbg(gc, "found parent hwirq %u\n", parent_hwirq);
 
-	/*
-	 * We set handle_bad_irq because the .set_type() should
-	 * always be invoked and set the right type of handler.
-	 */
-	irq_domain_set_info(d,
-			    irq,
-			    hwirq,
-			    gc->irq.chip,
-			    gc,
-			    girq->handler,
-			    NULL, NULL);
-	irq_set_probe(irq);
-
 	/* This parent only handles asserted level IRQs */
 	ret = girq->populate_parent_alloc_arg(gc, &gpio_parent_fwspec,
 					      parent_hwirq, parent_type);
@@ -1657,12 +1644,27 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
 	 */
 	if (irq_domain_is_msi(d->parent) && (ret == -EEXIST))
 		ret = 0;
-	if (ret)
+	if (ret) {
 		gpiochip_err(gc,
 			     "failed to allocate parent hwirq %d for hwirq %lu\n",
 			     parent_hwirq, hwirq);
+		return ret;
+	}
 
-	return ret;
+	/*
+	 * We set handle_bad_irq because the .set_type() should
+	 * always be invoked and set the right type of handler.
+	 */
+	irq_domain_set_info(d,
+			    irq,
+			    hwirq,
+			    gc->irq.chip,
+			    gc,
+			    girq->handler,
+			    NULL, NULL);
+	irq_set_probe(irq);
+
+	return 0;
 }
 
 static unsigned int gpiochip_child_offset_to_irq_noop(struct gpio_chip *gc,
-- 
2.47.3
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Tommaso Merciai 3 weeks, 5 days ago
Hi Oleksij,
Thanks for your patch.

I'm working on DSI support for RZ/G3E

from this morning rebasing on top of next-20260312 I'm seeing
the following:


[   19.230966] [drm] Initialized rzg2l-du 1.0.0 for 16490000.display on minor 2
[   19.240377] rzg2l-du 16490000.display: [drm] Device 16490000.display probed
[   19.250504] irq 165, desc: 000000004f0a321f, depth: 0, count: 0, unhandled: 0
[   19.257630] ->handle_irq():  00000000a74f5df5, handle_bad_irq+0x0/0x25c
[   19.264223] ->irq_data.chip(): 0000000057261646, rzg2l_gpio_irqchip+0x0/0x118
[   19.271328] ->action(): 0000000027be85a3
[   19.275227] ->action->handler(): 00000000e5c70c61, irq_default_primary_handler+0x0/0x8
[   20.645894] ov5645 0-003c: ov5645_write_reg: write reg error -110: reg=300e, val=58
[   40.257787] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[   40.263915] rcu:     (detected by 2, t=5253 jiffies, g=3325, q=241 ncpus=4)
[   40.270632] rcu: All QSes seen, last rcu_preempt kthread activity 5255 (4294902363-4294897108), jiffies_till_next_fqs=1, root ->qsmask 0x0
[   40.283054] rcu: rcu_preempt kthread timer wakeup didn't happen for 5257 jiffies! g3325 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200
[   40.294351] rcu:     Possible timer handling issue on cpu=0 timer-softirq=1342
[   40.301309] rcu: rcu_preempt kthread starved for 5262 jiffies! g3325 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0
[   40.311657] rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[   40.320771] rcu: RCU grace-period kthread stack dump:
[   40.325821] task:rcu_preempt     state:R stack:0     pid:15    tgid:15    ppid:2      task_flags:0x208040 flags:0x00000010
[   40.336886] Call trace:
[   40.339345]  __switch_to+0xec/0x1a8 (T)
[   40.343236]  __schedule+0x360/0xe18
[   40.346763]  schedule+0x34/0x110
[   40.350029]  schedule_timeout+0x84/0x100
[   40.353997]  rcu_gp_fqs_loop+0x114/0x420
[   40.357963]  rcu_gp_kthread+0x100/0x114
[   40.361843]  kthread+0x118/0x124
[   40.365122]  ret_from_fork+0x10/0x20
[   40.368740] rcu: Stack dump where RCU GP kthread last ran:
[   40.374223] Sending NMI from CPU 2 to CPUs 0:
[  113.405786] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  113.411908] rcu:     (detected by 3, t=23540 jiffies, g=3325, q=259 ncpus=4)
[  113.418711] rcu: All QSes seen, last rcu_preempt kthread activity 23542 (4294920650-4294897108), jiffies_till_next_fqs=1, root ->qsmask 0x0
[  113.431220] rcu: rcu_preempt kthread timer wakeup didn't happen for 23544 jiffies! g3325 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200
[  113.442606] rcu:     Possible timer handling issue on cpu=0 timer-softirq=1342
[  113.449564] rcu: rcu_preempt kthread starved for 23549 jiffies! g3325 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0
[  113.459998] rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[  113.469112] rcu: RCU grace-period kthread stack dump:
[  113.474163] task:rcu_preempt     state:R stack:0     pid:15    tgid:15    ppid:2      task_flags:0x208040 flags:0x00000010
[  113.485229] Call trace:
[  113.487688]  __switch_to+0xec/0x1a8 (T)
[  113.491581]  __schedule+0x360/0xe18
[  113.495109]  schedule+0x34/0x110
[  113.498374]  schedule_timeout+0x84/0x100
[  113.502342]  rcu_gp_fqs_loop+0x114/0x420
[  113.506308]  rcu_gp_kthread+0x100/0x114
[  113.510188]  kthread+0x118/0x124
[  113.513466]  ret_from_fork+0x10/0x20
[  113.517082] rcu: Stack dump where RCU GP kthread last ran:
[  113.522566] Sending NMI from CPU 3 to CPUs 0:
[  186.553784] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  186.559904] rcu:     (detected by 3, t=41827 jiffies, g=3325, q=268 ncpus=4)
[  186.566706] rcu: All QSes seen, last rcu_preempt kthread activity 41829 (4294938937-4294897108), jiffies_till_next_fqs=1, root ->qsmask 0x0
[  186.579213] rcu: rcu_preempt kthread timer wakeup didn't happen for 41831 jiffies! g3325 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200
[  186.590599] rcu:     Possible timer handling issue on cpu=0 timer-softirq=1342
[  186.597556] rcu: rcu_preempt kthread starved for 41836 jiffies! g3325 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x200 ->cpu=0
[  186.607990] rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[  186.617105] rcu: RCU grace-period kthread stack dump:
[  186.622155] task:rcu_preempt     state:R stack:0     pid:15    tgid:15    ppid:2      task_flags:0x208040 flags:0x00000010
[  186.633219] Call trace:


I found out the the issue is related to the interrupt of the adv7535
bridge:

        adv7535: hdmi1@3d {
                compatible = "adi,adv7535";
                ...
                ...
                interrupts-extended = <&pinctrl RZG3E_GPIO(L, 4) IRQ_TYPE_EDGE_FALLING>;

RZ/G3E is using:
 - drivers/pinctrl/renesas/pinctrl-rzg2l.c

Reverting this patch fix the issue.
(git revert a23463beb3d5)

I'm wondering if someone else get the same.
Thanks in advance.

Kind Regards,
Tommaso


On Mon, Mar 09, 2026 at 02:49:15PM +0100, Oleksij Rempel wrote:
> In gpiochip_hierarchy_irq_domain_alloc(), calling irq_domain_set_info()
> before irq_domain_alloc_irqs_parent() causes a NULL pointer dereference
> for slow-bus (SPI/I2C) IRQ chips.
> 
> irq_domain_set_info() locks the child descriptor, triggering .irq_bus_lock.
> If the child proxies this lock to the parent, it crashes because
> parent->chip is not yet allocated.
> 
> Fix this by allocating the parent IRQs first, ensuring parent->chip is
> populated before the child's .irq_bus_lock is invoked.
> 
> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> ---
> changes v3
> - new patch
> ---
>  drivers/gpio/gpiolib.c | 32 +++++++++++++++++---------------
>  1 file changed, 17 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index ada572aaebd6..1ea9531934cc 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -1628,19 +1628,6 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
>  	}
>  	gpiochip_dbg(gc, "found parent hwirq %u\n", parent_hwirq);
>  
> -	/*
> -	 * We set handle_bad_irq because the .set_type() should
> -	 * always be invoked and set the right type of handler.
> -	 */
> -	irq_domain_set_info(d,
> -			    irq,
> -			    hwirq,
> -			    gc->irq.chip,
> -			    gc,
> -			    girq->handler,
> -			    NULL, NULL);
> -	irq_set_probe(irq);
> -
>  	/* This parent only handles asserted level IRQs */
>  	ret = girq->populate_parent_alloc_arg(gc, &gpio_parent_fwspec,
>  					      parent_hwirq, parent_type);
> @@ -1657,12 +1644,27 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
>  	 */
>  	if (irq_domain_is_msi(d->parent) && (ret == -EEXIST))
>  		ret = 0;
> -	if (ret)
> +	if (ret) {
>  		gpiochip_err(gc,
>  			     "failed to allocate parent hwirq %d for hwirq %lu\n",
>  			     parent_hwirq, hwirq);
> +		return ret;
> +	}
>  
> -	return ret;
> +	/*
> +	 * We set handle_bad_irq because the .set_type() should
> +	 * always be invoked and set the right type of handler.
> +	 */
> +	irq_domain_set_info(d,
> +			    irq,
> +			    hwirq,
> +			    gc->irq.chip,
> +			    gc,
> +			    girq->handler,
> +			    NULL, NULL);
> +	irq_set_probe(irq);
> +
> +	return 0;
>  }
>  
>  static unsigned int gpiochip_child_offset_to_irq_noop(struct gpio_chip *gc,
> -- 
> 2.47.3
>
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Oleksij Rempel 3 weeks, 5 days ago
Hi Tommaso,

On Fri, Mar 13, 2026 at 11:42:34AM +0100, Tommaso Merciai wrote:
> Hi Oleksij,
> Thanks for your patch.
> 
> I'm working on DSI support for RZ/G3E
> 
> from this morning rebasing on top of next-20260312 I'm seeing
> the following:
> I found out the the issue is related to the interrupt of the adv7535
> bridge:
> 
>         adv7535: hdmi1@3d {
>                 compatible = "adi,adv7535";
>                 ...
>                 ...
>                 interrupts-extended = <&pinctrl RZG3E_GPIO(L, 4) IRQ_TYPE_EDGE_FALLING>;
> 
> RZ/G3E is using:
>  - drivers/pinctrl/renesas/pinctrl-rzg2l.c
> 
> Reverting this patch fix the issue.
> (git revert a23463beb3d5)

Thank you for the feedback! If I understand the problem correctly, the
adv7535 is asserting its IRQ line early during probe, which creates an
irq storm due to a missing handler.

My patch moved irq_domain_set_info() after the parent allocation. When
the parent allocates the IRQ, the pending hardware interrupt fires
immediately. Because the child descriptor isn't fully configured yet, it
routes to handle_bad_irq. This fails to acknowledge the hardware
interrupt, locking up the CPU and causing the RCU stall.

I hope splitting the irq_domain_set_info() should fix the regression.
Can you please test if this change resolve the RCU stalls on your setup:

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 13dd97344b26..376daeddbbbb 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -1628,6 +1628,9 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
 	}
 	gpiochip_dbg(gc, "found parent hwirq %u\n", parent_hwirq);
 
+	irq_set_handler(irq, girq->handler);
+	irq_set_handler_data(irq, gc);
+
 	/* This parent only handles asserted level IRQs */
 	ret = girq->populate_parent_alloc_arg(gc, &gpio_parent_fwspec,
 					      parent_hwirq, parent_type);
@@ -1655,13 +1658,7 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
 	 * We set handle_bad_irq because the .set_type() should
 	 * always be invoked and set the right type of handler.
 	 */
-	irq_domain_set_info(d,
-			    irq,
-			    hwirq,
-			    gc->irq.chip,
-			    gc,
-			    girq->handler,
-			    NULL, NULL);
+	irq_domain_set_hwirq_and_chip(d, irq, hwirq, gc->irq.chip, gc);
 	irq_set_probe(irq);
 
 	return 0;
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Tommaso Merciai 3 weeks, 5 days ago
Hi Oleksij,

On 3/13/26 14:08, Oleksij Rempel wrote:
> Hi Tommaso,
> 
> On Fri, Mar 13, 2026 at 11:42:34AM +0100, Tommaso Merciai wrote:
>> Hi Oleksij,
>> Thanks for your patch.
>>
>> I'm working on DSI support for RZ/G3E
>>
>> from this morning rebasing on top of next-20260312 I'm seeing
>> the following:
>> I found out the the issue is related to the interrupt of the adv7535
>> bridge:
>>
>>          adv7535: hdmi1@3d {
>>                  compatible = "adi,adv7535";
>>                  ...
>>                  ...
>>                  interrupts-extended = <&pinctrl RZG3E_GPIO(L, 4) IRQ_TYPE_EDGE_FALLING>;
>>
>> RZ/G3E is using:
>>   - drivers/pinctrl/renesas/pinctrl-rzg2l.c
>>
>> Reverting this patch fix the issue.
>> (git revert a23463beb3d5)
> 
> Thank you for the feedback! If I understand the problem correctly, the
> adv7535 is asserting its IRQ line early during probe, which creates an
> irq storm due to a missing handler.
> 
> My patch moved irq_domain_set_info() after the parent allocation. When
> the parent allocates the IRQ, the pending hardware interrupt fires
> immediately. Because the child descriptor isn't fully configured yet, it
> routes to handle_bad_irq. This fails to acknowledge the hardware
> interrupt, locking up the CPU and causing the RCU stall.
> 
> I hope splitting the irq_domain_set_info() should fix the regression.
> Can you please test if this change resolve the RCU stalls on your setup:

Thanks for sharing.

> 
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index 13dd97344b26..376daeddbbbb 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -1628,6 +1628,9 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
>   	}
>   	gpiochip_dbg(gc, "found parent hwirq %u\n", parent_hwirq);
>   
> +	irq_set_handler(irq, girq->handler);
> +	irq_set_handler_data(irq, gc);
> +
>   	/* This parent only handles asserted level IRQs */
>   	ret = girq->populate_parent_alloc_arg(gc, &gpio_parent_fwspec,
>   					      parent_hwirq, parent_type);
> @@ -1655,13 +1658,7 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
>   	 * We set handle_bad_irq because the .set_type() should
>   	 * always be invoked and set the right type of handler.
>   	 */
> -	irq_domain_set_info(d,
> -			    irq,
> -			    hwirq,
> -			    gc->irq.chip,
> -			    gc,
> -			    girq->handler,
> -			    NULL, NULL);
> +	irq_domain_set_hwirq_and_chip(d, irq, hwirq, gc->irq.chip, gc);
>   	irq_set_probe(irq);
>   
>   	return 0;

Tested on RZ/G3E + adv7535.

With this fix all is working fine on my side.
I'm not more seeing the seeing the RCU stall.

Thanks.

Kind Regards,
Tommaso
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Bartosz Golaszewski 3 weeks, 5 days ago
On Fri, Mar 13, 2026 at 3:05 PM Tommaso Merciai
<tommaso.merciai.xr@bp.renesas.com> wrote:
>
> Hi Oleksij,
>
> On 3/13/26 14:08, Oleksij Rempel wrote:
> > Hi Tommaso,
> >
> > On Fri, Mar 13, 2026 at 11:42:34AM +0100, Tommaso Merciai wrote:
> >> Hi Oleksij,
> >> Thanks for your patch.
> >>
> >> I'm working on DSI support for RZ/G3E
> >>
> >> from this morning rebasing on top of next-20260312 I'm seeing
> >> the following:
> >> I found out the the issue is related to the interrupt of the adv7535
> >> bridge:
> >>
> >>          adv7535: hdmi1@3d {
> >>                  compatible = "adi,adv7535";
> >>                  ...
> >>                  ...
> >>                  interrupts-extended = <&pinctrl RZG3E_GPIO(L, 4) IRQ_TYPE_EDGE_FALLING>;
> >>
> >> RZ/G3E is using:
> >>   - drivers/pinctrl/renesas/pinctrl-rzg2l.c
> >>
> >> Reverting this patch fix the issue.
> >> (git revert a23463beb3d5)
> >
> > Thank you for the feedback! If I understand the problem correctly, the
> > adv7535 is asserting its IRQ line early during probe, which creates an
> > irq storm due to a missing handler.
> >
> > My patch moved irq_domain_set_info() after the parent allocation. When
> > the parent allocates the IRQ, the pending hardware interrupt fires
> > immediately. Because the child descriptor isn't fully configured yet, it
> > routes to handle_bad_irq. This fails to acknowledge the hardware
> > interrupt, locking up the CPU and causing the RCU stall.
> >
> > I hope splitting the irq_domain_set_info() should fix the regression.
> > Can you please test if this change resolve the RCU stalls on your setup:
>
> Thanks for sharing.
>
> >
> > diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> > index 13dd97344b26..376daeddbbbb 100644
> > --- a/drivers/gpio/gpiolib.c
> > +++ b/drivers/gpio/gpiolib.c
> > @@ -1628,6 +1628,9 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
> >       }
> >       gpiochip_dbg(gc, "found parent hwirq %u\n", parent_hwirq);
> >
> > +     irq_set_handler(irq, girq->handler);
> > +     irq_set_handler_data(irq, gc);
> > +
> >       /* This parent only handles asserted level IRQs */
> >       ret = girq->populate_parent_alloc_arg(gc, &gpio_parent_fwspec,
> >                                             parent_hwirq, parent_type);
> > @@ -1655,13 +1658,7 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
> >        * We set handle_bad_irq because the .set_type() should
> >        * always be invoked and set the right type of handler.
> >        */
> > -     irq_domain_set_info(d,
> > -                         irq,
> > -                         hwirq,
> > -                         gc->irq.chip,
> > -                         gc,
> > -                         girq->handler,
> > -                         NULL, NULL);
> > +     irq_domain_set_hwirq_and_chip(d, irq, hwirq, gc->irq.chip, gc);
> >       irq_set_probe(irq);
> >
> >       return 0;
>
> Tested on RZ/G3E + adv7535.
>
> With this fix all is working fine on my side.
> I'm not more seeing the seeing the RCU stall.
>

Ah, I sent this patch upstream for v7.0. I will tell Linus to not pull
it. How do we want to handle it then? Should this patch go together
with the rest of the series?

Bartosz
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Oleksij Rempel 3 weeks, 5 days ago
On Fri, Mar 13, 2026 at 03:35:34PM +0100, Bartosz Golaszewski wrote:
> > With this fix all is working fine on my side.
> > I'm not more seeing the seeing the RCU stall.
> >
> 
> Ah, I sent this patch upstream for v7.0. I will tell Linus to not pull
> it. How do we want to handle it then? Should this patch go together
> with the rest of the series?

Yes, better let's go the slow way. I can't guarantee it will avoid other
regressions.

Should I include updated version it the next patch series?

Best Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Bartosz Golaszewski 3 weeks, 3 days ago
On Sat, Mar 14, 2026 at 8:08 AM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
>
> On Fri, Mar 13, 2026 at 03:35:34PM +0100, Bartosz Golaszewski wrote:
> > > With this fix all is working fine on my side.
> > > I'm not more seeing the seeing the RCU stall.
> > >
> >
> > Ah, I sent this patch upstream for v7.0. I will tell Linus to not pull
> > it. How do we want to handle it then? Should this patch go together
> > with the rest of the series?
>
> Yes, better let's go the slow way. I can't guarantee it will avoid other
> regressions.
>
> Should I include updated version it the next patch series?
>

Yes, I dropped this from my queue.

Acked-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Linus Walleij 1 month ago
On Mon, Mar 9, 2026 at 2:49 PM Oleksij Rempel <o.rempel@pengutronix.de> wrote:

> In gpiochip_hierarchy_irq_domain_alloc(), calling irq_domain_set_info()
> before irq_domain_alloc_irqs_parent() causes a NULL pointer dereference
> for slow-bus (SPI/I2C) IRQ chips.
>
> irq_domain_set_info() locks the child descriptor, triggering .irq_bus_lock.
> If the child proxies this lock to the parent, it crashes because
> parent->chip is not yet allocated.
>
> Fix this by allocating the parent IRQs first, ensuring parent->chip is
> populated before the child's .irq_bus_lock is invoked.
>
> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> ---
> changes v3
> - new patch

Bartosz, tglx: is this something we should apply for fixes?

I think it needs to go into gpiolib for next at minimum, unless
there is some semantic problem with the patch.

Verbose quote:

> -       /*
> -        * We set handle_bad_irq because the .set_type() should
> -        * always be invoked and set the right type of handler.
> -        */
> -       irq_domain_set_info(d,
> -                           irq,
> -                           hwirq,
> -                           gc->irq.chip,
> -                           gc,
> -                           girq->handler,
> -                           NULL, NULL);
> -       irq_set_probe(irq);
> -
>         /* This parent only handles asserted level IRQs */
>         ret = girq->populate_parent_alloc_arg(gc, &gpio_parent_fwspec,
>                                               parent_hwirq, parent_type);
> @@ -1657,12 +1644,27 @@ static int gpiochip_hierarchy_irq_domain_alloc(struct irq_domain *d,
>          */
>         if (irq_domain_is_msi(d->parent) && (ret == -EEXIST))
>                 ret = 0;
> -       if (ret)
> +       if (ret) {
>                 gpiochip_err(gc,
>                              "failed to allocate parent hwirq %d for hwirq %lu\n",
>                              parent_hwirq, hwirq);
> +               return ret;
> +       }
>
> -       return ret;
> +       /*
> +        * We set handle_bad_irq because the .set_type() should
> +        * always be invoked and set the right type of handler.
> +        */
> +       irq_domain_set_info(d,
> +                           irq,
> +                           hwirq,
> +                           gc->irq.chip,
> +                           gc,
> +                           girq->handler,
> +                           NULL, NULL);
> +       irq_set_probe(irq);
> +
> +       return 0;

Yours,
Linus Walleij
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Bartosz Golaszewski 1 month ago
On Tue, Mar 10, 2026 at 10:05 AM Linus Walleij <linusw@kernel.org> wrote:
>
> On Mon, Mar 9, 2026 at 2:49 PM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
>
> > In gpiochip_hierarchy_irq_domain_alloc(), calling irq_domain_set_info()
> > before irq_domain_alloc_irqs_parent() causes a NULL pointer dereference
> > for slow-bus (SPI/I2C) IRQ chips.
> >
> > irq_domain_set_info() locks the child descriptor, triggering .irq_bus_lock.
> > If the child proxies this lock to the parent, it crashes because
> > parent->chip is not yet allocated.
> >
> > Fix this by allocating the parent IRQs first, ensuring parent->chip is
> > populated before the child's .irq_bus_lock is invoked.
> >
> > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> > ---
> > changes v3
> > - new patch
>
> Bartosz, tglx: is this something we should apply for fixes?
>
> I think it needs to go into gpiolib for next at minimum, unless
> there is some semantic problem with the patch.
>

Looks good to me. I can take it into v7.0-rc4 via the GPIO tree and
tglx can pull the tag once it's out as a base for the rest of the
series?

Bart
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Bartosz Golaszewski 4 weeks, 1 day ago
On Tue, 10 Mar 2026 10:14:59 +0100, Bartosz Golaszewski <brgl@kernel.org> said:
> On Tue, Mar 10, 2026 at 10:05 AM Linus Walleij <linusw@kernel.org> wrote:
>>
>> On Mon, Mar 9, 2026 at 2:49 PM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
>>
>> > In gpiochip_hierarchy_irq_domain_alloc(), calling irq_domain_set_info()
>> > before irq_domain_alloc_irqs_parent() causes a NULL pointer dereference
>> > for slow-bus (SPI/I2C) IRQ chips.
>> >
>> > irq_domain_set_info() locks the child descriptor, triggering .irq_bus_lock.
>> > If the child proxies this lock to the parent, it crashes because
>> > parent->chip is not yet allocated.
>> >
>> > Fix this by allocating the parent IRQs first, ensuring parent->chip is
>> > populated before the child's .irq_bus_lock is invoked.
>> >
>> > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
>> > ---
>> > changes v3
>> > - new patch
>>
>> Bartosz, tglx: is this something we should apply for fixes?
>>
>> I think it needs to go into gpiolib for next at minimum, unless
>> there is some semantic problem with the patch.
>>
>
> Looks good to me. I can take it into v7.0-rc4 via the GPIO tree and
> tglx can pull the tag once it's out as a base for the rest of the
> series?
>
> Bart
>

Oleksij: it doesn't look like there are any dependencies for this patch, is
it ok if I queue it for the next RC?

Could you add the Fixes tag as well?

Bart
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Oleksij Rempel 4 weeks ago
On Wed, Mar 11, 2026 at 02:18:05AM -0700, Bartosz Golaszewski wrote:
> On Tue, 10 Mar 2026 10:14:59 +0100, Bartosz Golaszewski <brgl@kernel.org> said:
> > On Tue, Mar 10, 2026 at 10:05 AM Linus Walleij <linusw@kernel.org> wrote:
> >>
> >> On Mon, Mar 9, 2026 at 2:49 PM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
> >>
> >> > In gpiochip_hierarchy_irq_domain_alloc(), calling irq_domain_set_info()
> >> > before irq_domain_alloc_irqs_parent() causes a NULL pointer dereference
> >> > for slow-bus (SPI/I2C) IRQ chips.
> >> >
> >> > irq_domain_set_info() locks the child descriptor, triggering .irq_bus_lock.
> >> > If the child proxies this lock to the parent, it crashes because
> >> > parent->chip is not yet allocated.
> >> >
> >> > Fix this by allocating the parent IRQs first, ensuring parent->chip is
> >> > populated before the child's .irq_bus_lock is invoked.
> >> >
> >> > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> >> > ---
> >> > changes v3
> >> > - new patch
> >>
> >> Bartosz, tglx: is this something we should apply for fixes?
> >>
> >> I think it needs to go into gpiolib for next at minimum, unless
> >> there is some semantic problem with the patch.
> >>
> >
> > Looks good to me. I can take it into v7.0-rc4 via the GPIO tree and
> > tglx can pull the tag once it's out as a base for the rest of the
> > series?
> >
> > Bart
> >
> 
> Oleksij: it doesn't look like there are any dependencies for this patch, is
> it ok if I queue it for the next RC?

ACK, there are no dependencies.

> Could you add the Fixes tag as well?

Fixes: fdd61a013a24 ("gpio: Add support for hierarchical IRQ domains")

Should add it in the next round? There are some pending comments for
this patch set.

Best Regards,
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |
Re: [PATCH v3 4/7] gpio: gpiolib: fix allocation order in hierarchical IRQ domains
Posted by Bartosz Golaszewski 4 weeks ago
On Wed, Mar 11, 2026 at 2:40 PM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
>
> On Wed, Mar 11, 2026 at 02:18:05AM -0700, Bartosz Golaszewski wrote:
> > On Tue, 10 Mar 2026 10:14:59 +0100, Bartosz Golaszewski <brgl@kernel.org> said:
> > > On Tue, Mar 10, 2026 at 10:05 AM Linus Walleij <linusw@kernel.org> wrote:
> > >>
> > >> On Mon, Mar 9, 2026 at 2:49 PM Oleksij Rempel <o.rempel@pengutronix.de> wrote:
> > >>
> > >> > In gpiochip_hierarchy_irq_domain_alloc(), calling irq_domain_set_info()
> > >> > before irq_domain_alloc_irqs_parent() causes a NULL pointer dereference
> > >> > for slow-bus (SPI/I2C) IRQ chips.
> > >> >
> > >> > irq_domain_set_info() locks the child descriptor, triggering .irq_bus_lock.
> > >> > If the child proxies this lock to the parent, it crashes because
> > >> > parent->chip is not yet allocated.
> > >> >
> > >> > Fix this by allocating the parent IRQs first, ensuring parent->chip is
> > >> > populated before the child's .irq_bus_lock is invoked.
> > >> >
> > >> > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> > >> > ---
> > >> > changes v3
> > >> > - new patch
> > >>
> > >> Bartosz, tglx: is this something we should apply for fixes?
> > >>
> > >> I think it needs to go into gpiolib for next at minimum, unless
> > >> there is some semantic problem with the patch.
> > >>
> > >
> > > Looks good to me. I can take it into v7.0-rc4 via the GPIO tree and
> > > tglx can pull the tag once it's out as a base for the rest of the
> > > series?
> > >
> > > Bart
> > >
> >
> > Oleksij: it doesn't look like there are any dependencies for this patch, is
> > it ok if I queue it for the next RC?
>
> ACK, there are no dependencies.
>
> > Could you add the Fixes tag as well?
>
> Fixes: fdd61a013a24 ("gpio: Add support for hierarchical IRQ domains")
>
> Should add it in the next round? There are some pending comments for
> this patch set.
>

No, you can drop it, I'll send it for v7.0-rc4.

Bart