[v2] irqchip/sifive-plic: Fix insufficient irq_groups allocation

[PATCH v2] irqchip/sifive-plic: Fix insufficient irq_groups allocation

Posted by Yangyu Chen 2 weeks, 3 days ago

Since the first irq source is 1 instead of 0, when the number of
irqs is multiple of 32, the last irq group will be ignored during
allocation, saving, and restoring. This lead to memory corruption
when accessing enable_save beyond allocated memory after commit
14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state")
which will access enable_save for all sources during plic_probe.
Thus, we should allocate irq_groups based on (nr_irqs + 1) instead of
nr_irqs to avoid this issue.

This is an long standing bug since Linux v6.4 but since the last irq
source is rarely used, it may not be triggered in practice until commit
14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state").

Fixes: e80f0b6a2cf3 ("irqchip/irq-sifive-plic: Add syscore callbacks for hibernation")
Fixes: 4d936f10ff80 ("irqchip/sifive-plic: Probe plic driver early for Allwinner D1 platform")
Fixes: 539d147ef69c ("irqchip/sifive-plic: Add support for UltraRISC DP1000 PLIC")
Fixes: 14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state")
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
---
Changes since v1:
- Add more Fixes tags for earlier commits that are also affected by this
  bug.
- Add more explanation about the bug's history.

This can be reproduced by modifying the PLIC node in DT to have ndev as
exactly multiple of 32, e.g., 32, 64, etc., then triggering some interrupts
and checking dmesg for memory corruption:

plic: plic@3c000000 {
	compatible = "riscv,plic0";
	reg = <0x0 0x3c000000 0x0 0x4000000>;
	#interrupt-cells = <1>;
	interrupt-controller;
	interrupts-extended = <&cpu0_intc 11>, <&cpu0_intc 9>;
	riscv,max-priority = <7>;
	riscv,ndev = <64>;
};

Here is an example dmesg log when ndev is 64:
[    0.077196] Unable to handle kernel paging request at virtual address ffffaf8000000000
[    0.077205] Current swapper/0 pgtable: 4K pagesize, 48-bit VAs, pgdp=0x0000000081c2d000
[    0.077215] [ffffaf8000000000] pgd=000000009ffffc01, p4d=000000009ffffc01, pud=000000009ffff801, pmd=000000009ffff401, pte=0000000000000000
[    0.077240] Oops [#1]
[    0.077246] Modules linked in:
[    0.077254] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.19.0-rc6 #36 NONE 
[    0.077266] Hardware name: XiangShan (DT)
[    0.077273] epc : __kmalloc_node_track_caller_noprof+0x1a0/0x524
[    0.077284]  ra : kstrdup+0x32/0x60
[    0.077293] epc : ffffffff80253c70 ra : ffffffff801fa70e sp : ffff8f800000b700
[    0.077304]  gp : ffffffff81a1b580 tp : ffffaf8080158000 t0 : 0000000000000264
[    0.077313]  t1 : 0000000000000003 t2 : 0000000000000000 s0 : ffff8f800000b750
[    0.077323]  s1 : 0000000000000002 a0 : ffffaf8000000000 a1 : 0000000000000cc0
[    0.077332]  a2 : ffff8d800200bfc0 a3 : ffffffff81a5c5e0 a4 : ffffaf8000000000
[    0.077342]  a5 : 0000000000000003 a6 : ffffffffffffffff a7 : ffffaf8080001400
[    0.077352]  s2 : ffffaf80802ff178 s3 : ffffffff810107f0 s4 : 0000000000000000
[    0.077362]  s5 : 0000000000000000 s6 : ffff8f800000b9c0 s7 : ffffaf8080823200
[    0.077372]  s8 : ffffffff81a20580 s9 : ffffaf808012c990 s10: ffffffffffffffff
[    0.077382]  s11: 0000000000000000 t3 : 0000000000000cc0 t4 : ffffffff801fa764
[    0.077391]  t5 : 0000000000000000 t6 : 0000000000000263
[    0.077399] status: 0000000200000120 badaddr: ffffaf8000000000 cause: 000000000000000d
[    0.077409] [<ffffffff80253c70>] __kmalloc_node_track_caller_noprof+0x1a0/0x524
[    0.077422] [<ffffffff801fa70e>] kstrdup+0x32/0x60
[    0.077433] [<ffffffff801fa764>] kstrdup_const+0x28/0x34
[    0.077444] [<ffffffff80318438>] __kernfs_new_node+0x3c/0x274
[    0.077457] [<ffffffff80318a90>] kernfs_new_node+0x44/0x6c
[    0.077470] [<ffffffff80318f40>] kernfs_create_dir_ns+0x20/0x7c
[    0.077483] [<ffffffff8031b8f8>] sysfs_create_dir_ns+0x60/0xcc
[    0.077497] [<ffffffff80b41bea>] kobject_add_internal+0xae/0x2d8
[    0.077509] [<ffffffff80b422d6>] kobject_add+0x52/0xb8
[    0.077520] [<ffffffff80b6401c>] __irq_alloc_descs+0x190/0x328
[    0.077534] [<ffffffff800976de>] irq_domain_alloc_descs.part.0+0x46/0x78
[    0.077549] [<ffffffff8009827a>] irq_create_mapping_affinity+0x72/0xcc
[    0.077561] [<ffffffff805d27d2>] plic_probe+0x2e2/0x6c8
[    0.077573] [<ffffffff805d2bc8>] plic_platform_probe+0x10/0x18
[    0.077586] [<ffffffff80723aca>] platform_probe+0x46/0x80
[    0.077600] [<ffffffff807212ec>] really_probe+0x84/0x22c
[    0.077612] [<ffffffff807214f0>] __driver_probe_device+0x5c/0xd4
[    0.077625] [<ffffffff8072162e>] driver_probe_device+0x2e/0xf4
[    0.077639] [<ffffffff80721856>] __driver_attach+0x6e/0x14c
[    0.077652] [<ffffffff8071f434>] bus_for_each_dev+0x60/0xb0
[    0.077664] [<ffffffff80720e22>] driver_attach+0x1a/0x24
[    0.077676] [<ffffffff8072068a>] bus_add_driver+0xca/0x1d8
[    0.077689] [<ffffffff80722622>] driver_register+0x3e/0xdc
[    0.077702] [<ffffffff80723810>] __platform_driver_register+0x1c/0x24
[    0.077716] [<ffffffff80c2946e>] plic_driver_init+0x1a/0x24
[    0.077728] [<ffffffff800108aa>] do_one_initcall+0x4e/0x1b8
[    0.077740] [<ffffffff80c01358>] kernel_init_freeable+0x25c/0x2dc
[    0.077754] [<ffffffff80b639cc>] kernel_init+0x1c/0x144
[    0.077768] [<ffffffff8001239a>] ret_from_fork_kernel+0xe/0xf4
[    0.077781] [<ffffffff80b6e4ee>] ret_from_fork_kernel_asm+0x16/0x18
[    0.077800] Code: 6308 0c63 2a06 0a63 2a05 e703 0308 8293 001f 972a (630c) 7f73 
[    0.077809] ---[ end trace 0000000000000000 ]---
[    0.077817] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[    0.077827] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
---
 drivers/irqchip/irq-sifive-plic.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
index 210a57959637..d6515bf76444 100644
--- a/drivers/irqchip/irq-sifive-plic.c
+++ b/drivers/irqchip/irq-sifive-plic.c
@@ -293,7 +293,7 @@ static void plic_irq_resume(void *data)
 			continue;
 
 		raw_spin_lock_irqsave(&handler->enable_lock, flags);
-		for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs, 32); i++) {
+		for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs + 1, 32); i++) {
 			reg = handler->enable_base + i * sizeof(u32);
 			writel(handler->enable_save[i], reg);
 		}
@@ -431,7 +431,7 @@ static u32 cp100_isolate_pending_irq(int nr_irq_groups, struct plic_handler *han
 
 static irq_hw_number_t cp100_get_hwirq(struct plic_handler *handler, void __iomem *claim)
 {
-	int nr_irq_groups = DIV_ROUND_UP(handler->priv->nr_irqs, 32);
+	int nr_irq_groups = DIV_ROUND_UP(handler->priv->nr_irqs + 1, 32);
 	u32 __iomem *enable = handler->enable_base;
 	irq_hw_number_t hwirq = 0;
 	u32 iso_mask;
@@ -718,7 +718,7 @@ static int plic_probe(struct fwnode_handle *fwnode)
 			context_id * CONTEXT_ENABLE_SIZE;
 		handler->priv = priv;
 
-		handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs, 32),
+		handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs + 1, 32),
 					       sizeof(*handler->enable_save), GFP_KERNEL);
 		if (!handler->enable_save) {
 			error = -ENOMEM;
-- 
2.51.0

Re: [PATCH v2] irqchip/sifive-plic: Fix insufficient irq_groups allocation

Posted by Thomas Gleixner 1 week, 4 days ago

On Wed, Jan 21 2026 at 13:45, Yangyu Chen wrote:
> Since the first irq source is 1 instead of 0, when the number of
> irqs is multiple of 32, the last irq group will be ignored during
> allocation, saving, and restoring. This lead to memory corruption
> when accessing enable_save beyond allocated memory after commit
> 14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state")
> which will access enable_save for all sources during plic_probe.
> Thus, we should allocate irq_groups based on (nr_irqs + 1) instead of
> nr_irqs to avoid this issue.
>
> This is an long standing bug since Linux v6.4 but since the last irq
> source is rarely used, it may not be triggered in practice until commit
> 14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state").

I'm absolutely not convinced that this is the right fix. The handling of
nr_irqs in this driver is completely inconsistent:

1)

   static int plic_irq_suspend(void *data)
	/* irq ID 0 is reserved */
	for (unsigned int i = 1; i < priv->nr_irqs; i++) {

2)
   static void plic_irq_resume(void *data)
	/* irq ID 0 is reserved */
	for (unsigned int i = 1; i < priv->nr_irqs; i++) {
        ...
	for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs, 32); i++) {

3)

   static irq_hw_number_t cp100_get_hwirq(struct plic_handler *handler, void __iomem *claim)
	int nr_irq_groups = DIV_ROUND_UP(handler->priv->nr_irqs, 32);

        for (i = 0; i < nr_irq_groups; i++) {
4)

   static int plic_probe(struct fwnode_handle *fwnode)
  
	priv->prio_save = bitmap_zalloc(nr_irqs, GFP_KERNEL);
        ...
		for (int j = 0; j <= nr_irqs / 32; j++)
			writel(0, enable_base + j);
        ...                        
                handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs, 32),...
	...
	for (hwirq = 1; hwirq <= nr_irqs; hwirq++) {
        ...
	priv->irqdomain = irq_domain_create_linear(fwnode, nr_irqs + 1,

So can the SIFIVE people please clarify once and forever what's the
actual meaning of nr_irqs:

   A) The actual number of hardware interrupts available for drivers
      starting from hwirq=1

   B) The total number of hardware interrupts, where hwirq=0 is reserved
      and therefore the number of interrupts available to drivers is
      nr_irqs - 1.

That needs to be clarified first and then subsequently every single
instance which deals with nr_irqs has to be updated accordingly.

A random fix which cures the symptom of an out of bounds memory access
is not helpful at all to cure the underlying root cause of
inconsistency.

Thanks,

        tglx

Re: [PATCH v2] irqchip/sifive-plic: Fix insufficient irq_groups allocation

Posted by Yangyu Chen 4 days, 19 hours ago


On 27/1/2026 23:58, Thomas Gleixner wrote:
> On Wed, Jan 21 2026 at 13:45, Yangyu Chen wrote:
>> Since the first irq source is 1 instead of 0, when the number of
>> irqs is multiple of 32, the last irq group will be ignored during
>> allocation, saving, and restoring. This lead to memory corruption
>> when accessing enable_save beyond allocated memory after commit
>> 14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state")
>> which will access enable_save for all sources during plic_probe.
>> Thus, we should allocate irq_groups based on (nr_irqs + 1) instead of
>> nr_irqs to avoid this issue.
>>
>> This is an long standing bug since Linux v6.4 but since the last irq
>> source is rarely used, it may not be triggered in practice until commit
>> 14ff9e54dd14 ("irqchip/sifive-plic: Cache the interrupt enable state").
> 
> I'm absolutely not convinced that this is the right fix. The handling of
> nr_irqs in this driver is completely inconsistent:
> 
> 1)
> 
>     static int plic_irq_suspend(void *data)
> 	/* irq ID 0 is reserved */
> 	for (unsigned int i = 1; i < priv->nr_irqs; i++) {
> 
> 2)
>     static void plic_irq_resume(void *data)
> 	/* irq ID 0 is reserved */
> 	for (unsigned int i = 1; i < priv->nr_irqs; i++) {
>          ...
> 	for (i = 0; i < DIV_ROUND_UP(priv->nr_irqs, 32); i++) {
> 
> 3)
> 
>     static irq_hw_number_t cp100_get_hwirq(struct plic_handler *handler, void __iomem *claim)
> 	int nr_irq_groups = DIV_ROUND_UP(handler->priv->nr_irqs, 32);
> 
>          for (i = 0; i < nr_irq_groups; i++) {
> 4)
> 
>     static int plic_probe(struct fwnode_handle *fwnode)
>    
> 	priv->prio_save = bitmap_zalloc(nr_irqs, GFP_KERNEL);
>          ...
> 		for (int j = 0; j <= nr_irqs / 32; j++)
> 			writel(0, enable_base + j);
>          ...
>                  handler->enable_save = kcalloc(DIV_ROUND_UP(nr_irqs, 32),...
> 	...
> 	for (hwirq = 1; hwirq <= nr_irqs; hwirq++) {
>          ...
> 	priv->irqdomain = irq_domain_create_linear(fwnode, nr_irqs + 1,
> 
> So can the SIFIVE people please clarify once and forever what's the
> actual meaning of nr_irqs:
> 
>     A) The actual number of hardware interrupts available for drivers
>        starting from hwirq=1
> 
>     B) The total number of hardware interrupts, where hwirq=0 is reserved
>        and therefore the number of interrupts available to drivers is
>        nr_irqs - 1.
> 
> That needs to be clarified first and then subsequently every single
> instance which deals with nr_irqs has to be updated accordingly.
> 

Totally agree! But there is no comment after 1 week.

There's also some consideration of the actual device tree. I discovered 
that the Canaan K210 has only 64 interrupt sources (excluding 0), but 
its device tree (dts) has PLIC with ndev=65. [1] On the other hand, the 
ESWIN EIC7700 has 520 interrupt sources, but its dts has ndev=520. [2] 
This indicates that existing device trees already have some issues. 
Fortunately, other platforms like the cv1800b [3] have ndev=101 and the 
source ID 101 is used for mbox. Popular platforms like the JH7110 [4] 
and SG2042 [5] also show additional interrupts in their device trees 
that the SoC doesn’t actually use. In my opinion, the best practice 
would be option A and:

1. Note that the "riscv,ndev" value for PLIC represents the actual 
number of hardware interrupts available for drivers in the dt-binding. 
(This aligns with your option A.)
2. Fix all existing drivers to meet option A.
3. Since the K210 is only used in M-Mode with its built-in dts, we can 
modify its dts to change the "riscv,ndev" value from 65 to 64.

I will make such changes for this patch revision if there are no objections.

[1] 
https://cdn.hackaday.io/files/1654127076987008/kendryte_datasheet_20181011163248_en.pdf
[2] 
https://github.com/eswincomputing/EIC7700X-SoC-Technical-Reference-Manual/releases/download/v1.0.0-20250103/EIC7700X_SoC_Technical_Reference_Manual_Part1.pdf
[3] 
https://github.com/milkv-duo/duo-files/blob/main/duo/datasheet/CV1800B-CV1801B-Preliminary-Datasheet-full-en.pdf
[4] https://doc-en.rvspace.org/JH7110/PDF/JH7110_Datasheet.pdf
[5] 
https://github.com/milkv-pioneer/pioneer-files/blob/main/hardware/SG2042-TRM.pdf

Thanks,
Yangyu Chen

> A random fix which cures the symptom of an out of bounds memory access
> is not helpful at all to cure the underlying root cause of
> inconsistency.
> 
> Thanks,
> 
>          tglx
>