drivers/irqchip/irq-gic-v2m.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
The PCI Local Bus Specification (section 6.8.3.4 in Rev 3) permits
modifying the low-order bits of the DATA register to encode the interrupt
number. These bits must be reserved, but the base SPI may not be aligned to
the requested number of SPIs.
For example, with an initial MSI_TYPER base SPI of 0x16A and allocating a
multiple MSI of size 8, the offset returned is 8, resulting in an MSI DATA
base of 0x172.
This causes the endpoint device to send interrupt 3 wrong interrupt number:
1st MSI = 0x172 | 0x0 = 0x172
2nd MSI = 0x172 | 0x1 = 0x173
3rd MSI = 0x172 | 0x2 = 0x172 wrongly triggers the 1st MSI
...
To fix this, use bitmap_find_next_zero_area_off() instead of
bitmap_find_free_region() applying an initial offset of
base_spi - rounded(base_spi, nr_irqs) to accommodate the required alignment
for the first MSI.
With the above case, the returned bitmap offset is 6 which results in the
correct interrupts number encoding:
1st MSI = 0x170 | 0x0 = 0x170
2nd MSI = 0x170 | 0x1 = 0x171
3rd MSI = 0x170 | 0x2 = 0x172
...
Signed-off-by: Christian Bruel <christian.bruel@foss.st.com>
---
Changes in v1:
(Marc Zyngier)
- Replace the incorrect usage of msi_attrib.multiple with nr_irqs
- Reworked changelog
---
drivers/irqchip/irq-gic-v2m.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c
index 24ef5af569fe..2d5cf36340b1 100644
--- a/drivers/irqchip/irq-gic-v2m.c
+++ b/drivers/irqchip/irq-gic-v2m.c
@@ -153,14 +153,18 @@ static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
{
msi_alloc_info_t *info = args;
struct v2m_data *v2m = NULL, *tmp;
- int hwirq, offset, i, err = 0;
+ int hwirq, i, err = 0;
+ unsigned long align_off, offset;
+ unsigned long align_mask = nr_irqs - 1;
spin_lock(&v2m_lock);
list_for_each_entry(tmp, &v2m_nodes, entry) {
- offset = bitmap_find_free_region(tmp->bm, tmp->nr_spis,
- get_count_order(nr_irqs));
- if (offset >= 0) {
+ align_off = tmp->spi_start - (tmp->spi_start & ~align_mask);
+ offset = bitmap_find_next_zero_area_off(tmp->bm, tmp->nr_spis, 0,
+ nr_irqs, align_mask, align_off);
+ if (offset < tmp->nr_spis) {
v2m = tmp;
+ bitmap_set(v2m->bm, offset, nr_irqs);
break;
}
}
--
2.34.1
On Mon, 11 Aug 2025 11:39:42 +0100, Christian Bruel <christian.bruel@foss.st.com> wrote: > > The PCI Local Bus Specification (section 6.8.3.4 in Rev 3) permits > modifying the low-order bits of the DATA register to encode the interrupt > number. These bits must be reserved, but the base SPI may not be aligned to > the requested number of SPIs. The PCI spec knows nothing of SPIs. > For example, with an initial MSI_TYPER base SPI of 0x16A and allocating a > multiple MSI of size 8, the offset returned is 8, resulting in an MSI DATA > base of 0x172. > This causes the endpoint device to send interrupt 3 wrong interrupt number: > Please use the correct terminology: an interrupt is a signal delivered to the CPU. A *message* is what the device sends, which the GIC turns into an interrupt. Here, you are using the same word for two different things. The problem is that the device encodes a delta from a base message, that the delta is encoded with log2(nr_vectors) bits, OR'ing the vector number with the base message. If the base message is not correctly aligned, shit happens. > 1st MSI = 0x172 | 0x0 = 0x172 > 2nd MSI = 0x172 | 0x1 = 0x173 > 3rd MSI = 0x172 | 0x2 = 0x172 wrongly triggers the 1st MSI > ... > > To fix this, use bitmap_find_next_zero_area_off() instead of > bitmap_find_free_region() applying an initial offset of > base_spi - rounded(base_spi, nr_irqs) to accommodate the required alignment > for the first MSI. > > With the above case, the returned bitmap offset is 6 which results in the > correct interrupts number encoding: > > 1st MSI = 0x170 | 0x0 = 0x170 > 2nd MSI = 0x170 | 0x1 = 0x171 > 3rd MSI = 0x170 | 0x2 = 0x172 > ... Please rephrase this commit message so that it actually makes sense. We shouldn't need examples if the commit message was correctly written. > > Signed-off-by: Christian Bruel <christian.bruel@foss.st.com> > --- > Changes in v1: > (Marc Zyngier) > - Replace the incorrect usage of msi_attrib.multiple with nr_irqs > - Reworked changelog > --- > drivers/irqchip/irq-gic-v2m.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c > index 24ef5af569fe..2d5cf36340b1 100644 > --- a/drivers/irqchip/irq-gic-v2m.c > +++ b/drivers/irqchip/irq-gic-v2m.c > @@ -153,14 +153,18 @@ static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, > { > msi_alloc_info_t *info = args; > struct v2m_data *v2m = NULL, *tmp; > - int hwirq, offset, i, err = 0; > + int hwirq, i, err = 0; > + unsigned long align_off, offset; Move the definition of align_off inside the loop. > + unsigned long align_mask = nr_irqs - 1; > > spin_lock(&v2m_lock); > list_for_each_entry(tmp, &v2m_nodes, entry) { > - offset = bitmap_find_free_region(tmp->bm, tmp->nr_spis, > - get_count_order(nr_irqs)); > - if (offset >= 0) { > + align_off = tmp->spi_start - (tmp->spi_start & ~align_mask); > + offset = bitmap_find_next_zero_area_off(tmp->bm, tmp->nr_spis, 0, > + nr_irqs, align_mask, align_off); > + if (offset < tmp->nr_spis) { > v2m = tmp; > + bitmap_set(v2m->bm, offset, nr_irqs); > break; > } > } Isn't the GICv3 MBI driver affected by the same issue? M. -- Jazz isn't dead. It just smells funny.
On 8/18/25 18:23, Marc Zyngier wrote: > On Mon, 11 Aug 2025 11:39:42 +0100, > Christian Bruel <christian.bruel@foss.st.com> wrote: >> >> The PCI Local Bus Specification (section 6.8.3.4 in Rev 3) permits >> modifying the low-order bits of the DATA register to encode the interrupt >> number. These bits must be reserved, but the base SPI may not be aligned to >> the requested number of SPIs. > > The PCI spec knows nothing of SPIs. OK, admittedly unclear. I was referring to the base SPI from the GICV2M register and the MSI Message DATA register together. Need to rephrase. > >> For example, with an initial MSI_TYPER base SPI of 0x16A and allocating a >> multiple MSI of size 8, the offset returned is 8, resulting in an MSI DATA >> base of 0x172. >> This causes the endpoint device to send interrupt 3 wrong interrupt number: >> > > Please use the correct terminology: an interrupt is a signal delivered > to the CPU. A *message* is what the device sends, which the GIC turns > into an interrupt. Here, you are using the same word for two different > things. OK. by 'interrupt number' I reused the vocabulary used by dw_pcie_ep_raise_msi_irq() when writing the message. Which is a shortcut for the "MSI interrupt number" identifier. I will clarify. > > The problem is that the device encodes a delta from a base message, > that the delta is encoded with log2(nr_vectors) bits, OR'ing the > vector number with the base message. If the base message is not > correctly aligned, shit happens. > >> 1st MSI = 0x172 | 0x0 = 0x172 >> 2nd MSI = 0x172 | 0x1 = 0x173 >> 3rd MSI = 0x172 | 0x2 = 0x172 wrongly triggers the 1st MSI >> ... >> >> To fix this, use bitmap_find_next_zero_area_off() instead of >> bitmap_find_free_region() applying an initial offset of >> base_spi - rounded(base_spi, nr_irqs) to accommodate the required alignment >> for the first MSI. >> >> With the above case, the returned bitmap offset is 6 which results in the >> correct interrupts number encoding: >> >> 1st MSI = 0x170 | 0x0 = 0x170 >> 2nd MSI = 0x170 | 0x1 = 0x171 >> 3rd MSI = 0x170 | 0x2 = 0x172 >> ... > > Please rephrase this commit message so that it actually makes sense. > We shouldn't need examples if the commit message was correctly > written. OK > >> >> Signed-off-by: Christian Bruel <christian.bruel@foss.st.com> >> --- >> Changes in v1: >> (Marc Zyngier) >> - Replace the incorrect usage of msi_attrib.multiple with nr_irqs >> - Reworked changelog >> --- >> drivers/irqchip/irq-gic-v2m.c | 12 ++++++++---- >> 1 file changed, 8 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c >> index 24ef5af569fe..2d5cf36340b1 100644 >> --- a/drivers/irqchip/irq-gic-v2m.c >> +++ b/drivers/irqchip/irq-gic-v2m.c >> @@ -153,14 +153,18 @@ static int gicv2m_irq_domain_alloc(struct irq_domain *domain, unsigned int virq, >> { >> msi_alloc_info_t *info = args; >> struct v2m_data *v2m = NULL, *tmp; >> - int hwirq, offset, i, err = 0; >> + int hwirq, i, err = 0; >> + unsigned long align_off, offset; > > Move the definition of align_off inside the loop. > >> + unsigned long align_mask = nr_irqs - 1; >> >> spin_lock(&v2m_lock); >> list_for_each_entry(tmp, &v2m_nodes, entry) { >> - offset = bitmap_find_free_region(tmp->bm, tmp->nr_spis, >> - get_count_order(nr_irqs)); >> - if (offset >= 0) { >> + align_off = tmp->spi_start - (tmp->spi_start & ~align_mask); >> + offset = bitmap_find_next_zero_area_off(tmp->bm, tmp->nr_spis, 0, >> + nr_irqs, align_mask, align_off); >> + if (offset < tmp->nr_spis) { >> v2m = tmp; >> + bitmap_set(v2m->bm, offset, nr_irqs); >> break; >> } >> } > > Isn't the GICv3 MBI driver affected by the same issue? I’m not sure. From a brief look at the code, it seems that the GICv3 obtains spi_start from the Device Tree (mbi-ranges), not from the hardware. I noticed one platform that looks suspicious: rockchip/rk3568-lubancat-2.dts with mbi-ranges = <94 31>, <229 31>, <289 31>; For this platform, spi_start might not be aligned for multiple MSIs. Could it be a latent issue ? We can apply a similar fix in mbi_irq_domain_alloc(), but I would appreciate a Tested-By, as unable to test myself... thank you Christian > > M. >
© 2016 - 2025 Red Hat, Inc.