Change pci_set_irq() to call qemu_set_irq() on the PCI device IRQ rather than
calling PCI bus IRQ handler function directly. In order to preserve the
existing behaviour update pci_qdev_realize() so that it automatically connects
the PCI device IRQ to the PCI bus IRQ handler.
Finally add a "QEMU interface" description documenting the new PCI device IRQ
gpio next to the declaration of TYPE_PCI_DEVICE.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
---
hw/pci/pci.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 9471f996a7..3da1481eb5 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1680,8 +1680,7 @@ qemu_irq pci_allocate_irq(PCIDevice *pci_dev)
void pci_set_irq(PCIDevice *pci_dev, int level)
{
- int intx = pci_intx(pci_dev);
- pci_irq_handler(pci_dev, intx, level);
+ qemu_set_irq(pci_dev->irq, level);
}
/* Special hooks used by device assignment */
@@ -2193,6 +2192,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
pci_set_power(pci_dev, true);
pci_dev->msi_trigger = pci_msi_trigger;
+
+ /* Connect device IRQ to bus */
+ qdev_connect_gpio_out(DEVICE(pci_dev), 0,
+ pci_get_bus(pci_dev)->irq_in[pci_dev->devfn]);
}
static void pci_device_init(Object *obj)
@@ -2850,6 +2853,11 @@ void pci_set_power(PCIDevice *d, bool state)
}
}
+/*
+ * QEMU interface:
+ * + Unnamed GPIO output: set to 1 if the PCI Device has asserted its irq
+ */
+
static const TypeInfo pci_device_type_info = {
.name = TYPE_PCI_DEVICE,
.parent = TYPE_DEVICE,
--
2.30.2
Am 11. Mai 2023 08:57:16 UTC schrieb Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>:
>Change pci_set_irq() to call qemu_set_irq() on the PCI device IRQ rather than
>calling PCI bus IRQ handler function directly. In order to preserve the
>existing behaviour update pci_qdev_realize() so that it automatically connects
>the PCI device IRQ to the PCI bus IRQ handler.
>
>Finally add a "QEMU interface" description documenting the new PCI device IRQ
>gpio next to the declaration of TYPE_PCI_DEVICE.
>
>Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>---
> hw/pci/pci.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
>diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>index 9471f996a7..3da1481eb5 100644
>--- a/hw/pci/pci.c
>+++ b/hw/pci/pci.c
>@@ -1680,8 +1680,7 @@ qemu_irq pci_allocate_irq(PCIDevice *pci_dev)
>
> void pci_set_irq(PCIDevice *pci_dev, int level)
> {
>- int intx = pci_intx(pci_dev);
>- pci_irq_handler(pci_dev, intx, level);
>+ qemu_set_irq(pci_dev->irq, level);
> }
>
> /* Special hooks used by device assignment */
>@@ -2193,6 +2192,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> pci_set_power(pci_dev, true);
>
> pci_dev->msi_trigger = pci_msi_trigger;
>+
>+ /* Connect device IRQ to bus */
>+ qdev_connect_gpio_out(DEVICE(pci_dev), 0,
>+ pci_get_bus(pci_dev)->irq_in[pci_dev->devfn]);
I think this is confusing a few things. In my understanding -- unlike ISA -- PCI considers interrupt lanes only for PCI slots but not for buses. So for example each PCI slot could have its own direct connections (up to four, intA..intD) to the interrupt controller. IOW interrupt lanes and PCI buses are unrelated, thus PCIBus shouldn't really have IRQs.
Moreover, in case the interrupt lines are shared between multiple PCI slots, a usual pattern is to swizzle these lines such that the intAs from the slots don't all occupy just one IRQ line. That means that depending on the slot the device is plugged into a different lane is triggered. Above code, however, would always trigger the same line and wouldn't even allow for modeling the swizzeling.
Also, above code would cause out of bounds array accesses if a PCI device had more functions than there are on "the bus":
For example, consider PIIX which has four PIRQs, so ARRAY_SIZE(irq_fn) == 4, right? devfn can be up to 8 according to the PCI spec which would cause an out if bounds array access above.
I think that this commit does actually re-define how PCI buses work in QEMU although the cover letter claims to save this for another day. We should probably not apply the series in its current form.
Best regards,
Bernhard
> }
>
> static void pci_device_init(Object *obj)
>@@ -2850,6 +2853,11 @@ void pci_set_power(PCIDevice *d, bool state)
> }
> }
>
>+/*
>+ * QEMU interface:
>+ * + Unnamed GPIO output: set to 1 if the PCI Device has asserted its irq
>+ */
>+
> static const TypeInfo pci_device_type_info = {
> .name = TYPE_PCI_DEVICE,
> .parent = TYPE_DEVICE,
On 11/05/2023 22:44, Bernhard Beschow wrote:
> Am 11. Mai 2023 08:57:16 UTC schrieb Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>:
>> Change pci_set_irq() to call qemu_set_irq() on the PCI device IRQ rather than
>> calling PCI bus IRQ handler function directly. In order to preserve the
>> existing behaviour update pci_qdev_realize() so that it automatically connects
>> the PCI device IRQ to the PCI bus IRQ handler.
>>
>> Finally add a "QEMU interface" description documenting the new PCI device IRQ
>> gpio next to the declaration of TYPE_PCI_DEVICE.
>>
>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>> ---
>> hw/pci/pci.c | 12 ++++++++++--
>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>> index 9471f996a7..3da1481eb5 100644
>> --- a/hw/pci/pci.c
>> +++ b/hw/pci/pci.c
>> @@ -1680,8 +1680,7 @@ qemu_irq pci_allocate_irq(PCIDevice *pci_dev)
>>
>> void pci_set_irq(PCIDevice *pci_dev, int level)
>> {
>> - int intx = pci_intx(pci_dev);
>> - pci_irq_handler(pci_dev, intx, level);
>> + qemu_set_irq(pci_dev->irq, level);
>> }
>>
>> /* Special hooks used by device assignment */
>> @@ -2193,6 +2192,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
>> pci_set_power(pci_dev, true);
>>
>> pci_dev->msi_trigger = pci_msi_trigger;
>> +
>> + /* Connect device IRQ to bus */
>> + qdev_connect_gpio_out(DEVICE(pci_dev), 0,
>> + pci_get_bus(pci_dev)->irq_in[pci_dev->devfn]);
>
> I think this is confusing a few things. In my understanding -- unlike ISA -- PCI considers interrupt lanes only for PCI slots but not for buses. So for example each PCI slot could have its own direct connections (up to four, intA..intD) to the interrupt controller. IOW interrupt lanes and PCI buses are unrelated, thus PCIBus shouldn't really have IRQs.
That's definitely true: the restriction here is caused by the fact that QEMU's PCI
IRQ routing is already deeply integrated into the PCIBus object. This is visible by
the fact that pci_bus_map_irqs() is used to set the IRQ swizzling for each PCIBus so
I don't see there is a way to untangle without a massive redesign of the PCI buses in
QEMU, which is certainly outside the scope of this series.
> Moreover, in case the interrupt lines are shared between multiple PCI slots, a usual pattern is to swizzle these lines such that the intAs from the slots don't all occupy just one IRQ line. That means that depending on the slot the device is plugged into a different lane is triggered. Above code, however, would always trigger the same line and wouldn't even allow for modeling the swizzeling.
>
> Also, above code would cause out of bounds array accesses if a PCI device had more functions than there are on "the bus":
> For example, consider PIIX which has four PIRQs, so ARRAY_SIZE(irq_fn) == 4, right? devfn can be up to 8 according to the PCI spec which would cause an out if bounds array access above.
Another restriction on the current QEMU design of PCI devices is that there is an
implicit assumption that there is only one IRQ i.e. it's the current INTX rather than
representing the 4 separate PIRQ lines. Again this is something that people have
expressed interest in resolving, but as soon as you get here you end up with the same
problem above in that you need to revisit the PCI IRQ swizzling code above.
In particular there are some interesting use cases such as SPARC64 sabre whereby we
use virtual interrupt numbers during IRQ swizzling because we lose the PCIDevice of
the originating device as the IRQ propagates upwards through PCI bridges. So that
would also need a redesign assuming we move towards a more physical PCI model.
For the moment the PCI input IRQs are stored within PCIBus to ensure the existing
interrupt code works without having to touch any IRQ swizzling code, and so as the
current design assumes a single interrupt for each PCIDevice we only need a single
IRQ for each devfn. That's why there is no overflow of the array, and I can confirm
the code passed both gitlab CI and a local --enable-sanitizers "make check" without
introducing any additional failures.
> I think that this commit does actually re-define how PCI buses work in QEMU although the cover letter claims to save this for another day. We should probably not apply the series in its current form.
I hope the above explanation gives a bit more background as to why the series is
structured in the way it is. In effect it makes no attempt to alter the existing
PCIBus routing, but starts by considering that PCI devices have their own unique IRQ
handling using pci_set_irq(). So let's take a baby step which is to convert PCI
devices to use a standard qdev gpio instead of their own custom PCI IRQ handler,
which then allows for the possibility of using the standard qdev APIs for PCI IRQ
routing in future.
ATB,
Mark.
On Thu, May 11, 2023 at 09:44:51PM +0000, Bernhard Beschow wrote:
>
>
> Am 11. Mai 2023 08:57:16 UTC schrieb Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>:
> >Change pci_set_irq() to call qemu_set_irq() on the PCI device IRQ rather than
> >calling PCI bus IRQ handler function directly. In order to preserve the
> >existing behaviour update pci_qdev_realize() so that it automatically connects
> >the PCI device IRQ to the PCI bus IRQ handler.
> >
> >Finally add a "QEMU interface" description documenting the new PCI device IRQ
> >gpio next to the declaration of TYPE_PCI_DEVICE.
> >
> >Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
> >---
> > hw/pci/pci.c | 12 ++++++++++--
> > 1 file changed, 10 insertions(+), 2 deletions(-)
> >
> >diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> >index 9471f996a7..3da1481eb5 100644
> >--- a/hw/pci/pci.c
> >+++ b/hw/pci/pci.c
> >@@ -1680,8 +1680,7 @@ qemu_irq pci_allocate_irq(PCIDevice *pci_dev)
> >
> > void pci_set_irq(PCIDevice *pci_dev, int level)
> > {
> >- int intx = pci_intx(pci_dev);
> >- pci_irq_handler(pci_dev, intx, level);
> >+ qemu_set_irq(pci_dev->irq, level);
> > }
> >
> > /* Special hooks used by device assignment */
> >@@ -2193,6 +2192,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> > pci_set_power(pci_dev, true);
> >
> > pci_dev->msi_trigger = pci_msi_trigger;
> >+
> >+ /* Connect device IRQ to bus */
> >+ qdev_connect_gpio_out(DEVICE(pci_dev), 0,
> >+ pci_get_bus(pci_dev)->irq_in[pci_dev->devfn]);
>
> I think this is confusing a few things. In my understanding -- unlike
> ISA -- PCI considers interrupt lanes only for PCI slots but not for
> buses.
> So for example each PCI slot could have its own direct
> connections (up to four, intA..intD) to the interrupt controller. IOW
> interrupt lanes and PCI buses are unrelated, thus PCIBus shouldn't
> really have IRQs.
True, interrupt lines (not lanes I think - lanes is a PCI express
unrelated to interrupts since interrupts are just messages under PCIe)
bypass the PCI bus. They are in fact even used outside the
normal GNT#/REQ# protocol.
The system vendor is free to combine the various INTx# signals from the PCI connector(s)
in any way to connect them to the interrupt controller. They may be wire-ORed or
electronically switched under program control, or any combination thereof. The system
designer must insure that each INTx# signal from each connector is connected to an input
on the interrupt controller. This means the device driver may not make any assumptions
about interrupt sharing. All PCI device drivers must be able to share an interrupt (chaining)
with any other logical device including devices in the same multi-function package.
>
> Moreover, in case the interrupt lines are shared between multiple PCI slots, a usual pattern is to swizzle these lines such that the intAs from the slots don't all occupy just one IRQ line. That means that depending on the slot the device is plugged into a different lane is triggered. Above code, however, would always trigger the same line and wouldn't even allow for modeling the swizzeling.
the swizzeling always applies in case of PCI bridges:
However, since bridges will be used on add-in cards, the BIOS will assume an association
between device location and which INTx# line it uses when requesting an interrupt.
...
The BIOS code will assume the following binding behind the bridge and will
write the IRQ number in each device as described in Table 9-1. The interrupt binding
defined in this table is mandatory for add-in cards utilizing a bridge.
> Also, above code would cause out of bounds array accesses if a PCI device had more functions than there are on "the bus":
> For example, consider PIIX which has four PIRQs, so ARRAY_SIZE(irq_fn) == 4, right? devfn can be up to 8 according to the PCI spec which would cause an out if bounds array access above.
>
> I think that this commit does actually re-define how PCI buses work in QEMU although the cover letter claims to save this for another day. We should probably not apply the series in its current form.
>
> Best regards,
> Bernhard
>
> > }
> >
> > static void pci_device_init(Object *obj)
> >@@ -2850,6 +2853,11 @@ void pci_set_power(PCIDevice *d, bool state)
> > }
> > }
> >
> >+/*
> >+ * QEMU interface:
> >+ * + Unnamed GPIO output: set to 1 if the PCI Device has asserted its irq
> >+ */
> >+
> > static const TypeInfo pci_device_type_info = {
> > .name = TYPE_PCI_DEVICE,
> > .parent = TYPE_DEVICE,
On 12/05/2023 06:51, Michael S. Tsirkin wrote:
> On Thu, May 11, 2023 at 09:44:51PM +0000, Bernhard Beschow wrote:
>>
>>
>> Am 11. Mai 2023 08:57:16 UTC schrieb Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>:
>>> Change pci_set_irq() to call qemu_set_irq() on the PCI device IRQ rather than
>>> calling PCI bus IRQ handler function directly. In order to preserve the
>>> existing behaviour update pci_qdev_realize() so that it automatically connects
>>> the PCI device IRQ to the PCI bus IRQ handler.
>>>
>>> Finally add a "QEMU interface" description documenting the new PCI device IRQ
>>> gpio next to the declaration of TYPE_PCI_DEVICE.
>>>
>>> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
>>> ---
>>> hw/pci/pci.c | 12 ++++++++++--
>>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>> index 9471f996a7..3da1481eb5 100644
>>> --- a/hw/pci/pci.c
>>> +++ b/hw/pci/pci.c
>>> @@ -1680,8 +1680,7 @@ qemu_irq pci_allocate_irq(PCIDevice *pci_dev)
>>>
>>> void pci_set_irq(PCIDevice *pci_dev, int level)
>>> {
>>> - int intx = pci_intx(pci_dev);
>>> - pci_irq_handler(pci_dev, intx, level);
>>> + qemu_set_irq(pci_dev->irq, level);
>>> }
>>>
>>> /* Special hooks used by device assignment */
>>> @@ -2193,6 +2192,10 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
>>> pci_set_power(pci_dev, true);
>>>
>>> pci_dev->msi_trigger = pci_msi_trigger;
>>> +
>>> + /* Connect device IRQ to bus */
>>> + qdev_connect_gpio_out(DEVICE(pci_dev), 0,
>>> + pci_get_bus(pci_dev)->irq_in[pci_dev->devfn]);
>>
>> I think this is confusing a few things. In my understanding -- unlike
>> ISA -- PCI considers interrupt lanes only for PCI slots but not for
>> buses.
>> So for example each PCI slot could have its own direct
>> connections (up to four, intA..intD) to the interrupt controller. IOW
>> interrupt lanes and PCI buses are unrelated, thus PCIBus shouldn't
>> really have IRQs.
>
> True, interrupt lines (not lanes I think - lanes is a PCI express
> unrelated to interrupts since interrupts are just messages under PCIe)
> bypass the PCI bus. They are in fact even used outside the
> normal GNT#/REQ# protocol.
>
> The system vendor is free to combine the various INTx# signals from the PCI connector(s)
> in any way to connect them to the interrupt controller. They may be wire-ORed or
> electronically switched under program control, or any combination thereof. The system
> designer must insure that each INTx# signal from each connector is connected to an input
> on the interrupt controller. This means the device driver may not make any assumptions
> about interrupt sharing. All PCI device drivers must be able to share an interrupt (chaining)
> with any other logical device including devices in the same multi-function package.
I hope I covered this in my reply to Bernhard, but I think this supports the idea
that using a gpio is the right approach here: in the case of PCI IRQ the gpio
represents the physical signal and can be wired using standard qdev APIs, whereas for
PCIe the GPIOs can be wired to an internal message handler in exactly the same way.
>> Moreover, in case the interrupt lines are shared between multiple PCI slots, a usual pattern is to swizzle these lines such that the intAs from the slots don't all occupy just one IRQ line. That means that depending on the slot the device is plugged into a different lane is triggered. Above code, however, would always trigger the same line and wouldn't even allow for modeling the swizzeling.
>
> the swizzeling always applies in case of PCI bridges:
>
> However, since bridges will be used on add-in cards, the BIOS will assume an association
> between device location and which INTx# line it uses when requesting an interrupt.
> ...
> The BIOS code will assume the following binding behind the bridge and will
> write the IRQ number in each device as described in Table 9-1. The interrupt binding
> defined in this table is mandatory for add-in cards utilizing a bridge.
Also just to clarify in line with my previous reply: this series only changes the PCI
device IRQ endpoint to use a gpio as a starting point to facilitate using standard
qdev APIs to enable physical PCI device routing in future, as opposed to using a
bespoke pci_set_irq() function just for PCI.
If this makes sense, I'd be interested to hear whether you think the current approach
of using a named gpio "pci-input-irq" (which appears in "info qom-tree") is suitable,
or whether it makes sense to embed the PCI IRQ directly in the device which is the
normal qdev approach but will take more work as it involves updating all the relevant
PCIDevices.
ATB,
Mark.
© 2016 - 2026 Red Hat, Inc.