[PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Chuck Zmudzinski posted 1 patch 1 year, 3 months ago
Failed in applying to current master (apply log)
There is a newer version of this series
hw/i386/pc_piix.c    | 127 ++++++++++++++++++++++++++++++++-----------
hw/xen/xen_pt.c      |  49 ++++++++++++++---
hw/xen/xen_pt.h      |  16 ++++++
hw/xen/xen_pt_stub.c |   4 ++
4 files changed, 154 insertions(+), 42 deletions(-)
[PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Chuck Zmudzinski 1 year, 3 months ago
Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
as noted in docs/igd-assign.txt in the Qemu source code.

Currently, when the xl toolstack is used to configure a Xen HVM guest with
Intel IGD passthrough to the guest with the Qemu upstream device model,
a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
a different slot. This problem often prevents the guest from booting.

The only available workaround is not good: Configure Xen HVM guests to use
the old and no longer maintained Qemu traditional device model available
from xenbits.xen.org which does reserve slot 2 for the Intel IGD.

To implement this feature in the Qemu upstream device model for Xen HVM
guests, introduce the following new functions, types, and macros:

* XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
* XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
* typedef XenPTQdevRealize function pointer
* XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
* xen_igd_reserve_slot and xen_igd_clear_slot functions

The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
the xl toolstack with the gfx_passthru option enabled, which sets the
igd-passthru=on option to Qemu for the Xen HVM machine type.

The new xen_igd_reserve_slot function also needs to be implemented in
hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
in which case it does nothing.

The new xen_igd_clear_slot function overrides qdev->realize of the parent
PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
created in hw/i386/pc_piix.c for the case when igd-passthru=on.

Move the call to xen_host_pci_device_get, and the associated error
handling, from xen_pt_realize to the new xen_igd_clear_slot function to
initialize the device class and vendor values which enables the checks for
the Intel IGD to succeed. The verification that the host device is an
Intel IGD to be passed through is done by checking the domain, bus, slot,
and function values as well as by checking that gfx_passthru is enabled,
the device class is VGA, and the device vendor in Intel.

Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
---
Notes that might be helpful to reviewers of patched code in hw/xen:

The new functions and types are based on recommendations from Qemu docs:
https://qemu.readthedocs.io/en/latest/devel/qom.html

Notes that might be helpful to reviewers of patched code in hw/i386:

The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
not affect builds that do not have CONFIG_XEN defined.

xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
existing function that is only true when Qemu is built with
xen-pci-passthrough enabled and the administrator has configured the Xen
HVM guest with Qemu's igd-passthru=on option.

v2: Remove From: <email address> tag at top of commit message

v3: Changed the test for the Intel IGD in xen_igd_clear_slot:

    if (is_igd_vga_passthrough(&s->real_device) &&
        (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {

    is changed to

    if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
        && (s->hostaddr.function == 0)) {

    I hoped that I could use the test in v2, since it matches the
    other tests for the Intel IGD in Qemu and Xen, but those tests
    do not work because the necessary data structures are not set with
    their values yet. So instead use the test that the administrator
    has enabled gfx_passthru and the device address on the host is
    02.0. This test does detect the Intel IGD correctly.

v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
    email address to match the address used by the same author in commits
    be9c61da and c0e86b76
    
    Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc

v5: The patch of xen_pt.c was re-worked to allow a more consistent test
    for the Intel IGD that uses the same criteria as in other places.
    This involved moving the call to xen_host_pci_device_get from
    xen_pt_realize to xen_igd_clear_slot and updating the checks for the
    Intel IGD in xen_igd_clear_slot:
    
    if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
        && (s->hostaddr.function == 0)) {

    is changed to

    if (is_igd_vga_passthrough(&s->real_device) &&
        s->real_device.domain == 0 && s->real_device.bus == 0 &&
        s->real_device.dev == 2 && s->real_device.func == 0 &&
        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {

    Added an explanation for the move of xen_host_pci_device_get from
    xen_pt_realize to xen_igd_clear_slot to the commit message.

    Rebase.

v6: Fix logging by removing these lines from the move from xen_pt_realize
    to xen_igd_clear_slot that was done in v5:

    XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
               " to devfn 0x%x\n",
               s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
               s->dev.devfn);

    This log needs to be in xen_pt_realize because s->dev.devfn is not
    set yet in xen_igd_clear_slot.

v7: Inhibit out of context log message and needless processing by
    adding 2 lines at the top of the new xen_igd_clear_slot function:

    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
        return;

    Rebase. This removed an unnecessary header file from xen_pt.h 

 hw/i386/pc_piix.c    | 127 ++++++++++++++++++++++++++++++++-----------
 hw/xen/xen_pt.c      |  49 ++++++++++++++---
 hw/xen/xen_pt.h      |  16 ++++++
 hw/xen/xen_pt_stub.c |   4 ++
 4 files changed, 154 insertions(+), 42 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index b48047f50c..34a9736b5e 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -32,6 +32,7 @@
 #include "hw/i386/pc.h"
 #include "hw/i386/apic.h"
 #include "hw/pci-host/i440fx.h"
+#include "hw/rtc/mc146818rtc.h"
 #include "hw/southbridge/piix.h"
 #include "hw/display/ramfb.h"
 #include "hw/firmware/smbios.h"
@@ -40,16 +41,16 @@
 #include "hw/usb.h"
 #include "net/net.h"
 #include "hw/ide/pci.h"
-#include "hw/ide/piix.h"
 #include "hw/irq.h"
 #include "sysemu/kvm.h"
 #include "hw/kvm/clock.h"
 #include "hw/sysbus.h"
+#include "hw/i2c/i2c.h"
 #include "hw/i2c/smbus_eeprom.h"
 #include "hw/xen/xen-x86.h"
+#include "hw/xen/xen.h"
 #include "exec/memory.h"
 #include "hw/acpi/acpi.h"
-#include "hw/acpi/piix4.h"
 #include "qapi/error.h"
 #include "qemu/error-report.h"
 #include "sysemu/xen.h"
@@ -66,6 +67,7 @@
 #include "kvm/kvm-cpu.h"
 
 #define MAX_IDE_BUS 2
+#define XEN_IOAPIC_NUM_PIRQS 128ULL
 
 #ifdef CONFIG_IDE_ISA
 static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
@@ -73,6 +75,32 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
 static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 #endif
 
+/*
+ * Return the global irq number corresponding to a given device irq
+ * pin. We could also use the bus number to have a more precise mapping.
+ */
+static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
+{
+    int slot_addend;
+    slot_addend = PCI_SLOT(pci_dev->devfn) - 1;
+    return (pci_intx + slot_addend) & 3;
+}
+
+static void piix_intx_routing_notifier_xen(PCIDevice *dev)
+{
+    int i;
+
+    /* Scan for updates to PCI link routes (0x60-0x63). */
+    for (i = 0; i < PIIX_NUM_PIRQS; i++) {
+        uint8_t v = dev->config_read(dev, PIIX_PIRQCA + i, 1);
+        if (v & 0x80) {
+            v = 0;
+        }
+        v &= 0xf;
+        xen_set_pci_link_route(i, v);
+    }
+}
+
 /* PC hardware initialisation */
 static void pc_init1(MachineState *machine,
                      const char *host_type, const char *pci_type)
@@ -84,7 +112,7 @@ static void pc_init1(MachineState *machine,
     MemoryRegion *system_io = get_system_io();
     PCIBus *pci_bus;
     ISABus *isa_bus;
-    int piix3_devfn = -1;
+    Object *piix4_pm;
     qemu_irq smi_irq;
     GSIState *gsi_state;
     BusState *idebus[MAX_IDE_BUS];
@@ -205,10 +233,9 @@ static void pc_init1(MachineState *machine,
     gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
 
     if (pcmc->pci_enabled) {
-        PIIX3State *piix3;
+        DeviceState *dev;
         PCIDevice *pci_dev;
-        const char *type = xen_enabled() ? TYPE_PIIX3_XEN_DEVICE
-                                         : TYPE_PIIX3_DEVICE;
+        int i;
 
         pci_bus = i440fx_init(pci_type,
                               i440fx_host,
@@ -216,21 +243,65 @@ static void pc_init1(MachineState *machine,
                               x86ms->below_4g_mem_size,
                               x86ms->above_4g_mem_size,
                               pci_memory, ram_memory);
+        pci_bus_map_irqs(pci_bus,
+                         xen_enabled() ? xen_pci_slot_get_pirq
+                                       : pci_slot_get_pirq);
         pcms->bus = pci_bus;
 
-        pci_dev = pci_create_simple_multifunction(pci_bus, -1, true, type);
-        piix3 = PIIX3_PCI_DEVICE(pci_dev);
-        piix3->pic = x86ms->gsi;
-        piix3_devfn = piix3->dev.devfn;
-        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), "isa.0"));
+        pci_dev = pci_new_multifunction(-1, true, TYPE_PIIX3_DEVICE);
+        object_property_set_bool(OBJECT(pci_dev), "has-usb",
+                                 machine_usb(machine), &error_abort);
+        object_property_set_bool(OBJECT(pci_dev), "has-acpi",
+                                 x86_machine_is_acpi_enabled(x86ms),
+                                 &error_abort);
+        qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);
+        object_property_set_bool(OBJECT(pci_dev), "smm-enabled",
+                                 x86_machine_is_smm_enabled(x86ms),
+                                 &error_abort);
+        pci_realize_and_unref(pci_dev, pci_bus, &error_fatal);
+
+        if (xen_enabled()) {
+            pci_device_set_intx_routing_notifier(
+                        pci_dev, piix_intx_routing_notifier_xen);
+
+            /*
+             * Xen supports additional interrupt routes from the PCI devices to
+             * the IOAPIC: the four pins of each PCI device on the bus are also
+             * connected to the IOAPIC directly.
+             * These additional routes can be discovered through ACPI.
+             */
+            pci_bus_irqs(pci_bus, xen_intx_set_irq, pci_dev,
+                         XEN_IOAPIC_NUM_PIRQS);
+        }
+
+        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "pic"));
+        for (i = 0; i < ISA_NUM_IRQS; i++) {
+            qdev_connect_gpio_out(dev, i, x86ms->gsi[i]);
+        }
+        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));
+        rtc_state = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),
+                                                             "rtc"));
+        piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");
+        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));
+        pci_ide_create_devs(PCI_DEVICE(dev));
+        idebus[0] = qdev_get_child_bus(dev, "ide.0");
+        idebus[1] = qdev_get_child_bus(dev, "ide.1");
     } else {
         pci_bus = NULL;
+        piix4_pm = NULL;
         isa_bus = isa_bus_new(NULL, get_system_memory(), system_io,
                               &error_abort);
+        isa_bus_irqs(isa_bus, x86ms->gsi);
+
+        rtc_state = isa_new(TYPE_MC146818_RTC);
+        qdev_prop_set_int32(DEVICE(rtc_state), "base_year", 2000);
+        isa_realize_and_unref(rtc_state, isa_bus, &error_fatal);
+
         i8257_dma_init(isa_bus, 0);
         pcms->hpet_enabled = false;
+        idebus[0] = NULL;
+        idebus[1] = NULL;
     }
-    isa_bus_irqs(isa_bus, x86ms->gsi);
 
     if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
         pc_i8259_create(isa_bus, gsi_state->i8259_irq);
@@ -252,18 +323,12 @@ static void pc_init1(MachineState *machine,
     }
 
     /* init basic PC hardware */
-    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
+    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, rtc_state, true,
                          0x4);
 
     pc_nic_init(pcmc, isa_bus, pci_bus);
 
     if (pcmc->pci_enabled) {
-        PCIDevice *dev;
-
-        dev = pci_create_simple(pci_bus, piix3_devfn + 1, TYPE_PIIX3_IDE);
-        pci_ide_create_devs(dev);
-        idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
-        idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
         pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
     }
 #ifdef CONFIG_IDE_ISA
@@ -289,21 +354,9 @@ static void pc_init1(MachineState *machine,
     }
 #endif
 
-    if (pcmc->pci_enabled && machine_usb(machine)) {
-        pci_create_simple(pci_bus, piix3_devfn + 2, "piix3-usb-uhci");
-    }
-
-    if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
-        PCIDevice *piix4_pm;
-
+    if (piix4_pm) {
         smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
-        piix4_pm = pci_new(piix3_devfn + 3, TYPE_PIIX4_PM);
-        qdev_prop_set_uint32(DEVICE(piix4_pm), "smb_io_base", 0xb100);
-        qdev_prop_set_bit(DEVICE(piix4_pm), "smm-enabled",
-                          x86_machine_is_smm_enabled(x86ms));
-        pci_realize_and_unref(piix4_pm, pci_bus, &error_fatal);
 
-        qdev_connect_gpio_out(DEVICE(piix4_pm), 0, x86ms->gsi[9]);
         qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);
         pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));
         /* TODO: Populate SPD eeprom data.  */
@@ -315,7 +368,7 @@ static void pc_init1(MachineState *machine,
                                  object_property_allow_set_link,
                                  OBJ_PROP_LINK_STRONG);
         object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,
-                                 OBJECT(piix4_pm), &error_abort);
+                                 piix4_pm, &error_abort);
     }
 
     if (machine->nvdimms_state->is_enabled) {
@@ -405,6 +458,9 @@ static void pc_xen_hvm_init(MachineState *machine)
     }
 
     pc_xen_hvm_init_pci(machine);
+    if (xen_igd_gfx_pt_enabled()) {
+        xen_igd_reserve_slot(pcms->bus);
+    }
     pci_create_simple(pcms->bus, -1, "xen-platform");
 }
 #endif
@@ -441,6 +497,11 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
     pc_i440fx_machine_options(m);
     m->alias = "pc";
     m->is_default = true;
+#ifdef CONFIG_MICROVM_DEFAULT
+    m->is_default = false;
+#else
+    m->is_default = true;
+#endif
 }
 
 DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 0ec7e52183..eff38155ef 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
                s->dev.devfn);
 
-    xen_host_pci_device_get(&s->real_device,
-                            s->hostaddr.domain, s->hostaddr.bus,
-                            s->hostaddr.slot, s->hostaddr.function,
-                            errp);
-    if (*errp) {
-        error_append_hint(errp, "Failed to \"open\" the real pci device");
-        return;
-    }
-
     s->is_virtfn = s->real_device.is_virtfn;
     if (s->is_virtfn) {
         XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
@@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
     PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
 }
 
+void xen_igd_reserve_slot(PCIBus *pci_bus)
+{
+    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
+    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
+}
+
+static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
+{
+    ERRP_GUARD();
+    PCIDevice *pci_dev = (PCIDevice *)qdev;
+    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
+    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
+    PCIBus *pci_bus = pci_get_bus(pci_dev);
+
+    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
+        return;
+
+    xen_host_pci_device_get(&s->real_device,
+                            s->hostaddr.domain, s->hostaddr.bus,
+                            s->hostaddr.slot, s->hostaddr.function,
+                            errp);
+    if (*errp) {
+        error_append_hint(errp, "Failed to \"open\" the real pci device");
+        return;
+    }
+
+    if (is_igd_vga_passthrough(&s->real_device) &&
+        s->real_device.domain == 0 && s->real_device.bus == 0 &&
+        s->real_device.dev == 2 && s->real_device.func == 0 &&
+        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
+        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
+        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
+    }
+    xpdc->pci_qdev_realize(qdev, errp);
+}
+
 static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
     PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
+    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
+    xpdc->pci_qdev_realize = dc->realize;
+    dc->realize = xen_igd_clear_slot;
     k->realize = xen_pt_realize;
     k->exit = xen_pt_unregister_device;
     k->config_read = xen_pt_pci_read_config;
@@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
     .instance_size = sizeof(XenPCIPassthroughState),
     .instance_finalize = xen_pci_passthrough_finalize,
     .class_init = xen_pci_passthrough_class_init,
+    .class_size = sizeof(XenPTDeviceClass),
     .instance_init = xen_pci_passthrough_instance_init,
     .interfaces = (InterfaceInfo[]) {
         { INTERFACE_CONVENTIONAL_PCI_DEVICE },
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index cf10fc7bbf..8c25932b4b 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -2,6 +2,7 @@
 #define XEN_PT_H
 
 #include "hw/xen/xen_common.h"
+#include "hw/pci/pci_bus.h"
 #include "xen-host-pci-device.h"
 #include "qom/object.h"
 
@@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
 #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
 OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
 
+#define XEN_PT_DEVICE_CLASS(klass) \
+    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
+#define XEN_PT_DEVICE_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
+
+typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
+
+typedef struct XenPTDeviceClass {
+    PCIDeviceClass parent_class;
+    XenPTQdevRealize pci_qdev_realize;
+} XenPTDeviceClass;
+
 uint32_t igd_read_opregion(XenPCIPassthroughState *s);
+void xen_igd_reserve_slot(PCIBus *pci_bus);
 void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
 void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
                                            XenHostPCIDevice *dev);
@@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
 
 #define XEN_PCI_INTEL_OPREGION 0xfc
 
+#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
+
 typedef enum {
     XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
     XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
index 2d8cac8d54..5c108446a8 100644
--- a/hw/xen/xen_pt_stub.c
+++ b/hw/xen/xen_pt_stub.c
@@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
         error_setg(errp, "Xen PCI passthrough support not built in");
     }
 }
+
+void xen_igd_reserve_slot(PCIBus *pci_bus)
+{
+}
-- 
2.39.0
Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Michael S. Tsirkin 1 year, 3 months ago
On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
> as noted in docs/igd-assign.txt in the Qemu source code.
> 
> Currently, when the xl toolstack is used to configure a Xen HVM guest with
> Intel IGD passthrough to the guest with the Qemu upstream device model,
> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
> a different slot. This problem often prevents the guest from booting.
> 
> The only available workaround is not good: Configure Xen HVM guests to use
> the old and no longer maintained Qemu traditional device model available
> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
> 
> To implement this feature in the Qemu upstream device model for Xen HVM
> guests, introduce the following new functions, types, and macros:
> 
> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
> * typedef XenPTQdevRealize function pointer
> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
> * xen_igd_reserve_slot and xen_igd_clear_slot functions
> 
> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
> the xl toolstack with the gfx_passthru option enabled, which sets the
> igd-passthru=on option to Qemu for the Xen HVM machine type.
> 
> The new xen_igd_reserve_slot function also needs to be implemented in
> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
> in which case it does nothing.
> 
> The new xen_igd_clear_slot function overrides qdev->realize of the parent
> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
> 
> Move the call to xen_host_pci_device_get, and the associated error
> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
> initialize the device class and vendor values which enables the checks for
> the Intel IGD to succeed. The verification that the host device is an
> Intel IGD to be passed through is done by checking the domain, bus, slot,
> and function values as well as by checking that gfx_passthru is enabled,
> the device class is VGA, and the device vendor in Intel.
> 
> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
> ---
> Notes that might be helpful to reviewers of patched code in hw/xen:
> 
> The new functions and types are based on recommendations from Qemu docs:
> https://qemu.readthedocs.io/en/latest/devel/qom.html
> 
> Notes that might be helpful to reviewers of patched code in hw/i386:
> 
> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
> not affect builds that do not have CONFIG_XEN defined.

I'm not sure how you can claim that.

...

> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index b48047f50c..34a9736b5e 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -32,6 +32,7 @@
>  #include "hw/i386/pc.h"
>  #include "hw/i386/apic.h"
>  #include "hw/pci-host/i440fx.h"
> +#include "hw/rtc/mc146818rtc.h"
>  #include "hw/southbridge/piix.h"
>  #include "hw/display/ramfb.h"
>  #include "hw/firmware/smbios.h"
> @@ -40,16 +41,16 @@
>  #include "hw/usb.h"
>  #include "net/net.h"
>  #include "hw/ide/pci.h"
> -#include "hw/ide/piix.h"
>  #include "hw/irq.h"
>  #include "sysemu/kvm.h"
>  #include "hw/kvm/clock.h"
>  #include "hw/sysbus.h"
> +#include "hw/i2c/i2c.h"
>  #include "hw/i2c/smbus_eeprom.h"
>  #include "hw/xen/xen-x86.h"
> +#include "hw/xen/xen.h"
>  #include "exec/memory.h"
>  #include "hw/acpi/acpi.h"
> -#include "hw/acpi/piix4.h"
>  #include "qapi/error.h"
>  #include "qemu/error-report.h"
>  #include "sysemu/xen.h"
> @@ -66,6 +67,7 @@
>  #include "kvm/kvm-cpu.h"
>  
>  #define MAX_IDE_BUS 2
> +#define XEN_IOAPIC_NUM_PIRQS 128ULL
>  
>  #ifdef CONFIG_IDE_ISA
>  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
> @@ -73,6 +75,32 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
>  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
>  #endif
>  
> +/*
> + * Return the global irq number corresponding to a given device irq
> + * pin. We could also use the bus number to have a more precise mapping.
> + */
> +static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
> +{
> +    int slot_addend;
> +    slot_addend = PCI_SLOT(pci_dev->devfn) - 1;
> +    return (pci_intx + slot_addend) & 3;
> +}
> +
> +static void piix_intx_routing_notifier_xen(PCIDevice *dev)
> +{
> +    int i;
> +
> +    /* Scan for updates to PCI link routes (0x60-0x63). */
> +    for (i = 0; i < PIIX_NUM_PIRQS; i++) {
> +        uint8_t v = dev->config_read(dev, PIIX_PIRQCA + i, 1);
> +        if (v & 0x80) {
> +            v = 0;
> +        }
> +        v &= 0xf;
> +        xen_set_pci_link_route(i, v);
> +    }
> +}
> +
>  /* PC hardware initialisation */
>  static void pc_init1(MachineState *machine,
>                       const char *host_type, const char *pci_type)
> @@ -84,7 +112,7 @@ static void pc_init1(MachineState *machine,
>      MemoryRegion *system_io = get_system_io();
>      PCIBus *pci_bus;
>      ISABus *isa_bus;
> -    int piix3_devfn = -1;
> +    Object *piix4_pm;
>      qemu_irq smi_irq;
>      GSIState *gsi_state;
>      BusState *idebus[MAX_IDE_BUS];
> @@ -205,10 +233,9 @@ static void pc_init1(MachineState *machine,
>      gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
>  
>      if (pcmc->pci_enabled) {
> -        PIIX3State *piix3;
> +        DeviceState *dev;
>          PCIDevice *pci_dev;
> -        const char *type = xen_enabled() ? TYPE_PIIX3_XEN_DEVICE
> -                                         : TYPE_PIIX3_DEVICE;
> +        int i;
>  
>          pci_bus = i440fx_init(pci_type,
>                                i440fx_host,
> @@ -216,21 +243,65 @@ static void pc_init1(MachineState *machine,
>                                x86ms->below_4g_mem_size,
>                                x86ms->above_4g_mem_size,
>                                pci_memory, ram_memory);
> +        pci_bus_map_irqs(pci_bus,
> +                         xen_enabled() ? xen_pci_slot_get_pirq
> +                                       : pci_slot_get_pirq);
>          pcms->bus = pci_bus;
>  
> -        pci_dev = pci_create_simple_multifunction(pci_bus, -1, true, type);
> -        piix3 = PIIX3_PCI_DEVICE(pci_dev);
> -        piix3->pic = x86ms->gsi;
> -        piix3_devfn = piix3->dev.devfn;
> -        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), "isa.0"));
> +        pci_dev = pci_new_multifunction(-1, true, TYPE_PIIX3_DEVICE);
> +        object_property_set_bool(OBJECT(pci_dev), "has-usb",
> +                                 machine_usb(machine), &error_abort);
> +        object_property_set_bool(OBJECT(pci_dev), "has-acpi",
> +                                 x86_machine_is_acpi_enabled(x86ms),
> +                                 &error_abort);
> +        qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);
> +        object_property_set_bool(OBJECT(pci_dev), "smm-enabled",
> +                                 x86_machine_is_smm_enabled(x86ms),
> +                                 &error_abort);
> +        pci_realize_and_unref(pci_dev, pci_bus, &error_fatal);
> +
> +        if (xen_enabled()) {
> +            pci_device_set_intx_routing_notifier(
> +                        pci_dev, piix_intx_routing_notifier_xen);
> +
> +            /*
> +             * Xen supports additional interrupt routes from the PCI devices to
> +             * the IOAPIC: the four pins of each PCI device on the bus are also
> +             * connected to the IOAPIC directly.
> +             * These additional routes can be discovered through ACPI.
> +             */
> +            pci_bus_irqs(pci_bus, xen_intx_set_irq, pci_dev,
> +                         XEN_IOAPIC_NUM_PIRQS);
> +        }
> +
> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "pic"));
> +        for (i = 0; i < ISA_NUM_IRQS; i++) {
> +            qdev_connect_gpio_out(dev, i, x86ms->gsi[i]);
> +        }
> +        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));
> +        rtc_state = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),
> +                                                             "rtc"));
> +        piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");
> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));
> +        pci_ide_create_devs(PCI_DEVICE(dev));
> +        idebus[0] = qdev_get_child_bus(dev, "ide.0");
> +        idebus[1] = qdev_get_child_bus(dev, "ide.1");
>      } else {
>          pci_bus = NULL;
> +        piix4_pm = NULL;
>          isa_bus = isa_bus_new(NULL, get_system_memory(), system_io,
>                                &error_abort);
> +        isa_bus_irqs(isa_bus, x86ms->gsi);
> +
> +        rtc_state = isa_new(TYPE_MC146818_RTC);
> +        qdev_prop_set_int32(DEVICE(rtc_state), "base_year", 2000);
> +        isa_realize_and_unref(rtc_state, isa_bus, &error_fatal);
> +
>          i8257_dma_init(isa_bus, 0);
>          pcms->hpet_enabled = false;
> +        idebus[0] = NULL;
> +        idebus[1] = NULL;
>      }
> -    isa_bus_irqs(isa_bus, x86ms->gsi);
>  
>      if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
>          pc_i8259_create(isa_bus, gsi_state->i8259_irq);
> @@ -252,18 +323,12 @@ static void pc_init1(MachineState *machine,
>      }
>  
>      /* init basic PC hardware */
> -    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
> +    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, rtc_state, true,
>                           0x4);
>  
>      pc_nic_init(pcmc, isa_bus, pci_bus);
>  
>      if (pcmc->pci_enabled) {
> -        PCIDevice *dev;
> -
> -        dev = pci_create_simple(pci_bus, piix3_devfn + 1, TYPE_PIIX3_IDE);
> -        pci_ide_create_devs(dev);
> -        idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
> -        idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
>          pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
>      }
>  #ifdef CONFIG_IDE_ISA
> @@ -289,21 +354,9 @@ static void pc_init1(MachineState *machine,
>      }
>  #endif
>  
> -    if (pcmc->pci_enabled && machine_usb(machine)) {
> -        pci_create_simple(pci_bus, piix3_devfn + 2, "piix3-usb-uhci");
> -    }
> -
> -    if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
> -        PCIDevice *piix4_pm;
> -
> +    if (piix4_pm) {
>          smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
> -        piix4_pm = pci_new(piix3_devfn + 3, TYPE_PIIX4_PM);
> -        qdev_prop_set_uint32(DEVICE(piix4_pm), "smb_io_base", 0xb100);
> -        qdev_prop_set_bit(DEVICE(piix4_pm), "smm-enabled",
> -                          x86_machine_is_smm_enabled(x86ms));
> -        pci_realize_and_unref(piix4_pm, pci_bus, &error_fatal);
>  
> -        qdev_connect_gpio_out(DEVICE(piix4_pm), 0, x86ms->gsi[9]);
>          qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);
>          pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));
>          /* TODO: Populate SPD eeprom data.  */
> @@ -315,7 +368,7 @@ static void pc_init1(MachineState *machine,
>                                   object_property_allow_set_link,
>                                   OBJ_PROP_LINK_STRONG);
>          object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,
> -                                 OBJECT(piix4_pm), &error_abort);
> +                                 piix4_pm, &error_abort);
>      }
>  
>      if (machine->nvdimms_state->is_enabled) {
> @@ -405,6 +458,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>      }
>  
>      pc_xen_hvm_init_pci(machine);
> +    if (xen_igd_gfx_pt_enabled()) {
> +        xen_igd_reserve_slot(pcms->bus);
> +    }
>      pci_create_simple(pcms->bus, -1, "xen-platform");
>  }
>  #endif
> @@ -441,6 +497,11 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
>      pc_i440fx_machine_options(m);
>      m->alias = "pc";
>      m->is_default = true;
> +#ifdef CONFIG_MICROVM_DEFAULT
> +    m->is_default = false;
> +#else
> +    m->is_default = true;
> +#endif
>  }
>  
>  DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,


Lots of changes here not guarded by CONFIG_XEN.

-- 
MST
Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Chuck Zmudzinski 1 year, 3 months ago
On 1/9/23 6:33 PM, Michael S. Tsirkin wrote:
> On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
>> as noted in docs/igd-assign.txt in the Qemu source code.
>> 
>> Currently, when the xl toolstack is used to configure a Xen HVM guest with
>> Intel IGD passthrough to the guest with the Qemu upstream device model,
>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
>> a different slot. This problem often prevents the guest from booting.
>> 
>> The only available workaround is not good: Configure Xen HVM guests to use
>> the old and no longer maintained Qemu traditional device model available
>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
>> 
>> To implement this feature in the Qemu upstream device model for Xen HVM
>> guests, introduce the following new functions, types, and macros:
>> 
>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
>> * typedef XenPTQdevRealize function pointer
>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
>> * xen_igd_reserve_slot and xen_igd_clear_slot functions
>> 
>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
>> the xl toolstack with the gfx_passthru option enabled, which sets the
>> igd-passthru=on option to Qemu for the Xen HVM machine type.
>> 
>> The new xen_igd_reserve_slot function also needs to be implemented in
>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
>> in which case it does nothing.
>> 
>> The new xen_igd_clear_slot function overrides qdev->realize of the parent
>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
>> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
>> 
>> Move the call to xen_host_pci_device_get, and the associated error
>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
>> initialize the device class and vendor values which enables the checks for
>> the Intel IGD to succeed. The verification that the host device is an
>> Intel IGD to be passed through is done by checking the domain, bus, slot,
>> and function values as well as by checking that gfx_passthru is enabled,
>> the device class is VGA, and the device vendor in Intel.
>> 
>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
>> ---
>> Notes that might be helpful to reviewers of patched code in hw/xen:
>> 
>> The new functions and types are based on recommendations from Qemu docs:
>> https://qemu.readthedocs.io/en/latest/devel/qom.html
>> 
>> Notes that might be helpful to reviewers of patched code in hw/i386:
>> 
>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
>> not affect builds that do not have CONFIG_XEN defined.
> 
> I'm not sure how you can claim that.

I mean the small patch to pc_piix.c in this patch sits
between an "#ifdef CONFIG_XEN" and the corresponding
"#endif" so the preprocessor will exclude it when CONFIG_XEN
is not defined. In other words, my patch is part of the
xen-specific code in pc_piix.c. Or am I missing something?


> 
> ...
> 
>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>> index b48047f50c..34a9736b5e 100644
>> --- a/hw/i386/pc_piix.c
>> +++ b/hw/i386/pc_piix.c
>> @@ -32,6 +32,7 @@
>>  #include "hw/i386/pc.h"
>>  #include "hw/i386/apic.h"
>>  #include "hw/pci-host/i440fx.h"
>> +#include "hw/rtc/mc146818rtc.h"
>>  #include "hw/southbridge/piix.h"
>>  #include "hw/display/ramfb.h"
>>  #include "hw/firmware/smbios.h"
>> @@ -40,16 +41,16 @@
>>  #include "hw/usb.h"
>>  #include "net/net.h"
>>  #include "hw/ide/pci.h"
>> -#include "hw/ide/piix.h"
>>  #include "hw/irq.h"
>>  #include "sysemu/kvm.h"
>>  #include "hw/kvm/clock.h"
>>  #include "hw/sysbus.h"
>> +#include "hw/i2c/i2c.h"
>>  #include "hw/i2c/smbus_eeprom.h"
>>  #include "hw/xen/xen-x86.h"
>> +#include "hw/xen/xen.h"
>>  #include "exec/memory.h"
>>  #include "hw/acpi/acpi.h"
>> -#include "hw/acpi/piix4.h"
>>  #include "qapi/error.h"
>>  #include "qemu/error-report.h"
>>  #include "sysemu/xen.h"
>> @@ -66,6 +67,7 @@
>>  #include "kvm/kvm-cpu.h"
>>  
>>  #define MAX_IDE_BUS 2
>> +#define XEN_IOAPIC_NUM_PIRQS 128ULL
>>  
>>  #ifdef CONFIG_IDE_ISA
>>  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
>> @@ -73,6 +75,32 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
>>  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
>>  #endif
>>  
>> +/*
>> + * Return the global irq number corresponding to a given device irq
>> + * pin. We could also use the bus number to have a more precise mapping.
>> + */
>> +static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
>> +{
>> +    int slot_addend;
>> +    slot_addend = PCI_SLOT(pci_dev->devfn) - 1;
>> +    return (pci_intx + slot_addend) & 3;
>> +}
>> +
>> +static void piix_intx_routing_notifier_xen(PCIDevice *dev)
>> +{
>> +    int i;
>> +
>> +    /* Scan for updates to PCI link routes (0x60-0x63). */
>> +    for (i = 0; i < PIIX_NUM_PIRQS; i++) {
>> +        uint8_t v = dev->config_read(dev, PIIX_PIRQCA + i, 1);
>> +        if (v & 0x80) {
>> +            v = 0;
>> +        }
>> +        v &= 0xf;
>> +        xen_set_pci_link_route(i, v);
>> +    }
>> +}
>> +
>>  /* PC hardware initialisation */
>>  static void pc_init1(MachineState *machine,
>>                       const char *host_type, const char *pci_type)
>> @@ -84,7 +112,7 @@ static void pc_init1(MachineState *machine,
>>      MemoryRegion *system_io = get_system_io();
>>      PCIBus *pci_bus;
>>      ISABus *isa_bus;
>> -    int piix3_devfn = -1;
>> +    Object *piix4_pm;
>>      qemu_irq smi_irq;
>>      GSIState *gsi_state;
>>      BusState *idebus[MAX_IDE_BUS];
>> @@ -205,10 +233,9 @@ static void pc_init1(MachineState *machine,
>>      gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
>>  
>>      if (pcmc->pci_enabled) {
>> -        PIIX3State *piix3;
>> +        DeviceState *dev;
>>          PCIDevice *pci_dev;
>> -        const char *type = xen_enabled() ? TYPE_PIIX3_XEN_DEVICE
>> -                                         : TYPE_PIIX3_DEVICE;
>> +        int i;
>>  
>>          pci_bus = i440fx_init(pci_type,
>>                                i440fx_host,
>> @@ -216,21 +243,65 @@ static void pc_init1(MachineState *machine,
>>                                x86ms->below_4g_mem_size,
>>                                x86ms->above_4g_mem_size,
>>                                pci_memory, ram_memory);
>> +        pci_bus_map_irqs(pci_bus,
>> +                         xen_enabled() ? xen_pci_slot_get_pirq
>> +                                       : pci_slot_get_pirq);
>>          pcms->bus = pci_bus;
>>  
>> -        pci_dev = pci_create_simple_multifunction(pci_bus, -1, true, type);
>> -        piix3 = PIIX3_PCI_DEVICE(pci_dev);
>> -        piix3->pic = x86ms->gsi;
>> -        piix3_devfn = piix3->dev.devfn;
>> -        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), "isa.0"));
>> +        pci_dev = pci_new_multifunction(-1, true, TYPE_PIIX3_DEVICE);
>> +        object_property_set_bool(OBJECT(pci_dev), "has-usb",
>> +                                 machine_usb(machine), &error_abort);
>> +        object_property_set_bool(OBJECT(pci_dev), "has-acpi",
>> +                                 x86_machine_is_acpi_enabled(x86ms),
>> +                                 &error_abort);
>> +        qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);
>> +        object_property_set_bool(OBJECT(pci_dev), "smm-enabled",
>> +                                 x86_machine_is_smm_enabled(x86ms),
>> +                                 &error_abort);
>> +        pci_realize_and_unref(pci_dev, pci_bus, &error_fatal);
>> +
>> +        if (xen_enabled()) {
>> +            pci_device_set_intx_routing_notifier(
>> +                        pci_dev, piix_intx_routing_notifier_xen);
>> +
>> +            /*
>> +             * Xen supports additional interrupt routes from the PCI devices to
>> +             * the IOAPIC: the four pins of each PCI device on the bus are also
>> +             * connected to the IOAPIC directly.
>> +             * These additional routes can be discovered through ACPI.
>> +             */
>> +            pci_bus_irqs(pci_bus, xen_intx_set_irq, pci_dev,
>> +                         XEN_IOAPIC_NUM_PIRQS);
>> +        }
>> +
>> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "pic"));
>> +        for (i = 0; i < ISA_NUM_IRQS; i++) {
>> +            qdev_connect_gpio_out(dev, i, x86ms->gsi[i]);
>> +        }
>> +        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));
>> +        rtc_state = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),
>> +                                                             "rtc"));
>> +        piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");
>> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));
>> +        pci_ide_create_devs(PCI_DEVICE(dev));
>> +        idebus[0] = qdev_get_child_bus(dev, "ide.0");
>> +        idebus[1] = qdev_get_child_bus(dev, "ide.1");
>>      } else {
>>          pci_bus = NULL;
>> +        piix4_pm = NULL;
>>          isa_bus = isa_bus_new(NULL, get_system_memory(), system_io,
>>                                &error_abort);
>> +        isa_bus_irqs(isa_bus, x86ms->gsi);
>> +
>> +        rtc_state = isa_new(TYPE_MC146818_RTC);
>> +        qdev_prop_set_int32(DEVICE(rtc_state), "base_year", 2000);
>> +        isa_realize_and_unref(rtc_state, isa_bus, &error_fatal);
>> +
>>          i8257_dma_init(isa_bus, 0);
>>          pcms->hpet_enabled = false;
>> +        idebus[0] = NULL;
>> +        idebus[1] = NULL;
>>      }
>> -    isa_bus_irqs(isa_bus, x86ms->gsi);
>>  
>>      if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
>>          pc_i8259_create(isa_bus, gsi_state->i8259_irq);
>> @@ -252,18 +323,12 @@ static void pc_init1(MachineState *machine,
>>      }
>>  
>>      /* init basic PC hardware */
>> -    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
>> +    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, rtc_state, true,
>>                           0x4);
>>  
>>      pc_nic_init(pcmc, isa_bus, pci_bus);
>>  
>>      if (pcmc->pci_enabled) {
>> -        PCIDevice *dev;
>> -
>> -        dev = pci_create_simple(pci_bus, piix3_devfn + 1, TYPE_PIIX3_IDE);
>> -        pci_ide_create_devs(dev);
>> -        idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
>> -        idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
>>          pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
>>      }
>>  #ifdef CONFIG_IDE_ISA
>> @@ -289,21 +354,9 @@ static void pc_init1(MachineState *machine,
>>      }
>>  #endif
>>  
>> -    if (pcmc->pci_enabled && machine_usb(machine)) {
>> -        pci_create_simple(pci_bus, piix3_devfn + 2, "piix3-usb-uhci");
>> -    }
>> -
>> -    if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
>> -        PCIDevice *piix4_pm;
>> -
>> +    if (piix4_pm) {
>>          smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
>> -        piix4_pm = pci_new(piix3_devfn + 3, TYPE_PIIX4_PM);
>> -        qdev_prop_set_uint32(DEVICE(piix4_pm), "smb_io_base", 0xb100);
>> -        qdev_prop_set_bit(DEVICE(piix4_pm), "smm-enabled",
>> -                          x86_machine_is_smm_enabled(x86ms));
>> -        pci_realize_and_unref(piix4_pm, pci_bus, &error_fatal);
>>  
>> -        qdev_connect_gpio_out(DEVICE(piix4_pm), 0, x86ms->gsi[9]);
>>          qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);
>>          pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));
>>          /* TODO: Populate SPD eeprom data.  */
>> @@ -315,7 +368,7 @@ static void pc_init1(MachineState *machine,
>>                                   object_property_allow_set_link,
>>                                   OBJ_PROP_LINK_STRONG);
>>          object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,
>> -                                 OBJECT(piix4_pm), &error_abort);
>> +                                 piix4_pm, &error_abort);
>>      }
>>  
>>      if (machine->nvdimms_state->is_enabled) {
>> @@ -405,6 +458,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>>      }
>>  
>>      pc_xen_hvm_init_pci(machine);
>> +    if (xen_igd_gfx_pt_enabled()) {
>> +        xen_igd_reserve_slot(pcms->bus);
>> +    }
>>      pci_create_simple(pcms->bus, -1, "xen-platform");
>>  }
>>  #endif
>> @@ -441,6 +497,11 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
>>      pc_i440fx_machine_options(m);
>>      m->alias = "pc";
>>      m->is_default = true;
>> +#ifdef CONFIG_MICROVM_DEFAULT
>> +    m->is_default = false;
>> +#else
>> +    m->is_default = true;
>> +#endif
>>  }
>>  
>>  DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
> 
> 
> Lots of changes here not guarded by CONFIG_XEN.
> 

What diff is this? How is my patch related to it?


Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Michael S. Tsirkin 1 year, 3 months ago
On Mon, Jan 09, 2023 at 07:05:35PM -0500, Chuck Zmudzinski wrote:
> On 1/9/23 6:33 PM, Michael S. Tsirkin wrote:
> > On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
> >> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
> >> as noted in docs/igd-assign.txt in the Qemu source code.
> >> 
> >> Currently, when the xl toolstack is used to configure a Xen HVM guest with
> >> Intel IGD passthrough to the guest with the Qemu upstream device model,
> >> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
> >> a different slot. This problem often prevents the guest from booting.
> >> 
> >> The only available workaround is not good: Configure Xen HVM guests to use
> >> the old and no longer maintained Qemu traditional device model available
> >> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
> >> 
> >> To implement this feature in the Qemu upstream device model for Xen HVM
> >> guests, introduce the following new functions, types, and macros:
> >> 
> >> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
> >> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
> >> * typedef XenPTQdevRealize function pointer
> >> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
> >> * xen_igd_reserve_slot and xen_igd_clear_slot functions
> >> 
> >> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
> >> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
> >> the xl toolstack with the gfx_passthru option enabled, which sets the
> >> igd-passthru=on option to Qemu for the Xen HVM machine type.
> >> 
> >> The new xen_igd_reserve_slot function also needs to be implemented in
> >> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
> >> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
> >> in which case it does nothing.
> >> 
> >> The new xen_igd_clear_slot function overrides qdev->realize of the parent
> >> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
> >> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
> >> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
> >> 
> >> Move the call to xen_host_pci_device_get, and the associated error
> >> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
> >> initialize the device class and vendor values which enables the checks for
> >> the Intel IGD to succeed. The verification that the host device is an
> >> Intel IGD to be passed through is done by checking the domain, bus, slot,
> >> and function values as well as by checking that gfx_passthru is enabled,
> >> the device class is VGA, and the device vendor in Intel.
> >> 
> >> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
> >> ---
> >> Notes that might be helpful to reviewers of patched code in hw/xen:
> >> 
> >> The new functions and types are based on recommendations from Qemu docs:
> >> https://qemu.readthedocs.io/en/latest/devel/qom.html
> >> 
> >> Notes that might be helpful to reviewers of patched code in hw/i386:
> >> 
> >> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
> >> not affect builds that do not have CONFIG_XEN defined.
> > 
> > I'm not sure how you can claim that.
> 
> I mean the small patch to pc_piix.c in this patch sits
> between an "#ifdef CONFIG_XEN" and the corresponding
> "#endif" so the preprocessor will exclude it when CONFIG_XEN
> is not defined. In other words, my patch is part of the
> xen-specific code in pc_piix.c. Or am I missing something?
> 
> 
> > 
> > ...
> > 
> >> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> >> index b48047f50c..34a9736b5e 100644
> >> --- a/hw/i386/pc_piix.c
> >> +++ b/hw/i386/pc_piix.c
> >> @@ -32,6 +32,7 @@
> >>  #include "hw/i386/pc.h"
> >>  #include "hw/i386/apic.h"
> >>  #include "hw/pci-host/i440fx.h"
> >> +#include "hw/rtc/mc146818rtc.h"
> >>  #include "hw/southbridge/piix.h"
> >>  #include "hw/display/ramfb.h"
> >>  #include "hw/firmware/smbios.h"
> >> @@ -40,16 +41,16 @@
> >>  #include "hw/usb.h"
> >>  #include "net/net.h"
> >>  #include "hw/ide/pci.h"
> >> -#include "hw/ide/piix.h"
> >>  #include "hw/irq.h"
> >>  #include "sysemu/kvm.h"
> >>  #include "hw/kvm/clock.h"
> >>  #include "hw/sysbus.h"
> >> +#include "hw/i2c/i2c.h"
> >>  #include "hw/i2c/smbus_eeprom.h"
> >>  #include "hw/xen/xen-x86.h"
> >> +#include "hw/xen/xen.h"
> >>  #include "exec/memory.h"
> >>  #include "hw/acpi/acpi.h"
> >> -#include "hw/acpi/piix4.h"
> >>  #include "qapi/error.h"
> >>  #include "qemu/error-report.h"
> >>  #include "sysemu/xen.h"
> >> @@ -66,6 +67,7 @@
> >>  #include "kvm/kvm-cpu.h"
> >>  
> >>  #define MAX_IDE_BUS 2
> >> +#define XEN_IOAPIC_NUM_PIRQS 128ULL
> >>  
> >>  #ifdef CONFIG_IDE_ISA
> >>  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
> >> @@ -73,6 +75,32 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
> >>  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
> >>  #endif
> >>  
> >> +/*
> >> + * Return the global irq number corresponding to a given device irq
> >> + * pin. We could also use the bus number to have a more precise mapping.
> >> + */
> >> +static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
> >> +{
> >> +    int slot_addend;
> >> +    slot_addend = PCI_SLOT(pci_dev->devfn) - 1;
> >> +    return (pci_intx + slot_addend) & 3;
> >> +}
> >> +
> >> +static void piix_intx_routing_notifier_xen(PCIDevice *dev)
> >> +{
> >> +    int i;
> >> +
> >> +    /* Scan for updates to PCI link routes (0x60-0x63). */
> >> +    for (i = 0; i < PIIX_NUM_PIRQS; i++) {
> >> +        uint8_t v = dev->config_read(dev, PIIX_PIRQCA + i, 1);
> >> +        if (v & 0x80) {
> >> +            v = 0;
> >> +        }
> >> +        v &= 0xf;
> >> +        xen_set_pci_link_route(i, v);
> >> +    }
> >> +}
> >> +
> >>  /* PC hardware initialisation */
> >>  static void pc_init1(MachineState *machine,
> >>                       const char *host_type, const char *pci_type)
> >> @@ -84,7 +112,7 @@ static void pc_init1(MachineState *machine,
> >>      MemoryRegion *system_io = get_system_io();
> >>      PCIBus *pci_bus;
> >>      ISABus *isa_bus;
> >> -    int piix3_devfn = -1;
> >> +    Object *piix4_pm;
> >>      qemu_irq smi_irq;
> >>      GSIState *gsi_state;
> >>      BusState *idebus[MAX_IDE_BUS];
> >> @@ -205,10 +233,9 @@ static void pc_init1(MachineState *machine,
> >>      gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
> >>  
> >>      if (pcmc->pci_enabled) {
> >> -        PIIX3State *piix3;
> >> +        DeviceState *dev;
> >>          PCIDevice *pci_dev;
> >> -        const char *type = xen_enabled() ? TYPE_PIIX3_XEN_DEVICE
> >> -                                         : TYPE_PIIX3_DEVICE;
> >> +        int i;
> >>  
> >>          pci_bus = i440fx_init(pci_type,
> >>                                i440fx_host,
> >> @@ -216,21 +243,65 @@ static void pc_init1(MachineState *machine,
> >>                                x86ms->below_4g_mem_size,
> >>                                x86ms->above_4g_mem_size,
> >>                                pci_memory, ram_memory);
> >> +        pci_bus_map_irqs(pci_bus,
> >> +                         xen_enabled() ? xen_pci_slot_get_pirq
> >> +                                       : pci_slot_get_pirq);
> >>          pcms->bus = pci_bus;
> >>  
> >> -        pci_dev = pci_create_simple_multifunction(pci_bus, -1, true, type);
> >> -        piix3 = PIIX3_PCI_DEVICE(pci_dev);
> >> -        piix3->pic = x86ms->gsi;
> >> -        piix3_devfn = piix3->dev.devfn;
> >> -        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), "isa.0"));
> >> +        pci_dev = pci_new_multifunction(-1, true, TYPE_PIIX3_DEVICE);
> >> +        object_property_set_bool(OBJECT(pci_dev), "has-usb",
> >> +                                 machine_usb(machine), &error_abort);
> >> +        object_property_set_bool(OBJECT(pci_dev), "has-acpi",
> >> +                                 x86_machine_is_acpi_enabled(x86ms),
> >> +                                 &error_abort);
> >> +        qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);
> >> +        object_property_set_bool(OBJECT(pci_dev), "smm-enabled",
> >> +                                 x86_machine_is_smm_enabled(x86ms),
> >> +                                 &error_abort);
> >> +        pci_realize_and_unref(pci_dev, pci_bus, &error_fatal);
> >> +
> >> +        if (xen_enabled()) {
> >> +            pci_device_set_intx_routing_notifier(
> >> +                        pci_dev, piix_intx_routing_notifier_xen);
> >> +
> >> +            /*
> >> +             * Xen supports additional interrupt routes from the PCI devices to
> >> +             * the IOAPIC: the four pins of each PCI device on the bus are also
> >> +             * connected to the IOAPIC directly.
> >> +             * These additional routes can be discovered through ACPI.
> >> +             */
> >> +            pci_bus_irqs(pci_bus, xen_intx_set_irq, pci_dev,
> >> +                         XEN_IOAPIC_NUM_PIRQS);
> >> +        }
> >> +
> >> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "pic"));
> >> +        for (i = 0; i < ISA_NUM_IRQS; i++) {
> >> +            qdev_connect_gpio_out(dev, i, x86ms->gsi[i]);
> >> +        }
> >> +        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));
> >> +        rtc_state = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),
> >> +                                                             "rtc"));
> >> +        piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");
> >> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));
> >> +        pci_ide_create_devs(PCI_DEVICE(dev));
> >> +        idebus[0] = qdev_get_child_bus(dev, "ide.0");
> >> +        idebus[1] = qdev_get_child_bus(dev, "ide.1");
> >>      } else {
> >>          pci_bus = NULL;
> >> +        piix4_pm = NULL;
> >>          isa_bus = isa_bus_new(NULL, get_system_memory(), system_io,
> >>                                &error_abort);
> >> +        isa_bus_irqs(isa_bus, x86ms->gsi);
> >> +
> >> +        rtc_state = isa_new(TYPE_MC146818_RTC);
> >> +        qdev_prop_set_int32(DEVICE(rtc_state), "base_year", 2000);
> >> +        isa_realize_and_unref(rtc_state, isa_bus, &error_fatal);
> >> +
> >>          i8257_dma_init(isa_bus, 0);
> >>          pcms->hpet_enabled = false;
> >> +        idebus[0] = NULL;
> >> +        idebus[1] = NULL;
> >>      }
> >> -    isa_bus_irqs(isa_bus, x86ms->gsi);
> >>  
> >>      if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
> >>          pc_i8259_create(isa_bus, gsi_state->i8259_irq);
> >> @@ -252,18 +323,12 @@ static void pc_init1(MachineState *machine,
> >>      }
> >>  
> >>      /* init basic PC hardware */
> >> -    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
> >> +    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, rtc_state, true,
> >>                           0x4);
> >>  
> >>      pc_nic_init(pcmc, isa_bus, pci_bus);
> >>  
> >>      if (pcmc->pci_enabled) {
> >> -        PCIDevice *dev;
> >> -
> >> -        dev = pci_create_simple(pci_bus, piix3_devfn + 1, TYPE_PIIX3_IDE);
> >> -        pci_ide_create_devs(dev);
> >> -        idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
> >> -        idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
> >>          pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
> >>      }
> >>  #ifdef CONFIG_IDE_ISA
> >> @@ -289,21 +354,9 @@ static void pc_init1(MachineState *machine,
> >>      }
> >>  #endif
> >>  
> >> -    if (pcmc->pci_enabled && machine_usb(machine)) {
> >> -        pci_create_simple(pci_bus, piix3_devfn + 2, "piix3-usb-uhci");
> >> -    }
> >> -
> >> -    if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
> >> -        PCIDevice *piix4_pm;
> >> -
> >> +    if (piix4_pm) {
> >>          smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
> >> -        piix4_pm = pci_new(piix3_devfn + 3, TYPE_PIIX4_PM);
> >> -        qdev_prop_set_uint32(DEVICE(piix4_pm), "smb_io_base", 0xb100);
> >> -        qdev_prop_set_bit(DEVICE(piix4_pm), "smm-enabled",
> >> -                          x86_machine_is_smm_enabled(x86ms));
> >> -        pci_realize_and_unref(piix4_pm, pci_bus, &error_fatal);
> >>  
> >> -        qdev_connect_gpio_out(DEVICE(piix4_pm), 0, x86ms->gsi[9]);
> >>          qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);
> >>          pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));
> >>          /* TODO: Populate SPD eeprom data.  */
> >> @@ -315,7 +368,7 @@ static void pc_init1(MachineState *machine,
> >>                                   object_property_allow_set_link,
> >>                                   OBJ_PROP_LINK_STRONG);
> >>          object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,
> >> -                                 OBJECT(piix4_pm), &error_abort);
> >> +                                 piix4_pm, &error_abort);
> >>      }
> >>  
> >>      if (machine->nvdimms_state->is_enabled) {
> >> @@ -405,6 +458,9 @@ static void pc_xen_hvm_init(MachineState *machine)
> >>      }
> >>  
> >>      pc_xen_hvm_init_pci(machine);
> >> +    if (xen_igd_gfx_pt_enabled()) {
> >> +        xen_igd_reserve_slot(pcms->bus);
> >> +    }
> >>      pci_create_simple(pcms->bus, -1, "xen-platform");
> >>  }
> >>  #endif
> >> @@ -441,6 +497,11 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> >>      pc_i440fx_machine_options(m);
> >>      m->alias = "pc";
> >>      m->is_default = true;
> >> +#ifdef CONFIG_MICROVM_DEFAULT
> >> +    m->is_default = false;
> >> +#else
> >> +    m->is_default = true;
> >> +#endif
> >>  }
> >>  
> >>  DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
> > 
> > 
> > Lots of changes here not guarded by CONFIG_XEN.
> > 
> 
> What diff is this? How is my patch related to it?


This is what you posted, take a look:
https://lore.kernel.org/all/8349506149de6d81b0762f17623552c248439e93.1673297742.git.brchuckz@aol.com/


-- 
MST


Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Chuck Zmudzinski 1 year, 3 months ago
On 1/10/2023 12:27 AM, Michael S. Tsirkin wrote:
> On Mon, Jan 09, 2023 at 07:05:35PM -0500, Chuck Zmudzinski wrote:
> > On 1/9/23 6:33 PM, Michael S. Tsirkin wrote:
> > > On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
> > >> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
> > >> as noted in docs/igd-assign.txt in the Qemu source code.
> > >> 
> > >> Currently, when the xl toolstack is used to configure a Xen HVM guest with
> > >> Intel IGD passthrough to the guest with the Qemu upstream device model,
> > >> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
> > >> a different slot. This problem often prevents the guest from booting.
> > >> 
> > >> The only available workaround is not good: Configure Xen HVM guests to use
> > >> the old and no longer maintained Qemu traditional device model available
> > >> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
> > >> 
> > >> To implement this feature in the Qemu upstream device model for Xen HVM
> > >> guests, introduce the following new functions, types, and macros:
> > >> 
> > >> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
> > >> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
> > >> * typedef XenPTQdevRealize function pointer
> > >> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
> > >> * xen_igd_reserve_slot and xen_igd_clear_slot functions
> > >> 
> > >> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
> > >> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
> > >> the xl toolstack with the gfx_passthru option enabled, which sets the
> > >> igd-passthru=on option to Qemu for the Xen HVM machine type.
> > >> 
> > >> The new xen_igd_reserve_slot function also needs to be implemented in
> > >> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
> > >> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
> > >> in which case it does nothing.
> > >> 
> > >> The new xen_igd_clear_slot function overrides qdev->realize of the parent
> > >> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
> > >> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
> > >> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
> > >> 
> > >> Move the call to xen_host_pci_device_get, and the associated error
> > >> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
> > >> initialize the device class and vendor values which enables the checks for
> > >> the Intel IGD to succeed. The verification that the host device is an
> > >> Intel IGD to be passed through is done by checking the domain, bus, slot,
> > >> and function values as well as by checking that gfx_passthru is enabled,
> > >> the device class is VGA, and the device vendor in Intel.
> > >> 
> > >> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
> > >> ---
> > >> Notes that might be helpful to reviewers of patched code in hw/xen:
> > >> 
> > >> The new functions and types are based on recommendations from Qemu docs:
> > >> https://qemu.readthedocs.io/en/latest/devel/qom.html
> > >> 
> > >> Notes that might be helpful to reviewers of patched code in hw/i386:
> > >> 
> > >> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
> > >> not affect builds that do not have CONFIG_XEN defined.
> > > 
> > > I'm not sure how you can claim that.
> > 
> > I mean the small patch to pc_piix.c in this patch sits
> > between an "#ifdef CONFIG_XEN" and the corresponding
> > "#endif" so the preprocessor will exclude it when CONFIG_XEN
> > is not defined. In other words, my patch is part of the
> > xen-specific code in pc_piix.c. Or am I missing something?
> > 
> > 
> > > 
> > > ...
> > > 
> > >> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > >> index b48047f50c..34a9736b5e 100644
> > >> --- a/hw/i386/pc_piix.c
> > >> +++ b/hw/i386/pc_piix.c
> > >> @@ -32,6 +32,7 @@
> > >>  #include "hw/i386/pc.h"
> > >>  #include "hw/i386/apic.h"
> > >>  #include "hw/pci-host/i440fx.h"
> > >> +#include "hw/rtc/mc146818rtc.h"
> > >>  #include "hw/southbridge/piix.h"
> > >>  #include "hw/display/ramfb.h"
> > >>  #include "hw/firmware/smbios.h"
> > >> @@ -40,16 +41,16 @@
> > >>  #include "hw/usb.h"
> > >>  #include "net/net.h"
> > >>  #include "hw/ide/pci.h"
> > >> -#include "hw/ide/piix.h"
> > >>  #include "hw/irq.h"
> > >>  #include "sysemu/kvm.h"
> > >>  #include "hw/kvm/clock.h"
> > >>  #include "hw/sysbus.h"
> > >> +#include "hw/i2c/i2c.h"
> > >>  #include "hw/i2c/smbus_eeprom.h"
> > >>  #include "hw/xen/xen-x86.h"
> > >> +#include "hw/xen/xen.h"
> > >>  #include "exec/memory.h"
> > >>  #include "hw/acpi/acpi.h"
> > >> -#include "hw/acpi/piix4.h"
> > >>  #include "qapi/error.h"
> > >>  #include "qemu/error-report.h"
> > >>  #include "sysemu/xen.h"
> > >> @@ -66,6 +67,7 @@
> > >>  #include "kvm/kvm-cpu.h"
> > >>  
> > >>  #define MAX_IDE_BUS 2
> > >> +#define XEN_IOAPIC_NUM_PIRQS 128ULL
> > >>  
> > >>  #ifdef CONFIG_IDE_ISA
> > >>  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
> > >> @@ -73,6 +75,32 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
> > >>  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
> > >>  #endif
> > >>  
> > >> +/*
> > >> + * Return the global irq number corresponding to a given device irq
> > >> + * pin. We could also use the bus number to have a more precise mapping.
> > >> + */
> > >> +static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
> > >> +{
> > >> +    int slot_addend;
> > >> +    slot_addend = PCI_SLOT(pci_dev->devfn) - 1;
> > >> +    return (pci_intx + slot_addend) & 3;
> > >> +}
> > >> +
> > >> +static void piix_intx_routing_notifier_xen(PCIDevice *dev)
> > >> +{
> > >> +    int i;
> > >> +
> > >> +    /* Scan for updates to PCI link routes (0x60-0x63). */
> > >> +    for (i = 0; i < PIIX_NUM_PIRQS; i++) {
> > >> +        uint8_t v = dev->config_read(dev, PIIX_PIRQCA + i, 1);
> > >> +        if (v & 0x80) {
> > >> +            v = 0;
> > >> +        }
> > >> +        v &= 0xf;
> > >> +        xen_set_pci_link_route(i, v);
> > >> +    }
> > >> +}
> > >> +
> > >>  /* PC hardware initialisation */
> > >>  static void pc_init1(MachineState *machine,
> > >>                       const char *host_type, const char *pci_type)
> > >> @@ -84,7 +112,7 @@ static void pc_init1(MachineState *machine,
> > >>      MemoryRegion *system_io = get_system_io();
> > >>      PCIBus *pci_bus;
> > >>      ISABus *isa_bus;
> > >> -    int piix3_devfn = -1;
> > >> +    Object *piix4_pm;
> > >>      qemu_irq smi_irq;
> > >>      GSIState *gsi_state;
> > >>      BusState *idebus[MAX_IDE_BUS];
> > >> @@ -205,10 +233,9 @@ static void pc_init1(MachineState *machine,
> > >>      gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
> > >>  
> > >>      if (pcmc->pci_enabled) {
> > >> -        PIIX3State *piix3;
> > >> +        DeviceState *dev;
> > >>          PCIDevice *pci_dev;
> > >> -        const char *type = xen_enabled() ? TYPE_PIIX3_XEN_DEVICE
> > >> -                                         : TYPE_PIIX3_DEVICE;
> > >> +        int i;
> > >>  
> > >>          pci_bus = i440fx_init(pci_type,
> > >>                                i440fx_host,
> > >> @@ -216,21 +243,65 @@ static void pc_init1(MachineState *machine,
> > >>                                x86ms->below_4g_mem_size,
> > >>                                x86ms->above_4g_mem_size,
> > >>                                pci_memory, ram_memory);
> > >> +        pci_bus_map_irqs(pci_bus,
> > >> +                         xen_enabled() ? xen_pci_slot_get_pirq
> > >> +                                       : pci_slot_get_pirq);
> > >>          pcms->bus = pci_bus;
> > >>  
> > >> -        pci_dev = pci_create_simple_multifunction(pci_bus, -1, true, type);
> > >> -        piix3 = PIIX3_PCI_DEVICE(pci_dev);
> > >> -        piix3->pic = x86ms->gsi;
> > >> -        piix3_devfn = piix3->dev.devfn;
> > >> -        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), "isa.0"));
> > >> +        pci_dev = pci_new_multifunction(-1, true, TYPE_PIIX3_DEVICE);
> > >> +        object_property_set_bool(OBJECT(pci_dev), "has-usb",
> > >> +                                 machine_usb(machine), &error_abort);
> > >> +        object_property_set_bool(OBJECT(pci_dev), "has-acpi",
> > >> +                                 x86_machine_is_acpi_enabled(x86ms),
> > >> +                                 &error_abort);
> > >> +        qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);
> > >> +        object_property_set_bool(OBJECT(pci_dev), "smm-enabled",
> > >> +                                 x86_machine_is_smm_enabled(x86ms),
> > >> +                                 &error_abort);
> > >> +        pci_realize_and_unref(pci_dev, pci_bus, &error_fatal);
> > >> +
> > >> +        if (xen_enabled()) {
> > >> +            pci_device_set_intx_routing_notifier(
> > >> +                        pci_dev, piix_intx_routing_notifier_xen);
> > >> +
> > >> +            /*
> > >> +             * Xen supports additional interrupt routes from the PCI devices to
> > >> +             * the IOAPIC: the four pins of each PCI device on the bus are also
> > >> +             * connected to the IOAPIC directly.
> > >> +             * These additional routes can be discovered through ACPI.
> > >> +             */
> > >> +            pci_bus_irqs(pci_bus, xen_intx_set_irq, pci_dev,
> > >> +                         XEN_IOAPIC_NUM_PIRQS);
> > >> +        }
> > >> +
> > >> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "pic"));
> > >> +        for (i = 0; i < ISA_NUM_IRQS; i++) {
> > >> +            qdev_connect_gpio_out(dev, i, x86ms->gsi[i]);
> > >> +        }
> > >> +        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));
> > >> +        rtc_state = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),
> > >> +                                                             "rtc"));
> > >> +        piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");
> > >> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));
> > >> +        pci_ide_create_devs(PCI_DEVICE(dev));
> > >> +        idebus[0] = qdev_get_child_bus(dev, "ide.0");
> > >> +        idebus[1] = qdev_get_child_bus(dev, "ide.1");
> > >>      } else {
> > >>          pci_bus = NULL;
> > >> +        piix4_pm = NULL;
> > >>          isa_bus = isa_bus_new(NULL, get_system_memory(), system_io,
> > >>                                &error_abort);
> > >> +        isa_bus_irqs(isa_bus, x86ms->gsi);
> > >> +
> > >> +        rtc_state = isa_new(TYPE_MC146818_RTC);
> > >> +        qdev_prop_set_int32(DEVICE(rtc_state), "base_year", 2000);
> > >> +        isa_realize_and_unref(rtc_state, isa_bus, &error_fatal);
> > >> +
> > >>          i8257_dma_init(isa_bus, 0);
> > >>          pcms->hpet_enabled = false;
> > >> +        idebus[0] = NULL;
> > >> +        idebus[1] = NULL;
> > >>      }
> > >> -    isa_bus_irqs(isa_bus, x86ms->gsi);
> > >>  
> > >>      if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
> > >>          pc_i8259_create(isa_bus, gsi_state->i8259_irq);
> > >> @@ -252,18 +323,12 @@ static void pc_init1(MachineState *machine,
> > >>      }
> > >>  
> > >>      /* init basic PC hardware */
> > >> -    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
> > >> +    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, rtc_state, true,
> > >>                           0x4);
> > >>  
> > >>      pc_nic_init(pcmc, isa_bus, pci_bus);
> > >>  
> > >>      if (pcmc->pci_enabled) {
> > >> -        PCIDevice *dev;
> > >> -
> > >> -        dev = pci_create_simple(pci_bus, piix3_devfn + 1, TYPE_PIIX3_IDE);
> > >> -        pci_ide_create_devs(dev);
> > >> -        idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
> > >> -        idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
> > >>          pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
> > >>      }
> > >>  #ifdef CONFIG_IDE_ISA
> > >> @@ -289,21 +354,9 @@ static void pc_init1(MachineState *machine,
> > >>      }
> > >>  #endif
> > >>  
> > >> -    if (pcmc->pci_enabled && machine_usb(machine)) {
> > >> -        pci_create_simple(pci_bus, piix3_devfn + 2, "piix3-usb-uhci");
> > >> -    }
> > >> -
> > >> -    if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
> > >> -        PCIDevice *piix4_pm;
> > >> -
> > >> +    if (piix4_pm) {
> > >>          smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
> > >> -        piix4_pm = pci_new(piix3_devfn + 3, TYPE_PIIX4_PM);
> > >> -        qdev_prop_set_uint32(DEVICE(piix4_pm), "smb_io_base", 0xb100);
> > >> -        qdev_prop_set_bit(DEVICE(piix4_pm), "smm-enabled",
> > >> -                          x86_machine_is_smm_enabled(x86ms));
> > >> -        pci_realize_and_unref(piix4_pm, pci_bus, &error_fatal);
> > >>  
> > >> -        qdev_connect_gpio_out(DEVICE(piix4_pm), 0, x86ms->gsi[9]);
> > >>          qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);
> > >>          pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));
> > >>          /* TODO: Populate SPD eeprom data.  */
> > >> @@ -315,7 +368,7 @@ static void pc_init1(MachineState *machine,
> > >>                                   object_property_allow_set_link,
> > >>                                   OBJ_PROP_LINK_STRONG);
> > >>          object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,
> > >> -                                 OBJECT(piix4_pm), &error_abort);
> > >> +                                 piix4_pm, &error_abort);
> > >>      }
> > >>  
> > >>      if (machine->nvdimms_state->is_enabled) {
> > >> @@ -405,6 +458,9 @@ static void pc_xen_hvm_init(MachineState *machine)
> > >>      }
> > >>  
> > >>      pc_xen_hvm_init_pci(machine);
> > >> +    if (xen_igd_gfx_pt_enabled()) {
> > >> +        xen_igd_reserve_slot(pcms->bus);
> > >> +    }
> > >>      pci_create_simple(pcms->bus, -1, "xen-platform");
> > >>  }
> > >>  #endif
> > >> @@ -441,6 +497,11 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> > >>      pc_i440fx_machine_options(m);
> > >>      m->alias = "pc";
> > >>      m->is_default = true;
> > >> +#ifdef CONFIG_MICROVM_DEFAULT
> > >> +    m->is_default = false;
> > >> +#else
> > >> +    m->is_default = true;
> > >> +#endif
> > >>  }
> > >>  
> > >>  DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
> > > 
> > > 
> > > Lots of changes here not guarded by CONFIG_XEN.
> > > 
> > 
> > What diff is this? How is my patch related to it?
>
>
> This is what you posted, take a look:
> https://lore.kernel.org/all/8349506149de6d81b0762f17623552c248439e93.1673297742.git.brchuckz@aol.com/
>
>

Oops, I think I sent the wrong patch here. I must have used the wrong git
branch. Sorry.

I wouldn't blame you if you ignore future messages from me.

I will get it right next time if there is a next time with v8. Linus
named git a stupid content tracker. And in this case, it was
too stupid to warn me that the patch it sent on my behalf is
not what I expected it to be. Or maybe I am the stupid one, LOL.

I hope no one is stupid enough to consider v7 of my patch any
further! And the fact that this stupidity of mine is preserved for
all future generations is so sweet - one of the benefits of trying
to help out in FLOSS projects, LOL.

Kind regards,

Chuck

Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Chuck Zmudzinski 1 year, 3 months ago
On 1/9/2023 7:05 PM, Chuck Zmudzinski wrote:
> On 1/9/23 6:33 PM, Michael S. Tsirkin wrote:
> > On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
> >> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
> >> as noted in docs/igd-assign.txt in the Qemu source code.
> >> 
> >> Currently, when the xl toolstack is used to configure a Xen HVM guest with
> >> Intel IGD passthrough to the guest with the Qemu upstream device model,
> >> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
> >> a different slot. This problem often prevents the guest from booting.
> >> 
> >> The only available workaround is not good: Configure Xen HVM guests to use
> >> the old and no longer maintained Qemu traditional device model available
> >> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
> >> 
> >> To implement this feature in the Qemu upstream device model for Xen HVM
> >> guests, introduce the following new functions, types, and macros:
> >> 
> >> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
> >> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
> >> * typedef XenPTQdevRealize function pointer
> >> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
> >> * xen_igd_reserve_slot and xen_igd_clear_slot functions
> >> 
> >> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
> >> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
> >> the xl toolstack with the gfx_passthru option enabled, which sets the
> >> igd-passthru=on option to Qemu for the Xen HVM machine type.
> >> 
> >> The new xen_igd_reserve_slot function also needs to be implemented in
> >> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
> >> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
> >> in which case it does nothing.
> >> 
> >> The new xen_igd_clear_slot function overrides qdev->realize of the parent
> >> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
> >> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
> >> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
> >> 
> >> Move the call to xen_host_pci_device_get, and the associated error
> >> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
> >> initialize the device class and vendor values which enables the checks for
> >> the Intel IGD to succeed. The verification that the host device is an
> >> Intel IGD to be passed through is done by checking the domain, bus, slot,
> >> and function values as well as by checking that gfx_passthru is enabled,
> >> the device class is VGA, and the device vendor in Intel.
> >> 
> >> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
> >> ---
> >> Notes that might be helpful to reviewers of patched code in hw/xen:
> >> 
> >> The new functions and types are based on recommendations from Qemu docs:
> >> https://qemu.readthedocs.io/en/latest/devel/qom.html
> >> 
> >> Notes that might be helpful to reviewers of patched code in hw/i386:
> >> 
> >> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
> >> not affect builds that do not have CONFIG_XEN defined.
> > 
> > I'm not sure how you can claim that.
>
> I mean the small patch to pc_piix.c in this patch sits
> between an "#ifdef CONFIG_XEN" and the corresponding
> "#endif" so the preprocessor will exclude it when CONFIG_XEN
> is not defined. In other words, my patch is part of the
> xen-specific code in pc_piix.c. Or am I missing something?
>
>
> > 
> > ...
> > 
> >> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> >> index b48047f50c..34a9736b5e 100644
> >> --- a/hw/i386/pc_piix.c
> >> +++ b/hw/i386/pc_piix.c
> >> @@ -32,6 +32,7 @@
> >>  #include "hw/i386/pc.h"
> >>  #include "hw/i386/apic.h"
> >>  #include "hw/pci-host/i440fx.h"
> >> +#include "hw/rtc/mc146818rtc.h"
> >>  #include "hw/southbridge/piix.h"
> >>  #include "hw/display/ramfb.h"
> >>  #include "hw/firmware/smbios.h"
> >> @@ -40,16 +41,16 @@
> >>  #include "hw/usb.h"
> >>  #include "net/net.h"
> >>  #include "hw/ide/pci.h"
> >> -#include "hw/ide/piix.h"
> >>  #include "hw/irq.h"
> >>  #include "sysemu/kvm.h"
> >>  #include "hw/kvm/clock.h"
> >>  #include "hw/sysbus.h"
> >> +#include "hw/i2c/i2c.h"
> >>  #include "hw/i2c/smbus_eeprom.h"
> >>  #include "hw/xen/xen-x86.h"
> >> +#include "hw/xen/xen.h"
> >>  #include "exec/memory.h"
> >>  #include "hw/acpi/acpi.h"
> >> -#include "hw/acpi/piix4.h"
> >>  #include "qapi/error.h"
> >>  #include "qemu/error-report.h"
> >>  #include "sysemu/xen.h"
> >> @@ -66,6 +67,7 @@
> >>  #include "kvm/kvm-cpu.h"
> >>  
> >>  #define MAX_IDE_BUS 2
> >> +#define XEN_IOAPIC_NUM_PIRQS 128ULL
> >>  
> >>  #ifdef CONFIG_IDE_ISA
> >>  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
> >> @@ -73,6 +75,32 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
> >>  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
> >>  #endif
> >>  
> >> +/*
> >> + * Return the global irq number corresponding to a given device irq
> >> + * pin. We could also use the bus number to have a more precise mapping.
> >> + */
> >> +static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
> >> +{
> >> +    int slot_addend;
> >> +    slot_addend = PCI_SLOT(pci_dev->devfn) - 1;
> >> +    return (pci_intx + slot_addend) & 3;
> >> +}
> >> +
> >> +static void piix_intx_routing_notifier_xen(PCIDevice *dev)
> >> +{
> >> +    int i;
> >> +
> >> +    /* Scan for updates to PCI link routes (0x60-0x63). */
> >> +    for (i = 0; i < PIIX_NUM_PIRQS; i++) {
> >> +        uint8_t v = dev->config_read(dev, PIIX_PIRQCA + i, 1);
> >> +        if (v & 0x80) {
> >> +            v = 0;
> >> +        }
> >> +        v &= 0xf;
> >> +        xen_set_pci_link_route(i, v);
> >> +    }
> >> +}
> >> +
> >>  /* PC hardware initialisation */
> >>  static void pc_init1(MachineState *machine,
> >>                       const char *host_type, const char *pci_type)
> >> @@ -84,7 +112,7 @@ static void pc_init1(MachineState *machine,
> >>      MemoryRegion *system_io = get_system_io();
> >>      PCIBus *pci_bus;
> >>      ISABus *isa_bus;
> >> -    int piix3_devfn = -1;
> >> +    Object *piix4_pm;
> >>      qemu_irq smi_irq;
> >>      GSIState *gsi_state;
> >>      BusState *idebus[MAX_IDE_BUS];
> >> @@ -205,10 +233,9 @@ static void pc_init1(MachineState *machine,
> >>      gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
> >>  
> >>      if (pcmc->pci_enabled) {
> >> -        PIIX3State *piix3;
> >> +        DeviceState *dev;
> >>          PCIDevice *pci_dev;
> >> -        const char *type = xen_enabled() ? TYPE_PIIX3_XEN_DEVICE
> >> -                                         : TYPE_PIIX3_DEVICE;
> >> +        int i;
> >>  
> >>          pci_bus = i440fx_init(pci_type,
> >>                                i440fx_host,
> >> @@ -216,21 +243,65 @@ static void pc_init1(MachineState *machine,
> >>                                x86ms->below_4g_mem_size,
> >>                                x86ms->above_4g_mem_size,
> >>                                pci_memory, ram_memory);
> >> +        pci_bus_map_irqs(pci_bus,
> >> +                         xen_enabled() ? xen_pci_slot_get_pirq
> >> +                                       : pci_slot_get_pirq);
> >>          pcms->bus = pci_bus;
> >>  
> >> -        pci_dev = pci_create_simple_multifunction(pci_bus, -1, true, type);
> >> -        piix3 = PIIX3_PCI_DEVICE(pci_dev);
> >> -        piix3->pic = x86ms->gsi;
> >> -        piix3_devfn = piix3->dev.devfn;
> >> -        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), "isa.0"));
> >> +        pci_dev = pci_new_multifunction(-1, true, TYPE_PIIX3_DEVICE);
> >> +        object_property_set_bool(OBJECT(pci_dev), "has-usb",
> >> +                                 machine_usb(machine), &error_abort);
> >> +        object_property_set_bool(OBJECT(pci_dev), "has-acpi",
> >> +                                 x86_machine_is_acpi_enabled(x86ms),
> >> +                                 &error_abort);
> >> +        qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);
> >> +        object_property_set_bool(OBJECT(pci_dev), "smm-enabled",
> >> +                                 x86_machine_is_smm_enabled(x86ms),
> >> +                                 &error_abort);
> >> +        pci_realize_and_unref(pci_dev, pci_bus, &error_fatal);
> >> +
> >> +        if (xen_enabled()) {
> >> +            pci_device_set_intx_routing_notifier(
> >> +                        pci_dev, piix_intx_routing_notifier_xen);
> >> +
> >> +            /*
> >> +             * Xen supports additional interrupt routes from the PCI devices to
> >> +             * the IOAPIC: the four pins of each PCI device on the bus are also
> >> +             * connected to the IOAPIC directly.
> >> +             * These additional routes can be discovered through ACPI.
> >> +             */
> >> +            pci_bus_irqs(pci_bus, xen_intx_set_irq, pci_dev,
> >> +                         XEN_IOAPIC_NUM_PIRQS);
> >> +        }
> >> +
> >> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "pic"));
> >> +        for (i = 0; i < ISA_NUM_IRQS; i++) {
> >> +            qdev_connect_gpio_out(dev, i, x86ms->gsi[i]);
> >> +        }
> >> +        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));
> >> +        rtc_state = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),
> >> +                                                             "rtc"));
> >> +        piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");
> >> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));
> >> +        pci_ide_create_devs(PCI_DEVICE(dev));
> >> +        idebus[0] = qdev_get_child_bus(dev, "ide.0");
> >> +        idebus[1] = qdev_get_child_bus(dev, "ide.1");
> >>      } else {
> >>          pci_bus = NULL;
> >> +        piix4_pm = NULL;
> >>          isa_bus = isa_bus_new(NULL, get_system_memory(), system_io,
> >>                                &error_abort);
> >> +        isa_bus_irqs(isa_bus, x86ms->gsi);
> >> +
> >> +        rtc_state = isa_new(TYPE_MC146818_RTC);
> >> +        qdev_prop_set_int32(DEVICE(rtc_state), "base_year", 2000);
> >> +        isa_realize_and_unref(rtc_state, isa_bus, &error_fatal);
> >> +
> >>          i8257_dma_init(isa_bus, 0);
> >>          pcms->hpet_enabled = false;
> >> +        idebus[0] = NULL;
> >> +        idebus[1] = NULL;
> >>      }
> >> -    isa_bus_irqs(isa_bus, x86ms->gsi);
> >>  
> >>      if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
> >>          pc_i8259_create(isa_bus, gsi_state->i8259_irq);
> >> @@ -252,18 +323,12 @@ static void pc_init1(MachineState *machine,
> >>      }
> >>  
> >>      /* init basic PC hardware */
> >> -    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
> >> +    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, rtc_state, true,
> >>                           0x4);
> >>  
> >>      pc_nic_init(pcmc, isa_bus, pci_bus);
> >>  
> >>      if (pcmc->pci_enabled) {
> >> -        PCIDevice *dev;
> >> -
> >> -        dev = pci_create_simple(pci_bus, piix3_devfn + 1, TYPE_PIIX3_IDE);
> >> -        pci_ide_create_devs(dev);
> >> -        idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
> >> -        idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
> >>          pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
> >>      }
> >>  #ifdef CONFIG_IDE_ISA
> >> @@ -289,21 +354,9 @@ static void pc_init1(MachineState *machine,
> >>      }
> >>  #endif
> >>  
> >> -    if (pcmc->pci_enabled && machine_usb(machine)) {
> >> -        pci_create_simple(pci_bus, piix3_devfn + 2, "piix3-usb-uhci");
> >> -    }
> >> -
> >> -    if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
> >> -        PCIDevice *piix4_pm;
> >> -
> >> +    if (piix4_pm) {
> >>          smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
> >> -        piix4_pm = pci_new(piix3_devfn + 3, TYPE_PIIX4_PM);
> >> -        qdev_prop_set_uint32(DEVICE(piix4_pm), "smb_io_base", 0xb100);
> >> -        qdev_prop_set_bit(DEVICE(piix4_pm), "smm-enabled",
> >> -                          x86_machine_is_smm_enabled(x86ms));
> >> -        pci_realize_and_unref(piix4_pm, pci_bus, &error_fatal);
> >>  
> >> -        qdev_connect_gpio_out(DEVICE(piix4_pm), 0, x86ms->gsi[9]);
> >>          qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);
> >>          pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));
> >>          /* TODO: Populate SPD eeprom data.  */
> >> @@ -315,7 +368,7 @@ static void pc_init1(MachineState *machine,
> >>                                   object_property_allow_set_link,
> >>                                   OBJ_PROP_LINK_STRONG);
> >>          object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,
> >> -                                 OBJECT(piix4_pm), &error_abort);
> >> +                                 piix4_pm, &error_abort);
> >>      }
> >>  
> >>      if (machine->nvdimms_state->is_enabled) {
> >> @@ -405,6 +458,9 @@ static void pc_xen_hvm_init(MachineState *machine)
> >>      }
> >>  
> >>      pc_xen_hvm_init_pci(machine);
> >> +    if (xen_igd_gfx_pt_enabled()) {
> >> +        xen_igd_reserve_slot(pcms->bus);
> >> +    }

These are the three lines from my patch that are added to
pc_piix.c. IIUC, this diff is showing the differences in pc_piix.c
without and with CONFIG_XEN defined. If so, then the fact that
the lines in my patch are added when CONFIG_XEN is defined
means they are not included as part of the build when CONFIG_XEN
is not defined. That is how I can claim that my small patch to
pc_piix.c does not affect builds without CONFIG_XEN defined.

I also think the rest of the files touched by my patch are only included
by the meson build system with the --enable-xen an option to configure.
So the entire patch is xen-specific and can easily be eliminated from
any build with the --disable-xen configure option.

Kind regards,

Chuck

> >>      pci_create_simple(pcms->bus, -1, "xen-platform");
> >>  }
> >>  #endif
> >> @@ -441,6 +497,11 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> >>      pc_i440fx_machine_options(m);
> >>      m->alias = "pc";
> >>      m->is_default = true;
> >> +#ifdef CONFIG_MICROVM_DEFAULT
> >> +    m->is_default = false;
> >> +#else
> >> +    m->is_default = true;
> >> +#endif
> >>  }
> >>  
> >>  DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
> > 
> > 
> > Lots of changes here not guarded by CONFIG_XEN.
> > 
>
> What diff is this? How is my patch related to it?
>


Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Michael S. Tsirkin 1 year, 3 months ago
On Mon, Jan 09, 2023 at 09:11:22PM -0500, Chuck Zmudzinski wrote:
> On 1/9/2023 7:05 PM, Chuck Zmudzinski wrote:
> > On 1/9/23 6:33 PM, Michael S. Tsirkin wrote:
> > > On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
> > >> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
> > >> as noted in docs/igd-assign.txt in the Qemu source code.
> > >> 
> > >> Currently, when the xl toolstack is used to configure a Xen HVM guest with
> > >> Intel IGD passthrough to the guest with the Qemu upstream device model,
> > >> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
> > >> a different slot. This problem often prevents the guest from booting.
> > >> 
> > >> The only available workaround is not good: Configure Xen HVM guests to use
> > >> the old and no longer maintained Qemu traditional device model available
> > >> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
> > >> 
> > >> To implement this feature in the Qemu upstream device model for Xen HVM
> > >> guests, introduce the following new functions, types, and macros:
> > >> 
> > >> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
> > >> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
> > >> * typedef XenPTQdevRealize function pointer
> > >> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
> > >> * xen_igd_reserve_slot and xen_igd_clear_slot functions
> > >> 
> > >> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
> > >> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
> > >> the xl toolstack with the gfx_passthru option enabled, which sets the
> > >> igd-passthru=on option to Qemu for the Xen HVM machine type.
> > >> 
> > >> The new xen_igd_reserve_slot function also needs to be implemented in
> > >> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
> > >> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
> > >> in which case it does nothing.
> > >> 
> > >> The new xen_igd_clear_slot function overrides qdev->realize of the parent
> > >> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
> > >> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
> > >> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
> > >> 
> > >> Move the call to xen_host_pci_device_get, and the associated error
> > >> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
> > >> initialize the device class and vendor values which enables the checks for
> > >> the Intel IGD to succeed. The verification that the host device is an
> > >> Intel IGD to be passed through is done by checking the domain, bus, slot,
> > >> and function values as well as by checking that gfx_passthru is enabled,
> > >> the device class is VGA, and the device vendor in Intel.
> > >> 
> > >> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
> > >> ---
> > >> Notes that might be helpful to reviewers of patched code in hw/xen:
> > >> 
> > >> The new functions and types are based on recommendations from Qemu docs:
> > >> https://qemu.readthedocs.io/en/latest/devel/qom.html
> > >> 
> > >> Notes that might be helpful to reviewers of patched code in hw/i386:
> > >> 
> > >> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
> > >> not affect builds that do not have CONFIG_XEN defined.
> > > 
> > > I'm not sure how you can claim that.
> >
> > I mean the small patch to pc_piix.c in this patch sits
> > between an "#ifdef CONFIG_XEN" and the corresponding
> > "#endif" so the preprocessor will exclude it when CONFIG_XEN
> > is not defined. In other words, my patch is part of the
> > xen-specific code in pc_piix.c. Or am I missing something?
> >
> >
> > > 
> > > ...
> > > 
> > >> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > >> index b48047f50c..34a9736b5e 100644
> > >> --- a/hw/i386/pc_piix.c
> > >> +++ b/hw/i386/pc_piix.c
> > >> @@ -32,6 +32,7 @@
> > >>  #include "hw/i386/pc.h"
> > >>  #include "hw/i386/apic.h"
> > >>  #include "hw/pci-host/i440fx.h"
> > >> +#include "hw/rtc/mc146818rtc.h"
> > >>  #include "hw/southbridge/piix.h"
> > >>  #include "hw/display/ramfb.h"
> > >>  #include "hw/firmware/smbios.h"
> > >> @@ -40,16 +41,16 @@
> > >>  #include "hw/usb.h"
> > >>  #include "net/net.h"
> > >>  #include "hw/ide/pci.h"
> > >> -#include "hw/ide/piix.h"
> > >>  #include "hw/irq.h"
> > >>  #include "sysemu/kvm.h"
> > >>  #include "hw/kvm/clock.h"
> > >>  #include "hw/sysbus.h"
> > >> +#include "hw/i2c/i2c.h"
> > >>  #include "hw/i2c/smbus_eeprom.h"
> > >>  #include "hw/xen/xen-x86.h"
> > >> +#include "hw/xen/xen.h"
> > >>  #include "exec/memory.h"
> > >>  #include "hw/acpi/acpi.h"
> > >> -#include "hw/acpi/piix4.h"
> > >>  #include "qapi/error.h"
> > >>  #include "qemu/error-report.h"
> > >>  #include "sysemu/xen.h"
> > >> @@ -66,6 +67,7 @@
> > >>  #include "kvm/kvm-cpu.h"
> > >>  
> > >>  #define MAX_IDE_BUS 2
> > >> +#define XEN_IOAPIC_NUM_PIRQS 128ULL
> > >>  
> > >>  #ifdef CONFIG_IDE_ISA
> > >>  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
> > >> @@ -73,6 +75,32 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
> > >>  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
> > >>  #endif
> > >>  
> > >> +/*
> > >> + * Return the global irq number corresponding to a given device irq
> > >> + * pin. We could also use the bus number to have a more precise mapping.
> > >> + */
> > >> +static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
> > >> +{
> > >> +    int slot_addend;
> > >> +    slot_addend = PCI_SLOT(pci_dev->devfn) - 1;
> > >> +    return (pci_intx + slot_addend) & 3;
> > >> +}
> > >> +
> > >> +static void piix_intx_routing_notifier_xen(PCIDevice *dev)
> > >> +{
> > >> +    int i;
> > >> +
> > >> +    /* Scan for updates to PCI link routes (0x60-0x63). */
> > >> +    for (i = 0; i < PIIX_NUM_PIRQS; i++) {
> > >> +        uint8_t v = dev->config_read(dev, PIIX_PIRQCA + i, 1);
> > >> +        if (v & 0x80) {
> > >> +            v = 0;
> > >> +        }
> > >> +        v &= 0xf;
> > >> +        xen_set_pci_link_route(i, v);
> > >> +    }
> > >> +}
> > >> +
> > >>  /* PC hardware initialisation */
> > >>  static void pc_init1(MachineState *machine,
> > >>                       const char *host_type, const char *pci_type)
> > >> @@ -84,7 +112,7 @@ static void pc_init1(MachineState *machine,
> > >>      MemoryRegion *system_io = get_system_io();
> > >>      PCIBus *pci_bus;
> > >>      ISABus *isa_bus;
> > >> -    int piix3_devfn = -1;
> > >> +    Object *piix4_pm;
> > >>      qemu_irq smi_irq;
> > >>      GSIState *gsi_state;
> > >>      BusState *idebus[MAX_IDE_BUS];
> > >> @@ -205,10 +233,9 @@ static void pc_init1(MachineState *machine,
> > >>      gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
> > >>  
> > >>      if (pcmc->pci_enabled) {
> > >> -        PIIX3State *piix3;
> > >> +        DeviceState *dev;
> > >>          PCIDevice *pci_dev;
> > >> -        const char *type = xen_enabled() ? TYPE_PIIX3_XEN_DEVICE
> > >> -                                         : TYPE_PIIX3_DEVICE;
> > >> +        int i;
> > >>  
> > >>          pci_bus = i440fx_init(pci_type,
> > >>                                i440fx_host,
> > >> @@ -216,21 +243,65 @@ static void pc_init1(MachineState *machine,
> > >>                                x86ms->below_4g_mem_size,
> > >>                                x86ms->above_4g_mem_size,
> > >>                                pci_memory, ram_memory);
> > >> +        pci_bus_map_irqs(pci_bus,
> > >> +                         xen_enabled() ? xen_pci_slot_get_pirq
> > >> +                                       : pci_slot_get_pirq);
> > >>          pcms->bus = pci_bus;
> > >>  
> > >> -        pci_dev = pci_create_simple_multifunction(pci_bus, -1, true, type);
> > >> -        piix3 = PIIX3_PCI_DEVICE(pci_dev);
> > >> -        piix3->pic = x86ms->gsi;
> > >> -        piix3_devfn = piix3->dev.devfn;
> > >> -        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), "isa.0"));
> > >> +        pci_dev = pci_new_multifunction(-1, true, TYPE_PIIX3_DEVICE);
> > >> +        object_property_set_bool(OBJECT(pci_dev), "has-usb",
> > >> +                                 machine_usb(machine), &error_abort);
> > >> +        object_property_set_bool(OBJECT(pci_dev), "has-acpi",
> > >> +                                 x86_machine_is_acpi_enabled(x86ms),
> > >> +                                 &error_abort);
> > >> +        qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);
> > >> +        object_property_set_bool(OBJECT(pci_dev), "smm-enabled",
> > >> +                                 x86_machine_is_smm_enabled(x86ms),
> > >> +                                 &error_abort);
> > >> +        pci_realize_and_unref(pci_dev, pci_bus, &error_fatal);
> > >> +
> > >> +        if (xen_enabled()) {
> > >> +            pci_device_set_intx_routing_notifier(
> > >> +                        pci_dev, piix_intx_routing_notifier_xen);
> > >> +
> > >> +            /*
> > >> +             * Xen supports additional interrupt routes from the PCI devices to
> > >> +             * the IOAPIC: the four pins of each PCI device on the bus are also
> > >> +             * connected to the IOAPIC directly.
> > >> +             * These additional routes can be discovered through ACPI.
> > >> +             */
> > >> +            pci_bus_irqs(pci_bus, xen_intx_set_irq, pci_dev,
> > >> +                         XEN_IOAPIC_NUM_PIRQS);
> > >> +        }
> > >> +
> > >> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "pic"));
> > >> +        for (i = 0; i < ISA_NUM_IRQS; i++) {
> > >> +            qdev_connect_gpio_out(dev, i, x86ms->gsi[i]);
> > >> +        }
> > >> +        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));
> > >> +        rtc_state = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),
> > >> +                                                             "rtc"));
> > >> +        piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");
> > >> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));
> > >> +        pci_ide_create_devs(PCI_DEVICE(dev));
> > >> +        idebus[0] = qdev_get_child_bus(dev, "ide.0");
> > >> +        idebus[1] = qdev_get_child_bus(dev, "ide.1");
> > >>      } else {
> > >>          pci_bus = NULL;
> > >> +        piix4_pm = NULL;
> > >>          isa_bus = isa_bus_new(NULL, get_system_memory(), system_io,
> > >>                                &error_abort);
> > >> +        isa_bus_irqs(isa_bus, x86ms->gsi);
> > >> +
> > >> +        rtc_state = isa_new(TYPE_MC146818_RTC);
> > >> +        qdev_prop_set_int32(DEVICE(rtc_state), "base_year", 2000);
> > >> +        isa_realize_and_unref(rtc_state, isa_bus, &error_fatal);
> > >> +
> > >>          i8257_dma_init(isa_bus, 0);
> > >>          pcms->hpet_enabled = false;
> > >> +        idebus[0] = NULL;
> > >> +        idebus[1] = NULL;
> > >>      }
> > >> -    isa_bus_irqs(isa_bus, x86ms->gsi);
> > >>  
> > >>      if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
> > >>          pc_i8259_create(isa_bus, gsi_state->i8259_irq);
> > >> @@ -252,18 +323,12 @@ static void pc_init1(MachineState *machine,
> > >>      }
> > >>  
> > >>      /* init basic PC hardware */
> > >> -    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
> > >> +    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, rtc_state, true,
> > >>                           0x4);
> > >>  
> > >>      pc_nic_init(pcmc, isa_bus, pci_bus);
> > >>  
> > >>      if (pcmc->pci_enabled) {
> > >> -        PCIDevice *dev;
> > >> -
> > >> -        dev = pci_create_simple(pci_bus, piix3_devfn + 1, TYPE_PIIX3_IDE);
> > >> -        pci_ide_create_devs(dev);
> > >> -        idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
> > >> -        idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
> > >>          pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
> > >>      }
> > >>  #ifdef CONFIG_IDE_ISA
> > >> @@ -289,21 +354,9 @@ static void pc_init1(MachineState *machine,
> > >>      }
> > >>  #endif
> > >>  
> > >> -    if (pcmc->pci_enabled && machine_usb(machine)) {
> > >> -        pci_create_simple(pci_bus, piix3_devfn + 2, "piix3-usb-uhci");
> > >> -    }
> > >> -
> > >> -    if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
> > >> -        PCIDevice *piix4_pm;
> > >> -
> > >> +    if (piix4_pm) {
> > >>          smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
> > >> -        piix4_pm = pci_new(piix3_devfn + 3, TYPE_PIIX4_PM);
> > >> -        qdev_prop_set_uint32(DEVICE(piix4_pm), "smb_io_base", 0xb100);
> > >> -        qdev_prop_set_bit(DEVICE(piix4_pm), "smm-enabled",
> > >> -                          x86_machine_is_smm_enabled(x86ms));
> > >> -        pci_realize_and_unref(piix4_pm, pci_bus, &error_fatal);
> > >>  
> > >> -        qdev_connect_gpio_out(DEVICE(piix4_pm), 0, x86ms->gsi[9]);
> > >>          qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);
> > >>          pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));
> > >>          /* TODO: Populate SPD eeprom data.  */
> > >> @@ -315,7 +368,7 @@ static void pc_init1(MachineState *machine,
> > >>                                   object_property_allow_set_link,
> > >>                                   OBJ_PROP_LINK_STRONG);
> > >>          object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,
> > >> -                                 OBJECT(piix4_pm), &error_abort);
> > >> +                                 piix4_pm, &error_abort);
> > >>      }
> > >>  
> > >>      if (machine->nvdimms_state->is_enabled) {
> > >> @@ -405,6 +458,9 @@ static void pc_xen_hvm_init(MachineState *machine)
> > >>      }
> > >>  
> > >>      pc_xen_hvm_init_pci(machine);
> > >> +    if (xen_igd_gfx_pt_enabled()) {
> > >> +        xen_igd_reserve_slot(pcms->bus);
> > >> +    }
> 
> These are the three lines from my patch that are added to
> pc_piix.c. IIUC, this diff is showing the differences in pc_piix.c
> without and with CONFIG_XEN defined.

changing configure flags does not modify pc_piix.c IIUC.  configure
changes compiler flags and some generated files (mostly headers).
AFAIK pc_piix.c is not one of them.

> If so, then the fact that
> the lines in my patch are added when CONFIG_XEN is defined
> means they are not included as part of the build when CONFIG_XEN
> is not defined. That is how I can claim that my small patch to
> pc_piix.c does not affect builds without CONFIG_XEN defined.
>

Doesn't make sense to me, sorry.

> I also think the rest of the files touched by my patch are only included
> by the meson build system with the --enable-xen an option to configure.
> So the entire patch is xen-specific and can easily be eliminated from
> any build with the --disable-xen configure option.
> 
> Kind regards,
> 
> Chuck
> 
> > >>      pci_create_simple(pcms->bus, -1, "xen-platform");
> > >>  }
> > >>  #endif
> > >> @@ -441,6 +497,11 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
> > >>      pc_i440fx_machine_options(m);
> > >>      m->alias = "pc";
> > >>      m->is_default = true;
> > >> +#ifdef CONFIG_MICROVM_DEFAULT
> > >> +    m->is_default = false;
> > >> +#else
> > >> +    m->is_default = true;
> > >> +#endif
> > >>  }
> > >>  
> > >>  DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
> > > 
> > > 
> > > Lots of changes here not guarded by CONFIG_XEN.
> > > 
> >
> > What diff is this? How is my patch related to it?
> >


Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Michael S. Tsirkin 1 year, 3 months ago
On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
> as noted in docs/igd-assign.txt in the Qemu source code.
> 
> Currently, when the xl toolstack is used to configure a Xen HVM guest with
> Intel IGD passthrough to the guest with the Qemu upstream device model,
> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
> a different slot. This problem often prevents the guest from booting.
> 
> The only available workaround is not good: Configure Xen HVM guests to use
> the old and no longer maintained Qemu traditional device model available
> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
> 
> To implement this feature in the Qemu upstream device model for Xen HVM
> guests, introduce the following new functions, types, and macros:
> 
> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
> * typedef XenPTQdevRealize function pointer
> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
> * xen_igd_reserve_slot and xen_igd_clear_slot functions
> 
> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
> the xl toolstack with the gfx_passthru option enabled, which sets the
> igd-passthru=on option to Qemu for the Xen HVM machine type.

I don't like how slot_reserved_mask is set initially then cleared on
device realize.
To me this looks like a fragile hack. I suggest one of the following
1. adding a new mask
"slot-manual-mask" or some such blocking auto-allocation of a given
slot without blocking its use if address is specified on command line.
2. adding a special property that overrides slot_reserved_mask
for a given device.

both need changes in pci core but look like something generally
useful.


> 
> The new xen_igd_reserve_slot function also needs to be implemented in
> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
> in which case it does nothing.
> 
> The new xen_igd_clear_slot function overrides qdev->realize of the parent
> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
> 
> Move the call to xen_host_pci_device_get, and the associated error
> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
> initialize the device class and vendor values which enables the checks for
> the Intel IGD to succeed. The verification that the host device is an
> Intel IGD to be passed through is done by checking the domain, bus, slot,
> and function values as well as by checking that gfx_passthru is enabled,
> the device class is VGA, and the device vendor in Intel.
> 
> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
> ---
> Notes that might be helpful to reviewers of patched code in hw/xen:
> 
> The new functions and types are based on recommendations from Qemu docs:
> https://qemu.readthedocs.io/en/latest/devel/qom.html
> 
> Notes that might be helpful to reviewers of patched code in hw/i386:
> 
> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
> not affect builds that do not have CONFIG_XEN defined.
> 
> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
> existing function that is only true when Qemu is built with
> xen-pci-passthrough enabled and the administrator has configured the Xen
> HVM guest with Qemu's igd-passthru=on option.
> 
> v2: Remove From: <email address> tag at top of commit message
> 
> v3: Changed the test for the Intel IGD in xen_igd_clear_slot:
> 
>     if (is_igd_vga_passthrough(&s->real_device) &&
>         (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {
> 
>     is changed to
> 
>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>         && (s->hostaddr.function == 0)) {
> 
>     I hoped that I could use the test in v2, since it matches the
>     other tests for the Intel IGD in Qemu and Xen, but those tests
>     do not work because the necessary data structures are not set with
>     their values yet. So instead use the test that the administrator
>     has enabled gfx_passthru and the device address on the host is
>     02.0. This test does detect the Intel IGD correctly.
> 
> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
>     email address to match the address used by the same author in commits
>     be9c61da and c0e86b76
>     
>     Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc
> 
> v5: The patch of xen_pt.c was re-worked to allow a more consistent test
>     for the Intel IGD that uses the same criteria as in other places.
>     This involved moving the call to xen_host_pci_device_get from
>     xen_pt_realize to xen_igd_clear_slot and updating the checks for the
>     Intel IGD in xen_igd_clear_slot:
>     
>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>         && (s->hostaddr.function == 0)) {
> 
>     is changed to
> 
>     if (is_igd_vga_passthrough(&s->real_device) &&
>         s->real_device.domain == 0 && s->real_device.bus == 0 &&
>         s->real_device.dev == 2 && s->real_device.func == 0 &&
>         s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
> 
>     Added an explanation for the move of xen_host_pci_device_get from
>     xen_pt_realize to xen_igd_clear_slot to the commit message.
> 
>     Rebase.
> 
> v6: Fix logging by removing these lines from the move from xen_pt_realize
>     to xen_igd_clear_slot that was done in v5:
> 
>     XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
>                " to devfn 0x%x\n",
>                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>                s->dev.devfn);
> 
>     This log needs to be in xen_pt_realize because s->dev.devfn is not
>     set yet in xen_igd_clear_slot.
> 
> v7: Inhibit out of context log message and needless processing by
>     adding 2 lines at the top of the new xen_igd_clear_slot function:
> 
>     if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>         return;
> 
>     Rebase. This removed an unnecessary header file from xen_pt.h 
> 
>  hw/i386/pc_piix.c    | 127 ++++++++++++++++++++++++++++++++-----------
>  hw/xen/xen_pt.c      |  49 ++++++++++++++---
>  hw/xen/xen_pt.h      |  16 ++++++
>  hw/xen/xen_pt_stub.c |   4 ++
>  4 files changed, 154 insertions(+), 42 deletions(-)
> 
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index b48047f50c..34a9736b5e 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -32,6 +32,7 @@
>  #include "hw/i386/pc.h"
>  #include "hw/i386/apic.h"
>  #include "hw/pci-host/i440fx.h"
> +#include "hw/rtc/mc146818rtc.h"
>  #include "hw/southbridge/piix.h"
>  #include "hw/display/ramfb.h"
>  #include "hw/firmware/smbios.h"
> @@ -40,16 +41,16 @@
>  #include "hw/usb.h"
>  #include "net/net.h"
>  #include "hw/ide/pci.h"
> -#include "hw/ide/piix.h"
>  #include "hw/irq.h"
>  #include "sysemu/kvm.h"
>  #include "hw/kvm/clock.h"
>  #include "hw/sysbus.h"
> +#include "hw/i2c/i2c.h"
>  #include "hw/i2c/smbus_eeprom.h"
>  #include "hw/xen/xen-x86.h"
> +#include "hw/xen/xen.h"
>  #include "exec/memory.h"
>  #include "hw/acpi/acpi.h"
> -#include "hw/acpi/piix4.h"
>  #include "qapi/error.h"
>  #include "qemu/error-report.h"
>  #include "sysemu/xen.h"
> @@ -66,6 +67,7 @@
>  #include "kvm/kvm-cpu.h"
>  
>  #define MAX_IDE_BUS 2
> +#define XEN_IOAPIC_NUM_PIRQS 128ULL
>  
>  #ifdef CONFIG_IDE_ISA
>  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
> @@ -73,6 +75,32 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
>  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
>  #endif
>  
> +/*
> + * Return the global irq number corresponding to a given device irq
> + * pin. We could also use the bus number to have a more precise mapping.
> + */
> +static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
> +{
> +    int slot_addend;
> +    slot_addend = PCI_SLOT(pci_dev->devfn) - 1;
> +    return (pci_intx + slot_addend) & 3;
> +}
> +
> +static void piix_intx_routing_notifier_xen(PCIDevice *dev)
> +{
> +    int i;
> +
> +    /* Scan for updates to PCI link routes (0x60-0x63). */
> +    for (i = 0; i < PIIX_NUM_PIRQS; i++) {
> +        uint8_t v = dev->config_read(dev, PIIX_PIRQCA + i, 1);
> +        if (v & 0x80) {
> +            v = 0;
> +        }
> +        v &= 0xf;
> +        xen_set_pci_link_route(i, v);
> +    }
> +}
> +
>  /* PC hardware initialisation */
>  static void pc_init1(MachineState *machine,
>                       const char *host_type, const char *pci_type)
> @@ -84,7 +112,7 @@ static void pc_init1(MachineState *machine,
>      MemoryRegion *system_io = get_system_io();
>      PCIBus *pci_bus;
>      ISABus *isa_bus;
> -    int piix3_devfn = -1;
> +    Object *piix4_pm;
>      qemu_irq smi_irq;
>      GSIState *gsi_state;
>      BusState *idebus[MAX_IDE_BUS];
> @@ -205,10 +233,9 @@ static void pc_init1(MachineState *machine,
>      gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
>  
>      if (pcmc->pci_enabled) {
> -        PIIX3State *piix3;
> +        DeviceState *dev;
>          PCIDevice *pci_dev;
> -        const char *type = xen_enabled() ? TYPE_PIIX3_XEN_DEVICE
> -                                         : TYPE_PIIX3_DEVICE;
> +        int i;
>  
>          pci_bus = i440fx_init(pci_type,
>                                i440fx_host,
> @@ -216,21 +243,65 @@ static void pc_init1(MachineState *machine,
>                                x86ms->below_4g_mem_size,
>                                x86ms->above_4g_mem_size,
>                                pci_memory, ram_memory);
> +        pci_bus_map_irqs(pci_bus,
> +                         xen_enabled() ? xen_pci_slot_get_pirq
> +                                       : pci_slot_get_pirq);
>          pcms->bus = pci_bus;
>  
> -        pci_dev = pci_create_simple_multifunction(pci_bus, -1, true, type);
> -        piix3 = PIIX3_PCI_DEVICE(pci_dev);
> -        piix3->pic = x86ms->gsi;
> -        piix3_devfn = piix3->dev.devfn;
> -        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), "isa.0"));
> +        pci_dev = pci_new_multifunction(-1, true, TYPE_PIIX3_DEVICE);
> +        object_property_set_bool(OBJECT(pci_dev), "has-usb",
> +                                 machine_usb(machine), &error_abort);
> +        object_property_set_bool(OBJECT(pci_dev), "has-acpi",
> +                                 x86_machine_is_acpi_enabled(x86ms),
> +                                 &error_abort);
> +        qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);
> +        object_property_set_bool(OBJECT(pci_dev), "smm-enabled",
> +                                 x86_machine_is_smm_enabled(x86ms),
> +                                 &error_abort);
> +        pci_realize_and_unref(pci_dev, pci_bus, &error_fatal);
> +
> +        if (xen_enabled()) {
> +            pci_device_set_intx_routing_notifier(
> +                        pci_dev, piix_intx_routing_notifier_xen);
> +
> +            /*
> +             * Xen supports additional interrupt routes from the PCI devices to
> +             * the IOAPIC: the four pins of each PCI device on the bus are also
> +             * connected to the IOAPIC directly.
> +             * These additional routes can be discovered through ACPI.
> +             */
> +            pci_bus_irqs(pci_bus, xen_intx_set_irq, pci_dev,
> +                         XEN_IOAPIC_NUM_PIRQS);
> +        }
> +
> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "pic"));
> +        for (i = 0; i < ISA_NUM_IRQS; i++) {
> +            qdev_connect_gpio_out(dev, i, x86ms->gsi[i]);
> +        }
> +        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));
> +        rtc_state = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),
> +                                                             "rtc"));
> +        piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");
> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));
> +        pci_ide_create_devs(PCI_DEVICE(dev));
> +        idebus[0] = qdev_get_child_bus(dev, "ide.0");
> +        idebus[1] = qdev_get_child_bus(dev, "ide.1");
>      } else {
>          pci_bus = NULL;
> +        piix4_pm = NULL;
>          isa_bus = isa_bus_new(NULL, get_system_memory(), system_io,
>                                &error_abort);
> +        isa_bus_irqs(isa_bus, x86ms->gsi);
> +
> +        rtc_state = isa_new(TYPE_MC146818_RTC);
> +        qdev_prop_set_int32(DEVICE(rtc_state), "base_year", 2000);
> +        isa_realize_and_unref(rtc_state, isa_bus, &error_fatal);
> +
>          i8257_dma_init(isa_bus, 0);
>          pcms->hpet_enabled = false;
> +        idebus[0] = NULL;
> +        idebus[1] = NULL;
>      }
> -    isa_bus_irqs(isa_bus, x86ms->gsi);
>  
>      if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
>          pc_i8259_create(isa_bus, gsi_state->i8259_irq);
> @@ -252,18 +323,12 @@ static void pc_init1(MachineState *machine,
>      }
>  
>      /* init basic PC hardware */
> -    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
> +    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, rtc_state, true,
>                           0x4);
>  
>      pc_nic_init(pcmc, isa_bus, pci_bus);
>  
>      if (pcmc->pci_enabled) {
> -        PCIDevice *dev;
> -
> -        dev = pci_create_simple(pci_bus, piix3_devfn + 1, TYPE_PIIX3_IDE);
> -        pci_ide_create_devs(dev);
> -        idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
> -        idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
>          pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
>      }
>  #ifdef CONFIG_IDE_ISA
> @@ -289,21 +354,9 @@ static void pc_init1(MachineState *machine,
>      }
>  #endif
>  
> -    if (pcmc->pci_enabled && machine_usb(machine)) {
> -        pci_create_simple(pci_bus, piix3_devfn + 2, "piix3-usb-uhci");
> -    }
> -
> -    if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
> -        PCIDevice *piix4_pm;
> -
> +    if (piix4_pm) {
>          smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
> -        piix4_pm = pci_new(piix3_devfn + 3, TYPE_PIIX4_PM);
> -        qdev_prop_set_uint32(DEVICE(piix4_pm), "smb_io_base", 0xb100);
> -        qdev_prop_set_bit(DEVICE(piix4_pm), "smm-enabled",
> -                          x86_machine_is_smm_enabled(x86ms));
> -        pci_realize_and_unref(piix4_pm, pci_bus, &error_fatal);
>  
> -        qdev_connect_gpio_out(DEVICE(piix4_pm), 0, x86ms->gsi[9]);
>          qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);
>          pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));
>          /* TODO: Populate SPD eeprom data.  */
> @@ -315,7 +368,7 @@ static void pc_init1(MachineState *machine,
>                                   object_property_allow_set_link,
>                                   OBJ_PROP_LINK_STRONG);
>          object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,
> -                                 OBJECT(piix4_pm), &error_abort);
> +                                 piix4_pm, &error_abort);
>      }
>  
>      if (machine->nvdimms_state->is_enabled) {
> @@ -405,6 +458,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>      }
>  
>      pc_xen_hvm_init_pci(machine);
> +    if (xen_igd_gfx_pt_enabled()) {
> +        xen_igd_reserve_slot(pcms->bus);
> +    }
>      pci_create_simple(pcms->bus, -1, "xen-platform");
>  }
>  #endif
> @@ -441,6 +497,11 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
>      pc_i440fx_machine_options(m);
>      m->alias = "pc";
>      m->is_default = true;
> +#ifdef CONFIG_MICROVM_DEFAULT
> +    m->is_default = false;
> +#else
> +    m->is_default = true;
> +#endif
>  }
>  
>  DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 0ec7e52183..eff38155ef 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
>                 s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>                 s->dev.devfn);
>  
> -    xen_host_pci_device_get(&s->real_device,
> -                            s->hostaddr.domain, s->hostaddr.bus,
> -                            s->hostaddr.slot, s->hostaddr.function,
> -                            errp);
> -    if (*errp) {
> -        error_append_hint(errp, "Failed to \"open\" the real pci device");
> -        return;
> -    }
> -
>      s->is_virtfn = s->real_device.is_virtfn;
>      if (s->is_virtfn) {
>          XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
>      PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
>  }
>  
> +void xen_igd_reserve_slot(PCIBus *pci_bus)
> +{
> +    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
> +    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
> +}
> +
> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
> +{
> +    ERRP_GUARD();
> +    PCIDevice *pci_dev = (PCIDevice *)qdev;
> +    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
> +    PCIBus *pci_bus = pci_get_bus(pci_dev);
> +
> +    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
> +        return;
> +
> +    xen_host_pci_device_get(&s->real_device,
> +                            s->hostaddr.domain, s->hostaddr.bus,
> +                            s->hostaddr.slot, s->hostaddr.function,
> +                            errp);
> +    if (*errp) {
> +        error_append_hint(errp, "Failed to \"open\" the real pci device");
> +        return;
> +    }
> +
> +    if (is_igd_vga_passthrough(&s->real_device) &&
> +        s->real_device.domain == 0 && s->real_device.bus == 0 &&
> +        s->real_device.dev == 2 && s->real_device.func == 0 &&
> +        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
> +        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
> +        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
> +    }
> +    xpdc->pci_qdev_realize(qdev, errp);
> +}
> +
>  static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
>  {
>      DeviceClass *dc = DEVICE_CLASS(klass);
>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>  
> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
> +    xpdc->pci_qdev_realize = dc->realize;
> +    dc->realize = xen_igd_clear_slot;
>      k->realize = xen_pt_realize;
>      k->exit = xen_pt_unregister_device;
>      k->config_read = xen_pt_pci_read_config;
> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
>      .instance_size = sizeof(XenPCIPassthroughState),
>      .instance_finalize = xen_pci_passthrough_finalize,
>      .class_init = xen_pci_passthrough_class_init,
> +    .class_size = sizeof(XenPTDeviceClass),
>      .instance_init = xen_pci_passthrough_instance_init,
>      .interfaces = (InterfaceInfo[]) {
>          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
> index cf10fc7bbf..8c25932b4b 100644
> --- a/hw/xen/xen_pt.h
> +++ b/hw/xen/xen_pt.h
> @@ -2,6 +2,7 @@
>  #define XEN_PT_H
>  
>  #include "hw/xen/xen_common.h"
> +#include "hw/pci/pci_bus.h"
>  #include "xen-host-pci-device.h"
>  #include "qom/object.h"
>  
> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
>  #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
>  OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
>  
> +#define XEN_PT_DEVICE_CLASS(klass) \
> +    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
> +#define XEN_PT_DEVICE_GET_CLASS(obj) \
> +    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
> +
> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
> +
> +typedef struct XenPTDeviceClass {
> +    PCIDeviceClass parent_class;
> +    XenPTQdevRealize pci_qdev_realize;
> +} XenPTDeviceClass;
> +
>  uint32_t igd_read_opregion(XenPCIPassthroughState *s);
> +void xen_igd_reserve_slot(PCIBus *pci_bus);
>  void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
>  void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
>                                             XenHostPCIDevice *dev);
> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
>  
>  #define XEN_PCI_INTEL_OPREGION 0xfc
>  
> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
> +
>  typedef enum {
>      XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
>      XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
> index 2d8cac8d54..5c108446a8 100644
> --- a/hw/xen/xen_pt_stub.c
> +++ b/hw/xen/xen_pt_stub.c
> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
>          error_setg(errp, "Xen PCI passthrough support not built in");
>      }
>  }
> +
> +void xen_igd_reserve_slot(PCIBus *pci_bus)
> +{
> +}
> -- 
> 2.39.0
Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Chuck Zmudzinski 1 year, 3 months ago
On 1/9/23 5:33 PM, Michael S. Tsirkin wrote:
> On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
>> as noted in docs/igd-assign.txt in the Qemu source code.
>> 
>> Currently, when the xl toolstack is used to configure a Xen HVM guest with
>> Intel IGD passthrough to the guest with the Qemu upstream device model,
>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
>> a different slot. This problem often prevents the guest from booting.
>> 
>> The only available workaround is not good: Configure Xen HVM guests to use
>> the old and no longer maintained Qemu traditional device model available
>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
>> 
>> To implement this feature in the Qemu upstream device model for Xen HVM
>> guests, introduce the following new functions, types, and macros:
>> 
>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
>> * typedef XenPTQdevRealize function pointer
>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
>> * xen_igd_reserve_slot and xen_igd_clear_slot functions
>> 
>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
>> the xl toolstack with the gfx_passthru option enabled, which sets the
>> igd-passthru=on option to Qemu for the Xen HVM machine type.
> 
> I don't like how slot_reserved_mask is set initially then cleared on
> device realize.
> To me this looks like a fragile hack. I suggest one of the following
> 1. adding a new mask
> "slot-manual-mask" or some such blocking auto-allocation of a given
> slot without blocking its use if address is specified on command line.
> 2. adding a special property that overrides slot_reserved_mask
> for a given device.
> 
> both need changes in pci core but look like something generally
> useful.

I was hoping to not need to touch pci core but I understand it would be
better for this patch to not affect machines that are manually configured
on the command line.

However, keep in mind that this patch will only actually reserve the slot
initially for xen hvm machines (machine type "xenfv") that also are configured
with the qemu igd-passthru=on option which, AFAIK, only applies to machines
witn accel=xen. It will not affect kvm users at all. So I don't think this patch
will break many machines out there that manually specify the pci slots. The
only machines it could affect are machines configured for igd-passthru on xen.
This patch also does *not* reserve the slot initially for "xenfv" machines that
are not configured with igd passthrough which I am sure is the vast majority
of all the xen virtual machines out in the wild.

> 
> 
>> 
>> The new xen_igd_reserve_slot function also needs to be implemented in
>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
>> in which case it does nothing.
>> 
>> The new xen_igd_clear_slot function overrides qdev->realize of the parent
>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
>> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
>> 
>> Move the call to xen_host_pci_device_get, and the associated error
>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
>> initialize the device class and vendor values which enables the checks for
>> the Intel IGD to succeed. The verification that the host device is an
>> Intel IGD to be passed through is done by checking the domain, bus, slot,
>> and function values as well as by checking that gfx_passthru is enabled,
>> the device class is VGA, and the device vendor in Intel.
>> 
>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
>> ---
>> Notes that might be helpful to reviewers of patched code in hw/xen:
>> 
>> The new functions and types are based on recommendations from Qemu docs:
>> https://qemu.readthedocs.io/en/latest/devel/qom.html
>> 
>> Notes that might be helpful to reviewers of patched code in hw/i386:
>> 
>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
>> not affect builds that do not have CONFIG_XEN defined.
>> 
>> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
>> existing function that is only true when Qemu is built with
>> xen-pci-passthrough enabled and the administrator has configured the Xen
>> HVM guest with Qemu's igd-passthru=on option.
>> 
>> v2: Remove From: <email address> tag at top of commit message
>> 
>> v3: Changed the test for the Intel IGD in xen_igd_clear_slot:
>> 
>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>         (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {
>> 
>>     is changed to
>> 
>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>         && (s->hostaddr.function == 0)) {
>> 
>>     I hoped that I could use the test in v2, since it matches the
>>     other tests for the Intel IGD in Qemu and Xen, but those tests
>>     do not work because the necessary data structures are not set with
>>     their values yet. So instead use the test that the administrator
>>     has enabled gfx_passthru and the device address on the host is
>>     02.0. This test does detect the Intel IGD correctly.
>> 
>> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
>>     email address to match the address used by the same author in commits
>>     be9c61da and c0e86b76
>>     
>>     Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc
>> 
>> v5: The patch of xen_pt.c was re-worked to allow a more consistent test
>>     for the Intel IGD that uses the same criteria as in other places.
>>     This involved moving the call to xen_host_pci_device_get from
>>     xen_pt_realize to xen_igd_clear_slot and updating the checks for the
>>     Intel IGD in xen_igd_clear_slot:
>>     
>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>         && (s->hostaddr.function == 0)) {
>> 
>>     is changed to
>> 
>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>         s->real_device.domain == 0 && s->real_device.bus == 0 &&
>>         s->real_device.dev == 2 && s->real_device.func == 0 &&
>>         s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>> 
>>     Added an explanation for the move of xen_host_pci_device_get from
>>     xen_pt_realize to xen_igd_clear_slot to the commit message.
>> 
>>     Rebase.
>> 
>> v6: Fix logging by removing these lines from the move from xen_pt_realize
>>     to xen_igd_clear_slot that was done in v5:
>> 
>>     XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
>>                " to devfn 0x%x\n",
>>                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>                s->dev.devfn);
>> 
>>     This log needs to be in xen_pt_realize because s->dev.devfn is not
>>     set yet in xen_igd_clear_slot.
>> 
>> v7: Inhibit out of context log message and needless processing by
>>     adding 2 lines at the top of the new xen_igd_clear_slot function:
>> 
>>     if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>>         return;
>> 
>>     Rebase. This removed an unnecessary header file from xen_pt.h 
>> 
>>  hw/i386/pc_piix.c    | 127 ++++++++++++++++++++++++++++++++-----------
>>  hw/xen/xen_pt.c      |  49 ++++++++++++++---
>>  hw/xen/xen_pt.h      |  16 ++++++
>>  hw/xen/xen_pt_stub.c |   4 ++
>>  4 files changed, 154 insertions(+), 42 deletions(-)
>> 
>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>> index b48047f50c..34a9736b5e 100644
>> --- a/hw/i386/pc_piix.c
>> +++ b/hw/i386/pc_piix.c
>> @@ -32,6 +32,7 @@
>>  #include "hw/i386/pc.h"
>>  #include "hw/i386/apic.h"
>>  #include "hw/pci-host/i440fx.h"
>> +#include "hw/rtc/mc146818rtc.h"
>>  #include "hw/southbridge/piix.h"
>>  #include "hw/display/ramfb.h"
>>  #include "hw/firmware/smbios.h"
>> @@ -40,16 +41,16 @@
>>  #include "hw/usb.h"
>>  #include "net/net.h"
>>  #include "hw/ide/pci.h"
>> -#include "hw/ide/piix.h"
>>  #include "hw/irq.h"
>>  #include "sysemu/kvm.h"
>>  #include "hw/kvm/clock.h"
>>  #include "hw/sysbus.h"
>> +#include "hw/i2c/i2c.h"
>>  #include "hw/i2c/smbus_eeprom.h"
>>  #include "hw/xen/xen-x86.h"
>> +#include "hw/xen/xen.h"
>>  #include "exec/memory.h"
>>  #include "hw/acpi/acpi.h"
>> -#include "hw/acpi/piix4.h"
>>  #include "qapi/error.h"
>>  #include "qemu/error-report.h"
>>  #include "sysemu/xen.h"
>> @@ -66,6 +67,7 @@
>>  #include "kvm/kvm-cpu.h"
>>  
>>  #define MAX_IDE_BUS 2
>> +#define XEN_IOAPIC_NUM_PIRQS 128ULL
>>  
>>  #ifdef CONFIG_IDE_ISA
>>  static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 };
>> @@ -73,6 +75,32 @@ static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 };
>>  static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
>>  #endif
>>  
>> +/*
>> + * Return the global irq number corresponding to a given device irq
>> + * pin. We could also use the bus number to have a more precise mapping.
>> + */
>> +static int pci_slot_get_pirq(PCIDevice *pci_dev, int pci_intx)
>> +{
>> +    int slot_addend;
>> +    slot_addend = PCI_SLOT(pci_dev->devfn) - 1;
>> +    return (pci_intx + slot_addend) & 3;
>> +}
>> +
>> +static void piix_intx_routing_notifier_xen(PCIDevice *dev)
>> +{
>> +    int i;
>> +
>> +    /* Scan for updates to PCI link routes (0x60-0x63). */
>> +    for (i = 0; i < PIIX_NUM_PIRQS; i++) {
>> +        uint8_t v = dev->config_read(dev, PIIX_PIRQCA + i, 1);
>> +        if (v & 0x80) {
>> +            v = 0;
>> +        }
>> +        v &= 0xf;
>> +        xen_set_pci_link_route(i, v);
>> +    }
>> +}
>> +
>>  /* PC hardware initialisation */
>>  static void pc_init1(MachineState *machine,
>>                       const char *host_type, const char *pci_type)
>> @@ -84,7 +112,7 @@ static void pc_init1(MachineState *machine,
>>      MemoryRegion *system_io = get_system_io();
>>      PCIBus *pci_bus;
>>      ISABus *isa_bus;
>> -    int piix3_devfn = -1;
>> +    Object *piix4_pm;
>>      qemu_irq smi_irq;
>>      GSIState *gsi_state;
>>      BusState *idebus[MAX_IDE_BUS];
>> @@ -205,10 +233,9 @@ static void pc_init1(MachineState *machine,
>>      gsi_state = pc_gsi_create(&x86ms->gsi, pcmc->pci_enabled);
>>  
>>      if (pcmc->pci_enabled) {
>> -        PIIX3State *piix3;
>> +        DeviceState *dev;
>>          PCIDevice *pci_dev;
>> -        const char *type = xen_enabled() ? TYPE_PIIX3_XEN_DEVICE
>> -                                         : TYPE_PIIX3_DEVICE;
>> +        int i;
>>  
>>          pci_bus = i440fx_init(pci_type,
>>                                i440fx_host,
>> @@ -216,21 +243,65 @@ static void pc_init1(MachineState *machine,
>>                                x86ms->below_4g_mem_size,
>>                                x86ms->above_4g_mem_size,
>>                                pci_memory, ram_memory);
>> +        pci_bus_map_irqs(pci_bus,
>> +                         xen_enabled() ? xen_pci_slot_get_pirq
>> +                                       : pci_slot_get_pirq);
>>          pcms->bus = pci_bus;
>>  
>> -        pci_dev = pci_create_simple_multifunction(pci_bus, -1, true, type);
>> -        piix3 = PIIX3_PCI_DEVICE(pci_dev);
>> -        piix3->pic = x86ms->gsi;
>> -        piix3_devfn = piix3->dev.devfn;
>> -        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(piix3), "isa.0"));
>> +        pci_dev = pci_new_multifunction(-1, true, TYPE_PIIX3_DEVICE);
>> +        object_property_set_bool(OBJECT(pci_dev), "has-usb",
>> +                                 machine_usb(machine), &error_abort);
>> +        object_property_set_bool(OBJECT(pci_dev), "has-acpi",
>> +                                 x86_machine_is_acpi_enabled(x86ms),
>> +                                 &error_abort);
>> +        qdev_prop_set_uint32(DEVICE(pci_dev), "smb_io_base", 0xb100);
>> +        object_property_set_bool(OBJECT(pci_dev), "smm-enabled",
>> +                                 x86_machine_is_smm_enabled(x86ms),
>> +                                 &error_abort);
>> +        pci_realize_and_unref(pci_dev, pci_bus, &error_fatal);
>> +
>> +        if (xen_enabled()) {
>> +            pci_device_set_intx_routing_notifier(
>> +                        pci_dev, piix_intx_routing_notifier_xen);
>> +
>> +            /*
>> +             * Xen supports additional interrupt routes from the PCI devices to
>> +             * the IOAPIC: the four pins of each PCI device on the bus are also
>> +             * connected to the IOAPIC directly.
>> +             * These additional routes can be discovered through ACPI.
>> +             */
>> +            pci_bus_irqs(pci_bus, xen_intx_set_irq, pci_dev,
>> +                         XEN_IOAPIC_NUM_PIRQS);
>> +        }
>> +
>> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "pic"));
>> +        for (i = 0; i < ISA_NUM_IRQS; i++) {
>> +            qdev_connect_gpio_out(dev, i, x86ms->gsi[i]);
>> +        }
>> +        isa_bus = ISA_BUS(qdev_get_child_bus(DEVICE(pci_dev), "isa.0"));
>> +        rtc_state = ISA_DEVICE(object_resolve_path_component(OBJECT(pci_dev),
>> +                                                             "rtc"));
>> +        piix4_pm = object_resolve_path_component(OBJECT(pci_dev), "pm");
>> +        dev = DEVICE(object_resolve_path_component(OBJECT(pci_dev), "ide"));
>> +        pci_ide_create_devs(PCI_DEVICE(dev));
>> +        idebus[0] = qdev_get_child_bus(dev, "ide.0");
>> +        idebus[1] = qdev_get_child_bus(dev, "ide.1");
>>      } else {
>>          pci_bus = NULL;
>> +        piix4_pm = NULL;
>>          isa_bus = isa_bus_new(NULL, get_system_memory(), system_io,
>>                                &error_abort);
>> +        isa_bus_irqs(isa_bus, x86ms->gsi);
>> +
>> +        rtc_state = isa_new(TYPE_MC146818_RTC);
>> +        qdev_prop_set_int32(DEVICE(rtc_state), "base_year", 2000);
>> +        isa_realize_and_unref(rtc_state, isa_bus, &error_fatal);
>> +
>>          i8257_dma_init(isa_bus, 0);
>>          pcms->hpet_enabled = false;
>> +        idebus[0] = NULL;
>> +        idebus[1] = NULL;
>>      }
>> -    isa_bus_irqs(isa_bus, x86ms->gsi);
>>  
>>      if (x86ms->pic == ON_OFF_AUTO_ON || x86ms->pic == ON_OFF_AUTO_AUTO) {
>>          pc_i8259_create(isa_bus, gsi_state->i8259_irq);
>> @@ -252,18 +323,12 @@ static void pc_init1(MachineState *machine,
>>      }
>>  
>>      /* init basic PC hardware */
>> -    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
>> +    pc_basic_device_init(pcms, isa_bus, x86ms->gsi, rtc_state, true,
>>                           0x4);
>>  
>>      pc_nic_init(pcmc, isa_bus, pci_bus);
>>  
>>      if (pcmc->pci_enabled) {
>> -        PCIDevice *dev;
>> -
>> -        dev = pci_create_simple(pci_bus, piix3_devfn + 1, TYPE_PIIX3_IDE);
>> -        pci_ide_create_devs(dev);
>> -        idebus[0] = qdev_get_child_bus(&dev->qdev, "ide.0");
>> -        idebus[1] = qdev_get_child_bus(&dev->qdev, "ide.1");
>>          pc_cmos_init(pcms, idebus[0], idebus[1], rtc_state);
>>      }
>>  #ifdef CONFIG_IDE_ISA
>> @@ -289,21 +354,9 @@ static void pc_init1(MachineState *machine,
>>      }
>>  #endif
>>  
>> -    if (pcmc->pci_enabled && machine_usb(machine)) {
>> -        pci_create_simple(pci_bus, piix3_devfn + 2, "piix3-usb-uhci");
>> -    }
>> -
>> -    if (pcmc->pci_enabled && x86_machine_is_acpi_enabled(X86_MACHINE(pcms))) {
>> -        PCIDevice *piix4_pm;
>> -
>> +    if (piix4_pm) {
>>          smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
>> -        piix4_pm = pci_new(piix3_devfn + 3, TYPE_PIIX4_PM);
>> -        qdev_prop_set_uint32(DEVICE(piix4_pm), "smb_io_base", 0xb100);
>> -        qdev_prop_set_bit(DEVICE(piix4_pm), "smm-enabled",
>> -                          x86_machine_is_smm_enabled(x86ms));
>> -        pci_realize_and_unref(piix4_pm, pci_bus, &error_fatal);
>>  
>> -        qdev_connect_gpio_out(DEVICE(piix4_pm), 0, x86ms->gsi[9]);
>>          qdev_connect_gpio_out_named(DEVICE(piix4_pm), "smi-irq", 0, smi_irq);
>>          pcms->smbus = I2C_BUS(qdev_get_child_bus(DEVICE(piix4_pm), "i2c"));
>>          /* TODO: Populate SPD eeprom data.  */
>> @@ -315,7 +368,7 @@ static void pc_init1(MachineState *machine,
>>                                   object_property_allow_set_link,
>>                                   OBJ_PROP_LINK_STRONG);
>>          object_property_set_link(OBJECT(machine), PC_MACHINE_ACPI_DEVICE_PROP,
>> -                                 OBJECT(piix4_pm), &error_abort);
>> +                                 piix4_pm, &error_abort);
>>      }
>>  
>>      if (machine->nvdimms_state->is_enabled) {
>> @@ -405,6 +458,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>>      }
>>  
>>      pc_xen_hvm_init_pci(machine);
>> +    if (xen_igd_gfx_pt_enabled()) {
>> +        xen_igd_reserve_slot(pcms->bus);
>> +    }
>>      pci_create_simple(pcms->bus, -1, "xen-platform");
>>  }
>>  #endif
>> @@ -441,6 +497,11 @@ static void pc_i440fx_8_0_machine_options(MachineClass *m)
>>      pc_i440fx_machine_options(m);
>>      m->alias = "pc";
>>      m->is_default = true;
>> +#ifdef CONFIG_MICROVM_DEFAULT
>> +    m->is_default = false;
>> +#else
>> +    m->is_default = true;
>> +#endif
>>  }
>>  
>>  DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
>> index 0ec7e52183..eff38155ef 100644
>> --- a/hw/xen/xen_pt.c
>> +++ b/hw/xen/xen_pt.c
>> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
>>                 s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>                 s->dev.devfn);
>>  
>> -    xen_host_pci_device_get(&s->real_device,
>> -                            s->hostaddr.domain, s->hostaddr.bus,
>> -                            s->hostaddr.slot, s->hostaddr.function,
>> -                            errp);
>> -    if (*errp) {
>> -        error_append_hint(errp, "Failed to \"open\" the real pci device");
>> -        return;
>> -    }
>> -
>>      s->is_virtfn = s->real_device.is_virtfn;
>>      if (s->is_virtfn) {
>>          XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
>> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
>>      PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
>>  }
>>  
>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>> +{
>> +    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
>> +    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
>> +}
>> +
>> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
>> +{
>> +    ERRP_GUARD();
>> +    PCIDevice *pci_dev = (PCIDevice *)qdev;
>> +    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
>> +    PCIBus *pci_bus = pci_get_bus(pci_dev);
>> +
>> +    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>> +        return;
>> +
>> +    xen_host_pci_device_get(&s->real_device,
>> +                            s->hostaddr.domain, s->hostaddr.bus,
>> +                            s->hostaddr.slot, s->hostaddr.function,
>> +                            errp);
>> +    if (*errp) {
>> +        error_append_hint(errp, "Failed to \"open\" the real pci device");
>> +        return;
>> +    }
>> +
>> +    if (is_igd_vga_passthrough(&s->real_device) &&
>> +        s->real_device.domain == 0 && s->real_device.bus == 0 &&
>> +        s->real_device.dev == 2 && s->real_device.func == 0 &&
>> +        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>> +        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
>> +        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
>> +    }
>> +    xpdc->pci_qdev_realize(qdev, errp);
>> +}
>> +
>>  static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
>>  {
>>      DeviceClass *dc = DEVICE_CLASS(klass);
>>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>>  
>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
>> +    xpdc->pci_qdev_realize = dc->realize;
>> +    dc->realize = xen_igd_clear_slot;
>>      k->realize = xen_pt_realize;
>>      k->exit = xen_pt_unregister_device;
>>      k->config_read = xen_pt_pci_read_config;
>> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
>>      .instance_size = sizeof(XenPCIPassthroughState),
>>      .instance_finalize = xen_pci_passthrough_finalize,
>>      .class_init = xen_pci_passthrough_class_init,
>> +    .class_size = sizeof(XenPTDeviceClass),
>>      .instance_init = xen_pci_passthrough_instance_init,
>>      .interfaces = (InterfaceInfo[]) {
>>          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
>> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
>> index cf10fc7bbf..8c25932b4b 100644
>> --- a/hw/xen/xen_pt.h
>> +++ b/hw/xen/xen_pt.h
>> @@ -2,6 +2,7 @@
>>  #define XEN_PT_H
>>  
>>  #include "hw/xen/xen_common.h"
>> +#include "hw/pci/pci_bus.h"
>>  #include "xen-host-pci-device.h"
>>  #include "qom/object.h"
>>  
>> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
>>  #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
>>  OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
>>  
>> +#define XEN_PT_DEVICE_CLASS(klass) \
>> +    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
>> +#define XEN_PT_DEVICE_GET_CLASS(obj) \
>> +    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
>> +
>> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
>> +
>> +typedef struct XenPTDeviceClass {
>> +    PCIDeviceClass parent_class;
>> +    XenPTQdevRealize pci_qdev_realize;
>> +} XenPTDeviceClass;
>> +
>>  uint32_t igd_read_opregion(XenPCIPassthroughState *s);
>> +void xen_igd_reserve_slot(PCIBus *pci_bus);
>>  void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
>>  void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
>>                                             XenHostPCIDevice *dev);
>> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
>>  
>>  #define XEN_PCI_INTEL_OPREGION 0xfc
>>  
>> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
>> +
>>  typedef enum {
>>      XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
>>      XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
>> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
>> index 2d8cac8d54..5c108446a8 100644
>> --- a/hw/xen/xen_pt_stub.c
>> +++ b/hw/xen/xen_pt_stub.c
>> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
>>          error_setg(errp, "Xen PCI passthrough support not built in");
>>      }
>>  }
>> +
>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>> +{
>> +}
>> -- 
>> 2.39.0
> 


Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Michael S. Tsirkin 1 year, 3 months ago
On Mon, Jan 09, 2023 at 06:28:44PM -0500, Chuck Zmudzinski wrote:
> On 1/9/23 5:33 PM, Michael S. Tsirkin wrote:
> > On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
> >> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
> >> as noted in docs/igd-assign.txt in the Qemu source code.
> >> 
> >> Currently, when the xl toolstack is used to configure a Xen HVM guest with
> >> Intel IGD passthrough to the guest with the Qemu upstream device model,
> >> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
> >> a different slot. This problem often prevents the guest from booting.
> >> 
> >> The only available workaround is not good: Configure Xen HVM guests to use
> >> the old and no longer maintained Qemu traditional device model available
> >> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
> >> 
> >> To implement this feature in the Qemu upstream device model for Xen HVM
> >> guests, introduce the following new functions, types, and macros:
> >> 
> >> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
> >> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
> >> * typedef XenPTQdevRealize function pointer
> >> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
> >> * xen_igd_reserve_slot and xen_igd_clear_slot functions
> >> 
> >> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
> >> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
> >> the xl toolstack with the gfx_passthru option enabled, which sets the
> >> igd-passthru=on option to Qemu for the Xen HVM machine type.
> > 
> > I don't like how slot_reserved_mask is set initially then cleared on
> > device realize.
> > To me this looks like a fragile hack. I suggest one of the following
> > 1. adding a new mask
> > "slot-manual-mask" or some such blocking auto-allocation of a given
> > slot without blocking its use if address is specified on command line.
> > 2. adding a special property that overrides slot_reserved_mask
> > for a given device.
> > 
> > both need changes in pci core but look like something generally
> > useful.
> 
> I was hoping to not need to touch pci core but I understand it would be
> better for this patch to not affect machines that are manually configured
> on the command line.
> 
> However, keep in mind that this patch will only actually reserve the slot
> initially for xen hvm machines (machine type "xenfv") that also are configured
> with the qemu igd-passthru=on option which, AFAIK, only applies to machines
> witn accel=xen. It will not affect kvm users at all. So I don't think this patch
> will break many machines out there that manually specify the pci slots. The
> only machines it could affect are machines configured for igd-passthru on xen.
> This patch also does *not* reserve the slot initially for "xenfv" machines that
> are not configured with igd passthrough which I am sure is the vast majority
> of all the xen virtual machines out in the wild.

I'm just saying that adding a capability that is generally useful
as opposed to xen specific means less technical debt.

-- 
MST


Re: [PATCH v7] xen/pt: reserve PCI slot 2 for Intel igd-passthru
Posted by Chuck Zmudzinski 1 year, 3 months ago
On 1/9/23 6:35 PM, Michael S. Tsirkin wrote:
> On Mon, Jan 09, 2023 at 06:28:44PM -0500, Chuck Zmudzinski wrote:
>> On 1/9/23 5:33 PM, Michael S. Tsirkin wrote:
>> > On Mon, Jan 09, 2023 at 04:55:42PM -0500, Chuck Zmudzinski wrote:
>> >> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
>> >> as noted in docs/igd-assign.txt in the Qemu source code.
>> >> 
>> >> Currently, when the xl toolstack is used to configure a Xen HVM guest with
>> >> Intel IGD passthrough to the guest with the Qemu upstream device model,
>> >> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
>> >> a different slot. This problem often prevents the guest from booting.
>> >> 
>> >> The only available workaround is not good: Configure Xen HVM guests to use
>> >> the old and no longer maintained Qemu traditional device model available
>> >> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
>> >> 
>> >> To implement this feature in the Qemu upstream device model for Xen HVM
>> >> guests, introduce the following new functions, types, and macros:
>> >> 
>> >> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
>> >> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
>> >> * typedef XenPTQdevRealize function pointer
>> >> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
>> >> * xen_igd_reserve_slot and xen_igd_clear_slot functions
>> >> 
>> >> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
>> >> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
>> >> the xl toolstack with the gfx_passthru option enabled, which sets the
>> >> igd-passthru=on option to Qemu for the Xen HVM machine type.
>> > 
>> > I don't like how slot_reserved_mask is set initially then cleared on
>> > device realize.
>> > To me this looks like a fragile hack. I suggest one of the following
>> > 1. adding a new mask
>> > "slot-manual-mask" or some such blocking auto-allocation of a given
>> > slot without blocking its use if address is specified on command line.
>> > 2. adding a special property that overrides slot_reserved_mask
>> > for a given device.
>> > 
>> > both need changes in pci core but look like something generally
>> > useful.
>> 
>> I was hoping to not need to touch pci core but I understand it would be
>> better for this patch to not affect machines that are manually configured
>> on the command line.
>> 
>> However, keep in mind that this patch will only actually reserve the slot
>> initially for xen hvm machines (machine type "xenfv") that also are configured
>> with the qemu igd-passthru=on option which, AFAIK, only applies to machines
>> witn accel=xen. It will not affect kvm users at all. So I don't think this patch
>> will break many machines out there that manually specify the pci slots. The
>> only machines it could affect are machines configured for igd-passthru on xen.
>> This patch also does *not* reserve the slot initially for "xenfv" machines that
>> are not configured with igd passthrough which I am sure is the vast majority
>> of all the xen virtual machines out in the wild.
> 
> I'm just saying that adding a capability that is generally useful
> as opposed to xen specific means less technical debt.
> 

I agree with that.