[v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

[PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
as noted in docs/igd-assign.txt in the Qemu source code.

Currently, when the xl toolstack is used to configure a Xen HVM guest with
Intel IGD passthrough to the guest with the Qemu upstream device model,
a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
a different slot. This problem often prevents the guest from booting.

The only available workaround is not good: Configure Xen HVM guests to use
the old and no longer maintained Qemu traditional device model available
from xenbits.xen.org which does reserve slot 2 for the Intel IGD.

To implement this feature in the Qemu upstream device model for Xen HVM
guests, introduce the following new functions, types, and macros:

* XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
* XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
* typedef XenPTQdevRealize function pointer
* XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
* xen_igd_reserve_slot and xen_igd_clear_slot functions

The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
the xl toolstack with the gfx_passthru option enabled, which sets the
igd-passthru=on option to Qemu for the Xen HVM machine type.

The new xen_igd_reserve_slot function also needs to be implemented in
hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
in which case it does nothing.

The new xen_igd_clear_slot function overrides qdev->realize of the parent
PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
created in hw/i386/pc_piix.c for the case when igd-passthru=on.

Move the call to xen_host_pci_device_get, and the associated error
handling, from xen_pt_realize to the new xen_igd_clear_slot function to
initialize the device class and vendor values which enables the checks for
the Intel IGD to succeed. The verification that the host device is an
Intel IGD to be passed through is done by checking the domain, bus, slot,
and function values as well as by checking that gfx_passthru is enabled,
the device class is VGA, and the device vendor in Intel.

Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
---
Notes that might be helpful to reviewers of patched code in hw/xen:

The new functions and types are based on recommendations from Qemu docs:
https://qemu.readthedocs.io/en/latest/devel/qom.html

Notes that might be helpful to reviewers of patched code in hw/i386:

The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
not affect builds that do not have CONFIG_XEN defined.

xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
existing function that is only true when Qemu is built with
xen-pci-passthrough enabled and the administrator has configured the Xen
HVM guest with Qemu's igd-passthru=on option.

v2: Remove From: <email address> tag at top of commit message

v3: Changed the test for the Intel IGD in xen_igd_clear_slot:

    if (is_igd_vga_passthrough(&s->real_device) &&
        (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {

    is changed to

    if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
        && (s->hostaddr.function == 0)) {

    I hoped that I could use the test in v2, since it matches the
    other tests for the Intel IGD in Qemu and Xen, but those tests
    do not work because the necessary data structures are not set with
    their values yet. So instead use the test that the administrator
    has enabled gfx_passthru and the device address on the host is
    02.0. This test does detect the Intel IGD correctly.

v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
    email address to match the address used by the same author in commits
    be9c61da and c0e86b76
    
    Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc

v5: The patch of xen_pt.c was re-worked to allow a more consistent test
    for the Intel IGD that uses the same criteria as in other places.
    This involved moving the call to xen_host_pci_device_get from
    xen_pt_realize to xen_igd_clear_slot and updating the checks for the
    Intel IGD in xen_igd_clear_slot:
    
    if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
        && (s->hostaddr.function == 0)) {

    is changed to

    if (is_igd_vga_passthrough(&s->real_device) &&
        s->real_device.domain == 0 && s->real_device.bus == 0 &&
        s->real_device.dev == 2 && s->real_device.func == 0 &&
        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {

    Added an explanation for the move of xen_host_pci_device_get from
    xen_pt_realize to xen_igd_clear_slot to the commit message.

    Rebase.

v6: Fix logging by removing these lines from the move from xen_pt_realize
    to xen_igd_clear_slot that was done in v5:

    XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
               " to devfn 0x%x\n",
               s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
               s->dev.devfn);

    This log needs to be in xen_pt_realize because s->dev.devfn is not
    set yet in xen_igd_clear_slot.

v7: The v7 that was posted to the mailing list was incorrect. v8 is what
    v7 was intended to be.

v8: Inhibit out of context log message and needless processing by
    adding 2 lines at the top of the new xen_igd_clear_slot function:

    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
        return;

    Rebase. This removed an unnecessary header file from xen_pt.h 

 hw/i386/pc_piix.c    |  3 +++
 hw/xen/xen_pt.c      | 49 ++++++++++++++++++++++++++++++++++++--------
 hw/xen/xen_pt.h      | 16 +++++++++++++++
 hw/xen/xen_pt_stub.c |  4 ++++
 4 files changed, 63 insertions(+), 9 deletions(-)

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index b48047f50c..bc5efa4f59 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine)
     }
 
     pc_xen_hvm_init_pci(machine);
+    if (xen_igd_gfx_pt_enabled()) {
+        xen_igd_reserve_slot(pcms->bus);
+    }
     pci_create_simple(pcms->bus, -1, "xen-platform");
 }
 #endif
diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 0ec7e52183..eff38155ef 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
                s->dev.devfn);
 
-    xen_host_pci_device_get(&s->real_device,
-                            s->hostaddr.domain, s->hostaddr.bus,
-                            s->hostaddr.slot, s->hostaddr.function,
-                            errp);
-    if (*errp) {
-        error_append_hint(errp, "Failed to \"open\" the real pci device");
-        return;
-    }
-
     s->is_virtfn = s->real_device.is_virtfn;
     if (s->is_virtfn) {
         XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
@@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
     PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
 }
 
+void xen_igd_reserve_slot(PCIBus *pci_bus)
+{
+    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
+    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
+}
+
+static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
+{
+    ERRP_GUARD();
+    PCIDevice *pci_dev = (PCIDevice *)qdev;
+    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
+    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
+    PCIBus *pci_bus = pci_get_bus(pci_dev);
+
+    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
+        return;
+
+    xen_host_pci_device_get(&s->real_device,
+                            s->hostaddr.domain, s->hostaddr.bus,
+                            s->hostaddr.slot, s->hostaddr.function,
+                            errp);
+    if (*errp) {
+        error_append_hint(errp, "Failed to \"open\" the real pci device");
+        return;
+    }
+
+    if (is_igd_vga_passthrough(&s->real_device) &&
+        s->real_device.domain == 0 && s->real_device.bus == 0 &&
+        s->real_device.dev == 2 && s->real_device.func == 0 &&
+        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
+        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
+        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
+    }
+    xpdc->pci_qdev_realize(qdev, errp);
+}
+
 static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
     PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 
+    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
+    xpdc->pci_qdev_realize = dc->realize;
+    dc->realize = xen_igd_clear_slot;
     k->realize = xen_pt_realize;
     k->exit = xen_pt_unregister_device;
     k->config_read = xen_pt_pci_read_config;
@@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
     .instance_size = sizeof(XenPCIPassthroughState),
     .instance_finalize = xen_pci_passthrough_finalize,
     .class_init = xen_pci_passthrough_class_init,
+    .class_size = sizeof(XenPTDeviceClass),
     .instance_init = xen_pci_passthrough_instance_init,
     .interfaces = (InterfaceInfo[]) {
         { INTERFACE_CONVENTIONAL_PCI_DEVICE },
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index cf10fc7bbf..8c25932b4b 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -2,6 +2,7 @@
 #define XEN_PT_H
 
 #include "hw/xen/xen_common.h"
+#include "hw/pci/pci_bus.h"
 #include "xen-host-pci-device.h"
 #include "qom/object.h"
 
@@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
 #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
 OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
 
+#define XEN_PT_DEVICE_CLASS(klass) \
+    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
+#define XEN_PT_DEVICE_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
+
+typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
+
+typedef struct XenPTDeviceClass {
+    PCIDeviceClass parent_class;
+    XenPTQdevRealize pci_qdev_realize;
+} XenPTDeviceClass;
+
 uint32_t igd_read_opregion(XenPCIPassthroughState *s);
+void xen_igd_reserve_slot(PCIBus *pci_bus);
 void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
 void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
                                            XenHostPCIDevice *dev);
@@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
 
 #define XEN_PCI_INTEL_OPREGION 0xfc
 
+#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
+
 typedef enum {
     XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
     XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
index 2d8cac8d54..5c108446a8 100644
--- a/hw/xen/xen_pt_stub.c
+++ b/hw/xen/xen_pt_stub.c
@@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
         error_setg(errp, "Xen PCI passthrough support not built in");
     }
 }
+
+void xen_igd_reserve_slot(PCIBus *pci_bus)
+{
+}
-- 
2.39.0

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Michael S. Tsirkin 1 year, 3 months ago

On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote:
> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
> as noted in docs/igd-assign.txt in the Qemu source code.
> 
> Currently, when the xl toolstack is used to configure a Xen HVM guest with
> Intel IGD passthrough to the guest with the Qemu upstream device model,
> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
> a different slot. This problem often prevents the guest from booting.
> 
> The only available workaround is not good: Configure Xen HVM guests to use
> the old and no longer maintained Qemu traditional device model available
> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
> 
> To implement this feature in the Qemu upstream device model for Xen HVM
> guests, introduce the following new functions, types, and macros:
> 
> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
> * typedef XenPTQdevRealize function pointer
> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
> * xen_igd_reserve_slot and xen_igd_clear_slot functions
> 
> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
> the xl toolstack with the gfx_passthru option enabled, which sets the
> igd-passthru=on option to Qemu for the Xen HVM machine type.
> 
> The new xen_igd_reserve_slot function also needs to be implemented in
> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
> in which case it does nothing.
> 
> The new xen_igd_clear_slot function overrides qdev->realize of the parent
> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
> 
> Move the call to xen_host_pci_device_get, and the associated error
> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
> initialize the device class and vendor values which enables the checks for
> the Intel IGD to succeed. The verification that the host device is an
> Intel IGD to be passed through is done by checking the domain, bus, slot,
> and function values as well as by checking that gfx_passthru is enabled,
> the device class is VGA, and the device vendor in Intel.
> 
> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
> ---
> Notes that might be helpful to reviewers of patched code in hw/xen:
> 
> The new functions and types are based on recommendations from Qemu docs:
> https://qemu.readthedocs.io/en/latest/devel/qom.html
> 
> Notes that might be helpful to reviewers of patched code in hw/i386:
> 
> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
> not affect builds that do not have CONFIG_XEN defined.
> 
> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
> existing function that is only true when Qemu is built with
> xen-pci-passthrough enabled and the administrator has configured the Xen
> HVM guest with Qemu's igd-passthru=on option.
> 
> v2: Remove From: <email address> tag at top of commit message
> 
> v3: Changed the test for the Intel IGD in xen_igd_clear_slot:
> 
>     if (is_igd_vga_passthrough(&s->real_device) &&
>         (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {
> 
>     is changed to
> 
>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>         && (s->hostaddr.function == 0)) {
> 
>     I hoped that I could use the test in v2, since it matches the
>     other tests for the Intel IGD in Qemu and Xen, but those tests
>     do not work because the necessary data structures are not set with
>     their values yet. So instead use the test that the administrator
>     has enabled gfx_passthru and the device address on the host is
>     02.0. This test does detect the Intel IGD correctly.
> 
> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
>     email address to match the address used by the same author in commits
>     be9c61da and c0e86b76
>     
>     Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc
> 
> v5: The patch of xen_pt.c was re-worked to allow a more consistent test
>     for the Intel IGD that uses the same criteria as in other places.
>     This involved moving the call to xen_host_pci_device_get from
>     xen_pt_realize to xen_igd_clear_slot and updating the checks for the
>     Intel IGD in xen_igd_clear_slot:
>     
>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>         && (s->hostaddr.function == 0)) {
> 
>     is changed to
> 
>     if (is_igd_vga_passthrough(&s->real_device) &&
>         s->real_device.domain == 0 && s->real_device.bus == 0 &&
>         s->real_device.dev == 2 && s->real_device.func == 0 &&
>         s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
> 
>     Added an explanation for the move of xen_host_pci_device_get from
>     xen_pt_realize to xen_igd_clear_slot to the commit message.
> 
>     Rebase.
> 
> v6: Fix logging by removing these lines from the move from xen_pt_realize
>     to xen_igd_clear_slot that was done in v5:
> 
>     XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
>                " to devfn 0x%x\n",
>                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>                s->dev.devfn);
> 
>     This log needs to be in xen_pt_realize because s->dev.devfn is not
>     set yet in xen_igd_clear_slot.
> 
> v7: The v7 that was posted to the mailing list was incorrect. v8 is what
>     v7 was intended to be.
> 
> v8: Inhibit out of context log message and needless processing by
>     adding 2 lines at the top of the new xen_igd_clear_slot function:
> 
>     if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>         return;
> 
>     Rebase. This removed an unnecessary header file from xen_pt.h 
> 
>  hw/i386/pc_piix.c    |  3 +++
>  hw/xen/xen_pt.c      | 49 ++++++++++++++++++++++++++++++++++++--------
>  hw/xen/xen_pt.h      | 16 +++++++++++++++
>  hw/xen/xen_pt_stub.c |  4 ++++
>  4 files changed, 63 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index b48047f50c..bc5efa4f59 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>      }
>  
>      pc_xen_hvm_init_pci(machine);
> +    if (xen_igd_gfx_pt_enabled()) {
> +        xen_igd_reserve_slot(pcms->bus);
> +    }
>      pci_create_simple(pcms->bus, -1, "xen-platform");
>  }
>  #endif

I would even maybe go further and move the whole logic into
xen_igd_reserve_slot. And I would even just name it
xen_hvm_init_reserved_slots without worrying about the what
or why at the pc level.  At this point it will be up to Xen maintainers.

> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 0ec7e52183..eff38155ef 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
>                 s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>                 s->dev.devfn);
>  
> -    xen_host_pci_device_get(&s->real_device,
> -                            s->hostaddr.domain, s->hostaddr.bus,
> -                            s->hostaddr.slot, s->hostaddr.function,
> -                            errp);
> -    if (*errp) {
> -        error_append_hint(errp, "Failed to \"open\" the real pci device");
> -        return;
> -    }
> -
>      s->is_virtfn = s->real_device.is_virtfn;
>      if (s->is_virtfn) {
>          XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
>      PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
>  }
>  
> +void xen_igd_reserve_slot(PCIBus *pci_bus)
> +{
> +    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
> +    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
> +}
> +
> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
> +{
> +    ERRP_GUARD();
> +    PCIDevice *pci_dev = (PCIDevice *)qdev;
> +    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
> +    PCIBus *pci_bus = pci_get_bus(pci_dev);
> +
> +    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
> +        return;
> +
> +    xen_host_pci_device_get(&s->real_device,
> +                            s->hostaddr.domain, s->hostaddr.bus,
> +                            s->hostaddr.slot, s->hostaddr.function,
> +                            errp);
> +    if (*errp) {
> +        error_append_hint(errp, "Failed to \"open\" the real pci device");
> +        return;
> +    }
> +
> +    if (is_igd_vga_passthrough(&s->real_device) &&
> +        s->real_device.domain == 0 && s->real_device.bus == 0 &&
> +        s->real_device.dev == 2 && s->real_device.func == 0 &&
> +        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {

how about macros for these?

#define XEN_PCI_IGD_DOMAIN 0
#define XEN_PCI_IGD_BUS 0
#define XEN_PCI_IGD_DEV 2
#define XEN_PCI_IGD_FN 0

> +        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;

If you are going to do this, you should set it back
either after pci_qdev_realize or in unrealize,
for symmetry.

> +        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
> +    }


> +    xpdc->pci_qdev_realize(qdev, errp);
> +}
> +



>  static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
>  {
>      DeviceClass *dc = DEVICE_CLASS(klass);
>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>  
> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
> +    xpdc->pci_qdev_realize = dc->realize;
> +    dc->realize = xen_igd_clear_slot;
>      k->realize = xen_pt_realize;
>      k->exit = xen_pt_unregister_device;
>      k->config_read = xen_pt_pci_read_config;
> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
>      .instance_size = sizeof(XenPCIPassthroughState),
>      .instance_finalize = xen_pci_passthrough_finalize,
>      .class_init = xen_pci_passthrough_class_init,
> +    .class_size = sizeof(XenPTDeviceClass),
>      .instance_init = xen_pci_passthrough_instance_init,
>      .interfaces = (InterfaceInfo[]) {
>          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
> index cf10fc7bbf..8c25932b4b 100644
> --- a/hw/xen/xen_pt.h
> +++ b/hw/xen/xen_pt.h
> @@ -2,6 +2,7 @@
>  #define XEN_PT_H
>  
>  #include "hw/xen/xen_common.h"
> +#include "hw/pci/pci_bus.h"
>  #include "xen-host-pci-device.h"
>  #include "qom/object.h"
>  
> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
>  #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
>  OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
>  
> +#define XEN_PT_DEVICE_CLASS(klass) \
> +    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
> +#define XEN_PT_DEVICE_GET_CLASS(obj) \
> +    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
> +
> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
> +
> +typedef struct XenPTDeviceClass {
> +    PCIDeviceClass parent_class;
> +    XenPTQdevRealize pci_qdev_realize;
> +} XenPTDeviceClass;
> +
>  uint32_t igd_read_opregion(XenPCIPassthroughState *s);
> +void xen_igd_reserve_slot(PCIBus *pci_bus);
>  void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
>  void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
>                                             XenHostPCIDevice *dev);
> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
>  
>  #define XEN_PCI_INTEL_OPREGION 0xfc
>  
> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
> +

I think you want to calculate this based on dev fn:

#define XEN_PCI_IGD_SLOT_MASK \
	(0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN)))


>  typedef enum {
>      XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
>      XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
> index 2d8cac8d54..5c108446a8 100644
> --- a/hw/xen/xen_pt_stub.c
> +++ b/hw/xen/xen_pt_stub.c
> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
>          error_setg(errp, "Xen PCI passthrough support not built in");
>      }
>  }
> +
> +void xen_igd_reserve_slot(PCIBus *pci_bus)
> +{
> +}
> -- 
> 2.39.0

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/10/23 3:16 AM, Michael S. Tsirkin wrote:
> On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote:
>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
>> as noted in docs/igd-assign.txt in the Qemu source code.
>> 
>> Currently, when the xl toolstack is used to configure a Xen HVM guest with
>> Intel IGD passthrough to the guest with the Qemu upstream device model,
>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
>> a different slot. This problem often prevents the guest from booting.
>> 
>> The only available workaround is not good: Configure Xen HVM guests to use
>> the old and no longer maintained Qemu traditional device model available
>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
>> 
>> To implement this feature in the Qemu upstream device model for Xen HVM
>> guests, introduce the following new functions, types, and macros:
>> 
>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
>> * typedef XenPTQdevRealize function pointer
>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
>> * xen_igd_reserve_slot and xen_igd_clear_slot functions
>> 
>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
>> the xl toolstack with the gfx_passthru option enabled, which sets the
>> igd-passthru=on option to Qemu for the Xen HVM machine type.
>> 
>> The new xen_igd_reserve_slot function also needs to be implemented in
>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
>> in which case it does nothing.
>> 
>> The new xen_igd_clear_slot function overrides qdev->realize of the parent
>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
>> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
>> 
>> Move the call to xen_host_pci_device_get, and the associated error
>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
>> initialize the device class and vendor values which enables the checks for
>> the Intel IGD to succeed. The verification that the host device is an
>> Intel IGD to be passed through is done by checking the domain, bus, slot,
>> and function values as well as by checking that gfx_passthru is enabled,
>> the device class is VGA, and the device vendor in Intel.
>> 
>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
>> ---
>> Notes that might be helpful to reviewers of patched code in hw/xen:
>> 
>> The new functions and types are based on recommendations from Qemu docs:
>> https://qemu.readthedocs.io/en/latest/devel/qom.html
>> 
>> Notes that might be helpful to reviewers of patched code in hw/i386:
>> 
>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
>> not affect builds that do not have CONFIG_XEN defined.
>> 
>> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
>> existing function that is only true when Qemu is built with
>> xen-pci-passthrough enabled and the administrator has configured the Xen
>> HVM guest with Qemu's igd-passthru=on option.
>> 
>> v2: Remove From: <email address> tag at top of commit message
>> 
>> v3: Changed the test for the Intel IGD in xen_igd_clear_slot:
>> 
>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>         (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {
>> 
>>     is changed to
>> 
>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>         && (s->hostaddr.function == 0)) {
>> 
>>     I hoped that I could use the test in v2, since it matches the
>>     other tests for the Intel IGD in Qemu and Xen, but those tests
>>     do not work because the necessary data structures are not set with
>>     their values yet. So instead use the test that the administrator
>>     has enabled gfx_passthru and the device address on the host is
>>     02.0. This test does detect the Intel IGD correctly.
>> 
>> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
>>     email address to match the address used by the same author in commits
>>     be9c61da and c0e86b76
>>     
>>     Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc
>> 
>> v5: The patch of xen_pt.c was re-worked to allow a more consistent test
>>     for the Intel IGD that uses the same criteria as in other places.
>>     This involved moving the call to xen_host_pci_device_get from
>>     xen_pt_realize to xen_igd_clear_slot and updating the checks for the
>>     Intel IGD in xen_igd_clear_slot:
>>     
>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>         && (s->hostaddr.function == 0)) {
>> 
>>     is changed to
>> 
>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>         s->real_device.domain == 0 && s->real_device.bus == 0 &&
>>         s->real_device.dev == 2 && s->real_device.func == 0 &&
>>         s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>> 
>>     Added an explanation for the move of xen_host_pci_device_get from
>>     xen_pt_realize to xen_igd_clear_slot to the commit message.
>> 
>>     Rebase.
>> 
>> v6: Fix logging by removing these lines from the move from xen_pt_realize
>>     to xen_igd_clear_slot that was done in v5:
>> 
>>     XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
>>                " to devfn 0x%x\n",
>>                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>                s->dev.devfn);
>> 
>>     This log needs to be in xen_pt_realize because s->dev.devfn is not
>>     set yet in xen_igd_clear_slot.
>> 
>> v7: The v7 that was posted to the mailing list was incorrect. v8 is what
>>     v7 was intended to be.
>> 
>> v8: Inhibit out of context log message and needless processing by
>>     adding 2 lines at the top of the new xen_igd_clear_slot function:
>> 
>>     if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>>         return;
>> 
>>     Rebase. This removed an unnecessary header file from xen_pt.h 
>> 
>>  hw/i386/pc_piix.c    |  3 +++
>>  hw/xen/xen_pt.c      | 49 ++++++++++++++++++++++++++++++++++++--------
>>  hw/xen/xen_pt.h      | 16 +++++++++++++++
>>  hw/xen/xen_pt_stub.c |  4 ++++
>>  4 files changed, 63 insertions(+), 9 deletions(-)
>> 
>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>> index b48047f50c..bc5efa4f59 100644
>> --- a/hw/i386/pc_piix.c
>> +++ b/hw/i386/pc_piix.c
>> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>>      }
>>  
>>      pc_xen_hvm_init_pci(machine);
>> +    if (xen_igd_gfx_pt_enabled()) {
>> +        xen_igd_reserve_slot(pcms->bus);
>> +    }
>>      pci_create_simple(pcms->bus, -1, "xen-platform");
>>  }
>>  #endif
> 
> I would even maybe go further and move the whole logic into
> xen_igd_reserve_slot. And I would even just name it
> xen_hvm_init_reserved_slots without worrying about the what
> or why at the pc level.  At this point it will be up to Xen maintainers.
> 
>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
>> index 0ec7e52183..eff38155ef 100644
>> --- a/hw/xen/xen_pt.c
>> +++ b/hw/xen/xen_pt.c
>> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
>>                 s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>                 s->dev.devfn);
>>  
>> -    xen_host_pci_device_get(&s->real_device,
>> -                            s->hostaddr.domain, s->hostaddr.bus,
>> -                            s->hostaddr.slot, s->hostaddr.function,
>> -                            errp);
>> -    if (*errp) {
>> -        error_append_hint(errp, "Failed to \"open\" the real pci device");
>> -        return;
>> -    }
>> -
>>      s->is_virtfn = s->real_device.is_virtfn;
>>      if (s->is_virtfn) {
>>          XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
>> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
>>      PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
>>  }
>>  
>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>> +{
>> +    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
>> +    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
>> +}
>> +
>> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
>> +{
>> +    ERRP_GUARD();
>> +    PCIDevice *pci_dev = (PCIDevice *)qdev;
>> +    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
>> +    PCIBus *pci_bus = pci_get_bus(pci_dev);
>> +
>> +    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>> +        return;
>> +
>> +    xen_host_pci_device_get(&s->real_device,
>> +                            s->hostaddr.domain, s->hostaddr.bus,
>> +                            s->hostaddr.slot, s->hostaddr.function,
>> +                            errp);
>> +    if (*errp) {
>> +        error_append_hint(errp, "Failed to \"open\" the real pci device");
>> +        return;
>> +    }
>> +
>> +    if (is_igd_vga_passthrough(&s->real_device) &&
>> +        s->real_device.domain == 0 && s->real_device.bus == 0 &&
>> +        s->real_device.dev == 2 && s->real_device.func == 0 &&
>> +        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
> 
> how about macros for these?
> 
> #define XEN_PCI_IGD_DOMAIN 0
> #define XEN_PCI_IGD_BUS 0
> #define XEN_PCI_IGD_DEV 2
> #define XEN_PCI_IGD_FN 0
> 
>> +        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
> 
> If you are going to do this, you should set it back
> either after pci_qdev_realize or in unrealize,
> for symmetry.

I presume you are talking about the log here. The clearing of
the bit must be done before pci_qdev_realize because the slot
is assigned in pci_qdev_realize. If the bit is not cleared *before*
calling pci_qdev_realize, the igd will not be assigned slot 2.
Doing that would defeat the whole purpose of the patch.

I suppose I could move the log message after pci_qdev_realize
for symmetry, that would not change how the patch works. 

> 
>> +        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
>> +    }
> 
> 
>> +    xpdc->pci_qdev_realize(qdev, errp);
>> +}
>> +
> 
> 
> 
>>  static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
>>  {
>>      DeviceClass *dc = DEVICE_CLASS(klass);
>>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>>  
>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
>> +    xpdc->pci_qdev_realize = dc->realize;
>> +    dc->realize = xen_igd_clear_slot;
>>      k->realize = xen_pt_realize;
>>      k->exit = xen_pt_unregister_device;
>>      k->config_read = xen_pt_pci_read_config;
>> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
>>      .instance_size = sizeof(XenPCIPassthroughState),
>>      .instance_finalize = xen_pci_passthrough_finalize,
>>      .class_init = xen_pci_passthrough_class_init,
>> +    .class_size = sizeof(XenPTDeviceClass),
>>      .instance_init = xen_pci_passthrough_instance_init,
>>      .interfaces = (InterfaceInfo[]) {
>>          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
>> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
>> index cf10fc7bbf..8c25932b4b 100644
>> --- a/hw/xen/xen_pt.h
>> +++ b/hw/xen/xen_pt.h
>> @@ -2,6 +2,7 @@
>>  #define XEN_PT_H
>>  
>>  #include "hw/xen/xen_common.h"
>> +#include "hw/pci/pci_bus.h"
>>  #include "xen-host-pci-device.h"
>>  #include "qom/object.h"
>>  
>> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
>>  #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
>>  OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
>>  
>> +#define XEN_PT_DEVICE_CLASS(klass) \
>> +    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
>> +#define XEN_PT_DEVICE_GET_CLASS(obj) \
>> +    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
>> +
>> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
>> +
>> +typedef struct XenPTDeviceClass {
>> +    PCIDeviceClass parent_class;
>> +    XenPTQdevRealize pci_qdev_realize;
>> +} XenPTDeviceClass;
>> +
>>  uint32_t igd_read_opregion(XenPCIPassthroughState *s);
>> +void xen_igd_reserve_slot(PCIBus *pci_bus);
>>  void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
>>  void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
>>                                             XenHostPCIDevice *dev);
>> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
>>  
>>  #define XEN_PCI_INTEL_OPREGION 0xfc
>>  
>> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
>> +
> 
> I think you want to calculate this based on dev fn:
> 
> #define XEN_PCI_IGD_SLOT_MASK \
> 	(0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN)))
> 
> 
>>  typedef enum {
>>      XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
>>      XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
>> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
>> index 2d8cac8d54..5c108446a8 100644
>> --- a/hw/xen/xen_pt_stub.c
>> +++ b/hw/xen/xen_pt_stub.c
>> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
>>          error_setg(errp, "Xen PCI passthrough support not built in");
>>      }
>>  }
>> +
>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>> +{
>> +}
>> -- 
>> 2.39.0
>

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/10/23 3:16 AM, Michael S. Tsirkin wrote:
> On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote:
>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
>> as noted in docs/igd-assign.txt in the Qemu source code.
>> 
>> Currently, when the xl toolstack is used to configure a Xen HVM guest with
>> Intel IGD passthrough to the guest with the Qemu upstream device model,
>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
>> a different slot. This problem often prevents the guest from booting.
>> 
>> The only available workaround is not good: Configure Xen HVM guests to use
>> the old and no longer maintained Qemu traditional device model available
>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
>> 
>> To implement this feature in the Qemu upstream device model for Xen HVM
>> guests, introduce the following new functions, types, and macros:
>> 
>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
>> * typedef XenPTQdevRealize function pointer
>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
>> * xen_igd_reserve_slot and xen_igd_clear_slot functions
>> 
>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
>> the xl toolstack with the gfx_passthru option enabled, which sets the
>> igd-passthru=on option to Qemu for the Xen HVM machine type.
>> 
>> The new xen_igd_reserve_slot function also needs to be implemented in
>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
>> in which case it does nothing.
>> 
>> The new xen_igd_clear_slot function overrides qdev->realize of the parent
>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
>> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
>> 
>> Move the call to xen_host_pci_device_get, and the associated error
>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
>> initialize the device class and vendor values which enables the checks for
>> the Intel IGD to succeed. The verification that the host device is an
>> Intel IGD to be passed through is done by checking the domain, bus, slot,
>> and function values as well as by checking that gfx_passthru is enabled,
>> the device class is VGA, and the device vendor in Intel.
>> 
>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
>> ---
>> Notes that might be helpful to reviewers of patched code in hw/xen:
>> 
>> The new functions and types are based on recommendations from Qemu docs:
>> https://qemu.readthedocs.io/en/latest/devel/qom.html
>> 
>> Notes that might be helpful to reviewers of patched code in hw/i386:
>> 
>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
>> not affect builds that do not have CONFIG_XEN defined.
>> 
>> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
>> existing function that is only true when Qemu is built with
>> xen-pci-passthrough enabled and the administrator has configured the Xen
>> HVM guest with Qemu's igd-passthru=on option.
>> 
>> v2: Remove From: <email address> tag at top of commit message
>> 
>> v3: Changed the test for the Intel IGD in xen_igd_clear_slot:
>> 
>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>         (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {
>> 
>>     is changed to
>> 
>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>         && (s->hostaddr.function == 0)) {
>> 
>>     I hoped that I could use the test in v2, since it matches the
>>     other tests for the Intel IGD in Qemu and Xen, but those tests
>>     do not work because the necessary data structures are not set with
>>     their values yet. So instead use the test that the administrator
>>     has enabled gfx_passthru and the device address on the host is
>>     02.0. This test does detect the Intel IGD correctly.
>> 
>> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
>>     email address to match the address used by the same author in commits
>>     be9c61da and c0e86b76
>>     
>>     Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc
>> 
>> v5: The patch of xen_pt.c was re-worked to allow a more consistent test
>>     for the Intel IGD that uses the same criteria as in other places.
>>     This involved moving the call to xen_host_pci_device_get from
>>     xen_pt_realize to xen_igd_clear_slot and updating the checks for the
>>     Intel IGD in xen_igd_clear_slot:
>>     
>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>         && (s->hostaddr.function == 0)) {
>> 
>>     is changed to
>> 
>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>         s->real_device.domain == 0 && s->real_device.bus == 0 &&
>>         s->real_device.dev == 2 && s->real_device.func == 0 &&
>>         s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>> 
>>     Added an explanation for the move of xen_host_pci_device_get from
>>     xen_pt_realize to xen_igd_clear_slot to the commit message.
>> 
>>     Rebase.
>> 
>> v6: Fix logging by removing these lines from the move from xen_pt_realize
>>     to xen_igd_clear_slot that was done in v5:
>> 
>>     XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
>>                " to devfn 0x%x\n",
>>                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>                s->dev.devfn);
>> 
>>     This log needs to be in xen_pt_realize because s->dev.devfn is not
>>     set yet in xen_igd_clear_slot.
>> 
>> v7: The v7 that was posted to the mailing list was incorrect. v8 is what
>>     v7 was intended to be.
>> 
>> v8: Inhibit out of context log message and needless processing by
>>     adding 2 lines at the top of the new xen_igd_clear_slot function:
>> 
>>     if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>>         return;
>> 
>>     Rebase. This removed an unnecessary header file from xen_pt.h 
>> 
>>  hw/i386/pc_piix.c    |  3 +++
>>  hw/xen/xen_pt.c      | 49 ++++++++++++++++++++++++++++++++++++--------
>>  hw/xen/xen_pt.h      | 16 +++++++++++++++
>>  hw/xen/xen_pt_stub.c |  4 ++++
>>  4 files changed, 63 insertions(+), 9 deletions(-)
>> 
>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>> index b48047f50c..bc5efa4f59 100644
>> --- a/hw/i386/pc_piix.c
>> +++ b/hw/i386/pc_piix.c
>> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>>      }
>>  
>>      pc_xen_hvm_init_pci(machine);
>> +    if (xen_igd_gfx_pt_enabled()) {
>> +        xen_igd_reserve_slot(pcms->bus);
>> +    }
>>      pci_create_simple(pcms->bus, -1, "xen-platform");
>>  }
>>  #endif
> 
> I would even maybe go further and move the whole logic into
> xen_igd_reserve_slot. And I would even just name it
> xen_hvm_init_reserved_slots without worrying about the what
> or why at the pc level.  At this point it will be up to Xen maintainers.

I see to do that would be to resolve the two pc_xen_hvm*
functions in pc_piix.c that are guarded by CONFIG_XEN and
move them to an appropriate place such as xen-hvm.c.

That is along the lines of the work that Bernhard and Philippe
are doing, so I am Cc'ing them. My first inclination is just
to defer to them: I think eventually the little patch I propose
here to pc_piix.c is eventually going to be moved out of pc_piix.c
by Bernhard in a future patch.

What they have been doing is very conservative, and I expect
if and when Bernhard gets here to resolve those functions, they
will do it in a way that keeps the dependency of the xenfv machine
type on the pc machine type and the pc_init1 function.

What I would propose would be to break the dependency of xenfv
on the pc_init1 function. That is, I would propose having a
xenfv_init function in xen-hvm.c, and the first version would
be the current version of pc_init1, so xenfv would still depend
on many i440fx type things, but with the change xen developers
would be free to tweak xenfv_init without affecting the users
of the pc machine type.

Would that be a good idea? If I get posiive feedback for this
idea, I will put it on the table, probably initially as an RFC
patch.

Also, thanks, Michael, for your other suggestions for this patch
about using macros for the devfn constants.

Chuck

> 
>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
>> index 0ec7e52183..eff38155ef 100644
>> --- a/hw/xen/xen_pt.c
>> +++ b/hw/xen/xen_pt.c
>> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
>>                 s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>                 s->dev.devfn);
>>  
>> -    xen_host_pci_device_get(&s->real_device,
>> -                            s->hostaddr.domain, s->hostaddr.bus,
>> -                            s->hostaddr.slot, s->hostaddr.function,
>> -                            errp);
>> -    if (*errp) {
>> -        error_append_hint(errp, "Failed to \"open\" the real pci device");
>> -        return;
>> -    }
>> -
>>      s->is_virtfn = s->real_device.is_virtfn;
>>      if (s->is_virtfn) {
>>          XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
>> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
>>      PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
>>  }
>>  
>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>> +{
>> +    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
>> +    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
>> +}
>> +
>> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
>> +{
>> +    ERRP_GUARD();
>> +    PCIDevice *pci_dev = (PCIDevice *)qdev;
>> +    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
>> +    PCIBus *pci_bus = pci_get_bus(pci_dev);
>> +
>> +    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>> +        return;
>> +
>> +    xen_host_pci_device_get(&s->real_device,
>> +                            s->hostaddr.domain, s->hostaddr.bus,
>> +                            s->hostaddr.slot, s->hostaddr.function,
>> +                            errp);
>> +    if (*errp) {
>> +        error_append_hint(errp, "Failed to \"open\" the real pci device");
>> +        return;
>> +    }
>> +
>> +    if (is_igd_vga_passthrough(&s->real_device) &&
>> +        s->real_device.domain == 0 && s->real_device.bus == 0 &&
>> +        s->real_device.dev == 2 && s->real_device.func == 0 &&
>> +        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
> 
> how about macros for these?
> 
> #define XEN_PCI_IGD_DOMAIN 0
> #define XEN_PCI_IGD_BUS 0
> #define XEN_PCI_IGD_DEV 2
> #define XEN_PCI_IGD_FN 0
> 
>> +        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
> 
> If you are going to do this, you should set it back
> either after pci_qdev_realize or in unrealize,
> for symmetry.
> 
>> +        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
>> +    }
> 
> 
>> +    xpdc->pci_qdev_realize(qdev, errp);
>> +}
>> +
> 
> 
> 
>>  static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
>>  {
>>      DeviceClass *dc = DEVICE_CLASS(klass);
>>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>>  
>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
>> +    xpdc->pci_qdev_realize = dc->realize;
>> +    dc->realize = xen_igd_clear_slot;
>>      k->realize = xen_pt_realize;
>>      k->exit = xen_pt_unregister_device;
>>      k->config_read = xen_pt_pci_read_config;
>> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
>>      .instance_size = sizeof(XenPCIPassthroughState),
>>      .instance_finalize = xen_pci_passthrough_finalize,
>>      .class_init = xen_pci_passthrough_class_init,
>> +    .class_size = sizeof(XenPTDeviceClass),
>>      .instance_init = xen_pci_passthrough_instance_init,
>>      .interfaces = (InterfaceInfo[]) {
>>          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
>> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
>> index cf10fc7bbf..8c25932b4b 100644
>> --- a/hw/xen/xen_pt.h
>> +++ b/hw/xen/xen_pt.h
>> @@ -2,6 +2,7 @@
>>  #define XEN_PT_H
>>  
>>  #include "hw/xen/xen_common.h"
>> +#include "hw/pci/pci_bus.h"
>>  #include "xen-host-pci-device.h"
>>  #include "qom/object.h"
>>  
>> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
>>  #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
>>  OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
>>  
>> +#define XEN_PT_DEVICE_CLASS(klass) \
>> +    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
>> +#define XEN_PT_DEVICE_GET_CLASS(obj) \
>> +    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
>> +
>> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
>> +
>> +typedef struct XenPTDeviceClass {
>> +    PCIDeviceClass parent_class;
>> +    XenPTQdevRealize pci_qdev_realize;
>> +} XenPTDeviceClass;
>> +
>>  uint32_t igd_read_opregion(XenPCIPassthroughState *s);
>> +void xen_igd_reserve_slot(PCIBus *pci_bus);
>>  void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
>>  void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
>>                                             XenHostPCIDevice *dev);
>> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
>>  
>>  #define XEN_PCI_INTEL_OPREGION 0xfc
>>  
>> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
>> +
> 
> I think you want to calculate this based on dev fn:
> 
> #define XEN_PCI_IGD_SLOT_MASK \
> 	(0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN)))
> 
> 
>>  typedef enum {
>>      XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
>>      XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
>> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
>> index 2d8cac8d54..5c108446a8 100644
>> --- a/hw/xen/xen_pt_stub.c
>> +++ b/hw/xen/xen_pt_stub.c
>> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
>>          error_setg(errp, "Xen PCI passthrough support not built in");
>>      }
>>  }
>> +
>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>> +{
>> +}
>> -- 
>> 2.39.0
>

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Bernhard Beschow 1 year, 3 months ago


Am 11. Januar 2023 15:40:24 UTC schrieb Chuck Zmudzinski <brchuckz@aol.com>:
>On 1/10/23 3:16 AM, Michael S. Tsirkin wrote:
>> On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote:
>>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
>>> as noted in docs/igd-assign.txt in the Qemu source code.
>>> 
>>> Currently, when the xl toolstack is used to configure a Xen HVM guest with
>>> Intel IGD passthrough to the guest with the Qemu upstream device model,
>>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
>>> a different slot. This problem often prevents the guest from booting.
>>> 
>>> The only available workaround is not good: Configure Xen HVM guests to use
>>> the old and no longer maintained Qemu traditional device model available
>>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
>>> 
>>> To implement this feature in the Qemu upstream device model for Xen HVM
>>> guests, introduce the following new functions, types, and macros:
>>> 
>>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
>>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
>>> * typedef XenPTQdevRealize function pointer
>>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
>>> * xen_igd_reserve_slot and xen_igd_clear_slot functions
>>> 
>>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
>>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
>>> the xl toolstack with the gfx_passthru option enabled, which sets the
>>> igd-passthru=on option to Qemu for the Xen HVM machine type.
>>> 
>>> The new xen_igd_reserve_slot function also needs to be implemented in
>>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
>>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
>>> in which case it does nothing.
>>> 
>>> The new xen_igd_clear_slot function overrides qdev->realize of the parent
>>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
>>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
>>> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
>>> 
>>> Move the call to xen_host_pci_device_get, and the associated error
>>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
>>> initialize the device class and vendor values which enables the checks for
>>> the Intel IGD to succeed. The verification that the host device is an
>>> Intel IGD to be passed through is done by checking the domain, bus, slot,
>>> and function values as well as by checking that gfx_passthru is enabled,
>>> the device class is VGA, and the device vendor in Intel.
>>> 
>>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
>>> ---
>>> Notes that might be helpful to reviewers of patched code in hw/xen:
>>> 
>>> The new functions and types are based on recommendations from Qemu docs:
>>> https://qemu.readthedocs.io/en/latest/devel/qom.html
>>> 
>>> Notes that might be helpful to reviewers of patched code in hw/i386:
>>> 
>>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
>>> not affect builds that do not have CONFIG_XEN defined.
>>> 
>>> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
>>> existing function that is only true when Qemu is built with
>>> xen-pci-passthrough enabled and the administrator has configured the Xen
>>> HVM guest with Qemu's igd-passthru=on option.
>>> 
>>> v2: Remove From: <email address> tag at top of commit message
>>> 
>>> v3: Changed the test for the Intel IGD in xen_igd_clear_slot:
>>> 
>>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>>         (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {
>>> 
>>>     is changed to
>>> 
>>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>>         && (s->hostaddr.function == 0)) {
>>> 
>>>     I hoped that I could use the test in v2, since it matches the
>>>     other tests for the Intel IGD in Qemu and Xen, but those tests
>>>     do not work because the necessary data structures are not set with
>>>     their values yet. So instead use the test that the administrator
>>>     has enabled gfx_passthru and the device address on the host is
>>>     02.0. This test does detect the Intel IGD correctly.
>>> 
>>> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
>>>     email address to match the address used by the same author in commits
>>>     be9c61da and c0e86b76
>>>     
>>>     Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc
>>> 
>>> v5: The patch of xen_pt.c was re-worked to allow a more consistent test
>>>     for the Intel IGD that uses the same criteria as in other places.
>>>     This involved moving the call to xen_host_pci_device_get from
>>>     xen_pt_realize to xen_igd_clear_slot and updating the checks for the
>>>     Intel IGD in xen_igd_clear_slot:
>>>     
>>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>>         && (s->hostaddr.function == 0)) {
>>> 
>>>     is changed to
>>> 
>>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>>         s->real_device.domain == 0 && s->real_device.bus == 0 &&
>>>         s->real_device.dev == 2 && s->real_device.func == 0 &&
>>>         s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>>> 
>>>     Added an explanation for the move of xen_host_pci_device_get from
>>>     xen_pt_realize to xen_igd_clear_slot to the commit message.
>>> 
>>>     Rebase.
>>> 
>>> v6: Fix logging by removing these lines from the move from xen_pt_realize
>>>     to xen_igd_clear_slot that was done in v5:
>>> 
>>>     XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
>>>                " to devfn 0x%x\n",
>>>                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>>                s->dev.devfn);
>>> 
>>>     This log needs to be in xen_pt_realize because s->dev.devfn is not
>>>     set yet in xen_igd_clear_slot.
>>> 
>>> v7: The v7 that was posted to the mailing list was incorrect. v8 is what
>>>     v7 was intended to be.
>>> 
>>> v8: Inhibit out of context log message and needless processing by
>>>     adding 2 lines at the top of the new xen_igd_clear_slot function:
>>> 
>>>     if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>>>         return;
>>> 
>>>     Rebase. This removed an unnecessary header file from xen_pt.h 
>>> 
>>>  hw/i386/pc_piix.c    |  3 +++
>>>  hw/xen/xen_pt.c      | 49 ++++++++++++++++++++++++++++++++++++--------
>>>  hw/xen/xen_pt.h      | 16 +++++++++++++++
>>>  hw/xen/xen_pt_stub.c |  4 ++++
>>>  4 files changed, 63 insertions(+), 9 deletions(-)
>>> 
>>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>>> index b48047f50c..bc5efa4f59 100644
>>> --- a/hw/i386/pc_piix.c
>>> +++ b/hw/i386/pc_piix.c
>>> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>>>      }
>>>  
>>>      pc_xen_hvm_init_pci(machine);
>>> +    if (xen_igd_gfx_pt_enabled()) {
>>> +        xen_igd_reserve_slot(pcms->bus);
>>> +    }
>>>      pci_create_simple(pcms->bus, -1, "xen-platform");
>>>  }
>>>  #endif
>> 
>> I would even maybe go further and move the whole logic into
>> xen_igd_reserve_slot. And I would even just name it
>> xen_hvm_init_reserved_slots without worrying about the what
>> or why at the pc level.  At this point it will be up to Xen maintainers.
>
>I see to do that would be to resolve the two pc_xen_hvm*
>functions in pc_piix.c that are guarded by CONFIG_XEN and
>move them to an appropriate place such as xen-hvm.c.
>
>That is along the lines of the work that Bernhard and Philippe
>are doing, so I am Cc'ing them. My first inclination is just
>to defer to them: I think eventually the little patch I propose
>here to pc_piix.c is eventually going to be moved out of pc_piix.c
>by Bernhard in a future patch.
>
>What they have been doing is very conservative, and I expect
>if and when Bernhard gets here to resolve those functions, they
>will do it in a way that keeps the dependency of the xenfv machine
>type on the pc machine type and the pc_init1 function.
>
>What I would propose would be to break the dependency of xenfv
>on the pc_init1 function. That is, I would propose having a
>xenfv_init function in xen-hvm.c, and the first version would
>be the current version of pc_init1, so xenfv would still depend
>on many i440fx type things, but with the change xen developers
>would be free to tweak xenfv_init without affecting the users
>of the pc machine type.
>
>Would that be a good idea? If I get posiive feedback for this
>idea, I will put it on the table, probably initially as an RFC
>patch.

In various patches I've been decoupling 1/ PIIX3 from Xen and 2/ QEMU's Xen integration code from the PC machine. My idea is to confine all wiring for a PIIX based PC machine using Xen in pc_piix.c. The pc_xen_hvm* functions seem to do exactly that, so I'd leave them there, at least for now.

What I would like to avoid is for the Xen integration code to make assumptions that an x86 or PC machine is always based on i440fx or PIIX3.

I like Michael's idea of going one step further, both in terms of the approach and the reasoning.

Best regards,
Bernhard

>Also, thanks, Michael, for your other suggestions for this patch
>about using macros for the devfn constants.
>
>Chuck
>
>> 
>>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
>>> index 0ec7e52183..eff38155ef 100644
>>> --- a/hw/xen/xen_pt.c
>>> +++ b/hw/xen/xen_pt.c
>>> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
>>>                 s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>>                 s->dev.devfn);
>>>  
>>> -    xen_host_pci_device_get(&s->real_device,
>>> -                            s->hostaddr.domain, s->hostaddr.bus,
>>> -                            s->hostaddr.slot, s->hostaddr.function,
>>> -                            errp);
>>> -    if (*errp) {
>>> -        error_append_hint(errp, "Failed to \"open\" the real pci device");
>>> -        return;
>>> -    }
>>> -
>>>      s->is_virtfn = s->real_device.is_virtfn;
>>>      if (s->is_virtfn) {
>>>          XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
>>> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
>>>      PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
>>>  }
>>>  
>>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>>> +{
>>> +    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
>>> +    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
>>> +}
>>> +
>>> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
>>> +{
>>> +    ERRP_GUARD();
>>> +    PCIDevice *pci_dev = (PCIDevice *)qdev;
>>> +    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
>>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
>>> +    PCIBus *pci_bus = pci_get_bus(pci_dev);
>>> +
>>> +    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>>> +        return;
>>> +
>>> +    xen_host_pci_device_get(&s->real_device,
>>> +                            s->hostaddr.domain, s->hostaddr.bus,
>>> +                            s->hostaddr.slot, s->hostaddr.function,
>>> +                            errp);
>>> +    if (*errp) {
>>> +        error_append_hint(errp, "Failed to \"open\" the real pci device");
>>> +        return;
>>> +    }
>>> +
>>> +    if (is_igd_vga_passthrough(&s->real_device) &&
>>> +        s->real_device.domain == 0 && s->real_device.bus == 0 &&
>>> +        s->real_device.dev == 2 && s->real_device.func == 0 &&
>>> +        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>> 
>> how about macros for these?
>> 
>> #define XEN_PCI_IGD_DOMAIN 0
>> #define XEN_PCI_IGD_BUS 0
>> #define XEN_PCI_IGD_DEV 2
>> #define XEN_PCI_IGD_FN 0
>> 
>>> +        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
>> 
>> If you are going to do this, you should set it back
>> either after pci_qdev_realize or in unrealize,
>> for symmetry.
>> 
>>> +        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
>>> +    }
>> 
>> 
>>> +    xpdc->pci_qdev_realize(qdev, errp);
>>> +}
>>> +
>> 
>> 
>> 
>>>  static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
>>>  {
>>>      DeviceClass *dc = DEVICE_CLASS(klass);
>>>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>>>  
>>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
>>> +    xpdc->pci_qdev_realize = dc->realize;
>>> +    dc->realize = xen_igd_clear_slot;
>>>      k->realize = xen_pt_realize;
>>>      k->exit = xen_pt_unregister_device;
>>>      k->config_read = xen_pt_pci_read_config;
>>> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
>>>      .instance_size = sizeof(XenPCIPassthroughState),
>>>      .instance_finalize = xen_pci_passthrough_finalize,
>>>      .class_init = xen_pci_passthrough_class_init,
>>> +    .class_size = sizeof(XenPTDeviceClass),
>>>      .instance_init = xen_pci_passthrough_instance_init,
>>>      .interfaces = (InterfaceInfo[]) {
>>>          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
>>> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
>>> index cf10fc7bbf..8c25932b4b 100644
>>> --- a/hw/xen/xen_pt.h
>>> +++ b/hw/xen/xen_pt.h
>>> @@ -2,6 +2,7 @@
>>>  #define XEN_PT_H
>>>  
>>>  #include "hw/xen/xen_common.h"
>>> +#include "hw/pci/pci_bus.h"
>>>  #include "xen-host-pci-device.h"
>>>  #include "qom/object.h"
>>>  
>>> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
>>>  #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
>>>  OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
>>>  
>>> +#define XEN_PT_DEVICE_CLASS(klass) \
>>> +    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
>>> +#define XEN_PT_DEVICE_GET_CLASS(obj) \
>>> +    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
>>> +
>>> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
>>> +
>>> +typedef struct XenPTDeviceClass {
>>> +    PCIDeviceClass parent_class;
>>> +    XenPTQdevRealize pci_qdev_realize;
>>> +} XenPTDeviceClass;
>>> +
>>>  uint32_t igd_read_opregion(XenPCIPassthroughState *s);
>>> +void xen_igd_reserve_slot(PCIBus *pci_bus);
>>>  void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
>>>  void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
>>>                                             XenHostPCIDevice *dev);
>>> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
>>>  
>>>  #define XEN_PCI_INTEL_OPREGION 0xfc
>>>  
>>> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
>>> +
>> 
>> I think you want to calculate this based on dev fn:
>> 
>> #define XEN_PCI_IGD_SLOT_MASK \
>> 	(0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN)))
>> 
>> 
>>>  typedef enum {
>>>      XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
>>>      XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
>>> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
>>> index 2d8cac8d54..5c108446a8 100644
>>> --- a/hw/xen/xen_pt_stub.c
>>> +++ b/hw/xen/xen_pt_stub.c
>>> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
>>>          error_setg(errp, "Xen PCI passthrough support not built in");
>>>      }
>>>  }
>>> +
>>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>>> +{
>>> +}
>>> -- 
>>> 2.39.0
>> 
>

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/12/23 2:18 PM, Bernhard Beschow wrote:
> 
> 
> Am 11. Januar 2023 15:40:24 UTC schrieb Chuck Zmudzinski <brchuckz@aol.com>:
>>On 1/10/23 3:16 AM, Michael S. Tsirkin wrote:
>>> On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote:
>>>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
>>>> as noted in docs/igd-assign.txt in the Qemu source code.
>>>> 
>>>> Currently, when the xl toolstack is used to configure a Xen HVM guest with
>>>> Intel IGD passthrough to the guest with the Qemu upstream device model,
>>>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
>>>> a different slot. This problem often prevents the guest from booting.
>>>> 
>>>> The only available workaround is not good: Configure Xen HVM guests to use
>>>> the old and no longer maintained Qemu traditional device model available
>>>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
>>>> 
>>>> To implement this feature in the Qemu upstream device model for Xen HVM
>>>> guests, introduce the following new functions, types, and macros:
>>>> 
>>>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
>>>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
>>>> * typedef XenPTQdevRealize function pointer
>>>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
>>>> * xen_igd_reserve_slot and xen_igd_clear_slot functions
>>>> 
>>>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
>>>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
>>>> the xl toolstack with the gfx_passthru option enabled, which sets the
>>>> igd-passthru=on option to Qemu for the Xen HVM machine type.
>>>> 
>>>> The new xen_igd_reserve_slot function also needs to be implemented in
>>>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
>>>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
>>>> in which case it does nothing.
>>>> 
>>>> The new xen_igd_clear_slot function overrides qdev->realize of the parent
>>>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
>>>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
>>>> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
>>>> 
>>>> Move the call to xen_host_pci_device_get, and the associated error
>>>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
>>>> initialize the device class and vendor values which enables the checks for
>>>> the Intel IGD to succeed. The verification that the host device is an
>>>> Intel IGD to be passed through is done by checking the domain, bus, slot,
>>>> and function values as well as by checking that gfx_passthru is enabled,
>>>> the device class is VGA, and the device vendor in Intel.
>>>> 
>>>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
>>>> ---
>>>> Notes that might be helpful to reviewers of patched code in hw/xen:
>>>> 
>>>> The new functions and types are based on recommendations from Qemu docs:
>>>> https://qemu.readthedocs.io/en/latest/devel/qom.html
>>>> 
>>>> Notes that might be helpful to reviewers of patched code in hw/i386:
>>>> 
>>>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
>>>> not affect builds that do not have CONFIG_XEN defined.
>>>> 
>>>> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
>>>> existing function that is only true when Qemu is built with
>>>> xen-pci-passthrough enabled and the administrator has configured the Xen
>>>> HVM guest with Qemu's igd-passthru=on option.
>>>> 
>>>> v2: Remove From: <email address> tag at top of commit message
>>>> 
>>>> v3: Changed the test for the Intel IGD in xen_igd_clear_slot:
>>>> 
>>>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>>>         (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {
>>>> 
>>>>     is changed to
>>>> 
>>>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>>>         && (s->hostaddr.function == 0)) {
>>>> 
>>>>     I hoped that I could use the test in v2, since it matches the
>>>>     other tests for the Intel IGD in Qemu and Xen, but those tests
>>>>     do not work because the necessary data structures are not set with
>>>>     their values yet. So instead use the test that the administrator
>>>>     has enabled gfx_passthru and the device address on the host is
>>>>     02.0. This test does detect the Intel IGD correctly.
>>>> 
>>>> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
>>>>     email address to match the address used by the same author in commits
>>>>     be9c61da and c0e86b76
>>>>     
>>>>     Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc
>>>> 
>>>> v5: The patch of xen_pt.c was re-worked to allow a more consistent test
>>>>     for the Intel IGD that uses the same criteria as in other places.
>>>>     This involved moving the call to xen_host_pci_device_get from
>>>>     xen_pt_realize to xen_igd_clear_slot and updating the checks for the
>>>>     Intel IGD in xen_igd_clear_slot:
>>>>     
>>>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>>>         && (s->hostaddr.function == 0)) {
>>>> 
>>>>     is changed to
>>>> 
>>>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>>>         s->real_device.domain == 0 && s->real_device.bus == 0 &&
>>>>         s->real_device.dev == 2 && s->real_device.func == 0 &&
>>>>         s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>>>> 
>>>>     Added an explanation for the move of xen_host_pci_device_get from
>>>>     xen_pt_realize to xen_igd_clear_slot to the commit message.
>>>> 
>>>>     Rebase.
>>>> 
>>>> v6: Fix logging by removing these lines from the move from xen_pt_realize
>>>>     to xen_igd_clear_slot that was done in v5:
>>>> 
>>>>     XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
>>>>                " to devfn 0x%x\n",
>>>>                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>>>                s->dev.devfn);
>>>> 
>>>>     This log needs to be in xen_pt_realize because s->dev.devfn is not
>>>>     set yet in xen_igd_clear_slot.
>>>> 
>>>> v7: The v7 that was posted to the mailing list was incorrect. v8 is what
>>>>     v7 was intended to be.
>>>> 
>>>> v8: Inhibit out of context log message and needless processing by
>>>>     adding 2 lines at the top of the new xen_igd_clear_slot function:
>>>> 
>>>>     if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>>>>         return;
>>>> 
>>>>     Rebase. This removed an unnecessary header file from xen_pt.h 
>>>> 
>>>>  hw/i386/pc_piix.c    |  3 +++
>>>>  hw/xen/xen_pt.c      | 49 ++++++++++++++++++++++++++++++++++++--------
>>>>  hw/xen/xen_pt.h      | 16 +++++++++++++++
>>>>  hw/xen/xen_pt_stub.c |  4 ++++
>>>>  4 files changed, 63 insertions(+), 9 deletions(-)
>>>> 
>>>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>>>> index b48047f50c..bc5efa4f59 100644
>>>> --- a/hw/i386/pc_piix.c
>>>> +++ b/hw/i386/pc_piix.c
>>>> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>>>>      }
>>>>  
>>>>      pc_xen_hvm_init_pci(machine);
>>>> +    if (xen_igd_gfx_pt_enabled()) {
>>>> +        xen_igd_reserve_slot(pcms->bus);
>>>> +    }
>>>>      pci_create_simple(pcms->bus, -1, "xen-platform");
>>>>  }
>>>>  #endif
>>> 
>>> I would even maybe go further and move the whole logic into
>>> xen_igd_reserve_slot. And I would even just name it
>>> xen_hvm_init_reserved_slots without worrying about the what
>>> or why at the pc level.  At this point it will be up to Xen maintainers.
>>
>>I see to do that would be to resolve the two pc_xen_hvm*
>>functions in pc_piix.c that are guarded by CONFIG_XEN and
>>move them to an appropriate place such as xen-hvm.c.
>>
>>That is along the lines of the work that Bernhard and Philippe
>>are doing, so I am Cc'ing them. My first inclination is just
>>to defer to them: I think eventually the little patch I propose
>>here to pc_piix.c is eventually going to be moved out of pc_piix.c
>>by Bernhard in a future patch.
>>
>>What they have been doing is very conservative, and I expect
>>if and when Bernhard gets here to resolve those functions, they
>>will do it in a way that keeps the dependency of the xenfv machine
>>type on the pc machine type and the pc_init1 function.
>>
>>What I would propose would be to break the dependency of xenfv
>>on the pc_init1 function. That is, I would propose having a
>>xenfv_init function in xen-hvm.c, and the first version would
>>be the current version of pc_init1, so xenfv would still depend
>>on many i440fx type things, but with the change xen developers
>>would be free to tweak xenfv_init without affecting the users
>>of the pc machine type.
>>
>>Would that be a good idea? If I get posiive feedback for this
>>idea, I will put it on the table, probably initially as an RFC
>>patch.
> 
> In various patches I've been decoupling 1/ PIIX3 from Xen and 2/ QEMU's Xen integration code from the PC machine. My idea is to confine all wiring for a PIIX based PC machine using Xen in pc_piix.c. The pc_xen_hvm* functions seem to do exactly that, so I'd leave them there, at least for now.
> 
> What I would like to avoid is for the Xen integration code to make assumptions that an x86 or PC machine is always based on i440fx or PIIX3.

I think what you are saying is that if I try to move the logic of my patch to xen-hvm.c, as Michael suggests, I should not move or copy any piix3 code to the Xen integration code, but access it via an appropriate qom interface to the code in pc_piix.c and only move Xen specific things to the Xen integration code such as the content of my patch. I can try to do that for a v9 of my patch. It might take me a little while (I am not a professional coder), so I will just leave v8 of my patch as is for now until I have a patch ready to move it out of pc_piix.c the qom way.

Thanks,

Chuck

> 
> I like Michael's idea of going one step further, both in terms of the approach and the reasoning.
> 
> Best regards,
> Bernhard
> 
>>Also, thanks, Michael, for your other suggestions for this patch
>>about using macros for the devfn constants.
>>
>>Chuck
>>
>>> 
>>>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
>>>> index 0ec7e52183..eff38155ef 100644
>>>> --- a/hw/xen/xen_pt.c
>>>> +++ b/hw/xen/xen_pt.c
>>>> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
>>>>                 s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>>>                 s->dev.devfn);
>>>>  
>>>> -    xen_host_pci_device_get(&s->real_device,
>>>> -                            s->hostaddr.domain, s->hostaddr.bus,
>>>> -                            s->hostaddr.slot, s->hostaddr.function,
>>>> -                            errp);
>>>> -    if (*errp) {
>>>> -        error_append_hint(errp, "Failed to \"open\" the real pci device");
>>>> -        return;
>>>> -    }
>>>> -
>>>>      s->is_virtfn = s->real_device.is_virtfn;
>>>>      if (s->is_virtfn) {
>>>>          XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
>>>> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
>>>>      PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
>>>>  }
>>>>  
>>>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>>>> +{
>>>> +    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
>>>> +    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
>>>> +}
>>>> +
>>>> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
>>>> +{
>>>> +    ERRP_GUARD();
>>>> +    PCIDevice *pci_dev = (PCIDevice *)qdev;
>>>> +    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
>>>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
>>>> +    PCIBus *pci_bus = pci_get_bus(pci_dev);
>>>> +
>>>> +    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>>>> +        return;
>>>> +
>>>> +    xen_host_pci_device_get(&s->real_device,
>>>> +                            s->hostaddr.domain, s->hostaddr.bus,
>>>> +                            s->hostaddr.slot, s->hostaddr.function,
>>>> +                            errp);
>>>> +    if (*errp) {
>>>> +        error_append_hint(errp, "Failed to \"open\" the real pci device");
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    if (is_igd_vga_passthrough(&s->real_device) &&
>>>> +        s->real_device.domain == 0 && s->real_device.bus == 0 &&
>>>> +        s->real_device.dev == 2 && s->real_device.func == 0 &&
>>>> +        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>>> 
>>> how about macros for these?
>>> 
>>> #define XEN_PCI_IGD_DOMAIN 0
>>> #define XEN_PCI_IGD_BUS 0
>>> #define XEN_PCI_IGD_DEV 2
>>> #define XEN_PCI_IGD_FN 0
>>> 
>>>> +        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
>>> 
>>> If you are going to do this, you should set it back
>>> either after pci_qdev_realize or in unrealize,
>>> for symmetry.
>>> 
>>>> +        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
>>>> +    }
>>> 
>>> 
>>>> +    xpdc->pci_qdev_realize(qdev, errp);
>>>> +}
>>>> +
>>> 
>>> 
>>> 
>>>>  static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
>>>>  {
>>>>      DeviceClass *dc = DEVICE_CLASS(klass);
>>>>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>>>>  
>>>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
>>>> +    xpdc->pci_qdev_realize = dc->realize;
>>>> +    dc->realize = xen_igd_clear_slot;
>>>>      k->realize = xen_pt_realize;
>>>>      k->exit = xen_pt_unregister_device;
>>>>      k->config_read = xen_pt_pci_read_config;
>>>> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
>>>>      .instance_size = sizeof(XenPCIPassthroughState),
>>>>      .instance_finalize = xen_pci_passthrough_finalize,
>>>>      .class_init = xen_pci_passthrough_class_init,
>>>> +    .class_size = sizeof(XenPTDeviceClass),
>>>>      .instance_init = xen_pci_passthrough_instance_init,
>>>>      .interfaces = (InterfaceInfo[]) {
>>>>          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
>>>> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
>>>> index cf10fc7bbf..8c25932b4b 100644
>>>> --- a/hw/xen/xen_pt.h
>>>> +++ b/hw/xen/xen_pt.h
>>>> @@ -2,6 +2,7 @@
>>>>  #define XEN_PT_H
>>>>  
>>>>  #include "hw/xen/xen_common.h"
>>>> +#include "hw/pci/pci_bus.h"
>>>>  #include "xen-host-pci-device.h"
>>>>  #include "qom/object.h"
>>>>  
>>>> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
>>>>  #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
>>>>  OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
>>>>  
>>>> +#define XEN_PT_DEVICE_CLASS(klass) \
>>>> +    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
>>>> +#define XEN_PT_DEVICE_GET_CLASS(obj) \
>>>> +    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
>>>> +
>>>> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
>>>> +
>>>> +typedef struct XenPTDeviceClass {
>>>> +    PCIDeviceClass parent_class;
>>>> +    XenPTQdevRealize pci_qdev_realize;
>>>> +} XenPTDeviceClass;
>>>> +
>>>>  uint32_t igd_read_opregion(XenPCIPassthroughState *s);
>>>> +void xen_igd_reserve_slot(PCIBus *pci_bus);
>>>>  void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
>>>>  void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
>>>>                                             XenHostPCIDevice *dev);
>>>> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
>>>>  
>>>>  #define XEN_PCI_INTEL_OPREGION 0xfc
>>>>  
>>>> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
>>>> +
>>> 
>>> I think you want to calculate this based on dev fn:
>>> 
>>> #define XEN_PCI_IGD_SLOT_MASK \
>>> 	(0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN)))
>>> 
>>> 
>>>>  typedef enum {
>>>>      XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
>>>>      XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
>>>> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
>>>> index 2d8cac8d54..5c108446a8 100644
>>>> --- a/hw/xen/xen_pt_stub.c
>>>> +++ b/hw/xen/xen_pt_stub.c
>>>> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
>>>>          error_setg(errp, "Xen PCI passthrough support not built in");
>>>>      }
>>>>  }
>>>> +
>>>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>>>> +{
>>>> +}
>>>> -- 
>>>> 2.39.0
>>> 
>>

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Bernhard Beschow 1 year, 3 months ago


Am 12. Januar 2023 20:11:54 UTC schrieb Chuck Zmudzinski <brchuckz@aol.com>:
>On 1/12/23 2:18 PM, Bernhard Beschow wrote:
>> 
>> 
>> Am 11. Januar 2023 15:40:24 UTC schrieb Chuck Zmudzinski <brchuckz@aol.com>:
>>>On 1/10/23 3:16 AM, Michael S. Tsirkin wrote:
>>>> On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote:
>>>>> Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
>>>>> as noted in docs/igd-assign.txt in the Qemu source code.
>>>>> 
>>>>> Currently, when the xl toolstack is used to configure a Xen HVM guest with
>>>>> Intel IGD passthrough to the guest with the Qemu upstream device model,
>>>>> a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
>>>>> a different slot. This problem often prevents the guest from booting.
>>>>> 
>>>>> The only available workaround is not good: Configure Xen HVM guests to use
>>>>> the old and no longer maintained Qemu traditional device model available
>>>>> from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
>>>>> 
>>>>> To implement this feature in the Qemu upstream device model for Xen HVM
>>>>> guests, introduce the following new functions, types, and macros:
>>>>> 
>>>>> * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
>>>>> * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
>>>>> * typedef XenPTQdevRealize function pointer
>>>>> * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
>>>>> * xen_igd_reserve_slot and xen_igd_clear_slot functions
>>>>> 
>>>>> The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
>>>>> member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
>>>>> the xl toolstack with the gfx_passthru option enabled, which sets the
>>>>> igd-passthru=on option to Qemu for the Xen HVM machine type.
>>>>> 
>>>>> The new xen_igd_reserve_slot function also needs to be implemented in
>>>>> hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
>>>>> when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
>>>>> in which case it does nothing.
>>>>> 
>>>>> The new xen_igd_clear_slot function overrides qdev->realize of the parent
>>>>> PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
>>>>> since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
>>>>> created in hw/i386/pc_piix.c for the case when igd-passthru=on.
>>>>> 
>>>>> Move the call to xen_host_pci_device_get, and the associated error
>>>>> handling, from xen_pt_realize to the new xen_igd_clear_slot function to
>>>>> initialize the device class and vendor values which enables the checks for
>>>>> the Intel IGD to succeed. The verification that the host device is an
>>>>> Intel IGD to be passed through is done by checking the domain, bus, slot,
>>>>> and function values as well as by checking that gfx_passthru is enabled,
>>>>> the device class is VGA, and the device vendor in Intel.
>>>>> 
>>>>> Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
>>>>> ---
>>>>> Notes that might be helpful to reviewers of patched code in hw/xen:
>>>>> 
>>>>> The new functions and types are based on recommendations from Qemu docs:
>>>>> https://qemu.readthedocs.io/en/latest/devel/qom.html
>>>>> 
>>>>> Notes that might be helpful to reviewers of patched code in hw/i386:
>>>>> 
>>>>> The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
>>>>> not affect builds that do not have CONFIG_XEN defined.
>>>>> 
>>>>> xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
>>>>> existing function that is only true when Qemu is built with
>>>>> xen-pci-passthrough enabled and the administrator has configured the Xen
>>>>> HVM guest with Qemu's igd-passthru=on option.
>>>>> 
>>>>> v2: Remove From: <email address> tag at top of commit message
>>>>> 
>>>>> v3: Changed the test for the Intel IGD in xen_igd_clear_slot:
>>>>> 
>>>>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>>>>         (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {
>>>>> 
>>>>>     is changed to
>>>>> 
>>>>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>>>>         && (s->hostaddr.function == 0)) {
>>>>> 
>>>>>     I hoped that I could use the test in v2, since it matches the
>>>>>     other tests for the Intel IGD in Qemu and Xen, but those tests
>>>>>     do not work because the necessary data structures are not set with
>>>>>     their values yet. So instead use the test that the administrator
>>>>>     has enabled gfx_passthru and the device address on the host is
>>>>>     02.0. This test does detect the Intel IGD correctly.
>>>>> 
>>>>> v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
>>>>>     email address to match the address used by the same author in commits
>>>>>     be9c61da and c0e86b76
>>>>>     
>>>>>     Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc
>>>>> 
>>>>> v5: The patch of xen_pt.c was re-worked to allow a more consistent test
>>>>>     for the Intel IGD that uses the same criteria as in other places.
>>>>>     This involved moving the call to xen_host_pci_device_get from
>>>>>     xen_pt_realize to xen_igd_clear_slot and updating the checks for the
>>>>>     Intel IGD in xen_igd_clear_slot:
>>>>>     
>>>>>     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
>>>>>         && (s->hostaddr.function == 0)) {
>>>>> 
>>>>>     is changed to
>>>>> 
>>>>>     if (is_igd_vga_passthrough(&s->real_device) &&
>>>>>         s->real_device.domain == 0 && s->real_device.bus == 0 &&
>>>>>         s->real_device.dev == 2 && s->real_device.func == 0 &&
>>>>>         s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>>>>> 
>>>>>     Added an explanation for the move of xen_host_pci_device_get from
>>>>>     xen_pt_realize to xen_igd_clear_slot to the commit message.
>>>>> 
>>>>>     Rebase.
>>>>> 
>>>>> v6: Fix logging by removing these lines from the move from xen_pt_realize
>>>>>     to xen_igd_clear_slot that was done in v5:
>>>>> 
>>>>>     XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
>>>>>                " to devfn 0x%x\n",
>>>>>                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>>>>                s->dev.devfn);
>>>>> 
>>>>>     This log needs to be in xen_pt_realize because s->dev.devfn is not
>>>>>     set yet in xen_igd_clear_slot.
>>>>> 
>>>>> v7: The v7 that was posted to the mailing list was incorrect. v8 is what
>>>>>     v7 was intended to be.
>>>>> 
>>>>> v8: Inhibit out of context log message and needless processing by
>>>>>     adding 2 lines at the top of the new xen_igd_clear_slot function:
>>>>> 
>>>>>     if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>>>>>         return;
>>>>> 
>>>>>     Rebase. This removed an unnecessary header file from xen_pt.h 
>>>>> 
>>>>>  hw/i386/pc_piix.c    |  3 +++
>>>>>  hw/xen/xen_pt.c      | 49 ++++++++++++++++++++++++++++++++++++--------
>>>>>  hw/xen/xen_pt.h      | 16 +++++++++++++++
>>>>>  hw/xen/xen_pt_stub.c |  4 ++++
>>>>>  4 files changed, 63 insertions(+), 9 deletions(-)
>>>>> 
>>>>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>>>>> index b48047f50c..bc5efa4f59 100644
>>>>> --- a/hw/i386/pc_piix.c
>>>>> +++ b/hw/i386/pc_piix.c
>>>>> @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine)
>>>>>      }
>>>>>  
>>>>>      pc_xen_hvm_init_pci(machine);
>>>>> +    if (xen_igd_gfx_pt_enabled()) {
>>>>> +        xen_igd_reserve_slot(pcms->bus);
>>>>> +    }
>>>>>      pci_create_simple(pcms->bus, -1, "xen-platform");
>>>>>  }
>>>>>  #endif
>>>> 
>>>> I would even maybe go further and move the whole logic into
>>>> xen_igd_reserve_slot. And I would even just name it
>>>> xen_hvm_init_reserved_slots without worrying about the what
>>>> or why at the pc level.  At this point it will be up to Xen maintainers.
>>>
>>>I see to do that would be to resolve the two pc_xen_hvm*
>>>functions in pc_piix.c that are guarded by CONFIG_XEN and
>>>move them to an appropriate place such as xen-hvm.c.
>>>
>>>That is along the lines of the work that Bernhard and Philippe
>>>are doing, so I am Cc'ing them. My first inclination is just
>>>to defer to them: I think eventually the little patch I propose
>>>here to pc_piix.c is eventually going to be moved out of pc_piix.c
>>>by Bernhard in a future patch.
>>>
>>>What they have been doing is very conservative, and I expect
>>>if and when Bernhard gets here to resolve those functions, they
>>>will do it in a way that keeps the dependency of the xenfv machine
>>>type on the pc machine type and the pc_init1 function.
>>>
>>>What I would propose would be to break the dependency of xenfv
>>>on the pc_init1 function. That is, I would propose having a
>>>xenfv_init function in xen-hvm.c, and the first version would
>>>be the current version of pc_init1, so xenfv would still depend
>>>on many i440fx type things, but with the change xen developers
>>>would be free to tweak xenfv_init without affecting the users
>>>of the pc machine type.
>>>
>>>Would that be a good idea? If I get posiive feedback for this
>>>idea, I will put it on the table, probably initially as an RFC
>>>patch.
>> 
>> In various patches I've been decoupling 1/ PIIX3 from Xen and 2/ QEMU's Xen integration code from the PC machine. My idea is to confine all wiring for a PIIX based PC machine using Xen in pc_piix.c. The pc_xen_hvm* functions seem to do exactly that, so I'd leave them there, at least for now.
>> 
>> What I would like to avoid is for the Xen integration code to make assumptions that an x86 or PC machine is always based on i440fx or PIIX3.
>
>I think what you are saying is that if I try to move the logic of my patch to xen-hvm.c, as Michael suggests, I should not move or copy any piix3 code to the Xen integration code, but access it via an appropriate qom interface to the code in pc_piix.c and only move Xen specific things to the Xen integration code such as the content of my patch. I can try to do that for a v9 of my patch. It might take me a little while (I am not a professional coder), so I will just leave v8 of my patch as is for now until I have a patch ready to move it out of pc_piix.c the qom way.

I think the change Michael suggests is very minimalistic: Move the if condition around xen_igd_reserve_slot() into the function itself and always call it there unconditionally -- basically turning three lines into one. Since xen_igd_reserve_slot() seems very problem specific, Michael further suggests to rename it to something more general. All in all no big changes required.

Best regards,
Bernhard

>
>Thanks,
>
>Chuck
>
>> 
>> I like Michael's idea of going one step further, both in terms of the approach and the reasoning.
>> 
>> Best regards,
>> Bernhard
>> 
>>>Also, thanks, Michael, for your other suggestions for this patch
>>>about using macros for the devfn constants.
>>>
>>>Chuck
>>>
>>>> 
>>>>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
>>>>> index 0ec7e52183..eff38155ef 100644
>>>>> --- a/hw/xen/xen_pt.c
>>>>> +++ b/hw/xen/xen_pt.c
>>>>> @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
>>>>>                 s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
>>>>>                 s->dev.devfn);
>>>>>  
>>>>> -    xen_host_pci_device_get(&s->real_device,
>>>>> -                            s->hostaddr.domain, s->hostaddr.bus,
>>>>> -                            s->hostaddr.slot, s->hostaddr.function,
>>>>> -                            errp);
>>>>> -    if (*errp) {
>>>>> -        error_append_hint(errp, "Failed to \"open\" the real pci device");
>>>>> -        return;
>>>>> -    }
>>>>> -
>>>>>      s->is_virtfn = s->real_device.is_virtfn;
>>>>>      if (s->is_virtfn) {
>>>>>          XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
>>>>> @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
>>>>>      PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
>>>>>  }
>>>>>  
>>>>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>>>>> +{
>>>>> +    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
>>>>> +    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
>>>>> +}
>>>>> +
>>>>> +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
>>>>> +{
>>>>> +    ERRP_GUARD();
>>>>> +    PCIDevice *pci_dev = (PCIDevice *)qdev;
>>>>> +    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
>>>>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
>>>>> +    PCIBus *pci_bus = pci_get_bus(pci_dev);
>>>>> +
>>>>> +    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
>>>>> +        return;
>>>>> +
>>>>> +    xen_host_pci_device_get(&s->real_device,
>>>>> +                            s->hostaddr.domain, s->hostaddr.bus,
>>>>> +                            s->hostaddr.slot, s->hostaddr.function,
>>>>> +                            errp);
>>>>> +    if (*errp) {
>>>>> +        error_append_hint(errp, "Failed to \"open\" the real pci device");
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if (is_igd_vga_passthrough(&s->real_device) &&
>>>>> +        s->real_device.domain == 0 && s->real_device.bus == 0 &&
>>>>> +        s->real_device.dev == 2 && s->real_device.func == 0 &&
>>>>> +        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>>>> 
>>>> how about macros for these?
>>>> 
>>>> #define XEN_PCI_IGD_DOMAIN 0
>>>> #define XEN_PCI_IGD_BUS 0
>>>> #define XEN_PCI_IGD_DEV 2
>>>> #define XEN_PCI_IGD_FN 0
>>>> 
>>>>> +        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
>>>> 
>>>> If you are going to do this, you should set it back
>>>> either after pci_qdev_realize or in unrealize,
>>>> for symmetry.
>>>> 
>>>>> +        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
>>>>> +    }
>>>> 
>>>> 
>>>>> +    xpdc->pci_qdev_realize(qdev, errp);
>>>>> +}
>>>>> +
>>>> 
>>>> 
>>>> 
>>>>>  static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
>>>>>  {
>>>>>      DeviceClass *dc = DEVICE_CLASS(klass);
>>>>>      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>>>>>  
>>>>> +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
>>>>> +    xpdc->pci_qdev_realize = dc->realize;
>>>>> +    dc->realize = xen_igd_clear_slot;
>>>>>      k->realize = xen_pt_realize;
>>>>>      k->exit = xen_pt_unregister_device;
>>>>>      k->config_read = xen_pt_pci_read_config;
>>>>> @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
>>>>>      .instance_size = sizeof(XenPCIPassthroughState),
>>>>>      .instance_finalize = xen_pci_passthrough_finalize,
>>>>>      .class_init = xen_pci_passthrough_class_init,
>>>>> +    .class_size = sizeof(XenPTDeviceClass),
>>>>>      .instance_init = xen_pci_passthrough_instance_init,
>>>>>      .interfaces = (InterfaceInfo[]) {
>>>>>          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
>>>>> diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
>>>>> index cf10fc7bbf..8c25932b4b 100644
>>>>> --- a/hw/xen/xen_pt.h
>>>>> +++ b/hw/xen/xen_pt.h
>>>>> @@ -2,6 +2,7 @@
>>>>>  #define XEN_PT_H
>>>>>  
>>>>>  #include "hw/xen/xen_common.h"
>>>>> +#include "hw/pci/pci_bus.h"
>>>>>  #include "xen-host-pci-device.h"
>>>>>  #include "qom/object.h"
>>>>>  
>>>>> @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
>>>>>  #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
>>>>>  OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
>>>>>  
>>>>> +#define XEN_PT_DEVICE_CLASS(klass) \
>>>>> +    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
>>>>> +#define XEN_PT_DEVICE_GET_CLASS(obj) \
>>>>> +    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
>>>>> +
>>>>> +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
>>>>> +
>>>>> +typedef struct XenPTDeviceClass {
>>>>> +    PCIDeviceClass parent_class;
>>>>> +    XenPTQdevRealize pci_qdev_realize;
>>>>> +} XenPTDeviceClass;
>>>>> +
>>>>>  uint32_t igd_read_opregion(XenPCIPassthroughState *s);
>>>>> +void xen_igd_reserve_slot(PCIBus *pci_bus);
>>>>>  void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
>>>>>  void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
>>>>>                                             XenHostPCIDevice *dev);
>>>>> @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
>>>>>  
>>>>>  #define XEN_PCI_INTEL_OPREGION 0xfc
>>>>>  
>>>>> +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
>>>>> +
>>>> 
>>>> I think you want to calculate this based on dev fn:
>>>> 
>>>> #define XEN_PCI_IGD_SLOT_MASK \
>>>> 	(0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN)))
>>>> 
>>>> 
>>>>>  typedef enum {
>>>>>      XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
>>>>>      XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
>>>>> diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
>>>>> index 2d8cac8d54..5c108446a8 100644
>>>>> --- a/hw/xen/xen_pt_stub.c
>>>>> +++ b/hw/xen/xen_pt_stub.c
>>>>> @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
>>>>>          error_setg(errp, "Xen PCI passthrough support not built in");
>>>>>      }
>>>>>  }
>>>>> +
>>>>> +void xen_igd_reserve_slot(PCIBus *pci_bus)
>>>>> +{
>>>>> +}
>>>>> -- 
>>>>> 2.39.0
>>>> 
>>>
>

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Michael S. Tsirkin 1 year, 3 months ago

On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:
> I think the change Michael suggests is very minimalistic: Move the if
> condition around xen_igd_reserve_slot() into the function itself and
> always call it there unconditionally -- basically turning three lines
> into one. Since xen_igd_reserve_slot() seems very problem specific,
> Michael further suggests to rename it to something more general. All
> in all no big changes required.

yes, exactly.

-- 
MST

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:
> On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:
>> I think the change Michael suggests is very minimalistic: Move the if
>> condition around xen_igd_reserve_slot() into the function itself and
>> always call it there unconditionally -- basically turning three lines
>> into one. Since xen_igd_reserve_slot() seems very problem specific,
>> Michael further suggests to rename it to something more general. All
>> in all no big changes required.
> 
> yes, exactly.
> 

OK, got it. I can do that along with the other suggestions.

Thanks.

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Igor Mammedov 1 year, 3 months ago

On Thu, 12 Jan 2023 23:14:26 -0500
Chuck Zmudzinski <brchuckz@aol.com> wrote:

> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:
> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:  
> >> I think the change Michael suggests is very minimalistic: Move the if
> >> condition around xen_igd_reserve_slot() into the function itself and
> >> always call it there unconditionally -- basically turning three lines
> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> >> Michael further suggests to rename it to something more general. All
> >> in all no big changes required.  
> > 
> > yes, exactly.
> >   
> 
> OK, got it. I can do that along with the other suggestions.

have you considered instead of reservation, putting a slot check in device model
and if it's intel igd being passed through, fail at realize time  if it can't take
required slot (with a error directing user to fix command line)?
That could be less complicated than dealing with slot reservations at the cost of
being less convenient.

> 
> Thanks.
>

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/13/23 4:33 AM, Igor Mammedov wrote:
> On Thu, 12 Jan 2023 23:14:26 -0500
> Chuck Zmudzinski <brchuckz@aol.com> wrote:
> 
>> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:
>> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:  
>> >> I think the change Michael suggests is very minimalistic: Move the if
>> >> condition around xen_igd_reserve_slot() into the function itself and
>> >> always call it there unconditionally -- basically turning three lines
>> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
>> >> Michael further suggests to rename it to something more general. All
>> >> in all no big changes required.  
>> > 
>> > yes, exactly.
>> >   
>> 
>> OK, got it. I can do that along with the other suggestions.
> 
> have you considered instead of reservation, putting a slot check in device model
> and if it's intel igd being passed through, fail at realize time  if it can't take
> required slot (with a error directing user to fix command line)?

Yes, but the core pci code currently already fails at realize time
with a useful error message if the user tries to use slot 2 for the
igd, because of the xen platform device which has slot 2. The user
can fix this without patching qemu, but having the user fix it on
the command line is not the best way to solve the problem, primarily
because the user would need to hotplug the xen platform device via a
command line option instead of having the xen platform device added by
pc_xen_hvm_init functions almost immediately after creating the pci
bus, and that delay in adding the xen platform device degrades
startup performance of the guest.

> That could be less complicated than dealing with slot reservations at the cost of
> being less convenient.

And also a cost of reduced startup performance.

However, the performance hit can be prevented by assigning slot
3 instead of slot 2 for the xen platform device if igd passthrough
is configured on the command line instead of doing slot reservation,
but there would still be less convenience and, for libxl users, an
inability to easily configure the command line so that the igd can
still have slot 2 without a hacky and error-prone patch to libxl to
deal with this problem.

I did post a patch on xen-devel to fix this using libxl, but so far
it has not yet been reviewed and I mentioned in that patch that the
approach of patching qemu so qemu reserves slot 2 for the igd is less
prone to coding errors and is easier to maintain than the patch that
would be required to implement the fix in libxl.

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Igor Mammedov 1 year, 3 months ago

On Fri, 13 Jan 2023 16:31:26 -0500
Chuck Zmudzinski <brchuckz@aol.com> wrote:

> On 1/13/23 4:33 AM, Igor Mammedov wrote:
> > On Thu, 12 Jan 2023 23:14:26 -0500
> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> >   
> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:  
> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:    
> >> >> I think the change Michael suggests is very minimalistic: Move the if
> >> >> condition around xen_igd_reserve_slot() into the function itself and
> >> >> always call it there unconditionally -- basically turning three lines
> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> >> >> Michael further suggests to rename it to something more general. All
> >> >> in all no big changes required.    
> >> > 
> >> > yes, exactly.
> >> >     
> >> 
> >> OK, got it. I can do that along with the other suggestions.  
> > 
> > have you considered instead of reservation, putting a slot check in device model
> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > required slot (with a error directing user to fix command line)?  
> 
> Yes, but the core pci code currently already fails at realize time
> with a useful error message if the user tries to use slot 2 for the
> igd, because of the xen platform device which has slot 2. The user
> can fix this without patching qemu, but having the user fix it on
> the command line is not the best way to solve the problem, primarily
> because the user would need to hotplug the xen platform device via a
> command line option instead of having the xen platform device added by
> pc_xen_hvm_init functions almost immediately after creating the pci
> bus, and that delay in adding the xen platform device degrades
> startup performance of the guest.
> 
> > That could be less complicated than dealing with slot reservations at the cost of
> > being less convenient.  
> 
> And also a cost of reduced startup performance

Could you clarify how it affects performance (and how much).
(as I see, setup done at board_init time is roughly the same
as with '-device foo' CLI options, modulo time needed to parse
options which should be negligible. and both ways are done before
guest runs)

> However, the performance hit can be prevented by assigning slot
> 3 instead of slot 2 for the xen platform device if igd passthrough
> is configured on the command line instead of doing slot reservation,
> but there would still be less convenience and, for libxl users, an
> inability to easily configure the command line so that the igd can
> still have slot 2 without a hacky and error-prone patch to libxl to
> deal with this problem.
libvirt manages to get it right on management side without quirks on
QEMU side.

> I did post a patch on xen-devel to fix this using libxl, but so far
> it has not yet been reviewed and I mentioned in that patch that the
> approach of patching qemu so qemu reserves slot 2 for the igd is less
> prone to coding errors and is easier to maintain than the patch that
> would be required to implement the fix in libxl.

the patch is not trivial, and adds maintenance on QEMU.
Though I don't object to it as long as it's constrained to xen only
code and doesn't spill into generic PCI.
All I wanted is just point out there are other approach to problem
(i.e. do force user to user to provide correct configuration instead
of adding quirks whenever it's possible).

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/16/23 10:33, Igor Mammedov wrote:
> On Fri, 13 Jan 2023 16:31:26 -0500
> Chuck Zmudzinski <brchuckz@aol.com> wrote:
> 
>> On 1/13/23 4:33 AM, Igor Mammedov wrote:
>> > On Thu, 12 Jan 2023 23:14:26 -0500
>> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
>> >   
>> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:  
>> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:    
>> >> >> I think the change Michael suggests is very minimalistic: Move the if
>> >> >> condition around xen_igd_reserve_slot() into the function itself and
>> >> >> always call it there unconditionally -- basically turning three lines
>> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
>> >> >> Michael further suggests to rename it to something more general. All
>> >> >> in all no big changes required.    
>> >> > 
>> >> > yes, exactly.
>> >> >     
>> >> 
>> >> OK, got it. I can do that along with the other suggestions.  
>> > 
>> > have you considered instead of reservation, putting a slot check in device model
>> > and if it's intel igd being passed through, fail at realize time  if it can't take
>> > required slot (with a error directing user to fix command line)?  
>> 
>> Yes, but the core pci code currently already fails at realize time
>> with a useful error message if the user tries to use slot 2 for the
>> igd, because of the xen platform device which has slot 2. The user
>> can fix this without patching qemu, but having the user fix it on
>> the command line is not the best way to solve the problem, primarily
>> because the user would need to hotplug the xen platform device via a
>> command line option instead of having the xen platform device added by
>> pc_xen_hvm_init functions almost immediately after creating the pci
>> bus, and that delay in adding the xen platform device degrades
>> startup performance of the guest.
>> 
>> > That could be less complicated than dealing with slot reservations at the cost of
>> > being less convenient.  
>> 
>> And also a cost of reduced startup performance
> 
> Could you clarify how it affects performance (and how much).
> (as I see, setup done at board_init time is roughly the same
> as with '-device foo' CLI options, modulo time needed to parse
> options which should be negligible. and both ways are done before
> guest runs)

I preface my answer by saying there is a v9, but you don't
need to look at that. I will answer all your questions here.

I am going by what I observe on the main HDMI display with the
different approaches. With the approach of not patching Qemu
to fix this, which requires adding the Xen platform device a
little later, the length of time it takes to fully load the
guest is increased. I also noticed with Linux guests that use
the grub bootoader, the grub vga driver cannot display the
grub boot menu at the native resolution of the display, which
in the tested case is 1920x1080, when the Xen platform device
is added via a command line option instead of by the
pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
to Qemu, the grub menu is displayed at the full, 1920x1080
native resolution of the display. Once the guest fully loads,
there is no noticeable difference in performance. It is mainly
a degradation in startup performance, not performance once
the guest OS is fully loaded.

> 
>> However, the performance hit can be prevented by assigning slot
>> 3 instead of slot 2 for the xen platform device if igd passthrough
>> is configured on the command line instead of doing slot reservation,
>> but there would still be less convenience and, for libxl users, an
>> inability to easily configure the command line so that the igd can
>> still have slot 2 without a hacky and error-prone patch to libxl to
>> deal with this problem.
> libvirt manages to get it right on management side without quirks on
> QEMU side.

I think the reason libvirt/kvm gets it right is simply because the
code implementing the libvirt/kvm approach got more attention and testing
than the code that implements the libxl/Xen approach. This patch
really represents what should have been done when support for the
igd-passthru=on option for xenfv machines was added seven years ago,
but the code was apparently added without much testing and is stale now
and needs this fix which is entirely implemented in either files maintained
by Xen maintainers or, in the case of the small patch to pc_piix.c,
entirely within a section guarded by #ifdef CONFIG_XEN. Not much
maintenance burden for hw/i386 maintainers.

> 
>> I did post a patch on xen-devel to fix this using libxl, but so far
>> it has not yet been reviewed and I mentioned in that patch that the
>> approach of patching qemu so qemu reserves slot 2 for the igd is less
>> prone to coding errors and is easier to maintain than the patch that
>> would be required to implement the fix in libxl.
> 
> the patch is not trivial, and adds maintenance on QEMU.

For all practical purposes, the only additional maintenance would
be handled by Xen maintainers, and the Xen maintainer of the Xen
files being patched gave a Reviewed-by to an earlier iteration of
this patch. So I think the decision about the maintenance cost of
this patch should be made by the Xen maintainers. In fact, if I
were a Xen maintainer, I think this patch to Qemu would be much
easier for the Xen maintainers to maintain than the proposed patch
to libxl to fix this. So ultimately, I think it makes sense for
the Xen maintainers to decide on the maintenance cost. So far
they have not weighed in since the Reviewed-by that Anthony
gave to an earlier iteration of this patch. So far, they have
not responded to my patch to libxl, and I don't blame them because
that would be more difficult for them to maintain than this patch
to some of the Xen-specific code within Qemu.

For reference, the patch for libxl that fixes this is here:

https://lore.kernel.org/qemu-devel/20230110073201.mdUvSjy1vKtxPriqMQuWAxIjQzf1eAqIlZgal1u3GBI@z/

> Though I don't object to it as long as it's constrained to xen only
> code

It already is constrained to Xen only code - the small patch to
pc_piix.c is entirely guarded by #ifdef CONFIG_XEN.

and doesn't spill into generic PCI.

In comments on an earlier iteration of this patch, Michael indicated
he would not object a patch to core pci if it added some useful
functionality.

Michael, do I misunderstand you?

I have already proposed a patch that does that, which, if accepted,
would address the objection that unconditionally reserving the slot
during initialization is not desirable. He pointed out that a patch
to core pci could fix that, and I have proposed such a patch,
independent of this patch, here:

https://lore.kernel.org/qemu-devel/ad5f5cf8bc4bd4a720724ed41e47565a7f27adf5.1673829387.git.brchuckz@aol.com/

> All I wanted is just point out there are other approach to problem
> (i.e. do force user to user to provide correct configuration instead
> of adding quirks whenever it's possible).
> 

I disagree that the default configuration should configure the hardware
in a way that does not conform to the requirements of the device and thereby
force users to add non-default settings to configure the machine correctly.
That is simply not being friendly to Xen users of Qemu, and that,
unfortunately is what Qemu code currently does and has done for the
past seven years as regards the configuration by Qemu of igd passthru on
Xen. IMO, it is unreasonable to not fix this, and since the fix can be
implemented in entirely Xen-specific code, I hope and expect that
eventually the Xen maintainers will fix this. I hope they are just waiting
until I implement the fixes that you and Michael have requested which
are mostly reasonable and admittedly, not completed yet.

Perhaps this approach is what you call a "quirk" because of the limitations
of how slot_reserved_mask works. That can be fixed by patching core pci.
That, IMO, is the best and most maintainable way to fix this.

So my plan is to wait and see how my proposed patch to core pci is received.
If it gets accepted, I will do a v10 of this patch which will use the
improved management capability added by the patch to core pci that addresses
the concerns that this patch will interfere with the libvirt/kvm approach
of manually assigning the slots by causing the slot_reserved_mask to
only take effect when the device being added is configured for auto
assignment of the slot address. When libvirt adds a pci device to a xenfv
machine configured for igd-patthru, my proposed v10, with the patch to core
pci as a prerequisite, will not introduce any change to how Qemu configures
the machine in response to a libvirt configuration that manually assigns the
slot addresses.

I do accept that v8/v9 of the patch is stalled, and I am working to address
all the concerns being raised here. Thanks for your comments!

Kind regards,

Chuck

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Igor Mammedov 1 year, 3 months ago

On Mon, 16 Jan 2023 13:00:53 -0500
Chuck Zmudzinski <brchuckz@netscape.net> wrote:

> On 1/16/23 10:33, Igor Mammedov wrote:
> > On Fri, 13 Jan 2023 16:31:26 -0500
> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> >   
> >> On 1/13/23 4:33 AM, Igor Mammedov wrote:  
> >> > On Thu, 12 Jan 2023 23:14:26 -0500
> >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> >> >     
> >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:    
> >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:      
> >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> >> >> >> always call it there unconditionally -- basically turning three lines
> >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> >> >> >> Michael further suggests to rename it to something more general. All
> >> >> >> in all no big changes required.      
> >> >> > 
> >> >> > yes, exactly.
> >> >> >       
> >> >> 
> >> >> OK, got it. I can do that along with the other suggestions.    
> >> > 
> >> > have you considered instead of reservation, putting a slot check in device model
> >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> >> > required slot (with a error directing user to fix command line)?    
> >> 
> >> Yes, but the core pci code currently already fails at realize time
> >> with a useful error message if the user tries to use slot 2 for the
> >> igd, because of the xen platform device which has slot 2. The user
> >> can fix this without patching qemu, but having the user fix it on
> >> the command line is not the best way to solve the problem, primarily
> >> because the user would need to hotplug the xen platform device via a
> >> command line option instead of having the xen platform device added by
> >> pc_xen_hvm_init functions almost immediately after creating the pci
> >> bus, and that delay in adding the xen platform device degrades
> >> startup performance of the guest.
> >>   
> >> > That could be less complicated than dealing with slot reservations at the cost of
> >> > being less convenient.    
> >> 
> >> And also a cost of reduced startup performance  
> > 
> > Could you clarify how it affects performance (and how much).
> > (as I see, setup done at board_init time is roughly the same
> > as with '-device foo' CLI options, modulo time needed to parse
> > options which should be negligible. and both ways are done before
> > guest runs)  
> 
> I preface my answer by saying there is a v9, but you don't
> need to look at that. I will answer all your questions here.
> 
> I am going by what I observe on the main HDMI display with the
> different approaches. With the approach of not patching Qemu
> to fix this, which requires adding the Xen platform device a
> little later, the length of time it takes to fully load the
> guest is increased. I also noticed with Linux guests that use
> the grub bootoader, the grub vga driver cannot display the
> grub boot menu at the native resolution of the display, which
> in the tested case is 1920x1080, when the Xen platform device
> is added via a command line option instead of by the
> pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> to Qemu, the grub menu is displayed at the full, 1920x1080
> native resolution of the display. Once the guest fully loads,
> there is no noticeable difference in performance. It is mainly
> a degradation in startup performance, not performance once
> the guest OS is fully loaded.

Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI
option, and actually drop at least graphics defaults explicitly.
So it is expected to work fine even when IGD is constructed with
'-device'.

Could you provide full CLI current xen starts QEMU with and then
a CLI you used (with explicit -device for IGD) that leads
to reduced performance?

CCing vfio folks who might have an idea what could be wrong based
on vfio experience.

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/17/2023 6:04 AM, Igor Mammedov wrote:
> On Mon, 16 Jan 2023 13:00:53 -0500
> Chuck Zmudzinski <brchuckz@netscape.net> wrote:
>
> > On 1/16/23 10:33, Igor Mammedov wrote:
> > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > >   
> > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:  
> > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > >> >     
> > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:    
> > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:      
> > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > >> >> >> always call it there unconditionally -- basically turning three lines
> > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > >> >> >> Michael further suggests to rename it to something more general. All
> > >> >> >> in all no big changes required.      
> > >> >> > 
> > >> >> > yes, exactly.
> > >> >> >       
> > >> >> 
> > >> >> OK, got it. I can do that along with the other suggestions.    
> > >> > 
> > >> > have you considered instead of reservation, putting a slot check in device model
> > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > >> > required slot (with a error directing user to fix command line)?    
> > >> 
> > >> Yes, but the core pci code currently already fails at realize time
> > >> with a useful error message if the user tries to use slot 2 for the
> > >> igd, because of the xen platform device which has slot 2. The user
> > >> can fix this without patching qemu, but having the user fix it on
> > >> the command line is not the best way to solve the problem, primarily
> > >> because the user would need to hotplug the xen platform device via a
> > >> command line option instead of having the xen platform device added by
> > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > >> bus, and that delay in adding the xen platform device degrades
> > >> startup performance of the guest.
> > >>   
> > >> > That could be less complicated than dealing with slot reservations at the cost of
> > >> > being less convenient.    
> > >> 
> > >> And also a cost of reduced startup performance  
> > > 
> > > Could you clarify how it affects performance (and how much).
> > > (as I see, setup done at board_init time is roughly the same
> > > as with '-device foo' CLI options, modulo time needed to parse
> > > options which should be negligible. and both ways are done before
> > > guest runs)  
> > 
> > I preface my answer by saying there is a v9, but you don't
> > need to look at that. I will answer all your questions here.
> > 
> > I am going by what I observe on the main HDMI display with the
> > different approaches. With the approach of not patching Qemu
> > to fix this, which requires adding the Xen platform device a
> > little later, the length of time it takes to fully load the
> > guest is increased. I also noticed with Linux guests that use
> > the grub bootoader, the grub vga driver cannot display the
> > grub boot menu at the native resolution of the display, which
> > in the tested case is 1920x1080, when the Xen platform device
> > is added via a command line option instead of by the
> > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > to Qemu, the grub menu is displayed at the full, 1920x1080
> > native resolution of the display. Once the guest fully loads,
> > there is no noticeable difference in performance. It is mainly
> > a degradation in startup performance, not performance once
> > the guest OS is fully loaded.
>
> Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI
> option, and actually drop at least graphics defaults explicitly.
> So it is expected to work fine even when IGD is constructed with
> '-device'.
>
> Could you provide full CLI current xen starts QEMU with and then
> a CLI you used (with explicit -device for IGD) that leads
> to reduced performance?
>
> CCing vfio folks who might have an idea what could be wrong based
> on vfio experience.

Actually, the igd is not added with an explicit -device option using Xen:

   1573 ?        Ssl    0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback

I think it is added by xl (libxl management tool) when the guest is created
using the qmp-libxl socket that appears on the command line, but I am not 100
percent sure. So, with libxl, the command line alone does not tell the whole
story. The xl.cfg file has a line like this to define the pci devices passed through,
and in qemu they are type XEN_PT devices, not VFIO devices:

pci = [ '00:1b.0','00:14.0','00:02.0@02' ]

This means three host pci devices are passed through, the ones on the
host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd.
The @02 means libxl is requesting slot 2 in the guest for the igd, the
other 2 devices are just auto assigned a slot by Qemu. Qemu cannot
assign the igd to slot 2 for xenfv machines without a patch that prevents
the Xen platform device from grabbing slot 2. That is what this patch
accomplishes. The workaround involves using the Qemu pc machine
instead of the Qemu xenfv machine, in which case the code in Qemu
that adds the Xen platform device at slot 2 is avoided, and in that case
the Xen platform device is added via a command line option instead
at slot 3 instead of slot 2.

The differences between vfio and the Xen pci passthrough device
might make a difference in Xen vs. kvm for igd-pasthru.

Also, kvm does not use the Xen platform device, and it seems the
Xen guests behave better at startup when the Xen platform device
is added very early, during the initialization of the emulated devices
of the machine, which is based on the i440fx piix3 machine, instead
of being added using a command line option. Perhaps the performance
at startup could be improved by adding the igd via a command line
option using vfio instead of the canonical way that libxl does pci
passthrough on Xen, but I have no idea if vfio works on Xen the way it
works on kvm. I am willing to investigate and experiment with it, though.

So if any vfio people can shed some light on this, that would help.

Thanks,

Chuck

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Stefano Stabellini 1 year, 3 months ago

On Tue, 17 Jan 2023, Chuck Zmudzinski wrote:
> On 1/17/2023 6:04 AM, Igor Mammedov wrote:
> > On Mon, 16 Jan 2023 13:00:53 -0500
> > Chuck Zmudzinski <brchuckz@netscape.net> wrote:
> >
> > > On 1/16/23 10:33, Igor Mammedov wrote:
> > > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > >   
> > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:  
> > > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > >> >     
> > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:    
> > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:      
> > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > > >> >> >> always call it there unconditionally -- basically turning three lines
> > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > > >> >> >> Michael further suggests to rename it to something more general. All
> > > >> >> >> in all no big changes required.      
> > > >> >> > 
> > > >> >> > yes, exactly.
> > > >> >> >       
> > > >> >> 
> > > >> >> OK, got it. I can do that along with the other suggestions.    
> > > >> > 
> > > >> > have you considered instead of reservation, putting a slot check in device model
> > > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > > >> > required slot (with a error directing user to fix command line)?    
> > > >> 
> > > >> Yes, but the core pci code currently already fails at realize time
> > > >> with a useful error message if the user tries to use slot 2 for the
> > > >> igd, because of the xen platform device which has slot 2. The user
> > > >> can fix this without patching qemu, but having the user fix it on
> > > >> the command line is not the best way to solve the problem, primarily
> > > >> because the user would need to hotplug the xen platform device via a
> > > >> command line option instead of having the xen platform device added by
> > > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > > >> bus, and that delay in adding the xen platform device degrades
> > > >> startup performance of the guest.
> > > >>   
> > > >> > That could be less complicated than dealing with slot reservations at the cost of
> > > >> > being less convenient.    
> > > >> 
> > > >> And also a cost of reduced startup performance  
> > > > 
> > > > Could you clarify how it affects performance (and how much).
> > > > (as I see, setup done at board_init time is roughly the same
> > > > as with '-device foo' CLI options, modulo time needed to parse
> > > > options which should be negligible. and both ways are done before
> > > > guest runs)  
> > > 
> > > I preface my answer by saying there is a v9, but you don't
> > > need to look at that. I will answer all your questions here.
> > > 
> > > I am going by what I observe on the main HDMI display with the
> > > different approaches. With the approach of not patching Qemu
> > > to fix this, which requires adding the Xen platform device a
> > > little later, the length of time it takes to fully load the
> > > guest is increased. I also noticed with Linux guests that use
> > > the grub bootoader, the grub vga driver cannot display the
> > > grub boot menu at the native resolution of the display, which
> > > in the tested case is 1920x1080, when the Xen platform device
> > > is added via a command line option instead of by the
> > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > > to Qemu, the grub menu is displayed at the full, 1920x1080
> > > native resolution of the display. Once the guest fully loads,
> > > there is no noticeable difference in performance. It is mainly
> > > a degradation in startup performance, not performance once
> > > the guest OS is fully loaded.
> >
> > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI
> > option, and actually drop at least graphics defaults explicitly.
> > So it is expected to work fine even when IGD is constructed with
> > '-device'.
> >
> > Could you provide full CLI current xen starts QEMU with and then
> > a CLI you used (with explicit -device for IGD) that leads
> > to reduced performance?
> >
> > CCing vfio folks who might have an idea what could be wrong based
> > on vfio experience.
> 
> Actually, the igd is not added with an explicit -device option using Xen:
> 
>    1573 ?        Ssl    0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback
> 
> I think it is added by xl (libxl management tool) when the guest is created
> using the qmp-libxl socket that appears on the command line, but I am not 100
> percent sure. So, with libxl, the command line alone does not tell the whole
> story. The xl.cfg file has a line like this to define the pci devices passed through,
> and in qemu they are type XEN_PT devices, not VFIO devices:
> 
> pci = [ '00:1b.0','00:14.0','00:02.0@02' ]
> 
> This means three host pci devices are passed through, the ones on the
> host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd.
> The @02 means libxl is requesting slot 2 in the guest for the igd, the
> other 2 devices are just auto assigned a slot by Qemu. Qemu cannot
> assign the igd to slot 2 for xenfv machines without a patch that prevents
> the Xen platform device from grabbing slot 2. That is what this patch
> accomplishes.

In principle I think this change is OK. Apologies that this patch is at
v9 and none of the Xen/QEMU maintainers took a look at it yet. I'll try
to look at it today.

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/20/2023 11:57 AM, Stefano Stabellini wrote:
> On Tue, 17 Jan 2023, Chuck Zmudzinski wrote:
> > On 1/17/2023 6:04 AM, Igor Mammedov wrote:
> > > On Mon, 16 Jan 2023 13:00:53 -0500
> > > Chuck Zmudzinski <brchuckz@netscape.net> wrote:
> > >
> > > > On 1/16/23 10:33, Igor Mammedov wrote:
> > > > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > > >   
> > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:  
> > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > > >> >     
> > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:    
> > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:      
> > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > > > >> >> >> always call it there unconditionally -- basically turning three lines
> > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > > > >> >> >> Michael further suggests to rename it to something more general. All
> > > > >> >> >> in all no big changes required.      
> > > > >> >> > 
> > > > >> >> > yes, exactly.
> > > > >> >> >       
> > > > >> >> 
> > > > >> >> OK, got it. I can do that along with the other suggestions.    
> > > > >> > 
> > > > >> > have you considered instead of reservation, putting a slot check in device model
> > > > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > > > >> > required slot (with a error directing user to fix command line)?    
> > > > >> 
> > > > >> Yes, but the core pci code currently already fails at realize time
> > > > >> with a useful error message if the user tries to use slot 2 for the
> > > > >> igd, because of the xen platform device which has slot 2. The user
> > > > >> can fix this without patching qemu, but having the user fix it on
> > > > >> the command line is not the best way to solve the problem, primarily
> > > > >> because the user would need to hotplug the xen platform device via a
> > > > >> command line option instead of having the xen platform device added by
> > > > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > > > >> bus, and that delay in adding the xen platform device degrades
> > > > >> startup performance of the guest.
> > > > >>   
> > > > >> > That could be less complicated than dealing with slot reservations at the cost of
> > > > >> > being less convenient.    
> > > > >> 
> > > > >> And also a cost of reduced startup performance  
> > > > > 
> > > > > Could you clarify how it affects performance (and how much).
> > > > > (as I see, setup done at board_init time is roughly the same
> > > > > as with '-device foo' CLI options, modulo time needed to parse
> > > > > options which should be negligible. and both ways are done before
> > > > > guest runs)  
> > > > 
> > > > I preface my answer by saying there is a v9, but you don't
> > > > need to look at that. I will answer all your questions here.
> > > > 
> > > > I am going by what I observe on the main HDMI display with the
> > > > different approaches. With the approach of not patching Qemu
> > > > to fix this, which requires adding the Xen platform device a
> > > > little later, the length of time it takes to fully load the
> > > > guest is increased. I also noticed with Linux guests that use
> > > > the grub bootoader, the grub vga driver cannot display the
> > > > grub boot menu at the native resolution of the display, which
> > > > in the tested case is 1920x1080, when the Xen platform device
> > > > is added via a command line option instead of by the
> > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > > > to Qemu, the grub menu is displayed at the full, 1920x1080
> > > > native resolution of the display. Once the guest fully loads,
> > > > there is no noticeable difference in performance. It is mainly
> > > > a degradation in startup performance, not performance once
> > > > the guest OS is fully loaded.
> > >
> > > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI
> > > option, and actually drop at least graphics defaults explicitly.
> > > So it is expected to work fine even when IGD is constructed with
> > > '-device'.
> > >
> > > Could you provide full CLI current xen starts QEMU with and then
> > > a CLI you used (with explicit -device for IGD) that leads
> > > to reduced performance?
> > >
> > > CCing vfio folks who might have an idea what could be wrong based
> > > on vfio experience.
> > 
> > Actually, the igd is not added with an explicit -device option using Xen:
> > 
> >    1573 ?        Ssl    0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback
> > 
> > I think it is added by xl (libxl management tool) when the guest is created
> > using the qmp-libxl socket that appears on the command line, but I am not 100
> > percent sure. So, with libxl, the command line alone does not tell the whole
> > story. The xl.cfg file has a line like this to define the pci devices passed through,
> > and in qemu they are type XEN_PT devices, not VFIO devices:
> > 
> > pci = [ '00:1b.0','00:14.0','00:02.0@02' ]
> > 
> > This means three host pci devices are passed through, the ones on the
> > host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd.
> > The @02 means libxl is requesting slot 2 in the guest for the igd, the
> > other 2 devices are just auto assigned a slot by Qemu. Qemu cannot
> > assign the igd to slot 2 for xenfv machines without a patch that prevents
> > the Xen platform device from grabbing slot 2. That is what this patch
> > accomplishes.
>
> In principle I think this change is OK. Apologies that this patch is at
> v9 and none of the Xen/QEMU maintainers took a look at it yet. I'll try
> to look at it today.

Thanks!

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Alex Williamson 1 year, 3 months ago

On Tue, 17 Jan 2023 19:15:57 -0500
Chuck Zmudzinski <brchuckz@aol.com> wrote:

> On 1/17/2023 6:04 AM, Igor Mammedov wrote:
> > On Mon, 16 Jan 2023 13:00:53 -0500
> > Chuck Zmudzinski <brchuckz@netscape.net> wrote:
> >  
> > > On 1/16/23 10:33, Igor Mammedov wrote:  
> > > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > >     
> > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:    
> > > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > >> >       
> > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:      
> > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:        
> > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > > >> >> >> always call it there unconditionally -- basically turning three lines
> > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > > >> >> >> Michael further suggests to rename it to something more general. All
> > > >> >> >> in all no big changes required.        
> > > >> >> > 
> > > >> >> > yes, exactly.
> > > >> >> >         
> > > >> >> 
> > > >> >> OK, got it. I can do that along with the other suggestions.      
> > > >> > 
> > > >> > have you considered instead of reservation, putting a slot check in device model
> > > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > > >> > required slot (with a error directing user to fix command line)?      
> > > >> 
> > > >> Yes, but the core pci code currently already fails at realize time
> > > >> with a useful error message if the user tries to use slot 2 for the
> > > >> igd, because of the xen platform device which has slot 2. The user
> > > >> can fix this without patching qemu, but having the user fix it on
> > > >> the command line is not the best way to solve the problem, primarily
> > > >> because the user would need to hotplug the xen platform device via a
> > > >> command line option instead of having the xen platform device added by
> > > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > > >> bus, and that delay in adding the xen platform device degrades
> > > >> startup performance of the guest.
> > > >>     
> > > >> > That could be less complicated than dealing with slot reservations at the cost of
> > > >> > being less convenient.      
> > > >> 
> > > >> And also a cost of reduced startup performance    
> > > > 
> > > > Could you clarify how it affects performance (and how much).
> > > > (as I see, setup done at board_init time is roughly the same
> > > > as with '-device foo' CLI options, modulo time needed to parse
> > > > options which should be negligible. and both ways are done before
> > > > guest runs)    
> > > 
> > > I preface my answer by saying there is a v9, but you don't
> > > need to look at that. I will answer all your questions here.
> > > 
> > > I am going by what I observe on the main HDMI display with the
> > > different approaches. With the approach of not patching Qemu
> > > to fix this, which requires adding the Xen platform device a
> > > little later, the length of time it takes to fully load the
> > > guest is increased. I also noticed with Linux guests that use
> > > the grub bootoader, the grub vga driver cannot display the
> > > grub boot menu at the native resolution of the display, which
> > > in the tested case is 1920x1080, when the Xen platform device
> > > is added via a command line option instead of by the
> > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > > to Qemu, the grub menu is displayed at the full, 1920x1080
> > > native resolution of the display. Once the guest fully loads,
> > > there is no noticeable difference in performance. It is mainly
> > > a degradation in startup performance, not performance once
> > > the guest OS is fully loaded.  
> >
> > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI
> > option, and actually drop at least graphics defaults explicitly.
> > So it is expected to work fine even when IGD is constructed with
> > '-device'.
> >
> > Could you provide full CLI current xen starts QEMU with and then
> > a CLI you used (with explicit -device for IGD) that leads
> > to reduced performance?
> >
> > CCing vfio folks who might have an idea what could be wrong based
> > on vfio experience.  
> 
> Actually, the igd is not added with an explicit -device option using Xen:
> 
>    1573 ?        Ssl    0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback
> 
> I think it is added by xl (libxl management tool) when the guest is created
> using the qmp-libxl socket that appears on the command line, but I am not 100
> percent sure. So, with libxl, the command line alone does not tell the whole
> story. The xl.cfg file has a line like this to define the pci devices passed through,
> and in qemu they are type XEN_PT devices, not VFIO devices:
> 
> pci = [ '00:1b.0','00:14.0','00:02.0@02' ]
> 
> This means three host pci devices are passed through, the ones on the
> host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd.
> The @02 means libxl is requesting slot 2 in the guest for the igd, the
> other 2 devices are just auto assigned a slot by Qemu. Qemu cannot
> assign the igd to slot 2 for xenfv machines without a patch that prevents
> the Xen platform device from grabbing slot 2. That is what this patch
> accomplishes. The workaround involves using the Qemu pc machine
> instead of the Qemu xenfv machine, in which case the code in Qemu
> that adds the Xen platform device at slot 2 is avoided, and in that case
> the Xen platform device is added via a command line option instead
> at slot 3 instead of slot 2.
> 
> The differences between vfio and the Xen pci passthrough device
> might make a difference in Xen vs. kvm for igd-pasthru.
> 
> Also, kvm does not use the Xen platform device, and it seems the
> Xen guests behave better at startup when the Xen platform device
> is added very early, during the initialization of the emulated devices
> of the machine, which is based on the i440fx piix3 machine, instead
> of being added using a command line option. Perhaps the performance
> at startup could be improved by adding the igd via a command line
> option using vfio instead of the canonical way that libxl does pci
> passthrough on Xen, but I have no idea if vfio works on Xen the way it
> works on kvm. I am willing to investigate and experiment with it, though.
> 
> So if any vfio people can shed some light on this, that would help.

ISTR some rumors of Xen thinking about vfio, but AFAIK there is no
combination of vfio with Xen, nor is there any sharing of device quirks
to assign IGD.  IGD assignment has various VM platform dependencies, of
which the bus address is just one.  Vfio support for IGD assignment
takes a much different approach, the user or management tool is
responsible for configuring the VM correctly for IGD assignment,
including assigning the device to the correct bus address and using the
right VM chipset, with the correct slot free for the LPC controller.  If
all the user configuration of the VM aligns correctly, we'll activate
"legacy mode" by inserting the opregion and instantiate the LPC bridge
with the correct fields to make the BIOS and drivers happy.  Otherwise,
maybe the user is trying to use the mythical UPT mode, but given
Intel's lack of follow-through, it probably just won't work.  Or maybe
it's a vGPU and we don't need the legacy features anyway.

Working with an expectation that QEMU needs to do the heavy lifting to
make it all work automatically, with no support from the management
tool for putting devices in the right place, I'm afraid there might not
be much to leverage from the vfio vesion.  Thanks,

Alex

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/17/2023 11:27 PM, Alex Williamson wrote:
> On Tue, 17 Jan 2023 19:15:57 -0500
> Chuck Zmudzinski <brchuckz@aol.com> wrote:
>
> > On 1/17/2023 6:04 AM, Igor Mammedov wrote:
> > > On Mon, 16 Jan 2023 13:00:53 -0500
> > > Chuck Zmudzinski <brchuckz@netscape.net> wrote:
> > >  
> > > > On 1/16/23 10:33, Igor Mammedov wrote:  
> > > > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > > >     
> > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:    
> > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > > >> >       
> > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:      
> > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:        
> > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > > > >> >> >> always call it there unconditionally -- basically turning three lines
> > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > > > >> >> >> Michael further suggests to rename it to something more general. All
> > > > >> >> >> in all no big changes required.        
> > > > >> >> > 
> > > > >> >> > yes, exactly.
> > > > >> >> >         
> > > > >> >> 
> > > > >> >> OK, got it. I can do that along with the other suggestions.      
> > > > >> > 
> > > > >> > have you considered instead of reservation, putting a slot check in device model
> > > > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > > > >> > required slot (with a error directing user to fix command line)?      
> > > > >> 
> > > > >> Yes, but the core pci code currently already fails at realize time
> > > > >> with a useful error message if the user tries to use slot 2 for the
> > > > >> igd, because of the xen platform device which has slot 2. The user
> > > > >> can fix this without patching qemu, but having the user fix it on
> > > > >> the command line is not the best way to solve the problem, primarily
> > > > >> because the user would need to hotplug the xen platform device via a
> > > > >> command line option instead of having the xen platform device added by
> > > > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > > > >> bus, and that delay in adding the xen platform device degrades
> > > > >> startup performance of the guest.
> > > > >>     
> > > > >> > That could be less complicated than dealing with slot reservations at the cost of
> > > > >> > being less convenient.      
> > > > >> 
> > > > >> And also a cost of reduced startup performance    
> > > > > 
> > > > > Could you clarify how it affects performance (and how much).
> > > > > (as I see, setup done at board_init time is roughly the same
> > > > > as with '-device foo' CLI options, modulo time needed to parse
> > > > > options which should be negligible. and both ways are done before
> > > > > guest runs)    
> > > > 
> > > > I preface my answer by saying there is a v9, but you don't
> > > > need to look at that. I will answer all your questions here.
> > > > 
> > > > I am going by what I observe on the main HDMI display with the
> > > > different approaches. With the approach of not patching Qemu
> > > > to fix this, which requires adding the Xen platform device a
> > > > little later, the length of time it takes to fully load the
> > > > guest is increased. I also noticed with Linux guests that use
> > > > the grub bootoader, the grub vga driver cannot display the
> > > > grub boot menu at the native resolution of the display, which
> > > > in the tested case is 1920x1080, when the Xen platform device
> > > > is added via a command line option instead of by the
> > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > > > to Qemu, the grub menu is displayed at the full, 1920x1080
> > > > native resolution of the display. Once the guest fully loads,
> > > > there is no noticeable difference in performance. It is mainly
> > > > a degradation in startup performance, not performance once
> > > > the guest OS is fully loaded.  
> > >
> > > Looking at igd-assign.txt, it recommends to add IGD using '-device' CLI
> > > option, and actually drop at least graphics defaults explicitly.
> > > So it is expected to work fine even when IGD is constructed with
> > > '-device'.
> > >
> > > Could you provide full CLI current xen starts QEMU with and then
> > > a CLI you used (with explicit -device for IGD) that leads
> > > to reduced performance?
> > >
> > > CCing vfio folks who might have an idea what could be wrong based
> > > on vfio experience.  
> > 
> > Actually, the igd is not added with an explicit -device option using Xen:
> > 
> >    1573 ?        Ssl    0:42 /usr/bin/qemu-system-i386 -xen-domid 1 -no-shutdown -chardev socket,id=libxl-cmd,path=/var/run/xen/qmp-libxl-1,server,nowait -mon chardev=libxl-cmd,mode=control -chardev socket,id=libxenstat-cmd,path=/var/run/xen/qmp-libxenstat-1,server,nowait -mon chardev=libxenstat-cmd,mode=control -nodefaults -no-user-config -name windows -vnc none -display none -serial pty -boot order=c -smp 4,maxcpus=4 -net none -machine xenfv,max-ram-below-4g=3758096384,igd-passthru=on -m 6144 -drive file=/dev/loop0,if=ide,index=0,media=disk,format=raw,cache=writeback -drive file=/dev/disk/by-uuid/A44AA4984AA468AE,if=ide,index=1,media=disk,format=raw,cache=writeback
> > 
> > I think it is added by xl (libxl management tool) when the guest is created
> > using the qmp-libxl socket that appears on the command line, but I am not 100
> > percent sure. So, with libxl, the command line alone does not tell the whole
> > story. The xl.cfg file has a line like this to define the pci devices passed through,
> > and in qemu they are type XEN_PT devices, not VFIO devices:
> > 
> > pci = [ '00:1b.0','00:14.0','00:02.0@02' ]
> > 
> > This means three host pci devices are passed through, the ones on the
> > host at slot 1b.0, 14.0, and 02.0. Of course the device at 02 is the igd.
> > The @02 means libxl is requesting slot 2 in the guest for the igd, the
> > other 2 devices are just auto assigned a slot by Qemu. Qemu cannot
> > assign the igd to slot 2 for xenfv machines without a patch that prevents
> > the Xen platform device from grabbing slot 2. That is what this patch
> > accomplishes. The workaround involves using the Qemu pc machine
> > instead of the Qemu xenfv machine, in which case the code in Qemu
> > that adds the Xen platform device at slot 2 is avoided, and in that case
> > the Xen platform device is added via a command line option instead
> > at slot 3 instead of slot 2.
> > 
> > The differences between vfio and the Xen pci passthrough device
> > might make a difference in Xen vs. kvm for igd-pasthru.
> > 
> > Also, kvm does not use the Xen platform device, and it seems the
> > Xen guests behave better at startup when the Xen platform device
> > is added very early, during the initialization of the emulated devices
> > of the machine, which is based on the i440fx piix3 machine, instead
> > of being added using a command line option. Perhaps the performance
> > at startup could be improved by adding the igd via a command line
> > option using vfio instead of the canonical way that libxl does pci
> > passthrough on Xen, but I have no idea if vfio works on Xen the way it
> > works on kvm. I am willing to investigate and experiment with it, though.
> > 
> > So if any vfio people can shed some light on this, that would help.
>
> ISTR some rumors of Xen thinking about vfio, but AFAIK there is no
> combination of vfio with Xen, nor is there any sharing of device quirks
> to assign IGD.  IGD assignment has various VM platform dependencies, of
> which the bus address is just one.  Vfio support for IGD assignment
> takes a much different approach, the user or management tool is
> responsible for configuring the VM correctly for IGD assignment,
> including assigning the device to the correct bus address and using the
> right VM chipset, with the correct slot free for the LPC controller.  If
> all the user configuration of the VM aligns correctly, we'll activate
> "legacy mode" by inserting the opregion and instantiate the LPC bridge
> with the correct fields to make the BIOS and drivers happy.  Otherwise,
> maybe the user is trying to use the mythical UPT mode, but given
> Intel's lack of follow-through, it probably just won't work.  Or maybe
> it's a vGPU and we don't need the legacy features anyway.
>
> Working with an expectation that QEMU needs to do the heavy lifting to
> make it all work automatically, with no support from the management
> tool for putting devices in the right place, I'm afraid there might not
> be much to leverage from the vfio vesion.  Thanks,
>
> Alex

Thanks for answering my question. I thought vfio's implementation was
distinct and probably incompatible from Xen's implementation.

Chuck

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Igor Mammedov 1 year, 3 months ago

On Mon, 16 Jan 2023 13:00:53 -0500
Chuck Zmudzinski <brchuckz@netscape.net> wrote:

> On 1/16/23 10:33, Igor Mammedov wrote:
> > On Fri, 13 Jan 2023 16:31:26 -0500
> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> >   
> >> On 1/13/23 4:33 AM, Igor Mammedov wrote:  
> >> > On Thu, 12 Jan 2023 23:14:26 -0500
> >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> >> >     
> >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:    
> >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:      
> >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> >> >> >> always call it there unconditionally -- basically turning three lines
> >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> >> >> >> Michael further suggests to rename it to something more general. All
> >> >> >> in all no big changes required.      
> >> >> > 
> >> >> > yes, exactly.
> >> >> >       
> >> >> 
> >> >> OK, got it. I can do that along with the other suggestions.    
> >> > 
> >> > have you considered instead of reservation, putting a slot check in device model
> >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> >> > required slot (with a error directing user to fix command line)?    
> >> 
> >> Yes, but the core pci code currently already fails at realize time
> >> with a useful error message if the user tries to use slot 2 for the
> >> igd, because of the xen platform device which has slot 2. The user
> >> can fix this without patching qemu, but having the user fix it on
> >> the command line is not the best way to solve the problem, primarily
> >> because the user would need to hotplug the xen platform device via a
> >> command line option instead of having the xen platform device added by
> >> pc_xen_hvm_init functions almost immediately after creating the pci
> >> bus, and that delay in adding the xen platform device degrades
> >> startup performance of the guest.
> >>   
> >> > That could be less complicated than dealing with slot reservations at the cost of
> >> > being less convenient.    
> >> 
> >> And also a cost of reduced startup performance  
> > 
> > Could you clarify how it affects performance (and how much).
> > (as I see, setup done at board_init time is roughly the same
> > as with '-device foo' CLI options, modulo time needed to parse
> > options which should be negligible. and both ways are done before
> > guest runs)  
> 
> I preface my answer by saying there is a v9, but you don't
> need to look at that. I will answer all your questions here.
> 
> I am going by what I observe on the main HDMI display with the
> different approaches. With the approach of not patching Qemu
> to fix this, which requires adding the Xen platform device a
> little later, the length of time it takes to fully load the
> guest is increased. I also noticed with Linux guests that use
> the grub bootoader, the grub vga driver cannot display the
> grub boot menu at the native resolution of the display, which
> in the tested case is 1920x1080, when the Xen platform device
> is added via a command line option instead of by the
> pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> to Qemu, the grub menu is displayed at the full, 1920x1080
> native resolution of the display. Once the guest fully loads,
> there is no noticeable difference in performance. It is mainly
> a degradation in startup performance, not performance once
> the guest OS is fully loaded.
above hints on presence of bug[s] in igd-passthru implementation,
and this patch effectively hides problem instead of trying to figure
out what's wrong and fixing it.

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/17/2023 5:35 AM, Igor Mammedov wrote:
> On Mon, 16 Jan 2023 13:00:53 -0500
> Chuck Zmudzinski <brchuckz@netscape.net> wrote:
>
> > On 1/16/23 10:33, Igor Mammedov wrote:
> > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > >   
> > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:  
> > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > >> >     
> > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:    
> > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:      
> > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > >> >> >> always call it there unconditionally -- basically turning three lines
> > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > >> >> >> Michael further suggests to rename it to something more general. All
> > >> >> >> in all no big changes required.      
> > >> >> > 
> > >> >> > yes, exactly.
> > >> >> >       
> > >> >> 
> > >> >> OK, got it. I can do that along with the other suggestions.    
> > >> > 
> > >> > have you considered instead of reservation, putting a slot check in device model
> > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > >> > required slot (with a error directing user to fix command line)?    
> > >> 
> > >> Yes, but the core pci code currently already fails at realize time
> > >> with a useful error message if the user tries to use slot 2 for the
> > >> igd, because of the xen platform device which has slot 2. The user
> > >> can fix this without patching qemu, but having the user fix it on
> > >> the command line is not the best way to solve the problem, primarily
> > >> because the user would need to hotplug the xen platform device via a
> > >> command line option instead of having the xen platform device added by
> > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > >> bus, and that delay in adding the xen platform device degrades
> > >> startup performance of the guest.
> > >>   
> > >> > That could be less complicated than dealing with slot reservations at the cost of
> > >> > being less convenient.    
> > >> 
> > >> And also a cost of reduced startup performance  
> > > 
> > > Could you clarify how it affects performance (and how much).
> > > (as I see, setup done at board_init time is roughly the same
> > > as with '-device foo' CLI options, modulo time needed to parse
> > > options which should be negligible. and both ways are done before
> > > guest runs)  
> > 
> > I preface my answer by saying there is a v9, but you don't
> > need to look at that. I will answer all your questions here.
> > 
> > I am going by what I observe on the main HDMI display with the
> > different approaches. With the approach of not patching Qemu
> > to fix this, which requires adding the Xen platform device a
> > little later, the length of time it takes to fully load the
> > guest is increased. I also noticed with Linux guests that use
> > the grub bootoader, the grub vga driver cannot display the
> > grub boot menu at the native resolution of the display, which
> > in the tested case is 1920x1080, when the Xen platform device
> > is added via a command line option instead of by the
> > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > to Qemu, the grub menu is displayed at the full, 1920x1080
> > native resolution of the display. Once the guest fully loads,
> > there is no noticeable difference in performance. It is mainly
> > a degradation in startup performance, not performance once
> > the guest OS is fully loaded.
> above hints on presence of bug[s] in igd-passthru implementation,
> and this patch effectively hides problem instead of trying to figure
> out what's wrong and fixing it.
>

I agree that the with the current Qemu there is a bug in the igd-passthru
implementation. My proposed patch fixes it. So why not support the
proposed patch with a Reviewed-by tag?

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Igor Mammedov 1 year, 3 months ago

On Tue, 17 Jan 2023 09:50:23 -0500
Chuck Zmudzinski <brchuckz@aol.com> wrote:

> On 1/17/2023 5:35 AM, Igor Mammedov wrote:
> > On Mon, 16 Jan 2023 13:00:53 -0500
> > Chuck Zmudzinski <brchuckz@netscape.net> wrote:
> >  
> > > On 1/16/23 10:33, Igor Mammedov wrote:  
> > > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > >     
> > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:    
> > > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > >> >       
> > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:      
> > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:        
> > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > > >> >> >> always call it there unconditionally -- basically turning three lines
> > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > > >> >> >> Michael further suggests to rename it to something more general. All
> > > >> >> >> in all no big changes required.        
> > > >> >> > 
> > > >> >> > yes, exactly.
> > > >> >> >         
> > > >> >> 
> > > >> >> OK, got it. I can do that along with the other suggestions.      
> > > >> > 
> > > >> > have you considered instead of reservation, putting a slot check in device model
> > > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > > >> > required slot (with a error directing user to fix command line)?      
> > > >> 
> > > >> Yes, but the core pci code currently already fails at realize time
> > > >> with a useful error message if the user tries to use slot 2 for the
> > > >> igd, because of the xen platform device which has slot 2. The user
> > > >> can fix this without patching qemu, but having the user fix it on
> > > >> the command line is not the best way to solve the problem, primarily
> > > >> because the user would need to hotplug the xen platform device via a
> > > >> command line option instead of having the xen platform device added by
> > > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > > >> bus, and that delay in adding the xen platform device degrades
> > > >> startup performance of the guest.
> > > >>     
> > > >> > That could be less complicated than dealing with slot reservations at the cost of
> > > >> > being less convenient.      
> > > >> 
> > > >> And also a cost of reduced startup performance    
> > > > 
> > > > Could you clarify how it affects performance (and how much).
> > > > (as I see, setup done at board_init time is roughly the same
> > > > as with '-device foo' CLI options, modulo time needed to parse
> > > > options which should be negligible. and both ways are done before
> > > > guest runs)    
> > > 
> > > I preface my answer by saying there is a v9, but you don't
> > > need to look at that. I will answer all your questions here.
> > > 
> > > I am going by what I observe on the main HDMI display with the
> > > different approaches. With the approach of not patching Qemu
> > > to fix this, which requires adding the Xen platform device a
> > > little later, the length of time it takes to fully load the
> > > guest is increased. I also noticed with Linux guests that use
> > > the grub bootoader, the grub vga driver cannot display the
> > > grub boot menu at the native resolution of the display, which
> > > in the tested case is 1920x1080, when the Xen platform device
> > > is added via a command line option instead of by the
> > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > > to Qemu, the grub menu is displayed at the full, 1920x1080
> > > native resolution of the display. Once the guest fully loads,
> > > there is no noticeable difference in performance. It is mainly
> > > a degradation in startup performance, not performance once
> > > the guest OS is fully loaded.  
> > above hints on presence of bug[s] in igd-passthru implementation,
> > and this patch effectively hides problem instead of trying to figure
> > out what's wrong and fixing it.
> >  
> 
> I agree that the with the current Qemu there is a bug in the igd-passthru
> implementation. My proposed patch fixes it. So why not support the
> proposed patch with a Reviewed-by tag?
I see the patch as workaround (it might be a reasonable one) and
it's upto xen maintainers to give Reviewed-by/accept it.

Though I'd add to commit message something about performance issues
you are experiencing when trying to passthrough IGD manually
as one of justifications for workaround (it might push Xen folks
to investigate the issue and it won't be lost/forgotten on mail-list).

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/20/2023 11:01 AM, Igor Mammedov wrote:
> On Tue, 17 Jan 2023 09:50:23 -0500
> Chuck Zmudzinski <brchuckz@aol.com> wrote:
>
> > On 1/17/2023 5:35 AM, Igor Mammedov wrote:
> > > On Mon, 16 Jan 2023 13:00:53 -0500
> > > Chuck Zmudzinski <brchuckz@netscape.net> wrote:
> > >  
> > > > On 1/16/23 10:33, Igor Mammedov wrote:  
> > > > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > > >     
> > > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:    
> > > > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > > >> >       
> > > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:      
> > > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:        
> > > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > > > >> >> >> always call it there unconditionally -- basically turning three lines
> > > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > > > >> >> >> Michael further suggests to rename it to something more general. All
> > > > >> >> >> in all no big changes required.        
> > > > >> >> > 
> > > > >> >> > yes, exactly.
> > > > >> >> >         
> > > > >> >> 
> > > > >> >> OK, got it. I can do that along with the other suggestions.      
> > > > >> > 
> > > > >> > have you considered instead of reservation, putting a slot check in device model
> > > > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > > > >> > required slot (with a error directing user to fix command line)?      
> > > > >> 
> > > > >> Yes, but the core pci code currently already fails at realize time
> > > > >> with a useful error message if the user tries to use slot 2 for the
> > > > >> igd, because of the xen platform device which has slot 2. The user
> > > > >> can fix this without patching qemu, but having the user fix it on
> > > > >> the command line is not the best way to solve the problem, primarily
> > > > >> because the user would need to hotplug the xen platform device via a
> > > > >> command line option instead of having the xen platform device added by
> > > > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > > > >> bus, and that delay in adding the xen platform device degrades
> > > > >> startup performance of the guest.
> > > > >>     
> > > > >> > That could be less complicated than dealing with slot reservations at the cost of
> > > > >> > being less convenient.      
> > > > >> 
> > > > >> And also a cost of reduced startup performance    
> > > > > 
> > > > > Could you clarify how it affects performance (and how much).
> > > > > (as I see, setup done at board_init time is roughly the same
> > > > > as with '-device foo' CLI options, modulo time needed to parse
> > > > > options which should be negligible. and both ways are done before
> > > > > guest runs)    
> > > > 
> > > > I preface my answer by saying there is a v9, but you don't
> > > > need to look at that. I will answer all your questions here.
> > > > 
> > > > I am going by what I observe on the main HDMI display with the
> > > > different approaches. With the approach of not patching Qemu
> > > > to fix this, which requires adding the Xen platform device a
> > > > little later, the length of time it takes to fully load the
> > > > guest is increased. I also noticed with Linux guests that use
> > > > the grub bootoader, the grub vga driver cannot display the
> > > > grub boot menu at the native resolution of the display, which
> > > > in the tested case is 1920x1080, when the Xen platform device
> > > > is added via a command line option instead of by the
> > > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > > > to Qemu, the grub menu is displayed at the full, 1920x1080
> > > > native resolution of the display. Once the guest fully loads,
> > > > there is no noticeable difference in performance. It is mainly
> > > > a degradation in startup performance, not performance once
> > > > the guest OS is fully loaded.  
> > > above hints on presence of bug[s] in igd-passthru implementation,
> > > and this patch effectively hides problem instead of trying to figure
> > > out what's wrong and fixing it.
> > >  
> > 
> > I agree that the with the current Qemu there is a bug in the igd-passthru
> > implementation. My proposed patch fixes it. So why not support the
> > proposed patch with a Reviewed-by tag?
> I see the patch as workaround (it might be a reasonable one) and
> it's upto xen maintainers to give Reviewed-by/accept it.
>
> Though I'd add to commit message something about performance issues
> you are experiencing when trying to passthrough IGD manually

No, the device that needs to be passed through manually with the
workaround is the Xen platform device, and the workaround (as I
understand it) is the patch to libxl, not this patch to Qemu. The igd
is passed through the same way whether or not Qemu or libxl is
patched, with these patches I have proposed, and IIUC that is by
using QMP and the Qemu device listener.

> as one of justifications for workaround (it might push Xen folks
> to investigate the issue and it won't be lost/forgotten on mail-list).
>

I could add that on the next iteration.

As far as Xen folks investigating further, that would be welcome.
I think the best way would be for the pc_xen_hvm_init functions
to add the igd at slot 2 and the xen platform device at slot 3
when igd-passthru=on is set on the command line. But I don't
know if it is possible for the pc_xen_hvm_init functions to
attach a type XEN_PT, which is the device type of the igd when
using xen, without changing the interface between libxl and
Qemu. My patches fix this without changing that interface,
but the Qemu patch does it in a way that achieves better
startup performance that the patch to libxl because of the
fact that the xen platform device can be added sooner
during guest creation by patching Qemu than by patching
libxl.

Currently, IIUC, the XEN_PT devices are added through QMP
and the Qemu device listener. If there was a way for Qemu
(the pc_xen_hvm_init_pci function) to actively attach the igd
instead of having libxl add it via QMP and the Qemu device
listener, that would also work and might improve performance
more because then both the igd and the xen platform device
would be added early during the creation of the guest.

Thanks.

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/17/2023 5:35 AM, Igor Mammedov wrote:
> On Mon, 16 Jan 2023 13:00:53 -0500
> Chuck Zmudzinski <brchuckz@netscape.net> wrote:
>
> > On 1/16/23 10:33, Igor Mammedov wrote:
> > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > >   
> > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:  
> > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > >> >     
> > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:    
> > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:      
> > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > >> >> >> always call it there unconditionally -- basically turning three lines
> > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > >> >> >> Michael further suggests to rename it to something more general. All
> > >> >> >> in all no big changes required.      
> > >> >> > 
> > >> >> > yes, exactly.
> > >> >> >       
> > >> >> 
> > >> >> OK, got it. I can do that along with the other suggestions.    
> > >> > 
> > >> > have you considered instead of reservation, putting a slot check in device model
> > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > >> > required slot (with a error directing user to fix command line)?    
> > >> 
> > >> Yes, but the core pci code currently already fails at realize time
> > >> with a useful error message if the user tries to use slot 2 for the
> > >> igd, because of the xen platform device which has slot 2. The user
> > >> can fix this without patching qemu, but having the user fix it on
> > >> the command line is not the best way to solve the problem, primarily
> > >> because the user would need to hotplug the xen platform device via a
> > >> command line option instead of having the xen platform device added by
> > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > >> bus, and that delay in adding the xen platform device degrades
> > >> startup performance of the guest.
> > >>   
> > >> > That could be less complicated than dealing with slot reservations at the cost of
> > >> > being less convenient.    
> > >> 
> > >> And also a cost of reduced startup performance  
> > > 
> > > Could you clarify how it affects performance (and how much).
> > > (as I see, setup done at board_init time is roughly the same
> > > as with '-device foo' CLI options, modulo time needed to parse
> > > options which should be negligible. and both ways are done before
> > > guest runs)  
> > 
> > I preface my answer by saying there is a v9, but you don't
> > need to look at that. I will answer all your questions here.
> > 
> > I am going by what I observe on the main HDMI display with the
> > different approaches. With the approach of not patching Qemu
> > to fix this, which requires adding the Xen platform device a
> > little later, the length of time it takes to fully load the
> > guest is increased. I also noticed with Linux guests that use
> > the grub bootoader, the grub vga driver cannot display the
> > grub boot menu at the native resolution of the display, which
> > in the tested case is 1920x1080, when the Xen platform device
> > is added via a command line option instead of by the
> > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > to Qemu, the grub menu is displayed at the full, 1920x1080
> > native resolution of the display. Once the guest fully loads,
> > there is no noticeable difference in performance. It is mainly
> > a degradation in startup performance, not performance once
> > the guest OS is fully loaded.
> above hints on presence of bug[s] in igd-passthru implementation,
> and this patch effectively hides problem instead of trying to figure
> out what's wrong and fixing it.
>

Why did you delete the rest of my answers to your other observations and
not respond to them?

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Igor Mammedov 1 year, 3 months ago

On Tue, 17 Jan 2023 09:43:52 -0500
Chuck Zmudzinski <brchuckz@aol.com> wrote:

> On 1/17/2023 5:35 AM, Igor Mammedov wrote:
> > On Mon, 16 Jan 2023 13:00:53 -0500
> > Chuck Zmudzinski <brchuckz@netscape.net> wrote:
> >  
> > > On 1/16/23 10:33, Igor Mammedov wrote:  
> > > > On Fri, 13 Jan 2023 16:31:26 -0500
> > > > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > >     
> > > >> On 1/13/23 4:33 AM, Igor Mammedov wrote:    
> > > >> > On Thu, 12 Jan 2023 23:14:26 -0500
> > > >> > Chuck Zmudzinski <brchuckz@aol.com> wrote:
> > > >> >       
> > > >> >> On 1/12/23 6:03 PM, Michael S. Tsirkin wrote:      
> > > >> >> > On Thu, Jan 12, 2023 at 10:55:25PM +0000, Bernhard Beschow wrote:        
> > > >> >> >> I think the change Michael suggests is very minimalistic: Move the if
> > > >> >> >> condition around xen_igd_reserve_slot() into the function itself and
> > > >> >> >> always call it there unconditionally -- basically turning three lines
> > > >> >> >> into one. Since xen_igd_reserve_slot() seems very problem specific,
> > > >> >> >> Michael further suggests to rename it to something more general. All
> > > >> >> >> in all no big changes required.        
> > > >> >> > 
> > > >> >> > yes, exactly.
> > > >> >> >         
> > > >> >> 
> > > >> >> OK, got it. I can do that along with the other suggestions.      
> > > >> > 
> > > >> > have you considered instead of reservation, putting a slot check in device model
> > > >> > and if it's intel igd being passed through, fail at realize time  if it can't take
> > > >> > required slot (with a error directing user to fix command line)?      
> > > >> 
> > > >> Yes, but the core pci code currently already fails at realize time
> > > >> with a useful error message if the user tries to use slot 2 for the
> > > >> igd, because of the xen platform device which has slot 2. The user
> > > >> can fix this without patching qemu, but having the user fix it on
> > > >> the command line is not the best way to solve the problem, primarily
> > > >> because the user would need to hotplug the xen platform device via a
> > > >> command line option instead of having the xen platform device added by
> > > >> pc_xen_hvm_init functions almost immediately after creating the pci
> > > >> bus, and that delay in adding the xen platform device degrades
> > > >> startup performance of the guest.
> > > >>     
> > > >> > That could be less complicated than dealing with slot reservations at the cost of
> > > >> > being less convenient.      
> > > >> 
> > > >> And also a cost of reduced startup performance    
> > > > 
> > > > Could you clarify how it affects performance (and how much).
> > > > (as I see, setup done at board_init time is roughly the same
> > > > as with '-device foo' CLI options, modulo time needed to parse
> > > > options which should be negligible. and both ways are done before
> > > > guest runs)    
> > > 
> > > I preface my answer by saying there is a v9, but you don't
> > > need to look at that. I will answer all your questions here.
> > > 
> > > I am going by what I observe on the main HDMI display with the
> > > different approaches. With the approach of not patching Qemu
> > > to fix this, which requires adding the Xen platform device a
> > > little later, the length of time it takes to fully load the
> > > guest is increased. I also noticed with Linux guests that use
> > > the grub bootoader, the grub vga driver cannot display the
> > > grub boot menu at the native resolution of the display, which
> > > in the tested case is 1920x1080, when the Xen platform device
> > > is added via a command line option instead of by the
> > > pc_xen_hvm_init_pci fucntion in pc_piix.c, but with this patch
> > > to Qemu, the grub menu is displayed at the full, 1920x1080
> > > native resolution of the display. Once the guest fully loads,
> > > there is no noticeable difference in performance. It is mainly
> > > a degradation in startup performance, not performance once
> > > the guest OS is fully loaded.  
> > above hints on presence of bug[s] in igd-passthru implementation,
> > and this patch effectively hides problem instead of trying to figure
> > out what's wrong and fixing it.
> >  
> 
> Why did you delete the rest of my answers to your other observations and
> not respond to them?

they are preserved on mail-list in you previous email.
It's usually fine to trim non relevant parts and keep only part/context
that is being replied to.

I didn't want to argue point that it's usually job of management
arrange correct IGD passthrough for QEMU like Alex pointed out.
Hence those part got trimmed.


>

Re: [PATCH v8] xen/pt: reserve PCI slot 2 for Intel igd-passthru

Posted by Chuck Zmudzinski 1 year, 3 months ago

On 1/10/2023 3:16 AM, Michael S. Tsirkin wrote:
> On Tue, Jan 10, 2023 at 02:08:34AM -0500, Chuck Zmudzinski wrote:
> > Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
> > as noted in docs/igd-assign.txt in the Qemu source code.
> > 
> > Currently, when the xl toolstack is used to configure a Xen HVM guest with
> > Intel IGD passthrough to the guest with the Qemu upstream device model,
> > a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
> > a different slot. This problem often prevents the guest from booting.
> > 
> > The only available workaround is not good: Configure Xen HVM guests to use
> > the old and no longer maintained Qemu traditional device model available
> > from xenbits.xen.org which does reserve slot 2 for the Intel IGD.
> > 
> > To implement this feature in the Qemu upstream device model for Xen HVM
> > guests, introduce the following new functions, types, and macros:
> > 
> > * XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
> > * XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
> > * typedef XenPTQdevRealize function pointer
> > * XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
> > * xen_igd_reserve_slot and xen_igd_clear_slot functions
> > 
> > The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
> > member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
> > the xl toolstack with the gfx_passthru option enabled, which sets the
> > igd-passthru=on option to Qemu for the Xen HVM machine type.
> > 
> > The new xen_igd_reserve_slot function also needs to be implemented in
> > hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
> > when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
> > in which case it does nothing.
> > 
> > The new xen_igd_clear_slot function overrides qdev->realize of the parent
> > PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
> > since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
> > created in hw/i386/pc_piix.c for the case when igd-passthru=on.
> > 
> > Move the call to xen_host_pci_device_get, and the associated error
> > handling, from xen_pt_realize to the new xen_igd_clear_slot function to
> > initialize the device class and vendor values which enables the checks for
> > the Intel IGD to succeed. The verification that the host device is an
> > Intel IGD to be passed through is done by checking the domain, bus, slot,
> > and function values as well as by checking that gfx_passthru is enabled,
> > the device class is VGA, and the device vendor in Intel.
> > 
> > Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
> > ---
> > Notes that might be helpful to reviewers of patched code in hw/xen:
> > 
> > The new functions and types are based on recommendations from Qemu docs:
> > https://qemu.readthedocs.io/en/latest/devel/qom.html
> > 
> > Notes that might be helpful to reviewers of patched code in hw/i386:
> > 
> > The small patch to hw/i386/pc_piix.c is protected by CONFIG_XEN so it does
> > not affect builds that do not have CONFIG_XEN defined.
> > 
> > xen_igd_gfx_pt_enabled() in the patched hw/i386/pc_piix.c file is an
> > existing function that is only true when Qemu is built with
> > xen-pci-passthrough enabled and the administrator has configured the Xen
> > HVM guest with Qemu's igd-passthru=on option.
> > 
> > v2: Remove From: <email address> tag at top of commit message
> > 
> > v3: Changed the test for the Intel IGD in xen_igd_clear_slot:
> > 
> >     if (is_igd_vga_passthrough(&s->real_device) &&
> >         (s->real_device.vendor_id == PCI_VENDOR_ID_INTEL)) {
> > 
> >     is changed to
> > 
> >     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
> >         && (s->hostaddr.function == 0)) {
> > 
> >     I hoped that I could use the test in v2, since it matches the
> >     other tests for the Intel IGD in Qemu and Xen, but those tests
> >     do not work because the necessary data structures are not set with
> >     their values yet. So instead use the test that the administrator
> >     has enabled gfx_passthru and the device address on the host is
> >     02.0. This test does detect the Intel IGD correctly.
> > 
> > v4: Use brchuckz@aol.com instead of brchuckz@netscape.net for the author's
> >     email address to match the address used by the same author in commits
> >     be9c61da and c0e86b76
> >     
> >     Change variable for XEN_PT_DEVICE_CLASS: xptc changed to xpdc
> > 
> > v5: The patch of xen_pt.c was re-worked to allow a more consistent test
> >     for the Intel IGD that uses the same criteria as in other places.
> >     This involved moving the call to xen_host_pci_device_get from
> >     xen_pt_realize to xen_igd_clear_slot and updating the checks for the
> >     Intel IGD in xen_igd_clear_slot:
> >     
> >     if (xen_igd_gfx_pt_enabled() && (s->hostaddr.slot == 2)
> >         && (s->hostaddr.function == 0)) {
> > 
> >     is changed to
> > 
> >     if (is_igd_vga_passthrough(&s->real_device) &&
> >         s->real_device.domain == 0 && s->real_device.bus == 0 &&
> >         s->real_device.dev == 2 && s->real_device.func == 0 &&
> >         s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
> > 
> >     Added an explanation for the move of xen_host_pci_device_get from
> >     xen_pt_realize to xen_igd_clear_slot to the commit message.
> > 
> >     Rebase.
> > 
> > v6: Fix logging by removing these lines from the move from xen_pt_realize
> >     to xen_igd_clear_slot that was done in v5:
> > 
> >     XEN_PT_LOG(d, "Assigning real physical device %02x:%02x.%d"
> >                " to devfn 0x%x\n",
> >                s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
> >                s->dev.devfn);
> > 
> >     This log needs to be in xen_pt_realize because s->dev.devfn is not
> >     set yet in xen_igd_clear_slot.
> > 
> > v7: The v7 that was posted to the mailing list was incorrect. v8 is what
> >     v7 was intended to be.
> > 
> > v8: Inhibit out of context log message and needless processing by
> >     adding 2 lines at the top of the new xen_igd_clear_slot function:
> > 
> >     if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
> >         return;
> > 
> >     Rebase. This removed an unnecessary header file from xen_pt.h 
> > 
> >  hw/i386/pc_piix.c    |  3 +++
> >  hw/xen/xen_pt.c      | 49 ++++++++++++++++++++++++++++++++++++--------
> >  hw/xen/xen_pt.h      | 16 +++++++++++++++
> >  hw/xen/xen_pt_stub.c |  4 ++++
> >  4 files changed, 63 insertions(+), 9 deletions(-)
> > 
> > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > index b48047f50c..bc5efa4f59 100644
> > --- a/hw/i386/pc_piix.c
> > +++ b/hw/i386/pc_piix.c
> > @@ -405,6 +405,9 @@ static void pc_xen_hvm_init(MachineState *machine)
> >      }
> >  
> >      pc_xen_hvm_init_pci(machine);
> > +    if (xen_igd_gfx_pt_enabled()) {
> > +        xen_igd_reserve_slot(pcms->bus);
> > +    }
> >      pci_create_simple(pcms->bus, -1, "xen-platform");
> >  }
> >  #endif
>
> I would even maybe go further and move the whole logic into
> xen_igd_reserve_slot. And I would even just name it
> xen_hvm_init_reserved_slots without worrying about the what
> or why at the pc level.  At this point it will be up to Xen maintainers.

I will try to do that for v9. That would reduce, rather than increase, the
technical debt. I actually wanted to move all the xen specific stuff in
pc_piix.c to xen-hvm.c. That would reduce the technical debt that
has accumulated over the years. I just couldn't see how to do it easily.
It looked I would need to do violence to pc_init1. But, I suppose,
pc_init1 is the kind of function that has accumulated lots of technical
debt and needs some violence done to it! I will give it a try, but it
might take me a while. I think if it was easy, someone would have
done it by now.

Thanks,

Chuck

>
> > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> > index 0ec7e52183..eff38155ef 100644
> > --- a/hw/xen/xen_pt.c
> > +++ b/hw/xen/xen_pt.c
> > @@ -780,15 +780,6 @@ static void xen_pt_realize(PCIDevice *d, Error **errp)
> >                 s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function,
> >                 s->dev.devfn);
> >  
> > -    xen_host_pci_device_get(&s->real_device,
> > -                            s->hostaddr.domain, s->hostaddr.bus,
> > -                            s->hostaddr.slot, s->hostaddr.function,
> > -                            errp);
> > -    if (*errp) {
> > -        error_append_hint(errp, "Failed to \"open\" the real pci device");
> > -        return;
> > -    }
> > -
> >      s->is_virtfn = s->real_device.is_virtfn;
> >      if (s->is_virtfn) {
> >          XEN_PT_LOG(d, "%04x:%02x:%02x.%d is a SR-IOV Virtual Function\n",
> > @@ -950,11 +941,50 @@ static void xen_pci_passthrough_instance_init(Object *obj)
> >      PCI_DEVICE(obj)->cap_present |= QEMU_PCI_CAP_EXPRESS;
> >  }
> >  
> > +void xen_igd_reserve_slot(PCIBus *pci_bus)
> > +{
> > +    XEN_PT_LOG(0, "Reserving PCI slot 2 for IGD\n");
> > +    pci_bus->slot_reserved_mask |= XEN_PCI_IGD_SLOT_MASK;
> > +}
> > +
> > +static void xen_igd_clear_slot(DeviceState *qdev, Error **errp)
> > +{
> > +    ERRP_GUARD();
> > +    PCIDevice *pci_dev = (PCIDevice *)qdev;
> > +    XenPCIPassthroughState *s = XEN_PT_DEVICE(pci_dev);
> > +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_GET_CLASS(s);
> > +    PCIBus *pci_bus = pci_get_bus(pci_dev);
> > +
> > +    if (!(pci_bus->slot_reserved_mask & XEN_PCI_IGD_SLOT_MASK))
> > +        return;
> > +
> > +    xen_host_pci_device_get(&s->real_device,
> > +                            s->hostaddr.domain, s->hostaddr.bus,
> > +                            s->hostaddr.slot, s->hostaddr.function,
> > +                            errp);
> > +    if (*errp) {
> > +        error_append_hint(errp, "Failed to \"open\" the real pci device");
> > +        return;
> > +    }
> > +
> > +    if (is_igd_vga_passthrough(&s->real_device) &&
> > +        s->real_device.domain == 0 && s->real_device.bus == 0 &&
> > +        s->real_device.dev == 2 && s->real_device.func == 0 &&
> > +        s->real_device.vendor_id == PCI_VENDOR_ID_INTEL) {
>
> how about macros for these?
>
> #define XEN_PCI_IGD_DOMAIN 0
> #define XEN_PCI_IGD_BUS 0
> #define XEN_PCI_IGD_DEV 2
> #define XEN_PCI_IGD_FN 0
>
> > +        pci_bus->slot_reserved_mask &= ~XEN_PCI_IGD_SLOT_MASK;
>
> If you are going to do this, you should set it back
> either after pci_qdev_realize or in unrealize,
> for symmetry.
>
> > +        XEN_PT_LOG(pci_dev, "Intel IGD found, using slot 2\n");
> > +    }
>
>
> > +    xpdc->pci_qdev_realize(qdev, errp);
> > +}
> > +
>
>
>
> >  static void xen_pci_passthrough_class_init(ObjectClass *klass, void *data)
> >  {
> >      DeviceClass *dc = DEVICE_CLASS(klass);
> >      PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> >  
> > +    XenPTDeviceClass *xpdc = XEN_PT_DEVICE_CLASS(klass);
> > +    xpdc->pci_qdev_realize = dc->realize;
> > +    dc->realize = xen_igd_clear_slot;
> >      k->realize = xen_pt_realize;
> >      k->exit = xen_pt_unregister_device;
> >      k->config_read = xen_pt_pci_read_config;
> > @@ -977,6 +1007,7 @@ static const TypeInfo xen_pci_passthrough_info = {
> >      .instance_size = sizeof(XenPCIPassthroughState),
> >      .instance_finalize = xen_pci_passthrough_finalize,
> >      .class_init = xen_pci_passthrough_class_init,
> > +    .class_size = sizeof(XenPTDeviceClass),
> >      .instance_init = xen_pci_passthrough_instance_init,
> >      .interfaces = (InterfaceInfo[]) {
> >          { INTERFACE_CONVENTIONAL_PCI_DEVICE },
> > diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
> > index cf10fc7bbf..8c25932b4b 100644
> > --- a/hw/xen/xen_pt.h
> > +++ b/hw/xen/xen_pt.h
> > @@ -2,6 +2,7 @@
> >  #define XEN_PT_H
> >  
> >  #include "hw/xen/xen_common.h"
> > +#include "hw/pci/pci_bus.h"
> >  #include "xen-host-pci-device.h"
> >  #include "qom/object.h"
> >  
> > @@ -40,7 +41,20 @@ typedef struct XenPTReg XenPTReg;
> >  #define TYPE_XEN_PT_DEVICE "xen-pci-passthrough"
> >  OBJECT_DECLARE_SIMPLE_TYPE(XenPCIPassthroughState, XEN_PT_DEVICE)
> >  
> > +#define XEN_PT_DEVICE_CLASS(klass) \
> > +    OBJECT_CLASS_CHECK(XenPTDeviceClass, klass, TYPE_XEN_PT_DEVICE)
> > +#define XEN_PT_DEVICE_GET_CLASS(obj) \
> > +    OBJECT_GET_CLASS(XenPTDeviceClass, obj, TYPE_XEN_PT_DEVICE)
> > +
> > +typedef void (*XenPTQdevRealize)(DeviceState *qdev, Error **errp);
> > +
> > +typedef struct XenPTDeviceClass {
> > +    PCIDeviceClass parent_class;
> > +    XenPTQdevRealize pci_qdev_realize;
> > +} XenPTDeviceClass;
> > +
> >  uint32_t igd_read_opregion(XenPCIPassthroughState *s);
> > +void xen_igd_reserve_slot(PCIBus *pci_bus);
> >  void igd_write_opregion(XenPCIPassthroughState *s, uint32_t val);
> >  void xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
> >                                             XenHostPCIDevice *dev);
> > @@ -75,6 +89,8 @@ typedef int (*xen_pt_conf_byte_read)
> >  
> >  #define XEN_PCI_INTEL_OPREGION 0xfc
> >  
> > +#define XEN_PCI_IGD_SLOT_MASK 0x4UL /* Intel IGD slot_reserved_mask */
> > +
>
> I think you want to calculate this based on dev fn:
>
> #define XEN_PCI_IGD_SLOT_MASK \
> 	(0x1 << PCI_SLOT(PCI_DEVFN(XEN_PCI_IGD_DEV, XEN_PCI_IGD_FN)))
>
>
> >  typedef enum {
> >      XEN_PT_GRP_TYPE_HARDWIRED = 0,  /* 0 Hardwired reg group */
> >      XEN_PT_GRP_TYPE_EMU,            /* emul reg group */
> > diff --git a/hw/xen/xen_pt_stub.c b/hw/xen/xen_pt_stub.c
> > index 2d8cac8d54..5c108446a8 100644
> > --- a/hw/xen/xen_pt_stub.c
> > +++ b/hw/xen/xen_pt_stub.c
> > @@ -20,3 +20,7 @@ void xen_igd_gfx_pt_set(bool value, Error **errp)
> >          error_setg(errp, "Xen PCI passthrough support not built in");
> >      }
> >  }
> > +
> > +void xen_igd_reserve_slot(PCIBus *pci_bus)
> > +{
> > +}
> > -- 
> > 2.39.0
>