[Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR

Alexey Kardashevskiy posted 1 patch 6 years, 4 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20171212052131.24649-1-aik@ozlabs.ru
Test checkpatch passed
Test docker passed
Test ppc passed
Test s390x passed
There is a newer version of this series
hw/vfio/pci.h                 |  1 +
include/hw/vfio/vfio-common.h |  1 +
linux-headers/linux/vfio.h    |  5 +++++
hw/ppc/spapr.c                | 10 +++++++++-
hw/vfio/common.c              | 15 +++++++++++++++
hw/vfio/pci.c                 | 11 +++++++++++
6 files changed, 42 insertions(+), 1 deletion(-)
[Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by Alexey Kardashevskiy 6 years, 4 months ago
This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
which tells that a region with MSIX data can be mapped entirely, i.e.
the VFIO PCI driver won't prevent MSIX vectors area from being mapped.

This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
by default and "false" for pseries-2.12+ machines.

This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
https://www.spinics.net/lists/kvm/msg160282.html

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---

This is an RFC as it requires kernel headers update which is not there yet.

I'd like to make it "msix-mmap" (without "no") but could not find a way
of enabling a device property for machine versions newer than some value.

I changed 2.11 machine just for the demonstration purpose.


---
 hw/vfio/pci.h                 |  1 +
 include/hw/vfio/vfio-common.h |  1 +
 linux-headers/linux/vfio.h    |  5 +++++
 hw/ppc/spapr.c                | 10 +++++++++-
 hw/vfio/common.c              | 15 +++++++++++++++
 hw/vfio/pci.c                 | 11 +++++++++++
 6 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index a8fb3b3..53912ef 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
     bool no_kvm_intx;
     bool no_kvm_msi;
     bool no_kvm_msix;
+    bool msix_no_mmap;
 } VFIOPCIDevice;
 
 uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
index f3a2ac9..927d600 100644
--- a/include/hw/vfio/vfio-common.h
+++ b/include/hw/vfio/vfio-common.h
@@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
                          struct vfio_region_info **info);
 int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
                              uint32_t subtype, struct vfio_region_info **info);
+bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
 #endif
 extern const MemoryListener vfio_prereg_listener;
 
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 4e7ab4c..bce9baf 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
 #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
 #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
 
+/*
+ * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
+ */
+#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
+
 /**
  * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
  *				    struct vfio_irq_info)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 9de63f0..1dfc386 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
 /*
  * pseries-2.11
  */
+#define SPAPR_COMPAT_2_11                                             \
+    HW_COMPAT_2_10                                                    \
+    {                                                                 \
+        .driver = "vfio-pci",                                         \
+        .property = "msix-no-mmap",                                   \
+        .value    = "on",                                             \
+    },                                                                \
+
 static void spapr_machine_2_11_instance_options(MachineState *machine)
 {
 }
 
 static void spapr_machine_2_11_class_options(MachineClass *mc)
 {
-    /* Defaults for the latest behaviour inherited from the base class */
+    SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
 }
 
 DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
diff --git a/hw/vfio/common.c b/hw/vfio/common.c
index ed7717d..593514c 100644
--- a/hw/vfio/common.c
+++ b/hw/vfio/common.c
@@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
     return -ENODEV;
 }
 
+bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
+{
+        struct vfio_region_info *info = NULL;
+        bool ret = false;
+
+        if (!vfio_get_region_info(vbasedev, region, &info)) {
+            if (vfio_get_region_info_cap(info, cap_type)) {
+                ret = true;
+            }
+            g_free(info);
+        }
+
+        return ret;
+}
+
 /*
  * Interfaces for IBM EEH (Enhanced Error Handling)
  */
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index c977ee3..d9aeae8 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
     off_t start, end;
     VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;
 
+    if (!vdev->msix_no_mmap &&
+        vfio_is_cap_present(&vdev->vbasedev, VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
+                            vdev->msix->table_bar)) {
+        return;
+    }
+
     /*
      * We expect to find a single mmap covering the whole BAR, anything else
      * means it's either unsupported or already setup.
@@ -1473,6 +1479,10 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
      */
     memory_region_set_enabled(&vdev->pdev.msix_pba_mmio, false);
 
+    if (!vdev->msix_no_mmap) {
+        memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
+    }
+
     return 0;
 }
 
@@ -2986,6 +2996,7 @@ static Property vfio_pci_dev_properties[] = {
     DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features,
                     VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false),
     DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
+    DEFINE_PROP_BOOL("msix-no-mmap", VFIOPCIDevice, msix_no_mmap, true),
     DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false),
     DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false),
     DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),
-- 
2.11.0


Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by Alexey Kardashevskiy 6 years, 4 months ago
On 12/12/17 16:21, Alexey Kardashevskiy wrote:
> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> which tells that a region with MSIX data can be mapped entirely, i.e.
> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> 
> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> by default and "false" for pseries-2.12+ machines.
> 
> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> https://www.spinics.net/lists/kvm/msg160282.html
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> This is an RFC as it requires kernel headers update which is not there yet.
> 
> I'd like to make it "msix-mmap" (without "no") but could not find a way
> of enabling a device property for machine versions newer than some value.

Ah, this remark is wrong, making it "no" property does not help.

How do we enforce some property on some device depending on a machine type?



> 
> I changed 2.11 machine just for the demonstration purpose.
> 
> 
> ---
>  hw/vfio/pci.h                 |  1 +
>  include/hw/vfio/vfio-common.h |  1 +
>  linux-headers/linux/vfio.h    |  5 +++++
>  hw/ppc/spapr.c                | 10 +++++++++-
>  hw/vfio/common.c              | 15 +++++++++++++++
>  hw/vfio/pci.c                 | 11 +++++++++++
>  6 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index a8fb3b3..53912ef 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
>      bool no_kvm_intx;
>      bool no_kvm_msi;
>      bool no_kvm_msix;
> +    bool msix_no_mmap;
>  } VFIOPCIDevice;
>  
>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index f3a2ac9..927d600 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
>                           struct vfio_region_info **info);
>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>                               uint32_t subtype, struct vfio_region_info **info);
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
>  #endif
>  extern const MemoryListener vfio_prereg_listener;
>  
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 4e7ab4c..bce9baf 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
>  
> +/*
> + * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
> + */
> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
> +
>  /**
>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
>   *				    struct vfio_irq_info)
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 9de63f0..1dfc386 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
>  /*
>   * pseries-2.11
>   */
> +#define SPAPR_COMPAT_2_11                                             \
> +    HW_COMPAT_2_10                                                    \
> +    {                                                                 \
> +        .driver = "vfio-pci",                                         \
> +        .property = "msix-no-mmap",                                   \
> +        .value    = "on",                                             \
> +    },                                                                \
> +
>  static void spapr_machine_2_11_instance_options(MachineState *machine)
>  {
>  }
>  
>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>  {
> -    /* Defaults for the latest behaviour inherited from the base class */
> +    SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index ed7717d..593514c 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>      return -ENODEV;
>  }
>  
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
> +{
> +        struct vfio_region_info *info = NULL;
> +        bool ret = false;
> +
> +        if (!vfio_get_region_info(vbasedev, region, &info)) {
> +            if (vfio_get_region_info_cap(info, cap_type)) {
> +                ret = true;
> +            }
> +            g_free(info);
> +        }
> +
> +        return ret;
> +}
> +
>  /*
>   * Interfaces for IBM EEH (Enhanced Error Handling)
>   */
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index c977ee3..d9aeae8 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
>      off_t start, end;
>      VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;
>  
> +    if (!vdev->msix_no_mmap &&
> +        vfio_is_cap_present(&vdev->vbasedev, VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
> +                            vdev->msix->table_bar)) {
> +        return;
> +    }
> +
>      /*
>       * We expect to find a single mmap covering the whole BAR, anything else
>       * means it's either unsupported or already setup.
> @@ -1473,6 +1479,10 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
>       */
>      memory_region_set_enabled(&vdev->pdev.msix_pba_mmio, false);
>  
> +    if (!vdev->msix_no_mmap) {
> +        memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
> +    }
> +
>      return 0;
>  }
>  
> @@ -2986,6 +2996,7 @@ static Property vfio_pci_dev_properties[] = {
>      DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features,
>                      VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false),
>      DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
> +    DEFINE_PROP_BOOL("msix-no-mmap", VFIOPCIDevice, msix_no_mmap, true),
>      DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false),
>      DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false),
>      DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),
> 


-- 
Alexey

Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by Alex Williamson 6 years, 4 months ago
On Tue, 12 Dec 2017 16:21:31 +1100
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> which tells that a region with MSIX data can be mapped entirely, i.e.
> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> 
> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> by default and "false" for pseries-2.12+ machines.
> 
> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> https://www.spinics.net/lists/kvm/msg160282.html
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> This is an RFC as it requires kernel headers update which is not there yet.
> 
> I'd like to make it "msix-mmap" (without "no") but could not find a way
> of enabling a device property for machine versions newer than some value.
> 
> I changed 2.11 machine just for the demonstration purpose.
> 
> 
> ---
>  hw/vfio/pci.h                 |  1 +
>  include/hw/vfio/vfio-common.h |  1 +
>  linux-headers/linux/vfio.h    |  5 +++++
>  hw/ppc/spapr.c                | 10 +++++++++-
>  hw/vfio/common.c              | 15 +++++++++++++++
>  hw/vfio/pci.c                 | 11 +++++++++++
>  6 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index a8fb3b3..53912ef 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
>      bool no_kvm_intx;
>      bool no_kvm_msi;
>      bool no_kvm_msix;
> +    bool msix_no_mmap;
>  } VFIOPCIDevice;
>  
>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index f3a2ac9..927d600 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
>                           struct vfio_region_info **info);
>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>                               uint32_t subtype, struct vfio_region_info **info);
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
>  #endif
>  extern const MemoryListener vfio_prereg_listener;
>  
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 4e7ab4c..bce9baf 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
>  
> +/*
> + * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
> + */
> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
> +
>  /**
>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
>   *				    struct vfio_irq_info)
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 9de63f0..1dfc386 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
>  /*
>   * pseries-2.11
>   */
> +#define SPAPR_COMPAT_2_11                                             \
> +    HW_COMPAT_2_10                                                    \
> +    {                                                                 \
> +        .driver = "vfio-pci",                                         \
> +        .property = "msix-no-mmap",                                   \
> +        .value    = "on",                                             \
> +    },                                                                \
> +
>  static void spapr_machine_2_11_instance_options(MachineState *machine)
>  {
>  }
>  
>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>  {
> -    /* Defaults for the latest behaviour inherited from the base class */
> +    SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index ed7717d..593514c 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>      return -ENODEV;
>  }
>  
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
> +{
> +        struct vfio_region_info *info = NULL;
> +        bool ret = false;
> +
> +        if (!vfio_get_region_info(vbasedev, region, &info)) {
> +            if (vfio_get_region_info_cap(info, cap_type)) {
> +                ret = true;
> +            }
> +            g_free(info);
> +        }
> +
> +        return ret;
> +}
> +
>  /*
>   * Interfaces for IBM EEH (Enhanced Error Handling)
>   */
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index c977ee3..d9aeae8 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
>      off_t start, end;
>      VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;
>  
> +    if (!vdev->msix_no_mmap &&
> +        vfio_is_cap_present(&vdev->vbasedev, VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
> +                            vdev->msix->table_bar)) {
> +        return;
> +    }
> +
>      /*
>       * We expect to find a single mmap covering the whole BAR, anything else
>       * means it's either unsupported or already setup.
> @@ -1473,6 +1479,10 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
>       */
>      memory_region_set_enabled(&vdev->pdev.msix_pba_mmio, false);
>  
> +    if (!vdev->msix_no_mmap) {
> +        memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
> +    }

No, you're conflating issues.  There's (a) can we mmap over the MSI-X
vector table and (b) do we require QEMU emulation of the MSI-X vector
table.  (a) does NOT imply (b).  AFAICT, (a) should be enabled any time
the kernel supports it, (b) should never be enabled on ppc, regardless
of (a).  Thanks,

Alex

> +
>      return 0;
>  }
>  
> @@ -2986,6 +2996,7 @@ static Property vfio_pci_dev_properties[] = {
>      DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features,
>                      VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false),
>      DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
> +    DEFINE_PROP_BOOL("msix-no-mmap", VFIOPCIDevice, msix_no_mmap, true),
>      DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false),
>      DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false),
>      DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),


Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by Alexey Kardashevskiy 6 years, 4 months ago
On 12/12/17 16:54, Alex Williamson wrote:
> On Tue, 12 Dec 2017 16:21:31 +1100
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
>> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
>> which tells that a region with MSIX data can be mapped entirely, i.e.
>> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
>>
>> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
>> by default and "false" for pseries-2.12+ machines.
>>
>> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
>> https://www.spinics.net/lists/kvm/msg160282.html
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>
>> This is an RFC as it requires kernel headers update which is not there yet.
>>
>> I'd like to make it "msix-mmap" (without "no") but could not find a way
>> of enabling a device property for machine versions newer than some value.
>>
>> I changed 2.11 machine just for the demonstration purpose.
>>
>>
>> ---
>>  hw/vfio/pci.h                 |  1 +
>>  include/hw/vfio/vfio-common.h |  1 +
>>  linux-headers/linux/vfio.h    |  5 +++++
>>  hw/ppc/spapr.c                | 10 +++++++++-
>>  hw/vfio/common.c              | 15 +++++++++++++++
>>  hw/vfio/pci.c                 | 11 +++++++++++
>>  6 files changed, 42 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>> index a8fb3b3..53912ef 100644
>> --- a/hw/vfio/pci.h
>> +++ b/hw/vfio/pci.h
>> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
>>      bool no_kvm_intx;
>>      bool no_kvm_msi;
>>      bool no_kvm_msix;
>> +    bool msix_no_mmap;
>>  } VFIOPCIDevice;
>>  
>>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>> index f3a2ac9..927d600 100644
>> --- a/include/hw/vfio/vfio-common.h
>> +++ b/include/hw/vfio/vfio-common.h
>> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
>>                           struct vfio_region_info **info);
>>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>>                               uint32_t subtype, struct vfio_region_info **info);
>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
>>  #endif
>>  extern const MemoryListener vfio_prereg_listener;
>>  
>> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
>> index 4e7ab4c..bce9baf 100644
>> --- a/linux-headers/linux/vfio.h
>> +++ b/linux-headers/linux/vfio.h
>> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
>>  
>> +/*
>> + * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
>> + */
>> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
>> +
>>  /**
>>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
>>   *				    struct vfio_irq_info)
>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>> index 9de63f0..1dfc386 100644
>> --- a/hw/ppc/spapr.c
>> +++ b/hw/ppc/spapr.c
>> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
>>  /*
>>   * pseries-2.11
>>   */
>> +#define SPAPR_COMPAT_2_11                                             \
>> +    HW_COMPAT_2_10                                                    \
>> +    {                                                                 \
>> +        .driver = "vfio-pci",                                         \
>> +        .property = "msix-no-mmap",                                   \
>> +        .value    = "on",                                             \
>> +    },                                                                \
>> +
>>  static void spapr_machine_2_11_instance_options(MachineState *machine)
>>  {
>>  }
>>  
>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>>  {
>> -    /* Defaults for the latest behaviour inherited from the base class */
>> +    SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
>>  }
>>  
>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>> index ed7717d..593514c 100644
>> --- a/hw/vfio/common.c
>> +++ b/hw/vfio/common.c
>> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>>      return -ENODEV;
>>  }
>>  
>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
>> +{
>> +        struct vfio_region_info *info = NULL;
>> +        bool ret = false;
>> +
>> +        if (!vfio_get_region_info(vbasedev, region, &info)) {
>> +            if (vfio_get_region_info_cap(info, cap_type)) {
>> +                ret = true;
>> +            }
>> +            g_free(info);
>> +        }
>> +
>> +        return ret;
>> +}
>> +
>>  /*
>>   * Interfaces for IBM EEH (Enhanced Error Handling)
>>   */
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index c977ee3..d9aeae8 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
>>      off_t start, end;
>>      VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;
>>  
>> +    if (!vdev->msix_no_mmap &&
>> +        vfio_is_cap_present(&vdev->vbasedev, VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
>> +                            vdev->msix->table_bar)) {
>> +        return;
>> +    }
>> +
>>      /*
>>       * We expect to find a single mmap covering the whole BAR, anything else
>>       * means it's either unsupported or already setup.
>> @@ -1473,6 +1479,10 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
>>       */
>>      memory_region_set_enabled(&vdev->pdev.msix_pba_mmio, false);
>>  
>> +    if (!vdev->msix_no_mmap) {
>> +        memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
>> +    }
> 
> No, you're conflating issues.  There's (a) can we mmap over the MSI-X
> vector table and (b) do we require QEMU emulation of the MSI-X vector
> table.  (a) does NOT imply (b).  AFAICT, (a) should be enabled any time
> the kernel supports it,

This is the default setting, or you do not want it to be a property so the
user cannot shoot himself in a foot?

> (b) should never be enabled on ppc, regardless
> of (a).  Thanks,


The intention is to have a property - msix_no_mmap=true, except a single
case - ppc-pseries, I just do not know how to enforce it for a specific
machine type. It could also be a machine property, like "kernel_irqchip".



> 
> Alex
> 
>> +
>>      return 0;
>>  }
>>  
>> @@ -2986,6 +2996,7 @@ static Property vfio_pci_dev_properties[] = {
>>      DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features,
>>                      VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false),
>>      DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
>> +    DEFINE_PROP_BOOL("msix-no-mmap", VFIOPCIDevice, msix_no_mmap, true),
>>      DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false),
>>      DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false),
>>      DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),
> 


-- 
Alexey

Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by Alexey Kardashevskiy 6 years, 4 months ago
On 12/12/17 17:06, Alexey Kardashevskiy wrote:
> On 12/12/17 16:54, Alex Williamson wrote:
>> On Tue, 12 Dec 2017 16:21:31 +1100
>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
>>
>>> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
>>> which tells that a region with MSIX data can be mapped entirely, i.e.
>>> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
>>>
>>> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
>>> by default and "false" for pseries-2.12+ machines.
>>>
>>> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
>>> https://www.spinics.net/lists/kvm/msg160282.html
>>>
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> ---
>>>
>>> This is an RFC as it requires kernel headers update which is not there yet.
>>>
>>> I'd like to make it "msix-mmap" (without "no") but could not find a way
>>> of enabling a device property for machine versions newer than some value.
>>>
>>> I changed 2.11 machine just for the demonstration purpose.
>>>
>>>
>>> ---
>>>  hw/vfio/pci.h                 |  1 +
>>>  include/hw/vfio/vfio-common.h |  1 +
>>>  linux-headers/linux/vfio.h    |  5 +++++
>>>  hw/ppc/spapr.c                | 10 +++++++++-
>>>  hw/vfio/common.c              | 15 +++++++++++++++
>>>  hw/vfio/pci.c                 | 11 +++++++++++
>>>  6 files changed, 42 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
>>> index a8fb3b3..53912ef 100644
>>> --- a/hw/vfio/pci.h
>>> +++ b/hw/vfio/pci.h
>>> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
>>>      bool no_kvm_intx;
>>>      bool no_kvm_msi;
>>>      bool no_kvm_msix;
>>> +    bool msix_no_mmap;
>>>  } VFIOPCIDevice;
>>>  
>>>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
>>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
>>> index f3a2ac9..927d600 100644
>>> --- a/include/hw/vfio/vfio-common.h
>>> +++ b/include/hw/vfio/vfio-common.h
>>> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
>>>                           struct vfio_region_info **info);
>>>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>>>                               uint32_t subtype, struct vfio_region_info **info);
>>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
>>>  #endif
>>>  extern const MemoryListener vfio_prereg_listener;
>>>  
>>> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
>>> index 4e7ab4c..bce9baf 100644
>>> --- a/linux-headers/linux/vfio.h
>>> +++ b/linux-headers/linux/vfio.h
>>> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
>>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
>>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
>>>  
>>> +/*
>>> + * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
>>> + */
>>> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
>>> +
>>>  /**
>>>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
>>>   *				    struct vfio_irq_info)
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 9de63f0..1dfc386 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
>>>  /*
>>>   * pseries-2.11
>>>   */
>>> +#define SPAPR_COMPAT_2_11                                             \
>>> +    HW_COMPAT_2_10                                                    \
>>> +    {                                                                 \
>>> +        .driver = "vfio-pci",                                         \
>>> +        .property = "msix-no-mmap",                                   \
>>> +        .value    = "on",                                             \
>>> +    },                                                                \
>>> +
>>>  static void spapr_machine_2_11_instance_options(MachineState *machine)
>>>  {
>>>  }
>>>  
>>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>>>  {
>>> -    /* Defaults for the latest behaviour inherited from the base class */
>>> +    SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
>>>  }
>>>  
>>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index ed7717d..593514c 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>>>      return -ENODEV;
>>>  }
>>>  
>>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
>>> +{
>>> +        struct vfio_region_info *info = NULL;
>>> +        bool ret = false;
>>> +
>>> +        if (!vfio_get_region_info(vbasedev, region, &info)) {
>>> +            if (vfio_get_region_info_cap(info, cap_type)) {
>>> +                ret = true;
>>> +            }
>>> +            g_free(info);
>>> +        }
>>> +
>>> +        return ret;
>>> +}
>>> +
>>>  /*
>>>   * Interfaces for IBM EEH (Enhanced Error Handling)
>>>   */
>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>>> index c977ee3..d9aeae8 100644
>>> --- a/hw/vfio/pci.c
>>> +++ b/hw/vfio/pci.c
>>> @@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
>>>      off_t start, end;
>>>      VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;
>>>  
>>> +    if (!vdev->msix_no_mmap &&
>>> +        vfio_is_cap_present(&vdev->vbasedev, VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
>>> +                            vdev->msix->table_bar)) {
>>> +        return;
>>> +    }
>>> +
>>>      /*
>>>       * We expect to find a single mmap covering the whole BAR, anything else
>>>       * means it's either unsupported or already setup.
>>> @@ -1473,6 +1479,10 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
>>>       */
>>>      memory_region_set_enabled(&vdev->pdev.msix_pba_mmio, false);
>>>  
>>> +    if (!vdev->msix_no_mmap) {
>>> +        memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
>>> +    }
>>
>> No, you're conflating issues.  There's (a) can we mmap over the MSI-X
>> vector table and (b) do we require QEMU emulation of the MSI-X vector
>> table.  (a) does NOT imply (b).  AFAICT, (a) should be enabled any time
>> the kernel supports it,
> 
> This is the default setting, or you do not want it to be a property so the
> user cannot shoot himself in a foot?
> 
>> (b) should never be enabled on ppc, regardless
>> of (a).  Thanks,
> 
> 
> The intention is to have a property - msix_no_mmap=true, except a single
> case - ppc-pseries, I just do not know how to enforce it for a specific
> machine type.


I do know now. For example,

static void spapr_machine_2_11_instance_options(MachineState *machine)
{
    register_compat_prop("vfio-pci", "msix-no-mmap", "off");
}


which is basically "-global vfio-pci.msix-no-mmap=off" but in the code.



> It could also be a machine property, like "kernel_irqchip".
> 
> 
> 
>>
>> Alex
>>
>>> +
>>>      return 0;
>>>  }
>>>  
>>> @@ -2986,6 +2996,7 @@ static Property vfio_pci_dev_properties[] = {
>>>      DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features,
>>>                      VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false),
>>>      DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
>>> +    DEFINE_PROP_BOOL("msix-no-mmap", VFIOPCIDevice, msix_no_mmap, true),
>>>      DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false),
>>>      DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false),
>>>      DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),
>>
> 
> 


-- 
Alexey

Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by Alex Williamson 6 years, 4 months ago
On Tue, 12 Dec 2017 18:01:40 +1100
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> On 12/12/17 17:06, Alexey Kardashevskiy wrote:
> > On 12/12/17 16:54, Alex Williamson wrote:  
> >> On Tue, 12 Dec 2017 16:21:31 +1100
> >> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> >>  
> >>> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> >>> which tells that a region with MSIX data can be mapped entirely, i.e.
> >>> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> >>>
> >>> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> >>> by default and "false" for pseries-2.12+ machines.
> >>>
> >>> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> >>> https://www.spinics.net/lists/kvm/msg160282.html
> >>>
> >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> >>> ---
> >>>
> >>> This is an RFC as it requires kernel headers update which is not there yet.
> >>>
> >>> I'd like to make it "msix-mmap" (without "no") but could not find a way
> >>> of enabling a device property for machine versions newer than some value.
> >>>
> >>> I changed 2.11 machine just for the demonstration purpose.
> >>>
> >>>
> >>> ---
> >>>  hw/vfio/pci.h                 |  1 +
> >>>  include/hw/vfio/vfio-common.h |  1 +
> >>>  linux-headers/linux/vfio.h    |  5 +++++
> >>>  hw/ppc/spapr.c                | 10 +++++++++-
> >>>  hw/vfio/common.c              | 15 +++++++++++++++
> >>>  hw/vfio/pci.c                 | 11 +++++++++++
> >>>  6 files changed, 42 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> >>> index a8fb3b3..53912ef 100644
> >>> --- a/hw/vfio/pci.h
> >>> +++ b/hw/vfio/pci.h
> >>> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
> >>>      bool no_kvm_intx;
> >>>      bool no_kvm_msi;
> >>>      bool no_kvm_msix;
> >>> +    bool msix_no_mmap;
> >>>  } VFIOPCIDevice;
> >>>  
> >>>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> >>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> >>> index f3a2ac9..927d600 100644
> >>> --- a/include/hw/vfio/vfio-common.h
> >>> +++ b/include/hw/vfio/vfio-common.h
> >>> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
> >>>                           struct vfio_region_info **info);
> >>>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
> >>>                               uint32_t subtype, struct vfio_region_info **info);
> >>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
> >>>  #endif
> >>>  extern const MemoryListener vfio_prereg_listener;
> >>>  
> >>> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> >>> index 4e7ab4c..bce9baf 100644
> >>> --- a/linux-headers/linux/vfio.h
> >>> +++ b/linux-headers/linux/vfio.h
> >>> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
> >>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
> >>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
> >>>  
> >>> +/*
> >>> + * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
> >>> + */
> >>> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
> >>> +
> >>>  /**
> >>>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
> >>>   *				    struct vfio_irq_info)
> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>> index 9de63f0..1dfc386 100644
> >>> --- a/hw/ppc/spapr.c
> >>> +++ b/hw/ppc/spapr.c
> >>> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
> >>>  /*
> >>>   * pseries-2.11
> >>>   */
> >>> +#define SPAPR_COMPAT_2_11                                             \
> >>> +    HW_COMPAT_2_10                                                    \
> >>> +    {                                                                 \
> >>> +        .driver = "vfio-pci",                                         \
> >>> +        .property = "msix-no-mmap",                                   \
> >>> +        .value    = "on",                                             \
> >>> +    },                                                                \
> >>> +
> >>>  static void spapr_machine_2_11_instance_options(MachineState *machine)
> >>>  {
> >>>  }
> >>>  
> >>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
> >>>  {
> >>> -    /* Defaults for the latest behaviour inherited from the base class */
> >>> +    SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
> >>>  }
> >>>  
> >>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> >>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> >>> index ed7717d..593514c 100644
> >>> --- a/hw/vfio/common.c
> >>> +++ b/hw/vfio/common.c
> >>> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
> >>>      return -ENODEV;
> >>>  }
> >>>  
> >>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
> >>> +{
> >>> +        struct vfio_region_info *info = NULL;
> >>> +        bool ret = false;
> >>> +
> >>> +        if (!vfio_get_region_info(vbasedev, region, &info)) {
> >>> +            if (vfio_get_region_info_cap(info, cap_type)) {
> >>> +                ret = true;
> >>> +            }
> >>> +            g_free(info);
> >>> +        }
> >>> +
> >>> +        return ret;
> >>> +}
> >>> +
> >>>  /*
> >>>   * Interfaces for IBM EEH (Enhanced Error Handling)
> >>>   */
> >>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >>> index c977ee3..d9aeae8 100644
> >>> --- a/hw/vfio/pci.c
> >>> +++ b/hw/vfio/pci.c
> >>> @@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
> >>>      off_t start, end;
> >>>      VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;
> >>>  
> >>> +    if (!vdev->msix_no_mmap &&
> >>> +        vfio_is_cap_present(&vdev->vbasedev, VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
> >>> +                            vdev->msix->table_bar)) {
> >>> +        return;
> >>> +    }
> >>> +
> >>>      /*
> >>>       * We expect to find a single mmap covering the whole BAR, anything else
> >>>       * means it's either unsupported or already setup.
> >>> @@ -1473,6 +1479,10 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
> >>>       */
> >>>      memory_region_set_enabled(&vdev->pdev.msix_pba_mmio, false);
> >>>  
> >>> +    if (!vdev->msix_no_mmap) {
> >>> +        memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
> >>> +    }  
> >>
> >> No, you're conflating issues.  There's (a) can we mmap over the MSI-X
> >> vector table and (b) do we require QEMU emulation of the MSI-X vector
> >> table.  (a) does NOT imply (b).  AFAICT, (a) should be enabled any time
> >> the kernel supports it,  
> > 
> > This is the default setting, or you do not want it to be a property so the
> > user cannot shoot himself in a foot?

If the kernel allows mmap, other than debugging, why would it ever need
to be disabled?

   
> >> (b) should never be enabled on ppc, regardless
> >> of (a).  Thanks,  
> > 
> > 
> > The intention is to have a property - msix_no_mmap=true, except a single
> > case - ppc-pseries, I just do not know how to enforce it for a specific
> > machine type.  

The intention is wrong.  (a) should be done any time the kernel allows
it.  (b) is an independent concept of disabling QEMU MSI-X emulation
for platforms, ie. machine types, that do not require it.  (b) has
nothing to do with the mmap'ability of the msix table area.  So far (b)
includes only the ppc spapr machine and I don't see a reason to allow
the user to control this.  Thanks,

Alex

Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by David Gibson 6 years, 4 months ago
On Tue, Dec 12, 2017 at 09:05:25AM -0700, Alex Williamson wrote:
> On Tue, 12 Dec 2017 18:01:40 +1100
> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> 
> > On 12/12/17 17:06, Alexey Kardashevskiy wrote:
> > > On 12/12/17 16:54, Alex Williamson wrote:  
> > >> On Tue, 12 Dec 2017 16:21:31 +1100
> > >> Alexey Kardashevskiy <aik@ozlabs.ru> wrote:
> > >>  
> > >>> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> > >>> which tells that a region with MSIX data can be mapped entirely, i.e.
> > >>> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> > >>>
> > >>> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> > >>> by default and "false" for pseries-2.12+ machines.
> > >>>
> > >>> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> > >>> https://www.spinics.net/lists/kvm/msg160282.html
> > >>>
> > >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > >>> ---
> > >>>
> > >>> This is an RFC as it requires kernel headers update which is not there yet.
> > >>>
> > >>> I'd like to make it "msix-mmap" (without "no") but could not find a way
> > >>> of enabling a device property for machine versions newer than some value.
> > >>>
> > >>> I changed 2.11 machine just for the demonstration purpose.
> > >>>
> > >>>
> > >>> ---
> > >>>  hw/vfio/pci.h                 |  1 +
> > >>>  include/hw/vfio/vfio-common.h |  1 +
> > >>>  linux-headers/linux/vfio.h    |  5 +++++
> > >>>  hw/ppc/spapr.c                | 10 +++++++++-
> > >>>  hw/vfio/common.c              | 15 +++++++++++++++
> > >>>  hw/vfio/pci.c                 | 11 +++++++++++
> > >>>  6 files changed, 42 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> > >>> index a8fb3b3..53912ef 100644
> > >>> --- a/hw/vfio/pci.h
> > >>> +++ b/hw/vfio/pci.h
> > >>> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
> > >>>      bool no_kvm_intx;
> > >>>      bool no_kvm_msi;
> > >>>      bool no_kvm_msix;
> > >>> +    bool msix_no_mmap;
> > >>>  } VFIOPCIDevice;
> > >>>  
> > >>>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> > >>> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> > >>> index f3a2ac9..927d600 100644
> > >>> --- a/include/hw/vfio/vfio-common.h
> > >>> +++ b/include/hw/vfio/vfio-common.h
> > >>> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
> > >>>                           struct vfio_region_info **info);
> > >>>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
> > >>>                               uint32_t subtype, struct vfio_region_info **info);
> > >>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
> > >>>  #endif
> > >>>  extern const MemoryListener vfio_prereg_listener;
> > >>>  
> > >>> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> > >>> index 4e7ab4c..bce9baf 100644
> > >>> --- a/linux-headers/linux/vfio.h
> > >>> +++ b/linux-headers/linux/vfio.h
> > >>> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
> > >>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
> > >>>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
> > >>>  
> > >>> +/*
> > >>> + * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
> > >>> + */
> > >>> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
> > >>> +
> > >>>  /**
> > >>>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
> > >>>   *				    struct vfio_irq_info)
> > >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > >>> index 9de63f0..1dfc386 100644
> > >>> --- a/hw/ppc/spapr.c
> > >>> +++ b/hw/ppc/spapr.c
> > >>> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
> > >>>  /*
> > >>>   * pseries-2.11
> > >>>   */
> > >>> +#define SPAPR_COMPAT_2_11                                             \
> > >>> +    HW_COMPAT_2_10                                                    \
> > >>> +    {                                                                 \
> > >>> +        .driver = "vfio-pci",                                         \
> > >>> +        .property = "msix-no-mmap",                                   \
> > >>> +        .value    = "on",                                             \
> > >>> +    },                                                                \
> > >>> +
> > >>>  static void spapr_machine_2_11_instance_options(MachineState *machine)
> > >>>  {
> > >>>  }
> > >>>  
> > >>>  static void spapr_machine_2_11_class_options(MachineClass *mc)
> > >>>  {
> > >>> -    /* Defaults for the latest behaviour inherited from the base class */
> > >>> +    SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
> > >>>  }
> > >>>  
> > >>>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> > >>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > >>> index ed7717d..593514c 100644
> > >>> --- a/hw/vfio/common.c
> > >>> +++ b/hw/vfio/common.c
> > >>> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
> > >>>      return -ENODEV;
> > >>>  }
> > >>>  
> > >>> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
> > >>> +{
> > >>> +        struct vfio_region_info *info = NULL;
> > >>> +        bool ret = false;
> > >>> +
> > >>> +        if (!vfio_get_region_info(vbasedev, region, &info)) {
> > >>> +            if (vfio_get_region_info_cap(info, cap_type)) {
> > >>> +                ret = true;
> > >>> +            }
> > >>> +            g_free(info);
> > >>> +        }
> > >>> +
> > >>> +        return ret;
> > >>> +}
> > >>> +
> > >>>  /*
> > >>>   * Interfaces for IBM EEH (Enhanced Error Handling)
> > >>>   */
> > >>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > >>> index c977ee3..d9aeae8 100644
> > >>> --- a/hw/vfio/pci.c
> > >>> +++ b/hw/vfio/pci.c
> > >>> @@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
> > >>>      off_t start, end;
> > >>>      VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;
> > >>>  
> > >>> +    if (!vdev->msix_no_mmap &&
> > >>> +        vfio_is_cap_present(&vdev->vbasedev, VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
> > >>> +                            vdev->msix->table_bar)) {
> > >>> +        return;
> > >>> +    }
> > >>> +
> > >>>      /*
> > >>>       * We expect to find a single mmap covering the whole BAR, anything else
> > >>>       * means it's either unsupported or already setup.
> > >>> @@ -1473,6 +1479,10 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
> > >>>       */
> > >>>      memory_region_set_enabled(&vdev->pdev.msix_pba_mmio, false);
> > >>>  
> > >>> +    if (!vdev->msix_no_mmap) {
> > >>> +        memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
> > >>> +    }  
> > >>
> > >> No, you're conflating issues.  There's (a) can we mmap over the MSI-X
> > >> vector table and (b) do we require QEMU emulation of the MSI-X vector
> > >> table.  (a) does NOT imply (b).  AFAICT, (a) should be enabled any time
> > >> the kernel supports it,  
> > > 
> > > This is the default setting, or you do not want it to be a property so the
> > > user cannot shoot himself in a foot?
> 
> If the kernel allows mmap, other than debugging, why would it ever need
> to be disabled?
> 
>    
> > >> (b) should never be enabled on ppc, regardless
> > >> of (a).  Thanks,  
> > > 
> > > 
> > > The intention is to have a property - msix_no_mmap=true, except a single
> > > case - ppc-pseries, I just do not know how to enforce it for a specific
> > > machine type.  
> 
> The intention is wrong.  (a) should be done any time the kernel allows
> it.  (b) is an independent concept of disabling QEMU MSI-X emulation
> for platforms, ie. machine types, that do not require it.  (b) has
> nothing to do with the mmap'ability of the msix table area.  So far (b)
> includes only the ppc spapr machine and I don't see a reason to allow
> the user to control this.  Thanks,

I don't either, but setting properties like this seems to be the
more-or-less standard way for the machine type (and/or version) to
affect operation of other devices.

Allowing the user to shoot themselves in the foot is a side effect of
that, but seems to be one we're ok with.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by David Gibson 6 years, 4 months ago
On Tue, Dec 12, 2017 at 04:21:31PM +1100, Alexey Kardashevskiy wrote:
> This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> which tells that a region with MSIX data can be mapped entirely, i.e.
> the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> 
> This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> by default and "false" for pseries-2.12+ machines.
> 
> This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> https://www.spinics.net/lists/kvm/msg160282.html
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> This is an RFC as it requires kernel headers update which is not there yet.
> 
> I'd like to make it "msix-mmap" (without "no") but could not find a way
> of enabling a device property for machine versions newer than some value.
> 
> I changed 2.11 machine just for the demonstration purpose.

As Alex says, the mmap()ability of the MSI-X BAR isn't really the
point.  The point is whether we need to intercept guest MMIOs to the
MSI-X region.  Still, the logic's basically right, just rename your
property to, say, "intercept_msix_mmio".  It would be true by default,
set to false by the pseries machine type.

I don't think you actually need to make it vary depending on the
version of the pseries machine type: whether the BAR is mmap()ed or
qemu emulated shouldn't be a guest visible change.  No PAPR guest
should have been directly poking the MSI-X region (ever), so we
shouldn't need to intercept the region even for old versions.

> 
> 
> ---
>  hw/vfio/pci.h                 |  1 +
>  include/hw/vfio/vfio-common.h |  1 +
>  linux-headers/linux/vfio.h    |  5 +++++
>  hw/ppc/spapr.c                | 10 +++++++++-
>  hw/vfio/common.c              | 15 +++++++++++++++
>  hw/vfio/pci.c                 | 11 +++++++++++
>  6 files changed, 42 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index a8fb3b3..53912ef 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -142,6 +142,7 @@ typedef struct VFIOPCIDevice {
>      bool no_kvm_intx;
>      bool no_kvm_msi;
>      bool no_kvm_msix;
> +    bool msix_no_mmap;
>  } VFIOPCIDevice;
>  
>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index f3a2ac9..927d600 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -171,6 +171,7 @@ int vfio_get_region_info(VFIODevice *vbasedev, int index,
>                           struct vfio_region_info **info);
>  int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>                               uint32_t subtype, struct vfio_region_info **info);
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region);
>  #endif
>  extern const MemoryListener vfio_prereg_listener;
>  
> diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
> index 4e7ab4c..bce9baf 100644
> --- a/linux-headers/linux/vfio.h
> +++ b/linux-headers/linux/vfio.h
> @@ -300,6 +300,11 @@ struct vfio_region_info_cap_type {
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_HOST_CFG	(2)
>  #define VFIO_REGION_SUBTYPE_INTEL_IGD_LPC_CFG	(3)
>  
> +/*
> + * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped.
> + */
> +#define VFIO_REGION_INFO_CAP_MSIX_MAPPABLE	3
> +
>  /**
>   * VFIO_DEVICE_GET_IRQ_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 9,
>   *				    struct vfio_irq_info)
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 9de63f0..1dfc386 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3742,13 +3742,21 @@ static const TypeInfo spapr_machine_info = {
>  /*
>   * pseries-2.11
>   */
> +#define SPAPR_COMPAT_2_11                                             \
> +    HW_COMPAT_2_10                                                    \
> +    {                                                                 \
> +        .driver = "vfio-pci",                                         \
> +        .property = "msix-no-mmap",                                   \
> +        .value    = "on",                                             \
> +    },                                                                \
> +
>  static void spapr_machine_2_11_instance_options(MachineState *machine)
>  {
>  }
>  
>  static void spapr_machine_2_11_class_options(MachineClass *mc)
>  {
> -    /* Defaults for the latest behaviour inherited from the base class */
> +    SET_MACHINE_COMPAT(mc, SPAPR_COMPAT_2_11);
>  }
>  
>  DEFINE_SPAPR_MACHINE(2_11, "2.11", true);
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index ed7717d..593514c 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1408,6 +1408,21 @@ int vfio_get_dev_region_info(VFIODevice *vbasedev, uint32_t type,
>      return -ENODEV;
>  }
>  
> +bool vfio_is_cap_present(VFIODevice *vbasedev, uint16_t cap_type, int region)
> +{
> +        struct vfio_region_info *info = NULL;
> +        bool ret = false;
> +
> +        if (!vfio_get_region_info(vbasedev, region, &info)) {
> +            if (vfio_get_region_info_cap(info, cap_type)) {
> +                ret = true;
> +            }
> +            g_free(info);
> +        }
> +
> +        return ret;
> +}
> +
>  /*
>   * Interfaces for IBM EEH (Enhanced Error Handling)
>   */
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index c977ee3..d9aeae8 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1289,6 +1289,12 @@ static void vfio_pci_fixup_msix_region(VFIOPCIDevice *vdev)
>      off_t start, end;
>      VFIORegion *region = &vdev->bars[vdev->msix->table_bar].region;
>  
> +    if (!vdev->msix_no_mmap &&
> +        vfio_is_cap_present(&vdev->vbasedev, VFIO_REGION_INFO_CAP_MSIX_MAPPABLE,
> +                            vdev->msix->table_bar)) {
> +        return;
> +    }
> +
>      /*
>       * We expect to find a single mmap covering the whole BAR, anything else
>       * means it's either unsupported or already setup.
> @@ -1473,6 +1479,10 @@ static int vfio_msix_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
>       */
>      memory_region_set_enabled(&vdev->pdev.msix_pba_mmio, false);
>  
> +    if (!vdev->msix_no_mmap) {
> +        memory_region_set_enabled(&vdev->pdev.msix_table_mmio, false);
> +    }
> +
>      return 0;
>  }
>  
> @@ -2986,6 +2996,7 @@ static Property vfio_pci_dev_properties[] = {
>      DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features,
>                      VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false),
>      DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false),
> +    DEFINE_PROP_BOOL("msix-no-mmap", VFIOPCIDevice, msix_no_mmap, true),
>      DEFINE_PROP_BOOL("x-no-kvm-intx", VFIOPCIDevice, no_kvm_intx, false),
>      DEFINE_PROP_BOOL("x-no-kvm-msi", VFIOPCIDevice, no_kvm_msi, false),
>      DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by Alex Williamson 6 years, 4 months ago
On Fri, 15 Dec 2017 15:07:31 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Dec 12, 2017 at 04:21:31PM +1100, Alexey Kardashevskiy wrote:
> > This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> > which tells that a region with MSIX data can be mapped entirely, i.e.
> > the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> > 
> > This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> > by default and "false" for pseries-2.12+ machines.
> > 
> > This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> > https://www.spinics.net/lists/kvm/msg160282.html
> > 
> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > ---
> > 
> > This is an RFC as it requires kernel headers update which is not there yet.
> > 
> > I'd like to make it "msix-mmap" (without "no") but could not find a way
> > of enabling a device property for machine versions newer than some value.
> > 
> > I changed 2.11 machine just for the demonstration purpose.  
> 
> As Alex says, the mmap()ability of the MSI-X BAR isn't really the
> point.  The point is whether we need to intercept guest MMIOs to the
> MSI-X region.  Still, the logic's basically right, just rename your
> property to, say, "intercept_msix_mmio".  It would be true by default,
> set to false by the pseries machine type.
> 
> I don't think you actually need to make it vary depending on the
> version of the pseries machine type: whether the BAR is mmap()ed or
> qemu emulated shouldn't be a guest visible change.  No PAPR guest
> should have been directly poking the MSI-X region (ever), so we
> shouldn't need to intercept the region even for old versions.

I have to ask, is the vfio-pci driver really the right point in the VM
to be understanding whether the platform requires MSI-X MMIO
emulation?  vfio-pci is only unique here in that enabling that
emulation harms performance, but AIUI it's unused on any device and
there may eventually be other devices affected in the same way as
vfio-pci.  So should there be some post-realize platform code that
disables MSI-X MemoryRegions or should the MSI-X code call out to some
platform hook to determine whether to enable emulation?  Seems like a
case where the impact might be unique to vfio, but the root of the
problem is not.  Thanks,

Alex

Re: [Qemu-devel] [PATCH qemu] RFC: vfio-pci: Allow mmap of MSIX BAR
Posted by David Gibson 6 years, 4 months ago
On Fri, Dec 15, 2017 at 09:04:28AM -0700, Alex Williamson wrote:
> On Fri, 15 Dec 2017 15:07:31 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Tue, Dec 12, 2017 at 04:21:31PM +1100, Alexey Kardashevskiy wrote:
> > > This makes use of a new VFIO_REGION_INFO_CAP_MSIX_MAPPABLE capability
> > > which tells that a region with MSIX data can be mapped entirely, i.e.
> > > the VFIO PCI driver won't prevent MSIX vectors area from being mapped.
> > > 
> > > This adds a "msix-no-mmap" property to the vfio-pci device, it is "true"
> > > by default and "false" for pseries-2.12+ machines.
> > > 
> > > This requites kernel's "vfio-pci: Allow mapping MSIX BAR"
> > > https://www.spinics.net/lists/kvm/msg160282.html
> > > 
> > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> > > ---
> > > 
> > > This is an RFC as it requires kernel headers update which is not there yet.
> > > 
> > > I'd like to make it "msix-mmap" (without "no") but could not find a way
> > > of enabling a device property for machine versions newer than some value.
> > > 
> > > I changed 2.11 machine just for the demonstration purpose.  
> > 
> > As Alex says, the mmap()ability of the MSI-X BAR isn't really the
> > point.  The point is whether we need to intercept guest MMIOs to the
> > MSI-X region.  Still, the logic's basically right, just rename your
> > property to, say, "intercept_msix_mmio".  It would be true by default,
> > set to false by the pseries machine type.
> > 
> > I don't think you actually need to make it vary depending on the
> > version of the pseries machine type: whether the BAR is mmap()ed or
> > qemu emulated shouldn't be a guest visible change.  No PAPR guest
> > should have been directly poking the MSI-X region (ever), so we
> > shouldn't need to intercept the region even for old versions.
> 
> I have to ask, is the vfio-pci driver really the right point in the VM
> to be understanding whether the platform requires MSI-X MMIO
> emulation?  vfio-pci is only unique here in that enabling that
> emulation harms performance, but AIUI it's unused on any device and
> there may eventually be other devices affected in the same way as
> vfio-pci.  So should there be some post-realize platform code that
> disables MSI-X MemoryRegions or should the MSI-X code call out to some
> platform hook to determine whether to enable emulation?  Seems like a
> case where the impact might be unique to vfio, but the root of the
> problem is not.  Thanks,

That's a good point.  If we can reasonably do it at the level of a
generic PCI device, that would be preferable.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson