[RFC XEN PATCH v14 5/5] tools: Add new function to do PIRQ (un)map on PVH dom0

Jiqian Chen posted 5 patches 2 weeks, 1 day ago
There is a newer version of this series
[RFC XEN PATCH v14 5/5] tools: Add new function to do PIRQ (un)map on PVH dom0
Posted by Jiqian Chen 2 weeks, 1 day ago
When dom0 is PVH, and passthrough a device to dumU, xl will
use the gsi number of device to do a pirq mapping, see
pci_add_dm_done->xc_physdev_map_pirq, but the gsi number is
got from file /sys/bus/pci/devices/<sbdf>/irq, that confuses
irq and gsi, they are in different space and are not equal,
so it will fail when mapping.
To solve this issue, use xc_physdev_gsi_from_dev to get the
real gsi and add a new function xc_physdev_map_pirq_gsi to get
a free pirq for gsi(why not use current function
xc_physdev_map_pirq, because it doesn't support to allocate a
free pirq, what's more, to prevent changing it and affecting
its callers, so add xc_physdev_map_pirq_gsi).

Besides, PVH dom0 doesn't have PIRQ flag, it doesn't do
PHYSDEVOP_map_pirq for each gsi. So grant function callstack
pci_add_dm_done->XEN_DOMCTL_irq_permission will fail at function
domain_pirq_to_irq. And old hypercall XEN_DOMCTL_irq_permission
requires passing in pirq, it is not suitable for dom0 that
doesn't have PIRQs to grant irq permission.
To solve this issue, use the new hypercall
XEN_DOMCTL_gsi_permission to grant the permission of irq(
translate from gsi) to dumU when dom0 has no PIRQs.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Chen Jiqian <Jiqian.Chen@amd.com>
---
RFC: it needs to wait for the corresponding third patch on linux kernel side to be merged.
https://lore.kernel.org/xen-devel/20240607075109.126277-4-Jiqian.Chen@amd.com/
---
v13->v14 changes:
No.

v12->v13 changes:
Deleted patch #6 of v12, and added function xc_physdev_map_pirq_gsi to map pirq for gsi.
For functions that generate libxl error, changed the return value from -1 to ERROR_*.
Instead of declaring "ctx", use the macro "CTX".
Add the function libxl__arch_local_romain_ has_pirq_notion to determine if there is a concept of pirq in the domain where xl is located.
In the function libxl__arch_hvm_unmap_gsi, before unmap_pirq, use map_pirq to obtain the pirq corresponding to gsi.

v11->v12 changes:
Nothing.

v10->v11 changes:
New patch
Modification of the tools part of patches#4 and #5 of v10, use privcmd_gsi_from_dev to get gsi, and use XEN_DOMCTL_gsi_permission to grant gsi.
Change the hard-coded 0 to use LIBXL_TOOLSTACK_DOMID.
Add libxl__arch_hvm_map_gsi to distinguish x86 related implementations.
Add a list pcidev_pirq_list to record the relationship between sbdf and pirq, which can be used to obtain the corresponding pirq when unmap PIRQ.
---
 tools/include/xenctrl.h       |  10 +++
 tools/libs/ctrl/xc_domain.c   |  15 +++++
 tools/libs/ctrl/xc_physdev.c  |  27 ++++++++
 tools/libs/light/libxl_arch.h |   6 ++
 tools/libs/light/libxl_arm.c  |  15 +++++
 tools/libs/light/libxl_pci.c  | 112 ++++++++++++++++++++--------------
 tools/libs/light/libxl_x86.c  |  72 ++++++++++++++++++++++
 7 files changed, 212 insertions(+), 45 deletions(-)

diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 924f9a35f790..29617585c535 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -1383,6 +1383,11 @@ int xc_domain_irq_permission(xc_interface *xch,
                              uint32_t pirq,
                              bool allow_access);
 
+int xc_domain_gsi_permission(xc_interface *xch,
+                             uint32_t domid,
+                             uint32_t gsi,
+                             uint32_t flags);
+
 int xc_domain_iomem_permission(xc_interface *xch,
                                uint32_t domid,
                                unsigned long first_mfn,
@@ -1638,6 +1643,11 @@ int xc_physdev_map_pirq_msi(xc_interface *xch,
                             int entry_nr,
                             uint64_t table_base);
 
+int xc_physdev_map_pirq_gsi(xc_interface *xch,
+                            uint32_t domid,
+                            int gsi,
+                            int *pirq);
+
 int xc_physdev_unmap_pirq(xc_interface *xch,
                           uint32_t domid,
                           int pirq);
diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
index f2d9d14b4d9f..e3538ec0ba80 100644
--- a/tools/libs/ctrl/xc_domain.c
+++ b/tools/libs/ctrl/xc_domain.c
@@ -1394,6 +1394,21 @@ int xc_domain_irq_permission(xc_interface *xch,
     return do_domctl(xch, &domctl);
 }
 
+int xc_domain_gsi_permission(xc_interface *xch,
+                             uint32_t domid,
+                             uint32_t gsi,
+                             uint32_t flags)
+{
+    struct xen_domctl domctl = {
+        .cmd = XEN_DOMCTL_gsi_permission,
+        .domain = domid,
+        .u.gsi_permission.gsi = gsi,
+        .u.gsi_permission.flags = flags,
+    };
+
+    return do_domctl(xch, &domctl);
+}
+
 int xc_domain_iomem_permission(xc_interface *xch,
                                uint32_t domid,
                                unsigned long first_mfn,
diff --git a/tools/libs/ctrl/xc_physdev.c b/tools/libs/ctrl/xc_physdev.c
index 460a8e779ce8..c752cd1f4410 100644
--- a/tools/libs/ctrl/xc_physdev.c
+++ b/tools/libs/ctrl/xc_physdev.c
@@ -95,6 +95,33 @@ int xc_physdev_map_pirq_msi(xc_interface *xch,
     return rc;
 }
 
+int xc_physdev_map_pirq_gsi(xc_interface *xch,
+                            uint32_t domid,
+                            int gsi,
+                            int *pirq)
+{
+    int rc;
+    struct physdev_map_pirq map;
+
+    if ( !pirq )
+    {
+        errno = EINVAL;
+        return -1;
+    }
+    memset(&map, 0, sizeof(struct physdev_map_pirq));
+    map.domid = domid;
+    map.type = MAP_PIRQ_TYPE_GSI;
+    map.index = gsi;
+    map.pirq = *pirq;
+
+    rc = do_physdev_op(xch, PHYSDEVOP_map_pirq, &map, sizeof(map));
+
+    if ( !rc )
+        *pirq = map.pirq;
+
+    return rc;
+}
+
 int xc_physdev_unmap_pirq(xc_interface *xch,
                           uint32_t domid,
                           int pirq)
diff --git a/tools/libs/light/libxl_arch.h b/tools/libs/light/libxl_arch.h
index f88f11d6de1d..c8ef52ddbe7f 100644
--- a/tools/libs/light/libxl_arch.h
+++ b/tools/libs/light/libxl_arch.h
@@ -91,6 +91,12 @@ void libxl__arch_update_domain_config(libxl__gc *gc,
                                       libxl_domain_config *dst,
                                       const libxl_domain_config *src);
 
+_hidden
+int libxl__arch_hvm_map_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid);
+_hidden
+int libxl__arch_hvm_unmap_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid);
+_hidden
+bool libxl__arch_local_domain_has_pirq_notion(libxl__gc *gc);
 #if defined(__i386__) || defined(__x86_64__)
 
 #define LAPIC_BASE_ADDRESS  0xfee00000
diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index a4029e3ac810..5a9db5e85f6f 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -1774,6 +1774,21 @@ void libxl__arch_update_domain_config(libxl__gc *gc,
 {
 }
 
+int libxl__arch_hvm_map_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
+{
+    return ERROR_INVAL;
+}
+
+int libxl__arch_hvm_unmap_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
+{
+    return ERROR_INVAL;
+}
+
+bool libxl__arch_local_domain_has_pirq_notion(libxl__gc *gc)
+{
+    return true;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
index 96cb4da0794e..2014a67e6e56 100644
--- a/tools/libs/light/libxl_pci.c
+++ b/tools/libs/light/libxl_pci.c
@@ -17,6 +17,7 @@
 #include "libxl_osdeps.h" /* must come before any other headers */
 
 #include "libxl_internal.h"
+#include "libxl_arch.h"
 
 #define PCI_BDF                "%04x:%02x:%02x.%01x"
 #define PCI_BDF_SHORT          "%02x:%02x.%01x"
@@ -1478,32 +1479,43 @@ static void pci_add_dm_done(libxl__egc *egc,
     fclose(f);
     if (!pci_supp_legacy_irq())
         goto out_no_irq;
-    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
-                                pci->bus, pci->dev, pci->func);
-    f = fopen(sysfs_path, "r");
-    if (f == NULL) {
-        LOGED(ERROR, domainid, "Couldn't open %s", sysfs_path);
-        goto out_no_irq;
-    }
-    if ((fscanf(f, "%u", &irq) == 1) && irq) {
-        r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
-        if (r < 0) {
-            LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
-                  irq, r);
-            fclose(f);
-            rc = ERROR_FAIL;
+
+    /* When dom0 is PVH, should use gsi to map pirq and grant permission */
+    rc = libxl__arch_local_domain_has_pirq_notion(gc);
+    if (!rc) {
+        rc = libxl__arch_hvm_map_gsi(gc, pci_encode_bdf(pci), domid);
+        if (rc) {
+            LOGED(ERROR, domainid, "libxl__arch_hvm_map_gsi failed");
             goto out;
         }
-        r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
-        if (r < 0) {
-            LOGED(ERROR, domainid,
-                  "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
-            fclose(f);
-            rc = ERROR_FAIL;
-            goto out;
+    } else {
+        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
+                                    pci->bus, pci->dev, pci->func);
+        f = fopen(sysfs_path, "r");
+        if (f == NULL) {
+            LOGED(ERROR, domainid, "Couldn't open %s", sysfs_path);
+            goto out_no_irq;
+        }
+        if ((fscanf(f, "%u", &irq) == 1) && irq) {
+            r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
+            if (r < 0) {
+                LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
+                    irq, r);
+                fclose(f);
+                rc = ERROR_FAIL;
+                goto out;
+            }
+            r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
+            if (r < 0) {
+                LOGED(ERROR, domainid,
+                    "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
+                fclose(f);
+                rc = ERROR_FAIL;
+                goto out;
+            }
         }
+        fclose(f);
     }
-    fclose(f);
 
     /* Don't restrict writes to the PCI config space from this VM */
     if (pci->permissive) {
@@ -2229,33 +2241,43 @@ skip_bar:
     if (!pci_supp_legacy_irq())
         goto skip_legacy_irq;
 
-    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
-                           pci->bus, pci->dev, pci->func);
-
-    f = fopen(sysfs_path, "r");
-    if (f == NULL) {
-        LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
-        goto skip_legacy_irq;
-    }
+    /* When dom0 is PVH, should use gsi to unmap pirq and deny permission */
+    rc = libxl__arch_local_domain_has_pirq_notion(gc);
+    if (!rc) {
+        rc = libxl__arch_hvm_unmap_gsi(gc, pci_encode_bdf(pci), domid);
+        if (rc) {
+            LOGED(ERROR, domid, "libxl__arch_hvm_unmap_gsi failed");
+            goto out;
+        }
+    } else {
+        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
+                            pci->bus, pci->dev, pci->func);
 
-    if ((fscanf(f, "%u", &irq) == 1) && irq) {
-        rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
-        if (rc < 0) {
-            /*
-             * QEMU may have already unmapped the IRQ. So the error
-             * may be spurious. For now, still print an error message as
-             * it is not easy to distinguished between valid and
-             * spurious error.
-             */
-            LOGED(ERROR, domid, "xc_physdev_unmap_pirq irq=%d", irq);
+        f = fopen(sysfs_path, "r");
+        if (f == NULL) {
+            LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
+            goto skip_legacy_irq;
         }
-        rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
-        if (rc < 0) {
-            LOGED(ERROR, domid, "xc_domain_irq_permission irq=%d", irq);
+
+        if ((fscanf(f, "%u", &irq) == 1) && irq) {
+            rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
+            if (rc < 0) {
+                /*
+                * QEMU may have already unmapped the IRQ. So the error
+                * may be spurious. For now, still print an error message as
+                * it is not easy to distinguished between valid and
+                * spurious error.
+                */
+                LOGED(ERROR, domid, "xc_physdev_unmap_pirq irq=%d", irq);
+            }
+            rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
+            if (rc < 0) {
+                LOGED(ERROR, domid, "xc_domain_irq_permission irq=%d", irq);
+            }
         }
-    }
 
-    fclose(f);
+        fclose(f);
+    }
 
 skip_legacy_irq:
 
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index 60643d6f5376..20e3956f09b8 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -879,6 +879,78 @@ void libxl__arch_update_domain_config(libxl__gc *gc,
                                  libxl_defbool_val(src->b_info.u.hvm.pirq));
 }
 
+bool libxl__arch_local_domain_has_pirq_notion(libxl__gc *gc)
+{
+    int r;
+    xc_domaininfo_t info;
+
+    r = xc_domain_getinfo_single(CTX->xch, LIBXL_TOOLSTACK_DOMID, &info);
+    if (r == 0) {
+        if (!(info.flags & XEN_DOMINF_hvm_guest) ||
+            (info.arch_config.emulation_flags & XEN_X86_EMU_USE_PIRQ))
+            return true;
+    } else {
+        LOGE(ERROR, "getdomaininfo failed ret=%d", r);
+    }
+
+    return false;
+}
+
+int libxl__arch_hvm_map_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
+{
+    int pirq = -1, gsi, r;
+
+    gsi = xc_pcidev_get_gsi(CTX->xch, sbdf);
+    if (gsi < 0) {
+        return ERROR_FAIL;
+    }
+
+    r = xc_physdev_map_pirq_gsi(CTX->xch, domid, gsi, &pirq);
+    if (r < 0) {
+        LOGED(ERROR, domid, "xc_physdev_map_pirq_gsi gsi=%d ret=%d", gsi, r);
+        return ERROR_FAIL;
+    }
+
+    r = xc_domain_gsi_permission(CTX->xch, domid, gsi, XEN_DOMCTL_GSI_GRANT);
+    if (r < 0) {
+        LOGED(ERROR, domid, "xc_domain_gsi_permission gsi=%d ret=%d", gsi, r);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
+int libxl__arch_hvm_unmap_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
+{
+    int pirq = -1, gsi, r;
+
+    gsi = xc_pcidev_get_gsi(CTX->xch, sbdf);
+    if (gsi < 0) {
+        return ERROR_FAIL;
+    }
+
+    /* Before unmapping, use mapping to get the already mapped pirq first */
+    r = xc_physdev_map_pirq_gsi(CTX->xch, domid, gsi, &pirq);
+    if (r < 0) {
+        LOGED(ERROR, domid, "xc_physdev_map_pirq_gsi gsi=%d ret=%d", gsi, r);
+        return ERROR_FAIL;
+    }
+
+    r = xc_physdev_unmap_pirq(CTX->xch, domid, pirq);
+    if (r < 0) {
+        LOGED(ERROR, domid, "xc_physdev_unmap_pirq gsi=%d ret=%d", gsi, r);
+        return ERROR_FAIL;
+    }
+
+    r = xc_domain_gsi_permission(CTX->xch, domid, gsi, XEN_DOMCTL_GSI_REVOKE);
+    if (r < 0) {
+        LOGED(ERROR, domid, "xc_domain_gsi_permission gsi=%d ret=%d", gsi, r);
+        return ERROR_FAIL;
+    }
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
-- 
2.34.1
Re: [RFC XEN PATCH v14 5/5] tools: Add new function to do PIRQ (un)map on PVH dom0
Posted by Anthony PERARD 2 weeks, 1 day ago
On Tue, Sep 03, 2024 at 03:04:24PM +0800, Jiqian Chen wrote:
> When dom0 is PVH, and passthrough a device to dumU, xl will
> use the gsi number of device to do a pirq mapping, see
> pci_add_dm_done->xc_physdev_map_pirq, but the gsi number is
> got from file /sys/bus/pci/devices/<sbdf>/irq, that confuses
> irq and gsi, they are in different space and are not equal,
> so it will fail when mapping.
> To solve this issue, use xc_physdev_gsi_from_dev to get the

The function name has changed, it's not xc_physdev_gsi_from_dev anymore
;-). It is always possible to write commit description without naming
the functions that are going to be use, they are in the patch anyway.
But, the description should be updated to reflect changes in previous
patches.

> real gsi and add a new function xc_physdev_map_pirq_gsi to get
> a free pirq for gsi(why not use current function
> xc_physdev_map_pirq, because it doesn't support to allocate a
> free pirq, what's more, to prevent changing it and affecting
> its callers, so add xc_physdev_map_pirq_gsi).
> 
> Besides, PVH dom0 doesn't have PIRQ flag, it doesn't do
> PHYSDEVOP_map_pirq for each gsi. So grant function callstack
> pci_add_dm_done->XEN_DOMCTL_irq_permission will fail at function
> domain_pirq_to_irq. And old hypercall XEN_DOMCTL_irq_permission
> requires passing in pirq, it is not suitable for dom0 that
> doesn't have PIRQs to grant irq permission.
> To solve this issue, use the new hypercall
> XEN_DOMCTL_gsi_permission to grant the permission of irq(
> translate from gsi) to dumU when dom0 has no PIRQs.
>
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> Signed-off-by: Huang Rui <ray.huang@amd.com>
> Signed-off-by: Chen Jiqian <Jiqian.Chen@amd.com>
> ---
> diff --git a/tools/libs/ctrl/xc_physdev.c b/tools/libs/ctrl/xc_physdev.c
> index 460a8e779ce8..c752cd1f4410 100644
> --- a/tools/libs/ctrl/xc_physdev.c
> +++ b/tools/libs/ctrl/xc_physdev.c
> @@ -95,6 +95,33 @@ int xc_physdev_map_pirq_msi(xc_interface *xch,
>      return rc;
>  }
>  
> +int xc_physdev_map_pirq_gsi(xc_interface *xch,
> +                            uint32_t domid,
> +                            int gsi,
> +                            int *pirq)
> +{
> +    int rc;
> +    struct physdev_map_pirq map;
> +
> +    if ( !pirq )
> +    {
> +        errno = EINVAL;
> +        return -1;
> +    }
> +    memset(&map, 0, sizeof(struct physdev_map_pirq));
> +    map.domid = domid;
> +    map.type = MAP_PIRQ_TYPE_GSI;
> +    map.index = gsi;

You can write instead, when declaring `map`:

    struct physdev_map_pirq map = {
        .domid = domid,
        .type = MAP_PIRQ_TYPE_GSI,
        .index = gsi,
    };

Then there's no need to call memset(), the compiler will know what to
do.

> +    map.pirq = *pirq;
> +
> +    rc = do_physdev_op(xch, PHYSDEVOP_map_pirq, &map, sizeof(map));
> +
> +    if ( !rc )
> +        *pirq = map.pirq;
> +
> +    return rc;
> +}
> +
>  int xc_physdev_unmap_pirq(xc_interface *xch,
>                            uint32_t domid,
>                            int pirq)
> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> index 96cb4da0794e..2014a67e6e56 100644
> --- a/tools/libs/light/libxl_pci.c
> +++ b/tools/libs/light/libxl_pci.c
> @@ -17,6 +17,7 @@
>  #include "libxl_osdeps.h" /* must come before any other headers */
>  
>  #include "libxl_internal.h"
> +#include "libxl_arch.h"
>  
>  #define PCI_BDF                "%04x:%02x:%02x.%01x"
>  #define PCI_BDF_SHORT          "%02x:%02x.%01x"
> @@ -1478,32 +1479,43 @@ static void pci_add_dm_done(libxl__egc *egc,
>      fclose(f);
>      if (!pci_supp_legacy_irq())
>          goto out_no_irq;
> -    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> -                                pci->bus, pci->dev, pci->func);
> -    f = fopen(sysfs_path, "r");
> -    if (f == NULL) {
> -        LOGED(ERROR, domainid, "Couldn't open %s", sysfs_path);
> -        goto out_no_irq;
> -    }
> -    if ((fscanf(f, "%u", &irq) == 1) && irq) {
> -        r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
> -        if (r < 0) {
> -            LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
> -                  irq, r);
> -            fclose(f);
> -            rc = ERROR_FAIL;
> +
> +    /* When dom0 is PVH, should use gsi to map pirq and grant permission */
> +    rc = libxl__arch_local_domain_has_pirq_notion(gc);
> +    if (!rc) {

Here, you can libxl__arch_local_domain_has_pirq_notion() within the if()
condition because the function returns a bool. (Also, `rc` is for libxl
error code, so we can make the mistake here in thinking that there an
error code been been store.) Alternatively, you could declare "bool ok"
and use that.

After that, it will be easier to read the condition has "if has pirq" or
"if ok" instead of "if no error".



> +        rc = libxl__arch_hvm_map_gsi(gc, pci_encode_bdf(pci), domid);
> +        if (rc) {
> +            LOGED(ERROR, domainid, "libxl__arch_hvm_map_gsi failed");

I think LOGD() instead of LOGED() would be enough here.
libxl__arch_hvm_map_gsi() already logs `strerror(errno)` so there's no
need to print it again.

>              goto out;
>          }
> -        r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
> -        if (r < 0) {
> -            LOGED(ERROR, domainid,
> -                  "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
> -            fclose(f);
> -            rc = ERROR_FAIL;
> -            goto out;
> +    } else {
> +        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> +                                    pci->bus, pci->dev, pci->func);
> +        f = fopen(sysfs_path, "r");
> +        if (f == NULL) {
> +            LOGED(ERROR, domainid, "Couldn't open %s", sysfs_path);
> +            goto out_no_irq;
> +        }
> +        if ((fscanf(f, "%u", &irq) == 1) && irq) {
> +            r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
> +            if (r < 0) {
> +                LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
> +                    irq, r);
> +                fclose(f);
> +                rc = ERROR_FAIL;
> +                goto out;
> +            }
> +            r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
> +            if (r < 0) {
> +                LOGED(ERROR, domainid,
> +                    "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
> +                fclose(f);
> +                rc = ERROR_FAIL;
> +                goto out;
> +            }
>          }
> +        fclose(f);
>      }
> -    fclose(f);
>  
>      /* Don't restrict writes to the PCI config space from this VM */
>      if (pci->permissive) {
> @@ -2229,33 +2241,43 @@ skip_bar:
>      if (!pci_supp_legacy_irq())
>          goto skip_legacy_irq;
>  
> -    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> -                           pci->bus, pci->dev, pci->func);
> -
> -    f = fopen(sysfs_path, "r");
> -    if (f == NULL) {
> -        LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
> -        goto skip_legacy_irq;
> -    }
> +    /* When dom0 is PVH, should use gsi to unmap pirq and deny permission */
> +    rc = libxl__arch_local_domain_has_pirq_notion(gc);
> +    if (!rc) {
> +        rc = libxl__arch_hvm_unmap_gsi(gc, pci_encode_bdf(pci), domid);
> +        if (rc) {
> +            LOGED(ERROR, domid, "libxl__arch_hvm_unmap_gsi failed");
> +            goto out;
> +        }
> +    } else {
> +        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
> +                            pci->bus, pci->dev, pci->func);
>  
> -    if ((fscanf(f, "%u", &irq) == 1) && irq) {
> -        rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
> -        if (rc < 0) {
> -            /*
> -             * QEMU may have already unmapped the IRQ. So the error
> -             * may be spurious. For now, still print an error message as
> -             * it is not easy to distinguished between valid and
> -             * spurious error.
> -             */
> -            LOGED(ERROR, domid, "xc_physdev_unmap_pirq irq=%d", irq);
> +        f = fopen(sysfs_path, "r");
> +        if (f == NULL) {
> +            LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
> +            goto skip_legacy_irq;
>          }
> -        rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
> -        if (rc < 0) {
> -            LOGED(ERROR, domid, "xc_domain_irq_permission irq=%d", irq);
> +
> +        if ((fscanf(f, "%u", &irq) == 1) && irq) {
> +            rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
> +            if (rc < 0) {
> +                /*
> +                * QEMU may have already unmapped the IRQ. So the error

The * ...     here ^ should be aligned with the * on the previous line.

> +                * may be spurious. For now, still print an error message as
> +                * it is not easy to distinguished between valid and
> +                * spurious error.
> +                */
> +                LOGED(ERROR, domid, "xc_physdev_unmap_pirq irq=%d", irq);
> +            }
> +            rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
> +            if (rc < 0) {
> +                LOGED(ERROR, domid, "xc_domain_irq_permission irq=%d", irq);
> +            }
>          }
> -    }
>  
> -    fclose(f);
> +        fclose(f);
> +    }
>  
>  skip_legacy_irq:
>  
> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> index 60643d6f5376..20e3956f09b8 100644
> --- a/tools/libs/light/libxl_x86.c
> +++ b/tools/libs/light/libxl_x86.c
> +int libxl__arch_hvm_map_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
> +{
> +    int pirq = -1, gsi, r;
> +
> +    gsi = xc_pcidev_get_gsi(CTX->xch, sbdf);
> +    if (gsi < 0) {
> +        return ERROR_FAIL;
> +    }
> +
> +    r = xc_physdev_map_pirq_gsi(CTX->xch, domid, gsi, &pirq);
> +    if (r < 0) {
> +        LOGED(ERROR, domid, "xc_physdev_map_pirq_gsi gsi=%d ret=%d", gsi, r);

`r` should be -1, I don't think loggin it is useful..

> +        return ERROR_FAIL;
> +    }
> +
> +    r = xc_domain_gsi_permission(CTX->xch, domid, gsi, XEN_DOMCTL_GSI_GRANT);
> +    if (r < 0) {
> +        LOGED(ERROR, domid, "xc_domain_gsi_permission gsi=%d ret=%d", gsi, r);

Same here. And in the next function.

> +        return ERROR_FAIL;
> +    }
> +
> +    return 0;
> +}
> +
> +int libxl__arch_hvm_unmap_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
> +{
> +    int pirq = -1, gsi, r;
> +
> +    gsi = xc_pcidev_get_gsi(CTX->xch, sbdf);
> +    if (gsi < 0) {
> +        return ERROR_FAIL;
> +    }
> +
> +    /* Before unmapping, use mapping to get the already mapped pirq first */
> +    r = xc_physdev_map_pirq_gsi(CTX->xch, domid, gsi, &pirq);
> +    if (r < 0) {
> +        LOGED(ERROR, domid, "xc_physdev_map_pirq_gsi gsi=%d ret=%d", gsi, r);
> +        return ERROR_FAIL;
> +    }
> +
> +    r = xc_physdev_unmap_pirq(CTX->xch, domid, pirq);
> +    if (r < 0) {
> +        LOGED(ERROR, domid, "xc_physdev_unmap_pirq gsi=%d ret=%d", gsi, r);
> +        return ERROR_FAIL;
> +    }
> +
> +    r = xc_domain_gsi_permission(CTX->xch, domid, gsi, XEN_DOMCTL_GSI_REVOKE);
> +    if (r < 0) {
> +        LOGED(ERROR, domid, "xc_domain_gsi_permission gsi=%d ret=%d", gsi, r);
> +        return ERROR_FAIL;
> +    }
> +
> +    return 0;
> +}


Thanks,

-- 

Anthony Perard | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech
Re: [RFC XEN PATCH v14 5/5] tools: Add new function to do PIRQ (un)map on PVH dom0
Posted by Chen, Jiqian 2 weeks ago
On 2024/9/4 01:16, Anthony PERARD wrote:
> On Tue, Sep 03, 2024 at 03:04:24PM +0800, Jiqian Chen wrote:
>> When dom0 is PVH, and passthrough a device to dumU, xl will
>> use the gsi number of device to do a pirq mapping, see
>> pci_add_dm_done->xc_physdev_map_pirq, but the gsi number is
>> got from file /sys/bus/pci/devices/<sbdf>/irq, that confuses
>> irq and gsi, they are in different space and are not equal,
>> so it will fail when mapping.
>> To solve this issue, use xc_physdev_gsi_from_dev to get the
> 
> The function name has changed, it's not xc_physdev_gsi_from_dev anymore
> ;-). It is always possible to write commit description without naming
> the functions that are going to be use, they are in the patch anyway.
> But, the description should be updated to reflect changes in previous
> patches.
Will change in next version, and include all your below comments, thank you very much!

> 
>> real gsi and add a new function xc_physdev_map_pirq_gsi to get
>> a free pirq for gsi(why not use current function
>> xc_physdev_map_pirq, because it doesn't support to allocate a
>> free pirq, what's more, to prevent changing it and affecting
>> its callers, so add xc_physdev_map_pirq_gsi).
>>
>> Besides, PVH dom0 doesn't have PIRQ flag, it doesn't do
>> PHYSDEVOP_map_pirq for each gsi. So grant function callstack
>> pci_add_dm_done->XEN_DOMCTL_irq_permission will fail at function
>> domain_pirq_to_irq. And old hypercall XEN_DOMCTL_irq_permission
>> requires passing in pirq, it is not suitable for dom0 that
>> doesn't have PIRQs to grant irq permission.
>> To solve this issue, use the new hypercall
>> XEN_DOMCTL_gsi_permission to grant the permission of irq(
>> translate from gsi) to dumU when dom0 has no PIRQs.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> Signed-off-by: Huang Rui <ray.huang@amd.com>
>> Signed-off-by: Chen Jiqian <Jiqian.Chen@amd.com>
>> ---
>> diff --git a/tools/libs/ctrl/xc_physdev.c b/tools/libs/ctrl/xc_physdev.c
>> index 460a8e779ce8..c752cd1f4410 100644
>> --- a/tools/libs/ctrl/xc_physdev.c
>> +++ b/tools/libs/ctrl/xc_physdev.c
>> @@ -95,6 +95,33 @@ int xc_physdev_map_pirq_msi(xc_interface *xch,
>>      return rc;
>>  }
>>  
>> +int xc_physdev_map_pirq_gsi(xc_interface *xch,
>> +                            uint32_t domid,
>> +                            int gsi,
>> +                            int *pirq)
>> +{
>> +    int rc;
>> +    struct physdev_map_pirq map;
>> +
>> +    if ( !pirq )
>> +    {
>> +        errno = EINVAL;
>> +        return -1;
>> +    }
>> +    memset(&map, 0, sizeof(struct physdev_map_pirq));
>> +    map.domid = domid;
>> +    map.type = MAP_PIRQ_TYPE_GSI;
>> +    map.index = gsi;
> 
> You can write instead, when declaring `map`:
> 
>     struct physdev_map_pirq map = {
>         .domid = domid,
>         .type = MAP_PIRQ_TYPE_GSI,
>         .index = gsi,
>     };
> 
> Then there's no need to call memset(), the compiler will know what to
> do.
> 
>> +    map.pirq = *pirq;
>> +
>> +    rc = do_physdev_op(xch, PHYSDEVOP_map_pirq, &map, sizeof(map));
>> +
>> +    if ( !rc )
>> +        *pirq = map.pirq;
>> +
>> +    return rc;
>> +}
>> +
>>  int xc_physdev_unmap_pirq(xc_interface *xch,
>>                            uint32_t domid,
>>                            int pirq)
>> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
>> index 96cb4da0794e..2014a67e6e56 100644
>> --- a/tools/libs/light/libxl_pci.c
>> +++ b/tools/libs/light/libxl_pci.c
>> @@ -17,6 +17,7 @@
>>  #include "libxl_osdeps.h" /* must come before any other headers */
>>  
>>  #include "libxl_internal.h"
>> +#include "libxl_arch.h"
>>  
>>  #define PCI_BDF                "%04x:%02x:%02x.%01x"
>>  #define PCI_BDF_SHORT          "%02x:%02x.%01x"
>> @@ -1478,32 +1479,43 @@ static void pci_add_dm_done(libxl__egc *egc,
>>      fclose(f);
>>      if (!pci_supp_legacy_irq())
>>          goto out_no_irq;
>> -    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> -                                pci->bus, pci->dev, pci->func);
>> -    f = fopen(sysfs_path, "r");
>> -    if (f == NULL) {
>> -        LOGED(ERROR, domainid, "Couldn't open %s", sysfs_path);
>> -        goto out_no_irq;
>> -    }
>> -    if ((fscanf(f, "%u", &irq) == 1) && irq) {
>> -        r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
>> -        if (r < 0) {
>> -            LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
>> -                  irq, r);
>> -            fclose(f);
>> -            rc = ERROR_FAIL;
>> +
>> +    /* When dom0 is PVH, should use gsi to map pirq and grant permission */
>> +    rc = libxl__arch_local_domain_has_pirq_notion(gc);
>> +    if (!rc) {
> 
> Here, you can libxl__arch_local_domain_has_pirq_notion() within the if()
> condition because the function returns a bool. (Also, `rc` is for libxl
> error code, so we can make the mistake here in thinking that there an
> error code been been store.) Alternatively, you could declare "bool ok"
> and use that.
> 
> After that, it will be easier to read the condition has "if has pirq" or
> "if ok" instead of "if no error".
> 
> 
> 
>> +        rc = libxl__arch_hvm_map_gsi(gc, pci_encode_bdf(pci), domid);
>> +        if (rc) {
>> +            LOGED(ERROR, domainid, "libxl__arch_hvm_map_gsi failed");
> 
> I think LOGD() instead of LOGED() would be enough here.
> libxl__arch_hvm_map_gsi() already logs `strerror(errno)` so there's no
> need to print it again.
> 
>>              goto out;
>>          }
>> -        r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
>> -        if (r < 0) {
>> -            LOGED(ERROR, domainid,
>> -                  "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
>> -            fclose(f);
>> -            rc = ERROR_FAIL;
>> -            goto out;
>> +    } else {
>> +        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> +                                    pci->bus, pci->dev, pci->func);
>> +        f = fopen(sysfs_path, "r");
>> +        if (f == NULL) {
>> +            LOGED(ERROR, domainid, "Couldn't open %s", sysfs_path);
>> +            goto out_no_irq;
>> +        }
>> +        if ((fscanf(f, "%u", &irq) == 1) && irq) {
>> +            r = xc_physdev_map_pirq(ctx->xch, domid, irq, &irq);
>> +            if (r < 0) {
>> +                LOGED(ERROR, domainid, "xc_physdev_map_pirq irq=%d (error=%d)",
>> +                    irq, r);
>> +                fclose(f);
>> +                rc = ERROR_FAIL;
>> +                goto out;
>> +            }
>> +            r = xc_domain_irq_permission(ctx->xch, domid, irq, 1);
>> +            if (r < 0) {
>> +                LOGED(ERROR, domainid,
>> +                    "xc_domain_irq_permission irq=%d (error=%d)", irq, r);
>> +                fclose(f);
>> +                rc = ERROR_FAIL;
>> +                goto out;
>> +            }
>>          }
>> +        fclose(f);
>>      }
>> -    fclose(f);
>>  
>>      /* Don't restrict writes to the PCI config space from this VM */
>>      if (pci->permissive) {
>> @@ -2229,33 +2241,43 @@ skip_bar:
>>      if (!pci_supp_legacy_irq())
>>          goto skip_legacy_irq;
>>  
>> -    sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> -                           pci->bus, pci->dev, pci->func);
>> -
>> -    f = fopen(sysfs_path, "r");
>> -    if (f == NULL) {
>> -        LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
>> -        goto skip_legacy_irq;
>> -    }
>> +    /* When dom0 is PVH, should use gsi to unmap pirq and deny permission */
>> +    rc = libxl__arch_local_domain_has_pirq_notion(gc);
>> +    if (!rc) {
>> +        rc = libxl__arch_hvm_unmap_gsi(gc, pci_encode_bdf(pci), domid);
>> +        if (rc) {
>> +            LOGED(ERROR, domid, "libxl__arch_hvm_unmap_gsi failed");
>> +            goto out;
>> +        }
>> +    } else {
>> +        sysfs_path = GCSPRINTF(SYSFS_PCI_DEV"/"PCI_BDF"/irq", pci->domain,
>> +                            pci->bus, pci->dev, pci->func);
>>  
>> -    if ((fscanf(f, "%u", &irq) == 1) && irq) {
>> -        rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
>> -        if (rc < 0) {
>> -            /*
>> -             * QEMU may have already unmapped the IRQ. So the error
>> -             * may be spurious. For now, still print an error message as
>> -             * it is not easy to distinguished between valid and
>> -             * spurious error.
>> -             */
>> -            LOGED(ERROR, domid, "xc_physdev_unmap_pirq irq=%d", irq);
>> +        f = fopen(sysfs_path, "r");
>> +        if (f == NULL) {
>> +            LOGED(ERROR, domid, "Couldn't open %s", sysfs_path);
>> +            goto skip_legacy_irq;
>>          }
>> -        rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
>> -        if (rc < 0) {
>> -            LOGED(ERROR, domid, "xc_domain_irq_permission irq=%d", irq);
>> +
>> +        if ((fscanf(f, "%u", &irq) == 1) && irq) {
>> +            rc = xc_physdev_unmap_pirq(ctx->xch, domid, irq);
>> +            if (rc < 0) {
>> +                /*
>> +                * QEMU may have already unmapped the IRQ. So the error
> 
> The * ...     here ^ should be aligned with the * on the previous line.
> 
>> +                * may be spurious. For now, still print an error message as
>> +                * it is not easy to distinguished between valid and
>> +                * spurious error.
>> +                */
>> +                LOGED(ERROR, domid, "xc_physdev_unmap_pirq irq=%d", irq);
>> +            }
>> +            rc = xc_domain_irq_permission(ctx->xch, domid, irq, 0);
>> +            if (rc < 0) {
>> +                LOGED(ERROR, domid, "xc_domain_irq_permission irq=%d", irq);
>> +            }
>>          }
>> -    }
>>  
>> -    fclose(f);
>> +        fclose(f);
>> +    }
>>  
>>  skip_legacy_irq:
>>  
>> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
>> index 60643d6f5376..20e3956f09b8 100644
>> --- a/tools/libs/light/libxl_x86.c
>> +++ b/tools/libs/light/libxl_x86.c
>> +int libxl__arch_hvm_map_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
>> +{
>> +    int pirq = -1, gsi, r;
>> +
>> +    gsi = xc_pcidev_get_gsi(CTX->xch, sbdf);
>> +    if (gsi < 0) {
>> +        return ERROR_FAIL;
>> +    }
>> +
>> +    r = xc_physdev_map_pirq_gsi(CTX->xch, domid, gsi, &pirq);
>> +    if (r < 0) {
>> +        LOGED(ERROR, domid, "xc_physdev_map_pirq_gsi gsi=%d ret=%d", gsi, r);
> 
> `r` should be -1, I don't think loggin it is useful..
> 
>> +        return ERROR_FAIL;
>> +    }
>> +
>> +    r = xc_domain_gsi_permission(CTX->xch, domid, gsi, XEN_DOMCTL_GSI_GRANT);
>> +    if (r < 0) {
>> +        LOGED(ERROR, domid, "xc_domain_gsi_permission gsi=%d ret=%d", gsi, r);
> 
> Same here. And in the next function.
> 
>> +        return ERROR_FAIL;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +int libxl__arch_hvm_unmap_gsi(libxl__gc *gc, uint32_t sbdf, uint32_t domid)
>> +{
>> +    int pirq = -1, gsi, r;
>> +
>> +    gsi = xc_pcidev_get_gsi(CTX->xch, sbdf);
>> +    if (gsi < 0) {
>> +        return ERROR_FAIL;
>> +    }
>> +
>> +    /* Before unmapping, use mapping to get the already mapped pirq first */
>> +    r = xc_physdev_map_pirq_gsi(CTX->xch, domid, gsi, &pirq);
>> +    if (r < 0) {
>> +        LOGED(ERROR, domid, "xc_physdev_map_pirq_gsi gsi=%d ret=%d", gsi, r);
>> +        return ERROR_FAIL;
>> +    }
>> +
>> +    r = xc_physdev_unmap_pirq(CTX->xch, domid, pirq);
>> +    if (r < 0) {
>> +        LOGED(ERROR, domid, "xc_physdev_unmap_pirq gsi=%d ret=%d", gsi, r);
>> +        return ERROR_FAIL;
>> +    }
>> +
>> +    r = xc_domain_gsi_permission(CTX->xch, domid, gsi, XEN_DOMCTL_GSI_REVOKE);
>> +    if (r < 0) {
>> +        LOGED(ERROR, domid, "xc_domain_gsi_permission gsi=%d ret=%d", gsi, r);
>> +        return ERROR_FAIL;
>> +    }
>> +
>> +    return 0;
>> +}
> 
> 
> Thanks,
> 

-- 
Best regards,
Jiqian Chen.