[PATCH v7 2/8] vpci: Refactor REGISTER_VPCI_INIT

Jiqian Chen posted 8 patches 7 months, 1 week ago
There is a newer version of this series
[PATCH v7 2/8] vpci: Refactor REGISTER_VPCI_INIT
Posted by Jiqian Chen 7 months, 1 week ago
Refactor REGISTER_VPCI_INIT to contain more capability specific
information, this will benefit further follow-on changes to hide
capability when initialization fails.

What's more, change the definition of init_header() since it is
not a capability and it is needed for all devices' PCI config space.

After refactor, the "priority" of initializing capabilities isn't
needed anymore, so delete its related codes.

Note:
Call vpci_make_msix_hole() in the end of init_msix() since the change
of sequence of init_header() and init_msix(). And delete the call of
vpci_make_msix_hole() in modify_decoding() since it is not needed.

The cleanup hook is also added in this change, even if it's still
unused. Further changes will make use of it.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
There is a byte alignment problem in the array __start_vpci_array, which can be solved after
"[PATCH] x86: don't have gcc over-align data" is merged.
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
cc: Andrew Cooper <andrew.cooper3@citrix.com>
cc: Anthony PERARD <anthony.perard@vates.tech>
cc: Michal Orzel <michal.orzel@amd.com>
cc: Jan Beulich <jbeulich@suse.com>
cc: Julien Grall <julien@xen.org>
cc: Stefano Stabellini <sstabellini@kernel.org>
---
v6->v7 changes:
* Change the pointer parameter of cleanup hook of vpci_capability_t to be const.
  If change parameter of init hook to be const will affect init_msix, and it assigns pdev
  to struct vpci_msix, so keep no const to expanding the impact.
* Delete the vpci_make_msix_hole() call in modify_decoding().
* Change __start_vpci_array from vpci_capability_t* array to vpci_capability_t array.
* Change the name "finit##_t" to be "name##_entry" and add a "name" parameter to macro
  REGISTER_VPCI_CAPABILITY.

v5->v6 changes:
* Rename REGISTER_PCI_CAPABILITY to REGISTER_VPCI_CAPABILITY.
* Move vpci_capability_t entry from ".data.vpci" to ".data.rel.ro.vpci" and
  move the instances of VPCI_ARRAY in the linker scripts before *(.data.rel.ro).
* Change _start/end_vpci_array[] to be const pointer array.

v4->v5 changes:
* Rename REGISTER_VPCI_CAP to REGISTER_PCI_CAPABILITY, rename REGISTER_VPCI_LEGACY_CAP to
  REGISTER_VPCI_CAP, rename REGISTER_VPCI_EXTENDED_CAP to REGISTER_VPCI_EXTCAP.
* Change cleanup hook of vpci_capability_t from void to int.

v3->v4 changes
* Delete the useless trailing dot of section ".data.vpci".
* Add description about priority since this patch removes the initializing priority of
  capabilities and priority is not needed anymore.
* Change the hook name from fini to cleanup.
* Change the name x and y to be finit and fclean.
* Remove the unnecessary check "!capability->init"

v2->v3 changes:
* This is separated from patch "vpci: Hide capability when it fails to initialize" of v2.
* Delete __maybe_unused attribute of "out" in function vpci_assign_devic().
* Rename REGISTER_VPCI_EXTEND_CAP to REGISTER_VPCI_EXTENDED_CAP.

v1->v2 changes:
* Removed the "priorities" of initializing capabilities since it isn't used anymore.
* Added new function vpci_capability_mask() and vpci_ext_capability_mask() to remove
  failed capability from list.
* Called vpci_make_msix_hole() in the end of init_msix().

Best regards,
Jiqian Chen.
---
 xen/arch/arm/xen.lds.S    |  3 +--
 xen/arch/ppc/xen.lds.S    |  3 +--
 xen/arch/riscv/xen.lds.S  |  3 +--
 xen/arch/x86/xen.lds.S    |  2 +-
 xen/drivers/vpci/header.c | 16 +-------------
 xen/drivers/vpci/msi.c    |  2 +-
 xen/drivers/vpci/msix.c   | 11 +++++++---
 xen/drivers/vpci/rebar.c  |  2 +-
 xen/drivers/vpci/vpci.c   | 44 ++++++++++++++++++++++++++++++---------
 xen/include/xen/vpci.h    | 32 ++++++++++++++++++----------
 xen/include/xen/xen.lds.h |  2 +-
 11 files changed, 71 insertions(+), 49 deletions(-)

diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
index 5bfbe1e92c1e..9f30c3a13ed1 100644
--- a/xen/arch/arm/xen.lds.S
+++ b/xen/arch/arm/xen.lds.S
@@ -57,6 +57,7 @@ SECTIONS
 
        *(.rodata)
        *(.rodata.*)
+       VPCI_ARRAY
        *(.data.rel.ro)
        *(.data.rel.ro.*)
 
@@ -64,8 +65,6 @@ SECTIONS
        __proc_info_start = .;
        *(.proc.info)
        __proc_info_end = .;
-
-       VPCI_ARRAY
   } :text
 
 #if defined(BUILD_ID)
diff --git a/xen/arch/ppc/xen.lds.S b/xen/arch/ppc/xen.lds.S
index 1366e2819eed..1de0b77fc6b9 100644
--- a/xen/arch/ppc/xen.lds.S
+++ b/xen/arch/ppc/xen.lds.S
@@ -51,11 +51,10 @@ SECTIONS
 
         *(.rodata)
         *(.rodata.*)
+        VPCI_ARRAY
         *(.data.rel.ro)
         *(.data.rel.ro.*)
 
-        VPCI_ARRAY
-
         . = ALIGN(POINTER_ALIGN);
     } :text
 
diff --git a/xen/arch/riscv/xen.lds.S b/xen/arch/riscv/xen.lds.S
index 8c3c06de01f6..edcadff90bfe 100644
--- a/xen/arch/riscv/xen.lds.S
+++ b/xen/arch/riscv/xen.lds.S
@@ -46,11 +46,10 @@ SECTIONS
 
         *(.rodata)
         *(.rodata.*)
+        VPCI_ARRAY
         *(.data.rel.ro)
         *(.data.rel.ro.*)
 
-        VPCI_ARRAY
-
         . = ALIGN(POINTER_ALIGN);
     } :text
 
diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
index 636c7768aa3c..8e9cac75b09e 100644
--- a/xen/arch/x86/xen.lds.S
+++ b/xen/arch/x86/xen.lds.S
@@ -135,6 +135,7 @@ SECTIONS
 
        *(.rodata)
        *(.rodata.*)
+       VPCI_ARRAY
        *(.data.rel.ro)
        *(.data.rel.ro.*)
 
@@ -148,7 +149,6 @@ SECTIONS
        *(.note.gnu.build-id)
        __note_gnu_build_id_end = .;
 #endif
-       VPCI_ARRAY
   } PHDR(text)
 
 #if defined(CONFIG_PVH_GUEST) && !defined(EFI)
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 8ee8052cd4a3..069253b5f721 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -122,19 +122,6 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd,
     bool map = cmd & PCI_COMMAND_MEMORY;
     unsigned int i;
 
-    /*
-     * Make sure there are no mappings in the MSIX MMIO areas, so that accesses
-     * can be trapped (and emulated) by Xen when the memory decoding bit is
-     * enabled.
-     *
-     * FIXME: punching holes after the p2m has been set up might be racy for
-     * DomU usage, needs to be revisited.
-     */
-#ifdef CONFIG_HAS_PCI_MSI
-    if ( map && !rom_only && vpci_make_msix_hole(pdev) )
-        return;
-#endif
-
     for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
     {
         struct vpci_bar *bar = &header->bars[i];
@@ -869,7 +856,7 @@ static int vpci_init_ext_capability_list(const struct pci_dev *pdev)
     return 0;
 }
 
-static int cf_check init_header(struct pci_dev *pdev)
+int vpci_init_header(struct pci_dev *pdev)
 {
     uint16_t cmd;
     uint64_t addr, size;
@@ -1065,7 +1052,6 @@ static int cf_check init_header(struct pci_dev *pdev)
     pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
     return rc;
 }
-REGISTER_VPCI_INIT(init_header, VPCI_PRIORITY_MIDDLE);
 
 /*
  * Local variables:
diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
index 66e5a8a116be..c3eba4e14870 100644
--- a/xen/drivers/vpci/msi.c
+++ b/xen/drivers/vpci/msi.c
@@ -270,7 +270,7 @@ static int cf_check init_msi(struct pci_dev *pdev)
 
     return 0;
 }
-REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
+REGISTER_VPCI_CAP(MSI, init_msi, NULL);
 
 void vpci_dump_msi(void)
 {
diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index 74211301ba10..a1692b9d9f6a 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -589,7 +589,8 @@ static const struct hvm_mmio_ops vpci_msix_table_ops = {
     .write = msix_write,
 };
 
-int vpci_make_msix_hole(const struct pci_dev *pdev)
+/* Make sure there's a hole in the p2m for the MSIX mmio areas. */
+static int vpci_make_msix_hole(const struct pci_dev *pdev)
 {
     struct domain *d = pdev->domain;
     unsigned int i;
@@ -703,9 +704,13 @@ static int cf_check init_msix(struct pci_dev *pdev)
     pdev->vpci->msix = msix;
     list_add(&msix->next, &d->arch.hvm.msix_tables);
 
-    return 0;
+    spin_lock(&pdev->vpci->lock);
+    rc = vpci_make_msix_hole(pdev);
+    spin_unlock(&pdev->vpci->lock);
+
+    return rc;
 }
-REGISTER_VPCI_INIT(init_msix, VPCI_PRIORITY_HIGH);
+REGISTER_VPCI_CAP(MSIX, init_msix, NULL);
 
 /*
  * Local variables:
diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
index 793937449af7..3c18792d9bcd 100644
--- a/xen/drivers/vpci/rebar.c
+++ b/xen/drivers/vpci/rebar.c
@@ -118,7 +118,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
 
     return 0;
 }
-REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
+REGISTER_VPCI_EXTCAP(REBAR, init_rebar, NULL);
 
 /*
  * Local variables:
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 8474c0e3b995..e7e5b64f1be4 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -36,8 +36,8 @@ struct vpci_register {
 };
 
 #ifdef __XEN__
-extern vpci_register_init_t *const __start_vpci_array[];
-extern vpci_register_init_t *const __end_vpci_array[];
+extern const vpci_capability_t __start_vpci_array[];
+extern const vpci_capability_t __end_vpci_array[];
 #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
 
 #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
@@ -83,6 +83,32 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
 
 #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
 
+static int vpci_init_capabilities(struct pci_dev *pdev)
+{
+    for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
+    {
+        const vpci_capability_t *capability = &__start_vpci_array[i];
+        const unsigned int cap = capability->id;
+        const bool is_ext = capability->is_ext;
+        unsigned int pos = 0;
+        int rc;
+
+        if ( !is_ext )
+            pos = pci_find_cap_offset(pdev->sbdf, cap);
+        else if ( is_hardware_domain(pdev->domain) )
+            pos = pci_find_ext_capability(pdev->sbdf, cap);
+
+        if ( !pos )
+            continue;
+
+        rc = capability->init(pdev);
+        if ( rc )
+            return rc;
+    }
+
+    return 0;
+}
+
 void vpci_deassign_device(struct pci_dev *pdev)
 {
     unsigned int i;
@@ -128,7 +154,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
 
 int vpci_assign_device(struct pci_dev *pdev)
 {
-    unsigned int i;
     const unsigned long *ro_map;
     int rc = 0;
 
@@ -159,14 +184,13 @@ int vpci_assign_device(struct pci_dev *pdev)
         goto out;
 #endif
 
-    for ( i = 0; i < NUM_VPCI_INIT; i++ )
-    {
-        rc = __start_vpci_array[i](pdev);
-        if ( rc )
-            break;
-    }
+    rc = vpci_init_header(pdev);
+    if ( rc )
+        goto out;
+
+    rc = vpci_init_capabilities(pdev);
 
- out: __maybe_unused;
+ out:
     if ( rc )
         vpci_deassign_device(pdev);
 
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 61d16cc8b897..61287e5d2e12 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -13,11 +13,12 @@ typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg,
 typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg,
                           uint32_t val, void *data);
 
-typedef int vpci_register_init_t(struct pci_dev *dev);
-
-#define VPCI_PRIORITY_HIGH      "1"
-#define VPCI_PRIORITY_MIDDLE    "5"
-#define VPCI_PRIORITY_LOW       "9"
+typedef struct {
+    unsigned int id;
+    bool is_ext;
+    int (* init)(struct pci_dev *pdev);
+    int (* cleanup)(const struct pci_dev *pdev);
+} vpci_capability_t;
 
 #define VPCI_ECAM_BDF(addr)     (((addr) & 0x0ffff000) >> 12)
 
@@ -29,9 +30,21 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
  */
 #define VPCI_MAX_VIRT_DEV       (PCI_SLOT(~0) + 1)
 
-#define REGISTER_VPCI_INIT(x, p)                \
-  static vpci_register_init_t *const x##_entry  \
-               __used_section(".data.vpci." p) = (x)
+#define REGISTER_VPCI_CAPABILITY(cap, name, finit, fclean, ext) \
+    static const vpci_capability_t name##_entry \
+        __used_section(".data.rel.ro.vpci") = { \
+        .id = (cap), \
+        .init = (finit), \
+        .cleanup = (fclean), \
+        .is_ext = (ext), \
+    }
+
+#define REGISTER_VPCI_CAP(name, finit, fclean) \
+    REGISTER_VPCI_CAPABILITY(PCI_CAP_ID_##name, name, finit, fclean, false)
+#define REGISTER_VPCI_EXTCAP(name, finit, fclean) \
+    REGISTER_VPCI_CAPABILITY(PCI_EXT_CAP_ID_##name, name, finit, fclean, true)
+
+int __must_check vpci_init_header(struct pci_dev *pdev);
 
 /* Assign vPCI to device by adding handlers. */
 int __must_check vpci_assign_device(struct pci_dev *pdev);
@@ -206,9 +219,6 @@ struct vpci_vcpu {
 #ifdef __XEN__
 void vpci_dump_msi(void);
 
-/* Make sure there's a hole in the p2m for the MSIX mmio areas. */
-int vpci_make_msix_hole(const struct pci_dev *pdev);
-
 /* Arch-specific vPCI MSI helpers. */
 void vpci_msi_arch_mask(struct vpci_msi *msi, const struct pci_dev *pdev,
                         unsigned int entry, bool mask);
diff --git a/xen/include/xen/xen.lds.h b/xen/include/xen/xen.lds.h
index 793d0e11450c..eb86305c11c7 100644
--- a/xen/include/xen/xen.lds.h
+++ b/xen/include/xen/xen.lds.h
@@ -188,7 +188,7 @@
 #define VPCI_ARRAY               \
        . = ALIGN(POINTER_ALIGN); \
        __start_vpci_array = .;   \
-       *(SORT(.data.vpci.*))     \
+       *(.data.rel.ro.vpci)           \
        __end_vpci_array = .;
 #else
 #define VPCI_ARRAY
-- 
2.34.1


Re: [PATCH v7 2/8] vpci: Refactor REGISTER_VPCI_INIT
Posted by Roger Pau Monné 6 months, 3 weeks ago
On Fri, Jul 04, 2025 at 03:07:57PM +0800, Jiqian Chen wrote:
> Refactor REGISTER_VPCI_INIT to contain more capability specific
> information, this will benefit further follow-on changes to hide
> capability when initialization fails.
> 
> What's more, change the definition of init_header() since it is
> not a capability and it is needed for all devices' PCI config space.
> 
> After refactor, the "priority" of initializing capabilities isn't
> needed anymore, so delete its related codes.
> 
> Note:
> Call vpci_make_msix_hole() in the end of init_msix() since the change
> of sequence of init_header() and init_msix(). And delete the call of
> vpci_make_msix_hole() in modify_decoding() since it is not needed.
> 
> The cleanup hook is also added in this change, even if it's still
> unused. Further changes will make use of it.
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> There is a byte alignment problem in the array __start_vpci_array, which can be solved after
> "[PATCH] x86: don't have gcc over-align data" is merged.
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> cc: Andrew Cooper <andrew.cooper3@citrix.com>
> cc: Anthony PERARD <anthony.perard@vates.tech>
> cc: Michal Orzel <michal.orzel@amd.com>
> cc: Jan Beulich <jbeulich@suse.com>
> cc: Julien Grall <julien@xen.org>
> cc: Stefano Stabellini <sstabellini@kernel.org>
> ---
> v6->v7 changes:
> * Change the pointer parameter of cleanup hook of vpci_capability_t to be const.
>   If change parameter of init hook to be const will affect init_msix, and it assigns pdev
>   to struct vpci_msix, so keep no const to expanding the impact.
> * Delete the vpci_make_msix_hole() call in modify_decoding().
> * Change __start_vpci_array from vpci_capability_t* array to vpci_capability_t array.
> * Change the name "finit##_t" to be "name##_entry" and add a "name" parameter to macro
>   REGISTER_VPCI_CAPABILITY.
> 
> v5->v6 changes:
> * Rename REGISTER_PCI_CAPABILITY to REGISTER_VPCI_CAPABILITY.
> * Move vpci_capability_t entry from ".data.vpci" to ".data.rel.ro.vpci" and
>   move the instances of VPCI_ARRAY in the linker scripts before *(.data.rel.ro).
> * Change _start/end_vpci_array[] to be const pointer array.
> 
> v4->v5 changes:
> * Rename REGISTER_VPCI_CAP to REGISTER_PCI_CAPABILITY, rename REGISTER_VPCI_LEGACY_CAP to
>   REGISTER_VPCI_CAP, rename REGISTER_VPCI_EXTENDED_CAP to REGISTER_VPCI_EXTCAP.
> * Change cleanup hook of vpci_capability_t from void to int.
> 
> v3->v4 changes
> * Delete the useless trailing dot of section ".data.vpci".
> * Add description about priority since this patch removes the initializing priority of
>   capabilities and priority is not needed anymore.
> * Change the hook name from fini to cleanup.
> * Change the name x and y to be finit and fclean.
> * Remove the unnecessary check "!capability->init"
> 
> v2->v3 changes:
> * This is separated from patch "vpci: Hide capability when it fails to initialize" of v2.
> * Delete __maybe_unused attribute of "out" in function vpci_assign_devic().
> * Rename REGISTER_VPCI_EXTEND_CAP to REGISTER_VPCI_EXTENDED_CAP.
> 
> v1->v2 changes:
> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to remove
>   failed capability from list.
> * Called vpci_make_msix_hole() in the end of init_msix().
> 
> Best regards,
> Jiqian Chen.
> ---
>  xen/arch/arm/xen.lds.S    |  3 +--
>  xen/arch/ppc/xen.lds.S    |  3 +--
>  xen/arch/riscv/xen.lds.S  |  3 +--
>  xen/arch/x86/xen.lds.S    |  2 +-
>  xen/drivers/vpci/header.c | 16 +-------------
>  xen/drivers/vpci/msi.c    |  2 +-
>  xen/drivers/vpci/msix.c   | 11 +++++++---
>  xen/drivers/vpci/rebar.c  |  2 +-
>  xen/drivers/vpci/vpci.c   | 44 ++++++++++++++++++++++++++++++---------
>  xen/include/xen/vpci.h    | 32 ++++++++++++++++++----------
>  xen/include/xen/xen.lds.h |  2 +-
>  11 files changed, 71 insertions(+), 49 deletions(-)
> 
> diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
> index 5bfbe1e92c1e..9f30c3a13ed1 100644
> --- a/xen/arch/arm/xen.lds.S
> +++ b/xen/arch/arm/xen.lds.S
> @@ -57,6 +57,7 @@ SECTIONS
>  
>         *(.rodata)
>         *(.rodata.*)
> +       VPCI_ARRAY
>         *(.data.rel.ro)
>         *(.data.rel.ro.*)
>  
> @@ -64,8 +65,6 @@ SECTIONS
>         __proc_info_start = .;
>         *(.proc.info)
>         __proc_info_end = .;
> -
> -       VPCI_ARRAY
>    } :text
>  
>  #if defined(BUILD_ID)
> diff --git a/xen/arch/ppc/xen.lds.S b/xen/arch/ppc/xen.lds.S
> index 1366e2819eed..1de0b77fc6b9 100644
> --- a/xen/arch/ppc/xen.lds.S
> +++ b/xen/arch/ppc/xen.lds.S
> @@ -51,11 +51,10 @@ SECTIONS
>  
>          *(.rodata)
>          *(.rodata.*)
> +        VPCI_ARRAY
>          *(.data.rel.ro)
>          *(.data.rel.ro.*)
>  
> -        VPCI_ARRAY
> -
>          . = ALIGN(POINTER_ALIGN);
>      } :text
>  
> diff --git a/xen/arch/riscv/xen.lds.S b/xen/arch/riscv/xen.lds.S
> index 8c3c06de01f6..edcadff90bfe 100644
> --- a/xen/arch/riscv/xen.lds.S
> +++ b/xen/arch/riscv/xen.lds.S
> @@ -46,11 +46,10 @@ SECTIONS
>  
>          *(.rodata)
>          *(.rodata.*)
> +        VPCI_ARRAY
>          *(.data.rel.ro)
>          *(.data.rel.ro.*)
>  
> -        VPCI_ARRAY
> -
>          . = ALIGN(POINTER_ALIGN);
>      } :text
>  
> diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
> index 636c7768aa3c..8e9cac75b09e 100644
> --- a/xen/arch/x86/xen.lds.S
> +++ b/xen/arch/x86/xen.lds.S
> @@ -135,6 +135,7 @@ SECTIONS
>  
>         *(.rodata)
>         *(.rodata.*)
> +       VPCI_ARRAY
>         *(.data.rel.ro)
>         *(.data.rel.ro.*)
>  
> @@ -148,7 +149,6 @@ SECTIONS
>         *(.note.gnu.build-id)
>         __note_gnu_build_id_end = .;
>  #endif
> -       VPCI_ARRAY
>    } PHDR(text)
>  
>  #if defined(CONFIG_PVH_GUEST) && !defined(EFI)
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index 8ee8052cd4a3..069253b5f721 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -122,19 +122,6 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd,
>      bool map = cmd & PCI_COMMAND_MEMORY;
>      unsigned int i;
>  
> -    /*
> -     * Make sure there are no mappings in the MSIX MMIO areas, so that accesses
> -     * can be trapped (and emulated) by Xen when the memory decoding bit is
> -     * enabled.
> -     *
> -     * FIXME: punching holes after the p2m has been set up might be racy for
> -     * DomU usage, needs to be revisited.
> -     */
> -#ifdef CONFIG_HAS_PCI_MSI
> -    if ( map && !rom_only && vpci_make_msix_hole(pdev) )
> -        return;
> -#endif

I think you need to keep this.  What about BARs being repositioned by
dom0 over reserved region(s), and thus needing the MSI-X hole to be
craved out there?  It's not a common use-case, but we should support
dom0 moving BARs around.

I think you need both the added chunk in init_msix(), plus the code
above to not regress the current functionality.

>      for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
>      {
>          struct vpci_bar *bar = &header->bars[i];
> @@ -869,7 +856,7 @@ static int vpci_init_ext_capability_list(const struct pci_dev *pdev)
>      return 0;
>  }
>  
> -static int cf_check init_header(struct pci_dev *pdev)
> +int vpci_init_header(struct pci_dev *pdev)
>  {
>      uint16_t cmd;
>      uint64_t addr, size;
> @@ -1065,7 +1052,6 @@ static int cf_check init_header(struct pci_dev *pdev)
>      pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
>      return rc;
>  }
> -REGISTER_VPCI_INIT(init_header, VPCI_PRIORITY_MIDDLE);
>  
>  /*
>   * Local variables:
> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> index 66e5a8a116be..c3eba4e14870 100644
> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -270,7 +270,7 @@ static int cf_check init_msi(struct pci_dev *pdev)
>  
>      return 0;
>  }
> -REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
> +REGISTER_VPCI_CAP(MSI, init_msi, NULL);
>  
>  void vpci_dump_msi(void)
>  {
> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> index 74211301ba10..a1692b9d9f6a 100644
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -589,7 +589,8 @@ static const struct hvm_mmio_ops vpci_msix_table_ops = {
>      .write = msix_write,
>  };
>  
> -int vpci_make_msix_hole(const struct pci_dev *pdev)
> +/* Make sure there's a hole in the p2m for the MSIX mmio areas. */
> +static int vpci_make_msix_hole(const struct pci_dev *pdev)
>  {
>      struct domain *d = pdev->domain;
>      unsigned int i;
> @@ -703,9 +704,13 @@ static int cf_check init_msix(struct pci_dev *pdev)
>      pdev->vpci->msix = msix;
>      list_add(&msix->next, &d->arch.hvm.msix_tables);
>  
> -    return 0;
> +    spin_lock(&pdev->vpci->lock);
> +    rc = vpci_make_msix_hole(pdev);
> +    spin_unlock(&pdev->vpci->lock);
> +
> +    return rc;
>  }
> -REGISTER_VPCI_INIT(init_msix, VPCI_PRIORITY_HIGH);
> +REGISTER_VPCI_CAP(MSIX, init_msix, NULL);
>  
>  /*
>   * Local variables:
> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
> index 793937449af7..3c18792d9bcd 100644
> --- a/xen/drivers/vpci/rebar.c
> +++ b/xen/drivers/vpci/rebar.c
> @@ -118,7 +118,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
>  
>      return 0;
>  }
> -REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
> +REGISTER_VPCI_EXTCAP(REBAR, init_rebar, NULL);
>  
>  /*
>   * Local variables:
> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
> index 8474c0e3b995..e7e5b64f1be4 100644
> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -36,8 +36,8 @@ struct vpci_register {
>  };
>  
>  #ifdef __XEN__
> -extern vpci_register_init_t *const __start_vpci_array[];
> -extern vpci_register_init_t *const __end_vpci_array[];
> +extern const vpci_capability_t __start_vpci_array[];
> +extern const vpci_capability_t __end_vpci_array[];
>  #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
>  
>  #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> @@ -83,6 +83,32 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
>  
>  #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
>  
> +static int vpci_init_capabilities(struct pci_dev *pdev)
> +{
> +    for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
> +    {
> +        const vpci_capability_t *capability = &__start_vpci_array[i];
> +        const unsigned int cap = capability->id;
> +        const bool is_ext = capability->is_ext;
> +        unsigned int pos = 0;
> +        int rc;
> +
> +        if ( !is_ext )
> +            pos = pci_find_cap_offset(pdev->sbdf, cap);
> +        else if ( is_hardware_domain(pdev->domain) )
> +            pos = pci_find_ext_capability(pdev->sbdf, cap);
> +
> +        if ( !pos )
> +            continue;
> +
> +        rc = capability->init(pdev);
> +        if ( rc )
> +            return rc;
> +    }
> +
> +    return 0;
> +}
> +
>  void vpci_deassign_device(struct pci_dev *pdev)
>  {
>      unsigned int i;
> @@ -128,7 +154,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
>  
>  int vpci_assign_device(struct pci_dev *pdev)
>  {
> -    unsigned int i;
>      const unsigned long *ro_map;
>      int rc = 0;
>  
> @@ -159,14 +184,13 @@ int vpci_assign_device(struct pci_dev *pdev)
>          goto out;
>  #endif
>  
> -    for ( i = 0; i < NUM_VPCI_INIT; i++ )
> -    {
> -        rc = __start_vpci_array[i](pdev);
> -        if ( rc )
> -            break;
> -    }
> +    rc = vpci_init_header(pdev);
> +    if ( rc )
> +        goto out;
> +
> +    rc = vpci_init_capabilities(pdev);
>  
> - out: __maybe_unused;
> + out:
>      if ( rc )
>          vpci_deassign_device(pdev);
>  
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index 61d16cc8b897..61287e5d2e12 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -13,11 +13,12 @@ typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg,
>  typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg,
>                            uint32_t val, void *data);
>  
> -typedef int vpci_register_init_t(struct pci_dev *dev);
> -
> -#define VPCI_PRIORITY_HIGH      "1"
> -#define VPCI_PRIORITY_MIDDLE    "5"
> -#define VPCI_PRIORITY_LOW       "9"
> +typedef struct {
> +    unsigned int id;
> +    bool is_ext;
> +    int (* init)(struct pci_dev *pdev);
> +    int (* cleanup)(const struct pci_dev *pdev);
> +} vpci_capability_t;
>  
>  #define VPCI_ECAM_BDF(addr)     (((addr) & 0x0ffff000) >> 12)
>  
> @@ -29,9 +30,21 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
>   */
>  #define VPCI_MAX_VIRT_DEV       (PCI_SLOT(~0) + 1)
>  
> -#define REGISTER_VPCI_INIT(x, p)                \
> -  static vpci_register_init_t *const x##_entry  \
> -               __used_section(".data.vpci." p) = (x)
> +#define REGISTER_VPCI_CAPABILITY(cap, name, finit, fclean, ext) \
> +    static const vpci_capability_t name##_entry \
> +        __used_section(".data.rel.ro.vpci") = { \
> +        .id = (cap), \
> +        .init = (finit), \
> +        .cleanup = (fclean), \
> +        .is_ext = (ext), \
> +    }
> +
> +#define REGISTER_VPCI_CAP(name, finit, fclean) \
> +    REGISTER_VPCI_CAPABILITY(PCI_CAP_ID_##name, name, finit, fclean, false)
> +#define REGISTER_VPCI_EXTCAP(name, finit, fclean) \
> +    REGISTER_VPCI_CAPABILITY(PCI_EXT_CAP_ID_##name, name, finit, fclean, true)
> +
> +int __must_check vpci_init_header(struct pci_dev *pdev);
>  
>  /* Assign vPCI to device by adding handlers. */
>  int __must_check vpci_assign_device(struct pci_dev *pdev);
> @@ -206,9 +219,6 @@ struct vpci_vcpu {
>  #ifdef __XEN__
>  void vpci_dump_msi(void);
>  
> -/* Make sure there's a hole in the p2m for the MSIX mmio areas. */
> -int vpci_make_msix_hole(const struct pci_dev *pdev);
> -
>  /* Arch-specific vPCI MSI helpers. */
>  void vpci_msi_arch_mask(struct vpci_msi *msi, const struct pci_dev *pdev,
>                          unsigned int entry, bool mask);
> diff --git a/xen/include/xen/xen.lds.h b/xen/include/xen/xen.lds.h
> index 793d0e11450c..eb86305c11c7 100644
> --- a/xen/include/xen/xen.lds.h
> +++ b/xen/include/xen/xen.lds.h
> @@ -188,7 +188,7 @@
>  #define VPCI_ARRAY               \
>         . = ALIGN(POINTER_ALIGN); \
>         __start_vpci_array = .;   \
> -       *(SORT(.data.vpci.*))     \
> +       *(.data.rel.ro.vpci)           \

Indentation of the trailing '\' seems to be off?

Thanks, Roger.

Re: [PATCH v7 2/8] vpci: Refactor REGISTER_VPCI_INIT
Posted by Chen, Jiqian 6 months, 3 weeks ago
On 2025/7/21 22:37, Roger Pau Monné wrote:
> On Fri, Jul 04, 2025 at 03:07:57PM +0800, Jiqian Chen wrote:
>> Refactor REGISTER_VPCI_INIT to contain more capability specific
>> information, this will benefit further follow-on changes to hide
>> capability when initialization fails.
>>
>> What's more, change the definition of init_header() since it is
>> not a capability and it is needed for all devices' PCI config space.
>>
>> After refactor, the "priority" of initializing capabilities isn't
>> needed anymore, so delete its related codes.
>>
>> Note:
>> Call vpci_make_msix_hole() in the end of init_msix() since the change
>> of sequence of init_header() and init_msix(). And delete the call of
>> vpci_make_msix_hole() in modify_decoding() since it is not needed.
>>
>> The cleanup hook is also added in this change, even if it's still
>> unused. Further changes will make use of it.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> There is a byte alignment problem in the array __start_vpci_array, which can be solved after
>> "[PATCH] x86: don't have gcc over-align data" is merged.
>> ---
>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>> cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> cc: Anthony PERARD <anthony.perard@vates.tech>
>> cc: Michal Orzel <michal.orzel@amd.com>
>> cc: Jan Beulich <jbeulich@suse.com>
>> cc: Julien Grall <julien@xen.org>
>> cc: Stefano Stabellini <sstabellini@kernel.org>
>> ---
>> v6->v7 changes:
>> * Change the pointer parameter of cleanup hook of vpci_capability_t to be const.
>>   If change parameter of init hook to be const will affect init_msix, and it assigns pdev
>>   to struct vpci_msix, so keep no const to expanding the impact.
>> * Delete the vpci_make_msix_hole() call in modify_decoding().
>> * Change __start_vpci_array from vpci_capability_t* array to vpci_capability_t array.
>> * Change the name "finit##_t" to be "name##_entry" and add a "name" parameter to macro
>>   REGISTER_VPCI_CAPABILITY.
>>
>> v5->v6 changes:
>> * Rename REGISTER_PCI_CAPABILITY to REGISTER_VPCI_CAPABILITY.
>> * Move vpci_capability_t entry from ".data.vpci" to ".data.rel.ro.vpci" and
>>   move the instances of VPCI_ARRAY in the linker scripts before *(.data.rel.ro).
>> * Change _start/end_vpci_array[] to be const pointer array.
>>
>> v4->v5 changes:
>> * Rename REGISTER_VPCI_CAP to REGISTER_PCI_CAPABILITY, rename REGISTER_VPCI_LEGACY_CAP to
>>   REGISTER_VPCI_CAP, rename REGISTER_VPCI_EXTENDED_CAP to REGISTER_VPCI_EXTCAP.
>> * Change cleanup hook of vpci_capability_t from void to int.
>>
>> v3->v4 changes
>> * Delete the useless trailing dot of section ".data.vpci".
>> * Add description about priority since this patch removes the initializing priority of
>>   capabilities and priority is not needed anymore.
>> * Change the hook name from fini to cleanup.
>> * Change the name x and y to be finit and fclean.
>> * Remove the unnecessary check "!capability->init"
>>
>> v2->v3 changes:
>> * This is separated from patch "vpci: Hide capability when it fails to initialize" of v2.
>> * Delete __maybe_unused attribute of "out" in function vpci_assign_devic().
>> * Rename REGISTER_VPCI_EXTEND_CAP to REGISTER_VPCI_EXTENDED_CAP.
>>
>> v1->v2 changes:
>> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
>> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to remove
>>   failed capability from list.
>> * Called vpci_make_msix_hole() in the end of init_msix().
>>
>> Best regards,
>> Jiqian Chen.
>> ---
>>  xen/arch/arm/xen.lds.S    |  3 +--
>>  xen/arch/ppc/xen.lds.S    |  3 +--
>>  xen/arch/riscv/xen.lds.S  |  3 +--
>>  xen/arch/x86/xen.lds.S    |  2 +-
>>  xen/drivers/vpci/header.c | 16 +-------------
>>  xen/drivers/vpci/msi.c    |  2 +-
>>  xen/drivers/vpci/msix.c   | 11 +++++++---
>>  xen/drivers/vpci/rebar.c  |  2 +-
>>  xen/drivers/vpci/vpci.c   | 44 ++++++++++++++++++++++++++++++---------
>>  xen/include/xen/vpci.h    | 32 ++++++++++++++++++----------
>>  xen/include/xen/xen.lds.h |  2 +-
>>  11 files changed, 71 insertions(+), 49 deletions(-)
>>
>> diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
>> index 5bfbe1e92c1e..9f30c3a13ed1 100644
>> --- a/xen/arch/arm/xen.lds.S
>> +++ b/xen/arch/arm/xen.lds.S
>> @@ -57,6 +57,7 @@ SECTIONS
>>  
>>         *(.rodata)
>>         *(.rodata.*)
>> +       VPCI_ARRAY
>>         *(.data.rel.ro)
>>         *(.data.rel.ro.*)
>>  
>> @@ -64,8 +65,6 @@ SECTIONS
>>         __proc_info_start = .;
>>         *(.proc.info)
>>         __proc_info_end = .;
>> -
>> -       VPCI_ARRAY
>>    } :text
>>  
>>  #if defined(BUILD_ID)
>> diff --git a/xen/arch/ppc/xen.lds.S b/xen/arch/ppc/xen.lds.S
>> index 1366e2819eed..1de0b77fc6b9 100644
>> --- a/xen/arch/ppc/xen.lds.S
>> +++ b/xen/arch/ppc/xen.lds.S
>> @@ -51,11 +51,10 @@ SECTIONS
>>  
>>          *(.rodata)
>>          *(.rodata.*)
>> +        VPCI_ARRAY
>>          *(.data.rel.ro)
>>          *(.data.rel.ro.*)
>>  
>> -        VPCI_ARRAY
>> -
>>          . = ALIGN(POINTER_ALIGN);
>>      } :text
>>  
>> diff --git a/xen/arch/riscv/xen.lds.S b/xen/arch/riscv/xen.lds.S
>> index 8c3c06de01f6..edcadff90bfe 100644
>> --- a/xen/arch/riscv/xen.lds.S
>> +++ b/xen/arch/riscv/xen.lds.S
>> @@ -46,11 +46,10 @@ SECTIONS
>>  
>>          *(.rodata)
>>          *(.rodata.*)
>> +        VPCI_ARRAY
>>          *(.data.rel.ro)
>>          *(.data.rel.ro.*)
>>  
>> -        VPCI_ARRAY
>> -
>>          . = ALIGN(POINTER_ALIGN);
>>      } :text
>>  
>> diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
>> index 636c7768aa3c..8e9cac75b09e 100644
>> --- a/xen/arch/x86/xen.lds.S
>> +++ b/xen/arch/x86/xen.lds.S
>> @@ -135,6 +135,7 @@ SECTIONS
>>  
>>         *(.rodata)
>>         *(.rodata.*)
>> +       VPCI_ARRAY
>>         *(.data.rel.ro)
>>         *(.data.rel.ro.*)
>>  
>> @@ -148,7 +149,6 @@ SECTIONS
>>         *(.note.gnu.build-id)
>>         __note_gnu_build_id_end = .;
>>  #endif
>> -       VPCI_ARRAY
>>    } PHDR(text)
>>  
>>  #if defined(CONFIG_PVH_GUEST) && !defined(EFI)
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index 8ee8052cd4a3..069253b5f721 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -122,19 +122,6 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd,
>>      bool map = cmd & PCI_COMMAND_MEMORY;
>>      unsigned int i;
>>  
>> -    /*
>> -     * Make sure there are no mappings in the MSIX MMIO areas, so that accesses
>> -     * can be trapped (and emulated) by Xen when the memory decoding bit is
>> -     * enabled.
>> -     *
>> -     * FIXME: punching holes after the p2m has been set up might be racy for
>> -     * DomU usage, needs to be revisited.
>> -     */
>> -#ifdef CONFIG_HAS_PCI_MSI
>> -    if ( map && !rom_only && vpci_make_msix_hole(pdev) )
>> -        return;
>> -#endif
> 
> I think you need to keep this.  What about BARs being repositioned by
> dom0 over reserved region(s), and thus needing the MSI-X hole to be
> craved out there?  It's not a common use-case, but we should support
> dom0 moving BARs around.
> 
> I think you need both the added chunk in init_msix(), plus the code
> above to not regress the current functionality.
OK, will do.
As Jan required me to add some comment to describe the chunk in init_msix() if not to delete here.
Do you think below is appropriate?

    /*
     * To make sure there's a hole for the MSIX table/PBA in the p2m since
     * init_msix is called after init_header. Here and the calling in another
     * place are not redundant, another is to support dom0 moving BARs.
     */
    spin_lock(&pdev->vpci->lock);
    rc = vpci_make_msix_hole(pdev);
    spin_unlock(&pdev->vpci->lock);

    return rc;
}
REGISTER_VPCI_CAP(MSIX, init_msix, cleanup_msix);

> 
>>      for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
>>      {
>>          struct vpci_bar *bar = &header->bars[i];
>> @@ -869,7 +856,7 @@ static int vpci_init_ext_capability_list(const struct pci_dev *pdev)
>>      return 0;
>>  }
>>  
>> -static int cf_check init_header(struct pci_dev *pdev)
>> +int vpci_init_header(struct pci_dev *pdev)
>>  {
>>      uint16_t cmd;
>>      uint64_t addr, size;
>> @@ -1065,7 +1052,6 @@ static int cf_check init_header(struct pci_dev *pdev)
>>      pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
>>      return rc;
>>  }
>> -REGISTER_VPCI_INIT(init_header, VPCI_PRIORITY_MIDDLE);
>>  
>>  /*
>>   * Local variables:
>> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
>> index 66e5a8a116be..c3eba4e14870 100644
>> --- a/xen/drivers/vpci/msi.c
>> +++ b/xen/drivers/vpci/msi.c
>> @@ -270,7 +270,7 @@ static int cf_check init_msi(struct pci_dev *pdev)
>>  
>>      return 0;
>>  }
>> -REGISTER_VPCI_INIT(init_msi, VPCI_PRIORITY_LOW);
>> +REGISTER_VPCI_CAP(MSI, init_msi, NULL);
>>  
>>  void vpci_dump_msi(void)
>>  {
>> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
>> index 74211301ba10..a1692b9d9f6a 100644
>> --- a/xen/drivers/vpci/msix.c
>> +++ b/xen/drivers/vpci/msix.c
>> @@ -589,7 +589,8 @@ static const struct hvm_mmio_ops vpci_msix_table_ops = {
>>      .write = msix_write,
>>  };
>>  
>> -int vpci_make_msix_hole(const struct pci_dev *pdev)
>> +/* Make sure there's a hole in the p2m for the MSIX mmio areas. */
>> +static int vpci_make_msix_hole(const struct pci_dev *pdev)
>>  {
>>      struct domain *d = pdev->domain;
>>      unsigned int i;
>> @@ -703,9 +704,13 @@ static int cf_check init_msix(struct pci_dev *pdev)
>>      pdev->vpci->msix = msix;
>>      list_add(&msix->next, &d->arch.hvm.msix_tables);
>>  
>> -    return 0;
>> +    spin_lock(&pdev->vpci->lock);
>> +    rc = vpci_make_msix_hole(pdev);
>> +    spin_unlock(&pdev->vpci->lock);
>> +
>> +    return rc;
>>  }
>> -REGISTER_VPCI_INIT(init_msix, VPCI_PRIORITY_HIGH);
>> +REGISTER_VPCI_CAP(MSIX, init_msix, NULL);
>>  
>>  /*
>>   * Local variables:
>> diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
>> index 793937449af7..3c18792d9bcd 100644
>> --- a/xen/drivers/vpci/rebar.c
>> +++ b/xen/drivers/vpci/rebar.c
>> @@ -118,7 +118,7 @@ static int cf_check init_rebar(struct pci_dev *pdev)
>>  
>>      return 0;
>>  }
>> -REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
>> +REGISTER_VPCI_EXTCAP(REBAR, init_rebar, NULL);
>>  
>>  /*
>>   * Local variables:
>> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
>> index 8474c0e3b995..e7e5b64f1be4 100644
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -36,8 +36,8 @@ struct vpci_register {
>>  };
>>  
>>  #ifdef __XEN__
>> -extern vpci_register_init_t *const __start_vpci_array[];
>> -extern vpci_register_init_t *const __end_vpci_array[];
>> +extern const vpci_capability_t __start_vpci_array[];
>> +extern const vpci_capability_t __end_vpci_array[];
>>  #define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array)
>>  
>>  #ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
>> @@ -83,6 +83,32 @@ static int assign_virtual_sbdf(struct pci_dev *pdev)
>>  
>>  #endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
>>  
>> +static int vpci_init_capabilities(struct pci_dev *pdev)
>> +{
>> +    for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
>> +    {
>> +        const vpci_capability_t *capability = &__start_vpci_array[i];
>> +        const unsigned int cap = capability->id;
>> +        const bool is_ext = capability->is_ext;
>> +        unsigned int pos = 0;
>> +        int rc;
>> +
>> +        if ( !is_ext )
>> +            pos = pci_find_cap_offset(pdev->sbdf, cap);
>> +        else if ( is_hardware_domain(pdev->domain) )
>> +            pos = pci_find_ext_capability(pdev->sbdf, cap);
>> +
>> +        if ( !pos )
>> +            continue;
>> +
>> +        rc = capability->init(pdev);
>> +        if ( rc )
>> +            return rc;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>  void vpci_deassign_device(struct pci_dev *pdev)
>>  {
>>      unsigned int i;
>> @@ -128,7 +154,6 @@ void vpci_deassign_device(struct pci_dev *pdev)
>>  
>>  int vpci_assign_device(struct pci_dev *pdev)
>>  {
>> -    unsigned int i;
>>      const unsigned long *ro_map;
>>      int rc = 0;
>>  
>> @@ -159,14 +184,13 @@ int vpci_assign_device(struct pci_dev *pdev)
>>          goto out;
>>  #endif
>>  
>> -    for ( i = 0; i < NUM_VPCI_INIT; i++ )
>> -    {
>> -        rc = __start_vpci_array[i](pdev);
>> -        if ( rc )
>> -            break;
>> -    }
>> +    rc = vpci_init_header(pdev);
>> +    if ( rc )
>> +        goto out;
>> +
>> +    rc = vpci_init_capabilities(pdev);
>>  
>> - out: __maybe_unused;
>> + out:
>>      if ( rc )
>>          vpci_deassign_device(pdev);
>>  
>> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
>> index 61d16cc8b897..61287e5d2e12 100644
>> --- a/xen/include/xen/vpci.h
>> +++ b/xen/include/xen/vpci.h
>> @@ -13,11 +13,12 @@ typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg,
>>  typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg,
>>                            uint32_t val, void *data);
>>  
>> -typedef int vpci_register_init_t(struct pci_dev *dev);
>> -
>> -#define VPCI_PRIORITY_HIGH      "1"
>> -#define VPCI_PRIORITY_MIDDLE    "5"
>> -#define VPCI_PRIORITY_LOW       "9"
>> +typedef struct {
>> +    unsigned int id;
>> +    bool is_ext;
>> +    int (* init)(struct pci_dev *pdev);
>> +    int (* cleanup)(const struct pci_dev *pdev);
>> +} vpci_capability_t;
>>  
>>  #define VPCI_ECAM_BDF(addr)     (((addr) & 0x0ffff000) >> 12)
>>  
>> @@ -29,9 +30,21 @@ typedef int vpci_register_init_t(struct pci_dev *dev);
>>   */
>>  #define VPCI_MAX_VIRT_DEV       (PCI_SLOT(~0) + 1)
>>  
>> -#define REGISTER_VPCI_INIT(x, p)                \
>> -  static vpci_register_init_t *const x##_entry  \
>> -               __used_section(".data.vpci." p) = (x)
>> +#define REGISTER_VPCI_CAPABILITY(cap, name, finit, fclean, ext) \
>> +    static const vpci_capability_t name##_entry \
>> +        __used_section(".data.rel.ro.vpci") = { \
>> +        .id = (cap), \
>> +        .init = (finit), \
>> +        .cleanup = (fclean), \
>> +        .is_ext = (ext), \
>> +    }
>> +
>> +#define REGISTER_VPCI_CAP(name, finit, fclean) \
>> +    REGISTER_VPCI_CAPABILITY(PCI_CAP_ID_##name, name, finit, fclean, false)
>> +#define REGISTER_VPCI_EXTCAP(name, finit, fclean) \
>> +    REGISTER_VPCI_CAPABILITY(PCI_EXT_CAP_ID_##name, name, finit, fclean, true)
>> +
>> +int __must_check vpci_init_header(struct pci_dev *pdev);
>>  
>>  /* Assign vPCI to device by adding handlers. */
>>  int __must_check vpci_assign_device(struct pci_dev *pdev);
>> @@ -206,9 +219,6 @@ struct vpci_vcpu {
>>  #ifdef __XEN__
>>  void vpci_dump_msi(void);
>>  
>> -/* Make sure there's a hole in the p2m for the MSIX mmio areas. */
>> -int vpci_make_msix_hole(const struct pci_dev *pdev);
>> -
>>  /* Arch-specific vPCI MSI helpers. */
>>  void vpci_msi_arch_mask(struct vpci_msi *msi, const struct pci_dev *pdev,
>>                          unsigned int entry, bool mask);
>> diff --git a/xen/include/xen/xen.lds.h b/xen/include/xen/xen.lds.h
>> index 793d0e11450c..eb86305c11c7 100644
>> --- a/xen/include/xen/xen.lds.h
>> +++ b/xen/include/xen/xen.lds.h
>> @@ -188,7 +188,7 @@
>>  #define VPCI_ARRAY               \
>>         . = ALIGN(POINTER_ALIGN); \
>>         __start_vpci_array = .;   \
>> -       *(SORT(.data.vpci.*))     \
>> +       *(.data.rel.ro.vpci)           \
> 
> Indentation of the trailing '\' seems to be off?
Will change.
> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.
Re: [PATCH v7 2/8] vpci: Refactor REGISTER_VPCI_INIT
Posted by Roger Pau Monné 6 months, 3 weeks ago
On Wed, Jul 23, 2025 at 07:20:27AM +0000, Chen, Jiqian wrote:
> On 2025/7/21 22:37, Roger Pau Monné wrote:
> > On Fri, Jul 04, 2025 at 03:07:57PM +0800, Jiqian Chen wrote:
> >> Refactor REGISTER_VPCI_INIT to contain more capability specific
> >> information, this will benefit further follow-on changes to hide
> >> capability when initialization fails.
> >>
> >> What's more, change the definition of init_header() since it is
> >> not a capability and it is needed for all devices' PCI config space.
> >>
> >> After refactor, the "priority" of initializing capabilities isn't
> >> needed anymore, so delete its related codes.
> >>
> >> Note:
> >> Call vpci_make_msix_hole() in the end of init_msix() since the change
> >> of sequence of init_header() and init_msix(). And delete the call of
> >> vpci_make_msix_hole() in modify_decoding() since it is not needed.
> >>
> >> The cleanup hook is also added in this change, even if it's still
> >> unused. Further changes will make use of it.
> >>
> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >> ---
> >> There is a byte alignment problem in the array __start_vpci_array, which can be solved after
> >> "[PATCH] x86: don't have gcc over-align data" is merged.
> >> ---
> >> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> >> cc: Andrew Cooper <andrew.cooper3@citrix.com>
> >> cc: Anthony PERARD <anthony.perard@vates.tech>
> >> cc: Michal Orzel <michal.orzel@amd.com>
> >> cc: Jan Beulich <jbeulich@suse.com>
> >> cc: Julien Grall <julien@xen.org>
> >> cc: Stefano Stabellini <sstabellini@kernel.org>
> >> ---
> >> v6->v7 changes:
> >> * Change the pointer parameter of cleanup hook of vpci_capability_t to be const.
> >>   If change parameter of init hook to be const will affect init_msix, and it assigns pdev
> >>   to struct vpci_msix, so keep no const to expanding the impact.
> >> * Delete the vpci_make_msix_hole() call in modify_decoding().
> >> * Change __start_vpci_array from vpci_capability_t* array to vpci_capability_t array.
> >> * Change the name "finit##_t" to be "name##_entry" and add a "name" parameter to macro
> >>   REGISTER_VPCI_CAPABILITY.
> >>
> >> v5->v6 changes:
> >> * Rename REGISTER_PCI_CAPABILITY to REGISTER_VPCI_CAPABILITY.
> >> * Move vpci_capability_t entry from ".data.vpci" to ".data.rel.ro.vpci" and
> >>   move the instances of VPCI_ARRAY in the linker scripts before *(.data.rel.ro).
> >> * Change _start/end_vpci_array[] to be const pointer array.
> >>
> >> v4->v5 changes:
> >> * Rename REGISTER_VPCI_CAP to REGISTER_PCI_CAPABILITY, rename REGISTER_VPCI_LEGACY_CAP to
> >>   REGISTER_VPCI_CAP, rename REGISTER_VPCI_EXTENDED_CAP to REGISTER_VPCI_EXTCAP.
> >> * Change cleanup hook of vpci_capability_t from void to int.
> >>
> >> v3->v4 changes
> >> * Delete the useless trailing dot of section ".data.vpci".
> >> * Add description about priority since this patch removes the initializing priority of
> >>   capabilities and priority is not needed anymore.
> >> * Change the hook name from fini to cleanup.
> >> * Change the name x and y to be finit and fclean.
> >> * Remove the unnecessary check "!capability->init"
> >>
> >> v2->v3 changes:
> >> * This is separated from patch "vpci: Hide capability when it fails to initialize" of v2.
> >> * Delete __maybe_unused attribute of "out" in function vpci_assign_devic().
> >> * Rename REGISTER_VPCI_EXTEND_CAP to REGISTER_VPCI_EXTENDED_CAP.
> >>
> >> v1->v2 changes:
> >> * Removed the "priorities" of initializing capabilities since it isn't used anymore.
> >> * Added new function vpci_capability_mask() and vpci_ext_capability_mask() to remove
> >>   failed capability from list.
> >> * Called vpci_make_msix_hole() in the end of init_msix().
> >>
> >> Best regards,
> >> Jiqian Chen.
> >> ---
> >>  xen/arch/arm/xen.lds.S    |  3 +--
> >>  xen/arch/ppc/xen.lds.S    |  3 +--
> >>  xen/arch/riscv/xen.lds.S  |  3 +--
> >>  xen/arch/x86/xen.lds.S    |  2 +-
> >>  xen/drivers/vpci/header.c | 16 +-------------
> >>  xen/drivers/vpci/msi.c    |  2 +-
> >>  xen/drivers/vpci/msix.c   | 11 +++++++---
> >>  xen/drivers/vpci/rebar.c  |  2 +-
> >>  xen/drivers/vpci/vpci.c   | 44 ++++++++++++++++++++++++++++++---------
> >>  xen/include/xen/vpci.h    | 32 ++++++++++++++++++----------
> >>  xen/include/xen/xen.lds.h |  2 +-
> >>  11 files changed, 71 insertions(+), 49 deletions(-)
> >>
> >> diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S
> >> index 5bfbe1e92c1e..9f30c3a13ed1 100644
> >> --- a/xen/arch/arm/xen.lds.S
> >> +++ b/xen/arch/arm/xen.lds.S
> >> @@ -57,6 +57,7 @@ SECTIONS
> >>  
> >>         *(.rodata)
> >>         *(.rodata.*)
> >> +       VPCI_ARRAY
> >>         *(.data.rel.ro)
> >>         *(.data.rel.ro.*)
> >>  
> >> @@ -64,8 +65,6 @@ SECTIONS
> >>         __proc_info_start = .;
> >>         *(.proc.info)
> >>         __proc_info_end = .;
> >> -
> >> -       VPCI_ARRAY
> >>    } :text
> >>  
> >>  #if defined(BUILD_ID)
> >> diff --git a/xen/arch/ppc/xen.lds.S b/xen/arch/ppc/xen.lds.S
> >> index 1366e2819eed..1de0b77fc6b9 100644
> >> --- a/xen/arch/ppc/xen.lds.S
> >> +++ b/xen/arch/ppc/xen.lds.S
> >> @@ -51,11 +51,10 @@ SECTIONS
> >>  
> >>          *(.rodata)
> >>          *(.rodata.*)
> >> +        VPCI_ARRAY
> >>          *(.data.rel.ro)
> >>          *(.data.rel.ro.*)
> >>  
> >> -        VPCI_ARRAY
> >> -
> >>          . = ALIGN(POINTER_ALIGN);
> >>      } :text
> >>  
> >> diff --git a/xen/arch/riscv/xen.lds.S b/xen/arch/riscv/xen.lds.S
> >> index 8c3c06de01f6..edcadff90bfe 100644
> >> --- a/xen/arch/riscv/xen.lds.S
> >> +++ b/xen/arch/riscv/xen.lds.S
> >> @@ -46,11 +46,10 @@ SECTIONS
> >>  
> >>          *(.rodata)
> >>          *(.rodata.*)
> >> +        VPCI_ARRAY
> >>          *(.data.rel.ro)
> >>          *(.data.rel.ro.*)
> >>  
> >> -        VPCI_ARRAY
> >> -
> >>          . = ALIGN(POINTER_ALIGN);
> >>      } :text
> >>  
> >> diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
> >> index 636c7768aa3c..8e9cac75b09e 100644
> >> --- a/xen/arch/x86/xen.lds.S
> >> +++ b/xen/arch/x86/xen.lds.S
> >> @@ -135,6 +135,7 @@ SECTIONS
> >>  
> >>         *(.rodata)
> >>         *(.rodata.*)
> >> +       VPCI_ARRAY
> >>         *(.data.rel.ro)
> >>         *(.data.rel.ro.*)
> >>  
> >> @@ -148,7 +149,6 @@ SECTIONS
> >>         *(.note.gnu.build-id)
> >>         __note_gnu_build_id_end = .;
> >>  #endif
> >> -       VPCI_ARRAY
> >>    } PHDR(text)
> >>  
> >>  #if defined(CONFIG_PVH_GUEST) && !defined(EFI)
> >> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> >> index 8ee8052cd4a3..069253b5f721 100644
> >> --- a/xen/drivers/vpci/header.c
> >> +++ b/xen/drivers/vpci/header.c
> >> @@ -122,19 +122,6 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd,
> >>      bool map = cmd & PCI_COMMAND_MEMORY;
> >>      unsigned int i;
> >>  
> >> -    /*
> >> -     * Make sure there are no mappings in the MSIX MMIO areas, so that accesses
> >> -     * can be trapped (and emulated) by Xen when the memory decoding bit is
> >> -     * enabled.
> >> -     *
> >> -     * FIXME: punching holes after the p2m has been set up might be racy for
> >> -     * DomU usage, needs to be revisited.
> >> -     */
> >> -#ifdef CONFIG_HAS_PCI_MSI
> >> -    if ( map && !rom_only && vpci_make_msix_hole(pdev) )
> >> -        return;
> >> -#endif
> > 
> > I think you need to keep this.  What about BARs being repositioned by
> > dom0 over reserved region(s), and thus needing the MSI-X hole to be
> > craved out there?  It's not a common use-case, but we should support
> > dom0 moving BARs around.
> > 
> > I think you need both the added chunk in init_msix(), plus the code
> > above to not regress the current functionality.
> OK, will do.
> As Jan required me to add some comment to describe the chunk in init_msix() if not to delete here.
> Do you think below is appropriate?
> 
>     /*
>      * To make sure there's a hole for the MSIX table/PBA in the p2m since
>      * init_msix is called after init_header. Here and the calling in another
>      * place are not redundant, another is to support dom0 moving BARs.
>      */
>     spin_lock(&pdev->vpci->lock);
>     rc = vpci_make_msix_hole(pdev);
>     spin_unlock(&pdev->vpci->lock);

I would use:

/*
 * vPCI header initialization will have mapped the whole BAR into the
 * p2m, as MSI-X capability was not yet initialized.  Crave a hole for
 * the MSI-X table here, so that Xen can trap accesses.
 */

I think referencing "another is to support dom0..." is not helpful,
and likely to get out of sync if we ever change that code.  If
anything, the comment in modify_decoding() needs updating, but not via
a cross reference from a different context.

Thanks, Roger.

Re: [PATCH v7 2/8] vpci: Refactor REGISTER_VPCI_INIT
Posted by Jan Beulich 7 months ago
On 04.07.2025 09:07, Jiqian Chen wrote:
> Refactor REGISTER_VPCI_INIT to contain more capability specific
> information, this will benefit further follow-on changes to hide
> capability when initialization fails.
> 
> What's more, change the definition of init_header() since it is
> not a capability and it is needed for all devices' PCI config space.
> 
> After refactor, the "priority" of initializing capabilities isn't
> needed anymore, so delete its related codes.
> 
> Note:
> Call vpci_make_msix_hole() in the end of init_msix() since the change
> of sequence of init_header() and init_msix(). And delete the call of
> vpci_make_msix_hole() in modify_decoding() since it is not needed.
> 
> The cleanup hook is also added in this change, even if it's still
> unused. Further changes will make use of it.
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> There is a byte alignment problem in the array __start_vpci_array, which can be solved after
> "[PATCH] x86: don't have gcc over-align data" is merged.

I was meaning to suggest a way to actually detect the issue at build time,
but sadly what's wanted to do so will first need adding to gas. Which now
will likely be only after 2.45 went out.

Jan