[PATCH v4] vpci: Add resizable bar support

Jiqian Chen posted 1 patch 1 month, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20241219052143.3161332-1-Jiqian.Chen@amd.com
There is a newer version of this series
xen/drivers/vpci/Makefile  |   2 +-
xen/drivers/vpci/rebar.c   | 131 +++++++++++++++++++++++++++++++++++++
xen/drivers/vpci/vpci.c    |   6 ++
xen/include/xen/pci_regs.h |  14 ++++
xen/include/xen/vpci.h     |   3 +
5 files changed, 155 insertions(+), 1 deletion(-)
create mode 100644 xen/drivers/vpci/rebar.c
[PATCH v4] vpci: Add resizable bar support
Posted by Jiqian Chen 1 month, 2 weeks ago
Some devices, like discrete GPU of amd, support resizable bar
capability, but vpci of Xen doesn't support this feature, so
they fail to resize bars and then cause probing failure.

According to PCIe spec, each bar that supports resizing has
two registers, PCI_REBAR_CAP and PCI_REBAR_CTRL. So, add
handlers for them to support resizing the size of BARs.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
Hi all,
v3->v4 changes:
* Removed PCI_REBAR_CAP_SIZES since it was not needed, and added
  PCI_REBAR_CAP_SHIFT and PCI_REBAR_CTRL_SIZES.
* Added parameter resizable_sizes to struct vpci_bar to cache the support resizable sizes and
  added the logic in init_rebar().
* Changed PCI_REBAR_CAP to PCI_REBAR_CAP(n) (4+8*(n)), changed PCI_REBAR_CTRL to
  PCI_REBAR_CTRL(n) (8+8*(n)).
* Added domain info of pci_dev to printings of init_rebar().

Best regards,
Jiqian Chen.

v2->v3 changes:
* Used "bar->enabled" to replace "pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY",
  and added comments why it needs this check.
* Added "!is_hardware_domain(pdev->domain)" check in init_rebar() to return EOPNOTSUPP for domUs.
* Moved BAR type and index check into init_rebar(), then only need to check once.
* Added 'U' suffix for macro PCI_REBAR_CAP_SIZES.
* Added macro PCI_REBAR_SIZE_BIAS to represent 20.
TODO: need to hide ReBar capability from hardware domain when init_rebar() fails.

v1->v2 changes:
* In rebar_ctrl_write, to check if memory decoding is enabled, and added
  some checks for the type of Bar.
* Added vpci_hw_write32 to handle PCI_REBAR_CAP's write, since there is
  no write limitation of dom0.
* And has many other minor modifications as well.
---
 xen/drivers/vpci/Makefile  |   2 +-
 xen/drivers/vpci/rebar.c   | 131 +++++++++++++++++++++++++++++++++++++
 xen/drivers/vpci/vpci.c    |   6 ++
 xen/include/xen/pci_regs.h |  14 ++++
 xen/include/xen/vpci.h     |   3 +
 5 files changed, 155 insertions(+), 1 deletion(-)
 create mode 100644 xen/drivers/vpci/rebar.c

diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile
index 1a1413b93e76..a7c8a30a8956 100644
--- a/xen/drivers/vpci/Makefile
+++ b/xen/drivers/vpci/Makefile
@@ -1,2 +1,2 @@
-obj-y += vpci.o header.o
+obj-y += vpci.o header.o rebar.o
 obj-$(CONFIG_HAS_PCI_MSI) += msi.o msix.o
diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
new file mode 100644
index 000000000000..bfc0e6eb0668
--- /dev/null
+++ b/xen/drivers/vpci/rebar.c
@@ -0,0 +1,131 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
+ *
+ * Author: Jiqian Chen <Jiqian.Chen@amd.com>
+ */
+
+#include <xen/sched.h>
+#include <xen/vpci.h>
+
+static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
+                                      unsigned int reg,
+                                      uint32_t val,
+                                      void *data)
+{
+    struct vpci_bar *bar = data;
+    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
+
+    if ( bar->enabled )
+    {
+        /*
+         * Refuse to resize a BAR while memory decoding is enabled, as
+         * otherwise the size of the mapped region in the p2m would become
+         * stale with the newly set BAR size, and the position of the BAR
+         * would be reset to undefined.  Note the PCIe specification also
+         * forbids resizing a BAR with memory decoding enabled.
+         */
+        if ( size != bar->size )
+            gprintk(XENLOG_ERR,
+                    "%pp: refuse to resize BAR with memory decoding enabled\n",
+                    &pdev->sbdf);
+        return;
+    }
+
+    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
+        gprintk(XENLOG_WARNING,
+                "%pp: new size %#lx is not supported by hardware\n",
+                &pdev->sbdf, size);
+
+    bar->size = size;
+    bar->addr = 0;
+    bar->guest_addr = 0;
+    pci_conf_write32(pdev->sbdf, reg, val);
+}
+
+static int cf_check init_rebar(struct pci_dev *pdev)
+{
+    uint32_t ctrl;
+    unsigned int nbars;
+    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
+                                                        PCI_EXT_CAP_ID_REBAR);
+
+    if ( !rebar_offset )
+        return 0;
+
+    if ( !is_hardware_domain(pdev->domain) )
+    {
+        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
+               &pdev->sbdf, pdev->domain);
+        return -EOPNOTSUPP;
+    }
+
+    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
+    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
+
+    for ( unsigned int i = 0; i < nbars; i++ )
+    {
+        int rc;
+        struct vpci_bar *bar;
+        unsigned int index;
+
+        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
+        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;;
+        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
+        {
+            /*
+             * TODO: for failed pathes, need to hide ReBar capability
+             * from hardware domain instead of returning an error.
+             */
+            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
+                   pdev->domain, &pdev->sbdf, index);
+            return -EINVAL;
+        }
+
+        bar = &pdev->vpci->header.bars[index];
+        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
+        {
+            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
+                   pdev->domain, &pdev->sbdf, index);
+            return -EINVAL;
+        }
+
+        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
+                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
+        if ( rc )
+        {
+            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CAP rc=%d\n",
+                   pdev->domain, &pdev->sbdf, rc);
+            return rc;
+        }
+
+        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
+                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
+        if ( rc )
+        {
+            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CTRL rc=%d\n",
+                   pdev->domain, &pdev->sbdf, rc);
+            return rc;
+        }
+
+        bar->resizable_sizes |=
+            (pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CAP(i)) >>
+             PCI_REBAR_CAP_SHIFT);
+        bar->resizable_sizes |=
+            ((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES) <<
+             (32 - PCI_REBAR_CAP_SHIFT));
+    }
+
+    return 0;
+}
+REGISTER_VPCI_INIT(init_rebar, VPCI_PRIORITY_LOW);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index 1e6aa5d799b9..3349b98389b8 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
     pci_conf_write16(pdev->sbdf, reg, val);
 }
 
+void cf_check vpci_hw_write32(
+    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
+{
+    pci_conf_write32(pdev->sbdf, reg, val);
+}
+
 int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler,
                            vpci_write_t *write_handler, unsigned int offset,
                            unsigned int size, void *data, uint32_t ro_mask,
diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
index 250ba106dbd3..51fdab69fa74 100644
--- a/xen/include/xen/pci_regs.h
+++ b/xen/include/xen/pci_regs.h
@@ -459,6 +459,7 @@
 #define PCI_EXT_CAP_ID_ARI	14
 #define PCI_EXT_CAP_ID_ATS	15
 #define PCI_EXT_CAP_ID_SRIOV	16
+#define PCI_EXT_CAP_ID_REBAR	21	/* Resizable BAR */
 
 /* Advanced Error Reporting */
 #define PCI_ERR_UNCOR_STATUS	4	/* Uncorrectable Error Status */
@@ -541,6 +542,19 @@
 #define  PCI_VNDR_HEADER_REV(x)	(((x) >> 16) & 0xf)
 #define  PCI_VNDR_HEADER_LEN(x)	(((x) >> 20) & 0xfff)
 
+/* Resizable BARs */
+#define PCI_REBAR_SIZE_BIAS	20
+#define PCI_REBAR_CAP(n)    	(4 + 8 * (n))	/* capability register */
+#define  PCI_REBAR_CAP_SHIFT		4		/* shift for supported BAR sizes */
+#define PCI_REBAR_CTRL(n)   	(8 + 8 * (n))	/* control register */
+#define  PCI_REBAR_CTRL_BAR_IDX	0x00000007	/* BAR index */
+#define  PCI_REBAR_CTRL_NBAR_MASK	0x000000E0	/* # of resizable BARs */
+#define  PCI_REBAR_CTRL_BAR_SIZE	0x00001F00	/* BAR size */
+#define  PCI_REBAR_CTRL_SIZE(v) \
+            (1UL << (MASK_EXTR(v, PCI_REBAR_CTRL_BAR_SIZE) \
+                     + PCI_REBAR_SIZE_BIAS))
+#define  PCI_REBAR_CTRL_SIZES		0xFFFF0000U	/* supported BAR sizes */
+
 /*
  * Hypertransport sub capability types
  *
diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
index 41e7c3bc2791..9d47b8c1a50e 100644
--- a/xen/include/xen/vpci.h
+++ b/xen/include/xen/vpci.h
@@ -78,6 +78,8 @@ uint32_t cf_check vpci_hw_read32(
     const struct pci_dev *pdev, unsigned int reg, void *data);
 void cf_check vpci_hw_write16(
     const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data);
+void cf_check vpci_hw_write32(
+    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data);
 
 /*
  * Check for pending vPCI operations on this vcpu. Returns true if the vcpu
@@ -100,6 +102,7 @@ struct vpci {
             /* Guest address. */
             uint64_t guest_addr;
             uint64_t size;
+            uint64_t resizable_sizes;
             struct rangeset *mem;
             enum {
                 VPCI_BAR_EMPTY,
-- 
2.34.1
Re: [PATCH v4] vpci: Add resizable bar support
Posted by Jan Beulich 4 weeks, 1 day ago
On 19.12.2024 06:21, Jiqian Chen wrote:
> --- /dev/null
> +++ b/xen/drivers/vpci/rebar.c
> @@ -0,0 +1,131 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
> + *
> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
> + */
> +
> +#include <xen/sched.h>
> +#include <xen/vpci.h>
> +
> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> +                                      unsigned int reg,
> +                                      uint32_t val,
> +                                      void *data)
> +{
> +    struct vpci_bar *bar = data;
> +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
> +
> +    if ( bar->enabled )
> +    {
> +        /*
> +         * Refuse to resize a BAR while memory decoding is enabled, as
> +         * otherwise the size of the mapped region in the p2m would become
> +         * stale with the newly set BAR size, and the position of the BAR
> +         * would be reset to undefined.  Note the PCIe specification also
> +         * forbids resizing a BAR with memory decoding enabled.
> +         */
> +        if ( size != bar->size )
> +            gprintk(XENLOG_ERR,
> +                    "%pp: refuse to resize BAR with memory decoding enabled\n",
> +                    &pdev->sbdf);
> +        return;
> +    }
> +
> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
> +        gprintk(XENLOG_WARNING,
> +                "%pp: new size %#lx is not supported by hardware\n",
> +                &pdev->sbdf, size);
> +
> +    bar->size = size;

Shouldn't at least this be in an "else" to the if() above?

> +    bar->addr = 0;

For maximum compatibility with the behavior on bare metal, would we
perhaps better ...

> +    bar->guest_addr = 0;
> +    pci_conf_write32(pdev->sbdf, reg, val);

... re-read the BAR from hardware after this write?

Similar consideration may apply to ->guest_addr: Driver writers knowing
how their hardware behaves may expect that merely some of the bits of
the address get cleared (if the size increases).

> +static int cf_check init_rebar(struct pci_dev *pdev)
> +{
> +    uint32_t ctrl;
> +    unsigned int nbars;
> +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
> +                                                        PCI_EXT_CAP_ID_REBAR);
> +
> +    if ( !rebar_offset )
> +        return 0;
> +
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
> +               &pdev->sbdf, pdev->domain);
> +        return -EOPNOTSUPP;
> +    }
> +
> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> +
> +    for ( unsigned int i = 0; i < nbars; i++ )
> +    {
> +        int rc;
> +        struct vpci_bar *bar;
> +        unsigned int index;
> +
> +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
> +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;;

Nit: No double semicolons please.

> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> +        {
> +            /*
> +             * TODO: for failed pathes, need to hide ReBar capability
> +             * from hardware domain instead of returning an error.
> +             */
> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
> +                   pdev->domain, &pdev->sbdf, index);
> +            return -EINVAL;

With the TODO unaddressed, is it actually appropriate to return an error
here? Shouldn't we continue in a best effort manner? (Question also to
Roger as the maintainer.)

> +        }
> +
> +        bar = &pdev->vpci->header.bars[index];
> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> +                   pdev->domain, &pdev->sbdf, index);
> +            return -EINVAL;

Same question here then.

> +        }
> +
> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
> +        if ( rc )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CAP rc=%d\n",
> +                   pdev->domain, &pdev->sbdf, rc);
> +            return rc;
> +        }
> +
> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
> +        if ( rc )
> +        {
> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CTRL rc=%d\n",
> +                   pdev->domain, &pdev->sbdf, rc);
> +            return rc;
> +        }
> +
> +        bar->resizable_sizes |=
> +            (pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CAP(i)) >>
> +             PCI_REBAR_CAP_SHIFT);

Imo this would better use = in place of |= and (see also below) would also
better use MASK_EXTR() just like ...

> +        bar->resizable_sizes |=
> +            ((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES) <<
> +             (32 - PCI_REBAR_CAP_SHIFT));

... this one does.

Further I think you want to truncate the value for 32-bit BARs, such that
rebar_ctrl_write() would properly reject attempts to set sizes of 4G and
above for them.

> --- a/xen/drivers/vpci/vpci.c
> +++ b/xen/drivers/vpci/vpci.c
> @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
>      pci_conf_write16(pdev->sbdf, reg, val);
>  }
>  
> +void cf_check vpci_hw_write32(
> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
> +{
> +    pci_conf_write32(pdev->sbdf, reg, val);
> +}

This function is being added just to handle writing of a r/o register.
Can't you better re-use vpci_ignored_write()?

> --- a/xen/include/xen/pci_regs.h
> +++ b/xen/include/xen/pci_regs.h
> @@ -459,6 +459,7 @@
>  #define PCI_EXT_CAP_ID_ARI	14
>  #define PCI_EXT_CAP_ID_ATS	15
>  #define PCI_EXT_CAP_ID_SRIOV	16
> +#define PCI_EXT_CAP_ID_REBAR	21	/* Resizable BAR */
>  
>  /* Advanced Error Reporting */
>  #define PCI_ERR_UNCOR_STATUS	4	/* Uncorrectable Error Status */
> @@ -541,6 +542,19 @@
>  #define  PCI_VNDR_HEADER_REV(x)	(((x) >> 16) & 0xf)
>  #define  PCI_VNDR_HEADER_LEN(x)	(((x) >> 20) & 0xfff)
>  
> +/* Resizable BARs */
> +#define PCI_REBAR_SIZE_BIAS	20

I think it would be best if all register definitions came first, and
auxiliary ones followed afterwards (maybe even separated by a brief
comment for clarity).

> +#define PCI_REBAR_CAP(n)    	(4 + 8 * (n))	/* capability register */
> +#define  PCI_REBAR_CAP_SHIFT		4		/* shift for supported BAR sizes */
> +#define PCI_REBAR_CTRL(n)   	(8 + 8 * (n))	/* control register */

Something's odd with the padding here. Please be consistent with the use
of whitespace (ought to be only hard tabs here afaict).

> +#define  PCI_REBAR_CTRL_BAR_IDX	0x00000007	/* BAR index */
> +#define  PCI_REBAR_CTRL_NBAR_MASK	0x000000E0	/* # of resizable BARs */
> +#define  PCI_REBAR_CTRL_BAR_SIZE	0x00001F00	/* BAR size */

This field is 6 bits wide in the spec I'm looking at. Or else BAR sizes
2^^52 and up can't be encoded.

> +#define  PCI_REBAR_CTRL_SIZE(v) \
> +            (1UL << (MASK_EXTR(v, PCI_REBAR_CTRL_BAR_SIZE) \
> +                     + PCI_REBAR_SIZE_BIAS))
> +#define  PCI_REBAR_CTRL_SIZES		0xFFFF0000U	/* supported BAR sizes */

PCI_REBAR_CAP_SHIFT and PCI_REBAR_CTRL_SIZES don't fit together very well.
Imo both want representing as masks.

Jan
Re: [PATCH v4] vpci: Add resizable bar support
Posted by Chen, Jiqian 3 weeks, 5 days ago
On 2025/1/7 18:06, Jan Beulich wrote:
> On 19.12.2024 06:21, Jiqian Chen wrote:
>> --- /dev/null
>> +++ b/xen/drivers/vpci/rebar.c
>> @@ -0,0 +1,131 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
>> + *
>> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
>> + */
>> +
>> +#include <xen/sched.h>
>> +#include <xen/vpci.h>
>> +
>> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
>> +                                      unsigned int reg,
>> +                                      uint32_t val,
>> +                                      void *data)
>> +{
>> +    struct vpci_bar *bar = data;
>> +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
>> +
>> +    if ( bar->enabled )
>> +    {
>> +        /*
>> +         * Refuse to resize a BAR while memory decoding is enabled, as
>> +         * otherwise the size of the mapped region in the p2m would become
>> +         * stale with the newly set BAR size, and the position of the BAR
>> +         * would be reset to undefined.  Note the PCIe specification also
>> +         * forbids resizing a BAR with memory decoding enabled.
>> +         */
>> +        if ( size != bar->size )
>> +            gprintk(XENLOG_ERR,
>> +                    "%pp: refuse to resize BAR with memory decoding enabled\n",
>> +                    &pdev->sbdf);
>> +        return;
>> +    }
>> +
>> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
>> +        gprintk(XENLOG_WARNING,
>> +                "%pp: new size %#lx is not supported by hardware\n",
>> +                &pdev->sbdf, size);
>> +
>> +    bar->size = size;
> 
> Shouldn't at least this be in an "else" to the if() above?
After reading your discussion with Roger., here..

> 
>> +    bar->addr = 0;
> 
> For maximum compatibility with the behavior on bare metal, would we
> perhaps better ...
> 
>> +    bar->guest_addr = 0;
>> +    pci_conf_write32(pdev->sbdf, reg, val);
> 
> ... re-read the BAR from hardware after this write?
> 
> Similar consideration may apply to ->guest_addr: Driver writers knowing
> how their hardware behaves may expect that merely some of the bits of
> the address get cleared (if the size increases).
and here, I need to use pci_size_mem_bar to re-obtain addr and size, then set guest_addr to be addr.
    pci_size_mem_bar(pdev->sbdf, reg, &bar->addr, &bar->size, );
    bar->guest_addr = bar->addr;


> 
>> +static int cf_check init_rebar(struct pci_dev *pdev)
>> +{
>> +    uint32_t ctrl;
>> +    unsigned int nbars;
>> +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
>> +                                                        PCI_EXT_CAP_ID_REBAR);
>> +
>> +    if ( !rebar_offset )
>> +        return 0;
>> +
>> +    if ( !is_hardware_domain(pdev->domain) )
>> +    {
>> +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
>> +               &pdev->sbdf, pdev->domain);
>> +        return -EOPNOTSUPP;
>> +    }
>> +
>> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
>> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
>> +
>> +    for ( unsigned int i = 0; i < nbars; i++ )
>> +    {
>> +        int rc;
>> +        struct vpci_bar *bar;
>> +        unsigned int index;
>> +
>> +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
>> +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;;
> 
> Nit: No double semicolons please.
> 
>> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
>> +        {
>> +            /*
>> +             * TODO: for failed pathes, need to hide ReBar capability
>> +             * from hardware domain instead of returning an error.
>> +             */
>> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
>> +                   pdev->domain, &pdev->sbdf, index);
>> +            return -EINVAL;
> 
> With the TODO unaddressed, is it actually appropriate to return an error
> here? Shouldn't we continue in a best effort manner? (Question also to
> Roger as the maintainer.)
> 
>> +        }
>> +
>> +        bar = &pdev->vpci->header.bars[index];
>> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
>> +        {
>> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
>> +                   pdev->domain, &pdev->sbdf, index);
>> +            return -EINVAL;
> 
> Same question here then.
After reading your discussion with Roger. I will change to "continue" here and above.

> 
>> +        }
>> +
>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
>> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
>> +        if ( rc )
>> +        {
>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CAP rc=%d\n",
>> +                   pdev->domain, &pdev->sbdf, rc);
>> +            return rc;
>> +        }
>> +
>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
>> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
>> +        if ( rc )
>> +        {
>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CTRL rc=%d\n",
>> +                   pdev->domain, &pdev->sbdf, rc);
>> +            return rc;
>> +        }
>> +
>> +        bar->resizable_sizes |=
>> +            (pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CAP(i)) >>
>> +             PCI_REBAR_CAP_SHIFT);
> 
> Imo this would better use = in place of |= and (see also below) would also
> better use MASK_EXTR() just like ...
> 
>> +        bar->resizable_sizes |=
>> +            ((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES) <<
>> +             (32 - PCI_REBAR_CAP_SHIFT));
> 
> ... this one does.
Combine with your below comments about the macro " PCI_REBAR_CAP_SHIFT" and "PCI_REBAR_CTRL_SIZES ",
I will change "PCI_REBAR_CAP_SHIFT 4" to "PCI_REBAR_CAP_SIZES_MASK 0xFFFFFFF0U",
change "PCI_REBAR_CTRL_SIZES 0xFFFF0000U" to "PCI_REBAR_CTRL_SIZES_MASK 0xFFFF0000U"
Then, here will be:
        bar->resizable_sizes =
            MASK_EXTR(pci_conf_read32(pdev->sbdf,
                                      rebar_offset + PCI_REBAR_CAP(i)),
                      PCI_REBAR_CAP_SIZES_MASK);
        bar->resizable_sizes |=
            (((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES_MASK) << 32) /
             ISOLATE_LSB(PCI_REBAR_CAP_SIZES_MASK));

> 
> Further I think you want to truncate the value for 32-bit BARs, such that
> rebar_ctrl_write() would properly reject attempts to set sizes of 4G and
> above for them.
After reading your discussion with Roger, since I will change to re-obtain from hardware, so I can do nothing with this comment.

> 
>> --- a/xen/drivers/vpci/vpci.c
>> +++ b/xen/drivers/vpci/vpci.c
>> @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
>>      pci_conf_write16(pdev->sbdf, reg, val);
>>  }
>>  
>> +void cf_check vpci_hw_write32(
>> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
>> +{
>> +    pci_conf_write32(pdev->sbdf, reg, val);
>> +}
> 
> This function is being added just to handle writing of a r/o register.
> Can't you better re-use vpci_ignored_write()?
> 
>> --- a/xen/include/xen/pci_regs.h
>> +++ b/xen/include/xen/pci_regs.h
>> @@ -459,6 +459,7 @@
>>  #define PCI_EXT_CAP_ID_ARI	14
>>  #define PCI_EXT_CAP_ID_ATS	15
>>  #define PCI_EXT_CAP_ID_SRIOV	16
>> +#define PCI_EXT_CAP_ID_REBAR	21	/* Resizable BAR */
>>  
>>  /* Advanced Error Reporting */
>>  #define PCI_ERR_UNCOR_STATUS	4	/* Uncorrectable Error Status */
>> @@ -541,6 +542,19 @@
>>  #define  PCI_VNDR_HEADER_REV(x)	(((x) >> 16) & 0xf)
>>  #define  PCI_VNDR_HEADER_LEN(x)	(((x) >> 20) & 0xfff)
>>  
>> +/* Resizable BARs */
>> +#define PCI_REBAR_SIZE_BIAS	20
> 
> I think it would be best if all register definitions came first, and
> auxiliary ones followed afterwards (maybe even separated by a brief
> comment for clarity).
> 
>> +#define PCI_REBAR_CAP(n)    	(4 + 8 * (n))	/* capability register */
>> +#define  PCI_REBAR_CAP_SHIFT		4		/* shift for supported BAR sizes */
>> +#define PCI_REBAR_CTRL(n)   	(8 + 8 * (n))	/* control register */
> 
> Something's odd with the padding here. Please be consistent with the use
> of whitespace (ought to be only hard tabs here afaict).
Sorry, I don't understand how to modify it specifically.

> 
>> +#define  PCI_REBAR_CTRL_BAR_IDX	0x00000007	/* BAR index */
>> +#define  PCI_REBAR_CTRL_NBAR_MASK	0x000000E0	/* # of resizable BARs */
>> +#define  PCI_REBAR_CTRL_BAR_SIZE	0x00001F00	/* BAR size */
> 
> This field is 6 bits wide in the spec I'm looking at. Or else BAR sizes
> 2^^52 and up can't be encoded.
> 
>> +#define  PCI_REBAR_CTRL_SIZE(v) \
>> +            (1UL << (MASK_EXTR(v, PCI_REBAR_CTRL_BAR_SIZE) \
>> +                     + PCI_REBAR_SIZE_BIAS))
>> +#define  PCI_REBAR_CTRL_SIZES		0xFFFF0000U	/* supported BAR sizes */
> 
> PCI_REBAR_CAP_SHIFT and PCI_REBAR_CTRL_SIZES don't fit together very well.
> Imo both want representing as masks.
> 
> Jan

-- 
Best regards,
Jiqian Chen.
Re: [PATCH v4] vpci: Add resizable bar support
Posted by Jan Beulich 3 weeks, 5 days ago
On 10.01.2025 08:10, Chen, Jiqian wrote:
> On 2025/1/7 18:06, Jan Beulich wrote:
>> On 19.12.2024 06:21, Jiqian Chen wrote:
>>> +#define PCI_REBAR_CAP(n)    	(4 + 8 * (n))	/* capability register */
>>> +#define  PCI_REBAR_CAP_SHIFT		4		/* shift for supported BAR sizes */
>>> +#define PCI_REBAR_CTRL(n)   	(8 + 8 * (n))	/* control register */
>>
>> Something's odd with the padding here. Please be consistent with the use
>> of whitespace (ought to be only hard tabs here afaict).
> Sorry, I don't understand how to modify it specifically.

You surely have noticed that in two of the three quoted lines there are
blanks immediately followed by tabs in the padding. This can hardly ever
be correct. (The overall goal wants to be that "same level" definitions
are column-wise properly aligned with one another. While nested ones,
like you have it for PCI_REBAR_CAP_SHIFT, are properly identified as
being nested. You want to check with other parts of the file if in doubt.)

Jan
Re: [PATCH v4] vpci: Add resizable bar support
Posted by Roger Pau Monné 4 weeks ago
On Tue, Jan 07, 2025 at 11:06:33AM +0100, Jan Beulich wrote:
> On 19.12.2024 06:21, Jiqian Chen wrote:
> > --- /dev/null
> > +++ b/xen/drivers/vpci/rebar.c
> > @@ -0,0 +1,131 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
> > + *
> > + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
> > + */
> > +
> > +#include <xen/sched.h>
> > +#include <xen/vpci.h>
> > +
> > +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> > +                                      unsigned int reg,
> > +                                      uint32_t val,
> > +                                      void *data)
> > +{
> > +    struct vpci_bar *bar = data;
> > +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
> > +
> > +    if ( bar->enabled )
> > +    {
> > +        /*
> > +         * Refuse to resize a BAR while memory decoding is enabled, as
> > +         * otherwise the size of the mapped region in the p2m would become
> > +         * stale with the newly set BAR size, and the position of the BAR
> > +         * would be reset to undefined.  Note the PCIe specification also
> > +         * forbids resizing a BAR with memory decoding enabled.
> > +         */
> > +        if ( size != bar->size )
> > +            gprintk(XENLOG_ERR,
> > +                    "%pp: refuse to resize BAR with memory decoding enabled\n",
> > +                    &pdev->sbdf);
> > +        return;
> > +    }
> > +
> > +    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
> > +        gprintk(XENLOG_WARNING,
> > +                "%pp: new size %#lx is not supported by hardware\n",
> > +                &pdev->sbdf, size);
> > +
> > +    bar->size = size;
> 
> Shouldn't at least this be in an "else" to the if() above?

I think this was already raised in a previous version - would be good
to know how real hardware behaves when an invalid size is set.  Is the
BAR register still reset?

> > +    bar->addr = 0;
> 
> For maximum compatibility with the behavior on bare metal, would we
> perhaps better ...
> 
> > +    bar->guest_addr = 0;
> > +    pci_conf_write32(pdev->sbdf, reg, val);
> 
> ... re-read the BAR from hardware after this write?
> 
> Similar consideration may apply to ->guest_addr: Driver writers knowing
> how their hardware behaves may expect that merely some of the bits of
> the address get cleared (if the size increases).

Since we only plan to enable the capability for the hardware domain,
and in that case addr == guest_addr always, it's fine to just read
from the BAR register and update the fields.  If we do this we might
as well check that the newly reported BAR size matches what Xen
expects on debug builds at least.

> > +static int cf_check init_rebar(struct pci_dev *pdev)
> > +{
> > +    uint32_t ctrl;
> > +    unsigned int nbars;
> > +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
> > +                                                        PCI_EXT_CAP_ID_REBAR);
> > +
> > +    if ( !rebar_offset )
> > +        return 0;
> > +
> > +    if ( !is_hardware_domain(pdev->domain) )
> > +    {
> > +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
> > +               &pdev->sbdf, pdev->domain);
> > +        return -EOPNOTSUPP;
> > +    }
> > +
> > +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
> > +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
> > +
> > +    for ( unsigned int i = 0; i < nbars; i++ )
> > +    {
> > +        int rc;
> > +        struct vpci_bar *bar;
> > +        unsigned int index;
> > +
> > +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
> > +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;;
> 
> Nit: No double semicolons please.
> 
> > +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> > +        {
> > +            /*
> > +             * TODO: for failed pathes, need to hide ReBar capability
> > +             * from hardware domain instead of returning an error.
> > +             */
> > +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
> > +                   pdev->domain, &pdev->sbdf, index);
> > +            return -EINVAL;
> 
> With the TODO unaddressed, is it actually appropriate to return an error
> here? Shouldn't we continue in a best effort manner? (Question also to
> Roger as the maintainer.)

It would indeed be better to shallow the error and return 0, however
the handlers added in this loop would need removing if no error is
returned.

> > +        }
> > +
> > +        bar = &pdev->vpci->header.bars[index];
> > +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
> > +        {
> > +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> > +                   pdev->domain, &pdev->sbdf, index);
> > +            return -EINVAL;
> 
> Same question here then.
> 
> > +        }
> > +
> > +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
> > +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
> > +        if ( rc )
> > +        {
> > +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CAP rc=%d\n",
> > +                   pdev->domain, &pdev->sbdf, rc);
> > +            return rc;
> > +        }
> > +
> > +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
> > +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
> > +        if ( rc )
> > +        {
> > +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CTRL rc=%d\n",
> > +                   pdev->domain, &pdev->sbdf, rc);
> > +            return rc;
> > +        }
> > +
> > +        bar->resizable_sizes |=
> > +            (pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CAP(i)) >>
> > +             PCI_REBAR_CAP_SHIFT);
> 
> Imo this would better use = in place of |= and (see also below) would also
> better use MASK_EXTR() just like ...
> 
> > +        bar->resizable_sizes |=
> > +            ((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES) <<
> > +             (32 - PCI_REBAR_CAP_SHIFT));
> 
> ... this one does.
> 
> Further I think you want to truncate the value for 32-bit BARs, such that
> rebar_ctrl_write() would properly reject attempts to set sizes of 4G and
> above for them.

For the hardware domain at least we shouldn't add such restriction -
Xen in general allows dom0 to do things it would otherwise consider
invalid, in case it has to deal with hardware quirks.

Rather than reject Xen should just print a warning that the sizes
supported by the device are likely invalid.

> > --- a/xen/drivers/vpci/vpci.c
> > +++ b/xen/drivers/vpci/vpci.c
> > @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
> >      pci_conf_write16(pdev->sbdf, reg, val);
> >  }
> >  
> > +void cf_check vpci_hw_write32(
> > +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
> > +{
> > +    pci_conf_write32(pdev->sbdf, reg, val);
> > +}
> 
> This function is being added just to handle writing of a r/o register.
> Can't you better re-use vpci_ignored_write()?

But vpci_ignored_write() ignores the write, OTOH here the write is
propagated to the hardware.

Thanks, Roger.
Re: [PATCH v4] vpci: Add resizable bar support
Posted by Jan Beulich 4 weeks ago
On 07.01.2025 15:38, Roger Pau Monné wrote:
> On Tue, Jan 07, 2025 at 11:06:33AM +0100, Jan Beulich wrote:
>> On 19.12.2024 06:21, Jiqian Chen wrote:
>>> --- /dev/null
>>> +++ b/xen/drivers/vpci/rebar.c
>>> @@ -0,0 +1,131 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
>>> + *
>>> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
>>> + */
>>> +
>>> +#include <xen/sched.h>
>>> +#include <xen/vpci.h>
>>> +
>>> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
>>> +                                      unsigned int reg,
>>> +                                      uint32_t val,
>>> +                                      void *data)
>>> +{
>>> +    struct vpci_bar *bar = data;
>>> +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
>>> +
>>> +    if ( bar->enabled )
>>> +    {
>>> +        /*
>>> +         * Refuse to resize a BAR while memory decoding is enabled, as
>>> +         * otherwise the size of the mapped region in the p2m would become
>>> +         * stale with the newly set BAR size, and the position of the BAR
>>> +         * would be reset to undefined.  Note the PCIe specification also
>>> +         * forbids resizing a BAR with memory decoding enabled.
>>> +         */
>>> +        if ( size != bar->size )
>>> +            gprintk(XENLOG_ERR,
>>> +                    "%pp: refuse to resize BAR with memory decoding enabled\n",
>>> +                    &pdev->sbdf);
>>> +        return;
>>> +    }
>>> +
>>> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
>>> +        gprintk(XENLOG_WARNING,
>>> +                "%pp: new size %#lx is not supported by hardware\n",
>>> +                &pdev->sbdf, size);
>>> +
>>> +    bar->size = size;
>>
>> Shouldn't at least this be in an "else" to the if() above?
> 
> I think this was already raised in a previous version - would be good
> to know how real hardware behaves when an invalid size is set.  Is the
> BAR register still reset?

I'm pretty sure what happens is undefined. I'd expect though that the
BAR size then doesn't change. Which would require the above assignment
to not be unconditional.

>>> +static int cf_check init_rebar(struct pci_dev *pdev)
>>> +{
>>> +    uint32_t ctrl;
>>> +    unsigned int nbars;
>>> +    unsigned int rebar_offset = pci_find_ext_capability(pdev->sbdf,
>>> +                                                        PCI_EXT_CAP_ID_REBAR);
>>> +
>>> +    if ( !rebar_offset )
>>> +        return 0;
>>> +
>>> +    if ( !is_hardware_domain(pdev->domain) )
>>> +    {
>>> +        printk(XENLOG_ERR "%pp: resizable BARs unsupported for unpriv %pd\n",
>>> +               &pdev->sbdf, pdev->domain);
>>> +        return -EOPNOTSUPP;
>>> +    }
>>> +
>>> +    ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(0));
>>> +    nbars = MASK_EXTR(ctrl, PCI_REBAR_CTRL_NBAR_MASK);
>>> +
>>> +    for ( unsigned int i = 0; i < nbars; i++ )
>>> +    {
>>> +        int rc;
>>> +        struct vpci_bar *bar;
>>> +        unsigned int index;
>>> +
>>> +        ctrl = pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CTRL(i));
>>> +        index = ctrl & PCI_REBAR_CTRL_BAR_IDX;;
>>
>> Nit: No double semicolons please.
>>
>>> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
>>> +        {
>>> +            /*
>>> +             * TODO: for failed pathes, need to hide ReBar capability
>>> +             * from hardware domain instead of returning an error.
>>> +             */
>>> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
>>> +                   pdev->domain, &pdev->sbdf, index);
>>> +            return -EINVAL;
>>
>> With the TODO unaddressed, is it actually appropriate to return an error
>> here? Shouldn't we continue in a best effort manner? (Question also to
>> Roger as the maintainer.)
> 
> It would indeed be better to shallow the error and return 0, however
> the handlers added in this loop would need removing if no error is
> returned.

Would they? For those BARs where things worked fine I would think they
could be left in place.

>>> +        }
>>> +
>>> +        bar = &pdev->vpci->header.bars[index];
>>> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
>>> +        {
>>> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
>>> +                   pdev->domain, &pdev->sbdf, index);
>>> +            return -EINVAL;
>>
>> Same question here then.
>>
>>> +        }
>>> +
>>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
>>> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
>>> +        if ( rc )
>>> +        {
>>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CAP rc=%d\n",
>>> +                   pdev->domain, &pdev->sbdf, rc);
>>> +            return rc;
>>> +        }
>>> +
>>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
>>> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
>>> +        if ( rc )
>>> +        {
>>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CTRL rc=%d\n",
>>> +                   pdev->domain, &pdev->sbdf, rc);
>>> +            return rc;
>>> +        }
>>> +
>>> +        bar->resizable_sizes |=
>>> +            (pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CAP(i)) >>
>>> +             PCI_REBAR_CAP_SHIFT);
>>
>> Imo this would better use = in place of |= and (see also below) would also
>> better use MASK_EXTR() just like ...
>>
>>> +        bar->resizable_sizes |=
>>> +            ((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES) <<
>>> +             (32 - PCI_REBAR_CAP_SHIFT));
>>
>> ... this one does.
>>
>> Further I think you want to truncate the value for 32-bit BARs, such that
>> rebar_ctrl_write() would properly reject attempts to set sizes of 4G and
>> above for them.
> 
> For the hardware domain at least we shouldn't add such restriction -
> Xen in general allows dom0 to do things it would otherwise consider
> invalid, in case it has to deal with hardware quirks.
> 
> Rather than reject Xen should just print a warning that the sizes
> supported by the device are likely invalid.

And do what when memory decode is re-enabled on the device? What size a
P2M update should it do then?

>>> --- a/xen/drivers/vpci/vpci.c
>>> +++ b/xen/drivers/vpci/vpci.c
>>> @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
>>>      pci_conf_write16(pdev->sbdf, reg, val);
>>>  }
>>>  
>>> +void cf_check vpci_hw_write32(
>>> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
>>> +{
>>> +    pci_conf_write32(pdev->sbdf, reg, val);
>>> +}
>>
>> This function is being added just to handle writing of a r/o register.
>> Can't you better re-use vpci_ignored_write()?
> 
> But vpci_ignored_write() ignores the write, OTOH here the write is
> propagated to the hardware.

Right, just for the hardware to drop it. I wouldn't have commented if
the function needed to do things like this already existed. Adding yet
another cf_check function just for this is what made me give the remark.

Jan

Re: [PATCH v4] vpci: Add resizable bar support
Posted by Roger Pau Monné 4 weeks ago
On Tue, Jan 07, 2025 at 04:58:07PM +0100, Jan Beulich wrote:
> On 07.01.2025 15:38, Roger Pau Monné wrote:
> > On Tue, Jan 07, 2025 at 11:06:33AM +0100, Jan Beulich wrote:
> >> On 19.12.2024 06:21, Jiqian Chen wrote:
> >>> --- /dev/null
> >>> +++ b/xen/drivers/vpci/rebar.c
> >>> @@ -0,0 +1,131 @@
> >>> +/* SPDX-License-Identifier: GPL-2.0-only */
> >>> +/*
> >>> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
> >>> + *
> >>> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
> >>> + */
> >>> +
> >>> +#include <xen/sched.h>
> >>> +#include <xen/vpci.h>
> >>> +
> >>> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> >>> +                                      unsigned int reg,
> >>> +                                      uint32_t val,
> >>> +                                      void *data)
> >>> +{
> >>> +    struct vpci_bar *bar = data;
> >>> +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
> >>> +
> >>> +    if ( bar->enabled )
> >>> +    {
> >>> +        /*
> >>> +         * Refuse to resize a BAR while memory decoding is enabled, as
> >>> +         * otherwise the size of the mapped region in the p2m would become
> >>> +         * stale with the newly set BAR size, and the position of the BAR
> >>> +         * would be reset to undefined.  Note the PCIe specification also
> >>> +         * forbids resizing a BAR with memory decoding enabled.
> >>> +         */
> >>> +        if ( size != bar->size )
> >>> +            gprintk(XENLOG_ERR,
> >>> +                    "%pp: refuse to resize BAR with memory decoding enabled\n",
> >>> +                    &pdev->sbdf);
> >>> +        return;
> >>> +    }
> >>> +
> >>> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
> >>> +        gprintk(XENLOG_WARNING,
> >>> +                "%pp: new size %#lx is not supported by hardware\n",
> >>> +                &pdev->sbdf, size);
> >>> +
> >>> +    bar->size = size;
> >>
> >> Shouldn't at least this be in an "else" to the if() above?
> > 
> > I think this was already raised in a previous version - would be good
> > to know how real hardware behaves when an invalid size is set.  Is the
> > BAR register still reset?
> 
> I'm pretty sure what happens is undefined. I'd expect though that the
> BAR size then doesn't change. Which would require the above assignment
> to not be unconditional.

Might be better to just re-size the BAR, like you suggested to fetch
the BAR position from the register, instead of assuming 0.

> >>> +        if ( index >= PCI_HEADER_NORMAL_NR_BARS )
> >>> +        {
> >>> +            /*
> >>> +             * TODO: for failed pathes, need to hide ReBar capability
> >>> +             * from hardware domain instead of returning an error.
> >>> +             */
> >>> +            printk(XENLOG_ERR "%pd %pp: too big BAR number %u in REBAR_CTRL\n",
> >>> +                   pdev->domain, &pdev->sbdf, index);
> >>> +            return -EINVAL;
> >>
> >> With the TODO unaddressed, is it actually appropriate to return an error
> >> here? Shouldn't we continue in a best effort manner? (Question also to
> >> Roger as the maintainer.)
> > 
> > It would indeed be better to shallow the error and return 0, however
> > the handlers added in this loop would need removing if no error is
> > returned.
> 
> Would they? For those BARs where things worked fine I would think they
> could be left in place.

Hm, it's kind of partial support, but yes, that could likely be fine.
Then the return here should be a continue instead.

> >>> +        }
> >>> +
> >>> +        bar = &pdev->vpci->header.bars[index];
> >>> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
> >>> +        {
> >>> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
> >>> +                   pdev->domain, &pdev->sbdf, index);
> >>> +            return -EINVAL;
> >>
> >> Same question here then.
> >>
> >>> +        }
> >>> +
> >>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
> >>> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
> >>> +        if ( rc )
> >>> +        {
> >>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CAP rc=%d\n",
> >>> +                   pdev->domain, &pdev->sbdf, rc);
> >>> +            return rc;
> >>> +        }
> >>> +
> >>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
> >>> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
> >>> +        if ( rc )
> >>> +        {
> >>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CTRL rc=%d\n",
> >>> +                   pdev->domain, &pdev->sbdf, rc);
> >>> +            return rc;
> >>> +        }
> >>> +
> >>> +        bar->resizable_sizes |=
> >>> +            (pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CAP(i)) >>
> >>> +             PCI_REBAR_CAP_SHIFT);
> >>
> >> Imo this would better use = in place of |= and (see also below) would also
> >> better use MASK_EXTR() just like ...
> >>
> >>> +        bar->resizable_sizes |=
> >>> +            ((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES) <<
> >>> +             (32 - PCI_REBAR_CAP_SHIFT));
> >>
> >> ... this one does.
> >>
> >> Further I think you want to truncate the value for 32-bit BARs, such that
> >> rebar_ctrl_write() would properly reject attempts to set sizes of 4G and
> >> above for them.
> > 
> > For the hardware domain at least we shouldn't add such restriction -
> > Xen in general allows dom0 to do things it would otherwise consider
> > invalid, in case it has to deal with hardware quirks.
> > 
> > Rather than reject Xen should just print a warning that the sizes
> > supported by the device are likely invalid.
> 
> And do what when memory decode is re-enabled on the device? What size a
> P2M update should it do then?

You did suggest to re-read the BARs positions after a ctrl write, we
might as well read the BAR size and use that to be on the safe side.

> >>> --- a/xen/drivers/vpci/vpci.c
> >>> +++ b/xen/drivers/vpci/vpci.c
> >>> @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
> >>>      pci_conf_write16(pdev->sbdf, reg, val);
> >>>  }
> >>>  
> >>> +void cf_check vpci_hw_write32(
> >>> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
> >>> +{
> >>> +    pci_conf_write32(pdev->sbdf, reg, val);
> >>> +}
> >>
> >> This function is being added just to handle writing of a r/o register.
> >> Can't you better re-use vpci_ignored_write()?
> > 
> > But vpci_ignored_write() ignores the write, OTOH here the write is
> > propagated to the hardware.
> 
> Right, just for the hardware to drop it. I wouldn't have commented if
> the function needed to do things like this already existed. Adding yet
> another cf_check function just for this is what made me give the remark.

According to the spec yes, they will be ignored.  Yet for the hardware
domain we try to avoid changing behavior from native as much as
possible, hence propagating the write seems more appropriate.

Thanks, Roger.

Re: [PATCH v4] vpci: Add resizable bar support
Posted by Jan Beulich 4 weeks ago
On 07.01.2025 19:19, Roger Pau Monné wrote:
> On Tue, Jan 07, 2025 at 04:58:07PM +0100, Jan Beulich wrote:
>> On 07.01.2025 15:38, Roger Pau Monné wrote:
>>> On Tue, Jan 07, 2025 at 11:06:33AM +0100, Jan Beulich wrote:
>>>> On 19.12.2024 06:21, Jiqian Chen wrote:
>>>>> --- /dev/null
>>>>> +++ b/xen/drivers/vpci/rebar.c
>>>>> @@ -0,0 +1,131 @@
>>>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>>>> +/*
>>>>> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
>>>>> + *
>>>>> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
>>>>> + */
>>>>> +
>>>>> +#include <xen/sched.h>
>>>>> +#include <xen/vpci.h>
>>>>> +
>>>>> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
>>>>> +                                      unsigned int reg,
>>>>> +                                      uint32_t val,
>>>>> +                                      void *data)
>>>>> +{
>>>>> +    struct vpci_bar *bar = data;
>>>>> +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
>>>>> +
>>>>> +    if ( bar->enabled )
>>>>> +    {
>>>>> +        /*
>>>>> +         * Refuse to resize a BAR while memory decoding is enabled, as
>>>>> +         * otherwise the size of the mapped region in the p2m would become
>>>>> +         * stale with the newly set BAR size, and the position of the BAR
>>>>> +         * would be reset to undefined.  Note the PCIe specification also
>>>>> +         * forbids resizing a BAR with memory decoding enabled.
>>>>> +         */
>>>>> +        if ( size != bar->size )
>>>>> +            gprintk(XENLOG_ERR,
>>>>> +                    "%pp: refuse to resize BAR with memory decoding enabled\n",
>>>>> +                    &pdev->sbdf);
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
>>>>> +        gprintk(XENLOG_WARNING,
>>>>> +                "%pp: new size %#lx is not supported by hardware\n",
>>>>> +                &pdev->sbdf, size);
>>>>> +
>>>>> +    bar->size = size;
>>>>
>>>> Shouldn't at least this be in an "else" to the if() above?
>>>
>>> I think this was already raised in a previous version - would be good
>>> to know how real hardware behaves when an invalid size is set.  Is the
>>> BAR register still reset?
>>
>> I'm pretty sure what happens is undefined. I'd expect though that the
>> BAR size then doesn't change. Which would require the above assignment
>> to not be unconditional.
> 
> Might be better to just re-size the BAR, like you suggested to fetch
> the BAR position from the register, instead of assuming 0.

FTAOD by "re-size" you mean re-obtain its size (seeing we're talking of
re-sizable BARs here)? As kind of confirmed ...

>>>>> +        }
>>>>> +
>>>>> +        bar = &pdev->vpci->header.bars[index];
>>>>> +        if ( bar->type != VPCI_BAR_MEM64_LO && bar->type != VPCI_BAR_MEM32 )
>>>>> +        {
>>>>> +            printk(XENLOG_ERR "%pd %pp: BAR%u is not in memory space\n",
>>>>> +                   pdev->domain, &pdev->sbdf, index);
>>>>> +            return -EINVAL;
>>>>
>>>> Same question here then.
>>>>
>>>>> +        }
>>>>> +
>>>>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, vpci_hw_write32,
>>>>> +                               rebar_offset + PCI_REBAR_CAP(i), 4, NULL);
>>>>> +        if ( rc )
>>>>> +        {
>>>>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CAP rc=%d\n",
>>>>> +                   pdev->domain, &pdev->sbdf, rc);
>>>>> +            return rc;
>>>>> +        }
>>>>> +
>>>>> +        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rebar_ctrl_write,
>>>>> +                               rebar_offset + PCI_REBAR_CTRL(i), 4, bar);
>>>>> +        if ( rc )
>>>>> +        {
>>>>> +            printk(XENLOG_ERR "%pd %pp: fail to add reg of REBAR_CTRL rc=%d\n",
>>>>> +                   pdev->domain, &pdev->sbdf, rc);
>>>>> +            return rc;
>>>>> +        }
>>>>> +
>>>>> +        bar->resizable_sizes |=
>>>>> +            (pci_conf_read32(pdev->sbdf, rebar_offset + PCI_REBAR_CAP(i)) >>
>>>>> +             PCI_REBAR_CAP_SHIFT);
>>>>
>>>> Imo this would better use = in place of |= and (see also below) would also
>>>> better use MASK_EXTR() just like ...
>>>>
>>>>> +        bar->resizable_sizes |=
>>>>> +            ((uint64_t)MASK_EXTR(ctrl, PCI_REBAR_CTRL_SIZES) <<
>>>>> +             (32 - PCI_REBAR_CAP_SHIFT));
>>>>
>>>> ... this one does.
>>>>
>>>> Further I think you want to truncate the value for 32-bit BARs, such that
>>>> rebar_ctrl_write() would properly reject attempts to set sizes of 4G and
>>>> above for them.
>>>
>>> For the hardware domain at least we shouldn't add such restriction -
>>> Xen in general allows dom0 to do things it would otherwise consider
>>> invalid, in case it has to deal with hardware quirks.
>>>
>>> Rather than reject Xen should just print a warning that the sizes
>>> supported by the device are likely invalid.
>>
>> And do what when memory decode is re-enabled on the device? What size a
>> P2M update should it do then?
> 
> You did suggest to re-read the BARs positions after a ctrl write, we
> might as well read the BAR size and use that to be on the safe side.

... here.

>>>>> --- a/xen/drivers/vpci/vpci.c
>>>>> +++ b/xen/drivers/vpci/vpci.c
>>>>> @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
>>>>>      pci_conf_write16(pdev->sbdf, reg, val);
>>>>>  }
>>>>>  
>>>>> +void cf_check vpci_hw_write32(
>>>>> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
>>>>> +{
>>>>> +    pci_conf_write32(pdev->sbdf, reg, val);
>>>>> +}
>>>>
>>>> This function is being added just to handle writing of a r/o register.
>>>> Can't you better re-use vpci_ignored_write()?
>>>
>>> But vpci_ignored_write() ignores the write, OTOH here the write is
>>> propagated to the hardware.
>>
>> Right, just for the hardware to drop it. I wouldn't have commented if
>> the function needed to do things like this already existed. Adding yet
>> another cf_check function just for this is what made me give the remark.
> 
> According to the spec yes, they will be ignored.  Yet for the hardware
> domain we try to avoid changing behavior from native as much as
> possible, hence propagating the write seems more appropriate.

Okay; you're the maintainer of this code anyway.

Jan

Re: [PATCH v4] vpci: Add resizable bar support
Posted by Roger Pau Monné 4 weeks ago
On Wed, Jan 08, 2025 at 08:19:55AM +0100, Jan Beulich wrote:
> On 07.01.2025 19:19, Roger Pau Monné wrote:
> > On Tue, Jan 07, 2025 at 04:58:07PM +0100, Jan Beulich wrote:
> >> On 07.01.2025 15:38, Roger Pau Monné wrote:
> >>> On Tue, Jan 07, 2025 at 11:06:33AM +0100, Jan Beulich wrote:
> >>>> On 19.12.2024 06:21, Jiqian Chen wrote:
> >>>>> --- /dev/null
> >>>>> +++ b/xen/drivers/vpci/rebar.c
> >>>>> @@ -0,0 +1,131 @@
> >>>>> +/* SPDX-License-Identifier: GPL-2.0-only */
> >>>>> +/*
> >>>>> + * Copyright (C) 2024 Advanced Micro Devices, Inc. All Rights Reserved.
> >>>>> + *
> >>>>> + * Author: Jiqian Chen <Jiqian.Chen@amd.com>
> >>>>> + */
> >>>>> +
> >>>>> +#include <xen/sched.h>
> >>>>> +#include <xen/vpci.h>
> >>>>> +
> >>>>> +static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
> >>>>> +                                      unsigned int reg,
> >>>>> +                                      uint32_t val,
> >>>>> +                                      void *data)
> >>>>> +{
> >>>>> +    struct vpci_bar *bar = data;
> >>>>> +    uint64_t size = PCI_REBAR_CTRL_SIZE(val);
> >>>>> +
> >>>>> +    if ( bar->enabled )
> >>>>> +    {
> >>>>> +        /*
> >>>>> +         * Refuse to resize a BAR while memory decoding is enabled, as
> >>>>> +         * otherwise the size of the mapped region in the p2m would become
> >>>>> +         * stale with the newly set BAR size, and the position of the BAR
> >>>>> +         * would be reset to undefined.  Note the PCIe specification also
> >>>>> +         * forbids resizing a BAR with memory decoding enabled.
> >>>>> +         */
> >>>>> +        if ( size != bar->size )
> >>>>> +            gprintk(XENLOG_ERR,
> >>>>> +                    "%pp: refuse to resize BAR with memory decoding enabled\n",
> >>>>> +                    &pdev->sbdf);
> >>>>> +        return;
> >>>>> +    }
> >>>>> +
> >>>>> +    if ( !((size >> PCI_REBAR_SIZE_BIAS) & bar->resizable_sizes) )
> >>>>> +        gprintk(XENLOG_WARNING,
> >>>>> +                "%pp: new size %#lx is not supported by hardware\n",
> >>>>> +                &pdev->sbdf, size);
> >>>>> +
> >>>>> +    bar->size = size;
> >>>>
> >>>> Shouldn't at least this be in an "else" to the if() above?
> >>>
> >>> I think this was already raised in a previous version - would be good
> >>> to know how real hardware behaves when an invalid size is set.  Is the
> >>> BAR register still reset?
> >>
> >> I'm pretty sure what happens is undefined. I'd expect though that the
> >> BAR size then doesn't change. Which would require the above assignment
> >> to not be unconditional.
> > 
> > Might be better to just re-size the BAR, like you suggested to fetch
> > the BAR position from the register, instead of assuming 0.
> 
> FTAOD by "re-size" you mean re-obtain its size (seeing we're talking of
> re-sizable BARs here)? As kind of confirmed ...

Indeed, I meant to re-obtain the size (I can see that being
confusing in this context, sorry).

> >>>>> --- a/xen/drivers/vpci/vpci.c
> >>>>> +++ b/xen/drivers/vpci/vpci.c
> >>>>> @@ -232,6 +232,12 @@ void cf_check vpci_hw_write16(
> >>>>>      pci_conf_write16(pdev->sbdf, reg, val);
> >>>>>  }
> >>>>>  
> >>>>> +void cf_check vpci_hw_write32(
> >>>>> +    const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
> >>>>> +{
> >>>>> +    pci_conf_write32(pdev->sbdf, reg, val);
> >>>>> +}
> >>>>
> >>>> This function is being added just to handle writing of a r/o register.
> >>>> Can't you better re-use vpci_ignored_write()?
> >>>
> >>> But vpci_ignored_write() ignores the write, OTOH here the write is
> >>> propagated to the hardware.
> >>
> >> Right, just for the hardware to drop it. I wouldn't have commented if
> >> the function needed to do things like this already existed. Adding yet
> >> another cf_check function just for this is what made me give the remark.
> > 
> > According to the spec yes, they will be ignored.  Yet for the hardware
> > domain we try to avoid changing behavior from native as much as
> > possible, hence propagating the write seems more appropriate.
> 
> Okay; you're the maintainer of this code anyway.

Thanks for all your input Jan, you might not be the maintainer but
have certainly reviewed all vPCI code.

Roger.