When vpci fails to initialize an extended capability of device, it
just returns an error and vPCI gets disabled for the whole device.
So, add function to hide extended capability when initialization
fails. And remove the failed extended capability handler from vpci
extended capability list.
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
cc: Andrew Cooper <andrew.cooper3@citrix.com>
cc: Anthony PERARD <anthony.perard@vates.tech>
cc: Michal Orzel <michal.orzel@amd.com>
cc: Jan Beulich <jbeulich@suse.com>
cc: Julien Grall <julien@xen.org>
cc: Stefano Stabellini <sstabellini@kernel.org>
---
v8->v9 changes:
No
v7->v8 changes:
* s/and force/hopefully forcing/ in vpci_ext_capability_hide().
* Add Roger's Reviewed-by.
v6->v7 changes:
* Change the pointer parameter of vpci_get_previous_ext_cap_register()
and vpci_ext_capability_hide() to be const.
v5->v6 changes:
* Change to use for loop to compact code of vpci_get_previous_ext_cap_register().
* Rename parameter rm to r in vpci_ext_capability_hide().
* Change comment to describ the case that hide capability of position
0x100U.
v4->v5 changes:
* Modify the hex digits of PCI_EXT_CAP_NEXT_MASK and PCI_EXT_CAP_NEXT to be low case.
* Rename vpci_ext_capability_mask to vpci_ext_capability_hide.
v3->v4 changes:
* Change definition of PCI_EXT_CAP_NEXT to be "#define PCI_EXT_CAP_NEXT(header)
(MASK_EXTR(header, PCI_EXT_CAP_NEXT_MASK) & 0xFFCU)" to avoid redundancy.
* Modify the commit message.
* Change vpci_ext_capability_mask() to return error instead of using ASSERT.
* Set the capability ID part to be zero when we need to hide the capability of position 0x100U.
* Add check "if ( !offset )" in vpci_ext_capability_mask().
v2->v3 changes:
* Separated from the last version patch "vpci: Hide capability when it fails to initialize".
* Whole implementation changed because last version is wrong.
This version gets target handler and previous handler from vpci->handlers, then remove the target.
* Note: a case in function vpci_ext_capability_mask() needs to be discussed,
because it may change the offset of next capability when the offset of target
capability is 0x100U(the first extended capability), my implementation is just to
ignore and let hardware to handle the target capability.
v1->v2 changes:
* Removed the "priorities" of initializing capabilities since it isn't used anymore.
* Added new function vpci_capability_mask() and vpci_ext_capability_mask() to
remove failed capability from list.
* Called vpci_make_msix_hole() in the end of init_msix().
Best regards,
Jiqian Chen.
---
xen/drivers/vpci/vpci.c | 88 ++++++++++++++++++++++++++++++++++++++
xen/include/xen/pci_regs.h | 5 ++-
2 files changed, 92 insertions(+), 1 deletion(-)
diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c
index da67226e4f4d..01f5746b64df 100644
--- a/xen/drivers/vpci/vpci.c
+++ b/xen/drivers/vpci/vpci.c
@@ -165,6 +165,92 @@ static int vpci_capability_hide(const struct pci_dev *pdev, unsigned int cap)
return 0;
}
+static struct vpci_register *vpci_get_previous_ext_cap_register(
+ const struct vpci *vpci, unsigned int offset)
+{
+ unsigned int pos = PCI_CFG_SPACE_SIZE;
+ struct vpci_register *r;
+
+ if ( offset <= PCI_CFG_SPACE_SIZE )
+ {
+ ASSERT_UNREACHABLE();
+ return NULL;
+ }
+
+ for ( r = vpci_get_register(vpci, pos, 4); r;
+ r = pos > PCI_CFG_SPACE_SIZE ? vpci_get_register(vpci, pos, 4)
+ : NULL )
+ {
+ uint32_t header = (uint32_t)(uintptr_t)r->private;
+
+ ASSERT(header == (uintptr_t)r->private);
+
+ pos = PCI_EXT_CAP_NEXT(header);
+ if ( pos == offset )
+ break;
+ }
+
+ return r;
+}
+
+static int vpci_ext_capability_hide(
+ const struct pci_dev *pdev, unsigned int cap)
+{
+ const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
+ struct vpci_register *r, *prev_r;
+ struct vpci *vpci = pdev->vpci;
+ uint32_t header, pre_header;
+
+ if ( offset < PCI_CFG_SPACE_SIZE )
+ {
+ ASSERT_UNREACHABLE();
+ return 0;
+ }
+
+ spin_lock(&vpci->lock);
+ r = vpci_get_register(vpci, offset, 4);
+ if ( !r )
+ {
+ spin_unlock(&vpci->lock);
+ return -ENODEV;
+ }
+
+ header = (uint32_t)(uintptr_t)r->private;
+ if ( offset == PCI_CFG_SPACE_SIZE )
+ {
+ if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
+ r->private = (void *)(uintptr_t)0;
+ else
+ /*
+ * The first extended capability (0x100) can not be removed from
+ * the linked list, so instead mask its capability ID to return 0
+ * hopefully forcing OSes to skip it.
+ */
+ r->private = (void *)(uintptr_t)(header & ~PCI_EXT_CAP_ID(header));
+
+ spin_unlock(&vpci->lock);
+ return 0;
+ }
+
+ prev_r = vpci_get_previous_ext_cap_register(vpci, offset);
+ if ( !prev_r )
+ {
+ spin_unlock(&vpci->lock);
+ return -ENODEV;
+ }
+
+ pre_header = (uint32_t)(uintptr_t)prev_r->private;
+ pre_header &= ~PCI_EXT_CAP_NEXT_MASK;
+ pre_header |= header & PCI_EXT_CAP_NEXT_MASK;
+ prev_r->private = (void *)(uintptr_t)pre_header;
+
+ list_del(&r->node);
+ spin_unlock(&vpci->lock);
+ xfree(r);
+
+ return 0;
+}
+
static int vpci_init_capabilities(struct pci_dev *pdev)
{
for ( unsigned int i = 0; i < NUM_VPCI_INIT; i++ )
@@ -206,6 +292,8 @@ static int vpci_init_capabilities(struct pci_dev *pdev)
if ( !is_ext )
rc = vpci_capability_hide(pdev, cap);
+ else
+ rc = vpci_ext_capability_hide(pdev, cap);
if ( rc )
{
printk(XENLOG_ERR "%pd %pp: hide %s cap %u fail rc=%d\n",
diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h
index 27b4f44eedf3..3b6963133dbd 100644
--- a/xen/include/xen/pci_regs.h
+++ b/xen/include/xen/pci_regs.h
@@ -448,7 +448,10 @@
/* Extended Capabilities (PCI-X 2.0 and Express) */
#define PCI_EXT_CAP_ID(header) ((header) & 0x0000ffff)
#define PCI_EXT_CAP_VER(header) (((header) >> 16) & 0xf)
-#define PCI_EXT_CAP_NEXT(header) (((header) >> 20) & 0xffc)
+#define PCI_EXT_CAP_NEXT_MASK 0xfff00000U
+/* Bottom two bits of next capability position are reserved. */
+#define PCI_EXT_CAP_NEXT(header) \
+ (MASK_EXTR(header, PCI_EXT_CAP_NEXT_MASK) & 0xffcU)
#define PCI_EXT_CAP_ID_ERR 1
#define PCI_EXT_CAP_ID_VC 2
--
2.34.1
On 28.07.2025 07:03, Jiqian Chen wrote:
> +static int vpci_ext_capability_hide(
> + const struct pci_dev *pdev, unsigned int cap)
> +{
> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
> + struct vpci_register *r, *prev_r;
> + struct vpci *vpci = pdev->vpci;
> + uint32_t header, pre_header;
> +
> + if ( offset < PCI_CFG_SPACE_SIZE )
> + {
> + ASSERT_UNREACHABLE();
> + return 0;
> + }
> +
> + spin_lock(&vpci->lock);
> + r = vpci_get_register(vpci, offset, 4);
> + if ( !r )
> + {
> + spin_unlock(&vpci->lock);
> + return -ENODEV;
> + }
> +
> + header = (uint32_t)(uintptr_t)r->private;
> + if ( offset == PCI_CFG_SPACE_SIZE )
> + {
> + if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
> + r->private = (void *)(uintptr_t)0;
Eclair regards this a Misra rule 11.9 violation. Elsewhere we use (void *)0,
which I then would conclude is "fine". But I can't say why that is. Cc-ing
Bugseng for a possible explanation.
Jan
On 2025-07-30 11:50, Jan Beulich wrote:
> On 28.07.2025 07:03, Jiqian Chen wrote:
>> +static int vpci_ext_capability_hide(
>> + const struct pci_dev *pdev, unsigned int cap)
>> +{
>> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf,
>> cap);
>> + struct vpci_register *r, *prev_r;
>> + struct vpci *vpci = pdev->vpci;
>> + uint32_t header, pre_header;
>> +
>> + if ( offset < PCI_CFG_SPACE_SIZE )
>> + {
>> + ASSERT_UNREACHABLE();
>> + return 0;
>> + }
>> +
>> + spin_lock(&vpci->lock);
>> + r = vpci_get_register(vpci, offset, 4);
>> + if ( !r )
>> + {
>> + spin_unlock(&vpci->lock);
>> + return -ENODEV;
>> + }
>> +
>> + header = (uint32_t)(uintptr_t)r->private;
>> + if ( offset == PCI_CFG_SPACE_SIZE )
>> + {
>> + if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
>> + r->private = (void *)(uintptr_t)0;
>
> Eclair regards this a Misra rule 11.9 violation. Elsewhere we use (void
> *)0,
> which I then would conclude is "fine". But I can't say why that is.
> Cc-ing
> Bugseng for a possible explanation.
>
Hi Jan,
I only see
0|$ git grep "(void\*)0"
xen/include/xen/types.h:#define NULL ((void*)0)
which is fine for R11.9 of course. As Andrew noted, I don't see the need
for the use of uintptr_t either.
--
Nicola Vetrini, B.Sc.
Software Engineer
BUGSENG (https://bugseng.com)
LinkedIn: https://www.linkedin.com/in/nicola-vetrini-a42471253
On 2025-07-30 12:42, Nicola Vetrini wrote:
> On 2025-07-30 11:50, Jan Beulich wrote:
>> On 28.07.2025 07:03, Jiqian Chen wrote:
>>> +static int vpci_ext_capability_hide(
>>> + const struct pci_dev *pdev, unsigned int cap)
>>> +{
>>> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf,
>>> cap);
>>> + struct vpci_register *r, *prev_r;
>>> + struct vpci *vpci = pdev->vpci;
>>> + uint32_t header, pre_header;
>>> +
>>> + if ( offset < PCI_CFG_SPACE_SIZE )
>>> + {
>>> + ASSERT_UNREACHABLE();
>>> + return 0;
>>> + }
>>> +
>>> + spin_lock(&vpci->lock);
>>> + r = vpci_get_register(vpci, offset, 4);
>>> + if ( !r )
>>> + {
>>> + spin_unlock(&vpci->lock);
>>> + return -ENODEV;
>>> + }
>>> +
>>> + header = (uint32_t)(uintptr_t)r->private;
>>> + if ( offset == PCI_CFG_SPACE_SIZE )
>>> + {
>>> + if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
>>> + r->private = (void *)(uintptr_t)0;
>>
>> Eclair regards this a Misra rule 11.9 violation. Elsewhere we use
>> (void *)0,
>> which I then would conclude is "fine". But I can't say why that is.
>> Cc-ing
>> Bugseng for a possible explanation.
>>
>
> Hi Jan,
>
> I only see
>
> 0|$ git grep "(void\*)0"
> xen/include/xen/types.h:#define NULL ((void*)0)
>
> which is fine for R11.9 of course. As Andrew noted, I don't see the
> need for the use of uintptr_t either.
Oh, I missed forms using a space before the pointer. In any case, from
the rule's Amplification: "Note: a null pointer constant of the form
(void *)0 is permitted, whether or not it was expanded from NULL."
--
Nicola Vetrini, B.Sc.
Software Engineer
BUGSENG (https://bugseng.com)
LinkedIn: https://www.linkedin.com/in/nicola-vetrini-a42471253
On 2025/7/30 18:46, Nicola Vetrini wrote:
> On 2025-07-30 12:42, Nicola Vetrini wrote:
>> On 2025-07-30 11:50, Jan Beulich wrote:
>>> On 28.07.2025 07:03, Jiqian Chen wrote:
>>>> +static int vpci_ext_capability_hide(
>>>> + const struct pci_dev *pdev, unsigned int cap)
>>>> +{
>>>> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
>>>> + struct vpci_register *r, *prev_r;
>>>> + struct vpci *vpci = pdev->vpci;
>>>> + uint32_t header, pre_header;
>>>> +
>>>> + if ( offset < PCI_CFG_SPACE_SIZE )
>>>> + {
>>>> + ASSERT_UNREACHABLE();
>>>> + return 0;
>>>> + }
>>>> +
>>>> + spin_lock(&vpci->lock);
>>>> + r = vpci_get_register(vpci, offset, 4);
>>>> + if ( !r )
>>>> + {
>>>> + spin_unlock(&vpci->lock);
>>>> + return -ENODEV;
>>>> + }
>>>> +
>>>> + header = (uint32_t)(uintptr_t)r->private;
>>>> + if ( offset == PCI_CFG_SPACE_SIZE )
>>>> + {
>>>> + if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
>>>> + r->private = (void *)(uintptr_t)0;
>>>
>>> Eclair regards this a Misra rule 11.9 violation. Elsewhere we use (void *)0,
>>> which I then would conclude is "fine". But I can't say why that is. Cc-ing
>>> Bugseng for a possible explanation.
>>>
>>
>> Hi Jan,
>>
>> I only see
>>
>> 0|$ git grep "(void\*)0"
>> xen/include/xen/types.h:#define NULL ((void*)0)
>>
>> which is fine for R11.9 of course. As Andrew noted, I don't see the need for the use of uintptr_t either.
>
> Oh, I missed forms using a space before the pointer. In any case, from the rule's Amplification: "Note: a null pointer constant of the form (void *)0 is permitted, whether or not it was expanded from NULL."
>
Thank you for helping to solve this problem.
Thank you both very much!
--
Best regards,
Jiqian Chen.
On 30/07/2025 10:50 am, Jan Beulich wrote:
> On 28.07.2025 07:03, Jiqian Chen wrote:
>> +static int vpci_ext_capability_hide(
>> + const struct pci_dev *pdev, unsigned int cap)
>> +{
>> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
>> + struct vpci_register *r, *prev_r;
>> + struct vpci *vpci = pdev->vpci;
>> + uint32_t header, pre_header;
>> +
>> + if ( offset < PCI_CFG_SPACE_SIZE )
>> + {
>> + ASSERT_UNREACHABLE();
>> + return 0;
>> + }
>> +
>> + spin_lock(&vpci->lock);
>> + r = vpci_get_register(vpci, offset, 4);
>> + if ( !r )
>> + {
>> + spin_unlock(&vpci->lock);
>> + return -ENODEV;
>> + }
>> +
>> + header = (uint32_t)(uintptr_t)r->private;
>> + if ( offset == PCI_CFG_SPACE_SIZE )
>> + {
>> + if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
>> + r->private = (void *)(uintptr_t)0;
> Eclair regards this a Misra rule 11.9 violation. Elsewhere we use (void *)0,
> which I then would conclude is "fine". But I can't say why that is. Cc-ing
> Bugseng for a possible explanation.
Eclair is complaining that this isn't written r->private = NULL.
Given that private is a pointer, I don't understand why NULL isn't used
either.
~Andrew
On 30.07.2025 12:19, Andrew Cooper wrote:
> On 30/07/2025 10:50 am, Jan Beulich wrote:
>> On 28.07.2025 07:03, Jiqian Chen wrote:
>>> +static int vpci_ext_capability_hide(
>>> + const struct pci_dev *pdev, unsigned int cap)
>>> +{
>>> + const unsigned int offset = pci_find_ext_capability(pdev->sbdf, cap);
>>> + struct vpci_register *r, *prev_r;
>>> + struct vpci *vpci = pdev->vpci;
>>> + uint32_t header, pre_header;
>>> +
>>> + if ( offset < PCI_CFG_SPACE_SIZE )
>>> + {
>>> + ASSERT_UNREACHABLE();
>>> + return 0;
>>> + }
>>> +
>>> + spin_lock(&vpci->lock);
>>> + r = vpci_get_register(vpci, offset, 4);
>>> + if ( !r )
>>> + {
>>> + spin_unlock(&vpci->lock);
>>> + return -ENODEV;
>>> + }
>>> +
>>> + header = (uint32_t)(uintptr_t)r->private;
>>> + if ( offset == PCI_CFG_SPACE_SIZE )
>>> + {
>>> + if ( PCI_EXT_CAP_NEXT(header) <= PCI_CFG_SPACE_SIZE )
>>> + r->private = (void *)(uintptr_t)0;
>> Eclair regards this a Misra rule 11.9 violation. Elsewhere we use (void *)0,
>> which I then would conclude is "fine". But I can't say why that is. Cc-ing
>> Bugseng for a possible explanation.
>
> Eclair is complaining that this isn't written r->private = NULL.
>
> Given that private is a pointer, I don't understand why NULL isn't used
> either.
As with the various uses in calls to vpci_add_register(), the goal is to
indicate we want a value of 0 (could in principle be non-0 values as well,
but happens to be 0 in a number of cases), disguised as a pointer. Which
NULL doesn't quite express. And NULL there would also be inconsistent with
some (void *)0x25 that may need using elsewhere.
Jan
© 2016 - 2025 Red Hat, Inc.