[PATCH v8 8/8] vpci/msix: Free MSIX resources when init_msix() fails

Jiqian Chen posted 8 patches 3 months, 1 week ago
There is a newer version of this series
[PATCH v8 8/8] vpci/msix: Free MSIX resources when init_msix() fails
Posted by Jiqian Chen 3 months, 1 week ago
When init_msix() fails, current logic return fail and free MSIX-related
resources in vpci_deassign_device(). But the previous new changes will
hide MSIX capability and return success, it can't reach
vpci_deassign_device() to remove resources if hiding success, so those
resources must be removed in cleanup function of MSIX.

To do that, implement cleanup function for MSIX.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
---
cc: "Roger Pau Monné" <roger.pau@citrix.com>
---
v7->v8 changes:
* Given the code in vpci_remove_registers() an error in the removal of
  registers would likely imply memory corruption, at which point it's
  best to fully disable the device. So, Rollback the last two modifications of v7.

v6->v7 changes:
* Change the pointer parameter of cleanup_msix() to be const.
* When vpci_remove_registers() in cleanup_msix() fails, not to return
  directly, instead try to free msix and re-add ctrl handler.
* Pass pdev->vpci into vpci_add_register() instead of pdev->vpci->msix in
  init_msix() since we need that every handler realize that msix is NULL
  when msix is freed but handlers are still in there.

v5->v6 changes:
* Change the logic to add dummy handler when !vpci->msix in cleanup_msix().

v4->v5 changes:
* Change definition "static void cleanup_msix" to "static int cf_check cleanup_msix"
  since cleanup hook is changed to be int.
* Add a read-only register for MSIX Control Register in the end of cleanup_msix().

v3->v4 changes:
* Change function name from fini_msix() to cleanup_msix().
* Change to use XFREE to free vpci->msix.
* In cleanup function, change the sequence of check and remove action according to
  init_msix().

v2->v3 changes:
* Remove unnecessary clean operations in fini_msix().

v1->v2 changes:
new patch.

Best regards,
Jiqian Chen.
---
 xen/drivers/vpci/msix.c | 44 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
index eb3e7dcd1068..8ab159969da6 100644
--- a/xen/drivers/vpci/msix.c
+++ b/xen/drivers/vpci/msix.c
@@ -655,6 +655,48 @@ int vpci_make_msix_hole(const struct pci_dev *pdev)
     return 0;
 }
 
+static int cf_check cleanup_msix(const struct pci_dev *pdev)
+{
+    int rc;
+    struct vpci *vpci = pdev->vpci;
+    const unsigned int msix_pos = pdev->msix_pos;
+
+    if ( !msix_pos )
+        return 0;
+
+    rc = vpci_remove_registers(vpci, msix_control_reg(msix_pos), 2);
+    if ( rc )
+    {
+        printk(XENLOG_ERR "%pd %pp: fail to remove MSIX handlers rc=%d\n",
+               pdev->domain, &pdev->sbdf, rc);
+        ASSERT_UNREACHABLE();
+        return rc;
+    }
+
+    if ( vpci->msix )
+    {
+        for ( unsigned int i = 0; i < ARRAY_SIZE(vpci->msix->table); i++ )
+            if ( vpci->msix->table[i] )
+                iounmap(vpci->msix->table[i]);
+
+        list_del(&vpci->msix->next);
+        XFREE(vpci->msix);
+    }
+
+    /*
+     * The driver may not traverse the capability list and think device
+     * supports MSIX by default. So here let the control register of MSIX
+     * be Read-Only is to ensure MSIX disabled.
+     */
+    rc = vpci_add_register(vpci, vpci_hw_read16, NULL,
+                           msix_control_reg(msix_pos), 2, NULL);
+    if ( rc )
+        printk(XENLOG_ERR "%pd %pp: fail to add MSIX ctrl handler rc=%d\n",
+               pdev->domain, &pdev->sbdf, rc);
+
+    return rc;
+}
+
 static int cf_check init_msix(struct pci_dev *pdev)
 {
     struct domain *d = pdev->domain;
@@ -714,7 +756,7 @@ static int cf_check init_msix(struct pci_dev *pdev)
 
     return rc;
 }
-REGISTER_VPCI_CAP(MSIX, init_msix, NULL);
+REGISTER_VPCI_CAP(MSIX, init_msix, cleanup_msix);
 
 /*
  * Local variables:
-- 
2.34.1


Re: [PATCH v8 8/8] vpci/msix: Free MSIX resources when init_msix() fails
Posted by Roger Pau Monné 3 months, 1 week ago
On Thu, Jul 24, 2025 at 01:50:06PM +0800, Jiqian Chen wrote:
> When init_msix() fails, current logic return fail and free MSIX-related
> resources in vpci_deassign_device(). But the previous new changes will
> hide MSIX capability and return success, it can't reach
> vpci_deassign_device() to remove resources if hiding success, so those
> resources must be removed in cleanup function of MSIX.

The text here is a bit convoluted IMO.  It would be clearer as:

When MSI-X initialization fails vPCI will hide the capability, but
remove of handlers and data won't be performed until the device is
deassigned.  Introduce a MSI-X cleanup hook that will be called when
initialization fails to cleanup MSI-X related hooks and free it's
associated data.

As all supported capabilities have been switched to use the cleanup
hooks call those from vpci_deassign_device() instead of open-code the
capability specific cleanup in there.

(see below for the reasoning behind the last paragrpah).

> To do that, implement cleanup function for MSIX.
> 
> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> ---
> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> ---
> v7->v8 changes:
> * Given the code in vpci_remove_registers() an error in the removal of
>   registers would likely imply memory corruption, at which point it's
>   best to fully disable the device. So, Rollback the last two modifications of v7.
> 
> v6->v7 changes:
> * Change the pointer parameter of cleanup_msix() to be const.
> * When vpci_remove_registers() in cleanup_msix() fails, not to return
>   directly, instead try to free msix and re-add ctrl handler.
> * Pass pdev->vpci into vpci_add_register() instead of pdev->vpci->msix in
>   init_msix() since we need that every handler realize that msix is NULL
>   when msix is freed but handlers are still in there.
> 
> v5->v6 changes:
> * Change the logic to add dummy handler when !vpci->msix in cleanup_msix().
> 
> v4->v5 changes:
> * Change definition "static void cleanup_msix" to "static int cf_check cleanup_msix"
>   since cleanup hook is changed to be int.
> * Add a read-only register for MSIX Control Register in the end of cleanup_msix().
> 
> v3->v4 changes:
> * Change function name from fini_msix() to cleanup_msix().
> * Change to use XFREE to free vpci->msix.
> * In cleanup function, change the sequence of check and remove action according to
>   init_msix().
> 
> v2->v3 changes:
> * Remove unnecessary clean operations in fini_msix().
> 
> v1->v2 changes:
> new patch.
> 
> Best regards,
> Jiqian Chen.
> ---
>  xen/drivers/vpci/msix.c | 44 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> index eb3e7dcd1068..8ab159969da6 100644
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -655,6 +655,48 @@ int vpci_make_msix_hole(const struct pci_dev *pdev)
>      return 0;
>  }
>  
> +static int cf_check cleanup_msix(const struct pci_dev *pdev)
> +{
> +    int rc;
> +    struct vpci *vpci = pdev->vpci;
> +    const unsigned int msix_pos = pdev->msix_pos;
> +
> +    if ( !msix_pos )
> +        return 0;
> +
> +    rc = vpci_remove_registers(vpci, msix_control_reg(msix_pos), 2);
> +    if ( rc )
> +    {
> +        printk(XENLOG_ERR "%pd %pp: fail to remove MSIX handlers rc=%d\n",
> +               pdev->domain, &pdev->sbdf, rc);
> +        ASSERT_UNREACHABLE();
> +        return rc;
> +    }
> +
> +    if ( vpci->msix )
> +    {
> +        for ( unsigned int i = 0; i < ARRAY_SIZE(vpci->msix->table); i++ )
> +            if ( vpci->msix->table[i] )
> +                iounmap(vpci->msix->table[i]);
> +
> +        list_del(&vpci->msix->next);
> +        XFREE(vpci->msix);
> +    }
> +
> +    /*
> +     * The driver may not traverse the capability list and think device
> +     * supports MSIX by default. So here let the control register of MSIX
> +     * be Read-Only is to ensure MSIX disabled.
> +     */
> +    rc = vpci_add_register(vpci, vpci_hw_read16, NULL,
> +                           msix_control_reg(msix_pos), 2, NULL);
> +    if ( rc )
> +        printk(XENLOG_ERR "%pd %pp: fail to add MSIX ctrl handler rc=%d\n",
> +               pdev->domain, &pdev->sbdf, rc);
> +
> +    return rc;
> +}
> +
>  static int cf_check init_msix(struct pci_dev *pdev)
>  {
>      struct domain *d = pdev->domain;
> @@ -714,7 +756,7 @@ static int cf_check init_msix(struct pci_dev *pdev)
>  
>      return rc;
>  }
> -REGISTER_VPCI_CAP(MSIX, init_msix, NULL);
> +REGISTER_VPCI_CAP(MSIX, init_msix, cleanup_msix);

Don't you need to also call the cleanup hooks in
vpci_deassign_device() and remove the open-coded cleaning of MSI-X
done there?

Otherwise the code here is duplicated with the code in
vpci_deassign_device() for MSI-X cleanup (apart from it being a bit of
a layering violation to open-code the MSI-X cleanup in there).

Thanks, Roger.

Re: [PATCH v8 8/8] vpci/msix: Free MSIX resources when init_msix() fails
Posted by Chen, Jiqian 3 months, 1 week ago
On 2025/7/24 23:59, Roger Pau Monné wrote:
> On Thu, Jul 24, 2025 at 01:50:06PM +0800, Jiqian Chen wrote:
>> When init_msix() fails, current logic return fail and free MSIX-related
>> resources in vpci_deassign_device(). But the previous new changes will
>> hide MSIX capability and return success, it can't reach
>> vpci_deassign_device() to remove resources if hiding success, so those
>> resources must be removed in cleanup function of MSIX.
> 
> The text here is a bit convoluted IMO.  It would be clearer as:
> 
> When MSI-X initialization fails vPCI will hide the capability, but
> remove of handlers and data won't be performed until the device is
> deassigned.  Introduce a MSI-X cleanup hook that will be called when
> initialization fails to cleanup MSI-X related hooks and free it's
> associated data.
> 
> As all supported capabilities have been switched to use the cleanup
> hooks call those from vpci_deassign_device() instead of open-code the
> capability specific cleanup in there.
Thanks, will change.

> 
> (see below for the reasoning behind the last paragrpah).
> 
>> To do that, implement cleanup function for MSIX.
>>
>> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
>> ---
>> cc: "Roger Pau Monné" <roger.pau@citrix.com>
>> ---
>> v7->v8 changes:
>> * Given the code in vpci_remove_registers() an error in the removal of
>>   registers would likely imply memory corruption, at which point it's
>>   best to fully disable the device. So, Rollback the last two modifications of v7.
>>
>> v6->v7 changes:
>> * Change the pointer parameter of cleanup_msix() to be const.
>> * When vpci_remove_registers() in cleanup_msix() fails, not to return
>>   directly, instead try to free msix and re-add ctrl handler.
>> * Pass pdev->vpci into vpci_add_register() instead of pdev->vpci->msix in
>>   init_msix() since we need that every handler realize that msix is NULL
>>   when msix is freed but handlers are still in there.
>>
>> v5->v6 changes:
>> * Change the logic to add dummy handler when !vpci->msix in cleanup_msix().
>>
>> v4->v5 changes:
>> * Change definition "static void cleanup_msix" to "static int cf_check cleanup_msix"
>>   since cleanup hook is changed to be int.
>> * Add a read-only register for MSIX Control Register in the end of cleanup_msix().
>>
>> v3->v4 changes:
>> * Change function name from fini_msix() to cleanup_msix().
>> * Change to use XFREE to free vpci->msix.
>> * In cleanup function, change the sequence of check and remove action according to
>>   init_msix().
>>
>> v2->v3 changes:
>> * Remove unnecessary clean operations in fini_msix().
>>
>> v1->v2 changes:
>> new patch.
>>
>> Best regards,
>> Jiqian Chen.
>> ---
>>  xen/drivers/vpci/msix.c | 44 ++++++++++++++++++++++++++++++++++++++++-
>>  1 file changed, 43 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
>> index eb3e7dcd1068..8ab159969da6 100644
>> --- a/xen/drivers/vpci/msix.c
>> +++ b/xen/drivers/vpci/msix.c
>> @@ -655,6 +655,48 @@ int vpci_make_msix_hole(const struct pci_dev *pdev)
>>      return 0;
>>  }
>>  
>> +static int cf_check cleanup_msix(const struct pci_dev *pdev)
>> +{
>> +    int rc;
>> +    struct vpci *vpci = pdev->vpci;
>> +    const unsigned int msix_pos = pdev->msix_pos;
>> +
>> +    if ( !msix_pos )
>> +        return 0;
>> +
>> +    rc = vpci_remove_registers(vpci, msix_control_reg(msix_pos), 2);
>> +    if ( rc )
>> +    {
>> +        printk(XENLOG_ERR "%pd %pp: fail to remove MSIX handlers rc=%d\n",
>> +               pdev->domain, &pdev->sbdf, rc);
>> +        ASSERT_UNREACHABLE();
>> +        return rc;
>> +    }
>> +
>> +    if ( vpci->msix )
>> +    {
>> +        for ( unsigned int i = 0; i < ARRAY_SIZE(vpci->msix->table); i++ )
>> +            if ( vpci->msix->table[i] )
>> +                iounmap(vpci->msix->table[i]);
>> +
>> +        list_del(&vpci->msix->next);
Should I need to move this line above " for ( unsigned int i = 0; i < ARRAY_SIZE(vpci->msix->table); i++ )" ?
Because I noticed that is what it be in vpci_deassign_device.

>> +        XFREE(vpci->msix);
>> +    }
>> +
>> +    /*
>> +     * The driver may not traverse the capability list and think device
>> +     * supports MSIX by default. So here let the control register of MSIX
>> +     * be Read-Only is to ensure MSIX disabled.
>> +     */
>> +    rc = vpci_add_register(vpci, vpci_hw_read16, NULL,
>> +                           msix_control_reg(msix_pos), 2, NULL);
>> +    if ( rc )
>> +        printk(XENLOG_ERR "%pd %pp: fail to add MSIX ctrl handler rc=%d\n",
>> +               pdev->domain, &pdev->sbdf, rc);
>> +
>> +    return rc;
>> +}
>> +
>>  static int cf_check init_msix(struct pci_dev *pdev)
>>  {
>>      struct domain *d = pdev->domain;
>> @@ -714,7 +756,7 @@ static int cf_check init_msix(struct pci_dev *pdev)
>>  
>>      return rc;
>>  }
>> -REGISTER_VPCI_CAP(MSIX, init_msix, NULL);
>> +REGISTER_VPCI_CAP(MSIX, init_msix, cleanup_msix);
> 
> Don't you need to also call the cleanup hooks in
> vpci_deassign_device() and remove the open-coded cleaning of MSI-X
> done there?
Oh, will do.
How do I process the return value of cleanup_msix in vpci_deassign_device?
Just print an error when it fails and continue to do other deassign actions?

> 
> Otherwise the code here is duplicated with the code in
> vpci_deassign_device() for MSI-X cleanup (apart from it being a bit of
> a layering violation to open-code the MSI-X cleanup in there).
> 
> Thanks, Roger.

-- 
Best regards,
Jiqian Chen.
Re: [PATCH v8 8/8] vpci/msix: Free MSIX resources when init_msix() fails
Posted by Roger Pau Monné 3 months, 1 week ago
On Fri, Jul 25, 2025 at 02:50:36AM +0000, Chen, Jiqian wrote:
> On 2025/7/24 23:59, Roger Pau Monné wrote:
> > On Thu, Jul 24, 2025 at 01:50:06PM +0800, Jiqian Chen wrote:
> >> When init_msix() fails, current logic return fail and free MSIX-related
> >> resources in vpci_deassign_device(). But the previous new changes will
> >> hide MSIX capability and return success, it can't reach
> >> vpci_deassign_device() to remove resources if hiding success, so those
> >> resources must be removed in cleanup function of MSIX.
> > 
> > The text here is a bit convoluted IMO.  It would be clearer as:
> > 
> > When MSI-X initialization fails vPCI will hide the capability, but
> > remove of handlers and data won't be performed until the device is
> > deassigned.  Introduce a MSI-X cleanup hook that will be called when
> > initialization fails to cleanup MSI-X related hooks and free it's
> > associated data.
> > 
> > As all supported capabilities have been switched to use the cleanup
> > hooks call those from vpci_deassign_device() instead of open-code the
> > capability specific cleanup in there.
> Thanks, will change.
> 
> > 
> > (see below for the reasoning behind the last paragrpah).
> > 
> >> To do that, implement cleanup function for MSIX.
> >>
> >> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
> >> ---
> >> cc: "Roger Pau Monné" <roger.pau@citrix.com>
> >> ---
> >> v7->v8 changes:
> >> * Given the code in vpci_remove_registers() an error in the removal of
> >>   registers would likely imply memory corruption, at which point it's
> >>   best to fully disable the device. So, Rollback the last two modifications of v7.
> >>
> >> v6->v7 changes:
> >> * Change the pointer parameter of cleanup_msix() to be const.
> >> * When vpci_remove_registers() in cleanup_msix() fails, not to return
> >>   directly, instead try to free msix and re-add ctrl handler.
> >> * Pass pdev->vpci into vpci_add_register() instead of pdev->vpci->msix in
> >>   init_msix() since we need that every handler realize that msix is NULL
> >>   when msix is freed but handlers are still in there.
> >>
> >> v5->v6 changes:
> >> * Change the logic to add dummy handler when !vpci->msix in cleanup_msix().
> >>
> >> v4->v5 changes:
> >> * Change definition "static void cleanup_msix" to "static int cf_check cleanup_msix"
> >>   since cleanup hook is changed to be int.
> >> * Add a read-only register for MSIX Control Register in the end of cleanup_msix().
> >>
> >> v3->v4 changes:
> >> * Change function name from fini_msix() to cleanup_msix().
> >> * Change to use XFREE to free vpci->msix.
> >> * In cleanup function, change the sequence of check and remove action according to
> >>   init_msix().
> >>
> >> v2->v3 changes:
> >> * Remove unnecessary clean operations in fini_msix().
> >>
> >> v1->v2 changes:
> >> new patch.
> >>
> >> Best regards,
> >> Jiqian Chen.
> >> ---
> >>  xen/drivers/vpci/msix.c | 44 ++++++++++++++++++++++++++++++++++++++++-
> >>  1 file changed, 43 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> >> index eb3e7dcd1068..8ab159969da6 100644
> >> --- a/xen/drivers/vpci/msix.c
> >> +++ b/xen/drivers/vpci/msix.c
> >> @@ -655,6 +655,48 @@ int vpci_make_msix_hole(const struct pci_dev *pdev)
> >>      return 0;
> >>  }
> >>  
> >> +static int cf_check cleanup_msix(const struct pci_dev *pdev)
> >> +{
> >> +    int rc;
> >> +    struct vpci *vpci = pdev->vpci;
> >> +    const unsigned int msix_pos = pdev->msix_pos;
> >> +
> >> +    if ( !msix_pos )
> >> +        return 0;
> >> +
> >> +    rc = vpci_remove_registers(vpci, msix_control_reg(msix_pos), 2);
> >> +    if ( rc )
> >> +    {
> >> +        printk(XENLOG_ERR "%pd %pp: fail to remove MSIX handlers rc=%d\n",
> >> +               pdev->domain, &pdev->sbdf, rc);
> >> +        ASSERT_UNREACHABLE();
> >> +        return rc;
> >> +    }
> >> +
> >> +    if ( vpci->msix )
> >> +    {
> >> +        for ( unsigned int i = 0; i < ARRAY_SIZE(vpci->msix->table); i++ )
> >> +            if ( vpci->msix->table[i] )
> >> +                iounmap(vpci->msix->table[i]);
> >> +
> >> +        list_del(&vpci->msix->next);
> Should I need to move this line above " for ( unsigned int i = 0; i < ARRAY_SIZE(vpci->msix->table); i++ )" ?
> Because I noticed that is what it be in vpci_deassign_device.

Yes, indeed, that would be preferable.

> >> +        XFREE(vpci->msix);
> >> +    }
> >> +
> >> +    /*
> >> +     * The driver may not traverse the capability list and think device
> >> +     * supports MSIX by default. So here let the control register of MSIX
> >> +     * be Read-Only is to ensure MSIX disabled.
> >> +     */
> >> +    rc = vpci_add_register(vpci, vpci_hw_read16, NULL,
> >> +                           msix_control_reg(msix_pos), 2, NULL);
> >> +    if ( rc )
> >> +        printk(XENLOG_ERR "%pd %pp: fail to add MSIX ctrl handler rc=%d\n",
> >> +               pdev->domain, &pdev->sbdf, rc);
> >> +
> >> +    return rc;
> >> +}
> >> +
> >>  static int cf_check init_msix(struct pci_dev *pdev)
> >>  {
> >>      struct domain *d = pdev->domain;
> >> @@ -714,7 +756,7 @@ static int cf_check init_msix(struct pci_dev *pdev)
> >>  
> >>      return rc;
> >>  }
> >> -REGISTER_VPCI_CAP(MSIX, init_msix, NULL);
> >> +REGISTER_VPCI_CAP(MSIX, init_msix, cleanup_msix);
> > 
> > Don't you need to also call the cleanup hooks in
> > vpci_deassign_device() and remove the open-coded cleaning of MSI-X
> > done there?
> Oh, will do.
> How do I process the return value of cleanup_msix in vpci_deassign_device?
> Just print an error when it fails and continue to do other deassign actions?

Yeah, I don't think there's much else that can be done.  Printing an
error and continuing should be fine.

Thanks, Roger.