[PATCH] pci: fix locking around vPCI removal in pci_remove_device()

Roger Pau Monne posted 1 patch 2 months ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20240229131525.19099-1-roger.pau@citrix.com
xen/drivers/passthrough/pci.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
[PATCH] pci: fix locking around vPCI removal in pci_remove_device()
Posted by Roger Pau Monne 2 months ago
Currently vpci_deassign_device() is called without holding the per-domain
pci_lock in pci_remove_device(), which leads to:

Assertion 'rw_is_write_locked(&pdev->domain->pci_lock)' failed at ../drivers/vpci/vpci.c:47
[...]
Xen call trace:
   [<ffff82d040260eac>] R vpci_deassign_device+0x10d/0x1b9
   [<ffff82d04027932f>] S pci_remove_device+0x2b1/0x380
   [<ffff82d040260bd0>] F pci_physdev_op+0x197/0x19e
   [<ffff82d04032272d>] F do_physdev_op+0x342/0x12aa
   [<ffff82d0402f067a>] F pv_hypercall+0x58e/0x62b
   [<ffff82d0402012ba>] F lstar_enter+0x13a/0x140

Move the existing block that removes the device from the domain pdev_list ahead
and also issue the call to vpci_deassign_device() there.  It's fine to remove
the device from the domain list of assigned devices, as further functions only
care that the pdev domain field is correctly set to the owner of the device
about to be removed.

Moving the vpci_deassign_device() past the pci_cleanup_msi() call can be
dangerous, as doing the MSI cleanup ahead of having removed the vPCI handlers
could lead to stale data in vPCI MSI(-X) internal structures.

Fixes: 4f78438b45e2 ('vpci: use per-domain PCI lock to protect vpci structure')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/drivers/passthrough/pci.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 4c0a836486ec..194701c9137d 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -817,15 +817,15 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
     list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
         if ( pdev->bus == bus && pdev->devfn == devfn )
         {
-            vpci_deassign_device(pdev);
-            pci_cleanup_msi(pdev);
-            ret = iommu_remove_device(pdev);
             if ( pdev->domain )
             {
                 write_lock(&pdev->domain->pci_lock);
+                vpci_deassign_device(pdev);
                 list_del(&pdev->domain_list);
                 write_unlock(&pdev->domain->pci_lock);
             }
+            pci_cleanup_msi(pdev);
+            ret = iommu_remove_device(pdev);
             printk(XENLOG_DEBUG "PCI remove device %pp\n", &pdev->sbdf);
             free_pdev(pseg, pdev);
             break;
-- 
2.44.0


Re: [PATCH] pci: fix locking around vPCI removal in pci_remove_device()
Posted by Jan Beulich 2 months ago
On 29.02.2024 14:15, Roger Pau Monne wrote:
> Currently vpci_deassign_device() is called without holding the per-domain
> pci_lock in pci_remove_device(), which leads to:
> 
> Assertion 'rw_is_write_locked(&pdev->domain->pci_lock)' failed at ../drivers/vpci/vpci.c:47
> [...]
> Xen call trace:
>    [<ffff82d040260eac>] R vpci_deassign_device+0x10d/0x1b9
>    [<ffff82d04027932f>] S pci_remove_device+0x2b1/0x380
>    [<ffff82d040260bd0>] F pci_physdev_op+0x197/0x19e
>    [<ffff82d04032272d>] F do_physdev_op+0x342/0x12aa
>    [<ffff82d0402f067a>] F pv_hypercall+0x58e/0x62b
>    [<ffff82d0402012ba>] F lstar_enter+0x13a/0x140
> 
> Move the existing block that removes the device from the domain pdev_list ahead
> and also issue the call to vpci_deassign_device() there.  It's fine to remove
> the device from the domain list of assigned devices, as further functions only
> care that the pdev domain field is correctly set to the owner of the device
> about to be removed.
> 
> Moving the vpci_deassign_device() past the pci_cleanup_msi() call can be
> dangerous, as doing the MSI cleanup ahead of having removed the vPCI handlers
> could lead to stale data in vPCI MSI(-X) internal structures.
> 
> Fixes: 4f78438b45e2 ('vpci: use per-domain PCI lock to protect vpci structure')
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -817,15 +817,15 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
>      list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
>          if ( pdev->bus == bus && pdev->devfn == devfn )
>          {
> -            vpci_deassign_device(pdev);
> -            pci_cleanup_msi(pdev);
> -            ret = iommu_remove_device(pdev);
>              if ( pdev->domain )
>              {
>                  write_lock(&pdev->domain->pci_lock);
> +                vpci_deassign_device(pdev);
>                  list_del(&pdev->domain_list);
>                  write_unlock(&pdev->domain->pci_lock);

This then also is properly symmetric with what pci_add_device() has.

Jan

>              }
> +            pci_cleanup_msi(pdev);
> +            ret = iommu_remove_device(pdev);
>              printk(XENLOG_DEBUG "PCI remove device %pp\n", &pdev->sbdf);
>              free_pdev(pseg, pdev);
>              break;


Re: [PATCH] pci: fix locking around vPCI removal in pci_remove_device()
Posted by Stewart Hildebrand 2 months ago
On 2/29/24 08:23, Jan Beulich wrote:
> On 29.02.2024 14:15, Roger Pau Monne wrote:
>> Currently vpci_deassign_device() is called without holding the per-domain
>> pci_lock in pci_remove_device(), which leads to:
>>
>> Assertion 'rw_is_write_locked(&pdev->domain->pci_lock)' failed at ../drivers/vpci/vpci.c:47
>> [...]
>> Xen call trace:
>>    [<ffff82d040260eac>] R vpci_deassign_device+0x10d/0x1b9
>>    [<ffff82d04027932f>] S pci_remove_device+0x2b1/0x380
>>    [<ffff82d040260bd0>] F pci_physdev_op+0x197/0x19e
>>    [<ffff82d04032272d>] F do_physdev_op+0x342/0x12aa
>>    [<ffff82d0402f067a>] F pv_hypercall+0x58e/0x62b
>>    [<ffff82d0402012ba>] F lstar_enter+0x13a/0x140
>>
>> Move the existing block that removes the device from the domain pdev_list ahead
>> and also issue the call to vpci_deassign_device() there.  It's fine to remove
>> the device from the domain list of assigned devices, as further functions only
>> care that the pdev domain field is correctly set to the owner of the device
>> about to be removed.
>>
>> Moving the vpci_deassign_device() past the pci_cleanup_msi() call can be
>> dangerous, as doing the MSI cleanup ahead of having removed the vPCI handlers
>> could lead to stale data in vPCI MSI(-X) internal structures.
>>
>> Fixes: 4f78438b45e2 ('vpci: use per-domain PCI lock to protect vpci structure')
>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Stewart Hildebrand <stewart.hildebrand@amd.com>

Thanks