[PATCH v2 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and hotplug

Niklas Schnelle posted 2 patches 2 months, 3 weeks ago
There is a newer version of this series
[PATCH v2 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and hotplug
Posted by Niklas Schnelle 2 months, 3 weeks ago
Commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when
enabling/disabling SR-IOV") tried to fix a race between the VF removal
inside sriov_del_vfs() and concurrent hot unplug by taking the PCI
rescan/remove lock in sriov_del_vfs(). Similarly the PCI rescan/remove
lock was also taken in sriov_add_vfs() to protect addition of VFs.

This approach however causes deadlock on trying to remove PFs with
SR-IOV enabled because PFs disable SR-IOV during removal and this
removal happens under the PCI rescan/remove lock. So the original fix
had to be reverted.

Instead of taking the PCI rescan/remove lock in sriov_add_vfs() and
sriov_del_vfs(), fix the race that occurs with SR-IOV enable and disable
vs hotplug higher up in the callchain by taking the lock in
sriov_numvfs_store() before calling into the driver's sriov_configure()
callback.

Cc: stable@vger.kernel.org
Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV")
Reported-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 drivers/pci/iov.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index ac4375954c9479b5f4a0e666b5215094fdaeefc2..c6dc1b44bf602a0b1785b684f768fcd563f5b2aa 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -495,7 +495,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
 
 	if (num_vfs == 0) {
 		/* disable VFs */
+		pci_lock_rescan_remove();
 		ret = pdev->driver->sriov_configure(pdev, 0);
+		pci_unlock_rescan_remove();
 		goto exit;
 	}
 
@@ -507,7 +509,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
 		goto exit;
 	}
 
+	pci_lock_rescan_remove();
 	ret = pdev->driver->sriov_configure(pdev, num_vfs);
+	pci_unlock_rescan_remove();
 	if (ret < 0)
 		goto exit;
 

-- 
2.48.1
Re: [PATCH v2 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and hotplug
Posted by Gerd Bayer 2 months ago
On Wed, 2025-11-19 at 13:34 +0100, Niklas Schnelle wrote:
> Commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when
> enabling/disabling SR-IOV") tried to fix a race between the VF removal
> inside sriov_del_vfs() and concurrent hot unplug by taking the PCI
> rescan/remove lock in sriov_del_vfs(). Similarly the PCI rescan/remove
> lock was also taken in sriov_add_vfs() to protect addition of VFs.
> 
> This approach however causes deadlock on trying to remove PFs with
> SR-IOV enabled because PFs disable SR-IOV during removal and this
> removal happens under the PCI rescan/remove lock. So the original fix
> had to be reverted.
> 
> Instead of taking the PCI rescan/remove lock in sriov_add_vfs() and
> sriov_del_vfs(), fix the race that occurs with SR-IOV enable and disable
> vs hotplug higher up in the callchain by taking the lock in
> sriov_numvfs_store() before calling into the driver's sriov_configure()
> callback.

I agree, adding the lock to sriov_numvfs_store() is adding the missing
serialization against other threads that create or remove PCI devices
without getting into the middle of implicit PCI VF device removal in
the course of a PF removal that grabbed pci_lock_rescan_remove()
already.

> 
> Cc: stable@vger.kernel.org
> Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV")
> Reported-by: Benjamin Block <bblock@linux.ibm.com>
> Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
> ---
>  drivers/pci/iov.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index ac4375954c9479b5f4a0e666b5215094fdaeefc2..c6dc1b44bf602a0b1785b684f768fcd563f5b2aa 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -495,7 +495,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
>  
>  	if (num_vfs == 0) {
>  		/* disable VFs */
> +		pci_lock_rescan_remove();
>  		ret = pdev->driver->sriov_configure(pdev, 0);
> +		pci_unlock_rescan_remove();
>  		goto exit;
>  	}
>  
> @@ -507,7 +509,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
>  		goto exit;
>  	}
>  
> +	pci_lock_rescan_remove();
>  	ret = pdev->driver->sriov_configure(pdev, num_vfs);
> +	pci_unlock_rescan_remove();
>  	if (ret < 0)
>  		goto exit;
>  

This LGTM, feel free to add
Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>

Thank you!