[PATCH 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and hotplug

Niklas Schnelle posted 2 patches 3 months, 1 week ago
There is a newer version of this series
[PATCH 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and hotplug
Posted by Niklas Schnelle 3 months, 1 week ago
Commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when
enabling/disabling SR-IOV") tried to fix a race between the VF removal
inside sriov_del_vfs() and concurrent hot unplug by taking the PCI
rescan/remove lock in sriov_del_vfs(). Similarly the PCI rescan/remove
lock was also taken in sriov_add_vfs() to protect addition of VFs.

This approach however causes deadlock on trying to remove PFs with
SR-IOV enabled because PFs disable SR-IOV during removal and this
removal happens under the PCI rescan/remove lock. So the original fix
had to be reverted.

Instead of taking the PCI rescan/remove lock in sriov_add_vfs() and
sriov_del_vfs(), fix the race that occurs with SR-IOV enable and disable
vs hotplug higher up in the callchain by taking the lock in
sriov_numvfs_store() before calling into the driver's sriov_configure()
callback.

Cc: stable@vger.kernel.org
Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV")
Reported-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
---
 drivers/pci/iov.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index ac4375954c9479b5f4a0e666b5215094fdaeefc2..c6dc1b44bf602a0b1785b684f768fcd563f5b2aa 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -495,7 +495,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
 
 	if (num_vfs == 0) {
 		/* disable VFs */
+		pci_lock_rescan_remove();
 		ret = pdev->driver->sriov_configure(pdev, 0);
+		pci_unlock_rescan_remove();
 		goto exit;
 	}
 
@@ -507,7 +509,9 @@ static ssize_t sriov_numvfs_store(struct device *dev,
 		goto exit;
 	}
 
+	pci_lock_rescan_remove();
 	ret = pdev->driver->sriov_configure(pdev, num_vfs);
+	pci_unlock_rescan_remove();
 	if (ret < 0)
 		goto exit;
 

-- 
2.48.1
Re: [PATCH 2/2] PCI/IOV: Fix race between SR-IOV enable/disable and hotplug
Posted by Benjamin Block 3 months ago
Hey Niklas,

On Thu, Oct 30, 2025 at 11:26:02AM +0100, Niklas Schnelle wrote:
> Commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when
> enabling/disabling SR-IOV") tried to fix a race between the VF removal
> inside sriov_del_vfs() and concurrent hot unplug by taking the PCI
> rescan/remove lock in sriov_del_vfs(). Similarly the PCI rescan/remove
> lock was also taken in sriov_add_vfs() to protect addition of VFs.
> 
> This approach however causes deadlock on trying to remove PFs with
> SR-IOV enabled because PFs disable SR-IOV during removal and this
> removal happens under the PCI rescan/remove lock. So the original fix
> had to be reverted.
> 
> Instead of taking the PCI rescan/remove lock in sriov_add_vfs() and
> sriov_del_vfs(), fix the race that occurs with SR-IOV enable and disable
> vs hotplug higher up in the callchain by taking the lock in
> sriov_numvfs_store() before calling into the driver's sriov_configure()
> callback.
> 
> Cc: stable@vger.kernel.org
> Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV")
> Reported-by: Benjamin Block <bblock@linux.ibm.com>
> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
> ---
>  drivers/pci/iov.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 

It fixes the original bug (kernel panic) AFAICT, insofar:

Reviewed-by: Benjamin Block <bblock@linux.ibm.com>


But tbh., I find it a bit "scary" to take one global lock at such a high
level, when it is also used at much lower levels throughout the stack.

IIRC my original (internal) report boiled down to that pci_destroy_dev() is
called essentially unlocked, and without taking any of the attached reference
counts into account.. which seems fishy IMHO. The actual freeing of the memory
is hidden behind a reference, but stuff like device_del() is just.. done..
and that seems strange to me.

Having a bit more granularity here might not hurt; but I can't say I have a
good idea on top of my mind in how to do that, so... rather fix the panic.

-- 
Best Regards, Benjamin Block        /        Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH    /   https://www.ibm.com/privacy
Vors. Aufs.-R.: Wolfgang Wendt         /        Geschäftsführung: David Faller
Sitz der Ges.: Böblingen     /    Registergericht: AmtsG Stuttgart, HRB 243294