[PATCH] PCI: Prevent workqueue code nesting in pci_call_probe()

Waiman Long posted 1 patch 1 day, 23 hours ago
drivers/pci/pci-driver.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
[PATCH] PCI: Prevent workqueue code nesting in pci_call_probe()
Posted by Waiman Long 1 day, 23 hours ago
pci_call_probe() can be called recursively. If the recursive calls are
done indirectly via workqueue kworker, a lockdep recursive warning can
be produced.

There are older commits that tries to prevent that. One example is
commit 12c3156f10c5 ("PCI: Avoid unnecessary CPU switch when calling
driver .probe() method") which prevents work_on_cpu() recursion when
the current device is a virtual function and the physical device has
been probed. However, there are still other cases where workqueue code
nesting is possible leading to a lockdep recursive locking warning like
the following stack trace on a 4-socket Skylake server.

  <TASK>
    :
  work_on_cpu_key()
  pci_call_probe()
  pci_device_probe()
  really_probe()
  __driver_probe_device()
  driver_probe_device()
  __device_attach_driver()
  bus_for_each_drv()
  __device_attach()
  pci_bus_add_device()
  pci_bus_add_devices()
  vmd_enable_domain()
  vmd_probe()
  local_pci_probe()
  work_for_cpu_fn()
  process_one_work()
  worker_thread()
    :
  </TASK>

Fix that by adding a new wq_kworker() helper to check if the current
task is likely a workqueue kworker. If so, call local_pci_probe()
directly instead of calling into workqueue code recursively.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 drivers/pci/pci-driver.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index e3f59001785a..3c63098f6fde 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -370,6 +370,14 @@ static bool pci_physfn_is_probed(struct pci_dev *dev)
 #endif
 }
 
+/*
+ * Return true if current task is a workqueue kworker
+ */
+static bool wq_kworker(void)
+{
+	return (current->flags & PF_KTHREAD) && strstr(current->comm, "kworker");
+}
+
 static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 			  const struct pci_device_id *id)
 {
@@ -387,10 +395,11 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 	cpu_hotplug_disable();
 	/*
 	 * Prevent nesting work_on_cpu() for the case where a Virtual Function
-	 * device is probed from work_on_cpu() of the Physical device.
+	 * device is probed from work_on_cpu() of the Physical device or when
+	 * the current task is a workqueue kworker.
 	 */
 	if (node < 0 || node >= MAX_NUMNODES || !node_online(node) ||
-	    pci_physfn_is_probed(dev)) {
+	    pci_physfn_is_probed(dev) || wq_kworker()) {
 		error = local_pci_probe(&ddi);
 	} else {
 		struct pci_probe_arg arg = { .ddi = &ddi };
-- 
2.54.0