Niklas already mentioned it in his recent comments on discussions about
`pci_rescan_remove_lock` here
https://lore.kernel.org/linux-pci/286d0488aa72b1741f93f900fd5db5c4334a6f50.camel@linux.ibm.com/
and here
https://lore.kernel.org/linux-pci/2b6a844619892ecaa11031705808667e0886d8b2.camel@linux.ibm.com/
; we recently found a couple of deadlocks in the s390 architecture PCI
implementation with hotplug events on our platform.
So far these have not been observed because on s390 it was not usual to have
both PF and attached VFs in the same Linux instance. So far PCI devices have
largely been either available as PF without SR-IOV, or as VF without the PF
being visible in the same instance. This left us with some blindspots w.r.t.
the locking issues here.
This is now changing, and with that we started running into these
deadlocks.
Please Note: this patchset strictly depends on Ionut Nechita's patch that makes
`pci_lock_rescan_remove()` reentrant:
https://lore.kernel.org/linux-pci/20260306082108.17322-2-ionut.nechita@windriver.com/
Sicne the discussion so far sounded positive towards the change I
decided to base some of the changes in this patchset on the
assumption that his patch gets merged before mine. Otherwise
there will be rescursive deadlocks.
Patch 01 helps us insofar it enables us to use lockdep annotations in the
architecture code.
Patch 02 goes into detail what deadlocks exactly, and how they are fixed.
Patch 03 and 04 make it possible to use lock guards for
`pci_rescan_remove_lock` and make use of them in the s390 architecture PCI
implementation.
I've run a /lot/ of tests with affected PCI adapters:
* enable/disable SR-IOV on the PF;
* run FLR reset on PF and VF;
* run Bus reset on PF and VF;
* run s390's recover SysFS attribute on PF and VF;
* disable/enable power with the hotplug SysFS attribute on PF and VF;
* run `zpcictl` with `--reset`/`--reset-fw` on PF and VF;
* run Configure Off and Configure On on both the PF and VF from a Service
Element.
There is no more deadlocks and no other lockdep warnings I've witnessed.
Benjamin Block (4):
PCI: Move declaration of pci_rescan_remove_lock into public pci.h
s390/pci: Fix circular/recursive deadlocks in PCI-bus and -device
release
PCI: Provide lock guard for pci_rescan_remove_lock
s390/pci: Use lock guard for pci_rescan_remove_lock
arch/s390/pci/pci.c | 11 ++++++++---
arch/s390/pci/pci_bus.c | 15 ++++++++-------
arch/s390/pci/pci_event.c | 28 +++++++++++++++++++---------
arch/s390/pci/pci_iov.c | 3 +--
arch/s390/pci/pci_sysfs.c | 9 +++------
drivers/pci/pci.h | 2 --
drivers/pci/probe.c | 1 +
include/linux/pci.h | 5 +++++
8 files changed, 45 insertions(+), 29 deletions(-)
base-commit: 5ee8dbf54602dc340d6235b1d6aa17c0f283f48c
--
2.53.0