arch/s390/include/asm/pci.h | 29 ++++++++ arch/s390/pci/pci.c | 75 +++++++++++++++++++++ arch/s390/pci/pci_event.c | 107 +++++++++++++++++------------- drivers/pci/host-bridge.c | 4 +- drivers/pci/pci.c | 19 +++++- drivers/pci/slot.c | 25 ++++++- drivers/vfio/pci/vfio_pci_core.c | 20 ++++-- drivers/vfio/pci/vfio_pci_intrs.c | 3 +- drivers/vfio/pci/vfio_pci_priv.h | 9 +++ drivers/vfio/pci/vfio_pci_zdev.c | 45 ++++++++++++- include/linux/pci.h | 1 + include/uapi/linux/vfio.h | 15 +++++ 12 files changed, 291 insertions(+), 61 deletions(-)
Hi,
This Linux kernel patch series introduces support for error recovery for
passthrough PCI devices on System Z (s390x).
Background
----------
For PCI devices on s390x an operating system receives platform specific
error events from firmware rather than through AER.Today for
passthrough/userspace devices, we don't attempt any error recovery and
ignore any error events for the devices. The passthrough/userspace devices
are managed by the vfio-pci driver. The driver does register error handling
callbacks (error_detected), and on an error trigger an eventfd to
userspace. But we need a mechanism to notify userspace
(QEMU/guest/userspace drivers) about the error event.
Proposal
--------
We can expose this error information (currently only the PCI Error Code)
via a device feature. Userspace can then obtain the error information
via VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving
a device reset.
This is how a typical flow for passthrough devices to a VM would work:
For passthrough devices to a VM, the driver bound to the device on the host
is vfio-pci. vfio-pci driver does support the error_detected() callback
(vfio_pci_core_aer_err_detected()), and on an PCI error s390x recovery
code on the host will call the vfio-pci error_detected() callback. The
vfio-pci error_detected() callback will notify userspace/QEMU via an
eventfd, and return PCI_ERS_RESULT_CAN_RECOVER. At this point the s390x
error recovery on the host will skip any further action(see patch 6) and
let userspace drive the error recovery.
Once userspace/QEMU is notified, it then injects this error into the VM
so device drivers in the VM can take recovery actions. For example for a
passthrough NVMe device, the VM's OS NVMe driver will access the device.
At this point the VM's NVMe driver's error_detected() will drive the
recovery by returning PCI_ERS_RESULT_NEED_RESET, and the s390x error
recovery in the VM's OS will try to do a reset. Resets are privileged
operations and so the VM will need intervention from QEMU to perform the
reset. QEMU will invoke the VFIO_DEVICE_RESET ioctl to now notify the
host that the VM is requesting a reset of the device. The vfio-pci driver
on the host will then perform the reset on the device to recover it.
Thanks
Farhan
ChangeLog
---------
v5 series https://lore.kernel.org/all/20251113183502.2388-1-alifm@linux.ibm.com/
v5 -> v6
- Rebase on 6.18 + Lukas's PCI: Universal error recoverability of
devices series (https://lore.kernel.org/all/cover.1763483367.git.lukas@wunner.de/)
- Re-work config space accessibility check to pci_dev_save_and_disable() (patch 3).
This avoids saving the config space, in the reset path, if the device's config space is
corrupted or inaccessible.
v4 series https://lore.kernel.org/all/20250924171628.826-1-alifm@linux.ibm.com/
v4 -> v5
- Rebase on 6.18-rc5
- Move bug fixes to the beginning of the series (patch 1 and 2). These patches
were posted as a separate fixes series
https://lore.kernel.org/all/a14936ac-47d6-461b-816f-0fd66f869b0f@linux.ibm.com/
- Add matching pci_put_dev() for pci_get_slot() (patch 6).
v3 series https://lore.kernel.org/all/20250911183307.1910-1-alifm@linux.ibm.com/
v3 -> v4
- Remove warn messages for each PCI capability not restored (patch 1)
- Check PCI_COMMAND and PCI_STATUS register for error value instead of device id
(patch 1)
- Fix kernel crash in patch 3
- Added reviewed by tags
- Address comments from Niklas's (patches 4, 5, 7)
- Fix compilation error non s390x system (patch 8)
- Explicitly align struct vfio_device_feature_zpci_err (patch 8)
v2 series https://lore.kernel.org/all/20250825171226.1602-1-alifm@linux.ibm.com/
v2 -> v3
- Patch 1 avoids saving any config space state if the device is in error
(suggested by Alex)
- Patch 2 adds additional check only for FLR reset to try other function
reset method (suggested by Alex).
- Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
functions. Creates a new flag pci_slot to allow per function slot.
- Patch 4 fixes a bug in s390 for resource to bus address translation.
- Rebase on 6.17-rc5
v1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/
v1 - > v2
- Patches 1 and 2 adds some additional checks for FLR/PM reset to
try other function reset method (suggested by Alex).
- Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
functions.
- Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE
ioctl. The ioctl is used by userspace to retriece any PCI error
information for the device (suggested by Alex).
- Patch 8 adds a reset_done() callback for the vfio-pci driver, to
restore the state of the device after a reset.
- Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX.
Farhan Ali (9):
PCI: Allow per function PCI slots
s390/pci: Add architecture specific resource/bus address translation
PCI: Avoid saving config space state if inaccessible
PCI: Add additional checks for flr reset
s390/pci: Update the logic for detecting passthrough device
s390/pci: Store PCI error information for passthrough devices
vfio-pci/zdev: Add a device feature for error information
vfio: Add a reset_done callback for vfio-pci driver
vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX
arch/s390/include/asm/pci.h | 29 ++++++++
arch/s390/pci/pci.c | 75 +++++++++++++++++++++
arch/s390/pci/pci_event.c | 107 +++++++++++++++++-------------
drivers/pci/host-bridge.c | 4 +-
drivers/pci/pci.c | 19 +++++-
drivers/pci/slot.c | 25 ++++++-
drivers/vfio/pci/vfio_pci_core.c | 20 ++++--
drivers/vfio/pci/vfio_pci_intrs.c | 3 +-
drivers/vfio/pci/vfio_pci_priv.h | 9 +++
drivers/vfio/pci/vfio_pci_zdev.c | 45 ++++++++++++-
include/linux/pci.h | 1 +
include/uapi/linux/vfio.h | 15 +++++
12 files changed, 291 insertions(+), 61 deletions(-)
--
2.43.0
Hi Bjorn, Polite ping, to see if there are any concerns with this patch series (more specifically patches 1-4). Thanks Farhan On 12/1/2025 2:08 PM, Farhan Ali wrote: > Hi, > > This Linux kernel patch series introduces support for error recovery for > passthrough PCI devices on System Z (s390x). > > Background > ---------- > For PCI devices on s390x an operating system receives platform specific > error events from firmware rather than through AER.Today for > passthrough/userspace devices, we don't attempt any error recovery and > ignore any error events for the devices. The passthrough/userspace devices > are managed by the vfio-pci driver. The driver does register error handling > callbacks (error_detected), and on an error trigger an eventfd to > userspace. But we need a mechanism to notify userspace > (QEMU/guest/userspace drivers) about the error event. > > Proposal > -------- > We can expose this error information (currently only the PCI Error Code) > via a device feature. Userspace can then obtain the error information > via VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving > a device reset. > > This is how a typical flow for passthrough devices to a VM would work: > For passthrough devices to a VM, the driver bound to the device on the host > is vfio-pci. vfio-pci driver does support the error_detected() callback > (vfio_pci_core_aer_err_detected()), and on an PCI error s390x recovery > code on the host will call the vfio-pci error_detected() callback. The > vfio-pci error_detected() callback will notify userspace/QEMU via an > eventfd, and return PCI_ERS_RESULT_CAN_RECOVER. At this point the s390x > error recovery on the host will skip any further action(see patch 6) and > let userspace drive the error recovery. > > Once userspace/QEMU is notified, it then injects this error into the VM > so device drivers in the VM can take recovery actions. For example for a > passthrough NVMe device, the VM's OS NVMe driver will access the device. > At this point the VM's NVMe driver's error_detected() will drive the > recovery by returning PCI_ERS_RESULT_NEED_RESET, and the s390x error > recovery in the VM's OS will try to do a reset. Resets are privileged > operations and so the VM will need intervention from QEMU to perform the > reset. QEMU will invoke the VFIO_DEVICE_RESET ioctl to now notify the > host that the VM is requesting a reset of the device. The vfio-pci driver > on the host will then perform the reset on the device to recover it. > > > Thanks > Farhan > > ChangeLog > --------- > v5 series https://lore.kernel.org/all/20251113183502.2388-1-alifm@linux.ibm.com/ > v5 -> v6 > - Rebase on 6.18 + Lukas's PCI: Universal error recoverability of > devices series (https://lore.kernel.org/all/cover.1763483367.git.lukas@wunner.de/) > > - Re-work config space accessibility check to pci_dev_save_and_disable() (patch 3). > This avoids saving the config space, in the reset path, if the device's config space is > corrupted or inaccessible. > > v4 series https://lore.kernel.org/all/20250924171628.826-1-alifm@linux.ibm.com/ > v4 -> v5 > - Rebase on 6.18-rc5 > > - Move bug fixes to the beginning of the series (patch 1 and 2). These patches > were posted as a separate fixes series > https://lore.kernel.org/all/a14936ac-47d6-461b-816f-0fd66f869b0f@linux.ibm.com/ > > - Add matching pci_put_dev() for pci_get_slot() (patch 6). > > v3 series https://lore.kernel.org/all/20250911183307.1910-1-alifm@linux.ibm.com/ > v3 -> v4 > - Remove warn messages for each PCI capability not restored (patch 1) > > - Check PCI_COMMAND and PCI_STATUS register for error value instead of device id > (patch 1) > > - Fix kernel crash in patch 3 > > - Added reviewed by tags > > - Address comments from Niklas's (patches 4, 5, 7) > > - Fix compilation error non s390x system (patch 8) > > - Explicitly align struct vfio_device_feature_zpci_err (patch 8) > > > v2 series https://lore.kernel.org/all/20250825171226.1602-1-alifm@linux.ibm.com/ > v2 -> v3 > - Patch 1 avoids saving any config space state if the device is in error > (suggested by Alex) > > - Patch 2 adds additional check only for FLR reset to try other function > reset method (suggested by Alex). > > - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple > functions. Creates a new flag pci_slot to allow per function slot. > > - Patch 4 fixes a bug in s390 for resource to bus address translation. > > - Rebase on 6.17-rc5 > > > v1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/ > v1 - > v2 > - Patches 1 and 2 adds some additional checks for FLR/PM reset to > try other function reset method (suggested by Alex). > > - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple > functions. > > - Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE > ioctl. The ioctl is used by userspace to retriece any PCI error > information for the device (suggested by Alex). > > - Patch 8 adds a reset_done() callback for the vfio-pci driver, to > restore the state of the device after a reset. > > - Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX. > > > Farhan Ali (9): > PCI: Allow per function PCI slots > s390/pci: Add architecture specific resource/bus address translation > PCI: Avoid saving config space state if inaccessible > PCI: Add additional checks for flr reset > s390/pci: Update the logic for detecting passthrough device > s390/pci: Store PCI error information for passthrough devices > vfio-pci/zdev: Add a device feature for error information > vfio: Add a reset_done callback for vfio-pci driver > vfio: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX > > arch/s390/include/asm/pci.h | 29 ++++++++ > arch/s390/pci/pci.c | 75 +++++++++++++++++++++ > arch/s390/pci/pci_event.c | 107 +++++++++++++++++------------- > drivers/pci/host-bridge.c | 4 +- > drivers/pci/pci.c | 19 +++++- > drivers/pci/slot.c | 25 ++++++- > drivers/vfio/pci/vfio_pci_core.c | 20 ++++-- > drivers/vfio/pci/vfio_pci_intrs.c | 3 +- > drivers/vfio/pci/vfio_pci_priv.h | 9 +++ > drivers/vfio/pci/vfio_pci_zdev.c | 45 ++++++++++++- > include/linux/pci.h | 1 + > include/uapi/linux/vfio.h | 15 +++++ > 12 files changed, 291 insertions(+), 61 deletions(-) >
© 2016 - 2025 Red Hat, Inc.