arch/s390/include/asm/pci.h | 32 +++++++ arch/s390/pci/pci.c | 1 + arch/s390/pci/pci_event.c | 136 +++++++++++++++++++----------- drivers/pci/hotplug/rpaphp_slot.c | 2 +- drivers/pci/pci.c | 32 ++++++- drivers/pci/slot.c | 33 ++++++-- drivers/vfio/pci/vfio_pci_core.c | 22 +++-- drivers/vfio/pci/vfio_pci_intrs.c | 3 +- drivers/vfio/pci/vfio_pci_priv.h | 9 ++ drivers/vfio/pci/vfio_pci_zdev.c | 48 ++++++++++- include/linux/pci.h | 8 +- include/uapi/linux/vfio.h | 29 +++++++ 12 files changed, 282 insertions(+), 73 deletions(-)
Hi,
This Linux kernel patch series introduces support for error recovery for
passthrough PCI devices on System Z (s390x).
Background
----------
For PCI devices on s390x an operating system receives platform specific
error events from firmware rather than through AER.Today for
passthrough/userspace devices, we don't attempt any error recovery and
ignore any error events for the devices. The passthrough/userspace devices
are managed by the vfio-pci driver. The driver does register error handling
callbacks (error_detected), and on an error trigger an eventfd to
userspace. But we need a mechanism to notify userspace
(QEMU/guest/userspace drivers) about the error event.
Proposal
--------
We can expose this error information (currently only the PCI Error Code)
via a device feature. Userspace can then obtain the error information
via VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving
a device reset.
This is how a typical flow for passthrough devices to a VM would work:
For passthrough devices to a VM, the driver bound to the device on the host
is vfio-pci. vfio-pci driver does support the error_detected() callback
(vfio_pci_core_aer_err_detected()), and on an PCI error s390x recovery
code on the host will call the vfio-pci error_detected() callback. The
vfio-pci error_detected() callback will notify userspace/QEMU via an
eventfd, and return PCI_ERS_RESULT_CAN_RECOVER. At this point the s390x
error recovery on the host will skip any further action(see patch 4) and
let userspace drive the error recovery.
Once userspace/QEMU is notified, it then injects this error into the VM
so device drivers in the VM can take recovery actions. For example for a
passthrough NVMe device, the VM's OS NVMe driver will access the device.
At this point the VM's NVMe driver's error_detected() will drive the
recovery by returning PCI_ERS_RESULT_NEED_RESET, and the s390x error
recovery in the VM's OS will try to do a reset. Resets are privileged
operations and so the VM will need intervention from QEMU to perform the
reset. QEMU will invoke the VFIO_DEVICE_RESET ioctl to now notify the
host that the VM is requesting a reset of the device. The vfio-pci driver
on the host will then perform the reset on the device to recover it.
Thanks
Farhan
ChangeLog
---------
v13 series https://lore.kernel.org/all/20260413210608.2912-1-alifm@linux.ibm.com/
v13 -> v14
- Remove version from vfio uAPI struct. Instead reserve additional space
and add a flags field. The flags will be used to indicate any usage of
the reserved space (patch 5).
- Remove pending_errors from vfio uAPI struct and instead return an
error to indicate no more pending error for userspace to handle (patch 5).
- Rebase on recent linux master
v12 series https://lore.kernel.org/all/20260330174011.1161-1-alifm@linux.ibm.com/
v12 -> v13
- Add the mediated_recovery flag as part of struct zpci_ccdf_pending
and protect the struct with pending_errs_lock (patch 4).
- Move dequeing pending error logic to a helper function (patch 5).
- Update device feature number for VFIO_DEVICE_FEATURE_ZPCI_ERROR (patch 5).
- Rebase on linux-next with tag next-20260410
v11 series https://lore.kernel.org/all/20260316191544.2279-1-alifm@linux.ibm.com/
- Address Bjorn's comments from v11 (patches 1-3).
- Create a common function to check config space accessibility
(patch 2).
- Address Alex's comments from v11 (patches 4, 5, 7).
- Protect the mediated_recovery flag with the pending_errs_lock.
Doing that it made sense to squash patches 5 and 6 from v11
(current patch 4). Even though the code didn't change significantly
I have dropped R-b tags for it. Would appreciate another look at the
patch (current patch 4).
- Dropped arch specific pcibios_resource_to_bus and
pcibios_bus_to_resource as its not needed for this series. Will address
the issue as a standalone patch separate from this series.
- Rebased on pci/next, with head at f8a1c947ccc6 ("Merge branch 'pci/misc'")
v10 series https://lore.kernel.org/all/20260302203325.3826-1-alifm@linux.ibm.com/
v10 -> v11
- Rebase on pci/next to handle merge conflicts with patch 1.
- Typo fixup in commit message (patch 4) and use guard() for mutex
(patch 6).
v9 series https://lore.kernel.org/all/20260217182257.1582-1-alifm@linux.ibm.com/
v9 -> v10
- Change pci_slot number to u16 (patch 1).
- Avoid saving invalid config space state if config space is
inaccessible in the device reset path. It uses the same patch as in v8
with R-b from Niklas.
- Rebase on 7.0.0-rc2
v8 series https://lore.kernel.org/all/20260122194437.1903-1-alifm@linux.ibm.com/
v8 -> v9
- Avoid saving PCI config space state in reset path (patch 3) (suggested by Bjorn)
- Add explicit version to struct vfio_device_feature_zpci_err (patch 7).
- Rebase on 6.19
v7 series https://lore.kernel.org/all/20260107183217.1365-1-alifm@linux.ibm.com/
v7 -> v8
- Rebase on 6.19-rc4
- Address feedback from Niklas and Julien.
v6 series https://lore.kernel.org/all/2c609e61-1861-4bf3-b019-a11c137d26a5@linux.ibm.com/
v6 -> v7
- Rebase on 6.19-rc4
- Update commit message based on Niklas's suggestion (patch 3).
v5 series https://lore.kernel.org/all/20251113183502.2388-1-alifm@linux.ibm.com/
v5 -> v6
- Rebase on 6.18 + Lukas's PCI: Universal error recoverability of
devices series (https://lore.kernel.org/all/cover.1763483367.git.lukas@wunner.de/)
- Re-work config space accessibility check to pci_dev_save_and_disable() (patch 3).
This avoids saving the config space, in the reset path, if the device's config space is
corrupted or inaccessible.
v4 series https://lore.kernel.org/all/20250924171628.826-1-alifm@linux.ibm.com/
v4 -> v5
- Rebase on 6.18-rc5
- Move bug fixes to the beginning of the series (patch 1 and 2). These patches
were posted as a separate fixes series
https://lore.kernel.org/all/a14936ac-47d6-461b-816f-0fd66f869b0f@linux.ibm.com/
- Add matching pci_put_dev() for pci_get_slot() (patch 6).
v3 series https://lore.kernel.org/all/20250911183307.1910-1-alifm@linux.ibm.com/
v3 -> v4
- Remove warn messages for each PCI capability not restored (patch 1)
- Check PCI_COMMAND and PCI_STATUS register for error value instead of device id
(patch 1)
- Fix kernel crash in patch 3
- Added reviewed by tags
- Address comments from Niklas's (patches 4, 5, 7)
- Fix compilation error non s390x system (patch 8)
- Explicitly align struct vfio_device_feature_zpci_err (patch 8)
v2 series https://lore.kernel.org/all/20250825171226.1602-1-alifm@linux.ibm.com/
v2 -> v3
- Patch 1 avoids saving any config space state if the device is in error
(suggested by Alex)
- Patch 2 adds additional check only for FLR reset to try other function
reset method (suggested by Alex).
- Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
functions. Creates a new flag pci_slot to allow per function slot.
- Patch 4 fixes a bug in s390 for resource to bus address translation.
- Rebase on 6.17-rc5
v1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/
v1 - > v2
- Patches 1 and 2 adds some additional checks for FLR/PM reset to
try other function reset method (suggested by Alex).
- Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
functions.
- Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE
ioctl. The ioctl is used by userspace to retriece any PCI error
information for the device (suggested by Alex).
- Patch 8 adds a reset_done() callback for the vfio-pci driver, to
restore the state of the device after a reset.
- Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX.
Farhan Ali (7):
PCI: Allow per function PCI slots to fix slot reset on s390
PCI: Avoid saving config space state if inaccessible
PCI: Fail FLR when config space is inaccessible
s390/pci: Store PCI error information for passthrough devices
vfio-pci/zdev: Add a device feature for error information
vfio/pci: Add a reset_done callback for vfio-pci driver
vfio/pci: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX
arch/s390/include/asm/pci.h | 32 +++++++
arch/s390/pci/pci.c | 1 +
arch/s390/pci/pci_event.c | 136 +++++++++++++++++++-----------
drivers/pci/hotplug/rpaphp_slot.c | 2 +-
drivers/pci/pci.c | 32 ++++++-
drivers/pci/slot.c | 33 ++++++--
drivers/vfio/pci/vfio_pci_core.c | 22 +++--
drivers/vfio/pci/vfio_pci_intrs.c | 3 +-
drivers/vfio/pci/vfio_pci_priv.h | 9 ++
drivers/vfio/pci/vfio_pci_zdev.c | 48 ++++++++++-
include/linux/pci.h | 8 +-
include/uapi/linux/vfio.h | 29 +++++++
12 files changed, 282 insertions(+), 73 deletions(-)
--
2.43.0
On Tue, Apr 21, 2026 at 09:30:24AM -0700, Farhan Ali wrote: > Hi, > > This Linux kernel patch series introduces support for error recovery for > passthrough PCI devices on System Z (s390x). Can you take a look through https://sashiko.dev/#/patchset/20260421163031.704-1-alifm%40linux.ibm.com and see if there's anything worth changing?
On 4/28/2026 3:01 PM, Bjorn Helgaas wrote: > On Tue, Apr 21, 2026 at 09:30:24AM -0700, Farhan Ali wrote: >> Hi, >> >> This Linux kernel patch series introduces support for error recovery for >> passthrough PCI devices on System Z (s390x). > Can you take a look through > https://sashiko.dev/#/patchset/20260421163031.704-1-alifm%40linux.ibm.com > and see if there's anything worth changing? Hi Bjorn, AFAICT Sashiko correctly identified one error that needs to be fixed. I think there were some other suggestions that would require some changes based on discussion with Niklas. I will wait a bit more to see if there is any other feedback before sending a new revision. Thanks Farhan
Gentle ping on this series.
Thanks
Farhan
On 4/21/2026 9:30 AM, Farhan Ali wrote:
> Hi,
>
> This Linux kernel patch series introduces support for error recovery for
> passthrough PCI devices on System Z (s390x).
>
> Background
> ----------
> For PCI devices on s390x an operating system receives platform specific
> error events from firmware rather than through AER.Today for
> passthrough/userspace devices, we don't attempt any error recovery and
> ignore any error events for the devices. The passthrough/userspace devices
> are managed by the vfio-pci driver. The driver does register error handling
> callbacks (error_detected), and on an error trigger an eventfd to
> userspace. But we need a mechanism to notify userspace
> (QEMU/guest/userspace drivers) about the error event.
>
> Proposal
> --------
> We can expose this error information (currently only the PCI Error Code)
> via a device feature. Userspace can then obtain the error information
> via VFIO_DEVICE_FEATURE ioctl and take appropriate actions such as driving
> a device reset.
>
> This is how a typical flow for passthrough devices to a VM would work:
> For passthrough devices to a VM, the driver bound to the device on the host
> is vfio-pci. vfio-pci driver does support the error_detected() callback
> (vfio_pci_core_aer_err_detected()), and on an PCI error s390x recovery
> code on the host will call the vfio-pci error_detected() callback. The
> vfio-pci error_detected() callback will notify userspace/QEMU via an
> eventfd, and return PCI_ERS_RESULT_CAN_RECOVER. At this point the s390x
> error recovery on the host will skip any further action(see patch 4) and
> let userspace drive the error recovery.
>
> Once userspace/QEMU is notified, it then injects this error into the VM
> so device drivers in the VM can take recovery actions. For example for a
> passthrough NVMe device, the VM's OS NVMe driver will access the device.
> At this point the VM's NVMe driver's error_detected() will drive the
> recovery by returning PCI_ERS_RESULT_NEED_RESET, and the s390x error
> recovery in the VM's OS will try to do a reset. Resets are privileged
> operations and so the VM will need intervention from QEMU to perform the
> reset. QEMU will invoke the VFIO_DEVICE_RESET ioctl to now notify the
> host that the VM is requesting a reset of the device. The vfio-pci driver
> on the host will then perform the reset on the device to recover it.
>
>
> Thanks
> Farhan
>
> ChangeLog
> ---------
> v13 series https://lore.kernel.org/all/20260413210608.2912-1-alifm@linux.ibm.com/
> v13 -> v14
> - Remove version from vfio uAPI struct. Instead reserve additional space
> and add a flags field. The flags will be used to indicate any usage of
> the reserved space (patch 5).
>
> - Remove pending_errors from vfio uAPI struct and instead return an
> error to indicate no more pending error for userspace to handle (patch 5).
>
> - Rebase on recent linux master
>
> v12 series https://lore.kernel.org/all/20260330174011.1161-1-alifm@linux.ibm.com/
> v12 -> v13
> - Add the mediated_recovery flag as part of struct zpci_ccdf_pending
> and protect the struct with pending_errs_lock (patch 4).
>
> - Move dequeing pending error logic to a helper function (patch 5).
>
> - Update device feature number for VFIO_DEVICE_FEATURE_ZPCI_ERROR (patch 5).
>
> - Rebase on linux-next with tag next-20260410
>
>
> v11 series https://lore.kernel.org/all/20260316191544.2279-1-alifm@linux.ibm.com/
> - Address Bjorn's comments from v11 (patches 1-3).
>
> - Create a common function to check config space accessibility
> (patch 2).
>
> - Address Alex's comments from v11 (patches 4, 5, 7).
>
> - Protect the mediated_recovery flag with the pending_errs_lock.
> Doing that it made sense to squash patches 5 and 6 from v11
> (current patch 4). Even though the code didn't change significantly
> I have dropped R-b tags for it. Would appreciate another look at the
> patch (current patch 4).
>
> - Dropped arch specific pcibios_resource_to_bus and
> pcibios_bus_to_resource as its not needed for this series. Will address
> the issue as a standalone patch separate from this series.
>
> - Rebased on pci/next, with head at f8a1c947ccc6 ("Merge branch 'pci/misc'")
>
>
> v10 series https://lore.kernel.org/all/20260302203325.3826-1-alifm@linux.ibm.com/
> v10 -> v11
> - Rebase on pci/next to handle merge conflicts with patch 1.
>
> - Typo fixup in commit message (patch 4) and use guard() for mutex
> (patch 6).
>
> v9 series https://lore.kernel.org/all/20260217182257.1582-1-alifm@linux.ibm.com/
> v9 -> v10
> - Change pci_slot number to u16 (patch 1).
>
> - Avoid saving invalid config space state if config space is
> inaccessible in the device reset path. It uses the same patch as in v8
> with R-b from Niklas.
>
> - Rebase on 7.0.0-rc2
>
>
> v8 series https://lore.kernel.org/all/20260122194437.1903-1-alifm@linux.ibm.com/
> v8 -> v9
> - Avoid saving PCI config space state in reset path (patch 3) (suggested by Bjorn)
>
> - Add explicit version to struct vfio_device_feature_zpci_err (patch 7).
>
> - Rebase on 6.19
>
>
> v7 series https://lore.kernel.org/all/20260107183217.1365-1-alifm@linux.ibm.com/
> v7 -> v8
> - Rebase on 6.19-rc4
>
> - Address feedback from Niklas and Julien.
>
>
> v6 series https://lore.kernel.org/all/2c609e61-1861-4bf3-b019-a11c137d26a5@linux.ibm.com/
> v6 -> v7
> - Rebase on 6.19-rc4
>
> - Update commit message based on Niklas's suggestion (patch 3).
>
> v5 series https://lore.kernel.org/all/20251113183502.2388-1-alifm@linux.ibm.com/
> v5 -> v6
> - Rebase on 6.18 + Lukas's PCI: Universal error recoverability of
> devices series (https://lore.kernel.org/all/cover.1763483367.git.lukas@wunner.de/)
>
> - Re-work config space accessibility check to pci_dev_save_and_disable() (patch 3).
> This avoids saving the config space, in the reset path, if the device's config space is
> corrupted or inaccessible.
>
> v4 series https://lore.kernel.org/all/20250924171628.826-1-alifm@linux.ibm.com/
> v4 -> v5
> - Rebase on 6.18-rc5
>
> - Move bug fixes to the beginning of the series (patch 1 and 2). These patches
> were posted as a separate fixes series
> https://lore.kernel.org/all/a14936ac-47d6-461b-816f-0fd66f869b0f@linux.ibm.com/
>
> - Add matching pci_put_dev() for pci_get_slot() (patch 6).
>
> v3 series https://lore.kernel.org/all/20250911183307.1910-1-alifm@linux.ibm.com/
> v3 -> v4
> - Remove warn messages for each PCI capability not restored (patch 1)
>
> - Check PCI_COMMAND and PCI_STATUS register for error value instead of device id
> (patch 1)
>
> - Fix kernel crash in patch 3
>
> - Added reviewed by tags
>
> - Address comments from Niklas's (patches 4, 5, 7)
>
> - Fix compilation error non s390x system (patch 8)
>
> - Explicitly align struct vfio_device_feature_zpci_err (patch 8)
>
>
> v2 series https://lore.kernel.org/all/20250825171226.1602-1-alifm@linux.ibm.com/
> v2 -> v3
> - Patch 1 avoids saving any config space state if the device is in error
> (suggested by Alex)
>
> - Patch 2 adds additional check only for FLR reset to try other function
> reset method (suggested by Alex).
>
> - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
> functions. Creates a new flag pci_slot to allow per function slot.
>
> - Patch 4 fixes a bug in s390 for resource to bus address translation.
>
> - Rebase on 6.17-rc5
>
>
> v1 series https://lore.kernel.org/all/20250813170821.1115-1-alifm@linux.ibm.com/
> v1 - > v2
> - Patches 1 and 2 adds some additional checks for FLR/PM reset to
> try other function reset method (suggested by Alex).
>
> - Patch 3 fixes a bug in s390 for resetting PCI devices with multiple
> functions.
>
> - Patch 7 adds a new device feature for zPCI devices for the VFIO_DEVICE_FEATURE
> ioctl. The ioctl is used by userspace to retriece any PCI error
> information for the device (suggested by Alex).
>
> - Patch 8 adds a reset_done() callback for the vfio-pci driver, to
> restore the state of the device after a reset.
>
> - Patch 9 removes the pcie check for triggering VFIO_PCI_ERR_IRQ_INDEX.
>
>
> Farhan Ali (7):
> PCI: Allow per function PCI slots to fix slot reset on s390
> PCI: Avoid saving config space state if inaccessible
> PCI: Fail FLR when config space is inaccessible
> s390/pci: Store PCI error information for passthrough devices
> vfio-pci/zdev: Add a device feature for error information
> vfio/pci: Add a reset_done callback for vfio-pci driver
> vfio/pci: Remove the pcie check for VFIO_PCI_ERR_IRQ_INDEX
>
> arch/s390/include/asm/pci.h | 32 +++++++
> arch/s390/pci/pci.c | 1 +
> arch/s390/pci/pci_event.c | 136 +++++++++++++++++++-----------
> drivers/pci/hotplug/rpaphp_slot.c | 2 +-
> drivers/pci/pci.c | 32 ++++++-
> drivers/pci/slot.c | 33 ++++++--
> drivers/vfio/pci/vfio_pci_core.c | 22 +++--
> drivers/vfio/pci/vfio_pci_intrs.c | 3 +-
> drivers/vfio/pci/vfio_pci_priv.h | 9 ++
> drivers/vfio/pci/vfio_pci_zdev.c | 48 ++++++++++-
> include/linux/pci.h | 8 +-
> include/uapi/linux/vfio.h | 29 +++++++
> 12 files changed, 282 insertions(+), 73 deletions(-)
>
© 2016 - 2026 Red Hat, Inc.