[PATCH v2 0/2] PCI: Guard Resizable BAR restore against unreachable devices

Marco Nenciarini posted 2 patches 1 month, 4 weeks ago
drivers/pci/iov.c   | 6 ++++++
drivers/pci/rebar.c | 6 ++++++
2 files changed, 12 insertions(+)
[PATCH v2 0/2] PCI: Guard Resizable BAR restore against unreachable devices
Posted by Marco Nenciarini 1 month, 4 weeks ago
This series addresses Bjorn's review feedback on v1 [1].

v1 bounds-checked bar_idx before indexing dev->sriov->barsz[] in
sriov_restore_vf_rebar_state(). Bjorn pointed out that the non-SRIOV
sibling pci_restore_rebar_state() has the same issue, and that a
PCI_POSSIBLE_ERROR(ctrl) check on the config read makes the intent
of the guard more obvious than a post-hoc range check on the
extracted field.

v2 therefore adopts PCI_POSSIBLE_ERROR(ctrl) after each Resizable
BAR Control read in both functions, bailing out when config reads
return the all-ones pattern. Patch 1 covers pci_restore_rebar_state().
Patch 2 covers sriov_restore_vf_rebar_state(), with the NVIDIA GC6
UBSAN splat as motivation.

Note that this changes behavior versus v1: on a bad read we abort
the loop instead of skipping just the current BAR. This matches
the structure Bjorn suggested in review and is safe because the
all-ones pattern means the device is unreachable, so restoring the
remaining BARs is moot.

Compile-tested on pci/next (full drivers/pci/ build). The error
path cannot be exercised without reproducing the GC6 failure that
killed the GPU in the original report.

The broader v1 discussion on the pci_restore_config_dword() retry
loop and on save/restore behavior when the device has fallen off
the bus is out of scope for this fix. Happy to tackle that
separately if there is consensus.

[1] https://lore.kernel.org/all/20260408163922.1740497-1-mnencia@kcore.it/

Marco Nenciarini (2):
  PCI: Skip Resizable BAR restore on read error
  PCI/IOV: Skip VF Resizable BAR restore on read error

 drivers/pci/iov.c   | 6 ++++++
 drivers/pci/rebar.c | 6 ++++++
 2 files changed, 12 insertions(+)


base-commit: 40286d6379aacfcc053253ef78dc78b09addffda
-- 
2.47.3
Re: [PATCH v2 0/2] PCI: Guard Resizable BAR restore against unreachable devices
Posted by Bjorn Helgaas 1 month, 3 weeks ago
On Fri, Apr 17, 2026 at 03:24:35PM +0200, Marco Nenciarini wrote:
> This series addresses Bjorn's review feedback on v1 [1].
> 
> v1 bounds-checked bar_idx before indexing dev->sriov->barsz[] in
> sriov_restore_vf_rebar_state(). Bjorn pointed out that the non-SRIOV
> sibling pci_restore_rebar_state() has the same issue, and that a
> PCI_POSSIBLE_ERROR(ctrl) check on the config read makes the intent
> of the guard more obvious than a post-hoc range check on the
> extracted field.
> 
> v2 therefore adopts PCI_POSSIBLE_ERROR(ctrl) after each Resizable
> BAR Control read in both functions, bailing out when config reads
> return the all-ones pattern. Patch 1 covers pci_restore_rebar_state().
> Patch 2 covers sriov_restore_vf_rebar_state(), with the NVIDIA GC6
> UBSAN splat as motivation.
> 
> Note that this changes behavior versus v1: on a bad read we abort
> the loop instead of skipping just the current BAR. This matches
> the structure Bjorn suggested in review and is safe because the
> all-ones pattern means the device is unreachable, so restoring the
> remaining BARs is moot.
> 
> Compile-tested on pci/next (full drivers/pci/ build). The error
> path cannot be exercised without reproducing the GC6 failure that
> killed the GPU in the original report.
> 
> The broader v1 discussion on the pci_restore_config_dword() retry
> loop and on save/restore behavior when the device has fallen off
> the bus is out of scope for this fix. Happy to tackle that
> separately if there is consensus.
> 
> [1] https://lore.kernel.org/all/20260408163922.1740497-1-mnencia@kcore.it/
> 
> Marco Nenciarini (2):
>   PCI: Skip Resizable BAR restore on read error
>   PCI/IOV: Skip VF Resizable BAR restore on read error
> 
>  drivers/pci/iov.c   | 6 ++++++
>  drivers/pci/rebar.c | 6 ++++++
>  2 files changed, 12 insertions(+)

Applied to pci/pm for v7.2, thanks!  Will be rebased after v7.1-rc1.

> base-commit: 40286d6379aacfcc053253ef78dc78b09addffda
> -- 
> 2.47.3
>