[PATCH v4 0/7] PCI: endpoint/NTB: Harden vNTB resource management

Koichiro Den posted 7 patches 2 months, 1 week ago
drivers/ntb/hw/epf/ntb_hw_epf.c               |  3 +-
drivers/pci/endpoint/functions/pci-epf-ntb.c  | 56 +-----------
drivers/pci/endpoint/functions/pci-epf-vntb.c | 86 ++++++++++++-------
drivers/pci/endpoint/pci-ep-cfs.c             |  8 +-
4 files changed, 62 insertions(+), 91 deletions(-)
[PATCH v4 0/7] PCI: endpoint/NTB: Harden vNTB resource management
Posted by Koichiro Den 2 months, 1 week ago
The vNTB endpoint function (pci-epf-vntb) can be configured and reconfigured
through configfs (link/unlink functions, start/stop the controller, update
parameters). In practice, several pitfalls present: double-unmapping when two
windows share a BAR, wrong parameter order in .drop_link leading to wrong
object lookups, duplicate EPC teardown that leads to oopses, a work item
running after resources were torn down, and inability to re-link/restart
fundamentally because ntb_dev was embedded and the vPCI bus teardown was
incomplete.

This series addresses those issues and hardens resource management across NTB
EPF and PCI EP core:

- Avoid double iounmap when PEER_SPAD and CONFIG share the same BAR.
- Fix configfs .drop_link parameter order so the correct groups are used during
  unlink.
- Remove duplicate EPC resource teardown in both pci-epf-vntb and pci-epf-ntb,
  avoiding crashes on .allow_link failures and during .drop_link.
- Stop the delayed cmd_handler work before clearing BARs/doorbells.
- Manage ntb_dev as a devm-managed allocation and implement .remove() in the
  vNTB PCI driver. Switch to pci_scan_root_bus().

With these changes, the controller can now be stopped, a function unlinked,
configfs settings updated, and the controller re-linked and restarted
without rebooting the endpoint, as long as the underlying pci_epc_ops
.stop() is non-destructive and .start() restores normal operation.

Patches 1-5 carry Fixes tags and are candidates for stable.
Patch 6 is a preparatory one for Patch 7.
Patch 7 is a behavioral improvement that completes lifetime management for
relink/restart scenarios.


v3->v4 changes:
  - Added Reviewed-by tag for [PATCH v3 6/6].
  - Corrected patch split by moving the blank-line cleanup,
    based on the feedback from Frank.
  (No code changes overall.)
v2->v3 changes:
  - Added Reviewed-by tag for [PATCH v2 4/6].
  - Split [PATCH v2 6/6] into two, based on the feedback from Frank.
  (No code changes overall.)
v1->v2 changes:
  - Incorporated feedback from Frank.
  - Added Reviewed-by tags (except for patches #4 and #6).
  - Fixed a typo in patch #5 title.
  (No code changes overall.)

v3: https://lore.kernel.org/all/20251130151100.2591822-1-den@valinux.co.jp/
v2: https://lore.kernel.org/all/20251029080321.807943-1-den@valinux.co.jp/
v1: https://lore.kernel.org/all/20251023071757.901181-1-den@valinux.co.jp/


Koichiro Den (7):
  NTB: epf: Avoid pci_iounmap() with offset when PEER_SPAD and CONFIG
    share BAR
  PCI: endpoint: Fix parameter order for .drop_link
  PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
  PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
  NTB: epf: vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
  PCI: endpoint: pci-epf-vntb: Switch vpci_scan_bus() to use
    pci_scan_root_bus()
  PCI: endpoint: pci-epf-vntb: manage ntb_dev lifetime and fix vpci bus
    teardown

 drivers/ntb/hw/epf/ntb_hw_epf.c               |  3 +-
 drivers/pci/endpoint/functions/pci-epf-ntb.c  | 56 +-----------
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 86 ++++++++++++-------
 drivers/pci/endpoint/pci-ep-cfs.c             |  8 +-
 4 files changed, 62 insertions(+), 91 deletions(-)

-- 
2.48.1
Re: [PATCH v4 0/7] PCI: endpoint/NTB: Harden vNTB resource management
Posted by Koichiro Den 1 month ago
On Tue, Dec 02, 2025 at 04:23:41PM +0900, Koichiro Den wrote:
> The vNTB endpoint function (pci-epf-vntb) can be configured and reconfigured
> through configfs (link/unlink functions, start/stop the controller, update
> parameters). In practice, several pitfalls present: double-unmapping when two
> windows share a BAR, wrong parameter order in .drop_link leading to wrong
> object lookups, duplicate EPC teardown that leads to oopses, a work item
> running after resources were torn down, and inability to re-link/restart
> fundamentally because ntb_dev was embedded and the vPCI bus teardown was
> incomplete.
> 
> This series addresses those issues and hardens resource management across NTB
> EPF and PCI EP core:
> 
> - Avoid double iounmap when PEER_SPAD and CONFIG share the same BAR.
> - Fix configfs .drop_link parameter order so the correct groups are used during
>   unlink.
> - Remove duplicate EPC resource teardown in both pci-epf-vntb and pci-epf-ntb,
>   avoiding crashes on .allow_link failures and during .drop_link.
> - Stop the delayed cmd_handler work before clearing BARs/doorbells.
> - Manage ntb_dev as a devm-managed allocation and implement .remove() in the
>   vNTB PCI driver. Switch to pci_scan_root_bus().
> 
> With these changes, the controller can now be stopped, a function unlinked,
> configfs settings updated, and the controller re-linked and restarted
> without rebooting the endpoint, as long as the underlying pci_epc_ops
> .stop() is non-destructive and .start() restores normal operation.
> 
> Patches 1-5 carry Fixes tags and are candidates for stable.
> Patch 6 is a preparatory one for Patch 7.
> Patch 7 is a behavioral improvement that completes lifetime management for
> relink/restart scenarios.
> 
> 
> v3->v4 changes:
>   - Added Reviewed-by tag for [PATCH v3 6/6].
>   - Corrected patch split by moving the blank-line cleanup,
>     based on the feedback from Frank.
>   (No code changes overall.)
> v2->v3 changes:
>   - Added Reviewed-by tag for [PATCH v2 4/6].
>   - Split [PATCH v2 6/6] into two, based on the feedback from Frank.
>   (No code changes overall.)
> v1->v2 changes:
>   - Incorporated feedback from Frank.
>   - Added Reviewed-by tags (except for patches #4 and #6).
>   - Fixed a typo in patch #5 title.
>   (No code changes overall.)
> 
> v3: https://lore.kernel.org/all/20251130151100.2591822-1-den@valinux.co.jp/
> v2: https://lore.kernel.org/all/20251029080321.807943-1-den@valinux.co.jp/
> v1: https://lore.kernel.org/all/20251023071757.901181-1-den@valinux.co.jp/
> 
> 
> Koichiro Den (7):
>   NTB: epf: Avoid pci_iounmap() with offset when PEER_SPAD and CONFIG
>     share BAR
>   PCI: endpoint: Fix parameter order for .drop_link
>   PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
>   PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
>   NTB: epf: vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
>   PCI: endpoint: pci-epf-vntb: Switch vpci_scan_bus() to use
>     pci_scan_root_bus()
>   PCI: endpoint: pci-epf-vntb: manage ntb_dev lifetime and fix vpci bus
>     teardown
> 
>  drivers/ntb/hw/epf/ntb_hw_epf.c               |  3 +-
>  drivers/pci/endpoint/functions/pci-epf-ntb.c  | 56 +-----------
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 86 ++++++++++++-------
>  drivers/pci/endpoint/pci-ep-cfs.c             |  8 +-
>  4 files changed, 62 insertions(+), 91 deletions(-)

Dear NTB and PCI endpoint maintainers,

I suspect this series may have been confusing because it mixed patches
targeting both NTB and PCI endpoint subsystems.

Should I re-submit [PATCH v4 2/7], which touches
drivers/pci/endpoint/pci-ep-cfs.c separately to the linux-pci mailing
list, and re-submit the rest of the patches to the NTB mailing list?

Any guidance would be appreciated.

Thanks,
Koichiro

> 
> -- 
> 2.48.1
>
Re: [PATCH v4 0/7] PCI: endpoint/NTB: Harden vNTB resource management
Posted by Koichiro Den 1 week, 3 days ago
On Thu, Jan 08, 2026 at 03:57:30PM +0900, Koichiro Den wrote:
> On Tue, Dec 02, 2025 at 04:23:41PM +0900, Koichiro Den wrote:
> > The vNTB endpoint function (pci-epf-vntb) can be configured and reconfigured
> > through configfs (link/unlink functions, start/stop the controller, update
> > parameters). In practice, several pitfalls present: double-unmapping when two
> > windows share a BAR, wrong parameter order in .drop_link leading to wrong
> > object lookups, duplicate EPC teardown that leads to oopses, a work item
> > running after resources were torn down, and inability to re-link/restart
> > fundamentally because ntb_dev was embedded and the vPCI bus teardown was
> > incomplete.
> > 
> > This series addresses those issues and hardens resource management across NTB
> > EPF and PCI EP core:
> > 
> > - Avoid double iounmap when PEER_SPAD and CONFIG share the same BAR.
> > - Fix configfs .drop_link parameter order so the correct groups are used during
> >   unlink.
> > - Remove duplicate EPC resource teardown in both pci-epf-vntb and pci-epf-ntb,
> >   avoiding crashes on .allow_link failures and during .drop_link.
> > - Stop the delayed cmd_handler work before clearing BARs/doorbells.
> > - Manage ntb_dev as a devm-managed allocation and implement .remove() in the
> >   vNTB PCI driver. Switch to pci_scan_root_bus().
> > 
> > With these changes, the controller can now be stopped, a function unlinked,
> > configfs settings updated, and the controller re-linked and restarted
> > without rebooting the endpoint, as long as the underlying pci_epc_ops
> > .stop() is non-destructive and .start() restores normal operation.
> > 
> > Patches 1-5 carry Fixes tags and are candidates for stable.
> > Patch 6 is a preparatory one for Patch 7.
> > Patch 7 is a behavioral improvement that completes lifetime management for
> > relink/restart scenarios.
> > 
> > 
> > v3->v4 changes:
> >   - Added Reviewed-by tag for [PATCH v3 6/6].
> >   - Corrected patch split by moving the blank-line cleanup,
> >     based on the feedback from Frank.
> >   (No code changes overall.)
> > v2->v3 changes:
> >   - Added Reviewed-by tag for [PATCH v2 4/6].
> >   - Split [PATCH v2 6/6] into two, based on the feedback from Frank.
> >   (No code changes overall.)
> > v1->v2 changes:
> >   - Incorporated feedback from Frank.
> >   - Added Reviewed-by tags (except for patches #4 and #6).
> >   - Fixed a typo in patch #5 title.
> >   (No code changes overall.)
> > 
> > v3: https://lore.kernel.org/all/20251130151100.2591822-1-den@valinux.co.jp/
> > v2: https://lore.kernel.org/all/20251029080321.807943-1-den@valinux.co.jp/
> > v1: https://lore.kernel.org/all/20251023071757.901181-1-den@valinux.co.jp/
> > 
> > 
> > Koichiro Den (7):
> >   NTB: epf: Avoid pci_iounmap() with offset when PEER_SPAD and CONFIG
> >     share BAR
> >   PCI: endpoint: Fix parameter order for .drop_link
> >   PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
> >   PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
> >   NTB: epf: vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
> >   PCI: endpoint: pci-epf-vntb: Switch vpci_scan_bus() to use
> >     pci_scan_root_bus()
> >   PCI: endpoint: pci-epf-vntb: manage ntb_dev lifetime and fix vpci bus
> >     teardown
> > 
> >  drivers/ntb/hw/epf/ntb_hw_epf.c               |  3 +-
> >  drivers/pci/endpoint/functions/pci-epf-ntb.c  | 56 +-----------
> >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 86 ++++++++++++-------
> >  drivers/pci/endpoint/pci-ep-cfs.c             |  8 +-
> >  4 files changed, 62 insertions(+), 91 deletions(-)
> 
> Dear NTB and PCI endpoint maintainers,
> 
> I suspect this series may have been confusing because it mixed patches
> targeting both NTB and PCI endpoint subsystems.
> 
> Should I re-submit [PATCH v4 2/7], which touches
> drivers/pci/endpoint/pci-ep-cfs.c separately to the linux-pci mailing
> list, and re-submit the rest of the patches to the NTB mailing list?
> 
> Any guidance would be appreciated.

Hi Jon, Dave, Allen,

Sorry for the ping.

Regarding the earlier question about splitting the series, [PATCH v4 2/7]
is no longer needed, as an identical fix has already been merged recently:
https://lore.kernel.org/linux-pci/20260108062747.1870669-1-mmaddireddy@nvidia.com/

That leaves the remaining patches as NTB-focused changes.

Could you please take a look at the rest of the series when you have a
chance? Any feedback would be much appreciated.

Kind regards,
Koichiro

> 
> Thanks,
> Koichiro
> 
> > 
> > -- 
> > 2.48.1
> >