[PATCH v5 0/5] PCI: endpoint: pci-epf-*ntb: Harden vNTB resource management

Koichiro Den posted 5 patches 1 month, 2 weeks ago
drivers/pci/endpoint/functions/pci-epf-ntb.c  | 56 +-----------
drivers/pci/endpoint/functions/pci-epf-vntb.c | 88 ++++++++++++-------
2 files changed, 57 insertions(+), 87 deletions(-)
[PATCH v5 0/5] PCI: endpoint: pci-epf-*ntb: Harden vNTB resource management
Posted by Koichiro Den 1 month, 2 weeks ago
The vNTB endpoint function (pci-epf-vntb) can be configured and
reconfigured through configfs (link/unlink functions, start/stop the
controller, update parameters). In practice, several pitfalls present:
duplicate EPC teardown that leads to oopses, a work item running after
resources were torn down, and inability to re-link/restart fundamentally
because ntb_dev was embedded and the vPCI bus teardown was incomplete.

This series addresses those issues and hardens resource management of
pci-epf-vntb:

- Remove duplicate EPC resource teardown in both pci-epf-vntb and
  pci-epf-ntb, avoiding crashes on .allow_link failures and during
  .drop_link.
- Stop the delayed cmd_handler work before clearing BARs/doorbells.
- Manage ntb_dev as a devm-managed allocation and implement .remove() in
  the vNTB PCI driver. Switch to pci_scan_root_bus().

With these changes, the controller can now be stopped, a function
unlinked, configfs settings updated, and the controller re-linked and
restarted without rebooting the endpoint, as long as the underlying
pci_epc_ops .stop() is non-destructive and .start() restores normal
operation.

Patches 1-3 carry Fixes tags and are candidates for stable.
Patch 4 is a preparatory one for Patch 5.
Patch 5 is a behavioral improvement that completes lifetime management for
relink/restart scenarios.

---
v4->v5 changes:
  - Rebased onto the latest pci/endpoint (2026-02-26).
  - Dropped [PATCH v4 1/7]; will be reposted separately via the NTB tree.
  - Dropped [PATCH v4 2/7], which has been applied in a different form.
  - Corrected the subject prefix of [PATCH v4 5/7]:
    s/NTB: epf: vntb:/PCI: endpoint: pci-epf-vntb:/.
  - Picked up a Reviewed-by tag to [PATCH v4 7/7].
  - Resolved a conflict in [PATCH v4 7/7] due to commit
    dc693d606644 ("PCI: endpoint: pci-epf-vntb: Add MSI doorbell support").
v3->v4 changes:
  - Added Reviewed-by tag for [PATCH v3 6/6].
  - Corrected patch split by moving the blank-line cleanup,
    based on the feedback from Frank.
  (No code changes overall.)
v2->v3 changes:
  - Added Reviewed-by tag for [PATCH v2 4/6].
  - Split [PATCH v2 6/6] into two, based on the feedback from Frank.
  (No code changes overall.)
v1->v2 changes:
  - Incorporated feedback from Frank.
  - Added Reviewed-by tags (except for patches #4 and #6).
  - Fixed a typo in patch #5 title.
  (No code changes overall.)

v4: https://lore.kernel.org/linux-pci/20251202072348.2752371-1-den@valinux.co.jp/
v3: https://lore.kernel.org/all/20251130151100.2591822-1-den@valinux.co.jp/
v2: https://lore.kernel.org/all/20251029080321.807943-1-den@valinux.co.jp/
v1: https://lore.kernel.org/all/20251023071757.901181-1-den@valinux.co.jp/


Koichiro Den (5):
  PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
  PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
  PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in
    epf_ntb_epc_cleanup
  PCI: endpoint: pci-epf-vntb: Switch vpci_scan_bus() to use
    pci_scan_root_bus()
  PCI: endpoint: pci-epf-vntb: manage ntb_dev lifetime and fix vpci bus
    teardown

 drivers/pci/endpoint/functions/pci-epf-ntb.c  | 56 +-----------
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 88 ++++++++++++-------
 2 files changed, 57 insertions(+), 87 deletions(-)

-- 
2.51.0
Re: [PATCH v5 0/5] PCI: endpoint: pci-epf-*ntb: Harden vNTB resource management
Posted by Koichiro Den 1 month, 1 week ago
On Thu, Feb 26, 2026 at 05:41:37PM +0900, Koichiro Den wrote:
> The vNTB endpoint function (pci-epf-vntb) can be configured and
> reconfigured through configfs (link/unlink functions, start/stop the
> controller, update parameters). In practice, several pitfalls present:
> duplicate EPC teardown that leads to oopses, a work item running after
> resources were torn down, and inability to re-link/restart fundamentally
> because ntb_dev was embedded and the vPCI bus teardown was incomplete.
> 
> This series addresses those issues and hardens resource management of
> pci-epf-vntb:
> 
> - Remove duplicate EPC resource teardown in both pci-epf-vntb and
>   pci-epf-ntb, avoiding crashes on .allow_link failures and during
>   .drop_link.
> - Stop the delayed cmd_handler work before clearing BARs/doorbells.
> - Manage ntb_dev as a devm-managed allocation and implement .remove() in
>   the vNTB PCI driver. Switch to pci_scan_root_bus().
> 
> With these changes, the controller can now be stopped, a function
> unlinked, configfs settings updated, and the controller re-linked and
> restarted without rebooting the endpoint, as long as the underlying
> pci_epc_ops .stop() is non-destructive and .start() restores normal
> operation.
> 
> Patches 1-3 carry Fixes tags and are candidates for stable.
> Patch 4 is a preparatory one for Patch 5.
> Patch 5 is a behavioral improvement that completes lifetime management for
> relink/restart scenarios.

While I'm updating Patch 4 and 5 to address feedback from Mani, as well as the
concern I mentioned at [1], I noticed that if [2] gets merged before this
series, another issue may arrise. With [2], the DB IRQ may become a shared IRQ,
in which case the unbind/remove race would require additional care.


Mani, if it's ok, could you take Patch 1-3?

- If so, I'll spin the rest (Patch 4-5) into a separate patch series starting
  from v6, with some additional commits.

  It turns out that Patch 4-5 are a bigger change than I initially thought. Even
  though Patch 1-3 were originally written as preparatory fixes, they can be
  applied independently at any time.

  The code in Patches 1-3 has also been unchanged since v1 (submitted last
  October).

[1] https://lore.kernel.org/linux-pci/mipdls67csyyrugf4rjx3qqtbxes4sjjtluy3psecnadcgcs7k@rn42d3m6ggsf/
[2] [PATCH v10 0/7] PCI: endpoint: pci-ep-msi: Add embedded doorbell fallback
    https://lore.kernel.org/linux-pci/20260302071427.534158-1-den@valinux.co.jp/


Best regards,
Koichiro

> 
> ---
> v4->v5 changes:
>   - Rebased onto the latest pci/endpoint (2026-02-26).
>   - Dropped [PATCH v4 1/7]; will be reposted separately via the NTB tree.
>   - Dropped [PATCH v4 2/7], which has been applied in a different form.
>   - Corrected the subject prefix of [PATCH v4 5/7]:
>     s/NTB: epf: vntb:/PCI: endpoint: pci-epf-vntb:/.
>   - Picked up a Reviewed-by tag to [PATCH v4 7/7].
>   - Resolved a conflict in [PATCH v4 7/7] due to commit
>     dc693d606644 ("PCI: endpoint: pci-epf-vntb: Add MSI doorbell support").
> v3->v4 changes:
>   - Added Reviewed-by tag for [PATCH v3 6/6].
>   - Corrected patch split by moving the blank-line cleanup,
>     based on the feedback from Frank.
>   (No code changes overall.)
> v2->v3 changes:
>   - Added Reviewed-by tag for [PATCH v2 4/6].
>   - Split [PATCH v2 6/6] into two, based on the feedback from Frank.
>   (No code changes overall.)
> v1->v2 changes:
>   - Incorporated feedback from Frank.
>   - Added Reviewed-by tags (except for patches #4 and #6).
>   - Fixed a typo in patch #5 title.
>   (No code changes overall.)
> 
> v4: https://lore.kernel.org/linux-pci/20251202072348.2752371-1-den@valinux.co.jp/
> v3: https://lore.kernel.org/all/20251130151100.2591822-1-den@valinux.co.jp/
> v2: https://lore.kernel.org/all/20251029080321.807943-1-den@valinux.co.jp/
> v1: https://lore.kernel.org/all/20251023071757.901181-1-den@valinux.co.jp/
> 
> 
> Koichiro Den (5):
>   PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
>   PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
>   PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in
>     epf_ntb_epc_cleanup
>   PCI: endpoint: pci-epf-vntb: Switch vpci_scan_bus() to use
>     pci_scan_root_bus()
>   PCI: endpoint: pci-epf-vntb: manage ntb_dev lifetime and fix vpci bus
>     teardown
> 
>  drivers/pci/endpoint/functions/pci-epf-ntb.c  | 56 +-----------
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 88 ++++++++++++-------
>  2 files changed, 57 insertions(+), 87 deletions(-)
> 
> -- 
> 2.51.0
> 
>
Re: [PATCH v5 0/5] PCI: endpoint: pci-epf-*ntb: Harden vNTB resource management
Posted by Manivannan Sadhasivam 1 month, 1 week ago
On Wed, Mar 04, 2026 at 12:10:23PM +0900, Koichiro Den wrote:
> On Thu, Feb 26, 2026 at 05:41:37PM +0900, Koichiro Den wrote:
> > The vNTB endpoint function (pci-epf-vntb) can be configured and
> > reconfigured through configfs (link/unlink functions, start/stop the
> > controller, update parameters). In practice, several pitfalls present:
> > duplicate EPC teardown that leads to oopses, a work item running after
> > resources were torn down, and inability to re-link/restart fundamentally
> > because ntb_dev was embedded and the vPCI bus teardown was incomplete.
> > 
> > This series addresses those issues and hardens resource management of
> > pci-epf-vntb:
> > 
> > - Remove duplicate EPC resource teardown in both pci-epf-vntb and
> >   pci-epf-ntb, avoiding crashes on .allow_link failures and during
> >   .drop_link.
> > - Stop the delayed cmd_handler work before clearing BARs/doorbells.
> > - Manage ntb_dev as a devm-managed allocation and implement .remove() in
> >   the vNTB PCI driver. Switch to pci_scan_root_bus().
> > 
> > With these changes, the controller can now be stopped, a function
> > unlinked, configfs settings updated, and the controller re-linked and
> > restarted without rebooting the endpoint, as long as the underlying
> > pci_epc_ops .stop() is non-destructive and .start() restores normal
> > operation.
> > 
> > Patches 1-3 carry Fixes tags and are candidates for stable.
> > Patch 4 is a preparatory one for Patch 5.
> > Patch 5 is a behavioral improvement that completes lifetime management for
> > relink/restart scenarios.
> 
> While I'm updating Patch 4 and 5 to address feedback from Mani, as well as the
> concern I mentioned at [1], I noticed that if [2] gets merged before this
> series, another issue may arrise. With [2], the DB IRQ may become a shared IRQ,
> in which case the unbind/remove race would require additional care.
> 
> 
> Mani, if it's ok, could you take Patch 1-3?
> 
> - If so, I'll spin the rest (Patch 4-5) into a separate patch series starting
>   from v6, with some additional commits.
> 

Sounds OK to me.

- Mani

>   It turns out that Patch 4-5 are a bigger change than I initially thought. Even
>   though Patch 1-3 were originally written as preparatory fixes, they can be
>   applied independently at any time.
> 
>   The code in Patches 1-3 has also been unchanged since v1 (submitted last
>   October).
> 
> [1] https://lore.kernel.org/linux-pci/mipdls67csyyrugf4rjx3qqtbxes4sjjtluy3psecnadcgcs7k@rn42d3m6ggsf/
> [2] [PATCH v10 0/7] PCI: endpoint: pci-ep-msi: Add embedded doorbell fallback
>     https://lore.kernel.org/linux-pci/20260302071427.534158-1-den@valinux.co.jp/
> 
> 
> Best regards,
> Koichiro
> 
> > 
> > ---
> > v4->v5 changes:
> >   - Rebased onto the latest pci/endpoint (2026-02-26).
> >   - Dropped [PATCH v4 1/7]; will be reposted separately via the NTB tree.
> >   - Dropped [PATCH v4 2/7], which has been applied in a different form.
> >   - Corrected the subject prefix of [PATCH v4 5/7]:
> >     s/NTB: epf: vntb:/PCI: endpoint: pci-epf-vntb:/.
> >   - Picked up a Reviewed-by tag to [PATCH v4 7/7].
> >   - Resolved a conflict in [PATCH v4 7/7] due to commit
> >     dc693d606644 ("PCI: endpoint: pci-epf-vntb: Add MSI doorbell support").
> > v3->v4 changes:
> >   - Added Reviewed-by tag for [PATCH v3 6/6].
> >   - Corrected patch split by moving the blank-line cleanup,
> >     based on the feedback from Frank.
> >   (No code changes overall.)
> > v2->v3 changes:
> >   - Added Reviewed-by tag for [PATCH v2 4/6].
> >   - Split [PATCH v2 6/6] into two, based on the feedback from Frank.
> >   (No code changes overall.)
> > v1->v2 changes:
> >   - Incorporated feedback from Frank.
> >   - Added Reviewed-by tags (except for patches #4 and #6).
> >   - Fixed a typo in patch #5 title.
> >   (No code changes overall.)
> > 
> > v4: https://lore.kernel.org/linux-pci/20251202072348.2752371-1-den@valinux.co.jp/
> > v3: https://lore.kernel.org/all/20251130151100.2591822-1-den@valinux.co.jp/
> > v2: https://lore.kernel.org/all/20251029080321.807943-1-den@valinux.co.jp/
> > v1: https://lore.kernel.org/all/20251023071757.901181-1-den@valinux.co.jp/
> > 
> > 
> > Koichiro Den (5):
> >   PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
> >   PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
> >   PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in
> >     epf_ntb_epc_cleanup
> >   PCI: endpoint: pci-epf-vntb: Switch vpci_scan_bus() to use
> >     pci_scan_root_bus()
> >   PCI: endpoint: pci-epf-vntb: manage ntb_dev lifetime and fix vpci bus
> >     teardown
> > 
> >  drivers/pci/endpoint/functions/pci-epf-ntb.c  | 56 +-----------
> >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 88 ++++++++++++-------
> >  2 files changed, 57 insertions(+), 87 deletions(-)
> > 
> > -- 
> > 2.51.0
> > 
> > 

-- 
மணிவண்ணன் சதாசிவம்
Re: (subset) [PATCH v5 0/5] PCI: endpoint: pci-epf-*ntb: Harden vNTB resource management
Posted by Manivannan Sadhasivam 1 month, 1 week ago
On Thu, 26 Feb 2026 17:41:37 +0900, Koichiro Den wrote:
> The vNTB endpoint function (pci-epf-vntb) can be configured and
> reconfigured through configfs (link/unlink functions, start/stop the
> controller, update parameters). In practice, several pitfalls present:
> duplicate EPC teardown that leads to oopses, a work item running after
> resources were torn down, and inability to re-link/restart fundamentally
> because ntb_dev was embedded and the vPCI bus teardown was incomplete.
> 
> [...]

Applied, thanks!

[1/5] PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
      commit: 0da63230d3ec1ec5fcc443a2314233e95bfece54
[2/5] PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
      commit: 3446beddba450c8d6f9aca2f028712ac527fead3
[3/5] PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
      commit: d799984233a50abd2667a7d17a9a710a3f10ebe2

Best regards,
-- 
Manivannan Sadhasivam <mani@kernel.org>
Re: (subset) [PATCH v5 0/5] PCI: endpoint: pci-epf-*ntb: Harden vNTB resource management
Posted by Koichiro Den 1 month, 1 week ago
On Wed, Mar 04, 2026 at 12:11:11PM +0530, Manivannan Sadhasivam wrote:
> 
> On Thu, 26 Feb 2026 17:41:37 +0900, Koichiro Den wrote:
> > The vNTB endpoint function (pci-epf-vntb) can be configured and
> > reconfigured through configfs (link/unlink functions, start/stop the
> > controller, update parameters). In practice, several pitfalls present:
> > duplicate EPC teardown that leads to oopses, a work item running after
> > resources were torn down, and inability to re-link/restart fundamentally
> > because ntb_dev was embedded and the vPCI bus teardown was incomplete.
> > 
> > [...]
> 
> Applied, thanks!
> 
> [1/5] PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
>       commit: 0da63230d3ec1ec5fcc443a2314233e95bfece54
> [2/5] PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
>       commit: 3446beddba450c8d6f9aca2f028712ac527fead3
> [3/5] PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
>       commit: d799984233a50abd2667a7d17a9a710a3f10ebe2

Thanks for taking the subset, Mani.

I'll prepare the remaining patches as a separate series after taking another
careful look.

Best regards,
Koichiro

> 
> Best regards,
> -- 
> Manivannan Sadhasivam <mani@kernel.org>
>