drivers/ntb/hw/epf/ntb_hw_epf.c | 89 ++++++++++- drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++--- 2 files changed, 210 insertions(+), 26 deletions(-)
This series fixes doorbell bit/vector handling for the EPF-based NTB
pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
without changing the on-the-wire doorbell mapping.
Background / problem
====================
ntb_hw_epf historically applies an extra offset when ringing peer
doorbells: the link event uses the first interrupt slot, and doorbells
start from the third slot (i.e. a second slot is effectively unused).
pci-epf-vntb carries the matching offset on the EP side as well.
As long as db_vector_count()/db_vector_mask() are not implemented, this
mismatch is mostly masked. Doorbell events are effectively treated as
"can hit any QP" and the off-by-one vector numbering does not surface
clearly.
However, once per-vector handling is enabled, the current state becomes
problematic:
- db_valid_mask exposes bits that do not correspond to real doorbells
(link/unused slots leak into the mask).
- ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
expects a 0-based db_vector for doorbells.
- On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
it directly calls pci_epc_raise_irq(), which can sleep.
Why NOT fix the root offset?
============================
The natural "root" fix would be to remove the historical extra offset in
the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
Unfortunately this would lead to interoperability issues when mixing old
and new kernel versions (old/new peers). A new side would ring a
different interrupt slot than what an old peer expects, leading to
missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
are implemented.
Therefore this series intentionally keeps the legacy offset, and instead
fixes the surrounding pieces so the mapping is documented and handled
consistently in masks, vector numbering, and per-vector reporting.
What this series does
=====================
- pci-epf-vntb:
- Document the legacy offset.
- Defer MSI doorbell raises to process context to avoid sleeping in
atomic context. This becomes relevant once multiple doorbells are
raised concurrently at a high rate.
- Report doorbell vectors as 0-based to ntb_db_event().
- Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
- ntb_hw_epf:
- Document the legacy offset in ntb_epf_peer_db_set().
- Fix db_valid_mask to cover only real doorbell bits.
- Report 0-based db_vector to ntb_db_event() (accounting for the
unused slot).
- Keep db_val as a bitmask and fix db_read/db_clear semantics
accordingly.
- Implement db_vector_count()/db_vector_mask().
Compatibility
=============
By keeping the legacy offset intact, this series aims to remain
compatible across mixed kernel versions. The observable changes are
limited to correct mask/vector reporting and safer execution context
handling.
Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
applied separately through the respective trees. They are sent together
in this v3 for convenience.
Once the remaining acks from NTB maintainers are collected, the plan is
to take the whole series through the PCI EP tree. See:
https://lore.kernel.org/linux-pci/rnzsnp5de4qf5w7smebkmqekpuaqckltx73rj6ha3q2nrby5yp@7hsgvdzvjkp6/
---
Changelog
=========
Changes since v2:
- No functional changes.
- Rebased onto current pci/endpoint
e022f0c72c7f ("selftests: pci_endpoint: Skip reserved BARs").
* Patch 2 needed a trivial context-only adjustment while rebasing, due
to commit d799984233a5 ("PCI: endpoint: pci-epf-vntb: Stop
cmd_handler work in epf_ntb_epc_cleanup").
- Picked up additional Reviewed-by tags from Frank.
- Fixed the incorrect v2 series title.
Changes since v1:
- Addressed feedback from Dave (add a source code comment, introduce
enum to eliminate magic numbers)
- Updated source code comment in Patch 2.
- No functional changes, so retained Reviewed-by tags by Frank and Dave.
Thank you both for the review.
v2: https://lore.kernel.org/linux-pci/20260227084955.3184017-1-den@valinux.co.jp/
v1: https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
Best regards,
Koichiro
Koichiro Den (10):
PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic
context
PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via
ntb_db_event()
PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for
doorbells
NTB: epf: Document legacy doorbell slot offset in
ntb_epf_peer_db_set()
NTB: epf: Make db_valid_mask cover only real doorbell bits
NTB: epf: Report 0-based doorbell vector via ntb_db_event()
NTB: epf: Fix doorbell bitmask handling in db_read/db_clear
NTB: epf: Implement db_vector_count/mask for doorbells
drivers/ntb/hw/epf/ntb_hw_epf.c | 89 ++++++++++-
drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++---
2 files changed, 210 insertions(+), 26 deletions(-)
--
2.51.0
On Mon, Mar 23, 2026 at 12:15:34PM +0900, Koichiro Den wrote:
> This series fixes doorbell bit/vector handling for the EPF-based NTB
> pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> without changing the on-the-wire doorbell mapping.
>
>
> Background / problem
> ====================
>
> ntb_hw_epf historically applies an extra offset when ringing peer
> doorbells: the link event uses the first interrupt slot, and doorbells
> start from the third slot (i.e. a second slot is effectively unused).
> pci-epf-vntb carries the matching offset on the EP side as well.
>
> As long as db_vector_count()/db_vector_mask() are not implemented, this
> mismatch is mostly masked. Doorbell events are effectively treated as
> "can hit any QP" and the off-by-one vector numbering does not surface
> clearly.
>
> However, once per-vector handling is enabled, the current state becomes
> problematic:
>
> - db_valid_mask exposes bits that do not correspond to real doorbells
> (link/unused slots leak into the mask).
> - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> expects a 0-based db_vector for doorbells.
> - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> it directly calls pci_epc_raise_irq(), which can sleep.
>
>
> Why NOT fix the root offset?
> ============================
>
> The natural "root" fix would be to remove the historical extra offset in
> the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> Unfortunately this would lead to interoperability issues when mixing old
> and new kernel versions (old/new peers). A new side would ring a
> different interrupt slot than what an old peer expects, leading to
> missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> are implemented.
>
> Therefore this series intentionally keeps the legacy offset, and instead
> fixes the surrounding pieces so the mapping is documented and handled
> consistently in masks, vector numbering, and per-vector reporting.
>
>
> What this series does
> =====================
>
> - pci-epf-vntb:
>
> - Document the legacy offset.
> - Defer MSI doorbell raises to process context to avoid sleeping in
> atomic context. This becomes relevant once multiple doorbells are
> raised concurrently at a high rate.
> - Report doorbell vectors as 0-based to ntb_db_event().
> - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
>
> - ntb_hw_epf:
>
> - Document the legacy offset in ntb_epf_peer_db_set().
> - Fix db_valid_mask to cover only real doorbell bits.
> - Report 0-based db_vector to ntb_db_event() (accounting for the
> unused slot).
> - Keep db_val as a bitmask and fix db_read/db_clear semantics
> accordingly.
> - Implement db_vector_count()/db_vector_mask().
>
>
> Compatibility
> =============
>
> By keeping the legacy offset intact, this series aims to remain
> compatible across mixed kernel versions. The observable changes are
> limited to correct mask/vector reporting and safer execution context
> handling.
>
> Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> applied separately through the respective trees. They are sent together
> in this v3 for convenience.
>
> Once the remaining acks from NTB maintainers are collected, the plan is
Hi Dave,
When you have a chance, I'd appreciate another look at patches 6-10 (which are
unchanged since v2). If you do not see any blockers, Acked-by would be
greatly appreciated.
P.S. Regarding Sashiko's feedback [1], my understanding is that there are no
blockers, but there are a few points that would be better addressed separately
as orthogonal follow-ups:
- configfs knobs mutability and bounds checking, including (but not limited to)
db_count.
In my opinion, allowing updates after .bind() looks questionable, and
returning -EBUSY once bound seems more appropriate. I'm leaning toward
handling this as a separate hardening series.
- ntb_hw_epf IRQ unwind concern.
This is what I was trying to address in [2], which I hope will land soon.
- Other lifecycle concerns.
These are largely tied to the current vNTB implementation and were part of
[3], for which I still plan to post a follow-up series that adds .remove()
implementation to vntb_pci_driver.
[1] https://sashiko.dev/#/patchset/20260323031544.2598111-1-den%40valinux.co.jp
[2] https://lore.kernel.org/ntb/20260304083028.1391068-1-den@valinux.co.jp/
[3] https://lore.kernel.org/all/20260226084142.2226875-1-den@valinux.co.jp/
Best regards,
Koichiro
> to take the whole series through the PCI EP tree. See:
> https://lore.kernel.org/linux-pci/rnzsnp5de4qf5w7smebkmqekpuaqckltx73rj6ha3q2nrby5yp@7hsgvdzvjkp6/
>
> ---
> Changelog
> =========
>
> Changes since v2:
> - No functional changes.
> - Rebased onto current pci/endpoint
> e022f0c72c7f ("selftests: pci_endpoint: Skip reserved BARs").
> * Patch 2 needed a trivial context-only adjustment while rebasing, due
> to commit d799984233a5 ("PCI: endpoint: pci-epf-vntb: Stop
> cmd_handler work in epf_ntb_epc_cleanup").
> - Picked up additional Reviewed-by tags from Frank.
> - Fixed the incorrect v2 series title.
>
> Changes since v1:
> - Addressed feedback from Dave (add a source code comment, introduce
> enum to eliminate magic numbers)
> - Updated source code comment in Patch 2.
> - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> Thank you both for the review.
>
> v2: https://lore.kernel.org/linux-pci/20260227084955.3184017-1-den@valinux.co.jp/
> v1: https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
>
>
> Best regards,
> Koichiro
>
>
> Koichiro Den (10):
> PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
> PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic
> context
> PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via
> ntb_db_event()
> PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
> PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for
> doorbells
> NTB: epf: Document legacy doorbell slot offset in
> ntb_epf_peer_db_set()
> NTB: epf: Make db_valid_mask cover only real doorbell bits
> NTB: epf: Report 0-based doorbell vector via ntb_db_event()
> NTB: epf: Fix doorbell bitmask handling in db_read/db_clear
> NTB: epf: Implement db_vector_count/mask for doorbells
>
> drivers/ntb/hw/epf/ntb_hw_epf.c | 89 ++++++++++-
> drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++---
> 2 files changed, 210 insertions(+), 26 deletions(-)
>
> --
> 2.51.0
>
>
Hello Koichiro,
On Tue, Mar 24, 2026 at 12:43:53AM +0900, Koichiro Den wrote:
>
> - configfs knobs mutability and bounds checking, including (but not limited to)
> db_count.
> In my opinion, allowing updates after .bind() looks questionable, and
> returning -EBUSY once bound seems more appropriate. I'm leaning toward
> handling this as a separate hardening series.
This is in line with what we did for pci-epf-test, see commit:
ffcc4850a161 ("PCI: endpoint: pci-epf-test: Allow overriding default BAR sizes")
but we return EOPNOTSUPP instead of EBUSY.
Kind regards,
Niklas
On Wed, Mar 25, 2026 at 07:23:37AM +0100, Niklas Cassel wrote:
> Hello Koichiro,
>
> On Tue, Mar 24, 2026 at 12:43:53AM +0900, Koichiro Den wrote:
> >
> > - configfs knobs mutability and bounds checking, including (but not limited to)
> > db_count.
> > In my opinion, allowing updates after .bind() looks questionable, and
> > returning -EBUSY once bound seems more appropriate. I'm leaning toward
> > handling this as a separate hardening series.
>
> This is in line with what we did for pci-epf-test, see commit:
> ffcc4850a161 ("PCI: endpoint: pci-epf-test: Allow overriding default BAR sizes")
>
> but we return EOPNOTSUPP instead of EBUSY.
Yes, I remember you were discussing this with Mani in another thread. For
consistency, we should use -EOPNOTSUPP here as well. Thanks for the reminder.
Best regards,
Koichiro
>
>
> Kind regards,
> Niklas
© 2016 - 2026 Red Hat, Inc.