drivers/ntb/hw/epf/ntb_hw_epf.c | 89 ++++++++++- drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++--- 2 files changed, 210 insertions(+), 26 deletions(-)
This series fixes doorbell bit/vector handling for the EPF-based NTB
pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
without changing the on-the-wire doorbell mapping.
Background / problem
====================
ntb_hw_epf historically applies an extra offset when ringing peer
doorbells: the link event uses the first interrupt slot, and doorbells
start from the third slot (i.e. a second slot is effectively unused).
pci-epf-vntb carries the matching offset on the EP side as well.
As long as db_vector_count()/db_vector_mask() are not implemented, this
mismatch is mostly masked. Doorbell events are effectively treated as
"can hit any QP" and the off-by-one vector numbering does not surface
clearly.
However, once per-vector handling is enabled, the current state becomes
problematic:
- db_valid_mask exposes bits that do not correspond to real doorbells
(link/unused slots leak into the mask).
- ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
expects a 0-based db_vector for doorbells.
- On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
it directly calls pci_epc_raise_irq(), which can sleep.
Why NOT fix the root offset?
============================
The natural "root" fix would be to remove the historical extra offset in
the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
Unfortunately this would lead to interoperability issues when mixing old
and new kernel versions (old/new peers). A new side would ring a
different interrupt slot than what an old peer expects, leading to
missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
are implemented.
Therefore this series intentionally keeps the legacy offset, and instead
fixes the surrounding pieces so the mapping is documented and handled
consistently in masks, vector numbering, and per-vector reporting.
What this series does
=====================
- pci-epf-vntb:
- Document the legacy offset.
- Defer MSI doorbell raises to process context to avoid sleeping in
atomic context. This becomes relevant once multiple doorbells are
raised concurrently at a high rate.
- Report doorbell vectors as 0-based to ntb_db_event().
- Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
- ntb_hw_epf:
- Document the legacy offset in ntb_epf_peer_db_set().
- Fix db_valid_mask to cover only real doorbell bits.
- Report 0-based db_vector to ntb_db_event() (accounting for the
unused slot).
- Keep db_val as a bitmask and fix db_read/db_clear semantics
accordingly.
- Implement db_vector_count()/db_vector_mask().
Compatibility
=============
By keeping the legacy offset intact, this series aims to remain
compatible across mixed kernel versions. The observable changes are
limited to correct mask/vector reporting and safer execution context
handling.
Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
applied separately for each tree. I am sending them together in this
series to provide the full context and to make the cross-subsystem
compatibility constraints explicit. Ideally the whole series would be
applied in a single tree, but each subset is safe to merge on its own.
- Patch 1-5 can apply cleanly onto pci/endpoint latest:
f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
jiffies from pci_epf_mhi_edma_{read/write}")
- Patch 6-10 can apply cleanly onto ntb-next latest:
7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
path")
Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
(not vNTB) bridge scenario, but I believe no changes are needed in
pci-epf-ntb.c.
Changelog
=========
Changes since v1:
- Addressed feedback from Dave (add a source code comment, introduce
enum to eliminate magic numbers)
- Updated source code comment in Patch 2.
- No functional changes, so retained Reviewed-by tags by Frank and Dave.
Thank you both for the review.
Best regards,
Koichiro Den (10):
PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic
context
PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via
ntb_db_event()
PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for
doorbells
NTB: epf: Document legacy doorbell slot offset in
ntb_epf_peer_db_set()
NTB: epf: Make db_valid_mask cover only real doorbell bits
NTB: epf: Report 0-based doorbell vector via ntb_db_event()
NTB: epf: Fix doorbell bitmask handling in db_read/db_clear
NTB: epf: Implement db_vector_count/mask for doorbells
drivers/ntb/hw/epf/ntb_hw_epf.c | 89 ++++++++++-
drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++---
2 files changed, 210 insertions(+), 26 deletions(-)
--
2.51.0
On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> This series fixes doorbell bit/vector handling for the EPF-based NTB
> pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> without changing the on-the-wire doorbell mapping.
>
>
> Background / problem
> ====================
>
> ntb_hw_epf historically applies an extra offset when ringing peer
> doorbells: the link event uses the first interrupt slot, and doorbells
> start from the third slot (i.e. a second slot is effectively unused).
> pci-epf-vntb carries the matching offset on the EP side as well.
>
> As long as db_vector_count()/db_vector_mask() are not implemented, this
> mismatch is mostly masked. Doorbell events are effectively treated as
> "can hit any QP" and the off-by-one vector numbering does not surface
> clearly.
>
> However, once per-vector handling is enabled, the current state becomes
> problematic:
>
> - db_valid_mask exposes bits that do not correspond to real doorbells
> (link/unused slots leak into the mask).
> - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> expects a 0-based db_vector for doorbells.
> - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> it directly calls pci_epc_raise_irq(), which can sleep.
>
>
> Why NOT fix the root offset?
> ============================
>
> The natural "root" fix would be to remove the historical extra offset in
> the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> Unfortunately this would lead to interoperability issues when mixing old
> and new kernel versions (old/new peers). A new side would ring a
> different interrupt slot than what an old peer expects, leading to
> missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> are implemented.
>
> Therefore this series intentionally keeps the legacy offset, and instead
> fixes the surrounding pieces so the mapping is documented and handled
> consistently in masks, vector numbering, and per-vector reporting.
>
>
> What this series does
> =====================
>
> - pci-epf-vntb:
>
> - Document the legacy offset.
> - Defer MSI doorbell raises to process context to avoid sleeping in
> atomic context. This becomes relevant once multiple doorbells are
> raised concurrently at a high rate.
> - Report doorbell vectors as 0-based to ntb_db_event().
> - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
>
> - ntb_hw_epf:
>
> - Document the legacy offset in ntb_epf_peer_db_set().
> - Fix db_valid_mask to cover only real doorbell bits.
> - Report 0-based db_vector to ntb_db_event() (accounting for the
> unused slot).
> - Keep db_val as a bitmask and fix db_read/db_clear semantics
> accordingly.
> - Implement db_vector_count()/db_vector_mask().
>
>
> Compatibility
> =============
>
> By keeping the legacy offset intact, this series aims to remain
> compatible across mixed kernel versions. The observable changes are
> limited to correct mask/vector reporting and safer execution context
> handling.
>
> Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> applied separately for each tree. I am sending them together in this
> series to provide the full context and to make the cross-subsystem
> compatibility constraints explicit. Ideally the whole series would be
> applied in a single tree, but each subset is safe to merge on its own.
>
> - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> jiffies from pci_epf_mhi_edma_{read/write}")
>
> - Patch 6-10 can apply cleanly onto ntb-next latest:
> 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> path")
>
> Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> (not vNTB) bridge scenario, but I believe no changes are needed in
> pci-epf-ntb.c.
>
>
> Changelog
> =========
>
> Changes since v1:
> - Addressed feedback from Dave (add a source code comment, introduce
> enum to eliminate magic numbers)
> - Updated source code comment in Patch 2.
> - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> Thank you both for the review.
Sorry, I accidentally used an incorrect series title.
The correct subject should be:
[PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
For reference, v1 is:
https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
Best regards,
Koichiro
>
>
> Best regards,
>
>
> Koichiro Den (10):
> PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
> PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic
> context
> PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via
> ntb_db_event()
> PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
> PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for
> doorbells
> NTB: epf: Document legacy doorbell slot offset in
> ntb_epf_peer_db_set()
> NTB: epf: Make db_valid_mask cover only real doorbell bits
> NTB: epf: Report 0-based doorbell vector via ntb_db_event()
> NTB: epf: Fix doorbell bitmask handling in db_read/db_clear
> NTB: epf: Implement db_vector_count/mask for doorbells
>
> drivers/ntb/hw/epf/ntb_hw_epf.c | 89 ++++++++++-
> drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++---
> 2 files changed, 210 insertions(+), 26 deletions(-)
>
> --
> 2.51.0
>
>
On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > without changing the on-the-wire doorbell mapping.
> >
> >
> > Background / problem
> > ====================
> >
> > ntb_hw_epf historically applies an extra offset when ringing peer
> > doorbells: the link event uses the first interrupt slot, and doorbells
> > start from the third slot (i.e. a second slot is effectively unused).
> > pci-epf-vntb carries the matching offset on the EP side as well.
> >
> > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > mismatch is mostly masked. Doorbell events are effectively treated as
> > "can hit any QP" and the off-by-one vector numbering does not surface
> > clearly.
> >
> > However, once per-vector handling is enabled, the current state becomes
> > problematic:
> >
> > - db_valid_mask exposes bits that do not correspond to real doorbells
> > (link/unused slots leak into the mask).
> > - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> > expects a 0-based db_vector for doorbells.
> > - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> > it directly calls pci_epc_raise_irq(), which can sleep.
> >
> >
> > Why NOT fix the root offset?
> > ============================
> >
> > The natural "root" fix would be to remove the historical extra offset in
> > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > Unfortunately this would lead to interoperability issues when mixing old
> > and new kernel versions (old/new peers). A new side would ring a
> > different interrupt slot than what an old peer expects, leading to
> > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > are implemented.
> >
> > Therefore this series intentionally keeps the legacy offset, and instead
> > fixes the surrounding pieces so the mapping is documented and handled
> > consistently in masks, vector numbering, and per-vector reporting.
> >
> >
> > What this series does
> > =====================
> >
> > - pci-epf-vntb:
> >
> > - Document the legacy offset.
> > - Defer MSI doorbell raises to process context to avoid sleeping in
> > atomic context. This becomes relevant once multiple doorbells are
> > raised concurrently at a high rate.
> > - Report doorbell vectors as 0-based to ntb_db_event().
> > - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> >
> > - ntb_hw_epf:
> >
> > - Document the legacy offset in ntb_epf_peer_db_set().
> > - Fix db_valid_mask to cover only real doorbell bits.
> > - Report 0-based db_vector to ntb_db_event() (accounting for the
> > unused slot).
> > - Keep db_val as a bitmask and fix db_read/db_clear semantics
> > accordingly.
> > - Implement db_vector_count()/db_vector_mask().
> >
> >
> > Compatibility
> > =============
> >
> > By keeping the legacy offset intact, this series aims to remain
> > compatible across mixed kernel versions. The observable changes are
> > limited to correct mask/vector reporting and safer execution context
> > handling.
> >
> > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > applied separately for each tree. I am sending them together in this
> > series to provide the full context and to make the cross-subsystem
> > compatibility constraints explicit. Ideally the whole series would be
> > applied in a single tree, but each subset is safe to merge on its own.
> >
> > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> > f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> > jiffies from pci_epf_mhi_edma_{read/write}")
> >
> > - Patch 6-10 can apply cleanly onto ntb-next latest:
> > 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> > path")
> >
> > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > (not vNTB) bridge scenario, but I believe no changes are needed in
> > pci-epf-ntb.c.
> >
> >
> > Changelog
> > =========
> >
> > Changes since v1:
> > - Addressed feedback from Dave (add a source code comment, introduce
> > enum to eliminate magic numbers)
> > - Updated source code comment in Patch 2.
> > - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> > Thank you both for the review.
>
> Sorry, I accidentally used an incorrect series title.
> The correct subject should be:
>
> [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
>
> For reference, v1 is:
> https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
>
> Best regards,
> Koichiro
Hi Mani (cc: Jon, Dave),
This series has been sitting for a while, so I'd like to check how to proceed.
I'm thinking of the following approach:
- get the remaining acks from the NTB side
(Dave already gave Reviewed-by for Patch 6/10)
- then route the whole series via the PCI EP tree
Does that sound reasonable?
If so, I can prepare a v3 rebased onto the latest pci/endpoint.
Best regards,
Koichiro
On Fri, Mar 20, 2026 at 11:55:23PM +0900, Koichiro Den wrote:
> On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> > On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > > without changing the on-the-wire doorbell mapping.
> > >
> > >
> > > Background / problem
> > > ====================
> > >
> > > ntb_hw_epf historically applies an extra offset when ringing peer
> > > doorbells: the link event uses the first interrupt slot, and doorbells
> > > start from the third slot (i.e. a second slot is effectively unused).
> > > pci-epf-vntb carries the matching offset on the EP side as well.
> > >
> > > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > > mismatch is mostly masked. Doorbell events are effectively treated as
> > > "can hit any QP" and the off-by-one vector numbering does not surface
> > > clearly.
> > >
> > > However, once per-vector handling is enabled, the current state becomes
> > > problematic:
> > >
> > > - db_valid_mask exposes bits that do not correspond to real doorbells
> > > (link/unused slots leak into the mask).
> > > - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> > > expects a 0-based db_vector for doorbells.
> > > - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> > > it directly calls pci_epc_raise_irq(), which can sleep.
> > >
> > >
> > > Why NOT fix the root offset?
> > > ============================
> > >
> > > The natural "root" fix would be to remove the historical extra offset in
> > > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > > Unfortunately this would lead to interoperability issues when mixing old
> > > and new kernel versions (old/new peers). A new side would ring a
> > > different interrupt slot than what an old peer expects, leading to
> > > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > > are implemented.
> > >
> > > Therefore this series intentionally keeps the legacy offset, and instead
> > > fixes the surrounding pieces so the mapping is documented and handled
> > > consistently in masks, vector numbering, and per-vector reporting.
> > >
> > >
> > > What this series does
> > > =====================
> > >
> > > - pci-epf-vntb:
> > >
> > > - Document the legacy offset.
> > > - Defer MSI doorbell raises to process context to avoid sleeping in
> > > atomic context. This becomes relevant once multiple doorbells are
> > > raised concurrently at a high rate.
> > > - Report doorbell vectors as 0-based to ntb_db_event().
> > > - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> > >
> > > - ntb_hw_epf:
> > >
> > > - Document the legacy offset in ntb_epf_peer_db_set().
> > > - Fix db_valid_mask to cover only real doorbell bits.
> > > - Report 0-based db_vector to ntb_db_event() (accounting for the
> > > unused slot).
> > > - Keep db_val as a bitmask and fix db_read/db_clear semantics
> > > accordingly.
> > > - Implement db_vector_count()/db_vector_mask().
> > >
> > >
> > > Compatibility
> > > =============
> > >
> > > By keeping the legacy offset intact, this series aims to remain
> > > compatible across mixed kernel versions. The observable changes are
> > > limited to correct mask/vector reporting and safer execution context
> > > handling.
> > >
> > > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > > applied separately for each tree. I am sending them together in this
> > > series to provide the full context and to make the cross-subsystem
> > > compatibility constraints explicit. Ideally the whole series would be
> > > applied in a single tree, but each subset is safe to merge on its own.
> > >
> > > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> > > f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> > > jiffies from pci_epf_mhi_edma_{read/write}")
> > >
> > > - Patch 6-10 can apply cleanly onto ntb-next latest:
> > > 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> > > path")
> > >
> > > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > > (not vNTB) bridge scenario, but I believe no changes are needed in
> > > pci-epf-ntb.c.
> > >
> > >
> > > Changelog
> > > =========
> > >
> > > Changes since v1:
> > > - Addressed feedback from Dave (add a source code comment, introduce
> > > enum to eliminate magic numbers)
> > > - Updated source code comment in Patch 2.
> > > - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> > > Thank you both for the review.
> >
> > Sorry, I accidentally used an incorrect series title.
> > The correct subject should be:
> >
> > [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
> >
> > For reference, v1 is:
> > https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
> >
> > Best regards,
> > Koichiro
>
> Hi Mani (cc: Jon, Dave),
>
> This series has been sitting for a while, so I'd like to check how to proceed.
>
> I'm thinking of the following approach:
>
> - get the remaining acks from the NTB side
> (Dave already gave Reviewed-by for Patch 6/10)
> - then route the whole series via the PCI EP tree
>
> Does that sound reasonable?
>
> If so, I can prepare a v3 rebased onto the latest pci/endpoint.
>
Yes please. We have queued a lot of endpoint patches for v7.1, so it makes sense
to route this series as a whole through PCI tree.
Also as you noted, all the NTB patches need an Ack from Dave.
- Mani
--
மணிவண்ணன் சதாசிவம்
On Sat, Mar 21, 2026 at 07:08:36PM +0530, Manivannan Sadhasivam wrote:
> On Fri, Mar 20, 2026 at 11:55:23PM +0900, Koichiro Den wrote:
> > On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> > > On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > > > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > > > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > > > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > > > without changing the on-the-wire doorbell mapping.
> > > >
> > > >
> > > > Background / problem
> > > > ====================
> > > >
> > > > ntb_hw_epf historically applies an extra offset when ringing peer
> > > > doorbells: the link event uses the first interrupt slot, and doorbells
> > > > start from the third slot (i.e. a second slot is effectively unused).
> > > > pci-epf-vntb carries the matching offset on the EP side as well.
> > > >
> > > > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > > > mismatch is mostly masked. Doorbell events are effectively treated as
> > > > "can hit any QP" and the off-by-one vector numbering does not surface
> > > > clearly.
> > > >
> > > > However, once per-vector handling is enabled, the current state becomes
> > > > problematic:
> > > >
> > > > - db_valid_mask exposes bits that do not correspond to real doorbells
> > > > (link/unused slots leak into the mask).
> > > > - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> > > > expects a 0-based db_vector for doorbells.
> > > > - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> > > > it directly calls pci_epc_raise_irq(), which can sleep.
> > > >
> > > >
> > > > Why NOT fix the root offset?
> > > > ============================
> > > >
> > > > The natural "root" fix would be to remove the historical extra offset in
> > > > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > > > Unfortunately this would lead to interoperability issues when mixing old
> > > > and new kernel versions (old/new peers). A new side would ring a
> > > > different interrupt slot than what an old peer expects, leading to
> > > > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > > > are implemented.
> > > >
> > > > Therefore this series intentionally keeps the legacy offset, and instead
> > > > fixes the surrounding pieces so the mapping is documented and handled
> > > > consistently in masks, vector numbering, and per-vector reporting.
> > > >
> > > >
> > > > What this series does
> > > > =====================
> > > >
> > > > - pci-epf-vntb:
> > > >
> > > > - Document the legacy offset.
> > > > - Defer MSI doorbell raises to process context to avoid sleeping in
> > > > atomic context. This becomes relevant once multiple doorbells are
> > > > raised concurrently at a high rate.
> > > > - Report doorbell vectors as 0-based to ntb_db_event().
> > > > - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> > > >
> > > > - ntb_hw_epf:
> > > >
> > > > - Document the legacy offset in ntb_epf_peer_db_set().
> > > > - Fix db_valid_mask to cover only real doorbell bits.
> > > > - Report 0-based db_vector to ntb_db_event() (accounting for the
> > > > unused slot).
> > > > - Keep db_val as a bitmask and fix db_read/db_clear semantics
> > > > accordingly.
> > > > - Implement db_vector_count()/db_vector_mask().
> > > >
> > > >
> > > > Compatibility
> > > > =============
> > > >
> > > > By keeping the legacy offset intact, this series aims to remain
> > > > compatible across mixed kernel versions. The observable changes are
> > > > limited to correct mask/vector reporting and safer execution context
> > > > handling.
> > > >
> > > > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > > > applied separately for each tree. I am sending them together in this
> > > > series to provide the full context and to make the cross-subsystem
> > > > compatibility constraints explicit. Ideally the whole series would be
> > > > applied in a single tree, but each subset is safe to merge on its own.
> > > >
> > > > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> > > > f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> > > > jiffies from pci_epf_mhi_edma_{read/write}")
> > > >
> > > > - Patch 6-10 can apply cleanly onto ntb-next latest:
> > > > 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> > > > path")
> > > >
> > > > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > > > (not vNTB) bridge scenario, but I believe no changes are needed in
> > > > pci-epf-ntb.c.
> > > >
> > > >
> > > > Changelog
> > > > =========
> > > >
> > > > Changes since v1:
> > > > - Addressed feedback from Dave (add a source code comment, introduce
> > > > enum to eliminate magic numbers)
> > > > - Updated source code comment in Patch 2.
> > > > - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> > > > Thank you both for the review.
> > >
> > > Sorry, I accidentally used an incorrect series title.
> > > The correct subject should be:
> > >
> > > [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
> > >
> > > For reference, v1 is:
> > > https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
> > >
> > > Best regards,
> > > Koichiro
> >
> > Hi Mani (cc: Jon, Dave),
> >
> > This series has been sitting for a while, so I'd like to check how to proceed.
> >
> > I'm thinking of the following approach:
> >
> > - get the remaining acks from the NTB side
> > (Dave already gave Reviewed-by for Patch 6/10)
> > - then route the whole series via the PCI EP tree
> >
> > Does that sound reasonable?
> >
> > If so, I can prepare a v3 rebased onto the latest pci/endpoint.
> >
>
> Yes please. We have queued a lot of endpoint patches for v7.1, so it makes sense
> to route this series as a whole through PCI tree.
Sounds good, I'll send the next revision (v11) with that in mind.
>
> Also as you noted, all the NTB patches need an Ack from Dave.
I'll reach out to Dave once I send v11.
Best regards,
Koichiro
>
> - Mani
>
> --
> மணிவண்ணன் சதாசிவம்
On Mon, Mar 23, 2026 at 10:29:00AM +0900, Koichiro Den wrote:
> On Sat, Mar 21, 2026 at 07:08:36PM +0530, Manivannan Sadhasivam wrote:
> > On Fri, Mar 20, 2026 at 11:55:23PM +0900, Koichiro Den wrote:
> > > On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> > > > On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > > > > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > > > > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > > > > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > > > > without changing the on-the-wire doorbell mapping.
> > > > >
> > > > >
> > > > > Background / problem
> > > > > ====================
> > > > >
> > > > > ntb_hw_epf historically applies an extra offset when ringing peer
> > > > > doorbells: the link event uses the first interrupt slot, and doorbells
> > > > > start from the third slot (i.e. a second slot is effectively unused).
> > > > > pci-epf-vntb carries the matching offset on the EP side as well.
> > > > >
> > > > > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > > > > mismatch is mostly masked. Doorbell events are effectively treated as
> > > > > "can hit any QP" and the off-by-one vector numbering does not surface
> > > > > clearly.
> > > > >
> > > > > However, once per-vector handling is enabled, the current state becomes
> > > > > problematic:
> > > > >
> > > > > - db_valid_mask exposes bits that do not correspond to real doorbells
> > > > > (link/unused slots leak into the mask).
> > > > > - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> > > > > expects a 0-based db_vector for doorbells.
> > > > > - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> > > > > it directly calls pci_epc_raise_irq(), which can sleep.
> > > > >
> > > > >
> > > > > Why NOT fix the root offset?
> > > > > ============================
> > > > >
> > > > > The natural "root" fix would be to remove the historical extra offset in
> > > > > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > > > > Unfortunately this would lead to interoperability issues when mixing old
> > > > > and new kernel versions (old/new peers). A new side would ring a
> > > > > different interrupt slot than what an old peer expects, leading to
> > > > > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > > > > are implemented.
> > > > >
> > > > > Therefore this series intentionally keeps the legacy offset, and instead
> > > > > fixes the surrounding pieces so the mapping is documented and handled
> > > > > consistently in masks, vector numbering, and per-vector reporting.
> > > > >
> > > > >
> > > > > What this series does
> > > > > =====================
> > > > >
> > > > > - pci-epf-vntb:
> > > > >
> > > > > - Document the legacy offset.
> > > > > - Defer MSI doorbell raises to process context to avoid sleeping in
> > > > > atomic context. This becomes relevant once multiple doorbells are
> > > > > raised concurrently at a high rate.
> > > > > - Report doorbell vectors as 0-based to ntb_db_event().
> > > > > - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> > > > >
> > > > > - ntb_hw_epf:
> > > > >
> > > > > - Document the legacy offset in ntb_epf_peer_db_set().
> > > > > - Fix db_valid_mask to cover only real doorbell bits.
> > > > > - Report 0-based db_vector to ntb_db_event() (accounting for the
> > > > > unused slot).
> > > > > - Keep db_val as a bitmask and fix db_read/db_clear semantics
> > > > > accordingly.
> > > > > - Implement db_vector_count()/db_vector_mask().
> > > > >
> > > > >
> > > > > Compatibility
> > > > > =============
> > > > >
> > > > > By keeping the legacy offset intact, this series aims to remain
> > > > > compatible across mixed kernel versions. The observable changes are
> > > > > limited to correct mask/vector reporting and safer execution context
> > > > > handling.
> > > > >
> > > > > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > > > > applied separately for each tree. I am sending them together in this
> > > > > series to provide the full context and to make the cross-subsystem
> > > > > compatibility constraints explicit. Ideally the whole series would be
> > > > > applied in a single tree, but each subset is safe to merge on its own.
> > > > >
> > > > > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> > > > > f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> > > > > jiffies from pci_epf_mhi_edma_{read/write}")
> > > > >
> > > > > - Patch 6-10 can apply cleanly onto ntb-next latest:
> > > > > 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> > > > > path")
> > > > >
> > > > > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > > > > (not vNTB) bridge scenario, but I believe no changes are needed in
> > > > > pci-epf-ntb.c.
> > > > >
> > > > >
> > > > > Changelog
> > > > > =========
> > > > >
> > > > > Changes since v1:
> > > > > - Addressed feedback from Dave (add a source code comment, introduce
> > > > > enum to eliminate magic numbers)
> > > > > - Updated source code comment in Patch 2.
> > > > > - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> > > > > Thank you both for the review.
> > > >
> > > > Sorry, I accidentally used an incorrect series title.
> > > > The correct subject should be:
> > > >
> > > > [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
> > > >
> > > > For reference, v1 is:
> > > > https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
> > > >
> > > > Best regards,
> > > > Koichiro
> > >
> > > Hi Mani (cc: Jon, Dave),
> > >
> > > This series has been sitting for a while, so I'd like to check how to proceed.
> > >
> > > I'm thinking of the following approach:
> > >
> > > - get the remaining acks from the NTB side
> > > (Dave already gave Reviewed-by for Patch 6/10)
> > > - then route the whole series via the PCI EP tree
> > >
> > > Does that sound reasonable?
> > >
> > > If so, I can prepare a v3 rebased onto the latest pci/endpoint.
> > >
> >
> > Yes please. We have queued a lot of endpoint patches for v7.1, so it makes sense
> > to route this series as a whole through PCI tree.
>
> Sounds good, I'll send the next revision (v11) with that in mind.
Sorry, typo (v11 -> v3).
Koichiro
>
> >
> > Also as you noted, all the NTB patches need an Ack from Dave.
>
> I'll reach out to Dave once I send v11.
>
> Best regards,
> Koichiro
>
> >
> > - Mani
> >
> > --
> > மணிவண்ணன் சதாசிவம்
On Fri, Mar 20, 2026 at 11:55:24PM +0900, Koichiro Den wrote:
> On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> > On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > > without changing the on-the-wire doorbell mapping.
> > >
> > >
> > > Background / problem
> > > ====================
> > >
> > > ntb_hw_epf historically applies an extra offset when ringing peer
> > > doorbells: the link event uses the first interrupt slot, and doorbells
> > > start from the third slot (i.e. a second slot is effectively unused).
> > > pci-epf-vntb carries the matching offset on the EP side as well.
> > >
> > > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > > mismatch is mostly masked. Doorbell events are effectively treated as
> > > "can hit any QP" and the off-by-one vector numbering does not surface
> > > clearly.
> > >
> > > However, once per-vector handling is enabled, the current state becomes
> > > problematic:
> > >
> > > - db_valid_mask exposes bits that do not correspond to real doorbells
> > > (link/unused slots leak into the mask).
> > > - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> > > expects a 0-based db_vector for doorbells.
> > > - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> > > it directly calls pci_epc_raise_irq(), which can sleep.
> > >
> > >
> > > Why NOT fix the root offset?
> > > ============================
> > >
> > > The natural "root" fix would be to remove the historical extra offset in
> > > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > > Unfortunately this would lead to interoperability issues when mixing old
> > > and new kernel versions (old/new peers). A new side would ring a
> > > different interrupt slot than what an old peer expects, leading to
> > > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > > are implemented.
> > >
> > > Therefore this series intentionally keeps the legacy offset, and instead
> > > fixes the surrounding pieces so the mapping is documented and handled
> > > consistently in masks, vector numbering, and per-vector reporting.
> > >
> > >
> > > What this series does
> > > =====================
> > >
> > > - pci-epf-vntb:
> > >
> > > - Document the legacy offset.
> > > - Defer MSI doorbell raises to process context to avoid sleeping in
> > > atomic context. This becomes relevant once multiple doorbells are
> > > raised concurrently at a high rate.
> > > - Report doorbell vectors as 0-based to ntb_db_event().
> > > - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> > >
> > > - ntb_hw_epf:
> > >
> > > - Document the legacy offset in ntb_epf_peer_db_set().
> > > - Fix db_valid_mask to cover only real doorbell bits.
> > > - Report 0-based db_vector to ntb_db_event() (accounting for the
> > > unused slot).
> > > - Keep db_val as a bitmask and fix db_read/db_clear semantics
> > > accordingly.
> > > - Implement db_vector_count()/db_vector_mask().
> > >
> > >
> > > Compatibility
> > > =============
> > >
> > > By keeping the legacy offset intact, this series aims to remain
> > > compatible across mixed kernel versions. The observable changes are
> > > limited to correct mask/vector reporting and safer execution context
> > > handling.
> > >
> > > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > > applied separately for each tree. I am sending them together in this
> > > series to provide the full context and to make the cross-subsystem
> > > compatibility constraints explicit. Ideally the whole series would be
> > > applied in a single tree, but each subset is safe to merge on its own.
> > >
> > > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> > > f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> > > jiffies from pci_epf_mhi_edma_{read/write}")
> > >
> > > - Patch 6-10 can apply cleanly onto ntb-next latest:
> > > 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> > > path")
> > >
> > > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > > (not vNTB) bridge scenario, but I believe no changes are needed in
> > > pci-epf-ntb.c.
> > >
> > >
> > > Changelog
> > > =========
> > >
> > > Changes since v1:
> > > - Addressed feedback from Dave (add a source code comment, introduce
> > > enum to eliminate magic numbers)
> > > - Updated source code comment in Patch 2.
> > > - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> > > Thank you both for the review.
> >
> > Sorry, I accidentally used an incorrect series title.
> > The correct subject should be:
> >
> > [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
> >
> > For reference, v1 is:
> > https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
> >
> > Best regards,
> > Koichiro
>
> Hi Mani (cc: Jon, Dave),
>
> This series has been sitting for a while, so I'd like to check how to proceed.
>
> I'm thinking of the following approach:
>
> - get the remaining acks from the NTB side
> (Dave already gave Reviewed-by for Patch 6/10)
> - then route the whole series via the PCI EP tree
>
> Does that sound reasonable?
>
> If so, I can prepare a v3 rebased onto the latest pci/endpoint.
Just let me add one more point: in addition to fixing the issues, this series
also enables higher performance for ntb_transport.
Some results and context are described here:
https://lore.kernel.org/all/20260305155639.1885517-1-den@valinux.co.jp/
(Note: the ntb_netdev series has already landed in net-next)
Best regards,
Koichiro
>
> Best regards,
> Koichiro
© 2016 - 2026 Red Hat, Inc.