[PATCH v2 00/10] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset

Koichiro Den posted 10 patches 1 month, 1 week ago
drivers/ntb/hw/epf/ntb_hw_epf.c               |  89 ++++++++++-
drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++---
2 files changed, 210 insertions(+), 26 deletions(-)
[PATCH v2 00/10] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
Posted by Koichiro Den 1 month, 1 week ago
This series fixes doorbell bit/vector handling for the EPF-based NTB
pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
without changing the on-the-wire doorbell mapping.


Background / problem
====================

ntb_hw_epf historically applies an extra offset when ringing peer
doorbells: the link event uses the first interrupt slot, and doorbells
start from the third slot (i.e. a second slot is effectively unused).
pci-epf-vntb carries the matching offset on the EP side as well.

As long as db_vector_count()/db_vector_mask() are not implemented, this
mismatch is mostly masked. Doorbell events are effectively treated as
"can hit any QP" and the off-by-one vector numbering does not surface
clearly.

However, once per-vector handling is enabled, the current state becomes
problematic:

  - db_valid_mask exposes bits that do not correspond to real doorbells
    (link/unused slots leak into the mask).
  - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
    expects a 0-based db_vector for doorbells.
  - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
    it directly calls pci_epc_raise_irq(), which can sleep.


Why NOT fix the root offset?
============================

The natural "root" fix would be to remove the historical extra offset in
the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
Unfortunately this would lead to interoperability issues when mixing old
and new kernel versions (old/new peers). A new side would ring a
different interrupt slot than what an old peer expects, leading to
missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
are implemented.

Therefore this series intentionally keeps the legacy offset, and instead
fixes the surrounding pieces so the mapping is documented and handled
consistently in masks, vector numbering, and per-vector reporting.


What this series does
=====================

- pci-epf-vntb:

  - Document the legacy offset.
  - Defer MSI doorbell raises to process context to avoid sleeping in
    atomic context. This becomes relevant once multiple doorbells are
    raised concurrently at a high rate.
  - Report doorbell vectors as 0-based to ntb_db_event().
  - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().

- ntb_hw_epf:

  - Document the legacy offset in ntb_epf_peer_db_set().
  - Fix db_valid_mask to cover only real doorbell bits.
  - Report 0-based db_vector to ntb_db_event() (accounting for the
    unused slot).
  - Keep db_val as a bitmask and fix db_read/db_clear semantics
    accordingly.
  - Implement db_vector_count()/db_vector_mask().


Compatibility
=============

By keeping the legacy offset intact, this series aims to remain
compatible across mixed kernel versions. The observable changes are
limited to correct mask/vector reporting and safer execution context
handling.

Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
applied separately for each tree. I am sending them together in this
series to provide the full context and to make the cross-subsystem
compatibility constraints explicit. Ideally the whole series would be
applied in a single tree, but each subset is safe to merge on its own.

- Patch 1-5 can apply cleanly onto pci/endpoint latest:
  f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
                 jiffies from pci_epf_mhi_edma_{read/write}")

- Patch 6-10 can apply cleanly onto ntb-next latest:
  7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
                 path")

Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
(not vNTB) bridge scenario, but I believe no changes are needed in
pci-epf-ntb.c.


Changelog
=========

Changes since v1:
  - Addressed feedback from Dave (add a source code comment, introduce
    enum to eliminate magic numbers)
  - Updated source code comment in Patch 2.
  - No functional changes, so retained Reviewed-by tags by Frank and Dave.
    Thank you both for the review.


Best regards,


Koichiro Den (10):
  PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
  PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic
    context
  PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via
    ntb_db_event()
  PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
  PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for
    doorbells
  NTB: epf: Document legacy doorbell slot offset in
    ntb_epf_peer_db_set()
  NTB: epf: Make db_valid_mask cover only real doorbell bits
  NTB: epf: Report 0-based doorbell vector via ntb_db_event()
  NTB: epf: Fix doorbell bitmask handling in db_read/db_clear
  NTB: epf: Implement db_vector_count/mask for doorbells

 drivers/ntb/hw/epf/ntb_hw_epf.c               |  89 ++++++++++-
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++---
 2 files changed, 210 insertions(+), 26 deletions(-)

-- 
2.51.0
Re: [PATCH v2 00/10] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
Posted by Koichiro Den 1 month, 1 week ago
On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> This series fixes doorbell bit/vector handling for the EPF-based NTB
> pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> without changing the on-the-wire doorbell mapping.
> 
> 
> Background / problem
> ====================
> 
> ntb_hw_epf historically applies an extra offset when ringing peer
> doorbells: the link event uses the first interrupt slot, and doorbells
> start from the third slot (i.e. a second slot is effectively unused).
> pci-epf-vntb carries the matching offset on the EP side as well.
> 
> As long as db_vector_count()/db_vector_mask() are not implemented, this
> mismatch is mostly masked. Doorbell events are effectively treated as
> "can hit any QP" and the off-by-one vector numbering does not surface
> clearly.
> 
> However, once per-vector handling is enabled, the current state becomes
> problematic:
> 
>   - db_valid_mask exposes bits that do not correspond to real doorbells
>     (link/unused slots leak into the mask).
>   - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
>     expects a 0-based db_vector for doorbells.
>   - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
>     it directly calls pci_epc_raise_irq(), which can sleep.
> 
> 
> Why NOT fix the root offset?
> ============================
> 
> The natural "root" fix would be to remove the historical extra offset in
> the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> Unfortunately this would lead to interoperability issues when mixing old
> and new kernel versions (old/new peers). A new side would ring a
> different interrupt slot than what an old peer expects, leading to
> missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> are implemented.
> 
> Therefore this series intentionally keeps the legacy offset, and instead
> fixes the surrounding pieces so the mapping is documented and handled
> consistently in masks, vector numbering, and per-vector reporting.
> 
> 
> What this series does
> =====================
> 
> - pci-epf-vntb:
> 
>   - Document the legacy offset.
>   - Defer MSI doorbell raises to process context to avoid sleeping in
>     atomic context. This becomes relevant once multiple doorbells are
>     raised concurrently at a high rate.
>   - Report doorbell vectors as 0-based to ntb_db_event().
>   - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> 
> - ntb_hw_epf:
> 
>   - Document the legacy offset in ntb_epf_peer_db_set().
>   - Fix db_valid_mask to cover only real doorbell bits.
>   - Report 0-based db_vector to ntb_db_event() (accounting for the
>     unused slot).
>   - Keep db_val as a bitmask and fix db_read/db_clear semantics
>     accordingly.
>   - Implement db_vector_count()/db_vector_mask().
> 
> 
> Compatibility
> =============
> 
> By keeping the legacy offset intact, this series aims to remain
> compatible across mixed kernel versions. The observable changes are
> limited to correct mask/vector reporting and safer execution context
> handling.
> 
> Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> applied separately for each tree. I am sending them together in this
> series to provide the full context and to make the cross-subsystem
> compatibility constraints explicit. Ideally the whole series would be
> applied in a single tree, but each subset is safe to merge on its own.
> 
> - Patch 1-5 can apply cleanly onto pci/endpoint latest:
>   f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
>                  jiffies from pci_epf_mhi_edma_{read/write}")
> 
> - Patch 6-10 can apply cleanly onto ntb-next latest:
>   7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
>                  path")
> 
> Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> (not vNTB) bridge scenario, but I believe no changes are needed in
> pci-epf-ntb.c.
> 
> 
> Changelog
> =========
> 
> Changes since v1:
>   - Addressed feedback from Dave (add a source code comment, introduce
>     enum to eliminate magic numbers)
>   - Updated source code comment in Patch 2.
>   - No functional changes, so retained Reviewed-by tags by Frank and Dave.
>     Thank you both for the review.

Sorry, I accidentally used an incorrect series title.
The correct subject should be:

  [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset

For reference, v1 is:
https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/

Best regards,
Koichiro

> 
> 
> Best regards,
> 
> 
> Koichiro Den (10):
>   PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
>   PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic
>     context
>   PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via
>     ntb_db_event()
>   PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
>   PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for
>     doorbells
>   NTB: epf: Document legacy doorbell slot offset in
>     ntb_epf_peer_db_set()
>   NTB: epf: Make db_valid_mask cover only real doorbell bits
>   NTB: epf: Report 0-based doorbell vector via ntb_db_event()
>   NTB: epf: Fix doorbell bitmask handling in db_read/db_clear
>   NTB: epf: Implement db_vector_count/mask for doorbells
> 
>  drivers/ntb/hw/epf/ntb_hw_epf.c               |  89 ++++++++++-
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 147 +++++++++++++++---
>  2 files changed, 210 insertions(+), 26 deletions(-)
> 
> -- 
> 2.51.0
> 
>
Re: [PATCH v2 00/10] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
Posted by Koichiro Den 2 weeks, 4 days ago
On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > without changing the on-the-wire doorbell mapping.
> > 
> > 
> > Background / problem
> > ====================
> > 
> > ntb_hw_epf historically applies an extra offset when ringing peer
> > doorbells: the link event uses the first interrupt slot, and doorbells
> > start from the third slot (i.e. a second slot is effectively unused).
> > pci-epf-vntb carries the matching offset on the EP side as well.
> > 
> > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > mismatch is mostly masked. Doorbell events are effectively treated as
> > "can hit any QP" and the off-by-one vector numbering does not surface
> > clearly.
> > 
> > However, once per-vector handling is enabled, the current state becomes
> > problematic:
> > 
> >   - db_valid_mask exposes bits that do not correspond to real doorbells
> >     (link/unused slots leak into the mask).
> >   - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> >     expects a 0-based db_vector for doorbells.
> >   - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> >     it directly calls pci_epc_raise_irq(), which can sleep.
> > 
> > 
> > Why NOT fix the root offset?
> > ============================
> > 
> > The natural "root" fix would be to remove the historical extra offset in
> > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > Unfortunately this would lead to interoperability issues when mixing old
> > and new kernel versions (old/new peers). A new side would ring a
> > different interrupt slot than what an old peer expects, leading to
> > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > are implemented.
> > 
> > Therefore this series intentionally keeps the legacy offset, and instead
> > fixes the surrounding pieces so the mapping is documented and handled
> > consistently in masks, vector numbering, and per-vector reporting.
> > 
> > 
> > What this series does
> > =====================
> > 
> > - pci-epf-vntb:
> > 
> >   - Document the legacy offset.
> >   - Defer MSI doorbell raises to process context to avoid sleeping in
> >     atomic context. This becomes relevant once multiple doorbells are
> >     raised concurrently at a high rate.
> >   - Report doorbell vectors as 0-based to ntb_db_event().
> >   - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> > 
> > - ntb_hw_epf:
> > 
> >   - Document the legacy offset in ntb_epf_peer_db_set().
> >   - Fix db_valid_mask to cover only real doorbell bits.
> >   - Report 0-based db_vector to ntb_db_event() (accounting for the
> >     unused slot).
> >   - Keep db_val as a bitmask and fix db_read/db_clear semantics
> >     accordingly.
> >   - Implement db_vector_count()/db_vector_mask().
> > 
> > 
> > Compatibility
> > =============
> > 
> > By keeping the legacy offset intact, this series aims to remain
> > compatible across mixed kernel versions. The observable changes are
> > limited to correct mask/vector reporting and safer execution context
> > handling.
> > 
> > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > applied separately for each tree. I am sending them together in this
> > series to provide the full context and to make the cross-subsystem
> > compatibility constraints explicit. Ideally the whole series would be
> > applied in a single tree, but each subset is safe to merge on its own.
> > 
> > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> >   f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> >                  jiffies from pci_epf_mhi_edma_{read/write}")
> > 
> > - Patch 6-10 can apply cleanly onto ntb-next latest:
> >   7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> >                  path")
> > 
> > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > (not vNTB) bridge scenario, but I believe no changes are needed in
> > pci-epf-ntb.c.
> > 
> > 
> > Changelog
> > =========
> > 
> > Changes since v1:
> >   - Addressed feedback from Dave (add a source code comment, introduce
> >     enum to eliminate magic numbers)
> >   - Updated source code comment in Patch 2.
> >   - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> >     Thank you both for the review.
> 
> Sorry, I accidentally used an incorrect series title.
> The correct subject should be:
> 
>   [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
> 
> For reference, v1 is:
> https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
> 
> Best regards,
> Koichiro

Hi Mani (cc: Jon, Dave),

This series has been sitting for a while, so I'd like to check how to proceed.

I'm thinking of the following approach:

  - get the remaining acks from the NTB side
    (Dave already gave Reviewed-by for Patch 6/10)
  - then route the whole series via the PCI EP tree

Does that sound reasonable?

If so, I can prepare a v3 rebased onto the latest pci/endpoint.

Best regards,
Koichiro
Re: [PATCH v2 00/10] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
Posted by Manivannan Sadhasivam 2 weeks, 3 days ago
On Fri, Mar 20, 2026 at 11:55:23PM +0900, Koichiro Den wrote:
> On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> > On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > > without changing the on-the-wire doorbell mapping.
> > > 
> > > 
> > > Background / problem
> > > ====================
> > > 
> > > ntb_hw_epf historically applies an extra offset when ringing peer
> > > doorbells: the link event uses the first interrupt slot, and doorbells
> > > start from the third slot (i.e. a second slot is effectively unused).
> > > pci-epf-vntb carries the matching offset on the EP side as well.
> > > 
> > > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > > mismatch is mostly masked. Doorbell events are effectively treated as
> > > "can hit any QP" and the off-by-one vector numbering does not surface
> > > clearly.
> > > 
> > > However, once per-vector handling is enabled, the current state becomes
> > > problematic:
> > > 
> > >   - db_valid_mask exposes bits that do not correspond to real doorbells
> > >     (link/unused slots leak into the mask).
> > >   - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> > >     expects a 0-based db_vector for doorbells.
> > >   - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> > >     it directly calls pci_epc_raise_irq(), which can sleep.
> > > 
> > > 
> > > Why NOT fix the root offset?
> > > ============================
> > > 
> > > The natural "root" fix would be to remove the historical extra offset in
> > > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > > Unfortunately this would lead to interoperability issues when mixing old
> > > and new kernel versions (old/new peers). A new side would ring a
> > > different interrupt slot than what an old peer expects, leading to
> > > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > > are implemented.
> > > 
> > > Therefore this series intentionally keeps the legacy offset, and instead
> > > fixes the surrounding pieces so the mapping is documented and handled
> > > consistently in masks, vector numbering, and per-vector reporting.
> > > 
> > > 
> > > What this series does
> > > =====================
> > > 
> > > - pci-epf-vntb:
> > > 
> > >   - Document the legacy offset.
> > >   - Defer MSI doorbell raises to process context to avoid sleeping in
> > >     atomic context. This becomes relevant once multiple doorbells are
> > >     raised concurrently at a high rate.
> > >   - Report doorbell vectors as 0-based to ntb_db_event().
> > >   - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> > > 
> > > - ntb_hw_epf:
> > > 
> > >   - Document the legacy offset in ntb_epf_peer_db_set().
> > >   - Fix db_valid_mask to cover only real doorbell bits.
> > >   - Report 0-based db_vector to ntb_db_event() (accounting for the
> > >     unused slot).
> > >   - Keep db_val as a bitmask and fix db_read/db_clear semantics
> > >     accordingly.
> > >   - Implement db_vector_count()/db_vector_mask().
> > > 
> > > 
> > > Compatibility
> > > =============
> > > 
> > > By keeping the legacy offset intact, this series aims to remain
> > > compatible across mixed kernel versions. The observable changes are
> > > limited to correct mask/vector reporting and safer execution context
> > > handling.
> > > 
> > > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > > applied separately for each tree. I am sending them together in this
> > > series to provide the full context and to make the cross-subsystem
> > > compatibility constraints explicit. Ideally the whole series would be
> > > applied in a single tree, but each subset is safe to merge on its own.
> > > 
> > > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> > >   f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> > >                  jiffies from pci_epf_mhi_edma_{read/write}")
> > > 
> > > - Patch 6-10 can apply cleanly onto ntb-next latest:
> > >   7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> > >                  path")
> > > 
> > > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > > (not vNTB) bridge scenario, but I believe no changes are needed in
> > > pci-epf-ntb.c.
> > > 
> > > 
> > > Changelog
> > > =========
> > > 
> > > Changes since v1:
> > >   - Addressed feedback from Dave (add a source code comment, introduce
> > >     enum to eliminate magic numbers)
> > >   - Updated source code comment in Patch 2.
> > >   - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> > >     Thank you both for the review.
> > 
> > Sorry, I accidentally used an incorrect series title.
> > The correct subject should be:
> > 
> >   [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
> > 
> > For reference, v1 is:
> > https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
> > 
> > Best regards,
> > Koichiro
> 
> Hi Mani (cc: Jon, Dave),
> 
> This series has been sitting for a while, so I'd like to check how to proceed.
> 
> I'm thinking of the following approach:
> 
>   - get the remaining acks from the NTB side
>     (Dave already gave Reviewed-by for Patch 6/10)
>   - then route the whole series via the PCI EP tree
> 
> Does that sound reasonable?
> 
> If so, I can prepare a v3 rebased onto the latest pci/endpoint.
> 

Yes please. We have queued a lot of endpoint patches for v7.1, so it makes sense
to route this series as a whole through PCI tree.

Also as you noted, all the NTB patches need an Ack from Dave.

- Mani

-- 
மணிவண்ணன் சதாசிவம்
Re: [PATCH v2 00/10] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
Posted by Koichiro Den 2 weeks, 1 day ago
On Sat, Mar 21, 2026 at 07:08:36PM +0530, Manivannan Sadhasivam wrote:
> On Fri, Mar 20, 2026 at 11:55:23PM +0900, Koichiro Den wrote:
> > On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> > > On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > > > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > > > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > > > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > > > without changing the on-the-wire doorbell mapping.
> > > > 
> > > > 
> > > > Background / problem
> > > > ====================
> > > > 
> > > > ntb_hw_epf historically applies an extra offset when ringing peer
> > > > doorbells: the link event uses the first interrupt slot, and doorbells
> > > > start from the third slot (i.e. a second slot is effectively unused).
> > > > pci-epf-vntb carries the matching offset on the EP side as well.
> > > > 
> > > > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > > > mismatch is mostly masked. Doorbell events are effectively treated as
> > > > "can hit any QP" and the off-by-one vector numbering does not surface
> > > > clearly.
> > > > 
> > > > However, once per-vector handling is enabled, the current state becomes
> > > > problematic:
> > > > 
> > > >   - db_valid_mask exposes bits that do not correspond to real doorbells
> > > >     (link/unused slots leak into the mask).
> > > >   - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> > > >     expects a 0-based db_vector for doorbells.
> > > >   - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> > > >     it directly calls pci_epc_raise_irq(), which can sleep.
> > > > 
> > > > 
> > > > Why NOT fix the root offset?
> > > > ============================
> > > > 
> > > > The natural "root" fix would be to remove the historical extra offset in
> > > > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > > > Unfortunately this would lead to interoperability issues when mixing old
> > > > and new kernel versions (old/new peers). A new side would ring a
> > > > different interrupt slot than what an old peer expects, leading to
> > > > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > > > are implemented.
> > > > 
> > > > Therefore this series intentionally keeps the legacy offset, and instead
> > > > fixes the surrounding pieces so the mapping is documented and handled
> > > > consistently in masks, vector numbering, and per-vector reporting.
> > > > 
> > > > 
> > > > What this series does
> > > > =====================
> > > > 
> > > > - pci-epf-vntb:
> > > > 
> > > >   - Document the legacy offset.
> > > >   - Defer MSI doorbell raises to process context to avoid sleeping in
> > > >     atomic context. This becomes relevant once multiple doorbells are
> > > >     raised concurrently at a high rate.
> > > >   - Report doorbell vectors as 0-based to ntb_db_event().
> > > >   - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> > > > 
> > > > - ntb_hw_epf:
> > > > 
> > > >   - Document the legacy offset in ntb_epf_peer_db_set().
> > > >   - Fix db_valid_mask to cover only real doorbell bits.
> > > >   - Report 0-based db_vector to ntb_db_event() (accounting for the
> > > >     unused slot).
> > > >   - Keep db_val as a bitmask and fix db_read/db_clear semantics
> > > >     accordingly.
> > > >   - Implement db_vector_count()/db_vector_mask().
> > > > 
> > > > 
> > > > Compatibility
> > > > =============
> > > > 
> > > > By keeping the legacy offset intact, this series aims to remain
> > > > compatible across mixed kernel versions. The observable changes are
> > > > limited to correct mask/vector reporting and safer execution context
> > > > handling.
> > > > 
> > > > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > > > applied separately for each tree. I am sending them together in this
> > > > series to provide the full context and to make the cross-subsystem
> > > > compatibility constraints explicit. Ideally the whole series would be
> > > > applied in a single tree, but each subset is safe to merge on its own.
> > > > 
> > > > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> > > >   f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> > > >                  jiffies from pci_epf_mhi_edma_{read/write}")
> > > > 
> > > > - Patch 6-10 can apply cleanly onto ntb-next latest:
> > > >   7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> > > >                  path")
> > > > 
> > > > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > > > (not vNTB) bridge scenario, but I believe no changes are needed in
> > > > pci-epf-ntb.c.
> > > > 
> > > > 
> > > > Changelog
> > > > =========
> > > > 
> > > > Changes since v1:
> > > >   - Addressed feedback from Dave (add a source code comment, introduce
> > > >     enum to eliminate magic numbers)
> > > >   - Updated source code comment in Patch 2.
> > > >   - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> > > >     Thank you both for the review.
> > > 
> > > Sorry, I accidentally used an incorrect series title.
> > > The correct subject should be:
> > > 
> > >   [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
> > > 
> > > For reference, v1 is:
> > > https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
> > > 
> > > Best regards,
> > > Koichiro
> > 
> > Hi Mani (cc: Jon, Dave),
> > 
> > This series has been sitting for a while, so I'd like to check how to proceed.
> > 
> > I'm thinking of the following approach:
> > 
> >   - get the remaining acks from the NTB side
> >     (Dave already gave Reviewed-by for Patch 6/10)
> >   - then route the whole series via the PCI EP tree
> > 
> > Does that sound reasonable?
> > 
> > If so, I can prepare a v3 rebased onto the latest pci/endpoint.
> > 
> 
> Yes please. We have queued a lot of endpoint patches for v7.1, so it makes sense
> to route this series as a whole through PCI tree.

Sounds good, I'll send the next revision (v11) with that in mind.

> 
> Also as you noted, all the NTB patches need an Ack from Dave.

I'll reach out to Dave once I send v11.

Best regards,
Koichiro

> 
> - Mani
> 
> -- 
> மணிவண்ணன் சதாசிவம்
Re: [PATCH v2 00/10] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
Posted by Koichiro Den 2 weeks, 1 day ago
On Mon, Mar 23, 2026 at 10:29:00AM +0900, Koichiro Den wrote:
> On Sat, Mar 21, 2026 at 07:08:36PM +0530, Manivannan Sadhasivam wrote:
> > On Fri, Mar 20, 2026 at 11:55:23PM +0900, Koichiro Den wrote:
> > > On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> > > > On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > > > > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > > > > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > > > > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > > > > without changing the on-the-wire doorbell mapping.
> > > > > 
> > > > > 
> > > > > Background / problem
> > > > > ====================
> > > > > 
> > > > > ntb_hw_epf historically applies an extra offset when ringing peer
> > > > > doorbells: the link event uses the first interrupt slot, and doorbells
> > > > > start from the third slot (i.e. a second slot is effectively unused).
> > > > > pci-epf-vntb carries the matching offset on the EP side as well.
> > > > > 
> > > > > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > > > > mismatch is mostly masked. Doorbell events are effectively treated as
> > > > > "can hit any QP" and the off-by-one vector numbering does not surface
> > > > > clearly.
> > > > > 
> > > > > However, once per-vector handling is enabled, the current state becomes
> > > > > problematic:
> > > > > 
> > > > >   - db_valid_mask exposes bits that do not correspond to real doorbells
> > > > >     (link/unused slots leak into the mask).
> > > > >   - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> > > > >     expects a 0-based db_vector for doorbells.
> > > > >   - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> > > > >     it directly calls pci_epc_raise_irq(), which can sleep.
> > > > > 
> > > > > 
> > > > > Why NOT fix the root offset?
> > > > > ============================
> > > > > 
> > > > > The natural "root" fix would be to remove the historical extra offset in
> > > > > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > > > > Unfortunately this would lead to interoperability issues when mixing old
> > > > > and new kernel versions (old/new peers). A new side would ring a
> > > > > different interrupt slot than what an old peer expects, leading to
> > > > > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > > > > are implemented.
> > > > > 
> > > > > Therefore this series intentionally keeps the legacy offset, and instead
> > > > > fixes the surrounding pieces so the mapping is documented and handled
> > > > > consistently in masks, vector numbering, and per-vector reporting.
> > > > > 
> > > > > 
> > > > > What this series does
> > > > > =====================
> > > > > 
> > > > > - pci-epf-vntb:
> > > > > 
> > > > >   - Document the legacy offset.
> > > > >   - Defer MSI doorbell raises to process context to avoid sleeping in
> > > > >     atomic context. This becomes relevant once multiple doorbells are
> > > > >     raised concurrently at a high rate.
> > > > >   - Report doorbell vectors as 0-based to ntb_db_event().
> > > > >   - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> > > > > 
> > > > > - ntb_hw_epf:
> > > > > 
> > > > >   - Document the legacy offset in ntb_epf_peer_db_set().
> > > > >   - Fix db_valid_mask to cover only real doorbell bits.
> > > > >   - Report 0-based db_vector to ntb_db_event() (accounting for the
> > > > >     unused slot).
> > > > >   - Keep db_val as a bitmask and fix db_read/db_clear semantics
> > > > >     accordingly.
> > > > >   - Implement db_vector_count()/db_vector_mask().
> > > > > 
> > > > > 
> > > > > Compatibility
> > > > > =============
> > > > > 
> > > > > By keeping the legacy offset intact, this series aims to remain
> > > > > compatible across mixed kernel versions. The observable changes are
> > > > > limited to correct mask/vector reporting and safer execution context
> > > > > handling.
> > > > > 
> > > > > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > > > > applied separately for each tree. I am sending them together in this
> > > > > series to provide the full context and to make the cross-subsystem
> > > > > compatibility constraints explicit. Ideally the whole series would be
> > > > > applied in a single tree, but each subset is safe to merge on its own.
> > > > > 
> > > > > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> > > > >   f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> > > > >                  jiffies from pci_epf_mhi_edma_{read/write}")
> > > > > 
> > > > > - Patch 6-10 can apply cleanly onto ntb-next latest:
> > > > >   7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> > > > >                  path")
> > > > > 
> > > > > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > > > > (not vNTB) bridge scenario, but I believe no changes are needed in
> > > > > pci-epf-ntb.c.
> > > > > 
> > > > > 
> > > > > Changelog
> > > > > =========
> > > > > 
> > > > > Changes since v1:
> > > > >   - Addressed feedback from Dave (add a source code comment, introduce
> > > > >     enum to eliminate magic numbers)
> > > > >   - Updated source code comment in Patch 2.
> > > > >   - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> > > > >     Thank you both for the review.
> > > > 
> > > > Sorry, I accidentally used an incorrect series title.
> > > > The correct subject should be:
> > > > 
> > > >   [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
> > > > 
> > > > For reference, v1 is:
> > > > https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
> > > > 
> > > > Best regards,
> > > > Koichiro
> > > 
> > > Hi Mani (cc: Jon, Dave),
> > > 
> > > This series has been sitting for a while, so I'd like to check how to proceed.
> > > 
> > > I'm thinking of the following approach:
> > > 
> > >   - get the remaining acks from the NTB side
> > >     (Dave already gave Reviewed-by for Patch 6/10)
> > >   - then route the whole series via the PCI EP tree
> > > 
> > > Does that sound reasonable?
> > > 
> > > If so, I can prepare a v3 rebased onto the latest pci/endpoint.
> > > 
> > 
> > Yes please. We have queued a lot of endpoint patches for v7.1, so it makes sense
> > to route this series as a whole through PCI tree.
> 
> Sounds good, I'll send the next revision (v11) with that in mind.

Sorry, typo (v11 -> v3).

Koichiro

> 
> > 
> > Also as you noted, all the NTB patches need an Ack from Dave.
> 
> I'll reach out to Dave once I send v11.
> 
> Best regards,
> Koichiro
> 
> > 
> > - Mani
> > 
> > -- 
> > மணிவண்ணன் சதாசிவம்
Re: [PATCH v2 00/10] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
Posted by Koichiro Den 2 weeks, 4 days ago
On Fri, Mar 20, 2026 at 11:55:24PM +0900, Koichiro Den wrote:
> On Fri, Feb 27, 2026 at 05:57:40PM +0900, Koichiro Den wrote:
> > On Fri, Feb 27, 2026 at 05:49:45PM +0900, Koichiro Den wrote:
> > > This series fixes doorbell bit/vector handling for the EPF-based NTB
> > > pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
> > > per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
> > > without changing the on-the-wire doorbell mapping.
> > > 
> > > 
> > > Background / problem
> > > ====================
> > > 
> > > ntb_hw_epf historically applies an extra offset when ringing peer
> > > doorbells: the link event uses the first interrupt slot, and doorbells
> > > start from the third slot (i.e. a second slot is effectively unused).
> > > pci-epf-vntb carries the matching offset on the EP side as well.
> > > 
> > > As long as db_vector_count()/db_vector_mask() are not implemented, this
> > > mismatch is mostly masked. Doorbell events are effectively treated as
> > > "can hit any QP" and the off-by-one vector numbering does not surface
> > > clearly.
> > > 
> > > However, once per-vector handling is enabled, the current state becomes
> > > problematic:
> > > 
> > >   - db_valid_mask exposes bits that do not correspond to real doorbells
> > >     (link/unused slots leak into the mask).
> > >   - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
> > >     expects a 0-based db_vector for doorbells.
> > >   - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
> > >     it directly calls pci_epc_raise_irq(), which can sleep.
> > > 
> > > 
> > > Why NOT fix the root offset?
> > > ============================
> > > 
> > > The natural "root" fix would be to remove the historical extra offset in
> > > the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
> > > Unfortunately this would lead to interoperability issues when mixing old
> > > and new kernel versions (old/new peers). A new side would ring a
> > > different interrupt slot than what an old peer expects, leading to
> > > missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
> > > are implemented.
> > > 
> > > Therefore this series intentionally keeps the legacy offset, and instead
> > > fixes the surrounding pieces so the mapping is documented and handled
> > > consistently in masks, vector numbering, and per-vector reporting.
> > > 
> > > 
> > > What this series does
> > > =====================
> > > 
> > > - pci-epf-vntb:
> > > 
> > >   - Document the legacy offset.
> > >   - Defer MSI doorbell raises to process context to avoid sleeping in
> > >     atomic context. This becomes relevant once multiple doorbells are
> > >     raised concurrently at a high rate.
> > >   - Report doorbell vectors as 0-based to ntb_db_event().
> > >   - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
> > > 
> > > - ntb_hw_epf:
> > > 
> > >   - Document the legacy offset in ntb_epf_peer_db_set().
> > >   - Fix db_valid_mask to cover only real doorbell bits.
> > >   - Report 0-based db_vector to ntb_db_event() (accounting for the
> > >     unused slot).
> > >   - Keep db_val as a bitmask and fix db_read/db_clear semantics
> > >     accordingly.
> > >   - Implement db_vector_count()/db_vector_mask().
> > > 
> > > 
> > > Compatibility
> > > =============
> > > 
> > > By keeping the legacy offset intact, this series aims to remain
> > > compatible across mixed kernel versions. The observable changes are
> > > limited to correct mask/vector reporting and safer execution context
> > > handling.
> > > 
> > > Patches 1-5 (PCI Endpoint) and 6-10 (NTB) are independent and can be
> > > applied separately for each tree. I am sending them together in this
> > > series to provide the full context and to make the cross-subsystem
> > > compatibility constraints explicit. Ideally the whole series would be
> > > applied in a single tree, but each subset is safe to merge on its own.
> > > 
> > > - Patch 1-5 can apply cleanly onto pci/endpoint latest:
> > >   f6797680fe31 ("PCI: epf-mhi: Return 0 on success instead of positive
> > >                  jiffies from pci_epf_mhi_edma_{read/write}")
> > > 
> > > - Patch 6-10 can apply cleanly onto ntb-next latest:
> > >   7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link disable
> > >                  path")
> > > 
> > > Note: I don't have a suitable hardware to test ntb_hw_epf + pci-epf-ntb
> > > (not vNTB) bridge scenario, but I believe no changes are needed in
> > > pci-epf-ntb.c.
> > > 
> > > 
> > > Changelog
> > > =========
> > > 
> > > Changes since v1:
> > >   - Addressed feedback from Dave (add a source code comment, introduce
> > >     enum to eliminate magic numbers)
> > >   - Updated source code comment in Patch 2.
> > >   - No functional changes, so retained Reviewed-by tags by Frank and Dave.
> > >     Thank you both for the review.
> > 
> > Sorry, I accidentally used an incorrect series title.
> > The correct subject should be:
> > 
> >   [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
> > 
> > For reference, v1 is:
> > https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
> > 
> > Best regards,
> > Koichiro
> 
> Hi Mani (cc: Jon, Dave),
> 
> This series has been sitting for a while, so I'd like to check how to proceed.
> 
> I'm thinking of the following approach:
> 
>   - get the remaining acks from the NTB side
>     (Dave already gave Reviewed-by for Patch 6/10)
>   - then route the whole series via the PCI EP tree
> 
> Does that sound reasonable?
> 
> If so, I can prepare a v3 rebased onto the latest pci/endpoint.

Just let me add one more point: in addition to fixing the issues, this series
also enables higher performance for ntb_transport.

Some results and context are described here:
https://lore.kernel.org/all/20260305155639.1885517-1-den@valinux.co.jp/
(Note: the ntb_netdev series has already landed in net-next)

Best regards,
Koichiro

> 
> Best regards,
> Koichiro