*** GPU Direct RDMA (P2P DMA) for Device Private Pages ***

[PATCH v2 4/5] RDMA/mlx5: Enable P2P DMA with fallback mechanism

Posted by Yonatan Maman 6 months, 3 weeks ago

From: Yonatan Maman <Ymaman@Nvidia.com>

Add support for P2P for MLX5 NIC devices with automatic fallback to
standard DMA when P2P mapping fails.

The change introduces P2P DMA requests by default using the
HMM_PFN_ALLOW_P2P flag. If P2P mapping fails with -EFAULT error, the
operation is retried without the P2P flag, ensuring a fallback to
standard DMA flow (using host memory).

Signed-off-by: Yonatan Maman <Ymaman@Nvidia.com>
Signed-off-by: Gal Shalom <GalShalom@Nvidia.com>
---
 drivers/infiniband/hw/mlx5/odp.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index f6abd64f07f7..6a0171117f48 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -715,6 +715,10 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp,
 	if (odp->umem.writable && !downgrade)
 		access_mask |= HMM_PFN_WRITE;
 
+	/*
+	 * try fault with HMM_PFN_ALLOW_P2P flag
+	 */
+	access_mask |= HMM_PFN_ALLOW_P2P;
 	np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault);
 	if (np < 0)
 		return np;
@@ -724,6 +728,18 @@ static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp,
 	 * ib_umem_odp_map_dma_and_lock already checks this.
 	 */
 	ret = mlx5r_umr_update_xlt(mr, start_idx, np, page_shift, xlt_flags);
+	if (ret == -EFAULT) {
+		/*
+		 * Indicate P2P Mapping Error, retry with no HMM_PFN_ALLOW_P2P
+		 */
+		mutex_unlock(&odp->umem_mutex);
+		access_mask &= ~(HMM_PFN_ALLOW_P2P);
+		np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault);
+		if (np < 0)
+			return np;
+		ret = mlx5r_umr_update_xlt(mr, start_idx, np, page_shift, xlt_flags);
+	}
+
 	mutex_unlock(&odp->umem_mutex);
 
 	if (ret < 0) {
-- 
2.34.1

Re: [PATCH v2 4/5] RDMA/mlx5: Enable P2P DMA with fallback mechanism

Posted by Christoph Hellwig 6 months, 3 weeks ago

On Fri, Jul 18, 2025 at 02:51:11PM +0300, Yonatan Maman wrote:
> From: Yonatan Maman <Ymaman@Nvidia.com>
> 
> Add support for P2P for MLX5 NIC devices with automatic fallback to
> standard DMA when P2P mapping fails.

That's now how the P2P API works.  You need to check the P2P availability
higher up.

Re: [PATCH v2 4/5] RDMA/mlx5: Enable P2P DMA with fallback mechanism

Posted by Jason Gunthorpe 6 months, 2 weeks ago

On Mon, Jul 21, 2025 at 12:03:41AM -0700, Christoph Hellwig wrote:
> On Fri, Jul 18, 2025 at 02:51:11PM +0300, Yonatan Maman wrote:
> > From: Yonatan Maman <Ymaman@Nvidia.com>
> > 
> > Add support for P2P for MLX5 NIC devices with automatic fallback to
> > standard DMA when P2P mapping fails.
> 
> That's now how the P2P API works.  You need to check the P2P availability
> higher up.

How do you mean?

This looks OKish to me, for ODP and HMM it has to check the P2P
availability on a page by page basis because every single page can be
a different origin device.

There isn't really a higher up here...

Jason

Re: [PATCH v2 4/5] RDMA/mlx5: Enable P2P DMA with fallback mechanism

Posted by Christoph Hellwig 6 months, 2 weeks ago

On Wed, Jul 23, 2025 at 12:55:22AM -0300, Jason Gunthorpe wrote:
> On Mon, Jul 21, 2025 at 12:03:41AM -0700, Christoph Hellwig wrote:
> > On Fri, Jul 18, 2025 at 02:51:11PM +0300, Yonatan Maman wrote:
> > > From: Yonatan Maman <Ymaman@Nvidia.com>
> > > 
> > > Add support for P2P for MLX5 NIC devices with automatic fallback to
> > > standard DMA when P2P mapping fails.
> > 
> > That's now how the P2P API works.  You need to check the P2P availability
> > higher up.
> 
> How do you mean?
> 
> This looks OKish to me, for ODP and HMM it has to check the P2P
> availability on a page by page basis because every single page can be
> a different origin device.
> 
> There isn't really a higher up here...

The DMA API expects the caller to already check for connectability,
why can't HMM do that like everyone else?

Re: [PATCH v2 4/5] RDMA/mlx5: Enable P2P DMA with fallback mechanism

Posted by Jason Gunthorpe 6 months, 1 week ago

On Thu, Jul 24, 2025 at 12:30:34AM -0700, Christoph Hellwig wrote:
> On Wed, Jul 23, 2025 at 12:55:22AM -0300, Jason Gunthorpe wrote:
> > On Mon, Jul 21, 2025 at 12:03:41AM -0700, Christoph Hellwig wrote:
> > > On Fri, Jul 18, 2025 at 02:51:11PM +0300, Yonatan Maman wrote:
> > > > From: Yonatan Maman <Ymaman@Nvidia.com>
> > > > 
> > > > Add support for P2P for MLX5 NIC devices with automatic fallback to
> > > > standard DMA when P2P mapping fails.
> > > 
> > > That's now how the P2P API works.  You need to check the P2P availability
> > > higher up.
> > 
> > How do you mean?
> > 
> > This looks OKish to me, for ODP and HMM it has to check the P2P
> > availability on a page by page basis because every single page can be
> > a different origin device.
> > 
> > There isn't really a higher up here...
> 
> The DMA API expects the caller to already check for connectability,
> why can't HMM do that like everyone else?

It does, this doesn't change anything about how the DMA API works.

All this series does, and you stated it perfectly, is to allow HMM to
return the single PCI P2P alias of the device private page.

HMM already blindly returns normal P2P pages in a VMA, it should also
blindly return the P2P alias pages too.

Once the P2P is returned the xisting code in hmm_dma_map_pfn() calls
pci_p2pdma_state() to find out if it is compatible or not.

Lifting the pci_p2pdma_state() from hmm_dma_map_pfn() and into
hmm_range_fault() is perhaps possible and may be reasonable, but not
really related to this series.

Jason

[PATCH v2 1/5] mm/hmm: HMM API to enable P2P DMA for device private pages
[PATCH v2 2/5] nouveau/dmem: HMM P2P DMA for private dev pages
[PATCH v2 3/5] IB/core: P2P DMA for device private pages
[PATCH v2 4/5] RDMA/mlx5: Enable P2P DMA with fallback mechanism
[PATCH v2 5/5] RDMA/mlx5: Enabling ATS for ODP memory