[v2] devmem TCP fixes

[PATCH net-next v2 4/5] page_pool: disable sync for cpu for dmabuf memory provider

Posted by Mina Almasry 2 weeks, 2 days ago

dmabuf dma-addresses should not be dma_sync'd for CPU/device. Typically
its the driver responsibility to dma_sync for CPU, but the driver should
not dma_sync for CPU if the netmem is actually coming from a dmabuf
memory provider.

The page_pool already exposes a helper for dma_sync_for_cpu:
page_pool_dma_sync_for_cpu. Upgrade this existing helper to handle
netmem, and have it skip dma_sync if the memory is from a dmabuf memory
provider. Drivers should migrate to using this helper when adding
support for netmem.

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Mina Almasry <almasrymina@google.com>

---
 include/net/page_pool/helpers.h | 25 +++++++++++++++++++------
 1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index 8e548ff3044c..ad4fed4a791c 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -429,9 +429,10 @@ static inline dma_addr_t page_pool_get_dma_addr(const struct page *page)
 }
 
 /**
- * page_pool_dma_sync_for_cpu - sync Rx page for CPU after it's written by HW
+ * page_pool_dma_sync_netmem_for_cpu - sync Rx page for CPU after it's written
+ *				       by HW
  * @pool: &page_pool the @page belongs to
- * @page: page to sync
+ * @netmem: netmem to sync
  * @offset: offset from page start to "hard" start if using PP frags
  * @dma_sync_size: size of the data written to the page
  *
@@ -440,16 +441,28 @@ static inline dma_addr_t page_pool_get_dma_addr(const struct page *page)
  * Note that this version performs DMA sync unconditionally, even if the
  * associated PP doesn't perform sync-for-device.
  */
-static inline void page_pool_dma_sync_for_cpu(const struct page_pool *pool,
-					      const struct page *page,
-					      u32 offset, u32 dma_sync_size)
+static inline void
+page_pool_dma_sync_netmem_for_cpu(const struct page_pool *pool,
+				  const netmem_ref netmem, u32 offset,
+				  u32 dma_sync_size)
 {
+	if (pool->mp_priv)
+		return;
+
 	dma_sync_single_range_for_cpu(pool->p.dev,
-				      page_pool_get_dma_addr(page),
+				      page_pool_get_dma_addr_netmem(netmem),
 				      offset + pool->p.offset, dma_sync_size,
 				      page_pool_get_dma_dir(pool));
 }
 
+static inline void page_pool_dma_sync_for_cpu(const struct page_pool *pool,
+					      struct page *page, u32 offset,
+					      u32 dma_sync_size)
+{
+	page_pool_dma_sync_netmem_for_cpu(pool, page_to_netmem(page), offset,
+					  dma_sync_size);
+}
+
 static inline bool page_pool_put(struct page_pool *pool)
 {
 	return refcount_dec_and_test(&pool->user_cnt);
-- 
2.47.0.277.g8800431eea-goog

Re: [PATCH net-next v2 4/5] page_pool: disable sync for cpu for dmabuf memory provider

Posted by Jason Gunthorpe 2 weeks, 1 day ago

On Thu, Nov 07, 2024 at 09:23:08PM +0000, Mina Almasry wrote:
> dmabuf dma-addresses should not be dma_sync'd for CPU/device. Typically
> its the driver responsibility to dma_sync for CPU, but the driver should
> not dma_sync for CPU if the netmem is actually coming from a dmabuf
> memory provider.

This is not completely true, it is not *all* dmabuf, just the parts of
the dmabuf that are actually MMIO.

If you do this you may want to block accepting dmabufs that have CPU
pages inside them.

Jason

Re: [PATCH net-next v2 4/5] page_pool: disable sync for cpu for dmabuf memory provider

Posted by Mina Almasry 2 weeks, 1 day ago

On Fri, Nov 8, 2024 at 6:18 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Thu, Nov 07, 2024 at 09:23:08PM +0000, Mina Almasry wrote:
> > dmabuf dma-addresses should not be dma_sync'd for CPU/device. Typically
> > its the driver responsibility to dma_sync for CPU, but the driver should
> > not dma_sync for CPU if the netmem is actually coming from a dmabuf
> > memory provider.
>
> This is not completely true, it is not *all* dmabuf, just the parts of
> the dmabuf that are actually MMIO.
>

Thanks Jason, I can clarify the commit message when/if I send another iteration.

> If you do this you may want to block accepting dmabufs that have CPU
> pages inside them.
>

How do I check if the dmabuf has CPU pages inside of it? The only way
I can think to do that is to sg_page a scatterlist entry, then
!is_zone_device_page() the page. Or something like that, but I thought
calling sg_page() on the dmabuf scatterlist was banned now.

But as others mentioned, we've taken a dependency on using udmabuf for
testing, so we'd like that to still work, we probably need to find
another way than just blocking them entirely.

I'm thinking, we already use the dmabuf sync APIs when we want to read
the udmabuf from userspace. In ncdevmem.c, we do:

sync.flags = DMA_BUF_SYNC_READ | DMA_BUF_SYNC_START;
ioctl(buf, DMA_BUF_IOCTL_SYNC, &sync);

Before we read the udmabuf data. Doesn't that do the same job as
dma_sync_single_range_for_cpu? I'm thinking dmabufs with user pages
would work as long as the user calls the dmabuf sync APIs, so we don't
need to block them entirely from devmem tcp.

-- 
Thanks,
Mina

Re: [PATCH net-next v2 4/5] page_pool: disable sync for cpu for dmabuf memory provider

Posted by Jason Gunthorpe 1 week, 2 days ago

On Fri, Nov 08, 2024 at 11:01:21AM -0800, Mina Almasry wrote:

> > If you do this you may want to block accepting dmabufs that have CPU
> > pages inside them.
> >
> 
> How do I check if the dmabuf has CPU pages inside of it? The only way
> I can think to do that is to sg_page a scatterlist entry, then
> !is_zone_device_page() the page. Or something like that, but I thought
> calling sg_page() on the dmabuf scatterlist was banned now.

I don't know. Many dmabuf scenarios abuse scatter list and the CPU
list is invalid, so you can't reference sg_page().

I think you'd need to discuss with the dmabuf maintainers.

Jason

Re: [PATCH net-next v2 4/5] page_pool: disable sync for cpu for dmabuf memory provider

Posted by Samiullah Khawaja 1 day, 4 hours ago

On Thu, Nov 14, 2024 at 5:59 PM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> On Fri, Nov 08, 2024 at 11:01:21AM -0800, Mina Almasry wrote:
>
> > > If you do this you may want to block accepting dmabufs that have CPU
> > > pages inside them.

I believe we should be following the dmabuf API for this purpose, if
we really want to sync for CPU in. page_pool, and should not assume
the behaviour of the backing memory.

The dmabuf exporters are supposed to implement the begin_cpu_access
(optional) and the sync for cpu is done based on the exporter
implementation.

For example udmabuf implementation calls `dma_sync_sg_for_cpu` on the
underlying pages.

I think following dmabuf APIs can be used to ensure the backing memory
is synced:

Before cpu access to sync for cpu
`dma_buf_begin_cpu_access`

After cpu access to be synced for device
`dma_buf_end_cpu_access`

Sami
> > >
> >
> > How do I check if the dmabuf has CPU pages inside of it? The only way
> > I can think to do that is to sg_page a scatterlist entry, then
> > !is_zone_device_page() the page. Or something like that, but I thought
> > calling sg_page() on the dmabuf scatterlist was banned now.
>
> I don't know. Many dmabuf scenarios abuse scatter list and the CPU
> list is invalid, so you can't reference sg_page().
>
> I think you'd need to discuss with the dmabuf maintainers.
>
> Jason

Re: [PATCH net-next v2 4/5] page_pool: disable sync for cpu for dmabuf memory provider

Posted by Jakub Kicinski 20 minutes ago

On Fri, 22 Nov 2024 14:10:28 -0800 Samiullah Khawaja wrote:
> > > > If you do this you may want to block accepting dmabufs that have CPU
> > > > pages inside them.  
> 
> I believe we should be following the dmabuf API for this purpose, if
> we really want to sync for CPU in. page_pool, and should not assume
> the behaviour of the backing memory.
> 
> The dmabuf exporters are supposed to implement the begin_cpu_access
> (optional) and the sync for cpu is done based on the exporter
> implementation.

DMABUF is a bit of a distraction here.

What's key is that we need to fill in the gap in the driver facing
page_pool API, because DMABUF will not be the only provider we can
plug in.

Re: [PATCH net-next v2 4/5] page_pool: disable sync for cpu for dmabuf memory provider

Posted by Stanislav Fomichev 2 weeks, 1 day ago

On 11/08, Jason Gunthorpe wrote:
> On Thu, Nov 07, 2024 at 09:23:08PM +0000, Mina Almasry wrote:
> > dmabuf dma-addresses should not be dma_sync'd for CPU/device. Typically
> > its the driver responsibility to dma_sync for CPU, but the driver should
> > not dma_sync for CPU if the netmem is actually coming from a dmabuf
> > memory provider.
> 
> This is not completely true, it is not *all* dmabuf, just the parts of
> the dmabuf that are actually MMIO.
> 
> If you do this you may want to block accepting dmabufs that have CPU
> pages inside them.

We still want udmabufs to work, so probably need some new helper to test
whether a particular netmem is backed by the cpu memory?

Re: [PATCH net-next v2 4/5] page_pool: disable sync for cpu for dmabuf memory provider

Posted by Pavel Begunkov 2 weeks, 1 day ago

On 11/8/24 15:58, Stanislav Fomichev wrote:
> On 11/08, Jason Gunthorpe wrote:
>> On Thu, Nov 07, 2024 at 09:23:08PM +0000, Mina Almasry wrote:
>>> dmabuf dma-addresses should not be dma_sync'd for CPU/device. Typically
>>> its the driver responsibility to dma_sync for CPU, but the driver should
>>> not dma_sync for CPU if the netmem is actually coming from a dmabuf
>>> memory provider.
>>
>> This is not completely true, it is not *all* dmabuf, just the parts of
>> the dmabuf that are actually MMIO.
>>
>> If you do this you may want to block accepting dmabufs that have CPU
>> pages inside them.
> 
> We still want udmabufs to work, so probably need some new helper to test
> whether a particular netmem is backed by the cpu memory?

Agree. I guess it's fair to assume that page pool is backed either
by one or another, so could be a page pool flag that devmem.c can set
on init.

-- 
Pavel Begunkov