[PATCH net-next v1] net: page_pool: add page_pool_put_page_nosync()

Guowei Dang posted 1 patch 12 months ago
Documentation/networking/page_pool.rst |  5 ++++-
include/net/page_pool/helpers.h        | 17 +++++++++++++++++
2 files changed, 21 insertions(+), 1 deletion(-)
[PATCH net-next v1] net: page_pool: add page_pool_put_page_nosync()
Posted by Guowei Dang 12 months ago
Add page_pool_put_page_nosync() to respond to dma_sync_size being 0.

The purpose of this is to make the semantics more obvious and may
enable removing some checkings in the future.

And in the long term, treating the nosync scenario separately provides
more flexibility for the user and enable removing of the
PP_FLAG_DMA_SYNC_DEV in the future.

Since we do have a page_pool_put_full_page(), adding a variant for
the nosync seems reasonable.

Suggested-by: Yunsheng Lin <linyunsheng@huawei.com>
Acked-by: Furong Xu <0x1207@gmail.com>
Signed-off-by: Guowei Dang <guowei.dang@foxmail.com>
---
 Documentation/networking/page_pool.rst |  5 ++++-
 include/net/page_pool/helpers.h        | 17 +++++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst
index 9d958128a57c..a83f7c071132 100644
--- a/Documentation/networking/page_pool.rst
+++ b/Documentation/networking/page_pool.rst
@@ -62,7 +62,8 @@ a page will cause no race conditions is enough.
    :identifiers: struct page_pool_params
 
 .. kernel-doc:: include/net/page_pool/helpers.h
-   :identifiers: page_pool_put_page page_pool_put_full_page
+   :identifiers: page_pool_put_page
+		 page_pool_put_page_nosync page_pool_put_full_page
 		 page_pool_recycle_direct page_pool_free_va
 		 page_pool_dev_alloc_pages page_pool_dev_alloc_frag
 		 page_pool_dev_alloc page_pool_dev_alloc_va
@@ -93,6 +94,8 @@ much of the page needs to be synced (starting at ``offset``).
 When directly freeing pages in the driver (page_pool_put_page())
 the ``dma_sync_size`` argument specifies how much of the buffer needs
 to be synced.
+If the ``dma_sync_size`` argument is 0, page_pool_put_page_nosync() should be
+used instead of page_pool_put_page().
 
 If in doubt set ``offset`` to 0, ``max_len`` to ``PAGE_SIZE`` and
 pass -1 as ``dma_sync_size``. That combination of arguments is always
diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
index e555921e5233..5cc68d48624a 100644
--- a/include/net/page_pool/helpers.h
+++ b/include/net/page_pool/helpers.h
@@ -340,12 +340,14 @@ static inline void page_pool_put_netmem(struct page_pool *pool,
  * the allocator owns the page and will try to recycle it in one of the pool
  * caches. If PP_FLAG_DMA_SYNC_DEV is set, the page will be synced for_device
  * using dma_sync_single_range_for_device().
+ * page_pool_put_page_nosync() should be used if dma_sync_size is 0.
  */
 static inline void page_pool_put_page(struct page_pool *pool,
 				      struct page *page,
 				      unsigned int dma_sync_size,
 				      bool allow_direct)
 {
+	DEBUG_NET_WARN_ON_ONCE(!dma_sync_size);
 	page_pool_put_netmem(pool, page_to_netmem(page), dma_sync_size,
 			     allow_direct);
 }
@@ -372,6 +374,21 @@ static inline void page_pool_put_full_page(struct page_pool *pool,
 	page_pool_put_netmem(pool, page_to_netmem(page), -1, allow_direct);
 }
 
+/**
+ * page_pool_put_page_nosync() - release a reference on a page pool page
+ * @pool:	pool from which page was allocated
+ * @page:	page to release a reference on
+ * @allow_direct: released by the consumer, allow lockless caching
+ *
+ * Similar to page_pool_put_page(), but will not DMA sync the memory area.
+ */
+static inline void page_pool_put_page_nosync(struct page_pool *pool,
+					     struct page *page,
+					     bool allow_direct)
+{
+	page_pool_put_netmem(pool, page_to_netmem(page), 0, allow_direct);
+}
+
 /**
  * page_pool_recycle_direct() - release a reference on a page pool page
  * @pool:	pool from which page was allocated
-- 
2.34.1
Re: [PATCH net-next v1] net: page_pool: add page_pool_put_page_nosync()
Posted by Jakub Kicinski 12 months ago
On Thu, 19 Dec 2024 11:11:38 +0800 Guowei Dang wrote:
> Add page_pool_put_page_nosync() to respond to dma_sync_size being 0.
> 
> The purpose of this is to make the semantics more obvious and may
> enable removing some checkings in the future.
> 
> And in the long term, treating the nosync scenario separately provides
> more flexibility for the user and enable removing of the
> PP_FLAG_DMA_SYNC_DEV in the future.
> 
> Since we do have a page_pool_put_full_page(), adding a variant for
> the nosync seems reasonable.

You should provide an upstream user with the API.
But IMHO this just complicates the already very large API, 
for little benefit. 
I'm going to leave this in patchwork for a day in case page
pool maintainers disagree, but I vote "no".
Re: [PATCH net-next v1] net: page_pool: add page_pool_put_page_nosync()
Posted by Ilias Apalodimas 12 months ago
Hi Jakub

On Thu, 19 Dec 2024 at 16:24, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 19 Dec 2024 11:11:38 +0800 Guowei Dang wrote:
> > Add page_pool_put_page_nosync() to respond to dma_sync_size being 0.
> >
> > The purpose of this is to make the semantics more obvious and may
> > enable removing some checkings in the future.
> >
> > And in the long term, treating the nosync scenario separately provides
> > more flexibility for the user and enable removing of the
> > PP_FLAG_DMA_SYNC_DEV in the future.
> >
> > Since we do have a page_pool_put_full_page(), adding a variant for
> > the nosync seems reasonable.
>
> You should provide an upstream user with the API.
> But IMHO this just complicates the already very large API,
> for little benefit.

+1000, I think the API has grown more than it has to and we now have
way too many abstractions.

I'll try to find some time and see if I can come up with a cleanup
that makes sense

Thanks
/Ilias
> I'm going to leave this in patchwork for a day in case page
> pool maintainers disagree, but I vote "no".
Re: [PATCH net-next v1] net: page_pool: add page_pool_put_page_nosync()
Posted by Alexander Lobakin 12 months ago
From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Date: Fri, 20 Dec 2024 14:27:16 +0200

> Hi Jakub
> 
> On Thu, 19 Dec 2024 at 16:24, Jakub Kicinski <kuba@kernel.org> wrote:
>>
>> On Thu, 19 Dec 2024 11:11:38 +0800 Guowei Dang wrote:
>>> Add page_pool_put_page_nosync() to respond to dma_sync_size being 0.
>>>
>>> The purpose of this is to make the semantics more obvious and may
>>> enable removing some checkings in the future.
>>>
>>> And in the long term, treating the nosync scenario separately provides
>>> more flexibility for the user and enable removing of the
>>> PP_FLAG_DMA_SYNC_DEV in the future.
>>>
>>> Since we do have a page_pool_put_full_page(), adding a variant for
>>> the nosync seems reasonable.
>>
>> You should provide an upstream user with the API.
>> But IMHO this just complicates the already very large API,
>> for little benefit.
> 
> +1000, I think the API has grown more than it has to and we now have
> way too many abstractions.
> 
> I'll try to find some time and see if I can come up with a cleanup
> that makes sense

I'd remove:

* explicit 1-page-per-buffer API (leaving only the hybrid mode);
* order > 0 support (barely used if at all?);
* usage without DMA map and sync for device;

+ converting the users to netmem would allow to remove some wrappers.

Thanks,
Olek
Re: [PATCH net-next v1] net: page_pool: add page_pool_put_page_nosync()
Posted by Alexander Lobakin 12 months ago
From: Jakub Kicinski <kuba@kernel.org>
Date: Thu, 19 Dec 2024 06:24:38 -0800

(to the author of the patch)

> On Thu, 19 Dec 2024 11:11:38 +0800 Guowei Dang wrote:
>> Add page_pool_put_page_nosync() to respond to dma_sync_size being 0.

If PP_FLAG_DMA_SYNC_DEV is set, dma_sync_size == 0 can happen only when
the HW didn't write anything *and* the driver uses only one page per
frame, no frags. Very unlikely case I'd say, adding a separate wrapper
for it makes no sense.

>>
>> The purpose of this is to make the semantics more obvious and may
>> enable removing some checkings in the future.

Which checks do you want to remove?

>>
>> And in the long term, treating the nosync scenario separately provides
>> more flexibility for the user and enable removing of the
>> PP_FLAG_DMA_SYNC_DEV in the future.

Why remove SYNC_DEV?

>>
>> Since we do have a page_pool_put_full_page(), adding a variant for
>> the nosync seems reasonable.

Not really. put_full_page() is for cases when either the HW-written size
is unknown or the driver uses frags, those are common and widely-used.

> 
> You should provide an upstream user with the API.

Would be nice to see a real example as I don't understand the purpose of
this function as well.

> But IMHO this just complicates the already very large API, 
> for little benefit. 
> I'm going to leave this in patchwork for a day in case page
> pool maintainers disagree, but I vote "no".

I don't see a reason for this either.

Thanks,
Olek
Re: [PATCH net-next v1] net: page_pool: add page_pool_put_page_nosync()
Posted by Yunsheng Lin 12 months ago
On 2024/12/19 11:11, Guowei Dang wrote:
> Add page_pool_put_page_nosync() to respond to dma_sync_size being 0.
> 
> The purpose of this is to make the semantics more obvious and may
> enable removing some checkings in the future.

It would be good to describe the actual use case of the above API in
the commit log too.

> 
> And in the long term, treating the nosync scenario separately provides
> more flexibility for the user and enable removing of the
> PP_FLAG_DMA_SYNC_DEV in the future.
> 
> Since we do have a page_pool_put_full_page(), adding a variant for
> the nosync seems reasonable.
> 
> Suggested-by: Yunsheng Lin <linyunsheng@huawei.com>
> Acked-by: Furong Xu <0x1207@gmail.com>
> Signed-off-by: Guowei Dang <guowei.dang@foxmail.com>
> ---
>  Documentation/networking/page_pool.rst |  5 ++++-
>  include/net/page_pool/helpers.h        | 17 +++++++++++++++++
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/networking/page_pool.rst b/Documentation/networking/page_pool.rst
> index 9d958128a57c..a83f7c071132 100644
> --- a/Documentation/networking/page_pool.rst
> +++ b/Documentation/networking/page_pool.rst
> @@ -62,7 +62,8 @@ a page will cause no race conditions is enough.
>     :identifiers: struct page_pool_params
>  
>  .. kernel-doc:: include/net/page_pool/helpers.h
> -   :identifiers: page_pool_put_page page_pool_put_full_page
> +   :identifiers: page_pool_put_page
> +		 page_pool_put_page_nosync page_pool_put_full_page
>  		 page_pool_recycle_direct page_pool_free_va
>  		 page_pool_dev_alloc_pages page_pool_dev_alloc_frag
>  		 page_pool_dev_alloc page_pool_dev_alloc_va
> @@ -93,6 +94,8 @@ much of the page needs to be synced (starting at ``offset``).
>  When directly freeing pages in the driver (page_pool_put_page())
>  the ``dma_sync_size`` argument specifies how much of the buffer needs
>  to be synced.
> +If the ``dma_sync_size`` argument is 0, page_pool_put_page_nosync() should be
> +used instead of page_pool_put_page().

It would be good to describe when user should call page_pool_put_page_nosync()
and when user should call page_pool_put_page() as the above doesn't really
help user to decide using which API.

As I recall correctly, it seems there is some use case that user is able to
tell that it is ok the skip the dma sync to improve performance by calling
page_pool_put_page_nosync() even when the page_pool is created with
PP_FLAG_DMA_SYNC_DEV flags set.

>  
>  If in doubt set ``offset`` to 0, ``max_len`` to ``PAGE_SIZE`` and
>  pass -1 as ``dma_sync_size``. That combination of arguments is always
> diff --git a/include/net/page_pool/helpers.h b/include/net/page_pool/helpers.h
> index e555921e5233..5cc68d48624a 100644
> --- a/include/net/page_pool/helpers.h
> +++ b/include/net/page_pool/helpers.h
> @@ -340,12 +340,14 @@ static inline void page_pool_put_netmem(struct page_pool *pool,
>   * the allocator owns the page and will try to recycle it in one of the pool
>   * caches. If PP_FLAG_DMA_SYNC_DEV is set, the page will be synced for_device
>   * using dma_sync_single_range_for_device().
> + * page_pool_put_page_nosync() should be used if dma_sync_size is 0.
>   */
>  static inline void page_pool_put_page(struct page_pool *pool,
>  				      struct page *page,
>  				      unsigned int dma_sync_size,
>  				      bool allow_direct)
>  {
> +	DEBUG_NET_WARN_ON_ONCE(!dma_sync_size);
>  	page_pool_put_netmem(pool, page_to_netmem(page), dma_sync_size,
>  			     allow_direct);
>  }
> @@ -372,6 +374,21 @@ static inline void page_pool_put_full_page(struct page_pool *pool,
>  	page_pool_put_netmem(pool, page_to_netmem(page), -1, allow_direct);
>  }
>  
> +/**
> + * page_pool_put_page_nosync() - release a reference on a page pool page
> + * @pool:	pool from which page was allocated
> + * @page:	page to release a reference on
> + * @allow_direct: released by the consumer, allow lockless caching
> + *
> + * Similar to page_pool_put_page(), but will not DMA sync the memory area.
> + */
> +static inline void page_pool_put_page_nosync(struct page_pool *pool,
> +					     struct page *page,
> +					     bool allow_direct)
> +{
> +	page_pool_put_netmem(pool, page_to_netmem(page), 0, allow_direct);
> +}
> +
>  /**
>   * page_pool_recycle_direct() - release a reference on a page pool page
>   * @pool:	pool from which page was allocated