[RFC mm v4 1/2] page_pool: check if nmdesc->pp is !NULL to confirm its usage as pp for net_iov

Byungchul Park posted 2 patches 3 months, 2 weeks ago
There is a newer version of this series
[RFC mm v4 1/2] page_pool: check if nmdesc->pp is !NULL to confirm its usage as pp for net_iov
Posted by Byungchul Park 3 months, 2 weeks ago
->pp_magic field in struct page is current used to identify if a page
belongs to a page pool.  However, ->pp_magic will be removed and page
type bit in struct page e.g. PGTY_netpp should be used for that purpose.

As a preparation, the check for net_iov, that is not page-backed, should
avoid using ->pp_magic since net_iov doens't have to do with page type.
Instead, nmdesc->pp can be used if a net_iov or its nmdesc belongs to a
page pool, by making sure nmdesc->pp is NULL otherwise.

For page-backed netmem, just leave unchanged as is, while for net_iov,
make sure nmdesc->pp is initialized to NULL and use nmdesc->pp for the
check.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 net/core/devmem.c      |  1 +
 net/core/netmem_priv.h |  8 ++++++++
 net/core/page_pool.c   | 16 ++++++++++++++--
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/net/core/devmem.c b/net/core/devmem.c
index d9de31a6cc7f..f81b700f1fd1 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -291,6 +291,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 			niov = &owner->area.niovs[i];
 			niov->type = NET_IOV_DMABUF;
 			niov->owner = &owner->area;
+			niov->desc.pp = NULL;
 			page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
 						      net_devmem_get_dma_addr(niov));
 			if (direction == DMA_TO_DEVICE)
diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
index 23175cb2bd86..5561fd556bc5 100644
--- a/net/core/netmem_priv.h
+++ b/net/core/netmem_priv.h
@@ -22,6 +22,14 @@ static inline void netmem_clear_pp_magic(netmem_ref netmem)
 
 static inline bool netmem_is_pp(netmem_ref netmem)
 {
+	/* net_iov may be part of a page pool.  For net_iov, ->pp in
+	 * net_iov.desc can be used to determine if the pages belong to
+	 * a page pool.  Ensure that the ->pp either points to its page
+	 * pool or is set to NULL if it does not.
+	 */
+	if (netmem_is_net_iov(netmem))
+		return !!netmem_to_nmdesc(netmem)->pp;
+
 	return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
 }
 
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 1a5edec485f1..2756b78754b0 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -699,7 +699,13 @@ s32 page_pool_inflight(const struct page_pool *pool, bool strict)
 void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem)
 {
 	netmem_set_pp(netmem, pool);
-	netmem_or_pp_magic(netmem, PP_SIGNATURE);
+
+	/* For page-backed, pp_magic is used to identify if it's pp.
+	 * For net_iov, it's ensured nmdesc->pp is non-NULL if it's pp
+	 * and nmdesc->pp is NULL if it's not.
+	 */
+	if (!netmem_is_net_iov(netmem))
+		netmem_or_pp_magic(netmem, PP_SIGNATURE);
 
 	/* Ensuring all pages have been split into one fragment initially:
 	 * page_pool_set_pp_info() is only called once for every page when it
@@ -714,7 +720,13 @@ void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem)
 
 void page_pool_clear_pp_info(netmem_ref netmem)
 {
-	netmem_clear_pp_magic(netmem);
+	/* For page-backed, pp_magic is used to identify if it's pp.
+	 * For net_iov, it's ensured nmdesc->pp is non-NULL if it's pp
+	 * and nmdesc->pp is NULL if it's not.
+	 */
+	if (!netmem_is_net_iov(netmem))
+		netmem_clear_pp_magic(netmem);
+
 	netmem_set_pp(netmem, NULL);
 }
 
-- 
2.17.1
Re: [RFC mm v4 1/2] page_pool: check if nmdesc->pp is !NULL to confirm its usage as pp for net_iov
Posted by Jakub Kicinski 3 months, 1 week ago
On Thu, 23 Oct 2025 16:44:09 +0900 Byungchul Park wrote:
> As a preparation, the check for net_iov, that is not page-backed, should
> avoid using ->pp_magic since net_iov doens't have to do with page type.

doesn't

> Instead, nmdesc->pp can be used if a net_iov or its nmdesc belongs to a
> page pool, by making sure nmdesc->pp is NULL otherwise.

Please explain in the commit message why the new branch in
netmem_is_pp() is necessary. We used to identify the pages based
on PP_SIGNATURE, now we identify them based on page_type.
Re: [RFC mm v4 1/2] page_pool: check if nmdesc->pp is !NULL to confirm its usage as pp for net_iov
Posted by Byungchul Park 3 months, 1 week ago
On Tue, Oct 28, 2025 at 06:33:56PM -0700, Jakub Kicinski wrote:
> On Thu, 23 Oct 2025 16:44:09 +0900 Byungchul Park wrote:
> > As a preparation, the check for net_iov, that is not page-backed, should
> > avoid using ->pp_magic since net_iov doens't have to do with page type.
> 
> doesn't
> 
> > Instead, nmdesc->pp can be used if a net_iov or its nmdesc belongs to a
> > page pool, by making sure nmdesc->pp is NULL otherwise.
> 
> Please explain in the commit message why the new branch in
> netmem_is_pp() is necessary. We used to identify the pages based
> on PP_SIGNATURE, now we identify them based on page_type.

Yes, I will.  It'd be much better.  Thank you very much for the comment.

	Byungchul
Re: [RFC mm v4 1/2] page_pool: check if nmdesc->pp is !NULL to confirm its usage as pp for net_iov
Posted by Mina Almasry 3 months, 1 week ago
On Thu, Oct 23, 2025 at 12:44 AM Byungchul Park <byungchul@sk.com> wrote:
>
> ->pp_magic field in struct page is current used to identify if a page
> belongs to a page pool.  However, ->pp_magic will be removed and page
> type bit in struct page e.g. PGTY_netpp should be used for that purpose.
>
> As a preparation, the check for net_iov, that is not page-backed, should
> avoid using ->pp_magic since net_iov doens't have to do with page type.
> Instead, nmdesc->pp can be used if a net_iov or its nmdesc belongs to a
> page pool, by making sure nmdesc->pp is NULL otherwise.
>
> For page-backed netmem, just leave unchanged as is, while for net_iov,
> make sure nmdesc->pp is initialized to NULL and use nmdesc->pp for the
> check.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
>  net/core/devmem.c      |  1 +
>  net/core/netmem_priv.h |  8 ++++++++
>  net/core/page_pool.c   | 16 ++++++++++++++--
>  3 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/net/core/devmem.c b/net/core/devmem.c
> index d9de31a6cc7f..f81b700f1fd1 100644
> --- a/net/core/devmem.c
> +++ b/net/core/devmem.c
> @@ -291,6 +291,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
>                         niov = &owner->area.niovs[i];
>                         niov->type = NET_IOV_DMABUF;
>                         niov->owner = &owner->area;
> +                       niov->desc.pp = NULL;

Don't you also need to = NULL the niov allocations in io_uring zcrx,
or is that already done? Maybe mention in commit message.

Other than that, looks correct,

Reviewed-by: Mina Almasry <almasrymina@google.com>
Re: [RFC mm v4 1/2] page_pool: check if nmdesc->pp is !NULL to confirm its usage as pp for net_iov
Posted by Byungchul Park 3 months, 1 week ago
On Mon, Oct 27, 2025 at 06:25:38PM -0700, Mina Almasry wrote:
> On Thu, Oct 23, 2025 at 12:44 AM Byungchul Park <byungchul@sk.com> wrote:
> >
> > ->pp_magic field in struct page is current used to identify if a page
> > belongs to a page pool.  However, ->pp_magic will be removed and page
> > type bit in struct page e.g. PGTY_netpp should be used for that purpose.
> >
> > As a preparation, the check for net_iov, that is not page-backed, should
> > avoid using ->pp_magic since net_iov doens't have to do with page type.
> > Instead, nmdesc->pp can be used if a net_iov or its nmdesc belongs to a
> > page pool, by making sure nmdesc->pp is NULL otherwise.
> >
> > For page-backed netmem, just leave unchanged as is, while for net_iov,
> > make sure nmdesc->pp is initialized to NULL and use nmdesc->pp for the
> > check.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > ---
> >  net/core/devmem.c      |  1 +
> >  net/core/netmem_priv.h |  8 ++++++++
> >  net/core/page_pool.c   | 16 ++++++++++++++--
> >  3 files changed, 23 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/core/devmem.c b/net/core/devmem.c
> > index d9de31a6cc7f..f81b700f1fd1 100644
> > --- a/net/core/devmem.c
> > +++ b/net/core/devmem.c
> > @@ -291,6 +291,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
> >                         niov = &owner->area.niovs[i];
> >                         niov->type = NET_IOV_DMABUF;
> >                         niov->owner = &owner->area;
> > +                       niov->desc.pp = NULL;
> 
> Don't you also need to = NULL the niov allocations in io_uring zcrx,
> or is that already done? Maybe mention in commit message.

Yes, that's been already done by kvmalloc_array(__GFP_ZERO).  I want to
leave a comment explaining that on io_uring side like:

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index e5ff49f3425e..f771bb3e756d 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -444,6 +444,10 @@ static int io_zcrx_create_area(struct io_zcrx_ifq *ifq,
 		area->freelist[i] = i;
 		atomic_set(&area->user_refs[i], 0);
 		niov->type = NET_IOV_IOURING;
+
+		/* niov->pp is already initialized to NULL by
+		 * kvmalloc_array(__GFP_ZERO).
+		 */
 	}
 
 	area->free_count = nr_iovs;

However, I dropped it as Pavel requested:

  https://lore.kernel.org/lkml/8d833a3f-ae18-4ea6-9092-ddaa48290a63@gmail.com/

I will mention it in commit message then.

> Other than that, looks correct,
> 
> Reviewed-by: Mina Almasry <almasrymina@google.com>

Thanks.

	Byungchul