[RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed

Byungchul Park posted 2 patches 1 month, 2 weeks ago
[RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 2 weeks ago
Currently, the condition 'page->pp_magic == PP_SIGNATURE' is used to
determine if a page belongs to a page pool.  However, with the planned
removal of ->pp_magic, we will instead leverage the page_type in struct
page, such as PGTY_netpp, for this purpose.

That works for page-backed network memory.  However, for net_iov not
page-backed, the identification cannot be based on the page_type.
Instead, nmdesc->pp can be used to see if it belongs to a page pool, by
making sure nmdesc->pp is NULL otherwise.

For net_iov not page-backed, initialize it using nmdesc->pp = NULL in
net_devmem_bind_dmabuf() and using kvmalloc_array(__GFP_ZERO) in
io_zcrx_create_area() so that netmem_is_pp() can check if nmdesc->pp is
!NULL to confirm its usage as page pool.

Signed-off-by: Byungchul Park <byungchul@sk.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
---
 net/core/devmem.c      |  1 +
 net/core/netmem_priv.h |  8 ++++++++
 net/core/page_pool.c   | 16 ++++++++++++++--
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/net/core/devmem.c b/net/core/devmem.c
index d9de31a6cc7f..f81b700f1fd1 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -291,6 +291,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 			niov = &owner->area.niovs[i];
 			niov->type = NET_IOV_DMABUF;
 			niov->owner = &owner->area;
+			niov->desc.pp = NULL;
 			page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
 						      net_devmem_get_dma_addr(niov));
 			if (direction == DMA_TO_DEVICE)
diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
index 23175cb2bd86..5561fd556bc5 100644
--- a/net/core/netmem_priv.h
+++ b/net/core/netmem_priv.h
@@ -22,6 +22,14 @@ static inline void netmem_clear_pp_magic(netmem_ref netmem)
 
 static inline bool netmem_is_pp(netmem_ref netmem)
 {
+	/* net_iov may be part of a page pool.  For net_iov, ->pp in
+	 * net_iov.desc can be used to determine if the pages belong to
+	 * a page pool.  Ensure that the ->pp either points to its page
+	 * pool or is set to NULL if it does not.
+	 */
+	if (netmem_is_net_iov(netmem))
+		return !!netmem_to_nmdesc(netmem)->pp;
+
 	return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
 }
 
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 1a5edec485f1..2756b78754b0 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -699,7 +699,13 @@ s32 page_pool_inflight(const struct page_pool *pool, bool strict)
 void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem)
 {
 	netmem_set_pp(netmem, pool);
-	netmem_or_pp_magic(netmem, PP_SIGNATURE);
+
+	/* For page-backed, pp_magic is used to identify if it's pp.
+	 * For net_iov, it's ensured nmdesc->pp is non-NULL if it's pp
+	 * and nmdesc->pp is NULL if it's not.
+	 */
+	if (!netmem_is_net_iov(netmem))
+		netmem_or_pp_magic(netmem, PP_SIGNATURE);
 
 	/* Ensuring all pages have been split into one fragment initially:
 	 * page_pool_set_pp_info() is only called once for every page when it
@@ -714,7 +720,13 @@ void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem)
 
 void page_pool_clear_pp_info(netmem_ref netmem)
 {
-	netmem_clear_pp_magic(netmem);
+	/* For page-backed, pp_magic is used to identify if it's pp.
+	 * For net_iov, it's ensured nmdesc->pp is non-NULL if it's pp
+	 * and nmdesc->pp is NULL if it's not.
+	 */
+	if (!netmem_is_net_iov(netmem))
+		netmem_clear_pp_magic(netmem);
+
 	netmem_set_pp(netmem, NULL);
 }
 
-- 
2.17.1
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Jakub Kicinski 1 month, 1 week ago
On Mon,  3 Nov 2025 16:51:07 +0900 Byungchul Park wrote:
> However, for net_iov not
> page-backed, the identification cannot be based on the page_type.
> Instead, nmdesc->pp can be used to see if it belongs to a page pool, by
> making sure nmdesc->pp is NULL otherwise.

Please explain why. Isn't the type just a value in a field?
Which net_iov could also set accordingly.. ?
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 1 week ago
On Thu, Nov 06, 2025 at 05:33:20PM -0800, Jakub Kicinski wrote:
> On Mon,  3 Nov 2025 16:51:07 +0900 Byungchul Park wrote:
> > However, for net_iov not
			  ^
		*not* page-backed

> > page-backed, the identification cannot be based on the page_type.
> > Instead, nmdesc->pp can be used to see if it belongs to a page pool, by
> > making sure nmdesc->pp is NULL otherwise.
> 
> Please explain why. Isn't the type just a value in a field?
> Which net_iov could also set accordingly.. ?

page_type field is in 'struct page', so 'struct page' can check the type.

However, the field is not in 'struct net_iov', so 'struct net_iov' that
is not backed by page, cannot use the type checking to see if it's page
pool'ed instance.

I'm afraid I didn't get your questions.  I will try to explain again
properly if you give me more detail and example about your questions or
requirement.

	Byungchul
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Jakub Kicinski 1 month, 1 week ago
On Fri, 7 Nov 2025 10:59:02 +0900 Byungchul Park wrote:
> > > page-backed, the identification cannot be based on the page_type.
> > > Instead, nmdesc->pp can be used to see if it belongs to a page pool, by
> > > making sure nmdesc->pp is NULL otherwise.  
> > 
> > Please explain why. Isn't the type just a value in a field?
> > Which net_iov could also set accordingly.. ?  
> 
> page_type field is in 'struct page', so 'struct page' can check the type.
> 
> However, the field is not in 'struct net_iov', so 'struct net_iov' that
> is not backed by page, cannot use the type checking to see if it's page
> pool'ed instance.
> 
> I'm afraid I didn't get your questions.  I will try to explain again
> properly if you give me more detail and example about your questions or
> requirement.

net_iov has members in the same place as page. page_type is just 
a field right now.

static __always_inline int Page##uname(const struct page *page)		\
{									\
	return data_race(page->page_type >> 24) == PGTY_##lname;	\
}									\

The whole thing works right now by overlaying one struct on top of
another, and shared members being in the same places.

Is this clear enough?
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 1 week ago
On Thu, Nov 06, 2025 at 06:08:10PM -0800, Jakub Kicinski wrote:
> On Fri, 7 Nov 2025 10:59:02 +0900 Byungchul Park wrote:
> > > > page-backed, the identification cannot be based on the page_type.
> > > > Instead, nmdesc->pp can be used to see if it belongs to a page pool, by
> > > > making sure nmdesc->pp is NULL otherwise.
> > >
> > > Please explain why. Isn't the type just a value in a field?
> > > Which net_iov could also set accordingly.. ?
> >
> > page_type field is in 'struct page', so 'struct page' can check the type.
> >
> > However, the field is not in 'struct net_iov', so 'struct net_iov' that
> > is not backed by page, cannot use the type checking to see if it's page
> > pool'ed instance.
> >
> > I'm afraid I didn't get your questions.  I will try to explain again
> > properly if you give me more detail and example about your questions or
> > requirement.
> 
> net_iov has members in the same place as page. page_type is just
> a field right now.

The current layout is:

  struct page {
	memdesc_flags_t flags;
	union {
		...
		struct {
			unsigned long pp_magic;
			struct page_pool *pp;
			unsigned long _pp_mapping_pad;
			unsigned long dma_addr;
			atomic_long_t pp_ref_count;
		};
		...
	};
	unsigned int page_type;
	...
  };

  struct net_iov {
	union {
		struct netmem_desc desc;
		struct
		{
			unsigned long _flags;
			unsigned long pp_magic;
			struct page_pool *pp;
			unsigned long _pp_mapping_pad;
			unsigned long dma_addr;
			atomic_long_t pp_ref_count;
		};
	};
	struct net_iov_area *owner; // the same offet of page_type
	enum net_iov_type type;
  };

The offset of page_type in struct page cannot be used in struct net_iov
for the same purpose, since the offset in struct net_iov is for storing
(struct net_iov_area *)owner.

Yeah, you can tell 'why don't we add the field, page_type, to struct
net_iov (or struct netmem_desc)' so as to be like:

  struct net_iov {
	union {
		struct netmem_desc desc;
		struct
		{
			unsigned long _flags;
			unsigned long pp_magic;
			struct page_pool *pp;
			unsigned long _pp_mapping_pad;
			unsigned long dma_addr;
			atomic_long_t pp_ref_count;
+			unsigned int page_type; // add this field newly
		};
	};
	struct net_iov_area *owner; // the same offet of page_type
	enum net_iov_type type;
  };

I think we can make it anyway but it makes less sense to add page_type
to struct net_iov, only for PGTY_netpp.

It'd be better to use netmem_desc->pp for that purpose, IMO.

Thoughts?

	Byungchul

> static __always_inline int Page##uname(const struct page *page)         \
> {                                                                       \
>         return data_race(page->page_type >> 24) == PGTY_##lname;        \
> }                                                                       \
> 
> The whole thing works right now by overlaying one struct on top of
> another, and shared members being in the same places.
> 
> Is this clear enough?
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Jakub Kicinski 1 month, 1 week ago
On Fri, 7 Nov 2025 13:47:08 +0900 Byungchul Park wrote:
> The offset of page_type in struct page cannot be used in struct net_iov
> for the same purpose, since the offset in struct net_iov is for storing
> (struct net_iov_area *)owner.

owner does not have to be at a fixed offset. Can we not move owner 
to _pp_mapping_pad ? Or reorder it with type, enum net_iov_type 
only has 2 values we can smoosh it with page_type easily.

> Yeah, you can tell 'why don't we add the field, page_type, to struct
> net_iov (or struct netmem_desc)' so as to be like:
> 
>   struct net_iov {
> 	union {
> 		struct netmem_desc desc;
> 		struct
> 		{
> 			unsigned long _flags;
> 			unsigned long pp_magic;
> 			struct page_pool *pp;
> 			unsigned long _pp_mapping_pad;
> 			unsigned long dma_addr;
> 			atomic_long_t pp_ref_count;
> +			unsigned int page_type; // add this field newly
> 		};
> 	};
> 	struct net_iov_area *owner; // the same offet of page_type
> 	enum net_iov_type type;
>   };
> 
> I think we can make it anyway but it makes less sense to add page_type
> to struct net_iov, only for PGTY_netpp.
> 
> It'd be better to use netmem_desc->pp for that purpose, IMO.
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 1 week ago
On Fri, Nov 07, 2025 at 05:41:29PM -0800, Jakub Kicinski wrote:
> On Fri, 7 Nov 2025 13:47:08 +0900 Byungchul Park wrote:
> > The offset of page_type in struct page cannot be used in struct net_iov
> > for the same purpose, since the offset in struct net_iov is for storing
> > (struct net_iov_area *)owner.
> 
> owner does not have to be at a fixed offset. Can we not move owner
> to _pp_mapping_pad ? Or reorder it with type, enum net_iov_type
> only has 2 values we can smoosh it with page_type easily.

At the beginning I tried to smoosh it with page_type but it requires the
additional logic or something to save and restore the value of enum
net_iov_type if the netmem is net_iov when updating page_type like:

  if (netmem_is_net_iov(netmem))
	type = get_type_from_niov(netmem_to_niov(netmem));

  /* Unconditionally clear the value of enum net_iov_type. */
  __{Set,Clear}PageNetpp(__netmem_to_page((__force unsigned long)netmem & ~NET_IOV));

  if (netmem_is_net_iov(netmem))
	set_type_to_niov(netmem_to_niov(netmem), type);

Or page_pool_{set,clear}_pp_info() callers should handle it in a similar
way.  I'd like to check if you are okay with this approach.

Or I can make it simpler by adding page_type to struct net_iov like the
following.  Jakub, do these approachs work for you?

	Byungchul

---
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index 5d51600935a6..def274f5c1ca 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -707,7 +707,7 @@ static void mlx5e_free_xdpsq_desc(struct mlx5e_xdpsq *sq,
 				xdpi = mlx5e_xdpi_fifo_pop(xdpi_fifo);
 				page = xdpi.page.page;
 
-				/* No need to check page_pool_page_is_pp() as we
+				/* No need to check PageNetpp() as we
 				 * know this is a page_pool page.
 				 */
 				page_pool_recycle_direct(pp_page_to_nmdesc(page)->pp,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a3f97c551ad8..081e365caa1a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4252,10 +4252,9 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
  * DMA mapping IDs for page_pool
  *
  * When DMA-mapping a page, page_pool allocates an ID (from an xarray) and
- * stashes it in the upper bits of page->pp_magic. We always want to be able to
- * unambiguously identify page pool pages (using page_pool_page_is_pp()). Non-PP
- * pages can have arbitrary kernel pointers stored in the same field as pp_magic
- * (since it overlaps with page->lru.next), so we must ensure that we cannot
+ * stashes it in the upper bits of page->pp_magic. Non-PP pages can have
+ * arbitrary kernel pointers stored in the same field as pp_magic (since
+ * it overlaps with page->lru.next), so we must ensure that we cannot
  * mistake a valid kernel pointer with any of the values we write into this
  * field.
  *
@@ -4290,26 +4289,6 @@ int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
 #define PP_DMA_INDEX_MASK GENMASK(PP_DMA_INDEX_BITS + PP_DMA_INDEX_SHIFT - 1, \
 				  PP_DMA_INDEX_SHIFT)
 
-/* Mask used for checking in page_pool_page_is_pp() below. page->pp_magic is
- * OR'ed with PP_SIGNATURE after the allocation in order to preserve bit 0 for
- * the head page of compound page and bit 1 for pfmemalloc page, as well as the
- * bits used for the DMA index. page_is_pfmemalloc() is checked in
- * __page_pool_put_page() to avoid recycling the pfmemalloc page.
- */
-#define PP_MAGIC_MASK ~(PP_DMA_INDEX_MASK | 0x3UL)
-
-#ifdef CONFIG_PAGE_POOL
-static inline bool page_pool_page_is_pp(const struct page *page)
-{
-	return (page->pp_magic & PP_MAGIC_MASK) == PP_SIGNATURE;
-}
-#else
-static inline bool page_pool_page_is_pp(const struct page *page)
-{
-	return false;
-}
-#endif
-
 #define PAGE_SNAPSHOT_FAITHFUL (1 << 0)
 #define PAGE_SNAPSHOT_PG_BUDDY (1 << 1)
 #define PAGE_SNAPSHOT_PG_IDLE  (1 << 2)
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 0091ad1986bf..edf5418c91dd 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -934,6 +934,7 @@ enum pagetype {
 	PGTY_zsmalloc		= 0xf6,
 	PGTY_unaccepted		= 0xf7,
 	PGTY_large_kmalloc	= 0xf8,
+	PGTY_netpp		= 0xf9,
 
 	PGTY_mapcount_underflow = 0xff
 };
@@ -1078,6 +1079,11 @@ PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc)
 PAGE_TYPE_OPS(Unaccepted, unaccepted, unaccepted)
 FOLIO_TYPE_OPS(large_kmalloc, large_kmalloc)
 
+/*
+ * Marks page_pool allocated pages.
+ */
+PAGE_TYPE_OPS(Netpp, netpp, netpp)
+
 /**
  * PageHuge - Determine if the page belongs to hugetlbfs
  * @page: The page to test.
diff --git a/include/net/netmem.h b/include/net/netmem.h
index 651e2c62d1dd..b42d75ecd411 100644
--- a/include/net/netmem.h
+++ b/include/net/netmem.h
@@ -114,10 +114,21 @@ struct net_iov {
 			atomic_long_t pp_ref_count;
 		};
 	};
+
+	unsigned int page_type;
 	struct net_iov_area *owner;
 	enum net_iov_type type;
 };
 
+/* Make sure 'the offset of page_type in struct page == the offset of
+ * type in struct net_iov'.
+ */
+#define NET_IOV_ASSERT_OFFSET(pg, iov)			\
+	static_assert(offsetof(struct page, pg) ==	\
+		      offsetof(struct net_iov, iov))
+NET_IOV_ASSERT_OFFSET(page_type, page_type);
+#undef NET_IOV_ASSERT_OFFSET
+
 struct net_iov_area {
 	/* Array of net_iovs for this area. */
 	struct net_iov *niovs;
@@ -260,7 +271,7 @@ static inline unsigned long netmem_pfn_trace(netmem_ref netmem)
  */
 #define pp_page_to_nmdesc(p)						\
 ({									\
-	DEBUG_NET_WARN_ON_ONCE(!page_pool_page_is_pp(p));		\
+	DEBUG_NET_WARN_ON_ONCE(!PageNetpp(p));				\
 	__pp_page_to_nmdesc(p);						\
 })
 
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 600d9e981c23..01dd14123065 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1041,7 +1041,6 @@ static inline bool page_expected_state(struct page *page,
 #ifdef CONFIG_MEMCG
 			page->memcg_data |
 #endif
-			page_pool_page_is_pp(page) |
 			(page->flags.f & check_flags)))
 		return false;
 
@@ -1068,8 +1067,6 @@ static const char *page_bad_reason(struct page *page, unsigned long flags)
 	if (unlikely(page->memcg_data))
 		bad_reason = "page still charged to cgroup";
 #endif
-	if (unlikely(page_pool_page_is_pp(page)))
-		bad_reason = "page_pool leak";
 	return bad_reason;
 }
 
@@ -1378,9 +1375,12 @@ __always_inline bool free_pages_prepare(struct page *page,
 		mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
 		folio->mapping = NULL;
 	}
-	if (unlikely(page_has_type(page)))
+	if (unlikely(page_has_type(page))) {
+		/* networking expects to clear its page type before releasing */
+		WARN_ON_ONCE(PageNetpp(page));
 		/* Reset the page_type (which overlays _mapcount) */
 		page->page_type = UINT_MAX;
+	}
 
 	if (is_check_pages_enabled()) {
 		if (free_page_is_bad(page))
diff --git a/net/core/netmem_priv.h b/net/core/netmem_priv.h
index 23175cb2bd86..6b7d90b8ebb9 100644
--- a/net/core/netmem_priv.h
+++ b/net/core/netmem_priv.h
@@ -8,21 +8,15 @@ static inline unsigned long netmem_get_pp_magic(netmem_ref netmem)
 	return netmem_to_nmdesc(netmem)->pp_magic & ~PP_DMA_INDEX_MASK;
 }
 
-static inline void netmem_or_pp_magic(netmem_ref netmem, unsigned long pp_magic)
-{
-	netmem_to_nmdesc(netmem)->pp_magic |= pp_magic;
-}
-
-static inline void netmem_clear_pp_magic(netmem_ref netmem)
-{
-	WARN_ON_ONCE(netmem_to_nmdesc(netmem)->pp_magic & PP_DMA_INDEX_MASK);
-
-	netmem_to_nmdesc(netmem)->pp_magic = 0;
-}
-
 static inline bool netmem_is_pp(netmem_ref netmem)
 {
-	return (netmem_get_pp_magic(netmem) & PP_MAGIC_MASK) == PP_SIGNATURE;
+	/* XXX: Now that the offset of page_type is shared between
+	 * struct page and net_iov, just cast the netmem to struct page
+	 * unconditionally by clearing NET_IOV if any, no matter whether
+	 * it comes from struct net_iov or struct page.  This should be
+	 * adjusted once the offset is no longer shared.
+	 */
+	return PageNetpp(__netmem_to_page((__force unsigned long)netmem & ~NET_IOV));
 }
 
 static inline void netmem_set_pp(netmem_ref netmem, struct page_pool *pool)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 1a5edec485f1..27650ca789d1 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -699,7 +699,14 @@ s32 page_pool_inflight(const struct page_pool *pool, bool strict)
 void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem)
 {
 	netmem_set_pp(netmem, pool);
-	netmem_or_pp_magic(netmem, PP_SIGNATURE);
+
+	/* XXX: Now that the offset of page_type is shared between
+	 * struct page and net_iov, just cast the netmem to struct page
+	 * unconditionally by clearing NET_IOV if any, no matter whether
+	 * it comes from struct net_iov or struct page.  This should be
+	 * adjusted once the offset is no longer shared.
+	 */
+	__SetPageNetpp(__netmem_to_page((__force unsigned long)netmem & ~NET_IOV));
 
 	/* Ensuring all pages have been split into one fragment initially:
 	 * page_pool_set_pp_info() is only called once for every page when it
@@ -714,7 +721,14 @@ void page_pool_set_pp_info(struct page_pool *pool, netmem_ref netmem)
 
 void page_pool_clear_pp_info(netmem_ref netmem)
 {
-	netmem_clear_pp_magic(netmem);
+	/* XXX: Now that the offset of page_type is shared between
+	 * struct page and net_iov, just cast the netmem to struct page
+	 * unconditionally by clearing NET_IOV if any, no matter whether
+	 * it comes from struct net_iov or struct page.  This should be
+	 * adjusted once the offset is no longer shared.
+	 */
+	__ClearPageNetpp(__netmem_to_page((__force unsigned long)netmem & ~NET_IOV));
+
 	netmem_set_pp(netmem, NULL);
 }
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Jakub Kicinski 1 month ago
On Wed, 12 Nov 2025 16:41:18 +0900 Byungchul Park wrote:
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index 651e2c62d1dd..b42d75ecd411 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -114,10 +114,21 @@ struct net_iov {
>  			atomic_long_t pp_ref_count;
>  		};
>  	};
> +
> +	unsigned int page_type;
>  	struct net_iov_area *owner;
>  	enum net_iov_type type;

type is 4B already in net-next, so you may want to reorder @type 
with @owner to avoid a hole and increasing the struct size. 
Other than that LGTM!

struct net_iov {
	union {
		struct netmem_desc desc;                 /*     0    48 */
		struct {
			long unsigned int _flags;        /*     0     8 */
			long unsigned int pp_magic;      /*     8     8 */
			struct page_pool * pp;           /*    16     8 */
			long unsigned int _pp_mapping_pad; /*    24     8 */
			long unsigned int dma_addr;      /*    32     8 */
			atomic_long_t pp_ref_count;      /*    40     8 */
		};                                       /*     0    48 */
	};                                               /*     0    48 */
	struct net_iov_area *      owner;                /*    48     8 */
	enum net_iov_type          type;                 /*    56     4 */

	/* size: 64, cachelines: 1, members: 3 */
	/* padding: 4 */
};
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month ago
On Fri, Nov 14, 2025 at 05:23:18PM -0800, Jakub Kicinski wrote:
> On Wed, 12 Nov 2025 16:41:18 +0900 Byungchul Park wrote:
> > diff --git a/include/net/netmem.h b/include/net/netmem.h
> > index 651e2c62d1dd..b42d75ecd411 100644
> > --- a/include/net/netmem.h
> > +++ b/include/net/netmem.h
> > @@ -114,10 +114,21 @@ struct net_iov {
> >                       atomic_long_t pp_ref_count;
> >               };
> >       };
> > +
> > +     unsigned int page_type;
> >       struct net_iov_area *owner;
> >       enum net_iov_type type;
> 
> type is 4B already in net-next, so you may want to reorder @type
> with @owner to avoid a hole and increasing the struct size.

Sure.  Better reorder them.  I will.  Thanks.

	Byungchul

> Other than that LGTM!
> 
> struct net_iov {
>         union {
>                 struct netmem_desc desc;                 /*     0    48 */
>                 struct {
>                         long unsigned int _flags;        /*     0     8 */
>                         long unsigned int pp_magic;      /*     8     8 */
>                         struct page_pool * pp;           /*    16     8 */
>                         long unsigned int _pp_mapping_pad; /*    24     8 */
>                         long unsigned int dma_addr;      /*    32     8 */
>                         atomic_long_t pp_ref_count;      /*    40     8 */
>                 };                                       /*     0    48 */
>         };                                               /*     0    48 */
>         struct net_iov_area *      owner;                /*    48     8 */
>         enum net_iov_type          type;                 /*    56     4 */
> 
>         /* size: 64, cachelines: 1, members: 3 */
>         /* padding: 4 */
> };
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 1 week ago
On Fri, Nov 07, 2025 at 05:41:29PM -0800, Jakub Kicinski wrote:
> On Fri, 7 Nov 2025 13:47:08 +0900 Byungchul Park wrote:
> > The offset of page_type in struct page cannot be used in struct net_iov
> > for the same purpose, since the offset in struct net_iov is for storing
> > (struct net_iov_area *)owner.
> 
> owner does not have to be at a fixed offset. Can we not move owner
> to _pp_mapping_pad ? Or reorder it with type, enum net_iov_type
> only has 2 values we can smoosh it with page_type easily.

I'm still confused.  I think you probably understand what this work is
for.  (I've explained several times with related links.)  Or am I
missing something from your questions?

I've answered your question directly since you asked, but the point is
that, struct net_iov will no longer overlay on struct page.

Instead, struct netmem_desc will be responsible for keeping the pp
fields while struct page will lay down the resonsibility, once the pp
fields will be removed from struct page like:

<before> (the current form is:)

   struct page {
 	memdesc_flags_t flags;
 	union {
 		...
 		struct {
 			unsigned long pp_magic;
 			struct page_pool *pp;
 			unsigned long _pp_mapping_pad;
 			unsigned long dma_addr;
 			atomic_long_t pp_ref_count;
 		};
 		...
 	};
 	unsigned int page_type;
 	...
   };
 
   struct net_iov {
 	union {
 		struct netmem_desc desc;
 		struct
 		{
 			unsigned long _flags;
 			unsigned long pp_magic;
 			struct page_pool *pp;
 			unsigned long _pp_mapping_pad;
 			unsigned long dma_addr;
 			atomic_long_t pp_ref_count;
 		};
 	};
 	struct net_iov_area *owner;
 	enum net_iov_type type;
   };

<after> (the final form should be, just around the corner:)

   struct page {
 	memdesc_flags_t flags;
 	union {
 		...
		/* pp fields are gone. */
 		...
 	};
 	unsigned int page_type;
 	...
   };
 
   struct net_iov {
 	struct netmem_desc desc;
 	struct net_iov_area *owner;
 	enum net_iov_type type;
   };

After that, struct page and struct net_iov(or struct netmem_desc) will
not share any fields with each other, probably they will be connected
e.i. through some ways between struct page and netmem_desc tho.

	Byungchul
 
> > Yeah, you can tell 'why don't we add the field, page_type, to struct
> > net_iov (or struct netmem_desc)' so as to be like:
> >
> >   struct net_iov {
> >       union {
> >               struct netmem_desc desc;
> >               struct
> >               {
> >                       unsigned long _flags;
> >                       unsigned long pp_magic;
> >                       struct page_pool *pp;
> >                       unsigned long _pp_mapping_pad;
> >                       unsigned long dma_addr;
> >                       atomic_long_t pp_ref_count;
> > +                     unsigned int page_type; // add this field newly
> >               };
> >       };
> >       struct net_iov_area *owner; // the same offet of page_type
> >       enum net_iov_type type;
> >   };
> >
> > I think we can make it anyway but it makes less sense to add page_type
> > to struct net_iov, only for PGTY_netpp.
> >
> > It'd be better to use netmem_desc->pp for that purpose, IMO.
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Jakub Kicinski 1 month, 1 week ago
On Sat, 8 Nov 2025 11:24:58 +0900 Byungchul Park wrote:
> On Fri, Nov 07, 2025 at 05:41:29PM -0800, Jakub Kicinski wrote:
> > On Fri, 7 Nov 2025 13:47:08 +0900 Byungchul Park wrote:  
> > > The offset of page_type in struct page cannot be used in struct net_iov
> > > for the same purpose, since the offset in struct net_iov is for storing
> > > (struct net_iov_area *)owner.  
> > 
> > owner does not have to be at a fixed offset. Can we not move owner
> > to _pp_mapping_pad ? Or reorder it with type, enum net_iov_type
> > only has 2 values we can smoosh it with page_type easily.  
> 
> I'm still confused.  I think you probably understand what this work is
> for.  (I've explained several times with related links.)  Or am I
> missing something from your questions?
> 
> I've answered your question directly since you asked, but the point is
> that, struct net_iov will no longer overlay on struct page.
> 
> Instead, struct netmem_desc will be responsible for keeping the pp
> fields while struct page will lay down the resonsibility, once the pp
> fields will be removed from struct page like:

I understand the end goal. I don't understand why patch 1 is a step 
in that direction, and you seem incapable of explaining it. So please
either follow my suggestion on how to proceed with patch 2 without
patch 1 in current form. Or come back when have the full conversion
ready.
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 1 week ago
On Fri, Nov 07, 2025 at 06:37:12PM -0800, Jakub Kicinski wrote:
> On Sat, 8 Nov 2025 11:24:58 +0900 Byungchul Park wrote:
> > On Fri, Nov 07, 2025 at 05:41:29PM -0800, Jakub Kicinski wrote:
> > > On Fri, 7 Nov 2025 13:47:08 +0900 Byungchul Park wrote:
> > > > The offset of page_type in struct page cannot be used in struct net_iov
> > > > for the same purpose, since the offset in struct net_iov is for storing
> > > > (struct net_iov_area *)owner.
> > >
> > > owner does not have to be at a fixed offset. Can we not move owner
> > > to _pp_mapping_pad ? Or reorder it with type, enum net_iov_type
> > > only has 2 values we can smoosh it with page_type easily.
> >
> > I'm still confused.  I think you probably understand what this work is
> > for.  (I've explained several times with related links.)  Or am I
> > missing something from your questions?
> >
> > I've answered your question directly since you asked, but the point is
> > that, struct net_iov will no longer overlay on struct page.
> >
> > Instead, struct netmem_desc will be responsible for keeping the pp
> > fields while struct page will lay down the resonsibility, once the pp
> > fields will be removed from struct page like:
> 
> I understand the end goal. I don't understand why patch 1 is a step
> in that direction, and you seem incapable of explaining it. So please
> either follow my suggestion on how to proceed with patch 2 without

struct page and struct netmem_desc should keep difference information.
Even though they are sharing some fields at the moment, it should
eventually be decoupled, which I'm working on now.

> patch 1 in current form. Or come back when have the full conversion
> ready.

This patch set represents the final phase of the full conversion process,
awaiting the next steps. Once this patch is completed, the entire
conversion will be finished, allowing for the final patch that removes
the pp fields from the struct page to be carried out.

	Byungchul
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 1 week ago
On Mon, Nov 10, 2025 at 10:09:26AM +0900, Byungchul Park wrote:
> On Fri, Nov 07, 2025 at 06:37:12PM -0800, Jakub Kicinski wrote:
> > On Sat, 8 Nov 2025 11:24:58 +0900 Byungchul Park wrote:
> > > On Fri, Nov 07, 2025 at 05:41:29PM -0800, Jakub Kicinski wrote:
> > > > On Fri, 7 Nov 2025 13:47:08 +0900 Byungchul Park wrote:
> > > > > The offset of page_type in struct page cannot be used in struct net_iov
> > > > > for the same purpose, since the offset in struct net_iov is for storing
> > > > > (struct net_iov_area *)owner.
> > > >
> > > > owner does not have to be at a fixed offset. Can we not move owner
> > > > to _pp_mapping_pad ? Or reorder it with type, enum net_iov_type
> > > > only has 2 values we can smoosh it with page_type easily.
> > >
> > > I'm still confused.  I think you probably understand what this work is
> > > for.  (I've explained several times with related links.)  Or am I
> > > missing something from your questions?
> > >
> > > I've answered your question directly since you asked, but the point is
> > > that, struct net_iov will no longer overlay on struct page.
> > >
> > > Instead, struct netmem_desc will be responsible for keeping the pp
> > > fields while struct page will lay down the resonsibility, once the pp
> > > fields will be removed from struct page like:
> > 
> > I understand the end goal. I don't understand why patch 1 is a step
> > in that direction, and you seem incapable of explaining it. So please
> > either follow my suggestion on how to proceed with patch 2 without
> 
> struct page and struct netmem_desc should keep difference information.
> Even though they are sharing some fields at the moment, it should
> eventually be decoupled, which I'm working on now.

I'm removing the shared space between struct page and struct net_iov so
as to make struct page look its own way to be shrinked and let struct
net_iov be independent.

Introduing a new shared space for page type is non-sense.  Still not
clear to you?

	Byungchul

> > patch 1 in current form. Or come back when have the full conversion
> > ready.
> 
> This patch set represents the final phase of the full conversion process,
> awaiting the next steps. Once this patch is completed, the entire
> conversion will be finished, allowing for the final patch that removes
> the pp fields from the struct page to be carried out.
> 
> 	Byungchul
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Jakub Kicinski 1 month, 1 week ago
On Tue, 11 Nov 2025 10:40:52 +0900 Byungchul Park wrote:
> > > I understand the end goal. I don't understand why patch 1 is a step
> > > in that direction, and you seem incapable of explaining it. So please
> > > either follow my suggestion on how to proceed with patch 2 without  
> > 
> > struct page and struct netmem_desc should keep difference information.
> > Even though they are sharing some fields at the moment, it should
> > eventually be decoupled, which I'm working on now.  
> 
> I'm removing the shared space between struct page and struct net_iov so
> as to make struct page look its own way to be shrinked and let struct
> net_iov be independent.
> 
> Introduing a new shared space for page type is non-sense.  Still not
> clear to you?

I've spent enough time reasoning with out and suggesting alternatives.
If you respin this please carry:

Nacked-by: Jakub Kicinski <kuba@kernel.org>

Until I say otherwise.
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 1 week ago
On Mon, Nov 10, 2025 at 05:56:50PM -0800, Jakub Kicinski wrote:
> On Tue, 11 Nov 2025 10:40:52 +0900 Byungchul Park wrote:
> > > > I understand the end goal. I don't understand why patch 1 is a step
> > > > in that direction, and you seem incapable of explaining it. So please
> > > > either follow my suggestion on how to proceed with patch 2 without
> > >
> > > struct page and struct netmem_desc should keep difference information.
> > > Even though they are sharing some fields at the moment, it should
> > > eventually be decoupled, which I'm working on now.
> >
> > I'm removing the shared space between struct page and struct net_iov so
> > as to make struct page look its own way to be shrinked and let struct
> > net_iov be independent.
> >
> > Introduing a new shared space for page type is non-sense.  Still not
> > clear to you?
> 
> I've spent enough time reasoning with out and suggesting alternatives.

I'm not trying to be arguing but trying my best to understand you and
want to adopt your opinion.  However, it's not about objection but I
really don't understand what you meant.  Can anyone explain what he
meant who understood?

	Byungchul

> If you respin this please carry:
> 
> Nacked-by: Jakub Kicinski <kuba@kernel.org>
> 
> Until I say otherwise.
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 1 week ago
On Tue, Nov 11, 2025 at 11:17:41AM +0900, Byungchul Park wrote:
> On Mon, Nov 10, 2025 at 05:56:50PM -0800, Jakub Kicinski wrote:
> > On Tue, 11 Nov 2025 10:40:52 +0900 Byungchul Park wrote:
> > > > > I understand the end goal. I don't understand why patch 1 is a step
> > > > > in that direction, and you seem incapable of explaining it. So please
> > > > > either follow my suggestion on how to proceed with patch 2 without
> > > >
> > > > struct page and struct netmem_desc should keep difference information.
> > > > Even though they are sharing some fields at the moment, it should
> > > > eventually be decoupled, which I'm working on now.
> > >
> > > I'm removing the shared space between struct page and struct net_iov so
> > > as to make struct page look its own way to be shrinked and let struct
> > > net_iov be independent.
> > >
> > > Introduing a new shared space for page type is non-sense.  Still not
> > > clear to you?
> > 
> > I've spent enough time reasoning with out and suggesting alternatives.
> 
> I'm not trying to be arguing but trying my best to understand you and
> want to adopt your opinion.  However, it's not about objection but I
> really don't understand what you meant.  Can anyone explain what he
> meant who understood?

If no objection against Jakub's opinion, I will resend with his
alternaltive applied.

	Byungchul

> 	Byungchul
> 
> > If you respin this please carry:
> > 
> > Nacked-by: Jakub Kicinski <kuba@kernel.org>
> > 
> > Until I say otherwise.
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Toke Høiland-Jørgensen 1 month, 1 week ago
Byungchul Park <byungchul@sk.com> writes:

> On Tue, Nov 11, 2025 at 11:17:41AM +0900, Byungchul Park wrote:
>> On Mon, Nov 10, 2025 at 05:56:50PM -0800, Jakub Kicinski wrote:
>> > On Tue, 11 Nov 2025 10:40:52 +0900 Byungchul Park wrote:
>> > > > > I understand the end goal. I don't understand why patch 1 is a step
>> > > > > in that direction, and you seem incapable of explaining it. So please
>> > > > > either follow my suggestion on how to proceed with patch 2 without
>> > > >
>> > > > struct page and struct netmem_desc should keep difference information.
>> > > > Even though they are sharing some fields at the moment, it should
>> > > > eventually be decoupled, which I'm working on now.
>> > >
>> > > I'm removing the shared space between struct page and struct net_iov so
>> > > as to make struct page look its own way to be shrinked and let struct
>> > > net_iov be independent.
>> > >
>> > > Introduing a new shared space for page type is non-sense.  Still not
>> > > clear to you?
>> > 
>> > I've spent enough time reasoning with out and suggesting alternatives.
>> 
>> I'm not trying to be arguing but trying my best to understand you and
>> want to adopt your opinion.  However, it's not about objection but I
>> really don't understand what you meant.  Can anyone explain what he
>> meant who understood?
>
> If no objection against Jakub's opinion, I will resend with his
> alternaltive applied.

No objection from me :)

-Toke
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Byungchul Park 1 month, 1 week ago
On Sat, Nov 08, 2025 at 11:24:58AM +0900, Byungchul Park wrote:
> On Fri, Nov 07, 2025 at 05:41:29PM -0800, Jakub Kicinski wrote:
> > On Fri, 7 Nov 2025 13:47:08 +0900 Byungchul Park wrote:
> > > The offset of page_type in struct page cannot be used in struct net_iov
> > > for the same purpose, since the offset in struct net_iov is for storing
> > > (struct net_iov_area *)owner.
> > 
> > owner does not have to be at a fixed offset. Can we not move owner
> > to _pp_mapping_pad ? Or reorder it with type, enum net_iov_type
> > only has 2 values we can smoosh it with page_type easily.
> 
> I'm still confused.  I think you probably understand what this work is
> for.  (I've explained several times with related links.)  Or am I
         ^
Please don't mind.  It's not a blame.

	Byungchul

> missing something from your questions?
> 
> I've answered your question directly since you asked, but the point is
> that, struct net_iov will no longer overlay on struct page.
> 
> Instead, struct netmem_desc will be responsible for keeping the pp
> fields while struct page will lay down the resonsibility, once the pp
> fields will be removed from struct page like:
> 
> <before> (the current form is:)
> 
>    struct page {
>  	memdesc_flags_t flags;
>  	union {
>  		...
>  		struct {
>  			unsigned long pp_magic;
>  			struct page_pool *pp;
>  			unsigned long _pp_mapping_pad;
>  			unsigned long dma_addr;
>  			atomic_long_t pp_ref_count;
>  		};
>  		...
>  	};
>  	unsigned int page_type;
>  	...
>    };
>  
>    struct net_iov {
>  	union {
>  		struct netmem_desc desc;
>  		struct
>  		{
>  			unsigned long _flags;
>  			unsigned long pp_magic;
>  			struct page_pool *pp;
>  			unsigned long _pp_mapping_pad;
>  			unsigned long dma_addr;
>  			atomic_long_t pp_ref_count;
>  		};
>  	};
>  	struct net_iov_area *owner;
>  	enum net_iov_type type;
>    };
> 
> <after> (the final form should be, just around the corner:)
> 
>    struct page {
>  	memdesc_flags_t flags;
>  	union {
>  		...
> 		/* pp fields are gone. */
>  		...
>  	};
>  	unsigned int page_type;
>  	...
>    };
>  
>    struct net_iov {
>  	struct netmem_desc desc;
>  	struct net_iov_area *owner;
>  	enum net_iov_type type;
>    };
> 
> After that, struct page and struct net_iov(or struct netmem_desc) will
> not share any fields with each other, probably they will be connected
> e.i. through some ways between struct page and netmem_desc tho.
> 
> 	Byungchul
>  
> > > Yeah, you can tell 'why don't we add the field, page_type, to struct
> > > net_iov (or struct netmem_desc)' so as to be like:
> > >
> > >   struct net_iov {
> > >       union {
> > >               struct netmem_desc desc;
> > >               struct
> > >               {
> > >                       unsigned long _flags;
> > >                       unsigned long pp_magic;
> > >                       struct page_pool *pp;
> > >                       unsigned long _pp_mapping_pad;
> > >                       unsigned long dma_addr;
> > >                       atomic_long_t pp_ref_count;
> > > +                     unsigned int page_type; // add this field newly
> > >               };
> > >       };
> > >       struct net_iov_area *owner; // the same offet of page_type
> > >       enum net_iov_type type;
> > >   };
> > >
> > > I think we can make it anyway but it makes less sense to add page_type
> > > to struct net_iov, only for PGTY_netpp.
> > >
> > > It'd be better to use netmem_desc->pp for that purpose, IMO.
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Pavel Begunkov 1 month, 1 week ago
On 11/3/25 07:51, Byungchul Park wrote:
> Currently, the condition 'page->pp_magic == PP_SIGNATURE' is used to
> determine if a page belongs to a page pool.  However, with the planned
> removal of ->pp_magic, we will instead leverage the page_type in struct
> page, such as PGTY_netpp, for this purpose.
> 
> That works for page-backed network memory.  However, for net_iov not
> page-backed, the identification cannot be based on the page_type.
> Instead, nmdesc->pp can be used to see if it belongs to a page pool, by
> making sure nmdesc->pp is NULL otherwise.
> 
> For net_iov not page-backed, initialize it using nmdesc->pp = NULL in
> net_devmem_bind_dmabuf() and using kvmalloc_array(__GFP_ZERO) in
> io_zcrx_create_area() so that netmem_is_pp() can check if nmdesc->pp is
> !NULL to confirm its usage as page pool.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>

-- 
Pavel Begunkov
Re: [RFC mm v5 1/2] page_pool: check nmdesc->pp to see its usage as page pool for net_iov not page-backed
Posted by Toke Høiland-Jørgensen 1 month, 2 weeks ago
Byungchul Park <byungchul@sk.com> writes:

> Currently, the condition 'page->pp_magic == PP_SIGNATURE' is used to
> determine if a page belongs to a page pool.  However, with the planned
> removal of ->pp_magic, we will instead leverage the page_type in struct
> page, such as PGTY_netpp, for this purpose.
>
> That works for page-backed network memory.  However, for net_iov not
> page-backed, the identification cannot be based on the page_type.
> Instead, nmdesc->pp can be used to see if it belongs to a page pool, by
> making sure nmdesc->pp is NULL otherwise.
>
> For net_iov not page-backed, initialize it using nmdesc->pp = NULL in
> net_devmem_bind_dmabuf() and using kvmalloc_array(__GFP_ZERO) in
> io_zcrx_create_area() so that netmem_is_pp() can check if nmdesc->pp is
> !NULL to confirm its usage as page pool.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> Reviewed-by: Mina Almasry <almasrymina@google.com>

Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>