There is performance penalty with using page frag support when
user requests a larger frag size and a page only supports one
frag user, see [1].
It seems like user may request different frag size depending
on the mtu and packet size, provide an option to allocate
non-frag page when a whole page is not able to hold two frags,
so that user has a unified interface for the memory allocation
with least memory utilization and performance penalty.
1. https://lore.kernel.org/netdev/ZEU+vospFdm08IeE@localhost.localdomain/
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
---
net/core/page_pool.c | 47 +++++++++++++++++++++++++++-----------------
1 file changed, 29 insertions(+), 18 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 0868aa8f6323..e84ec6eabefd 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -699,14 +699,27 @@ struct page *page_pool_alloc_frag(struct page_pool *pool,
unsigned int max_size = PAGE_SIZE << pool->p.order;
struct page *page = pool->frag_page;
- if (WARN_ON(!(pool->p.flags & PP_FLAG_PAGE_FRAG) ||
- size > max_size))
+ if (unlikely(size > max_size))
return NULL;
+ if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) {
+ *offset = 0;
+ return page_pool_alloc_pages(pool, gfp);
+ }
+
size = ALIGN(size, dma_get_cache_alignment());
- *offset = pool->frag_offset;
- if (page && *offset + size > max_size) {
+ if (page) {
+ *offset = pool->frag_offset;
+
+ if (*offset + size <= max_size) {
+ pool->frag_users++;
+ pool->frag_offset = *offset + size;
+ alloc_stat_inc(pool, fast);
+ return page;
+ }
+
+ pool->frag_page = NULL;
page = page_pool_drain_frag(pool, page);
if (page) {
alloc_stat_inc(pool, fast);
@@ -714,26 +727,24 @@ struct page *page_pool_alloc_frag(struct page_pool *pool,
}
}
- if (!page) {
- page = page_pool_alloc_pages(pool, gfp);
- if (unlikely(!page)) {
- pool->frag_page = NULL;
- return NULL;
- }
-
- pool->frag_page = page;
+ page = page_pool_alloc_pages(pool, gfp);
+ if (unlikely(!page))
+ return NULL;
frag_reset:
- pool->frag_users = 1;
+ /* return page as non-frag page if a page is not able to
+ * hold two frags for the current requested size.
+ */
+ if (unlikely(size << 1 > max_size)) {
*offset = 0;
- pool->frag_offset = size;
- page_pool_fragment_page(page, BIAS_MAX);
return page;
}
- pool->frag_users++;
- pool->frag_offset = *offset + size;
- alloc_stat_inc(pool, fast);
+ pool->frag_page = page;
+ pool->frag_users = 1;
+ *offset = 0;
+ pool->frag_offset = size;
+ page_pool_fragment_page(page, BIAS_MAX);
return page;
}
EXPORT_SYMBOL(page_pool_alloc_frag);
--
2.33.0
On Fri, May 26, 2023 at 2:28 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: > > There is performance penalty with using page frag support when > user requests a larger frag size and a page only supports one > frag user, see [1]. > > It seems like user may request different frag size depending > on the mtu and packet size, provide an option to allocate > non-frag page when a whole page is not able to hold two frags, > so that user has a unified interface for the memory allocation > with least memory utilization and performance penalty. > > 1. https://lore.kernel.org/netdev/ZEU+vospFdm08IeE@localhost.localdomain/ > > Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> > CC: Lorenzo Bianconi <lorenzo@kernel.org> > CC: Alexander Duyck <alexander.duyck@gmail.com> The way I see it there are several problems with this approach. First, why not just increase the page order rather than trying to essentially make page_pool_alloc_frag into an analog for page_pool_alloc_pages? I know for the skb allocator we are working with an order 3 page. You could likely do something similar here to achieve the better performance you are looking for. Second, I am not a fan of these changes as they seem to be wasteful for drivers that might make use of a mix of large and small allocations. If we aren't going to use fragments then we should probably just wrap the call to this function in an inline wrapper that checks the size and just automatically pulls the larger sizes off into the non-frag allocation path. Look at something such as __netdev_alloc_skb as an example.
On 2023/5/26 23:16, Alexander Duyck wrote: > On Fri, May 26, 2023 at 2:28 AM Yunsheng Lin <linyunsheng@huawei.com> wrote: >> >> There is performance penalty with using page frag support when >> user requests a larger frag size and a page only supports one >> frag user, see [1]. >> >> It seems like user may request different frag size depending >> on the mtu and packet size, provide an option to allocate >> non-frag page when a whole page is not able to hold two frags, >> so that user has a unified interface for the memory allocation >> with least memory utilization and performance penalty. >> >> 1. https://lore.kernel.org/netdev/ZEU+vospFdm08IeE@localhost.localdomain/ >> >> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> >> CC: Lorenzo Bianconi <lorenzo@kernel.org> >> CC: Alexander Duyck <alexander.duyck@gmail.com> > > The way I see it there are several problems with this approach. > > First, why not just increase the page order rather than trying to > essentially make page_pool_alloc_frag into an analog for > page_pool_alloc_pages? I know for the skb allocator we are working > with an order 3 page. You could likely do something similar here to > achieve the better performance you are looking for. I suppose the "an order 3 page" refers to __page_frag_cache_refill() trying to allocate an order 3 page, if it fails, then fail back to order 0 page? As page pool alloc and recycle page in order based on pool->alloc and pool->ring, so we can not do the above failback trick for page pool. We could add different pool->alloc/pool->ring for different order page, for example, pool->alloc_order_0/pool->ring_order_0 for order 0 page, and pool->alloc_order_3/pool->ring_order_3 for order 3 page, but then it will be complicated and possibly more memory wasteful. We would also create page pool with pool->p.order being 3, then the optimization in this patch also apply. > > Second, I am not a fan of these changes as they seem to be wasteful > for drivers that might make use of a mix of large and small I suppose 'wasteful' refer to memory usage, right? I am not sure how drivers that might make use a mix of large and small yet, like mlx5 using page_pool_defrag_page() directly. but if the PAGE_SIZE << pool->p.order is integral multiple of the frag, then some waste seems to be unavoidable, does handling the frag count in the driver save more memory than handling the frag count in page pool? If not, handling the frag count in page pool seems more reasonable? > allocations. If we aren't going to use fragments then we should > probably just wrap the call to this function in an inline wrapper that > checks the size and just automatically pulls the larger sizes off into > the non-frag allocation path. Look at something such as > __netdev_alloc_skb as an example. Do you mean adding an new inline wrapper like below? if (len > xxx) return page_pool_alloc_pages(pool, gfp); else return page_pool_alloc_frag(); It seems the above is essentially the same as this patch does, this patch does it by not introducing another interface and doing some optimization. For example, for the above case with 'len > xxx' in this patch, we still allow frag allocate if the remaining size of the page is bigger than 'len', which I think it is a small optimization for 64K page size or order 3 page. Also, do you have some thing in mind above xxx here? As this patch chooses the ((PAGE_SIZE << pool->p.order) / 2) as xxx. > . >
© 2016 - 2023 Red Hat, Inc.