From nobody Sat Feb  7 19:02:36 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 09233C77B73
	for <linux-kernel@archiver.kernel.org>; Wed, 24 May 2023 15:36:38 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S236465AbjEXPgf (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Wed, 24 May 2023 11:36:35 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33516 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S236709AbjEXPfO (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 24 May 2023 11:35:14 -0400
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.129.124])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71F471A6
        for <linux-kernel@vger.kernel.org>;
 Wed, 24 May 2023 08:33:57 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
        s=mimecast20190719; t=1684942414;
        h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
         to:to:cc:cc:mime-version:mime-version:
         content-transfer-encoding:content-transfer-encoding:
         in-reply-to:in-reply-to:references:references;
        bh=E/mkzJ+c6qr7yPOsPF9NXbENxWjJthPNWT3mBurv3yc=;
        b=Kl/d8uRVqHoqy6t67jIyWJG+FdoHl4o80HDg8dKimTsfU0CPt45a5B+LQPTP6L+bJIV+5t
        Z2DSn7/0jqr36t3FWcRjSDBO1W5aKMCOAj/kPlmw1YdFgSPMS3VrOCe7GNwydphU+RbsA9
        TBaPHlMo6r89U5IgOsGkURKcllo/aoU=
Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com
 [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS
 (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 us-mta-617-ozOdLk_aN02TMJY6p7EiVg-1; Wed, 24 May 2023 11:33:33 -0400
X-MC-Unique: ozOdLk_aN02TMJY6p7EiVg-1
Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com
 [10.11.54.1])
        (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
        (No client certificate requested)
        by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B040C280BC80;
        Wed, 24 May 2023 15:33:31 +0000 (UTC)
Received: from warthog.procyon.org.uk (unknown [10.39.192.68])
        by smtp.corp.redhat.com (Postfix) with ESMTP id C783840CFD45;
        Wed, 24 May 2023 15:33:27 +0000 (UTC)
From: David Howells <dhowells@redhat.com>
To: netdev@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>,
        "David S. Miller" <davem@davemloft.net>,
        Eric Dumazet <edumazet@google.com>,
        Jakub Kicinski <kuba@kernel.org>,
        Paolo Abeni <pabeni@redhat.com>,
        Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
        David Ahern <dsahern@kernel.org>,
        Matthew Wilcox <willy@infradead.org>,
        Jens Axboe <axboe@kernel.dk>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org,
        Jeroen de Borst <jeroendb@google.com>,
        Catherine Sullivan <csully@google.com>,
        Shailend Chand <shailend@google.com>,
        Felix Fietkau <nbd@nbd.name>, John Crispin <john@phrozen.org>,
        Sean Wang <sean.wang@mediatek.com>,
        Mark Lee <Mark-MC.Lee@mediatek.com>,
        Lorenzo Bianconi <lorenzo@kernel.org>,
        Matthias Brugger <matthias.bgg@gmail.com>,
        AngeloGioacchino Del Regno
        <angelogioacchino.delregno@collabora.com>,
        Keith Busch <kbusch@kernel.org>, Jens Axboe <axboe@fb.com>,
        Christoph Hellwig <hch@lst.de>,
        Sagi Grimberg <sagi@grimberg.me>,
        Chaitanya Kulkarni <kch@nvidia.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        linux-arm-kernel@lists.infradead.org,
        linux-mediatek@lists.infradead.org, linux-nvme@lists.infradead.org
Subject: [PATCH net-next 04/12] mm: Make the page_frag_cache allocator use
 multipage folios
Date: Wed, 24 May 2023 16:33:03 +0100
Message-Id: <20230524153311.3625329-5-dhowells@redhat.com>
In-Reply-To: <20230524153311.3625329-1-dhowells@redhat.com>
References: <20230524153311.3625329-1-dhowells@redhat.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

Change the page_frag_cache allocator to use multipage folios rather than
groups of pages.  This reduces page_frag_free to just a folio_put() or
put_page().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Jeroen de Borst <jeroendb@google.com>
cc: Catherine Sullivan <csully@google.com>
cc: Shailend Chand <shailend@google.com>
cc: Felix Fietkau <nbd@nbd.name>
cc: John Crispin <john@phrozen.org>
cc: Sean Wang <sean.wang@mediatek.com>
cc: Mark Lee <Mark-MC.Lee@mediatek.com>
cc: Lorenzo Bianconi <lorenzo@kernel.org>
cc: Matthias Brugger <matthias.bgg@gmail.com>
cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
cc: Keith Busch <kbusch@kernel.org>
cc: Jens Axboe <axboe@fb.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Sagi Grimberg <sagi@grimberg.me>
cc: Chaitanya Kulkarni <kch@nvidia.com>
cc: Andrew Morton <akpm@linux-foundation.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
cc: linux-arm-kernel@lists.infradead.org
cc: linux-mediatek@lists.infradead.org
cc: linux-nvme@lists.infradead.org
cc: linux-mm@kvack.org
---
 include/linux/mm_types.h | 13 ++----
 mm/page_frag_alloc.c     | 99 +++++++++++++++++++---------------------
 2 files changed, 52 insertions(+), 60 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 306a3d1a0fa6..d7c52a5979cc 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -420,18 +420,13 @@ static inline void *folio_get_private(struct folio *f=
olio)
 }
=20
 struct page_frag_cache {
-	void * va;
-#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
-	__u16 offset;
-	__u16 size;
-#else
-	__u32 offset;
-#endif
+	struct folio	*folio;
+	unsigned int	offset;
 	/* we maintain a pagecount bias, so that we dont dirty cache line
 	 * containing page->_refcount every time we allocate a fragment.
 	 */
-	unsigned int		pagecnt_bias;
-	bool pfmemalloc;
+	unsigned int	pagecnt_bias;
+	bool		pfmemalloc;
 };
=20
 typedef unsigned long vm_flags_t;
diff --git a/mm/page_frag_alloc.c b/mm/page_frag_alloc.c
index 9d3f6fbd9a07..ffd68bfb677d 100644
--- a/mm/page_frag_alloc.c
+++ b/mm/page_frag_alloc.c
@@ -16,33 +16,34 @@
 #include <linux/init.h>
 #include <linux/mm.h>
=20
-static struct page *__page_frag_cache_refill(struct page_frag_cache *nc,
-					     gfp_t gfp_mask)
+/*
+ * Allocate a new folio for the frag cache.
+ */
+static struct folio *page_frag_cache_refill(struct page_frag_cache *nc,
+					    gfp_t gfp_mask)
 {
-	struct page *page =3D NULL;
+	struct folio *folio =3D NULL;
 	gfp_t gfp =3D gfp_mask;
=20
 #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
-	gfp_mask |=3D __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY |
-		    __GFP_NOMEMALLOC;
-	page =3D alloc_pages_node(NUMA_NO_NODE, gfp_mask,
-				PAGE_FRAG_CACHE_MAX_ORDER);
-	nc->size =3D page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE;
+	gfp_mask |=3D __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC;
+	folio =3D folio_alloc(gfp_mask, PAGE_FRAG_CACHE_MAX_ORDER);
 #endif
-	if (unlikely(!page))
-		page =3D alloc_pages_node(NUMA_NO_NODE, gfp, 0);
+	if (unlikely(!folio))
+		folio =3D folio_alloc(gfp, 0);
=20
-	nc->va =3D page ? page_address(page) : NULL;
-
-	return page;
+	if (folio)
+		nc->folio =3D folio;
+	return folio;
 }
=20
 void __page_frag_cache_drain(struct page *page, unsigned int count)
 {
-	VM_BUG_ON_PAGE(page_ref_count(page) =3D=3D 0, page);
+	struct folio *folio =3D page_folio(page);
+
+	VM_BUG_ON_FOLIO(folio_ref_count(folio) =3D=3D 0, folio);
=20
-	if (page_ref_sub_and_test(page, count - 1))
-		__free_pages(page, compound_order(page));
+	folio_put_refs(folio, count);
 }
 EXPORT_SYMBOL(__page_frag_cache_drain);
=20
@@ -54,11 +55,12 @@ EXPORT_SYMBOL(__page_frag_cache_drain);
  */
 void page_frag_cache_clear(struct page_frag_cache *nc)
 {
-	if (nc->va) {
-		struct page *page =3D virt_to_head_page(nc->va);
+	struct folio *folio =3D nc->folio;
=20
-		__page_frag_cache_drain(page, nc->pagecnt_bias);
-		nc->va =3D NULL;
+	if (folio) {
+		VM_BUG_ON_FOLIO(folio_ref_count(folio) =3D=3D 0, folio);
+		folio_put_refs(folio, nc->pagecnt_bias);
+		nc->folio =3D NULL;
 	}
 }
 EXPORT_SYMBOL(page_frag_cache_clear);
@@ -67,56 +69,51 @@ void *page_frag_alloc_align(struct page_frag_cache *nc,
 			    unsigned int fragsz, gfp_t gfp_mask,
 			    unsigned int align)
 {
-	unsigned int size =3D PAGE_SIZE;
-	struct page *page;
-	int offset;
+	struct folio *folio =3D nc->folio;
+	size_t offset;
=20
 	WARN_ON_ONCE(!is_power_of_2(align));
=20
-	if (unlikely(!nc->va)) {
+	if (unlikely(!folio)) {
 refill:
-		page =3D __page_frag_cache_refill(nc, gfp_mask);
-		if (!page)
+		folio =3D page_frag_cache_refill(nc, gfp_mask);
+		if (!folio)
 			return NULL;
=20
-#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
-		/* if size can vary use size else just use PAGE_SIZE */
-		size =3D nc->size;
-#endif
 		/* Even if we own the page, we do not use atomic_set().
 		 * This would break get_page_unless_zero() users.
 		 */
-		page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE);
+		folio_ref_add(folio, PAGE_FRAG_CACHE_MAX_SIZE);
=20
 		/* reset page count bias and offset to start of new frag */
-		nc->pfmemalloc =3D page_is_pfmemalloc(page);
+		nc->pfmemalloc =3D folio_is_pfmemalloc(folio);
 		nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1;
-		nc->offset =3D size;
+		nc->offset =3D folio_size(folio);
 	}
=20
-	offset =3D nc->offset - fragsz;
-	if (unlikely(offset < 0)) {
-		page =3D virt_to_page(nc->va);
-
-		if (page_ref_count(page) !=3D nc->pagecnt_bias)
+	offset =3D nc->offset;
+	if (unlikely(fragsz > offset)) {
+		/* Reuse the folio if everyone we gave it to has finished with
+		 * it.
+		 */
+		if (!folio_ref_sub_and_test(folio, nc->pagecnt_bias)) {
+			nc->folio =3D NULL;
 			goto refill;
+		}
+
 		if (unlikely(nc->pfmemalloc)) {
-			page_ref_sub(page, nc->pagecnt_bias - 1);
-			__free_pages(page, compound_order(page));
+			__folio_put(folio);
+			nc->folio =3D NULL;
 			goto refill;
 		}
=20
-#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
-		/* if size can vary use size else just use PAGE_SIZE */
-		size =3D nc->size;
-#endif
 		/* OK, page count is 0, we can safely set it */
-		set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1);
+		folio_set_count(folio, PAGE_FRAG_CACHE_MAX_SIZE + 1);
=20
 		/* reset page count bias and offset to start of new frag */
 		nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1;
-		offset =3D size - fragsz;
-		if (unlikely(offset < 0)) {
+		offset =3D folio_size(folio);
+		if (unlikely(fragsz > offset)) {
 			/*
 			 * The caller is trying to allocate a fragment
 			 * with fragsz > PAGE_SIZE but the cache isn't big
@@ -126,15 +123,17 @@ void *page_frag_alloc_align(struct page_frag_cache *n=
c,
 			 * it could make memory pressure worse
 			 * so we simply return NULL here.
 			 */
+			nc->offset =3D offset;
 			return NULL;
 		}
 	}
=20
 	nc->pagecnt_bias--;
+	offset -=3D fragsz;
 	offset &=3D ~(align - 1);
 	nc->offset =3D offset;
=20
-	return nc->va + offset;
+	return folio_address(folio) + offset;
 }
 EXPORT_SYMBOL(page_frag_alloc_align);
=20
@@ -143,8 +142,6 @@ EXPORT_SYMBOL(page_frag_alloc_align);
  */
 void page_frag_free(void *addr)
 {
-	struct page *page =3D virt_to_head_page(addr);
-
-	__free_pages(page, compound_order(page));
+	folio_put(virt_to_folio(addr));
 }
 EXPORT_SYMBOL(page_frag_free);