From nobody Wed Feb 11 18:28:02 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 896D6CA0ECB
	for <linux-kernel@archiver.kernel.org>; Mon, 11 Sep 2023 22:09:10 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1357024AbjIKWEz (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 11 Sep 2023 18:04:55 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59632 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S244257AbjIKTuk (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 11 Sep 2023 15:50:40 -0400
Received: from mail-ua1-x92d.google.com (mail-ua1-x92d.google.com
 [IPv6:2607:f8b0:4864:20::92d])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A8E531A5
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:34 -0700 (PDT)
Received: by mail-ua1-x92d.google.com with SMTP id
 a1e0cc1a2514c-7a52db1e4bbso1808744241.3
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694461834;
 x=1695066634; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=7FpxEl1I3pH8RvyVkhbOxLMeix4VtK7whCGHR+Kbh8c=;
        b=dPH+uWsHnkOkQ9ulxAtyOP8krdvm6ra5XJC+bjB6N5mGkxGJOVBfaaJQDHW73Ie9m/
         +LrSHM0wnet2xq+LsWXoUU7SfowLnwHmTG18KPs9Wb5IJkm5Gyrpuk+JLbzXBsPKIYmp
         OtzpP7nGkYDDc0zvmr6iDzbswzN5GXaqafjDc7U2xQxWk/kBNDADkylcsF63s19E3TxZ
         WwUxGFtNtmn9GGzaivgwjeKPqxNc8VJuXwahYlV6ciOtZBygtx97wjwSBGy72bQu7wsv
         Al7OdDKsTq2eO90b6AAJrcVckQ0MybjdOojd9EG+M5/Tl1n04A57lVxzgSDqVzTysCGW
         5S5w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1694461834; x=1695066634;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=7FpxEl1I3pH8RvyVkhbOxLMeix4VtK7whCGHR+Kbh8c=;
        b=ffXI0ra8XBvkQgEQ0WP7O5RFPCYy3frLjliuMSZv2klzn+4aQMYOUOCJ+l1PdJ6Of1
         D2Xp9l1M+L1wo9yCFgpOO19ZCctQXTLqUvKXHTcX64ng2egHCpJO0yBqDezFQL3dSdfF
         nf3LkI1cWZ2EpH9BMVL6Qb4afmY1S057WMvxf0fJPLmOAu61YOnRyUa/w2xRJ+4nxN7P
         mMLOtbqoejTKXhX2ZD5U2T8Y+8I+4vDTKQidIqiKRElwlAt2WThGeWkVuXtMONsmIiMQ
         K5GKM598oS860v2Noxnii/iqXa4dQNLPYTWbGrAFEWhRkz3+34GgFFsS25oNmZrGhOb1
         HHQg==
X-Gm-Message-State: AOJu0YyIUcYZhXAES7ot+pY855nbhmbvS+azxMmZu9mlL+IFZrSYJwzB
        cu9heX7PzHJRWbP+mGOSjHshAQ==
X-Google-Smtp-Source: 
 AGHT+IH5gBtYy1/4AAerHO8D5EjmzZL+ejo/P0LZt667+IenuP9gxk1guF7nx21/dGBJi/c5z4jB+Q==
X-Received: by 2002:a1f:e7c4:0:b0:495:db79:ea76 with SMTP id
 e187-20020a1fe7c4000000b00495db79ea76mr4004344vkh.1.1694461833711;
        Mon, 11 Sep 2023 12:50:33 -0700 (PDT)
Received: from localhost
 (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com.
 [2603:7000:c01:2716:3012:16a2:6bc2:2937])
        by smtp.gmail.com with ESMTPSA id
 o10-20020a0ccb0a000000b0063d038df3f3sm3149714qvk.52.2023.09.11.12.50.33
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 11 Sep 2023 12:50:33 -0700 (PDT)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
        Mel Gorman <mgorman@techsingularity.net>,
        Miaohe Lin <linmiaohe@huawei.com>,
        Kefeng Wang <wangkefeng.wang@huawei.com>,
        Zi Yan <ziy@nvidia.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org
Subject: [PATCH 1/6] mm: page_alloc: remove pcppage migratetype caching
Date: Mon, 11 Sep 2023 15:41:42 -0400
Message-ID: <20230911195023.247694-2-hannes@cmpxchg.org>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20230911195023.247694-1-hannes@cmpxchg.org>
References: <20230911195023.247694-1-hannes@cmpxchg.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

The idea behind the cache is to save get_pageblock_migratetype()
lookups during bulk freeing. A microbenchmark suggests this isn't
helping, though. The pcp migratetype can get stale, which means that
bulk freeing has an extra branch to check if the pageblock was
isolated while on the pcp.

While the variance overlaps, the cache write and the branch seem to
make this a net negative. The following test allocates and frees
batches of 10,000 pages (~3x the pcp high marks to trigger flushing):

Before:
          8,668.48 msec task-clock                       #   99.735 CPUs ut=
ilized               ( +-  2.90% )
                19      context-switches                 #    4.341 /sec   =
                     ( +-  3.24% )
                 0      cpu-migrations                   #    0.000 /sec
            17,440      page-faults                      #    3.984 K/sec  =
                     ( +-  2.90% )
    41,758,692,473      cycles                           #    9.541 GHz    =
                     ( +-  2.90% )
   126,201,294,231      instructions                     #    5.98  insn pe=
r cycle              ( +-  2.90% )
    25,348,098,335      branches                         #    5.791 G/sec  =
                     ( +-  2.90% )
        33,436,921      branch-misses                    #    0.26% of all =
branches             ( +-  2.90% )

         0.0869148 +- 0.0000302 seconds time elapsed  ( +-  0.03% )

After:
          8,444.81 msec task-clock                       #   99.726 CPUs ut=
ilized               ( +-  2.90% )
                22      context-switches                 #    5.160 /sec   =
                     ( +-  3.23% )
                 0      cpu-migrations                   #    0.000 /sec
            17,443      page-faults                      #    4.091 K/sec  =
                     ( +-  2.90% )
    40,616,738,355      cycles                           #    9.527 GHz    =
                     ( +-  2.90% )
   126,383,351,792      instructions                     #    6.16  insn pe=
r cycle              ( +-  2.90% )
    25,224,985,153      branches                         #    5.917 G/sec  =
                     ( +-  2.90% )
        32,236,793      branch-misses                    #    0.25% of all =
branches             ( +-  2.90% )

         0.0846799 +- 0.0000412 seconds time elapsed  ( +-  0.05% )

A side effect is that this also ensures that pages whose pageblock
gets stolen while on the pcplist end up on the right freelist and we
don't perform potentially type-incompatible buddy merges (or skip
merges when we shouldn't), whis is likely beneficial to long-term
fragmentation management, although the effects would be harder to
measure. Settle for simpler and faster code as justification here.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Zi Yan <ziy@nvidia.com>
Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
---
 mm/page_alloc.c | 61 ++++++++++++-------------------------------------
 1 file changed, 14 insertions(+), 47 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 95546f376302..e3f1c777feed 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -204,24 +204,6 @@ EXPORT_SYMBOL(node_states);
=20
 gfp_t gfp_allowed_mask __read_mostly =3D GFP_BOOT_MASK;
=20
-/*
- * A cached value of the page's pageblock's migratetype, used when the pag=
e is
- * put on a pcplist. Used to avoid the pageblock migratetype lookup when
- * freeing from pcplists in most cases, at the cost of possibly becoming s=
tale.
- * Also the migratetype set in the page does not necessarily match the pcp=
list
- * index, e.g. page might have MIGRATE_CMA set but be on a pcplist with any
- * other index - this ensures that it will be put on the correct CMA freel=
ist.
- */
-static inline int get_pcppage_migratetype(struct page *page)
-{
-	return page->index;
-}
-
-static inline void set_pcppage_migratetype(struct page *page, int migratet=
ype)
-{
-	page->index =3D migratetype;
-}
-
 #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE
 unsigned int pageblock_order __read_mostly;
 #endif
@@ -1186,7 +1168,6 @@ static void free_pcppages_bulk(struct zone *zone, int=
 count,
 {
 	unsigned long flags;
 	unsigned int order;
-	bool isolated_pageblocks;
 	struct page *page;
=20
 	/*
@@ -1199,7 +1180,6 @@ static void free_pcppages_bulk(struct zone *zone, int=
 count,
 	pindex =3D pindex - 1;
=20
 	spin_lock_irqsave(&zone->lock, flags);
-	isolated_pageblocks =3D has_isolate_pageblock(zone);
=20
 	while (count > 0) {
 		struct list_head *list;
@@ -1215,10 +1195,12 @@ static void free_pcppages_bulk(struct zone *zone, i=
nt count,
 		order =3D pindex_to_order(pindex);
 		nr_pages =3D 1 << order;
 		do {
+			unsigned long pfn;
 			int mt;
=20
 			page =3D list_last_entry(list, struct page, pcp_list);
-			mt =3D get_pcppage_migratetype(page);
+			pfn =3D page_to_pfn(page);
+			mt =3D get_pfnblock_migratetype(page, pfn);
=20
 			/* must delete to avoid corrupting pcp list */
 			list_del(&page->pcp_list);
@@ -1227,11 +1209,8 @@ static void free_pcppages_bulk(struct zone *zone, in=
t count,
=20
 			/* MIGRATE_ISOLATE page should not go to pcplists */
 			VM_BUG_ON_PAGE(is_migrate_isolate(mt), page);
-			/* Pageblock could have been isolated meanwhile */
-			if (unlikely(isolated_pageblocks))
-				mt =3D get_pageblock_migratetype(page);
=20
-			__free_one_page(page, page_to_pfn(page), zone, order, mt, FPI_NONE);
+			__free_one_page(page, pfn, zone, order, mt, FPI_NONE);
 			trace_mm_page_pcpu_drain(page, order, mt);
 		} while (count > 0 && !list_empty(list));
 	}
@@ -1577,7 +1556,6 @@ struct page *__rmqueue_smallest(struct zone *zone, un=
signed int order,
 			continue;
 		del_page_from_free_list(page, zone, current_order);
 		expand(zone, page, order, current_order, migratetype);
-		set_pcppage_migratetype(page, migratetype);
 		trace_mm_page_alloc_zone_locked(page, order, migratetype,
 				pcp_allowed_order(order) &&
 				migratetype < MIGRATE_PCPTYPES);
@@ -2145,7 +2123,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned i=
nt order,
 		 * pages are ordered properly.
 		 */
 		list_add_tail(&page->pcp_list, list);
-		if (is_migrate_cma(get_pcppage_migratetype(page)))
+		if (is_migrate_cma(get_pageblock_migratetype(page)))
 			__mod_zone_page_state(zone, NR_FREE_CMA_PAGES,
 					      -(1 << order));
 	}
@@ -2304,19 +2282,6 @@ void drain_all_pages(struct zone *zone)
 	__drain_all_pages(zone, false);
 }
=20
-static bool free_unref_page_prepare(struct page *page, unsigned long pfn,
-							unsigned int order)
-{
-	int migratetype;
-
-	if (!free_pages_prepare(page, order, FPI_NONE))
-		return false;
-
-	migratetype =3D get_pfnblock_migratetype(page, pfn);
-	set_pcppage_migratetype(page, migratetype);
-	return true;
-}
-
 static int nr_pcp_free(struct per_cpu_pages *pcp, int high, bool free_high)
 {
 	int min_nr_free, max_nr_free;
@@ -2402,7 +2367,7 @@ void free_unref_page(struct page *page, unsigned int =
order)
 	unsigned long pfn =3D page_to_pfn(page);
 	int migratetype, pcpmigratetype;
=20
-	if (!free_unref_page_prepare(page, pfn, order))
+	if (!free_pages_prepare(page, order, FPI_NONE))
 		return;
=20
 	/*
@@ -2412,7 +2377,7 @@ void free_unref_page(struct page *page, unsigned int =
order)
 	 * get those areas back if necessary. Otherwise, we may have to free
 	 * excessively into the page allocator
 	 */
-	migratetype =3D pcpmigratetype =3D get_pcppage_migratetype(page);
+	migratetype =3D pcpmigratetype =3D get_pfnblock_migratetype(page, pfn);
 	if (unlikely(migratetype >=3D MIGRATE_PCPTYPES)) {
 		if (unlikely(is_migrate_isolate(migratetype))) {
 			free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
@@ -2448,7 +2413,8 @@ void free_unref_page_list(struct list_head *list)
 	/* Prepare pages for freeing */
 	list_for_each_entry_safe(page, next, list, lru) {
 		unsigned long pfn =3D page_to_pfn(page);
-		if (!free_unref_page_prepare(page, pfn, 0)) {
+
+		if (!free_pages_prepare(page, 0, FPI_NONE)) {
 			list_del(&page->lru);
 			continue;
 		}
@@ -2457,7 +2423,7 @@ void free_unref_page_list(struct list_head *list)
 		 * Free isolated pages directly to the allocator, see
 		 * comment in free_unref_page.
 		 */
-		migratetype =3D get_pcppage_migratetype(page);
+		migratetype =3D get_pfnblock_migratetype(page, pfn);
 		if (unlikely(is_migrate_isolate(migratetype))) {
 			list_del(&page->lru);
 			free_one_page(page_zone(page), page, pfn, 0, migratetype, FPI_NONE);
@@ -2466,10 +2432,11 @@ void free_unref_page_list(struct list_head *list)
 	}
=20
 	list_for_each_entry_safe(page, next, list, lru) {
+		unsigned long pfn =3D page_to_pfn(page);
 		struct zone *zone =3D page_zone(page);
=20
 		list_del(&page->lru);
-		migratetype =3D get_pcppage_migratetype(page);
+		migratetype =3D get_pfnblock_migratetype(page, pfn);
=20
 		/*
 		 * Either different zone requiring a different pcp lock or
@@ -2492,7 +2459,7 @@ void free_unref_page_list(struct list_head *list)
 			pcp =3D pcp_spin_trylock(zone->per_cpu_pageset);
 			if (unlikely(!pcp)) {
 				pcp_trylock_finish(UP_flags);
-				free_one_page(zone, page, page_to_pfn(page),
+				free_one_page(zone, page, pfn,
 					      0, migratetype, FPI_NONE);
 				locked_zone =3D NULL;
 				continue;
@@ -2661,7 +2628,7 @@ struct page *rmqueue_buddy(struct zone *preferred_zon=
e, struct zone *zone,
 			}
 		}
 		__mod_zone_freepage_state(zone, -(1 << order),
-					  get_pcppage_migratetype(page));
+					  get_pageblock_migratetype(page));
 		spin_unlock_irqrestore(&zone->lock, flags);
 	} while (check_new_pages(page, order));
=20
--=20
2.42.0
From nobody Wed Feb 11 18:28:02 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 80658CA0EC8
	for <linux-kernel@archiver.kernel.org>; Mon, 11 Sep 2023 21:39:50 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1350124AbjIKVfw (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 11 Sep 2023 17:35:52 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59638 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S244258AbjIKTuk (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 11 Sep 2023 15:50:40 -0400
Received: from mail-oo1-xc2d.google.com (mail-oo1-xc2d.google.com
 [IPv6:2607:f8b0:4864:20::c2d])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 994041A2
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:35 -0700 (PDT)
Received: by mail-oo1-xc2d.google.com with SMTP id
 006d021491bc7-5731fe1d2bfso2697373eaf.3
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694461835;
 x=1695066635; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=sIntV6SfyJxxBgJqngsAK3yqYWZ9fE7nMi1YvaBUo2Y=;
        b=D37mE1/h64JbCmNboJSxslR+NrmIUYhIjTtW9HWzcx7q0MuNlncxtYfGFZ7iq49Kpm
         wiU/jC6KyUwBVLHtKFS3oIxFGvI3WHWkb4NLq7YQCAe9I3Jl6sNaV5xQ9mXBEWXYeqSq
         Ucdk8je8+Ozno3vsAPhuo5YKN5P1hcyiQjK4LBWvjVmv038AIupxUljlvhweRWBZvZdF
         rvPdFPBU1qUcPNM4agB728jgXj+CjQWL2OErPkeMzBUvUYs2cqq7eJpl1s7cfrRMcQQi
         nkNmqq8ybrEqVY9xQi7U4SVHvV4Zu9TXAvv4Y72Mg6BpaRdEdaD2lnzwIHEUIewiS/AL
         msIQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1694461835; x=1695066635;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=sIntV6SfyJxxBgJqngsAK3yqYWZ9fE7nMi1YvaBUo2Y=;
        b=wjZVfjAceR7cTp3RT/TNUDB9+XdHw6N5icWFWdYosN37Ko0ChTG7qpZ9hn2PHQ2msC
         k0j4flZyVL2J6PHov6GVFNSRJ1CiE90fW7ansrLs8uz1jsAiP1grz8s9vXHcX/h7QtES
         Ael4IhKRjwmgxWs+7DDyOFHPoh/l/d6sXVZNLLn6/QksirBbWkZoHOSjDYmT597Yu3hf
         xot2OAIN1uo8vbydV/3LWBGMMijEFS3XOG6iDIx7AOWmdHYV/FhvJZDU7NQ3ksFo34uk
         r3oMru5O1HX5LZ5Do+0S+3s4SL7Byh0m6RFi44sWbp89f1AojQh3Iu4LJA3I5VqU0bxW
         o6Jw==
X-Gm-Message-State: AOJu0Yx1p0+GGpX4rRO/C5W3HDe1+qWFEpyr7K+23bWFtHJ/Khk4tlfl
        yXPgVM41h2WpjpahN1U5zbqiilLJ5oJT18dpP/0=
X-Google-Smtp-Source: 
 AGHT+IGS06A0iPrm3hhJt1coz/NmHegf9pi5FM+C3kEjocCFV6oSc5waC969xFTQcEM5YgwdIe4bDQ==
X-Received: by 2002:a05:6358:6f92:b0:130:faea:a81f with SMTP id
 s18-20020a0563586f9200b00130faeaa81fmr5835243rwn.28.1694461834812;
        Mon, 11 Sep 2023 12:50:34 -0700 (PDT)
Received: from localhost
 (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com.
 [2603:7000:c01:2716:3012:16a2:6bc2:2937])
        by smtp.gmail.com with ESMTPSA id
 j20-20020a0cf514000000b0064b502fdeecsm3129162qvm.68.2023.09.11.12.50.34
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 11 Sep 2023 12:50:34 -0700 (PDT)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
        Mel Gorman <mgorman@techsingularity.net>,
        Miaohe Lin <linmiaohe@huawei.com>,
        Kefeng Wang <wangkefeng.wang@huawei.com>,
        Zi Yan <ziy@nvidia.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org
Subject: [PATCH 2/6] mm: page_alloc: fix up block types when merging
 compatible blocks
Date: Mon, 11 Sep 2023 15:41:43 -0400
Message-ID: <20230911195023.247694-3-hannes@cmpxchg.org>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20230911195023.247694-1-hannes@cmpxchg.org>
References: <20230911195023.247694-1-hannes@cmpxchg.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

The buddy allocator coalesces compatible blocks during freeing, but it
doesn't update the types of the subblocks to match. When an allocation
later breaks the chunk down again, its pieces will be put on freelists
of the wrong type. This encourages incompatible page mixing (ask for
one type, get another), and thus long-term fragmentation.

Update the subblocks when merging a larger chunk, such that a later
expand() will maintain freelist type hygiene.

v2:
- remove spurious change_pageblock_range() move (Zi Yan)

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e3f1c777feed..3db405414174 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -783,10 +783,17 @@ static inline void __free_one_page(struct page *page,
 			 */
 			int buddy_mt =3D get_pfnblock_migratetype(buddy, buddy_pfn);
=20
-			if (migratetype !=3D buddy_mt
-					&& (!migratetype_is_mergeable(migratetype) ||
-						!migratetype_is_mergeable(buddy_mt)))
-				goto done_merging;
+			if (migratetype !=3D buddy_mt) {
+				if (!migratetype_is_mergeable(migratetype) ||
+				    !migratetype_is_mergeable(buddy_mt))
+					goto done_merging;
+				/*
+				 * Match buddy type. This ensures that
+				 * an expand() down the line puts the
+				 * sub-blocks on the right freelists.
+				 */
+				set_pageblock_migratetype(buddy, migratetype);
+			}
 		}
=20
 		/*
--=20
2.42.0
From nobody Wed Feb 11 18:28:02 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 78D2DCA0ECC
	for <linux-kernel@archiver.kernel.org>; Mon, 11 Sep 2023 21:19:28 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1343758AbjIKVM0 (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 11 Sep 2023 17:12:26 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59652 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S244260AbjIKTul (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 11 Sep 2023 15:50:41 -0400
Received: from mail-qk1-x72d.google.com (mail-qk1-x72d.google.com
 [IPv6:2607:f8b0:4864:20::72d])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B10CE1B6
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:36 -0700 (PDT)
Received: by mail-qk1-x72d.google.com with SMTP id
 af79cd13be357-76dcf1d8957so290912885a.1
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694461836;
 x=1695066636; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=DXpNvTPTT9ib/tT2NTgBNzi3cETEqU1uBo715LHCaBY=;
        b=uT4MVQjCu6pBTo7Q8JifJ7gtBsh1Nkc6qOX+z7WR3NeD9H7v/+6G3G6NfEvQFYBK/G
         hvhe98QvObS74O5spsIzhJrrWIhRkPnDMS88PpPN3xH+tLigVSILwSEm0SdoYZxkRPkv
         ogC+ZcUkMeuD+q6iYbDCuFReydVRhWcHVsUXKuhJKDKDUo7+ZYYorruzd9UgTgWxr+C2
         mxbMKUVaLZvi6yiYrmaojHv85+hvXDXgUvpKBxlqf/Q+rI8/qam6YHaF4lRrJNVVOloz
         Mv1rMvgDiqx6DhBTiV4zI1RtLXNR3cEGbjYQpqzcaSNhLLjZj/kU8Dvk9TxHPi7ueSxI
         9Fqw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1694461836; x=1695066636;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=DXpNvTPTT9ib/tT2NTgBNzi3cETEqU1uBo715LHCaBY=;
        b=l4FUspZkjsgYeZcnCxAo+eLEldDBTBrbe6IAGA91cUWVYze2CmvKPI++/BgHCnMAQE
         xslp515eZAegZjKOkTw2iVg4wBZdsoX+nSupP/eBpTAX13Ty2kAuF+qkx+JuYdei7gTA
         Fg8czyCFVmFGeSlV5EbLdy0+b2JS/SUYk3XBql3wIFFgwSxw/SGEolkw05J0EVFCZ9pX
         f6A6J+v1DgGt2Udn5kZeLLawCJ3XQUPyHy8IMLGNXXun3piL+0QzPHEO3AhTi4Px3Lwv
         lGWJ6Y+Sa07p/bYALDEifr0Kib9Sn+YJeONLytn/c7MvrANsKvbG8r/yGLTgnjboUl2E
         /OyQ==
X-Gm-Message-State: AOJu0Yw72Xg5vyTG+no0408WV90d2HstE98lStBwZsMAuMwKVUNI2YRb
        C2swV4TzT5YEquKYzIh6CyESQQ==
X-Google-Smtp-Source: 
 AGHT+IFL0UuJsNEAxd5SAPx+lZ3HgMHaH/IGI5QCwmbwAzaYO1UWU+4wttoCCDMIY1y70kB2Wm1/ug==
X-Received: by 2002:a05:620a:991:b0:76e:f279:4c36 with SMTP id
 x17-20020a05620a099100b0076ef2794c36mr9707748qkx.29.1694461835893;
        Mon, 11 Sep 2023 12:50:35 -0700 (PDT)
Received: from localhost
 (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com.
 [2603:7000:c01:2716:3012:16a2:6bc2:2937])
        by smtp.gmail.com with ESMTPSA id
 20-20020a05620a071400b00767d6ec578csm2724636qkc.20.2023.09.11.12.50.35
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 11 Sep 2023 12:50:35 -0700 (PDT)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
        Mel Gorman <mgorman@techsingularity.net>,
        Miaohe Lin <linmiaohe@huawei.com>,
        Kefeng Wang <wangkefeng.wang@huawei.com>,
        Zi Yan <ziy@nvidia.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org
Subject: [PATCH 3/6] mm: page_alloc: move free pages when converting block
 during isolation
Date: Mon, 11 Sep 2023 15:41:44 -0400
Message-ID: <20230911195023.247694-4-hannes@cmpxchg.org>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20230911195023.247694-1-hannes@cmpxchg.org>
References: <20230911195023.247694-1-hannes@cmpxchg.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

When claiming a block during compaction isolation, move any remaining
free pages to the correct freelists as well, instead of stranding them
on the wrong list. Otherwise, this encourages incompatible page mixing
down the line, and thus long-term fragmentation.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3db405414174..f6f658c3d394 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2548,9 +2548,12 @@ int __isolate_free_page(struct page *page, unsigned =
int order)
 			 * Only change normal pageblocks (i.e., they can merge
 			 * with others)
 			 */
-			if (migratetype_is_mergeable(mt))
+			if (migratetype_is_mergeable(mt)) {
 				set_pageblock_migratetype(page,
 							  MIGRATE_MOVABLE);
+				move_freepages_block(zone, page,
+						     MIGRATE_MOVABLE, NULL);
+			}
 		}
 	}
=20
--=20
2.42.0
From nobody Wed Feb 11 18:28:02 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 7B8F5CA0EC0
	for <linux-kernel@archiver.kernel.org>; Mon, 11 Sep 2023 21:39:16 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1349366AbjIKVdU (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 11 Sep 2023 17:33:20 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41952 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S244262AbjIKTum (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 11 Sep 2023 15:50:42 -0400
Received: from mail-qt1-x836.google.com (mail-qt1-x836.google.com
 [IPv6:2607:f8b0:4864:20::836])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D36541A2
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:37 -0700 (PDT)
Received: by mail-qt1-x836.google.com with SMTP id
 d75a77b69052e-4124e1909edso30842971cf.1
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694461837;
 x=1695066637; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=XbiqGXh0AoO6pfd7jMLvMSWe/voOxpwIaHwDz3nXxMQ=;
        b=2s6JydhEOT3SkPjRmfQ/DLXTDVl5N94FfLH/DPQde0p4EB2ZlVQJSugLpcd7SvH1/G
         m4IWHh4etAHNqSrI6uMvdl/T++gFv7NZof+pFXoDpGUs2gL6UyCb2xbsir9S/mb7NHgv
         i2AL3o645ySyYeFoDNLpDamCk8mkXLySCkj7RqNhNHSnvZKvDBKSNDsPyhyUk78868fn
         0JgK3i5h3RMAgRcc8SkLfQvzVcLUAYzArZw9+YZA6bBKx0+k8ByvhQrWh1T3Cl7RysPn
         q50+sBgpQXWVhSgEi72hkhO3s7lWx6YaFM76hsydwwKx8PQOdwLCu4qY0s6GLy8PNKGK
         ewQg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1694461837; x=1695066637;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=XbiqGXh0AoO6pfd7jMLvMSWe/voOxpwIaHwDz3nXxMQ=;
        b=C8QfHYR/WjnEUUUFPs4Hu8Ujv4U8jJYyRhKUEAtSJINout+vWvvO7NyhvWT1OtG1Nz
         AL/rQG9PL1eTho0mzzuEUWxWEo8vWachRWIooH0pjRJ9DkVf45GH0/kYIAM7kd9Cw/X9
         1DHIEvL9/6qp6ALGDTNp3Qm/lTaDJoVtuU8wUmHf+QE8G4tQWIYbckHEPbPrKdDxlXmh
         Q0yCR7ppz6GelIGEpkfWR1TdaDRzdIU8ipXHc7ppj6sEWVL2lMoRfgLzvJ7UEnUAfftp
         PM9dOz3L4ZDBJMBSMkhsx/RromEZYzKgVuOOzKdq77juys3guo4EvUct9gwrHlxTtIFO
         Yqug==
X-Gm-Message-State: AOJu0YzvqqPr9/zibdH3ViJqnFurCtTUQoFuUEZWgrhJWPIA0zqScfCq
        TPH8h/nA7nh/ROoeZDCTd+GU/g==
X-Google-Smtp-Source: 
 AGHT+IGUAHLS2HPzXYDw7iJOOB1AIzY0p9GHTHFFHLSgWhRs7EOMIgK8ZOejhK0Em8gshplLrGFwBA==
X-Received: by 2002:ac8:5986:0:b0:40d:4c6:bcdb with SMTP id
 e6-20020ac85986000000b0040d04c6bcdbmr13002837qte.5.1694461836984;
        Mon, 11 Sep 2023 12:50:36 -0700 (PDT)
Received: from localhost
 (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com.
 [2603:7000:c01:2716:3012:16a2:6bc2:2937])
        by smtp.gmail.com with ESMTPSA id
 z17-20020ac84551000000b004108f6788a6sm2825736qtn.41.2023.09.11.12.50.36
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 11 Sep 2023 12:50:36 -0700 (PDT)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
        Mel Gorman <mgorman@techsingularity.net>,
        Miaohe Lin <linmiaohe@huawei.com>,
        Kefeng Wang <wangkefeng.wang@huawei.com>,
        Zi Yan <ziy@nvidia.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org
Subject: [PATCH 4/6] mm: page_alloc: fix move_freepages_block() range error
Date: Mon, 11 Sep 2023 15:41:45 -0400
Message-ID: <20230911195023.247694-5-hannes@cmpxchg.org>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20230911195023.247694-1-hannes@cmpxchg.org>
References: <20230911195023.247694-1-hannes@cmpxchg.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

When a block is partially outside the zone of the cursor page, the
function cuts the range to the pivot page instead of the zone
start. This can leave large parts of the block behind, which
encourages incompatible page mixing down the line (ask for one type,
get another), and thus long-term fragmentation.

This triggers reliably on the first block in the DMA zone, whose
start_pfn is 1. The block is stolen, but everything before the pivot
page (which was often hundreds of pages) is left on the old list.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f6f658c3d394..5bbe5f3be5ad 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1652,7 +1652,7 @@ int move_freepages_block(struct zone *zone, struct pa=
ge *page,
=20
 	/* Do not cross zone boundaries */
 	if (!zone_spans_pfn(zone, start_pfn))
-		start_pfn =3D pfn;
+		start_pfn =3D zone->zone_start_pfn;
 	if (!zone_spans_pfn(zone, end_pfn))
 		return 0;
=20
--=20
2.42.0
From nobody Wed Feb 11 18:28:02 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id D6F7ECA0EC7
	for <linux-kernel@archiver.kernel.org>; Mon, 11 Sep 2023 22:39:03 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1377893AbjIKW3H (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 11 Sep 2023 18:29:07 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41966 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S244273AbjIKTuo (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 11 Sep 2023 15:50:44 -0400
Received: from mail-qv1-xf33.google.com (mail-qv1-xf33.google.com
 [IPv6:2607:f8b0:4864:20::f33])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00AA21A5
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:38 -0700 (PDT)
Received: by mail-qv1-xf33.google.com with SMTP id
 6a1803df08f44-64cca551ae2so31102166d6.0
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694461838;
 x=1695066638; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=YU89p8TG5c7Q3Tc81fgPY+LY6pa0Rdp9Sv7NpbIR5yI=;
        b=mWgMzEHhtqe0g5uR3rQEE88xtioyXtvXsRgXa6ucAEEjOttof+gqkHPJLGlmZhQ91H
         I/jqCjHwcvpZpAwPw7OnFPRV2kbRdr5qD+lDhP5bjYL32CvVWqZF1iQIQai3Fej3b7HG
         3zasNx72Q86yyaK5qPiyRMNY/ano3FVL5e+zWNxBANzBx1sv8pNe9nmO/4BPBe05wqyJ
         fxcGgTr+X72F/xyqcbQzkUnOqJIwUmE3zvK0oIlxWbhoibU1b6N9xNEuzMsNKPXhqSsH
         J8eURqqcG939gM1WDEIscerRcaWIuuOfuSPKEQaYNr4/o5Eaf3Yfzrk2up3bo7j2rxv1
         yoGQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1694461838; x=1695066638;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=YU89p8TG5c7Q3Tc81fgPY+LY6pa0Rdp9Sv7NpbIR5yI=;
        b=NoVNq2VAH6KpbJ4L1rgvx9i/gZy2HJhrLtgbFCvFhPkIG5mE2ustArmABflYepDAjR
         tnIziLi8PuUFtamsNaaN2GvylXbXsgnsH+dnei1OciYGPfwIqCEHYiu1H2s0MbF9B0e8
         P/h9zD0lQJ/x9l4+Gj3xHtKMN7YNDw/xh3ERVetXnv1YaIhWJNir80MUU1zFVy2+7O+8
         8mJ90DJwiLYUi7tgiJ7FdCjcSnmYbezhcXpM8pnunCUKzBDL2vZr5AyOHYiGpcG029PF
         YFNCqnjUFj+OXE9mA1YNY+xWTrkV//wJNW117rY8AmGe1DTOtdo90IU8+iqpCwIlGnlj
         FcWw==
X-Gm-Message-State: AOJu0YzrbtHZ46cyOaCEqJKUkjmNb7axLVpUFrpvhb+m6/nyXsoAlGOg
        VPwFrncR2/krSoUOtE4VOWaCjQ==
X-Google-Smtp-Source: 
 AGHT+IH4M19lI0SDJVY6f78Lb3HcLcJw1JfP6kv4TxQN92+T5Gq5JV7EW0oHeoj0yCZP9Qf0W9+btw==
X-Received: by 2002:a0c:8cc1:0:b0:653:5b5e:7a96 with SMTP id
 q1-20020a0c8cc1000000b006535b5e7a96mr10400951qvb.1.1694461838052;
        Mon, 11 Sep 2023 12:50:38 -0700 (PDT)
Received: from localhost
 (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com.
 [2603:7000:c01:2716:3012:16a2:6bc2:2937])
        by smtp.gmail.com with ESMTPSA id
 o3-20020a0ce403000000b00653589babfcsm3146243qvl.132.2023.09.11.12.50.37
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 11 Sep 2023 12:50:37 -0700 (PDT)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
        Mel Gorman <mgorman@techsingularity.net>,
        Miaohe Lin <linmiaohe@huawei.com>,
        Kefeng Wang <wangkefeng.wang@huawei.com>,
        Zi Yan <ziy@nvidia.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org
Subject: [PATCH 5/6] mm: page_alloc: fix freelist movement during block
 conversion
Date: Mon, 11 Sep 2023 15:41:46 -0400
Message-ID: <20230911195023.247694-6-hannes@cmpxchg.org>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20230911195023.247694-1-hannes@cmpxchg.org>
References: <20230911195023.247694-1-hannes@cmpxchg.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

Currently, page block type conversion during fallbacks, atomic
reservations and isolation can strand various amounts of free pages on
incorrect freelists.

For example, fallback stealing moves free pages in the block to the
new type's freelists, but then may not actually claim the block for
that type if there aren't enough compatible pages already allocated.

In all cases, free page moving might fail if the block straddles more
than one zone, in which case no free pages are moved at all, but the
block type is changed anyway.

This is detrimental to type hygiene on the freelists. It encourages
incompatible page mixing down the line (ask for one type, get another)
and thus contributes to long-term fragmentation.

Split the process into a proper transaction: check first if conversion
will happen, then try to move the free pages, and only if that was
successful convert the block to the new type.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/page-isolation.h |   3 +-
 mm/page_alloc.c                | 171 ++++++++++++++++++++-------------
 mm/page_isolation.c            |  22 +++--
 3 files changed, 118 insertions(+), 78 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 4ac34392823a..8550b3c91480 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -34,8 +34,7 @@ static inline bool is_migrate_isolate(int migratetype)
 #define REPORT_FAILURE	0x2
=20
 void set_pageblock_migratetype(struct page *page, int migratetype);
-int move_freepages_block(struct zone *zone, struct page *page,
-				int migratetype, int *num_movable);
+int move_freepages_block(struct zone *zone, struct page *page, int migrate=
type);
=20
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pf=
n,
 			     int migratetype, int flags, gfp_t gfp_flags);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5bbe5f3be5ad..a902593f16dd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1601,9 +1601,8 @@ static inline struct page *__rmqueue_cma_fallback(str=
uct zone *zone,
  * Note that start_page and end_pages are not aligned on a pageblock
  * boundary. If alignment is required, use move_freepages_block()
  */
-static int move_freepages(struct zone *zone,
-			  unsigned long start_pfn, unsigned long end_pfn,
-			  int migratetype, int *num_movable)
+static int move_freepages(struct zone *zone, unsigned long start_pfn,
+			  unsigned long end_pfn, int migratetype)
 {
 	struct page *page;
 	unsigned long pfn;
@@ -1613,14 +1612,6 @@ static int move_freepages(struct zone *zone,
 	for (pfn =3D start_pfn; pfn <=3D end_pfn;) {
 		page =3D pfn_to_page(pfn);
 		if (!PageBuddy(page)) {
-			/*
-			 * We assume that pages that could be isolated for
-			 * migration are movable. But we don't actually try
-			 * isolating, as that would be expensive.
-			 */
-			if (num_movable &&
-					(PageLRU(page) || __PageMovable(page)))
-				(*num_movable)++;
 			pfn++;
 			continue;
 		}
@@ -1638,26 +1629,62 @@ static int move_freepages(struct zone *zone,
 	return pages_moved;
 }
=20
-int move_freepages_block(struct zone *zone, struct page *page,
-				int migratetype, int *num_movable)
+static bool prep_move_freepages_block(struct zone *zone, struct page *page,
+				      unsigned long *start_pfn,
+				      unsigned long *end_pfn,
+				      int *num_free, int *num_movable)
 {
-	unsigned long start_pfn, end_pfn, pfn;
-
-	if (num_movable)
-		*num_movable =3D 0;
+	unsigned long pfn, start, end;
=20
 	pfn =3D page_to_pfn(page);
-	start_pfn =3D pageblock_start_pfn(pfn);
-	end_pfn =3D pageblock_end_pfn(pfn) - 1;
+	start =3D pageblock_start_pfn(pfn);
+	end =3D pageblock_end_pfn(pfn) - 1;
=20
 	/* Do not cross zone boundaries */
-	if (!zone_spans_pfn(zone, start_pfn))
-		start_pfn =3D zone->zone_start_pfn;
-	if (!zone_spans_pfn(zone, end_pfn))
-		return 0;
+	if (!zone_spans_pfn(zone, start))
+		start =3D zone->zone_start_pfn;
+	if (!zone_spans_pfn(zone, end))
+		return false;
+
+	*start_pfn =3D start;
+	*end_pfn =3D end;
+
+	if (num_free) {
+		*num_free =3D 0;
+		*num_movable =3D 0;
+		for (pfn =3D start; pfn <=3D end;) {
+			page =3D pfn_to_page(pfn);
+			if (PageBuddy(page)) {
+				int nr =3D 1 << buddy_order(page);
+
+				*num_free +=3D nr;
+				pfn +=3D nr;
+				continue;
+			}
+			/*
+			 * We assume that pages that could be isolated for
+			 * migration are movable. But we don't actually try
+			 * isolating, as that would be expensive.
+			 */
+			if (PageLRU(page) || __PageMovable(page))
+				(*num_movable)++;
+			pfn++;
+		}
+	}
=20
-	return move_freepages(zone, start_pfn, end_pfn, migratetype,
-								num_movable);
+	return true;
+}
+
+int move_freepages_block(struct zone *zone, struct page *page,
+			 int migratetype)
+{
+	unsigned long start_pfn, end_pfn;
+
+	if (!prep_move_freepages_block(zone, page, &start_pfn, &end_pfn,
+				       NULL, NULL))
+		return -1;
+
+	return move_freepages(zone, start_pfn, end_pfn, migratetype);
 }
=20
 static void change_pageblock_range(struct page *pageblock_page,
@@ -1742,33 +1769,36 @@ static inline bool boost_watermark(struct zone *zon=
e)
 }
=20
 /*
- * This function implements actual steal behaviour. If order is large enou=
gh,
- * we can steal whole pageblock. If not, we first move freepages in this
- * pageblock to our migratetype and determine how many already-allocated p=
ages
- * are there in the pageblock with a compatible migratetype. If at least h=
alf
- * of pages are free or compatible, we can change migratetype of the pageb=
lock
- * itself, so pages freed in the future will be put on the correct free li=
st.
+ * This function implements actual steal behaviour. If order is large enou=
gh, we
+ * can claim the whole pageblock for the requested migratetype. If not, we=
 check
+ * the pageblock for constituent pages; if at least half of the pages are =
free
+ * or compatible, we can still claim the whole block, so pages freed in the
+ * future will be put on the correct free list. Otherwise, we isolate exac=
tly
+ * the order we need from the fallback block and leave its migratetype alo=
ne.
  */
 static void steal_suitable_fallback(struct zone *zone, struct page *page,
-		unsigned int alloc_flags, int start_type, bool whole_block)
+				    int current_order, int order, int start_type,
+				    unsigned int alloc_flags, bool whole_block)
 {
-	unsigned int current_order =3D buddy_order(page);
 	int free_pages, movable_pages, alike_pages;
-	int old_block_type;
+	unsigned long start_pfn, end_pfn;
+	int block_type;
=20
-	old_block_type =3D get_pageblock_migratetype(page);
+	block_type =3D get_pageblock_migratetype(page);
=20
 	/*
 	 * This can happen due to races and we want to prevent broken
 	 * highatomic accounting.
 	 */
-	if (is_migrate_highatomic(old_block_type))
+	if (is_migrate_highatomic(block_type))
 		goto single_page;
=20
 	/* Take ownership for orders >=3D pageblock_order */
 	if (current_order >=3D pageblock_order) {
+		del_page_from_free_list(page, zone, current_order);
 		change_pageblock_range(page, current_order, start_type);
-		goto single_page;
+		expand(zone, page, order, current_order, start_type);
+		return;
 	}
=20
 	/*
@@ -1783,10 +1813,9 @@ static void steal_suitable_fallback(struct zone *zon=
e, struct page *page,
 	if (!whole_block)
 		goto single_page;
=20
-	free_pages =3D move_freepages_block(zone, page, start_type,
-						&movable_pages);
 	/* moving whole block can fail due to zone boundary conditions */
-	if (!free_pages)
+	if (!prep_move_freepages_block(zone, page, &start_pfn, &end_pfn,
+				       &free_pages, &movable_pages))
 		goto single_page;
=20
 	/*
@@ -1804,7 +1833,7 @@ static void steal_suitable_fallback(struct zone *zone=
, struct page *page,
 		 * vice versa, be conservative since we can't distinguish the
 		 * exact migratetype of non-movable pages.
 		 */
-		if (old_block_type =3D=3D MIGRATE_MOVABLE)
+		if (block_type =3D=3D MIGRATE_MOVABLE)
 			alike_pages =3D pageblock_nr_pages
 						- (free_pages + movable_pages);
 		else
@@ -1815,13 +1844,15 @@ static void steal_suitable_fallback(struct zone *zo=
ne, struct page *page,
 	 * compatible migratability as our allocation, claim the whole block.
 	 */
 	if (free_pages + alike_pages >=3D (1 << (pageblock_order-1)) ||
-			page_group_by_mobility_disabled)
+			page_group_by_mobility_disabled) {
+		move_freepages(zone, start_pfn, end_pfn, start_type);
 		set_pageblock_migratetype(page, start_type);
-
-	return;
+		block_type =3D start_type;
+	}
=20
 single_page:
-	move_to_free_list(page, zone, current_order, start_type);
+	del_page_from_free_list(page, zone, current_order);
+	expand(zone, page, order, current_order, block_type);
 }
=20
 /*
@@ -1885,9 +1916,10 @@ static void reserve_highatomic_pageblock(struct page=
 *page, struct zone *zone)
 	mt =3D get_pageblock_migratetype(page);
 	/* Only reserve normal pageblocks (i.e., they can merge with others) */
 	if (migratetype_is_mergeable(mt)) {
-		zone->nr_reserved_highatomic +=3D pageblock_nr_pages;
-		set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
-		move_freepages_block(zone, page, MIGRATE_HIGHATOMIC, NULL);
+		if (move_freepages_block(zone, page, MIGRATE_HIGHATOMIC) !=3D -1) {
+			set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
+			zone->nr_reserved_highatomic +=3D pageblock_nr_pages;
+		}
 	}
=20
 out_unlock:
@@ -1912,7 +1944,7 @@ static bool unreserve_highatomic_pageblock(const stru=
ct alloc_context *ac,
 	struct zone *zone;
 	struct page *page;
 	int order;
-	bool ret;
+	int ret;
=20
 	for_each_zone_zonelist_nodemask(zone, z, zonelist, ac->highest_zoneidx,
 								ac->nodemask) {
@@ -1961,10 +1993,14 @@ static bool unreserve_highatomic_pageblock(const st=
ruct alloc_context *ac,
 			 * of pageblocks that cannot be completely freed
 			 * may increase.
 			 */
+			ret =3D move_freepages_block(zone, page, ac->migratetype);
+			/*
+			 * Reserving this block already succeeded, so this should
+			 * not fail on zone boundaries.
+			 */
+			WARN_ON_ONCE(ret =3D=3D -1);
 			set_pageblock_migratetype(page, ac->migratetype);
-			ret =3D move_freepages_block(zone, page, ac->migratetype,
-									NULL);
-			if (ret) {
+			if (ret > 0) {
 				spin_unlock_irqrestore(&zone->lock, flags);
 				return ret;
 			}
@@ -1985,7 +2021,7 @@ static bool unreserve_highatomic_pageblock(const stru=
ct alloc_context *ac,
  * deviation from the rest of this file, to make the for loop
  * condition simpler.
  */
-static __always_inline bool
+static __always_inline struct page *
 __rmqueue_fallback(struct zone *zone, int order, int start_migratetype,
 						unsigned int alloc_flags)
 {
@@ -2032,7 +2068,7 @@ __rmqueue_fallback(struct zone *zone, int order, int =
start_migratetype,
 		goto do_steal;
 	}
=20
-	return false;
+	return NULL;
=20
 find_smallest:
 	for (current_order =3D order; current_order <=3D MAX_ORDER;
@@ -2053,14 +2089,14 @@ __rmqueue_fallback(struct zone *zone, int order, in=
t start_migratetype,
 do_steal:
 	page =3D get_page_from_free_area(area, fallback_mt);
=20
-	steal_suitable_fallback(zone, page, alloc_flags, start_migratetype,
-								can_steal);
+	/* take off list, maybe claim block, expand remainder */
+	steal_suitable_fallback(zone, page, current_order, order,
+				start_migratetype, alloc_flags, can_steal);
=20
 	trace_mm_page_alloc_extfrag(page, order, current_order,
 		start_migratetype, fallback_mt);
=20
-	return true;
-
+	return page;
 }
=20
 /*
@@ -2087,15 +2123,14 @@ __rmqueue(struct zone *zone, unsigned int order, in=
t migratetype,
 				return page;
 		}
 	}
-retry:
+
 	page =3D __rmqueue_smallest(zone, order, migratetype);
 	if (unlikely(!page)) {
 		if (alloc_flags & ALLOC_CMA)
 			page =3D __rmqueue_cma_fallback(zone, order);
-
-		if (!page && __rmqueue_fallback(zone, order, migratetype,
-								alloc_flags))
-			goto retry;
+		else
+			page =3D __rmqueue_fallback(zone, order, migratetype,
+						  alloc_flags);
 	}
 	return page;
 }
@@ -2548,12 +2583,10 @@ int __isolate_free_page(struct page *page, unsigned=
 int order)
 			 * Only change normal pageblocks (i.e., they can merge
 			 * with others)
 			 */
-			if (migratetype_is_mergeable(mt)) {
-				set_pageblock_migratetype(page,
-							  MIGRATE_MOVABLE);
-				move_freepages_block(zone, page,
-						     MIGRATE_MOVABLE, NULL);
-			}
+			if (migratetype_is_mergeable(mt) &&
+			    move_freepages_block(zone, page,
+						 MIGRATE_MOVABLE) !=3D -1)
+				set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 		}
 	}
=20
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index bcf99ba747a0..cc48a3a52f00 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -178,15 +178,18 @@ static int set_migratetype_isolate(struct page *page,=
 int migratetype, int isol_
 	unmovable =3D has_unmovable_pages(check_unmovable_start, check_unmovable_=
end,
 			migratetype, isol_flags);
 	if (!unmovable) {
-		unsigned long nr_pages;
+		int nr_pages;
 		int mt =3D get_pageblock_migratetype(page);
=20
+		nr_pages =3D move_freepages_block(zone, page, MIGRATE_ISOLATE);
+		/* Block spans zone boundaries? */
+		if (nr_pages =3D=3D -1) {
+			spin_unlock_irqrestore(&zone->lock, flags);
+			return -EBUSY;
+		}
+		__mod_zone_freepage_state(zone, -nr_pages, mt);
 		set_pageblock_migratetype(page, MIGRATE_ISOLATE);
 		zone->nr_isolate_pageblock++;
-		nr_pages =3D move_freepages_block(zone, page, MIGRATE_ISOLATE,
-									NULL);
-
-		__mod_zone_freepage_state(zone, -nr_pages, mt);
 		spin_unlock_irqrestore(&zone->lock, flags);
 		return 0;
 	}
@@ -206,7 +209,7 @@ static int set_migratetype_isolate(struct page *page, i=
nt migratetype, int isol_
 static void unset_migratetype_isolate(struct page *page, int migratetype)
 {
 	struct zone *zone;
-	unsigned long flags, nr_pages;
+	unsigned long flags;
 	bool isolated_page =3D false;
 	unsigned int order;
 	struct page *buddy;
@@ -252,7 +255,12 @@ static void unset_migratetype_isolate(struct page *pag=
e, int migratetype)
 	 * allocation.
 	 */
 	if (!isolated_page) {
-		nr_pages =3D move_freepages_block(zone, page, migratetype, NULL);
+		int nr_pages =3D move_freepages_block(zone, page, migratetype);
+		/*
+		 * Isolating this block already succeeded, so this
+		 * should not fail on zone boundaries.
+		 */
+		WARN_ON_ONCE(nr_pages =3D=3D -1);
 		__mod_zone_freepage_state(zone, nr_pages, migratetype);
 	}
 	set_pageblock_migratetype(page, migratetype);
--=20
2.42.0
From nobody Wed Feb 11 18:28:02 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 7AE22CA0EC6
	for <linux-kernel@archiver.kernel.org>; Mon, 11 Sep 2023 21:30:51 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1347712AbjIKVZK (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 11 Sep 2023 17:25:10 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41968 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S244274AbjIKTup (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 11 Sep 2023 15:50:45 -0400
Received: from mail-qv1-xf31.google.com (mail-qv1-xf31.google.com
 [IPv6:2607:f8b0:4864:20::f31])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 230C81A2
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:40 -0700 (PDT)
Received: by mail-qv1-xf31.google.com with SMTP id
 6a1803df08f44-649921ec030so27190646d6.1
        for <linux-kernel@vger.kernel.org>;
 Mon, 11 Sep 2023 12:50:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694461839;
 x=1695066639; darn=vger.kernel.org;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=cFKNlfdAcUF/YBe6AFZW8MCMiyqnrgJqmqYsuJYCIEg=;
        b=Bc7ojUBrdeGYbjy0O8rL58H+5uxORmdvS6QeviKiGCjkM9sVTTADVK2f91v1ElBif2
         bIXpxkUn+i1tvBEm3oCRKEvzP3RnaqVoUGp6BvY0bJTuRrkzPmh8P+jXlKcpkUU5gS5w
         QtFSliA88zxV5wRwLS5a+OQ+jBsX6TTYO4hllzQOhjyz5pslBpfnYaOwC5aSx8Yito6f
         W61tdXS6xyp1JOGEEbvrvmrwyAhWKK3ETxbRpN7KtyONXlMJBmuhkfRZ/NOZLo9zHxzX
         nRSaiaFMK0jbqu0qwwle+g/Dixkp+BM1DeOO8HJEkdzkjUueK6pKnjO0RWSuH5sVOuN8
         qvIw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1694461839; x=1695066639;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=cFKNlfdAcUF/YBe6AFZW8MCMiyqnrgJqmqYsuJYCIEg=;
        b=OwA/EwHq3wy7tluBAw1OPxksn9jNzdpOW+tpIoFBLhOFG69sWQ+Jg59tRLMJPAIPIN
         cPzrV2ELTqN3HD+lndIbb0jXaewXX89q5YcCt4kenpvy+8fEpAIvK7xYGqgPbBXVTggN
         l0xzwoqjPIqwvzo2MW/3iOsxVfqd3/vihjNuMoGETTh1aYVAIjsLKd35eK3Qg/6c8ns7
         cKbz7Os7vp8YrAZxUJ+JgWRgNbYbrZ7uTTdXmS51YWIF/VmmkBK5Q47Sn60u44eusEy+
         Amb860gQwC+edyEh9KcasYwIe6iph0S5joUt71L+UHBNRqzzaieO1gsQUrbUa9WJiB33
         FR7w==
X-Gm-Message-State: AOJu0Yw8xYR8J7PmXKKNSWSQosStydRgYaAQ33DdrT8plmp6kUnZLjmj
        1kQv/pw51WP+gd2VWrrre1p4pA==
X-Google-Smtp-Source: 
 AGHT+IG6zxV3JjIHBg9m8nfMZE9qw+xm/r+Tycsfv0sTed00Bp/idpnNFj9CfKwZpEhUEgfXYmoHtA==
X-Received: by 2002:a0c:b447:0:b0:64f:92d2:44f8 with SMTP id
 e7-20020a0cb447000000b0064f92d244f8mr8876365qvf.59.1694461839179;
        Mon, 11 Sep 2023 12:50:39 -0700 (PDT)
Received: from localhost
 (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com.
 [2603:7000:c01:2716:3012:16a2:6bc2:2937])
        by smtp.gmail.com with ESMTPSA id
 f3-20020a0cf3c3000000b0064743dd0633sm3085730qvm.106.2023.09.11.12.50.38
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 11 Sep 2023 12:50:38 -0700 (PDT)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
        Mel Gorman <mgorman@techsingularity.net>,
        Miaohe Lin <linmiaohe@huawei.com>,
        Kefeng Wang <wangkefeng.wang@huawei.com>,
        Zi Yan <ziy@nvidia.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org
Subject: [PATCH 6/6] mm: page_alloc: consolidate free page accounting
Date: Mon, 11 Sep 2023 15:41:47 -0400
Message-ID: <20230911195023.247694-7-hannes@cmpxchg.org>
X-Mailer: git-send-email 2.42.0
In-Reply-To: <20230911195023.247694-1-hannes@cmpxchg.org>
References: <20230911195023.247694-1-hannes@cmpxchg.org>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Type: text/plain; charset="utf-8"

Free page accounting currently happens a bit too high up the call
stack, where it has to deal with guard pages, compaction capturing,
block stealing and even page isolation. This is subtle and fragile,
and makes it difficult to hack on the code.

Now that type violations on the freelists have been fixed, push the
accounting down to where pages enter and leave the freelist.

v3:
- fix CONFIG_UNACCEPTED_MEMORY build (lkp)
v2:
- fix CONFIG_DEBUG_PAGEALLOC build (Mel)

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/mm.h             |  18 ++---
 include/linux/page-isolation.h |   3 +-
 include/linux/vmstat.h         |   8 --
 mm/debug_page_alloc.c          |  12 +--
 mm/internal.h                  |   5 --
 mm/page_alloc.c                | 135 ++++++++++++++++++---------------
 mm/page_isolation.c            |   7 +-
 7 files changed, 90 insertions(+), 98 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index bf5d0b1b16f4..d8698248f280 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3680,24 +3680,22 @@ static inline bool page_is_guard(struct page *page)
 	return PageGuard(page);
 }
=20
-bool __set_page_guard(struct zone *zone, struct page *page, unsigned int o=
rder,
-		      int migratetype);
+bool __set_page_guard(struct zone *zone, struct page *page, unsigned int o=
rder);
 static inline bool set_page_guard(struct zone *zone, struct page *page,
-				  unsigned int order, int migratetype)
+				  unsigned int order)
 {
 	if (!debug_guardpage_enabled())
 		return false;
-	return __set_page_guard(zone, page, order, migratetype);
+	return __set_page_guard(zone, page, order);
 }
=20
-void __clear_page_guard(struct zone *zone, struct page *page, unsigned int=
 order,
-			int migratetype);
+void __clear_page_guard(struct zone *zone, struct page *page, unsigned int=
 order);
 static inline void clear_page_guard(struct zone *zone, struct page *page,
-				    unsigned int order, int migratetype)
+				    unsigned int order)
 {
 	if (!debug_guardpage_enabled())
 		return;
-	__clear_page_guard(zone, page, order, migratetype);
+	__clear_page_guard(zone, page, order);
 }
=20
 #else	/* CONFIG_DEBUG_PAGEALLOC */
@@ -3707,9 +3705,9 @@ static inline unsigned int debug_guardpage_minorder(v=
oid) { return 0; }
 static inline bool debug_guardpage_enabled(void) { return false; }
 static inline bool page_is_guard(struct page *page) { return false; }
 static inline bool set_page_guard(struct zone *zone, struct page *page,
-			unsigned int order, int migratetype) { return false; }
+			unsigned int order) { return false; }
 static inline void clear_page_guard(struct zone *zone, struct page *page,
-				unsigned int order, int migratetype) {}
+				unsigned int order) {}
 #endif	/* CONFIG_DEBUG_PAGEALLOC */
=20
 #ifdef __HAVE_ARCH_GATE_AREA
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 8550b3c91480..901915747960 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -34,7 +34,8 @@ static inline bool is_migrate_isolate(int migratetype)
 #define REPORT_FAILURE	0x2
=20
 void set_pageblock_migratetype(struct page *page, int migratetype);
-int move_freepages_block(struct zone *zone, struct page *page, int migrate=
type);
+int move_freepages_block(struct zone *zone, struct page *page,
+			 int old_mt, int new_mt);
=20
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pf=
n,
 			     int migratetype, int flags, gfp_t gfp_flags);
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index fed855bae6d8..a4eae03f6094 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -487,14 +487,6 @@ static inline void node_stat_sub_folio(struct folio *f=
olio,
 	mod_node_page_state(folio_pgdat(folio), item, -folio_nr_pages(folio));
 }
=20
-static inline void __mod_zone_freepage_state(struct zone *zone, int nr_pag=
es,
-					     int migratetype)
-{
-	__mod_zone_page_state(zone, NR_FREE_PAGES, nr_pages);
-	if (is_migrate_cma(migratetype))
-		__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, nr_pages);
-}
-
 extern const char * const vmstat_text[];
=20
 static inline const char *zone_stat_name(enum zone_stat_item item)
diff --git a/mm/debug_page_alloc.c b/mm/debug_page_alloc.c
index f9d145730fd1..03a810927d0a 100644
--- a/mm/debug_page_alloc.c
+++ b/mm/debug_page_alloc.c
@@ -32,8 +32,7 @@ static int __init debug_guardpage_minorder_setup(char *bu=
f)
 }
 early_param("debug_guardpage_minorder", debug_guardpage_minorder_setup);
=20
-bool __set_page_guard(struct zone *zone, struct page *page, unsigned int o=
rder,
-		      int migratetype)
+bool __set_page_guard(struct zone *zone, struct page *page, unsigned int o=
rder)
 {
 	if (order >=3D debug_guardpage_minorder())
 		return false;
@@ -41,19 +40,12 @@ bool __set_page_guard(struct zone *zone, struct page *p=
age, unsigned int order,
 	__SetPageGuard(page);
 	INIT_LIST_HEAD(&page->buddy_list);
 	set_page_private(page, order);
-	/* Guard pages are not available for any usage */
-	if (!is_migrate_isolate(migratetype))
-		__mod_zone_freepage_state(zone, -(1 << order), migratetype);
=20
 	return true;
 }
=20
-void __clear_page_guard(struct zone *zone, struct page *page, unsigned int=
 order,
-		      int migratetype)
+void __clear_page_guard(struct zone *zone, struct page *page, unsigned int=
 order)
 {
 	__ClearPageGuard(page);
-
 	set_page_private(page, 0);
-	if (!is_migrate_isolate(migratetype))
-		__mod_zone_freepage_state(zone, (1 << order), migratetype);
 }
diff --git a/mm/internal.h b/mm/internal.h
index 30cf724ddbce..d53b70e9cc3a 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -883,11 +883,6 @@ static inline bool is_migrate_highatomic(enum migratet=
ype migratetype)
 	return migratetype =3D=3D MIGRATE_HIGHATOMIC;
 }
=20
-static inline bool is_migrate_highatomic_page(struct page *page)
-{
-	return get_pageblock_migratetype(page) =3D=3D MIGRATE_HIGHATOMIC;
-}
-
 void setup_zone_pageset(struct zone *zone);
=20
 struct migration_target_control {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a902593f16dd..bfede72251d9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -640,24 +640,36 @@ compaction_capture(struct capture_control *capc, stru=
ct page *page,
 }
 #endif /* CONFIG_COMPACTION */
=20
-/* Used for pages not on another list */
-static inline void add_to_free_list(struct page *page, struct zone *zone,
-				    unsigned int order, int migratetype)
+static inline void account_freepages(struct page *page, struct zone *zone,
+				     int nr_pages, int migratetype)
 {
-	struct free_area *area =3D &zone->free_area[order];
+	if (is_migrate_isolate(migratetype))
+		return;
=20
-	list_add(&page->buddy_list, &area->free_list[migratetype]);
-	area->nr_free++;
+	__mod_zone_page_state(zone, NR_FREE_PAGES, nr_pages);
+
+	if (is_migrate_cma(migratetype))
+		__mod_zone_page_state(zone, NR_FREE_CMA_PAGES, nr_pages);
 }
=20
 /* Used for pages not on another list */
-static inline void add_to_free_list_tail(struct page *page, struct zone *z=
one,
-					 unsigned int order, int migratetype)
+static inline void add_to_free_list(struct page *page, struct zone *zone,
+				    unsigned int order, int migratetype,
+				    bool tail)
 {
 	struct free_area *area =3D &zone->free_area[order];
=20
-	list_add_tail(&page->buddy_list, &area->free_list[migratetype]);
+	VM_WARN_ONCE(get_pageblock_migratetype(page) !=3D migratetype,
+		     "page type is %lu, passed migratetype is %d (nr=3D%d)\n",
+		     get_pageblock_migratetype(page), migratetype, 1 << order);
+
+	if (tail)
+		list_add_tail(&page->buddy_list, &area->free_list[migratetype]);
+	else
+		list_add(&page->buddy_list, &area->free_list[migratetype]);
 	area->nr_free++;
+
+	account_freepages(page, zone, 1 << order, migratetype);
 }
=20
 /*
@@ -666,16 +678,28 @@ static inline void add_to_free_list_tail(struct page =
*page, struct zone *zone,
  * allocation again (e.g., optimization for memory onlining).
  */
 static inline void move_to_free_list(struct page *page, struct zone *zone,
-				     unsigned int order, int migratetype)
+				     unsigned int order, int old_mt, int new_mt)
 {
 	struct free_area *area =3D &zone->free_area[order];
=20
-	list_move_tail(&page->buddy_list, &area->free_list[migratetype]);
+	/* Free page moving can fail, so it happens before the type update */
+	VM_WARN_ONCE(get_pageblock_migratetype(page) !=3D old_mt,
+		     "page type is %lu, passed migratetype is %d (nr=3D%d)\n",
+		     get_pageblock_migratetype(page), old_mt, 1 << order);
+
+	list_move_tail(&page->buddy_list, &area->free_list[new_mt]);
+
+	account_freepages(page, zone, -(1 << order), old_mt);
+	account_freepages(page, zone, 1 << order, new_mt);
 }
=20
 static inline void del_page_from_free_list(struct page *page, struct zone =
*zone,
-					   unsigned int order)
+					   unsigned int order, int migratetype)
 {
+        VM_WARN_ONCE(get_pageblock_migratetype(page) !=3D migratetype,
+		     "page type is %lu, passed migratetype is %d (nr=3D%d)\n",
+		     get_pageblock_migratetype(page), migratetype, 1 << order);
+
 	/* clear reported state and update reported page count */
 	if (page_reported(page))
 		__ClearPageReported(page);
@@ -684,6 +708,8 @@ static inline void del_page_from_free_list(struct page =
*page, struct zone *zone,
 	__ClearPageBuddy(page);
 	set_page_private(page, 0);
 	zone->free_area[order].nr_free--;
+
+	account_freepages(page, zone, -(1 << order), migratetype);
 }
=20
 static inline struct page *get_page_from_free_area(struct free_area *area,
@@ -757,23 +783,21 @@ static inline void __free_one_page(struct page *page,
 	VM_BUG_ON_PAGE(page->flags & PAGE_FLAGS_CHECK_AT_PREP, page);
=20
 	VM_BUG_ON(migratetype =3D=3D -1);
-	if (likely(!is_migrate_isolate(migratetype)))
-		__mod_zone_freepage_state(zone, 1 << order, migratetype);
-
 	VM_BUG_ON_PAGE(pfn & ((1 << order) - 1), page);
 	VM_BUG_ON_PAGE(bad_range(zone, page), page);
=20
 	while (order < MAX_ORDER) {
-		if (compaction_capture(capc, page, order, migratetype)) {
-			__mod_zone_freepage_state(zone, -(1 << order),
-								migratetype);
+		int buddy_mt;
+
+		if (compaction_capture(capc, page, order, migratetype))
 			return;
-		}
=20
 		buddy =3D find_buddy_page_pfn(page, pfn, order, &buddy_pfn);
 		if (!buddy)
 			goto done_merging;
=20
+		buddy_mt =3D get_pfnblock_migratetype(buddy, buddy_pfn);
+
 		if (unlikely(order >=3D pageblock_order)) {
 			/*
 			 * We want to prevent merge between freepages on pageblock
@@ -801,9 +825,9 @@ static inline void __free_one_page(struct page *page,
 		 * merge with it and move up one order.
 		 */
 		if (page_is_guard(buddy))
-			clear_page_guard(zone, buddy, order, migratetype);
+			clear_page_guard(zone, buddy, order);
 		else
-			del_page_from_free_list(buddy, zone, order);
+			del_page_from_free_list(buddy, zone, order, buddy_mt);
 		combined_pfn =3D buddy_pfn & pfn;
 		page =3D page + (combined_pfn - pfn);
 		pfn =3D combined_pfn;
@@ -820,10 +844,7 @@ static inline void __free_one_page(struct page *page,
 	else
 		to_tail =3D buddy_merge_likely(pfn, buddy_pfn, page, order);
=20
-	if (to_tail)
-		add_to_free_list_tail(page, zone, order, migratetype);
-	else
-		add_to_free_list(page, zone, order, migratetype);
+	add_to_free_list(page, zone, order, migratetype, to_tail);
=20
 	/* Notify page reporting subsystem of freed page */
 	if (!(fpi_flags & FPI_SKIP_REPORT_NOTIFY))
@@ -865,10 +886,8 @@ int split_free_page(struct page *free_page,
 	}
=20
 	mt =3D get_pfnblock_migratetype(free_page, free_page_pfn);
-	if (likely(!is_migrate_isolate(mt)))
-		__mod_zone_freepage_state(zone, -(1UL << order), mt);
+	del_page_from_free_list(free_page, zone, order, mt);
=20
-	del_page_from_free_list(free_page, zone, order);
 	for (pfn =3D free_page_pfn;
 	     pfn < free_page_pfn + (1UL << order);) {
 		int mt =3D get_pfnblock_migratetype(pfn_to_page(pfn), pfn);
@@ -1388,10 +1407,10 @@ static inline void expand(struct zone *zone, struct=
 page *page,
 		 * Corresponding page table entries will not be touched,
 		 * pages will stay not present in virtual address space
 		 */
-		if (set_page_guard(zone, &page[size], high, migratetype))
+		if (set_page_guard(zone, &page[size], high))
 			continue;
=20
-		add_to_free_list(&page[size], zone, high, migratetype);
+		add_to_free_list(&page[size], zone, high, migratetype, false);
 		set_buddy_order(&page[size], high);
 	}
 }
@@ -1561,7 +1580,7 @@ struct page *__rmqueue_smallest(struct zone *zone, un=
signed int order,
 		page =3D get_page_from_free_area(area, migratetype);
 		if (!page)
 			continue;
-		del_page_from_free_list(page, zone, current_order);
+		del_page_from_free_list(page, zone, current_order, migratetype);
 		expand(zone, page, order, current_order, migratetype);
 		trace_mm_page_alloc_zone_locked(page, order, migratetype,
 				pcp_allowed_order(order) &&
@@ -1602,7 +1621,7 @@ static inline struct page *__rmqueue_cma_fallback(str=
uct zone *zone,
  * boundary. If alignment is required, use move_freepages_block()
  */
 static int move_freepages(struct zone *zone, unsigned long start_pfn,
-			  unsigned long end_pfn, int migratetype)
+			  unsigned long end_pfn, int old_mt, int new_mt)
 {
 	struct page *page;
 	unsigned long pfn;
@@ -1621,7 +1640,7 @@ static int move_freepages(struct zone *zone, unsigned=
 long start_pfn,
 		VM_BUG_ON_PAGE(page_zone(page) !=3D zone, page);
=20
 		order =3D buddy_order(page);
-		move_to_free_list(page, zone, order, migratetype);
+		move_to_free_list(page, zone, order, old_mt, new_mt);
 		pfn +=3D 1 << order;
 		pages_moved +=3D 1 << order;
 	}
@@ -1676,7 +1695,7 @@ static bool prep_move_freepages_block(struct zone *zo=
ne, struct page *page,
 }
=20
 int move_freepages_block(struct zone *zone, struct page *page,
-			 int migratetype)
+			 int old_mt, int new_mt)
 {
 	unsigned long start_pfn, end_pfn;
=20
@@ -1684,7 +1703,7 @@ int move_freepages_block(struct zone *zone, struct pa=
ge *page,
 				       NULL, NULL))
 		return -1;
=20
-	return move_freepages(zone, start_pfn, end_pfn, migratetype);
+	return move_freepages(zone, start_pfn, end_pfn, old_mt, new_mt);
 }
=20
 static void change_pageblock_range(struct page *pageblock_page,
@@ -1795,7 +1814,7 @@ static void steal_suitable_fallback(struct zone *zone=
, struct page *page,
=20
 	/* Take ownership for orders >=3D pageblock_order */
 	if (current_order >=3D pageblock_order) {
-		del_page_from_free_list(page, zone, current_order);
+		del_page_from_free_list(page, zone, current_order, block_type);
 		change_pageblock_range(page, current_order, start_type);
 		expand(zone, page, order, current_order, start_type);
 		return;
@@ -1845,13 +1864,13 @@ static void steal_suitable_fallback(struct zone *zo=
ne, struct page *page,
 	 */
 	if (free_pages + alike_pages >=3D (1 << (pageblock_order-1)) ||
 			page_group_by_mobility_disabled) {
-		move_freepages(zone, start_pfn, end_pfn, start_type);
+		move_freepages(zone, start_pfn, end_pfn, block_type, start_type);
 		set_pageblock_migratetype(page, start_type);
 		block_type =3D start_type;
 	}
=20
 single_page:
-	del_page_from_free_list(page, zone, current_order);
+	del_page_from_free_list(page, zone, current_order, block_type);
 	expand(zone, page, order, current_order, block_type);
 }
=20
@@ -1916,7 +1935,8 @@ static void reserve_highatomic_pageblock(struct page =
*page, struct zone *zone)
 	mt =3D get_pageblock_migratetype(page);
 	/* Only reserve normal pageblocks (i.e., they can merge with others) */
 	if (migratetype_is_mergeable(mt)) {
-		if (move_freepages_block(zone, page, MIGRATE_HIGHATOMIC) !=3D -1) {
+		if (move_freepages_block(zone, page,
+					 mt, MIGRATE_HIGHATOMIC) !=3D -1) {
 			set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
 			zone->nr_reserved_highatomic +=3D pageblock_nr_pages;
 		}
@@ -1959,11 +1979,13 @@ static bool unreserve_highatomic_pageblock(const st=
ruct alloc_context *ac,
 		spin_lock_irqsave(&zone->lock, flags);
 		for (order =3D 0; order <=3D MAX_ORDER; order++) {
 			struct free_area *area =3D &(zone->free_area[order]);
+			int mt;
=20
 			page =3D get_page_from_free_area(area, MIGRATE_HIGHATOMIC);
 			if (!page)
 				continue;
=20
+			mt =3D get_pageblock_migratetype(page);
 			/*
 			 * In page freeing path, migratetype change is racy so
 			 * we can counter several free pages in a pageblock
@@ -1971,7 +1993,7 @@ static bool unreserve_highatomic_pageblock(const stru=
ct alloc_context *ac,
 			 * from highatomic to ac->migratetype. So we should
 			 * adjust the count once.
 			 */
-			if (is_migrate_highatomic_page(page)) {
+			if (is_migrate_highatomic(mt)) {
 				/*
 				 * It should never happen but changes to
 				 * locking could inadvertently allow a per-cpu
@@ -1993,7 +2015,8 @@ static bool unreserve_highatomic_pageblock(const stru=
ct alloc_context *ac,
 			 * of pageblocks that cannot be completely freed
 			 * may increase.
 			 */
-			ret =3D move_freepages_block(zone, page, ac->migratetype);
+			ret =3D move_freepages_block(zone, page, mt,
+						   ac->migratetype);
 			/*
 			 * Reserving this block already succeeded, so this should
 			 * not fail on zone boundaries.
@@ -2165,12 +2188,7 @@ static int rmqueue_bulk(struct zone *zone, unsigned =
int order,
 		 * pages are ordered properly.
 		 */
 		list_add_tail(&page->pcp_list, list);
-		if (is_migrate_cma(get_pageblock_migratetype(page)))
-			__mod_zone_page_state(zone, NR_FREE_CMA_PAGES,
-					      -(1 << order));
 	}
-
-	__mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order));
 	spin_unlock_irqrestore(&zone->lock, flags);
=20
 	return i;
@@ -2565,11 +2583,9 @@ int __isolate_free_page(struct page *page, unsigned =
int order)
 		watermark =3D zone->_watermark[WMARK_MIN] + (1UL << order);
 		if (!zone_watermark_ok(zone, 0, watermark, 0, ALLOC_CMA))
 			return 0;
-
-		__mod_zone_freepage_state(zone, -(1UL << order), mt);
 	}
=20
-	del_page_from_free_list(page, zone, order);
+	del_page_from_free_list(page, zone, order, mt);
=20
 	/*
 	 * Set the pageblock if the isolated page is at least half of a
@@ -2584,7 +2600,7 @@ int __isolate_free_page(struct page *page, unsigned i=
nt order)
 			 * with others)
 			 */
 			if (migratetype_is_mergeable(mt) &&
-			    move_freepages_block(zone, page,
+			    move_freepages_block(zone, page, mt,
 						 MIGRATE_MOVABLE) !=3D -1)
 				set_pageblock_migratetype(page, MIGRATE_MOVABLE);
 		}
@@ -2670,8 +2686,6 @@ struct page *rmqueue_buddy(struct zone *preferred_zon=
e, struct zone *zone,
 				return NULL;
 			}
 		}
-		__mod_zone_freepage_state(zone, -(1 << order),
-					  get_pageblock_migratetype(page));
 		spin_unlock_irqrestore(&zone->lock, flags);
 	} while (check_new_pages(page, order));
=20
@@ -6434,8 +6448,9 @@ void __offline_isolated_pages(unsigned long start_pfn=
, unsigned long end_pfn)
=20
 		BUG_ON(page_count(page));
 		BUG_ON(!PageBuddy(page));
+		VM_WARN_ON(get_pageblock_migratetype(page) !=3D MIGRATE_ISOLATE);
 		order =3D buddy_order(page);
-		del_page_from_free_list(page, zone, order);
+		del_page_from_free_list(page, zone, order, MIGRATE_ISOLATE);
 		pfn +=3D (1 << order);
 	}
 	spin_unlock_irqrestore(&zone->lock, flags);
@@ -6486,11 +6501,12 @@ static void break_down_buddy_pages(struct zone *zon=
e, struct page *page,
 			current_buddy =3D page + size;
 		}
=20
-		if (set_page_guard(zone, current_buddy, high, migratetype))
+		if (set_page_guard(zone, current_buddy, high))
 			continue;
=20
 		if (current_buddy !=3D target) {
-			add_to_free_list(current_buddy, zone, high, migratetype);
+			add_to_free_list(current_buddy, zone, high,
+					 migratetype, false);
 			set_buddy_order(current_buddy, high);
 			page =3D next_page;
 		}
@@ -6518,12 +6534,11 @@ bool take_page_off_buddy(struct page *page)
 			int migratetype =3D get_pfnblock_migratetype(page_head,
 								   pfn_head);
=20
-			del_page_from_free_list(page_head, zone, page_order);
+			del_page_from_free_list(page_head, zone, page_order,
+						migratetype);
 			break_down_buddy_pages(zone, page_head, page, 0,
 						page_order, migratetype);
 			SetPageHWPoisonTakenOff(page);
-			if (!is_migrate_isolate(migratetype))
-				__mod_zone_freepage_state(zone, -1, migratetype);
 			ret =3D true;
 			break;
 		}
@@ -6630,7 +6645,7 @@ static bool try_to_accept_memory_one(struct zone *zon=
e)
 	list_del(&page->lru);
 	last =3D list_empty(&zone->unaccepted_pages);
=20
-	__mod_zone_freepage_state(zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
+	account_freepages(page, zone, -MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
 	__mod_zone_page_state(zone, NR_UNACCEPTED, -MAX_ORDER_NR_PAGES);
 	spin_unlock_irqrestore(&zone->lock, flags);
=20
@@ -6682,7 +6697,7 @@ static bool __free_unaccepted(struct page *page)
 	spin_lock_irqsave(&zone->lock, flags);
 	first =3D list_empty(&zone->unaccepted_pages);
 	list_add_tail(&page->lru, &zone->unaccepted_pages);
-	__mod_zone_freepage_state(zone, MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
+	account_freepages(page, zone, MAX_ORDER_NR_PAGES, MIGRATE_MOVABLE);
 	__mod_zone_page_state(zone, NR_UNACCEPTED, MAX_ORDER_NR_PAGES);
 	spin_unlock_irqrestore(&zone->lock, flags);
=20
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index cc48a3a52f00..b5c7a9d21257 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -181,13 +181,12 @@ static int set_migratetype_isolate(struct page *page,=
 int migratetype, int isol_
 		int nr_pages;
 		int mt =3D get_pageblock_migratetype(page);
=20
-		nr_pages =3D move_freepages_block(zone, page, MIGRATE_ISOLATE);
+		nr_pages =3D move_freepages_block(zone, page, mt, MIGRATE_ISOLATE);
 		/* Block spans zone boundaries? */
 		if (nr_pages =3D=3D -1) {
 			spin_unlock_irqrestore(&zone->lock, flags);
 			return -EBUSY;
 		}
-		__mod_zone_freepage_state(zone, -nr_pages, mt);
 		set_pageblock_migratetype(page, MIGRATE_ISOLATE);
 		zone->nr_isolate_pageblock++;
 		spin_unlock_irqrestore(&zone->lock, flags);
@@ -255,13 +254,13 @@ static void unset_migratetype_isolate(struct page *pa=
ge, int migratetype)
 	 * allocation.
 	 */
 	if (!isolated_page) {
-		int nr_pages =3D move_freepages_block(zone, page, migratetype);
+		int nr_pages =3D move_freepages_block(zone, page, MIGRATE_ISOLATE,
+						    migratetype);
 		/*
 		 * Isolating this block already succeeded, so this
 		 * should not fail on zone boundaries.
 		 */
 		WARN_ON_ONCE(nr_pages =3D=3D -1);
-		__mod_zone_freepage_state(zone, nr_pages, migratetype);
 	}
 	set_pageblock_migratetype(page, migratetype);
 	if (isolated_page)
--=20
2.42.0