From nobody Thu Jun 18 08:01:44 2026 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC79934C826 for ; Wed, 15 Apr 2026 11:14:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251668; cv=none; b=B54m3dT3tx793FZ53Jh9Vo/SRrtV9NtV26FIMPI2QFiWOKgcqlkU2EE3Z1muIZfD5SbB6lvuQJXu/gb9Ytg1bO4MSZe9U/XaPqHnO6i4RWu1sktmHUVnr8zLIe+eKgOMvgrETGVksBi4+hB1vURbo/3fxkzHkVKB+RY31rZKAMo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251668; c=relaxed/simple; bh=C/EhSahQ1RWLRKsO1SIZ1e5Tp3HJP9eth020COCsDUk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=EKOKQO7cZjMWYpyDLLJ6TESoQIuUi1rVSq8yIRGMFoeug6mhpGyhcl3FJ0kfGQX0ZZuzwua/n5IlDiZBcU0fw+P+6M+jSJcMUO4G6EBmoJP5vtqFjEpeNknSN6SJ+W27ME5me0noY3w5Bom7ENk8o4OIik/c8sVHYQMf5FEcXbg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=bPlguRbH; arc=none smtp.client-ip=209.85.216.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="bPlguRbH" Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-35d9f68d011so4284097a91.2 for ; Wed, 15 Apr 2026 04:14:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776251666; x=1776856466; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=t2IwsSjULj2cdQGLjgGLVnmP99rI03lQ3ndkxR63O5c=; b=bPlguRbHRdKKB/EPXGXPNluUTi/xaYsVQrZfL66GwQtKmgOGwbwJjxT/VvJsLvJt+w A7TBVHlRXCIUXdYTefFfMtmdLerbsCeJBLsLVEL9YPXrkCafHHnBqJ+2YS8cAuQIXocC 80Yh1jOVc8lDcYTZCUNLUsm+LZgHAO0X57dJvmMHhoC4kR5aW925DexXpEUImerIPok3 i7rnilDqCsGfUxrRuRsgray9vQzOXz7IshXkjJ2CoFC/1DRiujUuoNxgdMx/6liFCna2 ZTgXplKJnCvBqornkHtHEntvhGf8NCNtxxFc1AYW+mFgxFU80JU9j0cGBvh6KhdVoQi5 W7jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776251666; x=1776856466; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=t2IwsSjULj2cdQGLjgGLVnmP99rI03lQ3ndkxR63O5c=; b=EF1yvw1TQZOvGJpGiJTI4pd1ApiWjGaROhrR7OB/sakGroTaQShwDFaQ6+ms/QkIQY B1ZZMLX50Mg/GApB22uR4nGGECZq/66hBYGscUKScOfrbhpg2TndbMQnVKILPul0p9y0 c70HuXOZEDwYPp5pZX/9WANU+p8MXqsd0m+Zuob08QU2ZiRqBpGDwt0aF5sHpB9gInUf 5NLaUtDbrZTgZ08YiyY5iMJ1+KWnv47PiMn0lUqIFWaau5Ds1W83uovfwumKDrnCw/tP ZJuZbaOrzk2MiFlAWlqixadMomoWlXm9UdDitrexc34q83U6ZfTj9krC3jE5nt3ck1x2 swTA== X-Forwarded-Encrypted: i=1; AFNElJ/HcBMLs/Bi7FSISZJBxtCQ4CGCGaJaDULI2n3fGPU3It5rdB0pQlAPwRB9WkZj/6A/ud0mueOgpt2YqU0=@vger.kernel.org X-Gm-Message-State: AOJu0YyO6iMvNP3ihILI2vVJp4T6Dw26MhTej8azd2SilEyotimIaruZ /Sc+BvGmKa3cVbMXQra/A6gY1vLwJRloHd3BM9r635HlUpPaIuR0hGIsScwjQBZsmZA= X-Gm-Gg: AeBDies3r5YSIlmx6kc26MUEmo1xMnWoLj2/1Uqd/1E45jGlOqLyoacl1C76Ar9mdeq thybOT3fTLQC/DjHlawCunRXMU5vvc2dDwYaH9a2oeFCsl0J8j54d5ysAnE44FY8/nlGUiOw1QX ueMqaFcU9bIuedhpl1adfmdjyf75+C1g+RH2FgiAT9mI8nLveikSf5o0ZsF7gXZi3RHnS33mB/j WmgTewJZ+dq46SlmTrgadvXcAgl6zqmDsSrlzAw+BXOMo2/iWAhP6Y5gy6uWZJj+1hPNe9ds7C7 bDFMk8t3/LuP5OfRYE3EnAMjdzrvRCqjvuZ3l+XFWzl6Fa3o5XDNIuRlHSmhMOJVmKVFrAIBIRB RtLACHtjVd9Wbx/HNzVte17MbSjlJUyIzdxT5p7IKoOJFUsj0aA0LtE88sbvHRcrpOn9CD4AlOD xZNQXkQsQIZuj9YSSHL3l82OHlCby0hTEe34J9C0PoTUg= X-Received: by 2002:a17:90b:17cd:b0:35f:c1e1:a263 with SMTP id 98e67ed59e1d1-35fc1f0e974mr7643499a91.19.1776251666069; Wed, 15 Apr 2026 04:14:26 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.96]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b477fd3724sm19509485ad.0.2026.04.15.04.14.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Apr 2026 04:14:25 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH v2 1/6] mm/sparse-vmemmap: Fix vmemmap accounting underflow Date: Wed, 15 Apr 2026 19:14:07 +0800 Message-Id: <20260415111412.1003526-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260415111412.1003526-1-songmuchun@bytedance.com> References: <20260415111412.1003526-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In section_activate(), if populate_section_memmap() fails, the error handling path calls section_deactivate() to roll back the state. This causes a vmemmap accounting imbalance. Since commit c3576889d87b ("mm: fix accounting of memmap pages"), memmap pages are accounted for only after populate_section_memmap() succeeds. However, the failure path unconditionally calls section_deactivate(), which decreases the vmemmap count. Consequently, a failure in populate_section_memmap() leads to an accounting underflow, incorrectly reducing the system's tracked vmemmap usage. Fix this more thoroughly by moving all accounting calls into the lower level functions that actually perform the vmemmap allocation and freeing: - populate_section_memmap() accounts for newly allocated vmemmap pages - depopulate_section_memmap() unaccounts when vmemmap is freed - free_map_bootmem() handles early bootmem section accounting This ensures proper accounting in all code paths, including error handling and early section cases. Fixes: c3576889d87b ("mm: fix accounting of memmap pages") Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) --- mm/sparse-vmemmap.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 6eadb9d116e4..a7b11248b989 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -656,7 +656,12 @@ static struct page * __meminit populate_section_memmap= (unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { - return __populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); + struct page *page =3D __populate_section_memmap(pfn, nr_pages, nid, altma= p, + pgmap); + + memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); + + return page; } =20 static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_= pages, @@ -665,13 +670,17 @@ static void depopulate_section_memmap(unsigned long p= fn, unsigned long nr_pages, unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); =20 + memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE= _SIZE))); vmemmap_free(start, end, altmap); } + static void free_map_bootmem(struct page *memmap) { unsigned long start =3D (unsigned long)memmap; unsigned long end =3D (unsigned long)(memmap + PAGES_PER_SECTION); =20 + memmap_boot_pages_add(-1L * (DIV_ROUND_UP(PAGES_PER_SECTION * sizeof(stru= ct page), + PAGE_SIZE))); vmemmap_free(start, end, NULL); } =20 @@ -774,14 +783,10 @@ static void section_deactivate(unsigned long pfn, uns= igned long nr_pages, * The memmap of early sections is always fully populated. See * section_activate() and pfn_valid() . */ - if (!section_is_early) { - memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAG= E_SIZE))); + if (!section_is_early) depopulate_section_memmap(pfn, nr_pages, altmap); - } else if (memmap) { - memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), - PAGE_SIZE))); + else if (memmap) free_map_bootmem(memmap); - } =20 if (empty) ms->section_mem_map =3D (unsigned long)NULL; @@ -826,7 +831,6 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, section_deactivate(pfn, nr_pages, altmap); return ERR_PTR(-ENOMEM); } - memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); =20 return memmap; } --=20 2.20.1 From nobody Thu Jun 18 08:01:44 2026 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6DF46313E07 for ; Wed, 15 Apr 2026 11:14:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251675; cv=none; b=sB/z0gc4wNZGnW3DEQKepdP1yOaN6Xni1KS8Rp4lwVIDCfFcfOOrGblGJjIlHJUAHLpO93ILmjOoE72CokV9BH1++MJhc3N8pS4ltD/Lmldy8ka7SfMX7nJvEJ99US28bdykN7N4sh/KRhLCH6pE4t/MCyAfQrBpCrfIq+JSfCA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251675; c=relaxed/simple; bh=RYU420FNmoVSEifIso4m/FVIJcdSe+M6JvN+27inVr8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=C8kwhNkNx8+19iXdMoq80BfbFy6iqejMrxIGq+j4c5ZNNPz0SYHqvmhy4YyiWs/78e8736OjyNfHS/k/miENBlqR/+QqfV7Ynbmbr8mHu6JjD8Fv4HTy5NKDoUQGpT9Bx/Qj2U4BMGiARf/i7AxCwDgRXUpo2OWDjTUvNhlrCQQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=igP5SU29; arc=none smtp.client-ip=209.85.214.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="igP5SU29" Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-2ad4d639db3so34584325ad.0 for ; Wed, 15 Apr 2026 04:14:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776251673; x=1776856473; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=z5+/wr28ovMeZQg9qKyb97SWC0411ZRUTOfCSyd/WG0=; b=igP5SU29gk5sb8zs89Xrsh8iioa+CLi4SjjTsUY3gzo8uQeIFj54tuT8SXq36/KT+m KFKmTW+IuWycIFrG8yfZxItZNZanqzeAD3oT6sG8AqDSNcxOTE4jPf147voutYpq7ipa CBbK0VsJqCzgBUubDhqlDtWMVhE0Oi2bomkOvvMFohC5JZGrOjN0kOPIWvLXkfhI6TBM cxOgtV+DfjVzHjk1AX78nA00vv8XoJcOaXLV+/AXs3P/p377elrtZJZ8I9oIAel3VELs 3kuKa+RP7olh8BQePdbHCBccL1wa2iw/AlecanVConDPkzOHCxqZ65d2uvARVllMklT4 Ncbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776251673; x=1776856473; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=z5+/wr28ovMeZQg9qKyb97SWC0411ZRUTOfCSyd/WG0=; b=WviA82fvLx69XPPTyw+xl3lRr+GyXEFeqcfRhIXXAyqf2WiTNmr+WUGsNUJHnYGWwZ DqTP0CtQCq3MH2NVq+zYK0KiI1t8G9mYFMvOunuyDRXTGBj14Rchx8fDoAEL2eLLWUic Hc+pCHGE9uFD5LZRLCDegakE07guHtNpz22KtCu4dP3crjNJTOxVa2nV38L0wcfiiPE/ T3R9rvw+0TpCv1Y1/QuBkZI+ewDGUlgGklv6zm1T9ArSgIgltd/OmQKkVMIm5WJRV1vL DfsZMXr87tVpUHZ4o1liTXxEYoOBll9MDNRbjH/35nnDhnkbE7t8ygLB1oWE+l1NJXeo DOHA== X-Forwarded-Encrypted: i=1; AFNElJ8LmNdGCue9+sSjXXWAIw1MMDvPlxjkmOV8g5uZCkdQamEzkLXtxAKSPr0vGzjjhI7JLARolpy3Qu+8eVQ=@vger.kernel.org X-Gm-Message-State: AOJu0YwMLN0WOFFCr2a/GZG+JMIkfI71x3HW4Vh1hSXmXt6rlGVrQ6F5 VSjZoV2AidBi9LOsbjuIdKQXabSrBOK34YAewmznM7Q1JrjfqxXql1oA7BWa3KBTQ38= X-Gm-Gg: AeBDievZK8vqZYEnEL6jL+0t9oHmHCiJPYF1e/hB5VULl1rXXLEHcWZn/q9Omo6xybM eLCWllDDE88aJk1p6F7hlRmvp8EXfxIAKL+N9rZn2BiRqUraK71uOssPLNNENH/iPlwtSu+J7rs IY30uxTyCxm+eDQeszRTu8FLMJBQZy5YjlPpZFqk1yrO2req08SGhh3aZ+sE2cqDMmme8Abd2sC GFUrSkTv4xdO1JJrRCekyO7i2s8xEDE0+pycZaF+y9HN37QB1NOjtLwfh6k5IiDZEDkVOHnbNFN 1Y9N5idoVMEfhLcslgZw4zJ3rvQYrLVzqbXKFT8V3/3Ca6yWipJejoD+fa0KRzWYUe8I6drriem LvFQflHr1tUAvKUrv0Y/xG2CTncX1BysNclhklC8Q6ApblYecXElMVdp/6aX8IFOEeqHpVRDKbg lHNx+H6HsBTtIPCUHRW2jAIsN3kfIASGqFMdAqd3P5VvY= X-Received: by 2002:a17:902:c3c1:b0:2b0:6d33:e7c0 with SMTP id d9443c01a7336-2b2d59766c2mr159851625ad.1.1776251672541; Wed, 15 Apr 2026 04:14:32 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.96]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b477fd3724sm19509485ad.0.2026.04.15.04.14.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Apr 2026 04:14:31 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH v2 2/6] mm/sparse-vmemmap: Pass @pgmap argument to memory deactivation paths Date: Wed, 15 Apr 2026 19:14:08 +0800 Message-Id: <20260415111412.1003526-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260415111412.1003526-1-songmuchun@bytedance.com> References: <20260415111412.1003526-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, the memory hot-remove call chain -- arch_remove_memory(), __remove_pages(), sparse_remove_section() and section_deactivate() -- does not carry the struct dev_pagemap pointer. This prevents the lower levels from knowing whether the section was originally populated with vmemmap optimizations (e.g., DAX with vmemmap optimization enabled). Without this information, we cannot call vmemmap_can_optimize() to determine if the vmemmap pages were optimized. As a result, the vmemmap page accounting during teardown will mistakenly assume a non-optimized allocation, leading to incorrect memmap statistics. To lay the groundwork for fixing the vmemmap page accounting, we need to pass the @pgmap pointer down to the deactivation location. Plumb the @pgmap argument through the APIs of arch_remove_memory(), __remove_pages() and sparse_remove_section(), mirroring the corresponding *_activate() paths. Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) --- arch/arm64/mm/mmu.c | 5 +++-- arch/loongarch/mm/init.c | 5 +++-- arch/powerpc/mm/mem.c | 5 +++-- arch/riscv/mm/init.c | 5 +++-- arch/s390/mm/init.c | 5 +++-- arch/x86/mm/init_64.c | 5 +++-- include/linux/memory_hotplug.h | 8 +++++--- mm/memory_hotplug.c | 12 ++++++------ mm/memremap.c | 4 ++-- mm/sparse-vmemmap.c | 17 +++++++++-------- 10 files changed, 40 insertions(+), 31 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index dd85e093ffdb..e5a42b7a0160 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -2024,12 +2024,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); __remove_pgd_mapping(swapper_pg_dir, __phys_to_virt(start), size); } =20 diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index 00f3822b6e47..c9c57f08fa2c 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -86,7 +86,8 @@ int arch_add_memory(int nid, u64 start, u64 size, struct = mhp_params *params) return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; @@ -95,7 +96,7 @@ void arch_remove_memory(u64 start, u64 size, struct vmem_= altmap *altmap) /* With altmap the first mapped page is offset from @start */ if (altmap) page +=3D vmem_altmap_offset(altmap); - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); } #endif =20 diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 648d0c5602ec..4c1afab91996 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -158,12 +158,13 @@ int __ref arch_add_memory(int nid, u64 start, u64 siz= e, return rc; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); arch_remove_linear_mapping(start, size); } #endif diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index decd7df40fa4..b0092fb842a3 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1717,9 +1717,10 @@ int __ref arch_add_memory(int nid, u64 start, u64 si= ze, struct mhp_params *param return ret; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { - __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap); + __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap, pgmap); remove_linear_mapping(start, size); flush_tlb_all(); } diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 1f72efc2a579..11a689423440 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -276,12 +276,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return rc; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); vmem_remove_mapping(start, size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index df2261fa4f98..77b889b71cf3 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1288,12 +1288,13 @@ kernel_physical_mapping_remove(unsigned long start,= unsigned long end) remove_pagetable(start, end, true, NULL); } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); kernel_physical_mapping_remove(start, start + size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 815e908c4135..7c9d66729c60 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -135,9 +135,10 @@ static inline bool movable_node_is_enabled(void) return movable_node_enabled; } =20 -extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap); +extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap, + struct dev_pagemap *pgmap); extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, struct dev_pagemap *pgmap); =20 /* reasonably generic interface to expand the physical pages */ extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_= pages, @@ -307,7 +308,8 @@ extern int sparse_add_section(int nid, unsigned long pf= n, unsigned long nr_pages, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); extern void sparse_remove_section(unsigned long pfn, unsigned long nr_page= s, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); extern struct zone *zone_for_pfn_range(enum mmop online_type, int nid, struct memory_group *group, unsigned long start_pfn, unsigned long nr_pages); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 2a943ec57c85..6a9e2dc751d2 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -583,7 +583,7 @@ void remove_pfn_range_from_zone(struct zone *zone, * calling offline_pages(). */ void __remove_pages(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { const unsigned long end_pfn =3D pfn + nr_pages; unsigned long cur_nr_pages; @@ -598,7 +598,7 @@ void __remove_pages(unsigned long pfn, unsigned long nr= _pages, /* Select all remaining pages up to the next section boundary */ cur_nr_pages =3D min(end_pfn - pfn, SECTION_ALIGN_UP(pfn + 1) - pfn); - sparse_remove_section(pfn, cur_nr_pages, altmap); + sparse_remove_section(pfn, cur_nr_pages, altmap, pgmap); } } =20 @@ -1425,7 +1425,7 @@ static void remove_memory_blocks_and_altmaps(u64 star= t, u64 size) =20 remove_memory_block_devices(cur_start, memblock_size); =20 - arch_remove_memory(cur_start, memblock_size, altmap); + arch_remove_memory(cur_start, memblock_size, altmap, NULL); =20 /* Verify that all vmemmap pages have actually been freed. */ WARN(altmap->alloc, "Altmap not fully unmapped"); @@ -1468,7 +1468,7 @@ static int create_altmaps_and_memory_blocks(int nid, = struct memory_group *group, ret =3D create_memory_block_devices(cur_start, memblock_size, nid, params.altmap, group); if (ret) { - arch_remove_memory(cur_start, memblock_size, NULL); + arch_remove_memory(cur_start, memblock_size, NULL, NULL); kfree(params.altmap); goto out; } @@ -1554,7 +1554,7 @@ int add_memory_resource(int nid, struct resource *res= , mhp_t mhp_flags) /* create memory block devices after memory was added */ ret =3D create_memory_block_devices(start, size, nid, NULL, group); if (ret) { - arch_remove_memory(start, size, params.altmap); + arch_remove_memory(start, size, params.altmap, NULL); goto error; } } @@ -2266,7 +2266,7 @@ static int try_remove_memory(u64 start, u64 size) * No altmaps present, do the removal directly */ remove_memory_block_devices(start, size); - arch_remove_memory(start, size, NULL); + arch_remove_memory(start, size, NULL, NULL); } else { /* all memblocks in the range have altmaps */ remove_memory_blocks_and_altmaps(start, size); diff --git a/mm/memremap.c b/mm/memremap.c index ac7be07e3361..c45b90f334ea 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -97,10 +97,10 @@ static void pageunmap_range(struct dev_pagemap *pgmap, = int range_id) PHYS_PFN(range_len(range))); if (pgmap->type =3D=3D MEMORY_DEVICE_PRIVATE) { __remove_pages(PHYS_PFN(range->start), - PHYS_PFN(range_len(range)), NULL); + PHYS_PFN(range_len(range)), NULL, pgmap); } else { arch_remove_memory(range->start, range_len(range), - pgmap_altmap(pgmap)); + pgmap_altmap(pgmap), pgmap); kasan_remove_zero_shadow(__va(range->start), range_len(range)); } mem_hotplug_done(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index a7b11248b989..40290fbc1db4 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -665,7 +665,7 @@ static struct page * __meminit populate_section_memmap(= unsigned long pfn, } =20 static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_= pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); @@ -674,7 +674,8 @@ static void depopulate_section_memmap(unsigned long pfn= , unsigned long nr_pages, vmemmap_free(start, end, altmap); } =20 -static void free_map_bootmem(struct page *memmap) +static void free_map_bootmem(struct page *memmap, struct vmem_altmap *altm= ap, + struct dev_pagemap *pgmap) { unsigned long start =3D (unsigned long)memmap; unsigned long end =3D (unsigned long)(memmap + PAGES_PER_SECTION); @@ -746,7 +747,7 @@ static int fill_subsection_map(unsigned long pfn, unsig= ned long nr_pages) * usage map, but still need to free the vmemmap range. */ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms =3D __pfn_to_section(pfn); bool section_is_early =3D early_section(ms); @@ -784,9 +785,9 @@ static void section_deactivate(unsigned long pfn, unsig= ned long nr_pages, * section_activate() and pfn_valid() . */ if (!section_is_early) - depopulate_section_memmap(pfn, nr_pages, altmap); + depopulate_section_memmap(pfn, nr_pages, altmap, pgmap); else if (memmap) - free_map_bootmem(memmap); + free_map_bootmem(memmap, altmap, pgmap); =20 if (empty) ms->section_mem_map =3D (unsigned long)NULL; @@ -828,7 +829,7 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, =20 memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); if (!memmap) { - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); return ERR_PTR(-ENOMEM); } =20 @@ -889,13 +890,13 @@ int __meminit sparse_add_section(int nid, unsigned lo= ng start_pfn, } =20 void sparse_remove_section(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms =3D __pfn_to_section(pfn); =20 if (WARN_ON_ONCE(!valid_section(ms))) return; =20 - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ --=20 2.20.1 From nobody Thu Jun 18 08:01:44 2026 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C64A336604A for ; Wed, 15 Apr 2026 11:14:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251681; cv=none; b=NeXA8Wu4nLQacLmdtd4p1VNr5DvlIx2sbJten/ISjqb0/2XNygg3GUfZJVVhm/45e0TViWKH98m7xzHMC6fYPLrKlmJ3vWepBneLN+U0Iu5LVx0XpGX699XHm93IBSs1DF6YQvkd5RtGaW10aseqMzGRRZ4O83mQrAmp/67mVME= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251681; c=relaxed/simple; bh=4EoFcxXiqyz1WVaO4esUREYa5gKezgAPn/lvIA/IdgU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hKIm3ESsLKyV+HOGkxRJk4hcMAhx6lk9MOPIdW7+sLu7kxnj3ndkd+9yrlrskAvqgDJhgxljluMMU20UT/TOqjJPIZN5qjMIgrTL1QGoMCTQTxsiX6BiTYyseyJ16JUMTWbzkQyiTP0I7qflmhaigIbW2MFVfyfdPHJtk0tyBEA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=bGscesIW; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="bGscesIW" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-2b4583f0a1aso21441015ad.3 for ; Wed, 15 Apr 2026 04:14:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776251679; x=1776856479; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vC2b62MaZVNGL0bmie++eHoqVNtyAoC7qXucopPxUdw=; b=bGscesIWWodblw+E81BqGt+r/I12Eo7OloSdJsyF8gzS9iC5LrZ0cNHRv2JZ6njHQ3 PNC56+X4Y7/QrXp2yLibl8YnP06zRxKPfh8CzS/JCUXxsVxKog+F0kPWsSWQ72uAQwnQ Ohlsf/VyrpopyZPOqItevVdDAbUIL66tmjFujxEqKrUOKWJ43O7BfYAjJ38bbs/xmOkm pT3CalwFBhA1RiGRAUiXdyUUHfQ2Wmgn9S/M9GJhwMMZAL9ABHWiCPlHzKzhQBhTRNpo YUrmQ4ABw7kfTEQP8Y9KKe732+ZWKJtHeMnMwNSvVxXGPw7URgu+zOJhui1/MaXnW+Hy hfrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776251679; x=1776856479; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=vC2b62MaZVNGL0bmie++eHoqVNtyAoC7qXucopPxUdw=; b=qacPKs6s3ttRKaBgz4fQvMZD4Ffnm8jiE6ump052tcW3jzjgXDDb+D4z74aUQ5Epes HoY/49vv3PWp+bsZ5gRI64ecu1gqMeH/Amsx50a5VVbDSWUygKXmf/JUFaU5NzTvxhQC eUgGVxvSWF8r2XWhRUA1Ahj4XGKnnsc+k3n2UNO75x6LwrbeMid/bah5t3wm5akW3i6V SqidClkzEgUrFN+Kw98h9sMDcCuUtB5mFKPJND8gt0+ip8T3k6jI1jaf1Bt4vAbIGotI 4g7KU7AJrVumQUQOGCiC4FnaXFxbQuyw78afovCcaeWltiDVRtd4ivySv9byRMpAhEX1 MRcA== X-Forwarded-Encrypted: i=1; AFNElJ/L7RHUpH83NPUF4Yp7JwTUbttlYYTC4J76dw+6fDvgXgwfZnXa9eC7FlWIW23UVMJTPfd6I/xq4vfd4Vo=@vger.kernel.org X-Gm-Message-State: AOJu0Yx/p/d3Hz+nfGFPI7cuLwIF6a4tlKfwC9E+9HutO25r5P0HkkHo Hj/HKScCIKDW8uHR4TFb4NRvNZ5GVvDttN71EkjDNMclAJAylEGAwEabeA0JlQDSBJE= X-Gm-Gg: AeBDies8EqMdm/gG2B+J4Y2jK/WHUi7bjsAwaqjhc0LF7wwzFTpu3BpZoHQuBQOGeBj uxsdE6TYq+HY/5azQ87rXMiZ+cgIT17lgaArwDi6Ip+dZR15fdaEpEx6Z4xSa6vH4tjqKa1TEGC Og/HzcQfET0mc+PS6r/N0g/M6SIm7Gx7hGaXhfamM0a7tHOCTmFDOyzllpL+Z611qoNl28QNWXm Mbdqa1v339cd400peU8/rsb7o5xHDAf0E8SAgdNpvS76McEcKEM5Gqw1Zp/8BDxN8Vhr9G1qgTW UNlB7bZTk7ZKAlALvs5YWgPGdBeWgiPllunEyd5zoSxTzftApS7AR1HTqxckKHENypBHyFqPUup /HLbAGLjVdPP2vmM48Lnm+9g450KE8XTheB6oaQfmkjVpGps1JWUPHTwmJNcAznMRod2dX3nxUb W1UydHZs1Uv5CGdXF13Chk+EIQqDdlw6GZfnwxR+W2cRs= X-Received: by 2002:a17:903:249:b0:2ae:6192:8d78 with SMTP id d9443c01a7336-2b2d59411dbmr209611585ad.1.1776251679043; Wed, 15 Apr 2026 04:14:39 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.96]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b477fd3724sm19509485ad.0.2026.04.15.04.14.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Apr 2026 04:14:38 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH v2 3/6] mm/sparse-vmemmap: Fix DAX vmemmap accounting with optimization Date: Wed, 15 Apr 2026 19:14:09 +0800 Message-Id: <20260415111412.1003526-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260415111412.1003526-1-songmuchun@bytedance.com> References: <20260415111412.1003526-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When vmemmap optimization is enabled for DAX, the nr_memmap_pages counter in /proc/vmstat is incorrect. The current code always accounts for the full, non-optimized vmemmap size, but vmemmap optimization reduces the actual number of vmemmap pages by reusing tail pages. This causes the system to overcount vmemmap usage, leading to inaccurate page statistics in /proc/vmstat. Fix this by introducing section_vmemmap_pages(), which returns the exact vmemmap page count for a given pfn range based on whether optimization is in effect. Fixes: 15995a352474 ("mm: report per-page metadata information") Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) --- mm/sparse-vmemmap.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 40290fbc1db4..05e3e2b94e32 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -652,6 +652,29 @@ void offline_mem_sections(unsigned long start_pfn, uns= igned long end_pfn) } } =20 +static int __meminit section_vmemmap_pages(unsigned long pfn, unsigned lon= g nr_pages, + struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) +{ + unsigned int order =3D pgmap ? pgmap->vmemmap_shift : 0; + unsigned long pages_per_compound =3D 1L << order; + + VM_WARN_ON_ONCE(!IS_ALIGNED(pfn | nr_pages, min(pages_per_compound, + PAGES_PER_SECTION))); + VM_WARN_ON_ONCE(pfn_to_section_nr(pfn) !=3D pfn_to_section_nr(pfn + nr_pa= ges - 1)); + + if (!vmemmap_can_optimize(altmap, pgmap)) + return DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE); + + if (order < PFN_SECTION_SHIFT) + return VMEMMAP_RESERVE_NR * nr_pages / pages_per_compound; + + if (IS_ALIGNED(pfn, pages_per_compound)) + return VMEMMAP_RESERVE_NR; + + return 0; +} + static struct page * __meminit populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) @@ -659,7 +682,7 @@ static struct page * __meminit populate_section_memmap(= unsigned long pfn, struct page *page =3D __populate_section_memmap(pfn, nr_pages, nid, altma= p, pgmap); =20 - memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); + memmap_pages_add(section_vmemmap_pages(pfn, nr_pages, altmap, pgmap)); =20 return page; } @@ -670,7 +693,7 @@ static void depopulate_section_memmap(unsigned long pfn= , unsigned long nr_pages, unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); =20 - memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE= _SIZE))); + memmap_pages_add(-section_vmemmap_pages(pfn, nr_pages, altmap, pgmap)); vmemmap_free(start, end, altmap); } =20 @@ -679,9 +702,10 @@ static void free_map_bootmem(struct page *memmap, stru= ct vmem_altmap *altmap, { unsigned long start =3D (unsigned long)memmap; unsigned long end =3D (unsigned long)(memmap + PAGES_PER_SECTION); + unsigned long pfn =3D page_to_pfn(memmap); =20 - memmap_boot_pages_add(-1L * (DIV_ROUND_UP(PAGES_PER_SECTION * sizeof(stru= ct page), - PAGE_SIZE))); + memmap_boot_pages_add(-section_vmemmap_pages(pfn, PAGES_PER_SECTION, + altmap, pgmap)); vmemmap_free(start, end, NULL); } =20 --=20 2.20.1 From nobody Thu Jun 18 08:01:44 2026 Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 254B134575A for ; Wed, 15 Apr 2026 11:14:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251690; cv=none; b=oRMcdWWOMtC5d69Z866KYQ2hbfnvqIhGRKGeM0uxBexGVnYk43FkFVkGCUh8iAZ5ARGW8M+dopLZ3ks1HVxPgQct83f+OLZIXAgg2XV148zWAwILzdJ2ufMEsAB96yh2D6K4BrbsiYgWLQ6j1FV78sduUPWrLdSLzveIMPRmJBk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251690; c=relaxed/simple; bh=TqZiT49jcPQVws+ozUQD1Ld24Ol4B8k2cQoDwc2em78=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WO+2rePiX8I8GJEPIgHutYgRXEk6NxgE641mMdbdmd4lrmtZ+s1PelFDw9OH+eVcqlyjMBDCzqyx0ITUQ1cEpv7cg0Hx5Z2zvkBF6c/33hkpYzT253jsSWx/M8/W1ScoQuRVIcLmEG4WaDCOjPEqpltTrqC4LrC6wIrRbciwOuk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ajCdlyAM; arc=none smtp.client-ip=209.85.215.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ajCdlyAM" Received: by mail-pg1-f171.google.com with SMTP id 41be03b00d2f7-c76b994f7a8so2418925a12.3 for ; Wed, 15 Apr 2026 04:14:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776251687; x=1776856487; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3Df1XnTLx+FlR5iv38FY01xmyEBUedrertPqH9E0HLA=; b=ajCdlyAMUVd01pYzPtKUwr0mC49h0+q2OqyyH2LgpVBJF2yQamzM6fZbki5vumqe13 wxDbHj2OOjGWiKsReaIdltRB2NxcNOb+FbOnn5YavRaFCwiKDUYLVV+y+WwYAXpwIG1k BwG6oXcBOExvvN8tGhItoxUgUiGs5/p3fjKoGr4JmjK8x7qFY7nRzA4q94k2sAYsDtrr W+bPCrldtX+FgpGpVcEDEMhgM28GOoU4M7AqaqBpXKfPJkWlyHe2paoT/YFMhrEUH6e7 VGwLKY6dRiVWDQOECr/UZIZHNJr0KfrJ9jW2MhVcFYySV1hAbDD/4sRpug7gUHi+Ko3s gSFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776251687; x=1776856487; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=3Df1XnTLx+FlR5iv38FY01xmyEBUedrertPqH9E0HLA=; b=JkfjdB2tUlM/egvHe4YVBkf3T+ZifyyxTnU3kEtBfTRs9pTwUojOsdx7BzCZF4/V8m fmd3ZDwjNEoWkeNgno67pgEeLBPO+D11M2318h9wEZbkDklEp7AcrBv8aB8gvkDaswiC i0D+gSHi38ObZtZ5NYSyAGNzfVah32GfYvWCI7ZQPMQxClJBEGxmHQQ+oNCiYuVfmtzr NI78ZPDl/tVjz4FxODZ2Ok5gYf6ieOLu2uaD5mqV3MPJUSpXaUbBtxcx4YDXGISP4N8I wc6FMgR1S6+MI12Xj8bmI/yXJKhDyQSIuQ7XVkeMlqHFkBYKsuGy1m2j6ciXcS+d/YXM P0QQ== X-Forwarded-Encrypted: i=1; AFNElJ8C6Qr8YoWGWzDcPRysgTH6uBoFv19ge41o1m/MBzMbS1gyDrxqsphjzonuJII+f+sBLo5cu+XDvnQWN18=@vger.kernel.org X-Gm-Message-State: AOJu0Yw4tfQDXcK0GEQjGQ/amwa31KFyxoSmAoeLKE6e4ZUoqGqPtOes MKVIpDjDVE2HxCqqKaNp/BuQQ4NLNXq+44pk/Q0HliwUfPRLv6oX7TEnDzfivckS6AM= X-Gm-Gg: AeBDievmVlkuYJO7b/LmyaIGvyV4fs4iGSMYI5+6YZ12UVf16Dk8N4P+3Df99rAiUGf LsdgXNOP6y0K9G6Cp2f/ulVtrjOeFi4SHBJvtZdBWttAIGpir3wxNBjZmnUVEMncLr9WJQK0AMS t+RESK+8wwA/7APlspOcwXSgsAhANnOQR+x2+fV5hLvRc2OAx8J3tDMgBzSBWJ9k5txRj9QGJS5 wwbAHSvMzaaZI6gP2MsT1n/tTrwVR4+WjgF16/iAtWf4GGIIwYhbC4CoKuKLlVfkjDA8NkR5HfB MDEF9w5kJPRHQL0BW+XYQ8YB+MUKnYy8OJuPgverdMjheB/Bfzn0aV/qdU2qJ/VsWfu2KKr8nRe /p+zJY5netffrvUTep0ghnrL7MUJ7sFGERMMD6o4dCmT8WOX93HPuLImt4TsZfugbWPtHehHgsa mnpQ+LdscTfSF0fA2jxjs8RLL1SN63YsOphjUIFsMIz5k= X-Received: by 2002:a17:902:db0f:b0:2b2:67ca:5ff1 with SMTP id d9443c01a7336-2b2d5a6a36cmr226277565ad.31.1776251685742; Wed, 15 Apr 2026 04:14:45 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.96]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b477fd3724sm19509485ad.0.2026.04.15.04.14.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Apr 2026 04:14:45 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH v2 4/6] mm/sparse-vmemmap: Pass @pgmap argument to arch vmemmap_populate() Date: Wed, 15 Apr 2026 19:14:10 +0800 Message-Id: <20260415111412.1003526-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260415111412.1003526-1-songmuchun@bytedance.com> References: <20260415111412.1003526-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the struct dev_pagemap pointer as a parameter to the architecture specific vmemmap_populate(), vmemmap_populate_hugepages() and vmemmap_populate_basepages() functions. Currently, the vmemmap optimization for DAX is handled mostly in an architecture-agnostic way via vmemmap_populate_compound_pages(). However, this approach skips crucial architecture-specific initialization steps. For example, the x86 path must call sync_global_pgds() after populating the vmemmap, which is currently being bypassed. To lay the groundwork for fixing the vmemmap optimization in the arch level, we need to pass the @pgmap pointer down to the arch specific vmemmap_populate() location. Plumb the @pgmap argument through the APIs of vmemmap_populate(), vmemmap_populate_hugepages() and vmemmap_populate_basepages(). Signed-off-by: Muchun Song --- arch/arm64/mm/mmu.c | 6 +++--- arch/loongarch/mm/init.c | 7 ++++--- arch/powerpc/include/asm/book3s/64/radix.h | 3 ++- arch/powerpc/mm/book3s64/radix_pgtable.c | 2 +- arch/powerpc/mm/init_64.c | 4 ++-- arch/riscv/mm/init.c | 4 ++-- arch/s390/mm/vmem.c | 2 +- arch/sparc/mm/init_64.c | 5 +++-- arch/x86/mm/init_64.c | 8 ++++---- include/linux/mm.h | 8 +++++--- mm/hugetlb_vmemmap.c | 4 ++-- mm/sparse-vmemmap.c | 10 ++++++---- 12 files changed, 35 insertions(+), 28 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index e5a42b7a0160..11227e104c48 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1790,7 +1790,7 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); /* [start, end] should be within one section */ @@ -1798,9 +1798,9 @@ int __meminit vmemmap_populate(unsigned long start, u= nsigned long end, int node, =20 if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) || (end - start < PAGES_PER_SECTION * sizeof(struct page))) - return vmemmap_populate_basepages(start, end, node, altmap); + return vmemmap_populate_basepages(start, end, node, altmap, pgmap); else - return vmemmap_populate_hugepages(start, end, node, altmap); + return vmemmap_populate_hugepages(start, end, node, altmap, pgmap); } =20 #ifdef CONFIG_MEMORY_HOTPLUG diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index c9c57f08fa2c..d61c2e09caae 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -123,12 +123,13 @@ int __meminit vmemmap_check_pmd(pmd_t *pmd, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, - int node, struct vmem_altmap *altmap) + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { #if CONFIG_PGTABLE_LEVELS =3D=3D 2 - return vmemmap_populate_basepages(start, end, node, NULL); + return vmemmap_populate_basepages(start, end, node, NULL, pgmap); #else - return vmemmap_populate_hugepages(start, end, node, NULL); + return vmemmap_populate_hugepages(start, end, node, NULL, pgmap); #endif } =20 diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/incl= ude/asm/book3s/64/radix.h index da954e779744..bde07c6f900f 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -321,7 +321,8 @@ extern int __meminit radix__vmemmap_create_mapping(unsi= gned long start, unsigned long page_size, unsigned long phys); int __meminit radix__vmemmap_populate(unsigned long start, unsigned long e= nd, - int node, struct vmem_altmap *altmap); + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); void __ref radix__vmemmap_free(unsigned long start, unsigned long end, struct vmem_altmap *altmap); extern void radix__vmemmap_remove_mapping(unsigned long start, diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/boo= k3s64/radix_pgtable.c index 10aced261cff..568500343e5f 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1112,7 +1112,7 @@ static inline pte_t *vmemmap_pte_alloc(pmd_t *pmdp, i= nt node, =20 =20 int __meminit radix__vmemmap_populate(unsigned long start, unsigned long e= nd, int node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { unsigned long addr; unsigned long next; diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index b6f3ae03ca9e..8f4aa5b32186 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -275,12 +275,12 @@ static int __meminit __vmemmap_populate(unsigned long= start, unsigned long end, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { =20 #ifdef CONFIG_PPC_BOOK3S_64 if (radix_enabled()) - return radix__vmemmap_populate(start, end, node, altmap); + return radix__vmemmap_populate(start, end, node, altmap, pgmap); #endif =20 return __vmemmap_populate(start, end, node, altmap); diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index b0092fb842a3..a04ae9727cbe 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1348,7 +1348,7 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); =20 @@ -1358,7 +1358,7 @@ int __meminit vmemmap_populate(unsigned long start, u= nsigned long end, int node, * memory hotplug, we are not able to update all the page tables with * the new PMDs. */ - return vmemmap_populate_hugepages(start, end, node, altmap); + return vmemmap_populate_hugepages(start, end, node, altmap, pgmap); } #endif =20 diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c index eeadff45e0e1..a7bf8d3d5601 100644 --- a/arch/s390/mm/vmem.c +++ b/arch/s390/mm/vmem.c @@ -506,7 +506,7 @@ static void vmem_remove_range(unsigned long start, unsi= gned long size) * Add a backed mem_map array to the virtual mem_map array. */ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { int ret; =20 diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 367c269305e5..f870ca330f9e 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -2591,9 +2591,10 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int nod= e, } =20 int __meminit vmemmap_populate(unsigned long vstart, unsigned long vend, - int node, struct vmem_altmap *altmap) + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { - return vmemmap_populate_hugepages(vstart, vend, node, NULL); + return vmemmap_populate_hugepages(vstart, vend, node, NULL, pgmap); } #endif /* CONFIG_SPARSEMEM_VMEMMAP */ =20 diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 77b889b71cf3..e18cc81a30b4 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1557,7 +1557,7 @@ int __meminit vmemmap_check_pmd(pmd_t *pmd, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { int err; =20 @@ -1565,15 +1565,15 @@ int __meminit vmemmap_populate(unsigned long start,= unsigned long end, int node, VM_BUG_ON(!PAGE_ALIGNED(end)); =20 if (end - start < PAGES_PER_SECTION * sizeof(struct page)) - err =3D vmemmap_populate_basepages(start, end, node, NULL); + err =3D vmemmap_populate_basepages(start, end, node, NULL, pgmap); else if (boot_cpu_has(X86_FEATURE_PSE)) - err =3D vmemmap_populate_hugepages(start, end, node, altmap); + err =3D vmemmap_populate_hugepages(start, end, node, altmap, pgmap); else if (altmap) { pr_err_once("%s: no cpu support for altmap allocations\n", __func__); err =3D -ENOMEM; } else - err =3D vmemmap_populate_basepages(start, end, node, NULL); + err =3D vmemmap_populate_basepages(start, end, node, NULL, pgmap); if (!err) sync_global_pgds(start, end - 1); return err; diff --git a/include/linux/mm.h b/include/linux/mm.h index 0b776907152e..bebc5f892f81 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4877,11 +4877,13 @@ void vmemmap_set_pmd(pmd_t *pmd, void *p, int node, int vmemmap_check_pmd(pmd_t *pmd, int node, unsigned long addr, unsigned long next); int vmemmap_populate_basepages(unsigned long start, unsigned long end, - int node, struct vmem_altmap *altmap); + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); int vmemmap_populate_hugepages(unsigned long start, unsigned long end, - int node, struct vmem_altmap *altmap); + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); int vmemmap_populate(unsigned long start, unsigned long end, int node, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, struct dev_pagemap *pgmap); int vmemmap_populate_hvo(unsigned long start, unsigned long end, unsigned int order, struct zone *zone, unsigned long headsize); diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 4a077d231d3a..50b7123f3bdd 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -829,7 +829,7 @@ void __init hugetlb_vmemmap_init_late(int nid) */ list_del(&m->list); =20 - vmemmap_populate(start, end, nid, NULL); + vmemmap_populate(start, end, nid, NULL, NULL); nr_mmap =3D end - start; memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE)); =20 @@ -845,7 +845,7 @@ void __init hugetlb_vmemmap_init_late(int nid) if (vmemmap_populate_hvo(start, end, huge_page_order(h), zone, HUGETLB_VMEMMAP_RESERVE_SIZE) < 0) { /* Fallback if HVO population fails */ - vmemmap_populate(start, end, nid, NULL); + vmemmap_populate(start, end, nid, NULL, NULL); nr_mmap =3D end - start; } else { m->flags |=3D HUGE_BOOTMEM_ZONES_VALID; diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 05e3e2b94e32..f5245647afee 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -297,7 +297,8 @@ static int __meminit vmemmap_populate_range(unsigned lo= ng start, } =20 int __meminit vmemmap_populate_basepages(unsigned long start, unsigned lon= g end, - int node, struct vmem_altmap *altmap) + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { return vmemmap_populate_range(start, end, node, altmap, -1, 0); } @@ -400,7 +401,8 @@ int __weak __meminit vmemmap_check_pmd(pmd_t *pmd, int = node, } =20 int __meminit vmemmap_populate_hugepages(unsigned long start, unsigned lon= g end, - int node, struct vmem_altmap *altmap) + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long addr; unsigned long next; @@ -445,7 +447,7 @@ int __meminit vmemmap_populate_hugepages(unsigned long = start, unsigned long end, } } else if (vmemmap_check_pmd(pmd, node, addr, next)) continue; - if (vmemmap_populate_basepages(addr, next, node, altmap)) + if (vmemmap_populate_basepages(addr, next, node, altmap, pgmap)) return -ENOMEM; } return 0; @@ -559,7 +561,7 @@ struct page * __meminit __populate_section_memmap(unsig= ned long pfn, if (vmemmap_can_optimize(altmap, pgmap)) r =3D vmemmap_populate_compound_pages(pfn, start, end, nid, pgmap); else - r =3D vmemmap_populate(start, end, nid, altmap); + r =3D vmemmap_populate(start, end, nid, altmap, pgmap); =20 if (r < 0) return NULL; --=20 2.20.1 From nobody Thu Jun 18 08:01:44 2026 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67BB4367F42 for ; Wed, 15 Apr 2026 11:14:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251694; cv=none; b=EwUGXFjRAac/ij4ItgykAU565VysBTxG5lRz92FrydJSfmmt716MTj3urpIi3eNDaOQYGJcf8ciE8q9lEFj3tIUNt3I7V7vcioeHCW6Rb+iqS+JRXZzhepeRTYxgGQEl2H31M9e4nScZeEz1RQrRp3PmJKWNrgOzvDdxwI9WJt4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251694; c=relaxed/simple; bh=56/m/uTeNN9wB78un6OO3se3/EIjy7omNkiIDPY+1+o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=e4T29C2HXmgMDYJR/s2Sl97n9Htg7TINOG5OkPxMuKw6rY/w+cZ1kzkSg3S0grb7AgL7N25u0acmdwj3KiiMfcUrlRcHD1XO7itezkm5vYlj3+IP6MUka1lP90fSUpnfSjUsG3cLyN3EpYa5kyzT+uNYiB7vueNqoKSwGzPio4c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=KVazzK0V; arc=none smtp.client-ip=209.85.214.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="KVazzK0V" Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-2b25cf1b5f0so44350875ad.3 for ; Wed, 15 Apr 2026 04:14:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776251693; x=1776856493; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UUwLFe4sJwTmvK33kSzO0AkxCRTRozJKSvcgK9DIV10=; b=KVazzK0VZfBqNnwCyx9s4RW+G4ihrsBj400ANV/6sAdq1NsAhiYBBuRqEcCkLCSbTi Za+KJcfgsFDm3S6eFYuouHSpjSIxB9cZxKuISFrjuIc7oWJk9w9M7l0a7hShGEqQZTXH gWDM3J1/k3ugFxESZf2cfjN4ajtHhil2emaOem6zHEYsqUukB+ArEjgLSGUbrdKsXJx0 Fn0vB+meUjfKzqLtMAemphfu2/HX+bqonCbUXCsZrSDh5FiFeTkm4MjYyD58a1b3YSZt 6LKbq7qJJqAENjlKC9tFMZWAJi6hGaBIj3kBraqwpGjjkwUpVhux63gImIvHInyd2WOG FPmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776251693; x=1776856493; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=UUwLFe4sJwTmvK33kSzO0AkxCRTRozJKSvcgK9DIV10=; b=SUKIp72l6wbCia0S/H+g6vAmn9f2FCUkvxheCGCpqrnbdhV5/j0eYx29GVQz6/ubxK x97JDZ8Z/ruZvV5A1u2z06b4EWmgLwt2aoNhT5PGJUyz2+Zbee5nu9kyR+iTgBTnITZJ TCpHSSb1hKt781W247IKasXizVW5wCUelPYe6/9j6zmi6UF4IS/Ab/xexf8ll2wCseww IfnEwf2Rk81qLKZs0Aa4OcKqHlqzpYfngdzlwge8F3eBZAnePOmnDN2MXl6us7tE+uUh CnxibvNYE9LmesHTw0Sm6lVA72pJ3gCmG3kH0VwkFgQmQgF6N8sic8p7HRZ+NJiqZ/Z0 gbVQ== X-Forwarded-Encrypted: i=1; AFNElJ/uKFbIBB2nLM2Cjxg9RHXNSsudTs0za+H/pW6oOHDR81COGZA8EdpRydROZaBl9cIPXSBL9JDS6YY/aZY=@vger.kernel.org X-Gm-Message-State: AOJu0YwrkJwECCvYw0NWyTLZ+5MDHPsBMPm2Ilh044VSdzHL0yBL3ydr id0rdxOQQef5yJ1vdebSV4UcWlbZCMcAek4dbXvl+M2OuEiW3nTZuSmeZVpWpdNZUA8= X-Gm-Gg: AeBDievHameZ0NdcOBd2PMD9G9MIQYOr68UxbRMoY9Igb+TndWMq/HsZPUMiGsGU8sV +4nqO+5ldlmNm/9Zxhv3XzEdugVgfRCAv+D/KSDC9XCuy6P4bFZCMofsxZTtVD+KaLcPiQYPiQ7 tUe1myLUl+pZXWgIJYs5K0/xY3iwi7MZKum31NLcos3wg179DMTR5asVmfi3DOMklklNR+opwzS +CVYhD53OJ2dhD3O1mTB3FkdMGmGJyLDgzahJbzYCq3Uy8mI0wSqf8WAKq4D6h0LvzFPRNyZGuK rCnHlcRcqmaR10TQ/eIz/sRkeRwEIrn4GQKaDbE683h8ur7k6ogy9pbHXBLVCgTUy+/GMyY3jla jz5LJghA3NfMn/WZ/jyYWoKqnv1lyaoiOZnhfAdg2uG7ZUqLTOdKLccUtuUHhOQHEfVSNTYckdh 8gYs5Psn+VLjNh0kFk2PI0nUeCzOuJ0nab9ZXrYSJSB6k= X-Received: by 2002:a17:903:37c6:b0:2b2:49a7:a5bc with SMTP id d9443c01a7336-2b2d5a60603mr217754215ad.39.1776251692524; Wed, 15 Apr 2026 04:14:52 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.96]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b477fd3724sm19509485ad.0.2026.04.15.04.14.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Apr 2026 04:14:52 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH v2 5/6] mm/sparse-vmemmap: Fix missing architecture-specific page table sync Date: Wed, 15 Apr 2026 19:14:11 +0800 Message-Id: <20260415111412.1003526-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260415111412.1003526-1-songmuchun@bytedance.com> References: <20260415111412.1003526-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On x86-64, vmemmap_populate() normally calls sync_global_pgds() to keep the page tables in sync. However, when vmemmap optimization for compound devmaps is enabled, vmemmap_populate_compound_pages() is called directly from __populate_section_memmap(), bypassing the architecture- specific vmemmap_populate() entirely. This skips the sync on x86-64 and can later trigger vmemmap-access faults. Fix this by moving the vmemmap_can_optimize() dispatch from __populate_section_memmap() into the generic helpers -- vmemmap_populate_basepages() and vmemmap_populate_hugepages(). This way, the architecture vmemmap_populate() is always invoked first, ensuring any arch-specific post-population steps (e.g. sync_global_pgds()) are executed before returning. Architectures that override vmemmap_populate() (e.g. powerpc) handle the optimization dispatch in their own implementation instead. Fixes: 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compoun= d devmaps") Signed-off-by: Muchun Song --- arch/powerpc/include/asm/book3s/64/radix.h | 6 ------ arch/powerpc/mm/book3s64/radix_pgtable.c | 16 ++++++++++----- mm/sparse-vmemmap.c | 24 +++++++++++----------- 3 files changed, 23 insertions(+), 23 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/incl= ude/asm/book3s/64/radix.h index bde07c6f900f..2600defa2dc2 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -357,11 +357,5 @@ int radix__remove_section_mapping(unsigned long start,= unsigned long end); #define vmemmap_can_optimize vmemmap_can_optimize bool vmemmap_can_optimize(struct vmem_altmap *altmap, struct dev_pagemap *= pgmap); #endif - -#define vmemmap_populate_compound_pages vmemmap_populate_compound_pages -int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, - unsigned long start, - unsigned long end, int node, - struct dev_pagemap *pgmap); #endif /* __ASSEMBLER__ */ #endif diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/boo= k3s64/radix_pgtable.c index 568500343e5f..21fece355fbb 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1109,7 +1109,10 @@ static inline pte_t *vmemmap_pte_alloc(pmd_t *pmdp, = int node, return pte_offset_kernel(pmdp, address); } =20 - +static int __meminit vmemmap_populate_compound_pages(unsigned long start_p= fn, + unsigned long start, + unsigned long end, int node, + struct dev_pagemap *pgmap); =20 int __meminit radix__vmemmap_populate(unsigned long start, unsigned long e= nd, int node, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) @@ -1122,6 +1125,9 @@ int __meminit radix__vmemmap_populate(unsigned long s= tart, unsigned long end, in pmd_t *pmd; pte_t *pte; =20 + if (vmemmap_can_optimize(altmap, pgmap)) + return vmemmap_populate_compound_pages(page_to_pfn((struct page *)start), + start, end, node, pgmap); /* * If altmap is present, Make sure we align the start vmemmap addr * to PAGE_SIZE so that we calculate the correct start_pfn in @@ -1303,10 +1309,10 @@ static pte_t * __meminit vmemmap_compound_tail_page= (unsigned long addr, return pte; } =20 -int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, - unsigned long start, - unsigned long end, int node, - struct dev_pagemap *pgmap) +static int __meminit vmemmap_populate_compound_pages(unsigned long start_p= fn, + unsigned long start, + unsigned long end, int node, + struct dev_pagemap *pgmap) { /* * we want to map things as base page size mapping so that diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index f5245647afee..7f684ed3479e 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -296,10 +296,16 @@ static int __meminit vmemmap_populate_range(unsigned = long start, return 0; } =20 +static int __meminit vmemmap_populate_compound_pages(unsigned long start, + unsigned long end, int node, + struct dev_pagemap *pgmap); + int __meminit vmemmap_populate_basepages(unsigned long start, unsigned lon= g end, int node, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { + if (vmemmap_can_optimize(altmap, pgmap)) + return vmemmap_populate_compound_pages(start, end, node, pgmap); return vmemmap_populate_range(start, end, node, altmap, -1, 0); } =20 @@ -411,6 +417,9 @@ int __meminit vmemmap_populate_hugepages(unsigned long = start, unsigned long end, pud_t *pud; pmd_t *pmd; =20 + if (vmemmap_can_optimize(altmap, pgmap)) + return vmemmap_populate_compound_pages(start, end, node, pgmap); + for (addr =3D start; addr < end; addr =3D next) { next =3D pmd_addr_end(addr, end); =20 @@ -453,7 +462,6 @@ int __meminit vmemmap_populate_hugepages(unsigned long = start, unsigned long end, return 0; } =20 -#ifndef vmemmap_populate_compound_pages /* * For compound pages bigger than section size (e.g. x86 1G compound * pages with 2M subsection size) fill the rest of sections as tail @@ -491,14 +499,14 @@ static pte_t * __meminit compound_section_tail_page(u= nsigned long addr) return pte; } =20 -static int __meminit vmemmap_populate_compound_pages(unsigned long start_p= fn, - unsigned long start, +static int __meminit vmemmap_populate_compound_pages(unsigned long start, unsigned long end, int node, struct dev_pagemap *pgmap) { unsigned long size, addr; pte_t *pte; int rc; + unsigned long start_pfn =3D page_to_pfn((struct page *)start); =20 if (reuse_compound_section(start_pfn, pgmap)) { pte =3D compound_section_tail_page(start); @@ -544,26 +552,18 @@ static int __meminit vmemmap_populate_compound_pages(= unsigned long start_pfn, return 0; } =20 -#endif - struct page * __meminit __populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); - int r; =20 if (WARN_ON_ONCE(!IS_ALIGNED(pfn, PAGES_PER_SUBSECTION) || !IS_ALIGNED(nr_pages, PAGES_PER_SUBSECTION))) return NULL; =20 - if (vmemmap_can_optimize(altmap, pgmap)) - r =3D vmemmap_populate_compound_pages(pfn, start, end, nid, pgmap); - else - r =3D vmemmap_populate(start, end, nid, altmap, pgmap); - - if (r < 0) + if (vmemmap_populate(start, end, nid, altmap, pgmap)) return NULL; =20 return pfn_to_page(pfn); --=20 2.20.1 From nobody Thu Jun 18 08:01:44 2026 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7556E367F40 for ; Wed, 15 Apr 2026 11:14:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251699; cv=none; b=ADoQfjX6Um1/SFM8dtBEaRUKL5ZqlRr8lE3zQCXNFrs1WA2m4BRHYnIkpxeWTgV5MA/h6NxLqGmo6m41vmxDAHFDPz5l7E3zkBdudqgWpXpyIKlwilJHtJaquzLvNy9GKaDee/YwXSuK8SOrUurHraDULVuWtr9kaKL8zOzKGik= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776251699; c=relaxed/simple; bh=eaYlGmdNTGMhO6SKUx0BUddmt3tOQxCaGHyqTB+T3jw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nMBA0x0aggLIsRfAz04Vgon8CX/eC0leSNwFCC4kxytncfTLzsV1MZ2bo8MkmRKH69CcOWEzEUMylvplxmivSvm5emAbnZR8/RLFM6emf9YhufQDD6nq4VDuFxayASDlWiMbacpWrfqlAUDXVuhEStnFKz2w6T/60RyUqDSSyLQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=KlQVBp83; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="KlQVBp83" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-2b24fdac394so60010275ad.3 for ; Wed, 15 Apr 2026 04:14:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776251698; x=1776856498; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=M2tbh3r8pv+EpSWaqLjjsd7slEO4DRhf7DkjZnRXEmM=; b=KlQVBp831WnMSphhU+WwA8tdD5IQ2QlckO7n+eNC11kvAM3RlhpuGcBTM5PXrgHa1N imANnhMoVhAAjXfHmrvts/8sCmAMUTTpodIGeT0CzApspfOiUghKe1mMCDUQaxTxgK9P sY7M91Lj0L+s9Ex1y+SXaqd+QsZ5T8U5GQJHksVibRFt5ZPrTUxPHgB6zGkGizH4r3Hs lK7SoA6f9X6Tm5DIq44G2sPHlFslBrE55z1W4vKYBfxnYo39QtXqI8+li45JPrnJtQNS 7XrpTZl0MrOd7PjRCGlPxJeIX5XpGZi18nH6eY+ZOQxHxFdqE7dWqD557EsqtfyLgX7z 9X1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776251698; x=1776856498; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=M2tbh3r8pv+EpSWaqLjjsd7slEO4DRhf7DkjZnRXEmM=; b=Qxj56qqXe1/7mmWdKOpIhxspvxQtT8vwclZLjqjdK215/C7KHlQSoFTI73WG38sinj yM58gTyJrohXZRddCMDnccyCBy9medcgl69lLPn1oUbiVshs+Y8i+0wkEeHiEfiK/3b0 yhwHUHrX4i4YPEAVxZsFcGHLcKcqBJfdQgmRv9RsTGQ/S5z6k4lHEYLXlzHCxP9CEJUD HMsjZU2ipP9/otfdLQ1wONl17pRoWUuY+CbuHybvIgTeZuR+k5qtVcYBU8mikQoR0Kn2 BL7dcwzMxjFEmluLZI2lk10G6lUvK8KHzmy3soIReb9NVyFWm0xig/fCbGHUUqAuDnV6 at8g== X-Forwarded-Encrypted: i=1; AFNElJ9t/UEBVwC4R51EPy7qJ/lMYpClooDuJ+a313H+XJLyp3SpEa7Qu+7YC4XoEeoRMHtk2jWvNBOf7ZWcvZY=@vger.kernel.org X-Gm-Message-State: AOJu0YwoGHOzUvaXQUZUW2Z8yKPacGN4yfqU5NkzknCsst4h3dRA1hGS XDpdDOVvbu8CRmA8WuiBRtMWzeX9fdYNdCqlrRu59LIcGOPeNhBQLY47ov9X15RBInY= X-Gm-Gg: AeBDietfuhmd17m42Hmh7ImSgugawqKfoObTUj33rDzuM6c9HrAst2dzwVXsyjQ4bLf m0pNrK8TIGpHM25RNSJvqUHPNwt6Qm4isUmDGxEIgxq0N7jFGmWG9QiQ5rZr39D3JoTGg1Vj2Ir 0o+3WhZMF5s/Yz7WABYxMkW3ynP1SzuGETMT+dsdnhfTtAqzvbkzBuwh3pCeeWEesXwgO34jKxT WFYizhmlt9FSWHT/z+SN7PwVaxPjdPFk9C3synUsIlCz8chKWLaCO6hz7Jym+2jvB3FMtvUTiWz hcpeh206pMKlWOeCoywz7SG0JbrKOnIsaXiDaUd6hbkJzReK0fpOf8V/CQv5MTgoIj9l0YKNZmP 6C+krgNB9yP1ApCAe68MJcEKEwlxDbtFiXGJXcmvGmSXOrvjRmgcucS/Cjnzuv35ycSEc4+5Yqm VGHdkvQJadlVgU9QxHWk40VzTO0EjbbARpQXSzH0SPsu0= X-Received: by 2002:a17:903:1a24:b0:2b4:63c8:ce18 with SMTP id d9443c01a7336-2b463c8db7emr113441015ad.12.1776251697718; Wed, 15 Apr 2026 04:14:57 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.96]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b477fd3724sm19509485ad.0.2026.04.15.04.14.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Apr 2026 04:14:57 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH v2 6/6] mm/mm_init: Fix pageblock migratetype for ZONE_DEVICE compound pages Date: Wed, 15 Apr 2026 19:14:12 +0800 Message-Id: <20260415111412.1003526-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260415111412.1003526-1-songmuchun@bytedance.com> References: <20260415111412.1003526-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The memmap_init_zone_device() function only initializes the migratetype of the first pageblock of a compound page. If the compound page size exceeds pageblock_nr_pages (e.g., 1GB hugepages with 2MB pageblocks), subsequent pageblocks in the compound page remain uninitialized. Move the migratetype initialization out of __init_zone_device_page() and into a separate pageblock_migratetype_init_range() function. This iterates over the entire PFN range of the memory, ensuring that all pageblocks are correctly initialized. Fixes: c4386bd8ee3a ("mm/memremap: add ZONE_DEVICE support for compound pag= es") Signed-off-by: Muchun Song Reviewed-by: Mike Rapoport (Microsoft) --- mm/mm_init.c | 42 +++++++++++++++++++++++++++--------------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index f9f8e1af921c..30528c4206c1 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -674,6 +674,19 @@ static inline void fixup_hashdist(void) static inline void fixup_hashdist(void) {} #endif /* CONFIG_NUMA */ =20 +static __meminit void pageblock_migratetype_init_range(unsigned long pfn, + unsigned long nr_pages, + int migratetype) +{ + unsigned long end =3D pfn + nr_pages; + + for (pfn =3D pageblock_align(pfn); pfn < end; pfn +=3D pageblock_nr_pages= ) { + init_pageblock_migratetype(pfn_to_page(pfn), migratetype, false); + if (IS_ALIGNED(pfn, PAGES_PER_SECTION)) + cond_resched(); + } +} + /* * Initialize a reserved page unconditionally, finding its zone first. */ @@ -1011,21 +1024,6 @@ static void __ref __init_zone_device_page(struct pag= e *page, unsigned long pfn, page_folio(page)->pgmap =3D pgmap; page->zone_device_data =3D NULL; =20 - /* - * Mark the block movable so that blocks are reserved for - * movable at startup. This will force kernel allocations - * to reserve their blocks rather than leaking throughout - * the address space during boot when many long-lived - * kernel allocations are made. - * - * Please note that MEMINIT_HOTPLUG path doesn't clear memmap - * because this is done early in section_activate() - */ - if (pageblock_aligned(pfn)) { - init_pageblock_migratetype(page, MIGRATE_MOVABLE, false); - cond_resched(); - } - /* * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC are released * directly to the driver page allocator which will set the page count @@ -1122,6 +1120,8 @@ void __ref memmap_init_zone_device(struct zone *zone, =20 __init_zone_device_page(page, pfn, zone_idx, nid, pgmap); =20 + cond_resched(); + if (pfns_per_compound =3D=3D 1) continue; =20 @@ -1129,6 +1129,18 @@ void __ref memmap_init_zone_device(struct zone *zone, compound_nr_pages(altmap, pgmap)); } =20 + /* + * Mark the block movable so that blocks are reserved for + * movable at startup. This will force kernel allocations + * to reserve their blocks rather than leaking throughout + * the address space during boot when many long-lived + * kernel allocations are made. + * + * Please note that MEMINIT_HOTPLUG path doesn't clear memmap + * because this is done early in section_activate() + */ + pageblock_migratetype_init_range(start_pfn, nr_pages, MIGRATE_MOVABLE); + pr_debug("%s initialised %lu pages in %ums\n", __func__, nr_pages, jiffies_to_msecs(jiffies - start)); } --=20 2.20.1