From nobody Tue Jun 16 20:39:47 2026 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF9D228150F for ; Tue, 21 Apr 2026 02:21:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776738076; cv=none; b=KxrxJlFB5owtrhhy1pDAnH1WmSZ56LnnLZtxR5Aqa35ruNraeVSxMsLS3mcVTCa8p0+7bEDd4j1tnuIm22o4UgouWWBRTptTcf9+GobkP/9mbhdi7t3GxcNTuWDtHoFQfi5fx2SJ8KPF+7c2CnDbbESN0XTxKfGdzzwqEA3OqpE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776738076; c=relaxed/simple; bh=cyDmSkGHrAylpjw8b5q9QK2Yaj3wrd2mV0LszOVFa4o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KFf6GZ20EXGJ8/hLraQivCwu3K/WWWsvv6U4tqZ90zXzGvMzgjsyQn3Qtqw2w/q6bi3pG5fZklfHBvMohZ9WtXBhclvdC9lmj8wpI8XyZoZZh2buB+D73HYANJUerOFWXUs4ff/2UHDMJoTZ81zWffP4Z5xDutvvm8yRKKV821s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ZAF4osxL; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ZAF4osxL" Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-82fb2d0c5d1so1090603b3a.0 for ; Mon, 20 Apr 2026 19:21:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776738074; x=1777342874; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mFMUe4eU96PwNZ4hmEg9N6ajvO6DYLBIUThz3MnM+0w=; b=ZAF4osxL4+yAEQ1RepNHJHK4K/zcrY6akLvOq1cMMZ/OUk0VHx8+Bc0d7iU/D+f1Nv T3Jfs5x9+Ne6OoQru7seWQQOcU9hwSSCCY9fwRG0Ow9xqNgwQp9sv5CZUfDIgwlvD3EC 5MMJ+W+empt9Z29nWE1DR81EkhRbKpI/NYgj6ajG+eoR2ylBkBic1pARyFdXOWWaaRyM i2gPXIh5iBP08SYXLkhRnbXVtyzLUeK61XcqATuY4qhIQOnLwiMIt6iCTxFDpeB0gvvh +TDrarKJSQZs6Kt3RdxSwxiVL83GgQjScaFct+o9mQl8in0DxX7ZtATGwRySkjL00gg0 kLXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776738074; x=1777342874; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=mFMUe4eU96PwNZ4hmEg9N6ajvO6DYLBIUThz3MnM+0w=; b=gn3IrHzZqJ+E0RKFaOAlvqq7lLPBdxsg/tL2ONV3adC/Ps8m8QZqGgR030pRPA/pPP 9cYS23MMoo9hK5HPoe/6XhsqjqUQf9494E5OPUMCnF/kVjqEtkelSDYdoJBQvFxEyjly 1laKNyWP+86fub1mB5KBuBQvBLBuo5Lh2NqVZxcYkZ4bLaouOMTuRDjbz2/6v0wkdPxz pNATEO2T34FouJNSt046ReqVF+ROKCIMWN8shwbCln24fJOLtsCvGzqBPFUvlRyrYwo/ PkwzApeADl2cPAnT2cf2yldtM6OMvOi2x4OZwfiKuCA/CArawi48ae/5JZhwh/qRbb0s GUJw== X-Forwarded-Encrypted: i=1; AFNElJ9Gqf2Z7dK9hf/ipJPEJ5t+K4fh4yN5tbNXwLMMSnvyGTE9cPw1mFuQq1WoOft0CmJ16nQLGom1ZJtzLnM=@vger.kernel.org X-Gm-Message-State: AOJu0YzpCZ7Q+BOKbRFYObDLSRyE6BRfSELvtWgox9Ye2gZkzPLOuzc2 HkTtgRl9dPCQRpb2yqc/rVmL7A4WGTcdvGqH9U4hYNEU2slt/oLkfXrOov2JYCClBcU= X-Gm-Gg: AeBDievAouXDKgRfJw3JAArKMw/PkEoKdPHJCPrbiOYeRRFcP2FTn+tz8/HXNOCniUh t9pPttGKrWpXFHz5nys1ufSvBlv3cQgdkOricCdosg6SDp0zth8FWB7DDQMkREX/6ZmkA45A8NO C84i4XA3cTPKHOJcqYcXGQSeJgLD8UFzthSXv7pk6W5ThA9UWhepLKAxMzZ9RtrElfAz+KSVW/e C4YmVkmd/6h0zqMvsuGWx2gusPpPiyDOkgHLxfal3sDo5svqXuWB0Vws9GWANUcTIO/dbMWjj69 gWfCqwqM/ipcZBx9Y6MYdwlCXkF0jCX0bM8DViCzLXKOTRFDk2DzvHUzMvVm4GUX31ksgl7uJK+ QpA1Gt5hXKg9wu4VHSLRr5zzDlKliJXwBqvwv9s7HgbrzimcDJRSrogYUmzlOqearS9IDXBj6WJ 4IDVIayP2r+VJ7QcZZAf2h1icXolCK X-Received: by 2002:a05:6a00:992:b0:82f:2aaa:c14c with SMTP id d2e1a72fcca58-82f8b5060f2mr13376259b3a.16.1776738073875; Mon, 20 Apr 2026 19:21:13 -0700 (PDT) Received: from n232-176-004.byted.org ([240e:83:200::340]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f932dabd4sm11538780b3a.51.2026.04.20.19.21.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Apr 2026 19:21:13 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Muchun Song , Mike Rapoport , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 1/4] mm/sparse-vmemmap: Fix vmemmap accounting underflow Date: Tue, 21 Apr 2026 10:20:41 +0800 Message-Id: <20260421022044.1217503-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260421022044.1217503-1-songmuchun@bytedance.com> References: <20260421022044.1217503-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In section_activate(), if populate_section_memmap() fails, the error handling path calls section_deactivate() to roll back the state. This causes a vmemmap accounting imbalance. Since commit c3576889d87b ("mm: fix accounting of memmap pages"), memmap pages are accounted for only after populate_section_memmap() succeeds. However, the failure path unconditionally calls section_deactivate(), which decreases the vmemmap count. Consequently, a failure in populate_section_memmap() leads to an accounting underflow, incorrectly reducing the system's tracked vmemmap usage. Fix this more thoroughly by moving all accounting calls into the lower level functions that actually perform the vmemmap allocation and freeing: - populate_section_memmap() accounts for newly allocated vmemmap pages - depopulate_section_memmap() unaccounts when vmemmap is freed This ensures proper accounting in all code paths, including error handling and early section cases. Fixes: c3576889d87b ("mm: fix accounting of memmap pages") Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) Acked-by: Oscar Salvador --- mm/sparse-vmemmap.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 6eadb9d116e4..a7b11248b989 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -656,7 +656,12 @@ static struct page * __meminit populate_section_memmap= (unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { - return __populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); + struct page *page =3D __populate_section_memmap(pfn, nr_pages, nid, altma= p, + pgmap); + + memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); + + return page; } =20 static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_= pages, @@ -665,13 +670,17 @@ static void depopulate_section_memmap(unsigned long p= fn, unsigned long nr_pages, unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); =20 + memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE= _SIZE))); vmemmap_free(start, end, altmap); } + static void free_map_bootmem(struct page *memmap) { unsigned long start =3D (unsigned long)memmap; unsigned long end =3D (unsigned long)(memmap + PAGES_PER_SECTION); =20 + memmap_boot_pages_add(-1L * (DIV_ROUND_UP(PAGES_PER_SECTION * sizeof(stru= ct page), + PAGE_SIZE))); vmemmap_free(start, end, NULL); } =20 @@ -774,14 +783,10 @@ static void section_deactivate(unsigned long pfn, uns= igned long nr_pages, * The memmap of early sections is always fully populated. See * section_activate() and pfn_valid() . */ - if (!section_is_early) { - memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAG= E_SIZE))); + if (!section_is_early) depopulate_section_memmap(pfn, nr_pages, altmap); - } else if (memmap) { - memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), - PAGE_SIZE))); + else if (memmap) free_map_bootmem(memmap); - } =20 if (empty) ms->section_mem_map =3D (unsigned long)NULL; @@ -826,7 +831,6 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, section_deactivate(pfn, nr_pages, altmap); return ERR_PTR(-ENOMEM); } - memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); =20 return memmap; } --=20 2.20.1 From nobody Tue Jun 16 20:39:47 2026 Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB870283FE5 for ; Tue, 21 Apr 2026 02:21:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776738081; cv=none; b=cxAekELCSaliwxTMymAMjvHKR+47UHZamnkDhFnptHjaHYOYRnBh2iv5BWiaerLaLhzlMMrUEnJBNGfE5QFc0S1NGxXHFwId0i+2r7/qJDH+cU+x4mbfEw4d9QsnY9gQTvYsr4Hbk3dCtAgTEoJ3SKc6S8YeEeilR5JvfKwVFT8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776738081; c=relaxed/simple; bh=cisEORVCjat5lwV5tkQDCX1Tqc45f9xCBoE4PqUqHq0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=H4OuqG6n5IZK74hZfm3zGwcAsrPrloRb5QEbQMk8H3BNcnV5IxL+997DPB3FKju14ATouVPryX60Ig0kXISIimHQxs3M/E14ypdS5p0wlLw/sxKhBoloz7Ew4WXoTqY7bUbT9EcvTzGSubwuWUZEKmEnajc6il/cCXKtAqXGbGg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ayxDdBSh; arc=none smtp.client-ip=209.85.210.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ayxDdBSh" Received: by mail-pf1-f173.google.com with SMTP id d2e1a72fcca58-8296dabef74so3321040b3a.1 for ; Mon, 20 Apr 2026 19:21:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776738079; x=1777342879; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ll+rBBsFEUHYy/yotCSUgO8Da9y5VqQ+MYajiuV9kqo=; b=ayxDdBShijs9p1CWmf7O2EcjZ04TD5yjEhk7N9wQ9CBjH6voiuP7cHH+BgjZ8pNJHX R8K2ZDxwXFmGJdKrGtxw5nhNeT8txn3hEfUTBDqtZ5/RuydNOmB3WN24UFUGPE+A838L ah8LKqbT4obr/35uqBCkrCTFdQFN1ak+RzGUnR71C9Edi8wS+n9wYzKqGrKuefXzmNXq 8qqDDA1EyyCV9N1+5n65Cz5ZjS/OUuuRPm/vt10E+Mg3xkkZJ/QUhaZQLC859stkdzFH xRpOIbaeoxCxQ0QsCBp6Cn+S0CB7XfUlzlcKuYkKdzpTznY1YvRxtkOXiJhA6XP5YSXm v0Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776738079; x=1777342879; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ll+rBBsFEUHYy/yotCSUgO8Da9y5VqQ+MYajiuV9kqo=; b=NGjKq+zqHCwDe2Cw33r7qRUoE1RadaVgl1Pz6FSJO2OsFlTVB87HO/4t75H9WrF+hs PDxfcZN3/2oFGvgqWY4RfzpaHdMMqUpBQEzhx+8/YvKJDhOCcsxPiY2g1M4GWJOtDL89 YilYpQWyJWewJj0ov7d6HTYxEFvB60TlpEf96x5jVMwxz/6jVwsRDvscvNke0z9MHEY2 qfV7hDRIX2Ox+GIVc+5NQ79Fm6h5w5cA/t2AuzyNvNqvHFn8kJ+OZGHFocj74gyWVJvS e6feRk5n+EQoYKUzMgHhQySBXoi+9/ygeYORJ0KgElu1iK4gkwCM832+UbpqY01hXf4v iCzg== X-Forwarded-Encrypted: i=1; AFNElJ/E0KF4VXM0grD6P8D1stxCGiZfTy4//FwO/0kVlMEnThQCh48bBU+rWZ9csN8oe+124HgNIN2VuU9EJl4=@vger.kernel.org X-Gm-Message-State: AOJu0YwRCSs/wibk55470qcfBLM4X/3BAXMT++Vp/G6hpjSgxJvD/nLN VGuku9eej8v4ql+7oK/3BkuFEidRWzZekBHQMUUsJ7WEKJ5yRAN9eLY0DQxhqFRutZQ= X-Gm-Gg: AeBDietXFmi2PQEYZhLFqEiu/CVIFKhcBo5NBuUuI8QaRUK8fg8G/wJo3QKlPbw4+xI NHk9FR/7YELJGaP1CZgVJ9Kw2SA/89hbXo0cXaWvPEtvnvvV4K4kR//tezxnj0aX21ao/gCrr4t 2qLaVyQnyaQV5WD3ufAjO9d/G8LCdRvbipF5Z4YSxitbiXq0MzASRtoRITf1Xb+CpVGBuhV35cv Cnjda/RbDBxhPzThT16QEF2UmkoDlaR+wVIaF6AV+JDv9jERKBSdzrqH4rct1T5sm3q9AfxMMo7 zBGadOq38LyDs3VCd5Q4DIurjzGJt38XLTAZRAUANjhAx5ldhgCpMBH2NM7ZbayHEsKUZCXj1jQ mf2ygMJ0MHuMpIk8rjeiZpsV1+js8m5ayh+3uenR0T34PA1ZH1BBEcGXOpWzJCByxpI7mmjjpQI dfE9q5BOlYFjikKzz6pv9NZRXO2Mvg X-Received: by 2002:a05:6a00:845:b0:82f:5125:a327 with SMTP id d2e1a72fcca58-82f8c902183mr17293231b3a.27.1776738078868; Mon, 20 Apr 2026 19:21:18 -0700 (PDT) Received: from n232-176-004.byted.org ([240e:83:200::340]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f932dabd4sm11538780b3a.51.2026.04.20.19.21.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Apr 2026 19:21:18 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Muchun Song , Mike Rapoport , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 2/4] mm/sparse-vmemmap: Pass @pgmap argument to memory deactivation paths Date: Tue, 21 Apr 2026 10:20:42 +0800 Message-Id: <20260421022044.1217503-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260421022044.1217503-1-songmuchun@bytedance.com> References: <20260421022044.1217503-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, the memory hot-remove call chain -- arch_remove_memory(), __remove_pages(), sparse_remove_section() and section_deactivate() -- does not carry the struct dev_pagemap pointer. This prevents the lower levels from knowing whether the section was originally populated with vmemmap optimizations (e.g., DAX with vmemmap optimization enabled). Without this information, we cannot call vmemmap_can_optimize() to determine if the vmemmap pages were optimized. As a result, the vmemmap page accounting during teardown will mistakenly assume a non-optimized allocation, leading to incorrect memmap statistics. To lay the groundwork for fixing the vmemmap page accounting, we need to pass the @pgmap pointer down to the deactivation location. Plumb the @pgmap argument through the APIs of arch_remove_memory(), __remove_pages() and sparse_remove_section(), mirroring the corresponding *_activate() paths. Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) Reviewed-by: Oscar Salvador --- arch/arm64/mm/mmu.c | 5 +++-- arch/loongarch/mm/init.c | 5 +++-- arch/powerpc/mm/mem.c | 5 +++-- arch/riscv/mm/init.c | 5 +++-- arch/s390/mm/init.c | 5 +++-- arch/x86/mm/init_64.c | 5 +++-- include/linux/memory_hotplug.h | 8 +++++--- mm/memory_hotplug.c | 12 ++++++------ mm/memremap.c | 4 ++-- mm/sparse-vmemmap.c | 17 +++++++++-------- 10 files changed, 40 insertions(+), 31 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index dd85e093ffdb..e5a42b7a0160 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -2024,12 +2024,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); __remove_pgd_mapping(swapper_pg_dir, __phys_to_virt(start), size); } =20 diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index 00f3822b6e47..c9c57f08fa2c 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -86,7 +86,8 @@ int arch_add_memory(int nid, u64 start, u64 size, struct = mhp_params *params) return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; @@ -95,7 +96,7 @@ void arch_remove_memory(u64 start, u64 size, struct vmem_= altmap *altmap) /* With altmap the first mapped page is offset from @start */ if (altmap) page +=3D vmem_altmap_offset(altmap); - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); } #endif =20 diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 648d0c5602ec..4c1afab91996 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -158,12 +158,13 @@ int __ref arch_add_memory(int nid, u64 start, u64 siz= e, return rc; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); arch_remove_linear_mapping(start, size); } #endif diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index decd7df40fa4..b0092fb842a3 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1717,9 +1717,10 @@ int __ref arch_add_memory(int nid, u64 start, u64 si= ze, struct mhp_params *param return ret; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { - __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap); + __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap, pgmap); remove_linear_mapping(start, size); flush_tlb_all(); } diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 1f72efc2a579..11a689423440 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -276,12 +276,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return rc; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); vmem_remove_mapping(start, size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index df2261fa4f98..77b889b71cf3 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1288,12 +1288,13 @@ kernel_physical_mapping_remove(unsigned long start,= unsigned long end) remove_pagetable(start, end, true, NULL); } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); kernel_physical_mapping_remove(start, start + size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 815e908c4135..7c9d66729c60 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -135,9 +135,10 @@ static inline bool movable_node_is_enabled(void) return movable_node_enabled; } =20 -extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap); +extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap, + struct dev_pagemap *pgmap); extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, struct dev_pagemap *pgmap); =20 /* reasonably generic interface to expand the physical pages */ extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_= pages, @@ -307,7 +308,8 @@ extern int sparse_add_section(int nid, unsigned long pf= n, unsigned long nr_pages, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); extern void sparse_remove_section(unsigned long pfn, unsigned long nr_page= s, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); extern struct zone *zone_for_pfn_range(enum mmop online_type, int nid, struct memory_group *group, unsigned long start_pfn, unsigned long nr_pages); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 2a943ec57c85..6a9e2dc751d2 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -583,7 +583,7 @@ void remove_pfn_range_from_zone(struct zone *zone, * calling offline_pages(). */ void __remove_pages(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { const unsigned long end_pfn =3D pfn + nr_pages; unsigned long cur_nr_pages; @@ -598,7 +598,7 @@ void __remove_pages(unsigned long pfn, unsigned long nr= _pages, /* Select all remaining pages up to the next section boundary */ cur_nr_pages =3D min(end_pfn - pfn, SECTION_ALIGN_UP(pfn + 1) - pfn); - sparse_remove_section(pfn, cur_nr_pages, altmap); + sparse_remove_section(pfn, cur_nr_pages, altmap, pgmap); } } =20 @@ -1425,7 +1425,7 @@ static void remove_memory_blocks_and_altmaps(u64 star= t, u64 size) =20 remove_memory_block_devices(cur_start, memblock_size); =20 - arch_remove_memory(cur_start, memblock_size, altmap); + arch_remove_memory(cur_start, memblock_size, altmap, NULL); =20 /* Verify that all vmemmap pages have actually been freed. */ WARN(altmap->alloc, "Altmap not fully unmapped"); @@ -1468,7 +1468,7 @@ static int create_altmaps_and_memory_blocks(int nid, = struct memory_group *group, ret =3D create_memory_block_devices(cur_start, memblock_size, nid, params.altmap, group); if (ret) { - arch_remove_memory(cur_start, memblock_size, NULL); + arch_remove_memory(cur_start, memblock_size, NULL, NULL); kfree(params.altmap); goto out; } @@ -1554,7 +1554,7 @@ int add_memory_resource(int nid, struct resource *res= , mhp_t mhp_flags) /* create memory block devices after memory was added */ ret =3D create_memory_block_devices(start, size, nid, NULL, group); if (ret) { - arch_remove_memory(start, size, params.altmap); + arch_remove_memory(start, size, params.altmap, NULL); goto error; } } @@ -2266,7 +2266,7 @@ static int try_remove_memory(u64 start, u64 size) * No altmaps present, do the removal directly */ remove_memory_block_devices(start, size); - arch_remove_memory(start, size, NULL); + arch_remove_memory(start, size, NULL, NULL); } else { /* all memblocks in the range have altmaps */ remove_memory_blocks_and_altmaps(start, size); diff --git a/mm/memremap.c b/mm/memremap.c index 053842d45cb1..81766d822400 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -97,10 +97,10 @@ static void pageunmap_range(struct dev_pagemap *pgmap, = int range_id) PHYS_PFN(range_len(range))); if (pgmap->type =3D=3D MEMORY_DEVICE_PRIVATE) { __remove_pages(PHYS_PFN(range->start), - PHYS_PFN(range_len(range)), NULL); + PHYS_PFN(range_len(range)), NULL, pgmap); } else { arch_remove_memory(range->start, range_len(range), - pgmap_altmap(pgmap)); + pgmap_altmap(pgmap), pgmap); kasan_remove_zero_shadow(__va(range->start), range_len(range)); } mem_hotplug_done(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index a7b11248b989..40290fbc1db4 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -665,7 +665,7 @@ static struct page * __meminit populate_section_memmap(= unsigned long pfn, } =20 static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_= pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); @@ -674,7 +674,8 @@ static void depopulate_section_memmap(unsigned long pfn= , unsigned long nr_pages, vmemmap_free(start, end, altmap); } =20 -static void free_map_bootmem(struct page *memmap) +static void free_map_bootmem(struct page *memmap, struct vmem_altmap *altm= ap, + struct dev_pagemap *pgmap) { unsigned long start =3D (unsigned long)memmap; unsigned long end =3D (unsigned long)(memmap + PAGES_PER_SECTION); @@ -746,7 +747,7 @@ static int fill_subsection_map(unsigned long pfn, unsig= ned long nr_pages) * usage map, but still need to free the vmemmap range. */ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms =3D __pfn_to_section(pfn); bool section_is_early =3D early_section(ms); @@ -784,9 +785,9 @@ static void section_deactivate(unsigned long pfn, unsig= ned long nr_pages, * section_activate() and pfn_valid() . */ if (!section_is_early) - depopulate_section_memmap(pfn, nr_pages, altmap); + depopulate_section_memmap(pfn, nr_pages, altmap, pgmap); else if (memmap) - free_map_bootmem(memmap); + free_map_bootmem(memmap, altmap, pgmap); =20 if (empty) ms->section_mem_map =3D (unsigned long)NULL; @@ -828,7 +829,7 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, =20 memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); if (!memmap) { - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); return ERR_PTR(-ENOMEM); } =20 @@ -889,13 +890,13 @@ int __meminit sparse_add_section(int nid, unsigned lo= ng start_pfn, } =20 void sparse_remove_section(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms =3D __pfn_to_section(pfn); =20 if (WARN_ON_ONCE(!valid_section(ms))) return; =20 - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ --=20 2.20.1 From nobody Tue Jun 16 20:39:47 2026 Received: from mail-pf1-f169.google.com (mail-pf1-f169.google.com [209.85.210.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD3AE28851C for ; Tue, 21 Apr 2026 02:21:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776738086; cv=none; b=k0qykTwUcPDnwcDGaZKAb6hrG4Kr++cq/1FV0vY4II/yaiKDSK1q8MW9BJHdm2t8hRAL68Pifcz9pkfsEBqxIoOJMu+IBpPVbOM7OkPl34a8n7zYACEGDculzsccZXLElL+OaLLdWJJXFBRU+sAf/OcYahjHE9b9RoHwosP6A5I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776738086; c=relaxed/simple; bh=KbaVktxHOdEiSDzWbcrTOO4Pzv9QguIQ8vWwv5Weejk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=HyQitWz0QqGi0f8ij2PqG1/iVxxAK4d91Tu4EsESsiDrz7xmwIRAX+VPDJs1MDxSQ2p1h1Fzufn0HFW44xQD5QrQ0/62a14TUMyO81PTFOtqwVW28gjYbR/ZtH0fEmsCXgpHwldFFOzDiNHGYxzL9VVYplgSlWpomxaVIiPYI9M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=eGnMwOcl; arc=none smtp.client-ip=209.85.210.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="eGnMwOcl" Received: by mail-pf1-f169.google.com with SMTP id d2e1a72fcca58-82748257f5fso2517513b3a.1 for ; Mon, 20 Apr 2026 19:21:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776738084; x=1777342884; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yQT9sgIl1x5UPRFaiEeld1v2vJgPbOqztRRBh4ZUxsY=; b=eGnMwOcl54p/M0I3FPxXPfuC9sLqC5pAT9yCbNXow09RrIK//Hf3o2HsbCmYvf9D5P vs2IYl4Pd41rvxrQwtcJUhSNAxV+aZVwzCwg01USPrfzEXc3rKPokj9EvocHcZUz2F9P 6YB8DZ5KMaOGc+vtmo0O4f+bceanrNiSpZJ5h1BhZtJAK/E2iGk2ku7dHMDNr7Ec5hpm G59zSfnwpoKbgpPrG05e45qtIdDkWc2EvG3gg45A2xXs1gJ5N+feL2y3EQuyHrgRFb98 yRRZbYWRWvYcS9n7E7Hg9xcVjOzRc4WbBP8E514H3PPcqbMereiAHH6kzx0fd5J/7T32 9Z7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776738084; x=1777342884; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=yQT9sgIl1x5UPRFaiEeld1v2vJgPbOqztRRBh4ZUxsY=; b=KfAJ6CSEndmFIgKmfTrmvmbsvleAlvtYkS4QiCb47gVKwPmOp32Ix2ZPNHcTAO7KU6 1Zn5u88UjMMdDoEfxpmoR+/UaLaRPUEBHX2wUWaMjrXKvMqAwpbOfYV734mvectG920j uDuSsiP5Wc9VFSlK7sGu6Z8ke0vTSzhN+W8n4wuMTWkETVwGVSx6TOPgSNuBykWcL0PW zgsTLOA4I/VVyq5r4Z8n+yko8JpVSR3B9K0EZKCelMvR1oPVhB2XnLvIGSdFxdK86CGI kjYqWScKHt3WBsU5DtXYgZULogky4fIsNGWUQHe3fVRWOp+yNcyMyPyWliMSOIc9dc7j WGlw== X-Forwarded-Encrypted: i=1; AFNElJ9apvqCfyBwBPRXuW0vpxG1/uJJwKaipmyQ+qMKZEZUUbfNS31hnLVsAwi9i8oDkLaXwC6paFYPIr/Fheo=@vger.kernel.org X-Gm-Message-State: AOJu0YxIabL18GuPut0w/oW29qtRY6aOFibtb8I/ZhRPrEhQNuqOrN7K NS+hjv6l2CcqXJ6uXnYi8EBNTi45XVCfu4eKjCqf9G1z/dsFHVKoHfWRuNtUdFuTsQo= X-Gm-Gg: AeBDietpc9PxQWBy7RVITrGC1q8veNRa1xGCNiZRMfzmc8lC3bsxtRjW8MZjK8y6cW0 wkzwOlHu4wXU0f2GSK2d0XK/o114SH8WNRUEHpyJeDQfdZxL7upxIXByhYtBtN1SwV/NocueqFF 8HP4vAaqxkzaFnEDoUNOgo8SfIlkPBPzDcxj3gkXeC7rkp6fgURK4pcsbqrPJr8pdtpIAlYBtRK 5i0irfqW+Ywzi0E2rDy0kIzWLIfZcFGuEJ+2vNO4Bpd9kL+Ty3q2NYAqaybNpesMLNBOcYQgqsA leF5osof964RpHIS7y+hapRzSLdkupFKb22hVz1HWXmGOJJFu0ekCoM0jIMWPW9dJSzUcBDKxk0 sJ193E3pEobt7z+ReNIrRbcXUFcfOYXMcEpSvoLK+kZTkN4B7ID02bvg8dgp1HCHqM2KVP3Wy9Z p/bapBoiGi+Id2jck1lxAaBFk0qNc6 X-Received: by 2002:a05:6a00:be8:b0:82f:9a88:9092 with SMTP id d2e1a72fcca58-82f9a889558mr7527975b3a.33.1776738084191; Mon, 20 Apr 2026 19:21:24 -0700 (PDT) Received: from n232-176-004.byted.org ([240e:83:200::340]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f932dabd4sm11538780b3a.51.2026.04.20.19.21.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Apr 2026 19:21:23 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Muchun Song , Mike Rapoport , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 3/4] mm/sparse-vmemmap: Fix DAX vmemmap accounting with optimization Date: Tue, 21 Apr 2026 10:20:43 +0800 Message-Id: <20260421022044.1217503-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260421022044.1217503-1-songmuchun@bytedance.com> References: <20260421022044.1217503-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When vmemmap optimization is enabled for DAX, the nr_memmap_pages counter in /proc/vmstat is incorrect. The current code always accounts for the full, non-optimized vmemmap size, but vmemmap optimization reduces the actual number of vmemmap pages by reusing tail pages. This causes the system to overcount vmemmap usage, leading to inaccurate page statistics in /proc/vmstat. Fix this by introducing section_vmemmap_pages(), which returns the exact vmemmap page count for a given pfn range based on whether optimization is in effect. Fixes: 15995a352474 ("mm: report per-page metadata information") Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) Acked-by: Oscar Salvador --- mm/sparse-vmemmap.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 40290fbc1db4..05e3e2b94e32 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -652,6 +652,29 @@ void offline_mem_sections(unsigned long start_pfn, uns= igned long end_pfn) } } =20 +static int __meminit section_vmemmap_pages(unsigned long pfn, unsigned lon= g nr_pages, + struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) +{ + unsigned int order =3D pgmap ? pgmap->vmemmap_shift : 0; + unsigned long pages_per_compound =3D 1L << order; + + VM_WARN_ON_ONCE(!IS_ALIGNED(pfn | nr_pages, min(pages_per_compound, + PAGES_PER_SECTION))); + VM_WARN_ON_ONCE(pfn_to_section_nr(pfn) !=3D pfn_to_section_nr(pfn + nr_pa= ges - 1)); + + if (!vmemmap_can_optimize(altmap, pgmap)) + return DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE); + + if (order < PFN_SECTION_SHIFT) + return VMEMMAP_RESERVE_NR * nr_pages / pages_per_compound; + + if (IS_ALIGNED(pfn, pages_per_compound)) + return VMEMMAP_RESERVE_NR; + + return 0; +} + static struct page * __meminit populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) @@ -659,7 +682,7 @@ static struct page * __meminit populate_section_memmap(= unsigned long pfn, struct page *page =3D __populate_section_memmap(pfn, nr_pages, nid, altma= p, pgmap); =20 - memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); + memmap_pages_add(section_vmemmap_pages(pfn, nr_pages, altmap, pgmap)); =20 return page; } @@ -670,7 +693,7 @@ static void depopulate_section_memmap(unsigned long pfn= , unsigned long nr_pages, unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); =20 - memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE= _SIZE))); + memmap_pages_add(-section_vmemmap_pages(pfn, nr_pages, altmap, pgmap)); vmemmap_free(start, end, altmap); } =20 @@ -679,9 +702,10 @@ static void free_map_bootmem(struct page *memmap, stru= ct vmem_altmap *altmap, { unsigned long start =3D (unsigned long)memmap; unsigned long end =3D (unsigned long)(memmap + PAGES_PER_SECTION); + unsigned long pfn =3D page_to_pfn(memmap); =20 - memmap_boot_pages_add(-1L * (DIV_ROUND_UP(PAGES_PER_SECTION * sizeof(stru= ct page), - PAGE_SIZE))); + memmap_boot_pages_add(-section_vmemmap_pages(pfn, PAGES_PER_SECTION, + altmap, pgmap)); vmemmap_free(start, end, NULL); } =20 --=20 2.20.1 From nobody Tue Jun 16 20:39:47 2026 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A19F28851C for ; Tue, 21 Apr 2026 02:21:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776738091; cv=none; b=avN+G64AXp23IZbSAT7zVd9M++wQHw3+xKG+mUDTkTbm0vY7PGjE4poAdqE5i6Qk/KB2Nm+3CxsmJ+5ReViCyBgfzgyc+47U604BZ/yT1P/7LOEU+PSEqvtOs9w0VqZtZ8FvKX/lD+D/HJTEFPXkDjBogiNCOmHRelUJrM9D+k8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776738091; c=relaxed/simple; bh=SJrShLia5NzcOoEZw9B00ncEIOu+VKGTSsfBAfsovP4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fZQCKpT9upXwZbg+Mdm4t622SjmQbEjcRqA95LE5b6Ns5Sb9jByXKVLW02g5Cq9fZNPfPmwgY3Jq4JPQNvpLe5GagUZCaZPo5w00H6vGPTHQyGlxewo54Gvt4uRkehj5koY3+gnlALSKXM75nFRwKgm077f45FIY/CiuLtxRx+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=R1Z/Deb7; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="R1Z/Deb7" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-82f8892d4d6so1602014b3a.0 for ; Mon, 20 Apr 2026 19:21:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776738090; x=1777342890; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=AHmsLfWP4VqB6EidVCtZitT2oU+Ddzspq2XDJx69+hM=; b=R1Z/Deb7eue8px0u8H1a2h5mW5fLto9yeA7rJQleAuvmQnlM2Hr2JMjsKygUadFu4s Ps1UhSr7NdH3nB3qicDgcHhN5+bQYLPmDj2p/QLQCcsTgYORS42OWzViEF7w0+HRpKoW FH8Oaq4XHqqF5ETqZkWOixb+BwJZNwQ/y5UvP9fvY3j30m4JzCub5iw1lrVfVKvmYgmd uBu1q5jnzM1pJfqw0OrhVxJkg8UcdS+nWkpz1M+t7Bp129QPVBKKYuH8rtix7pO7Q7Mq 5Io3UEWKcx+DU3ISEGWc8YuYh+516EgMpHqxFNoA2hI9SBbqVdLazZYkExOTcI6vXvw2 /z9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776738090; x=1777342890; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=AHmsLfWP4VqB6EidVCtZitT2oU+Ddzspq2XDJx69+hM=; b=TDFF5jRDpV44rcidIGJBU+neZsxx7LMQHGrmUqxhvBw8fPJJmpwbMAFSb5eAnu1R++ cXbG9U4Md0ks3PlkgXns0obKbPy7PQfPpHbctNni4Z9CLe2QY25D3q/B02FHHS5pSjmt DisS79RtOVzRZRTl+uYlG4Vq6suhzb35jDlTDqi2fT18X1bLtl5FSgRch2I9KDICzI5b NrygUrUqGB0HR/FQrgYsGl4bWrYcbLGbpAXNoorqGqoOT24+HEyn6kaWg4XX4zJdGO5y u960Ne/03PoSbmXQjU0s5WkWGLTzr7gJ0w2kl4Zwigj7EpFpeTLE8vWRDuijgbEMdxUy H4pQ== X-Forwarded-Encrypted: i=1; AFNElJ8w+jNcIoFvrHis6yNYsxj3jROdonBMpj407w54k64RjoOikLUQseWC5r71q5O8kuiDcSIPKuSQpbj3Bz4=@vger.kernel.org X-Gm-Message-State: AOJu0Yx7rCVGEeNxUsAH9Xx1JHcGSz95SnSO+ewgyrhLFFF6mZT/HaTt 2UMGjDCRZWy4/QXtFoJw9ZW1Bvv9OfylnjkW5XNRN+q1g/W8ZiJo17XWZtJCoWLcVpc= X-Gm-Gg: AeBDiesZ5A9Z2ceEIhTKzew2PwsQtLmA+YkAteMybkzapkMypEVN556rjvOmwSwhd/l vStvKzct4YEJGxi9+tXSSqAVMZmxRhhorMcRqMeChCV141ZA/4LhCmDfX3FSW5D/MPn1yiS5bvK TqnLH83IbSE+8vaa1sIOIFLxwBp3Mv3YjUt9v4L8xxfG8P6xeY+gIk7tIGrVgxHjR/YycnBaMg7 oc61lMKtDf4LYZu4mrxV7jxPoD1YhsnYg2Y3X0w74R3Xty0m6276qUx5puf4di+gdsRkpXhuoaD 9/Hv97K2ixo27VBenvF2J4rTqCyFm0F6xiMrU0YHkBUjRCYs+TTWVCOxVBn73Ae/y6P4gVTPMOX ILM2FSp+GPPjKPrh/kvk0G7BSitTK/h/ko3+pqWEovFzkGyKyYjT0hOSp7xMCONeWhLSG3NkLRU 6sth4IrKqQW8LjdCbw2YqjGWUl0DXZMJEKf6cec50= X-Received: by 2002:a05:6a00:2993:b0:82a:6461:6d1e with SMTP id d2e1a72fcca58-82f8c92a5fbmr15190526b3a.46.1776738089510; Mon, 20 Apr 2026 19:21:29 -0700 (PDT) Received: from n232-176-004.byted.org ([240e:83:200::340]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f932dabd4sm11538780b3a.51.2026.04.20.19.21.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Apr 2026 19:21:29 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Muchun Song , Mike Rapoport , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 4/4] mm/mm_init: Fix pageblock migratetype for ZONE_DEVICE compound pages Date: Tue, 21 Apr 2026 10:20:44 +0800 Message-Id: <20260421022044.1217503-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260421022044.1217503-1-songmuchun@bytedance.com> References: <20260421022044.1217503-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The memmap_init_zone_device() function only initializes the migratetype of the first pageblock of a compound page. If the compound page size exceeds pageblock_nr_pages (e.g., 1GB hugepages with 2MB pageblocks), subsequent pageblocks in the compound page remain uninitialized. Move the migratetype initialization out of __init_zone_device_page() and into a separate pageblock_migratetype_init_range() function. This iterates over the entire PFN range of the memory, ensuring that all pageblocks are correctly initialized. Fixes: c4386bd8ee3a ("mm/memremap: add ZONE_DEVICE support for compound pag= es") Signed-off-by: Muchun Song Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Oscar Salvador --- mm/mm_init.c | 43 ++++++++++++++++++++++++++++--------------- 1 file changed, 28 insertions(+), 15 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index f9f8e1af921c..e2d8eae23aa3 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -674,6 +674,19 @@ static inline void fixup_hashdist(void) static inline void fixup_hashdist(void) {} #endif /* CONFIG_NUMA */ =20 +static __meminit void pageblock_migratetype_init_range(unsigned long pfn, + unsigned long nr_pages, + int migratetype) +{ + unsigned long end =3D pfn + nr_pages; + + for (pfn =3D pageblock_align(pfn); pfn < end; pfn +=3D pageblock_nr_pages= ) { + init_pageblock_migratetype(pfn_to_page(pfn), migratetype, false); + if (IS_ALIGNED(pfn, PAGES_PER_SECTION)) + cond_resched(); + } +} + /* * Initialize a reserved page unconditionally, finding its zone first. */ @@ -1011,21 +1024,6 @@ static void __ref __init_zone_device_page(struct pag= e *page, unsigned long pfn, page_folio(page)->pgmap =3D pgmap; page->zone_device_data =3D NULL; =20 - /* - * Mark the block movable so that blocks are reserved for - * movable at startup. This will force kernel allocations - * to reserve their blocks rather than leaking throughout - * the address space during boot when many long-lived - * kernel allocations are made. - * - * Please note that MEMINIT_HOTPLUG path doesn't clear memmap - * because this is done early in section_activate() - */ - if (pageblock_aligned(pfn)) { - init_pageblock_migratetype(page, MIGRATE_MOVABLE, false); - cond_resched(); - } - /* * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC are released * directly to the driver page allocator which will set the page count @@ -1122,6 +1120,9 @@ void __ref memmap_init_zone_device(struct zone *zone, =20 __init_zone_device_page(page, pfn, zone_idx, nid, pgmap); =20 + if (IS_ALIGNED(pfn, PAGES_PER_SECTION)) + cond_resched(); + if (pfns_per_compound =3D=3D 1) continue; =20 @@ -1129,6 +1130,18 @@ void __ref memmap_init_zone_device(struct zone *zone, compound_nr_pages(altmap, pgmap)); } =20 + /* + * Mark the block movable so that blocks are reserved for + * movable at startup. This will force kernel allocations + * to reserve their blocks rather than leaking throughout + * the address space during boot when many long-lived + * kernel allocations are made. + * + * Please note that MEMINIT_HOTPLUG path doesn't clear memmap + * because this is done early in section_activate() + */ + pageblock_migratetype_init_range(start_pfn, nr_pages, MIGRATE_MOVABLE); + pr_debug("%s initialised %lu pages in %ums\n", __func__, nr_pages, jiffies_to_msecs(jiffies - start)); } --=20 2.20.1