From nobody Wed Jun 17 02:53:04 2026 Received: from mail-pf1-f174.google.com (mail-pf1-f174.google.com [209.85.210.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C56EA3A9D8A for ; Wed, 22 Apr 2026 08:14:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845694; cv=none; b=jFGQWb6SGL5Bee1H/pN3c0Bt1BszWtudTRgA0QbQAPhKw0Am+ExYW1CLjP3VSBjh6Oh+UrFe57iEzMBxuvXRYnvF1Xuy70UC3FDvrV8iXugcxjchuXngCl8t9hOKGL8IMsuRHerjtp0RKAQmtCjS+SrxjzCI6fQeemwu55ZeZPc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845694; c=relaxed/simple; bh=VgrOSv5lepekAANBfEikqeyqPyGWOg423/S5/ovnugo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KostAyr2pTZyWgrmF0R0ppJN4AAQO7ZL7S0F+Raokrd+s91bXwp5e0x+NZTuZYkIGNO1oGlJo1zyouDXjth7BcvMA2w6j84u76D61IIuoB+YOJMyXS4xVWfBOvQlP5rwQuiEW1S8WZOfzsz7jVBsawIcjK7/1i7oGoyOrmVIVZo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=Crw8RLOE; arc=none smtp.client-ip=209.85.210.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="Crw8RLOE" Received: by mail-pf1-f174.google.com with SMTP id d2e1a72fcca58-82f8b60e485so2158487b3a.0 for ; Wed, 22 Apr 2026 01:14:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776845690; x=1777450490; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6TG6JcdOEtIWQgfEa9nERXqC1ZnDVOapveQPimu2FdY=; b=Crw8RLOEAfUBMLIiJwRcTvOXdkpEuzsbLoQa1JUfHc+VCYdrrF7gJr8j4c4OaU17qi 86xGCuq9Qm9SGtSMGBqlvb8nE7p0+keRHfufVgSyHTxXdxiCToPA33qqHXj+sSqO/YJd GKmYB+zpXKxiHbKM0DYXpfTgYke2uoZwmkmTEUPf1ycGEvW6Nwkm+/POUfj1c+ZUF7r+ xw0dPv/E6pbb4vs0Y+6IrcVxSEknJkPoWHKRYUH1tB9twC16N196sObarTej95Tc9BDb /g3Hvuw5Oo/UZ5HYdxN6PkRIy0x5hP47KAKdWklkgd9G2kl8w3yhPrOG+wVP9AFbdhCO ZTkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776845690; x=1777450490; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=6TG6JcdOEtIWQgfEa9nERXqC1ZnDVOapveQPimu2FdY=; b=Id3+OKDDP1j6ddtLSsfUjZqzUE2gNFUrQXAJs/JxvVGteJobU4Op9zCa7ye97owrwK NPUviC1APFNAEchk81pzQYb0gbgzv+1nxniroOohM/8jBukJOkek2AnCqT/+PRj4eDfW /xwyEs4HJAXEZDnS/blfxQmgXXwiNWp2j7Ccks7XfnStZEbk4qVTGC3kybCHAY3sch0f Dgb15WAekmJ0+1FmfXUuJYs3BmbYnJg0+k6NK6VZqeqnMWMSgB4OpMpGMWfkYqXHhaKB nVvj33DHb4Pj0y1IM0138dUTmP12VkTXx0cEM624jnVpySUifjzwbRpl8cmtxVEL69sN w1lQ== X-Forwarded-Encrypted: i=1; AFNElJ+ABPCm3IkycuUf0yQ8jnx93FZxQJMOE7VTL1j2o47vTHUNIu1GHrxsKHOYjMRavTE0YL0mQWBXX4f6nfY=@vger.kernel.org X-Gm-Message-State: AOJu0YxsjbIl7gUGHjg2/GRPHSOIdWhuDGdCs4ud1aQAgZOVLe/B+dCr jLvum6Q0b6+T6jJHazxrmkCpdperHvJwJhI+CYMlwAS2WrkBZqxRXMBQYxctqxaS+6g= X-Gm-Gg: AeBDieut9dnTYOKeMpHFyURuN3b+YuAtreVXQXA8OpLu0PntQfnrgpTd1czyxsHLVz/ 9sJ3xs5JymorwvAmbwxnOOBi1O8NkhU6P91BRKg5POq7WaIaqskkCQaM3c6ONo3TT8/4VE16H6s qDTVnVqYkhJvHWl8FN0+Dvb7uO0942/IS7gp1eVJ+Lw7vDMGILREeqBxExCfwclR5HvCtEwKC6u H6Nzk2pApVuzB6NLq4fnqU57GuZQxOX2LyFlvG0BVJndpkWKUpanMWAOlvJiLq83z2Y1VxhIu72 yNpbW2sKSdq2jFqmmTRgYujFu2Ge7vYoSI5JeWfiBoZgjdeyzS/S3c73OrXLCgq3f8XlbgchqAJ 6WvWSLw56Et0//C5ZS53GQ2rw2AwynmEDi4IVkFCdSA2ddar+3AcgQLRqymSNYDdmiwOt2FKQMV HMxfW9al8TT489atMRWGZmIh1jvuQl X-Received: by 2002:a05:6a00:1826:b0:82f:2384:212a with SMTP id d2e1a72fcca58-82f8c961ef5mr22759743b3a.26.1776845690162; Wed, 22 Apr 2026 01:14:50 -0700 (PDT) Received: from n232-176-004.byted.org ([240e:83:200::34f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f8ec0307esm16522874b3a.53.2026.04.22.01.14.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Apr 2026 01:14:49 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Muchun Song , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 1/5] mm/sparse-vmemmap: Fix vmemmap accounting underflow Date: Wed, 22 Apr 2026 16:14:16 +0800 Message-Id: <20260422081420.4009847-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260422081420.4009847-1-songmuchun@bytedance.com> References: <20260422081420.4009847-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In section_activate(), if populate_section_memmap() fails, the error handling path calls section_deactivate() to roll back the state. This causes a vmemmap accounting imbalance. Since commit c3576889d87b ("mm: fix accounting of memmap pages"), memmap pages are accounted for only after populate_section_memmap() succeeds. However, the failure path unconditionally calls section_deactivate(), which decreases the vmemmap count. Consequently, a failure in populate_section_memmap() leads to an accounting underflow, incorrectly reducing the system's tracked vmemmap usage. Fix this more thoroughly by moving all accounting calls into the lower level functions that actually perform the vmemmap allocation and freeing: - populate_section_memmap() accounts for newly allocated vmemmap pages - depopulate_section_memmap() unaccounts when vmemmap is freed This ensures proper accounting in all code paths, including error handling and early section cases. Fixes: c3576889d87b ("mm: fix accounting of memmap pages") Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) Acked-by: Oscar Salvador Acked-by: David Hildenbrand (Arm) --- mm/sparse-vmemmap.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 6eadb9d116e4..a7b11248b989 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -656,7 +656,12 @@ static struct page * __meminit populate_section_memmap= (unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { - return __populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); + struct page *page =3D __populate_section_memmap(pfn, nr_pages, nid, altma= p, + pgmap); + + memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); + + return page; } =20 static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_= pages, @@ -665,13 +670,17 @@ static void depopulate_section_memmap(unsigned long p= fn, unsigned long nr_pages, unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); =20 + memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE= _SIZE))); vmemmap_free(start, end, altmap); } + static void free_map_bootmem(struct page *memmap) { unsigned long start =3D (unsigned long)memmap; unsigned long end =3D (unsigned long)(memmap + PAGES_PER_SECTION); =20 + memmap_boot_pages_add(-1L * (DIV_ROUND_UP(PAGES_PER_SECTION * sizeof(stru= ct page), + PAGE_SIZE))); vmemmap_free(start, end, NULL); } =20 @@ -774,14 +783,10 @@ static void section_deactivate(unsigned long pfn, uns= igned long nr_pages, * The memmap of early sections is always fully populated. See * section_activate() and pfn_valid() . */ - if (!section_is_early) { - memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAG= E_SIZE))); + if (!section_is_early) depopulate_section_memmap(pfn, nr_pages, altmap); - } else if (memmap) { - memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), - PAGE_SIZE))); + else if (memmap) free_map_bootmem(memmap); - } =20 if (empty) ms->section_mem_map =3D (unsigned long)NULL; @@ -826,7 +831,6 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, section_deactivate(pfn, nr_pages, altmap); return ERR_PTR(-ENOMEM); } - memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); =20 return memmap; } --=20 2.20.1 From nobody Wed Jun 17 02:53:04 2026 Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 312983A963A for ; Wed, 22 Apr 2026 08:14:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845700; cv=none; b=l51hhOTvlvxA4CKxfmCu7IU/V/9BdMCGWI2HoE3VOMMWvSePwsWitkeu1jQ8TpwTT92MM7CfhyauGDWGbdRbqJpTcqsNgpyS7CQtDAsJ1kuHUcOdHU4zN+YPdJjxrSWVyYOAsDyLeBP6JGczPp/KbU3Rmn9sQd6wSV3/AahZCOg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845700; c=relaxed/simple; bh=NLMUHE7iTif0SupRNqrXbfTYuVcDrfz7lFIGZor2tiA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bV//wqYQhHa9iOARKTLGuKg7/J0iEZD0wWHcKRda1Tx8J4L/Qofzd6CSTu/5aT6L8g0QY/OtwTF7M0r62J56kBS3mo+JHgXAT+MtNNDSMgiuZmHCfphFIsPDbq271iuaTEV11xKwhK3VwTfBZsFI+z9O2M3Ny2jLDFHcIaIz454= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=e3iPF43/; arc=none smtp.client-ip=209.85.210.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="e3iPF43/" Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-82fa8d6425bso1608527b3a.0 for ; Wed, 22 Apr 2026 01:14:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776845696; x=1777450496; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=N9jQcBqjOBMyMvaKc/0bmbd5llByfHuxAmeKteQfaMc=; b=e3iPF43/hpueGmGCdjC1UN3trEz9N3ysQJdQ9mroXJq7WOTnqjpGoUjI3xT/tCvTdx MqF180Fsik+grS/P6beyaXKuixQbcywzkXbL+hxzezhPkl+y7JzR+nqeR2FMFBq1bn3K +zJzNs2tjVdtpH8xw/pOAVBM1xZyt5W+IKzIWmUpSiDdLs4CdEQ2x7sjrMJoIES86er1 bxYZkQsqdLBFg7asUrPzilWjP/DrP8IbUtareu7nbQZ05u1YxJAc+3fJrfpxjHn2mt8c 2RwJk5KD1fQab9ku3HJ/61ThlHAWdjrq3GD1ltQ+CSFB3zYL4EBA2XpeVsqqtbBsWEeL 3r7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776845696; x=1777450496; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=N9jQcBqjOBMyMvaKc/0bmbd5llByfHuxAmeKteQfaMc=; b=E9Udub6/DNgo9J4IopGfKyJA1SqOkkj8nMpxVLH8xrCAKexl95r7rTAx7n8mW8X06n VGbUYdGgZkGzj6phRouSo8N41toALXj1dGrkMLMp7t88WvKOsTwieqI8ePmGPDuxoTMM uk2gNU2qQ4WGSgqTqYPwn7Xy6NRWahH+qL7oNfkfc/zwsbXI1h7kjGoTx5tpex3CrDLb Y6xRIN2bRgSVvvkQq/h85KjR/uzE7dCaq7zBaGmu2VNWiQMobqLdZ/tN4WAIdAOKLyzW WVeDsTLAIj098pSigGCkt19vN4IjgeJEM7p8hPH5mZX6f0kKGFZqtXGYm7+eTe8lCIw5 7d7A== X-Forwarded-Encrypted: i=1; AFNElJ/qwKT6jt4iti17zW+fmwDBLqpKXRTn3GazSgv3gHcr7QcLYAjbX756+JLYpt/95kWbfGfIsqPQApZfei8=@vger.kernel.org X-Gm-Message-State: AOJu0YwTalFeRzm7XAnUUB2D6qUcn+35RFSXCzCSAXEf0svhzTnBUGgq BVlYd1ADW5sKbrTykiAUKM6eczICfgYKQyI+sFJBoAkPEMW5opT/UF2vAYbQXfpLrbw= X-Gm-Gg: AeBDies6nisjFfPBLSQo6Pvb7X8GhpWI52SDz5FM23YLoGsVuW7Zmhb3naMGZGO7Yft 26sshYywlVhu57mMUrzwJvlmBRyFcnz3Zq7kYJggRb4+g5L5MCJ95405Zo5fz7woPaRTc2Ee+Xl BmrvxbgTFtX329F02jdKjd3FYgNanvBiXNkbH2zlrrRkcHZhw/gqnBuaHC+fkpzz62RdHE1kfGv zkkp6MNrNGTNg0voHST0N2wb2AGHkyVqcMdNt+7cg3Az1Lu+Vbxg7xKkR4Pd4uGxFXdphNMYjRf 3wG9B7S0Qn24hT8Z6p+tPJjcZL00lnYJYmYEgWavzrl2F8PaO44KL7G+wKoJ2wKFPaGfVzjnnVo pdtqYnXmzThKElgzHn1D1kNtkKyOBTu48AjC8IW2Up5PJD8LAHYqwte+0XB/zqngLaKPQksBbA2 243SRi8kUFaKsnYO2VKbcZi49lqY0D3cD2fwpYVbo= X-Received: by 2002:a05:6a00:2d1f:b0:82f:bd8:70eb with SMTP id d2e1a72fcca58-82f8c876febmr21354185b3a.21.1776845695323; Wed, 22 Apr 2026 01:14:55 -0700 (PDT) Received: from n232-176-004.byted.org ([240e:83:200::34f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f8ec0307esm16522874b3a.53.2026.04.22.01.14.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Apr 2026 01:14:54 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Muchun Song , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 2/5] mm/sparse-vmemmap: Pass @pgmap argument to memory deactivation paths Date: Wed, 22 Apr 2026 16:14:17 +0800 Message-Id: <20260422081420.4009847-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260422081420.4009847-1-songmuchun@bytedance.com> References: <20260422081420.4009847-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, the memory hot-remove call chain -- arch_remove_memory(), __remove_pages(), sparse_remove_section() and section_deactivate() -- does not carry the struct dev_pagemap pointer. This prevents the lower levels from knowing whether the section was originally populated with vmemmap optimizations (e.g., DAX with vmemmap optimization enabled). Without this information, we cannot call vmemmap_can_optimize() to determine if the vmemmap pages were optimized. As a result, the vmemmap page accounting during teardown will mistakenly assume a non-optimized allocation, leading to incorrect memmap statistics. To lay the groundwork for fixing the vmemmap page accounting, we need to pass the @pgmap pointer down to the deactivation location. Plumb the @pgmap argument through the APIs of arch_remove_memory(), __remove_pages() and sparse_remove_section(), mirroring the corresponding *_activate() paths. Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) Reviewed-by: Oscar Salvador Acked-by: David Hildenbrand (Arm) --- arch/arm64/mm/mmu.c | 5 +++-- arch/loongarch/mm/init.c | 5 +++-- arch/powerpc/mm/mem.c | 5 +++-- arch/riscv/mm/init.c | 5 +++-- arch/s390/mm/init.c | 5 +++-- arch/x86/mm/init_64.c | 5 +++-- include/linux/memory_hotplug.h | 8 +++++--- mm/memory_hotplug.c | 13 +++++++------ mm/memremap.c | 4 ++-- mm/sparse-vmemmap.c | 12 ++++++------ 10 files changed, 38 insertions(+), 29 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index dd85e093ffdb..e5a42b7a0160 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -2024,12 +2024,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); __remove_pgd_mapping(swapper_pg_dir, __phys_to_virt(start), size); } =20 diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index 00f3822b6e47..c9c57f08fa2c 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -86,7 +86,8 @@ int arch_add_memory(int nid, u64 start, u64 size, struct = mhp_params *params) return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; @@ -95,7 +96,7 @@ void arch_remove_memory(u64 start, u64 size, struct vmem_= altmap *altmap) /* With altmap the first mapped page is offset from @start */ if (altmap) page +=3D vmem_altmap_offset(altmap); - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); } #endif =20 diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 648d0c5602ec..4c1afab91996 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -158,12 +158,13 @@ int __ref arch_add_memory(int nid, u64 start, u64 siz= e, return rc; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); arch_remove_linear_mapping(start, size); } #endif diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index decd7df40fa4..b0092fb842a3 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1717,9 +1717,10 @@ int __ref arch_add_memory(int nid, u64 start, u64 si= ze, struct mhp_params *param return ret; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { - __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap); + __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap, pgmap); remove_linear_mapping(start, size); flush_tlb_all(); } diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 1f72efc2a579..11a689423440 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -276,12 +276,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return rc; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); vmem_remove_mapping(start, size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index df2261fa4f98..77b889b71cf3 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1288,12 +1288,13 @@ kernel_physical_mapping_remove(unsigned long start,= unsigned long end) remove_pagetable(start, end, true, NULL); } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); kernel_physical_mapping_remove(start, start + size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 815e908c4135..7c9d66729c60 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -135,9 +135,10 @@ static inline bool movable_node_is_enabled(void) return movable_node_enabled; } =20 -extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap); +extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap, + struct dev_pagemap *pgmap); extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, struct dev_pagemap *pgmap); =20 /* reasonably generic interface to expand the physical pages */ extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_= pages, @@ -307,7 +308,8 @@ extern int sparse_add_section(int nid, unsigned long pf= n, unsigned long nr_pages, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); extern void sparse_remove_section(unsigned long pfn, unsigned long nr_page= s, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); extern struct zone *zone_for_pfn_range(enum mmop online_type, int nid, struct memory_group *group, unsigned long start_pfn, unsigned long nr_pages); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 2a943ec57c85..ec1a86b12477 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -576,6 +576,7 @@ void remove_pfn_range_from_zone(struct zone *zone, * @pfn: starting pageframe (must be aligned to start of a section) * @nr_pages: number of pages to remove (must be multiple of section size) * @altmap: alternative device page map or %NULL if default memmap is used + * @pgmap: device page map or %NULL if not ZONE_DEVICE * * Generic helper function to remove section mappings and sysfs entries * for the section of the memory we are removing. Caller needs to make @@ -583,7 +584,7 @@ void remove_pfn_range_from_zone(struct zone *zone, * calling offline_pages(). */ void __remove_pages(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { const unsigned long end_pfn =3D pfn + nr_pages; unsigned long cur_nr_pages; @@ -598,7 +599,7 @@ void __remove_pages(unsigned long pfn, unsigned long nr= _pages, /* Select all remaining pages up to the next section boundary */ cur_nr_pages =3D min(end_pfn - pfn, SECTION_ALIGN_UP(pfn + 1) - pfn); - sparse_remove_section(pfn, cur_nr_pages, altmap); + sparse_remove_section(pfn, cur_nr_pages, altmap, pgmap); } } =20 @@ -1425,7 +1426,7 @@ static void remove_memory_blocks_and_altmaps(u64 star= t, u64 size) =20 remove_memory_block_devices(cur_start, memblock_size); =20 - arch_remove_memory(cur_start, memblock_size, altmap); + arch_remove_memory(cur_start, memblock_size, altmap, NULL); =20 /* Verify that all vmemmap pages have actually been freed. */ WARN(altmap->alloc, "Altmap not fully unmapped"); @@ -1468,7 +1469,7 @@ static int create_altmaps_and_memory_blocks(int nid, = struct memory_group *group, ret =3D create_memory_block_devices(cur_start, memblock_size, nid, params.altmap, group); if (ret) { - arch_remove_memory(cur_start, memblock_size, NULL); + arch_remove_memory(cur_start, memblock_size, NULL, NULL); kfree(params.altmap); goto out; } @@ -1554,7 +1555,7 @@ int add_memory_resource(int nid, struct resource *res= , mhp_t mhp_flags) /* create memory block devices after memory was added */ ret =3D create_memory_block_devices(start, size, nid, NULL, group); if (ret) { - arch_remove_memory(start, size, params.altmap); + arch_remove_memory(start, size, params.altmap, NULL); goto error; } } @@ -2266,7 +2267,7 @@ static int try_remove_memory(u64 start, u64 size) * No altmaps present, do the removal directly */ remove_memory_block_devices(start, size); - arch_remove_memory(start, size, NULL); + arch_remove_memory(start, size, NULL, NULL); } else { /* all memblocks in the range have altmaps */ remove_memory_blocks_and_altmaps(start, size); diff --git a/mm/memremap.c b/mm/memremap.c index 053842d45cb1..81766d822400 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -97,10 +97,10 @@ static void pageunmap_range(struct dev_pagemap *pgmap, = int range_id) PHYS_PFN(range_len(range))); if (pgmap->type =3D=3D MEMORY_DEVICE_PRIVATE) { __remove_pages(PHYS_PFN(range->start), - PHYS_PFN(range_len(range)), NULL); + PHYS_PFN(range_len(range)), NULL, pgmap); } else { arch_remove_memory(range->start, range_len(range), - pgmap_altmap(pgmap)); + pgmap_altmap(pgmap), pgmap); kasan_remove_zero_shadow(__va(range->start), range_len(range)); } mem_hotplug_done(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index a7b11248b989..c208187a4b00 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -665,7 +665,7 @@ static struct page * __meminit populate_section_memmap(= unsigned long pfn, } =20 static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_= pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); @@ -746,7 +746,7 @@ static int fill_subsection_map(unsigned long pfn, unsig= ned long nr_pages) * usage map, but still need to free the vmemmap range. */ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms =3D __pfn_to_section(pfn); bool section_is_early =3D early_section(ms); @@ -784,7 +784,7 @@ static void section_deactivate(unsigned long pfn, unsig= ned long nr_pages, * section_activate() and pfn_valid() . */ if (!section_is_early) - depopulate_section_memmap(pfn, nr_pages, altmap); + depopulate_section_memmap(pfn, nr_pages, altmap, pgmap); else if (memmap) free_map_bootmem(memmap); =20 @@ -828,7 +828,7 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, =20 memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); if (!memmap) { - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); return ERR_PTR(-ENOMEM); } =20 @@ -889,13 +889,13 @@ int __meminit sparse_add_section(int nid, unsigned lo= ng start_pfn, } =20 void sparse_remove_section(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms =3D __pfn_to_section(pfn); =20 if (WARN_ON_ONCE(!valid_section(ms))) return; =20 - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ --=20 2.20.1 From nobody Wed Jun 17 02:53:04 2026 Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C7D553AC0CD for ; Wed, 22 Apr 2026 08:15:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845705; cv=none; b=Cct09rpby90qDG874N3hy3g+6JPG0F+vjxZcwjiQq9s5T5N5PQCJNCIzK2gsnpUPljv1Y2jwt1TaTzu9e/DrvyWoVDuU0IVDm4Y43iig9bePlSrjQrwxjxbktbf0q+pWqp1peZlAJifvMTJOzIp4vnv/nbbL4PIxcp8PcGoQOto= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845705; c=relaxed/simple; bh=feyEf9apG7rn+SlAG9vJ81wSJr7VCvouu8IzG6Ch/DA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LzyeDd8ie5TXkjL+MdAHHxJQ+/1OeoiGdM2Q+8450BSsKHqu4JtfHxyAijl22IRHHIwUQ7H7LnftkDTg0Sv4N3XPDjxb3WhavwTA0FITQVvANpHbWya88//wf5rI4b2XLenY+UnJDjblZCiVlMNvGIRgVvwMsupO6qaOitEeSCk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=cAc5czAx; arc=none smtp.client-ip=209.85.210.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="cAc5czAx" Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-82cf636dac8so2193650b3a.3 for ; Wed, 22 Apr 2026 01:15:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776845701; x=1777450501; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=guNzfHlKCo9CNY+Bza9mDTIoReNzZ+M+FnR7lRKuPVg=; b=cAc5czAxHuI+MF3YYQ/p1X+4p2RSzf9Ro5fZidn/bSPgjoFqT8u0XOsvrfichXOeb0 uH2Ks060/GNfx98TfgWkU2NfUhNUUMcGZ3yMsOFDrRcSwZGenX2I3aKO2/r2vn0n8Fmm 62h/90yaoIhRLpPGg6GrJI5GNnAvQIojX8/WPtihhr7yXEq4uDofb4R304nzRXNf7++a lUvSCK+wkBZYlRYUVBC0GveJk4FgQBUno8qH0aDLUqG2/4Xj6LSUUReC7Inyd/G0uV1j 9dllaDMHmen6LucbvGE6YorXUvZmGQ5d2sjDfrT/dcwxNaGQaBiUNJSg6Ien5Fpd/KxJ Y6lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776845701; x=1777450501; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=guNzfHlKCo9CNY+Bza9mDTIoReNzZ+M+FnR7lRKuPVg=; b=Zpo1YB8IL4cUuNX4kYz5kCWQmBcLf4Y0VkBbNMEqPkbh0+rcuWjAsxxBjlIj2AGQY2 Nh8vFOMRLSEa2USb9q9wHiaUGkqIcCXzZiw8zqstZ5Mnq/FluEcBZR6p+yTqrQagpeIl MOjbaCnYXegC99gnFeNbs0oBbgkoLAEwRblrYq1eLfdKpAc4ZonYrsk1bGWmirXpcDrm NHmx3Uz7eiQQIZkaWilrKWPP/lkilUafj9BhtDPZGvOuQayAwc45DXOI4FyFEwyLkrB2 tl/DNzNwMjE/RUCvc3e1YoumWxvsGpgDAYu6ufFkmZ0nRvmzv49mObO5M5IiHONJqg0z nmmg== X-Forwarded-Encrypted: i=1; AFNElJ82ZAVLN6ePX+vhprE56ULrYQRZ4EXjkk/GGeb50T4YMxC6CSyM6jzJ4qieiiNhOAeDcUr6OsScQ+krqVk=@vger.kernel.org X-Gm-Message-State: AOJu0YzRwnCWuj4SaWGgz36wfzrtUTPzIiI+2fyDhD4blXxp6c7C+S8L 6GAgwp/0UuOGWAstForOwnaxU/RstOqcsGn3KFBUHJHwT1TVLc70l7nwKnxop8suPYM= X-Gm-Gg: AeBDietJxB7yPwsZzDbspccWuJ0wN10UEGTy825iu4JWQr6CwyDt/iItiNZDU3xjVxs JeI3z7f4+bvqs4tRX20H9ucEAnSZ9LkyirBCKXBFSOq7lfuxZH+Og00WJBe9MRxEFLzjF1sTOf/ G08/V0wftCjC2l3vqQ4/WqX810kth3ldbmn3nqhTbjds3GR369G/X57AccjBv1BgI622xzkuhBa lQliB+CUd39J7MlvVXsiegcXeU5t55pvaSQZTM4DhrTXk52OnOwXnIagAecq+Il3TiaKmHRGwak SZRXSkllVmBjURcokTulrc+F/hpJ6ZU3afnFbxXQAKMGxAT9luTjlDaSWAay+x5c8CG5CaVtOnw gpjvULD8nj3vujAmKRAuU7Y+B/IeL9comIoZagYKmmSId1QW89ts8hsn9tjql7/wpMkVrdmI/pq IqkVIg5syzQujpiTXN5KCam27DC404 X-Received: by 2002:a05:6a00:3397:b0:82f:24e:6a50 with SMTP id d2e1a72fcca58-82f8c830359mr23157259b3a.10.1776845700663; Wed, 22 Apr 2026 01:15:00 -0700 (PDT) Received: from n232-176-004.byted.org ([240e:83:200::34f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f8ec0307esm16522874b3a.53.2026.04.22.01.14.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Apr 2026 01:15:00 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Muchun Song , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 3/5] mm/sparse-vmemmap: Fix DAX vmemmap accounting with optimization Date: Wed, 22 Apr 2026 16:14:18 +0800 Message-Id: <20260422081420.4009847-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260422081420.4009847-1-songmuchun@bytedance.com> References: <20260422081420.4009847-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When vmemmap optimization is enabled for DAX, the nr_memmap_pages counter in /proc/vmstat is incorrect. The current code always accounts for the full, non-optimized vmemmap size, but vmemmap optimization reduces the actual number of vmemmap pages by reusing tail pages. This causes the system to overcount vmemmap usage, leading to inaccurate page statistics in /proc/vmstat. Fix this by introducing section_vmemmap_pages(), which returns the exact vmemmap page count for a given pfn range based on whether optimization is in effect. Fixes: 15995a352474 ("mm: report per-page metadata information") Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) Acked-by: Oscar Salvador --- mm/sparse-vmemmap.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index c208187a4b00..fcc5e0eda9e7 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -652,6 +652,29 @@ void offline_mem_sections(unsigned long start_pfn, uns= igned long end_pfn) } } =20 +static int __meminit section_vmemmap_pages(unsigned long pfn, unsigned lon= g nr_pages, + struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) +{ + unsigned int order =3D pgmap ? pgmap->vmemmap_shift : 0; + unsigned long pages_per_compound =3D 1L << order; + + VM_WARN_ON_ONCE(!IS_ALIGNED(pfn | nr_pages, min(pages_per_compound, + PAGES_PER_SECTION))); + VM_WARN_ON_ONCE(pfn_to_section_nr(pfn) !=3D pfn_to_section_nr(pfn + nr_pa= ges - 1)); + + if (!vmemmap_can_optimize(altmap, pgmap)) + return DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE); + + if (order < PFN_SECTION_SHIFT) + return VMEMMAP_RESERVE_NR * nr_pages / pages_per_compound; + + if (IS_ALIGNED(pfn, pages_per_compound)) + return VMEMMAP_RESERVE_NR; + + return 0; +} + static struct page * __meminit populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) @@ -659,7 +682,7 @@ static struct page * __meminit populate_section_memmap(= unsigned long pfn, struct page *page =3D __populate_section_memmap(pfn, nr_pages, nid, altma= p, pgmap); =20 - memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); + memmap_pages_add(section_vmemmap_pages(pfn, nr_pages, altmap, pgmap)); =20 return page; } @@ -670,7 +693,7 @@ static void depopulate_section_memmap(unsigned long pfn= , unsigned long nr_pages, unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); =20 - memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE= _SIZE))); + memmap_pages_add(-section_vmemmap_pages(pfn, nr_pages, altmap, pgmap)); vmemmap_free(start, end, altmap); } =20 @@ -678,9 +701,10 @@ static void free_map_bootmem(struct page *memmap) { unsigned long start =3D (unsigned long)memmap; unsigned long end =3D (unsigned long)(memmap + PAGES_PER_SECTION); + unsigned long pfn =3D page_to_pfn(memmap); =20 - memmap_boot_pages_add(-1L * (DIV_ROUND_UP(PAGES_PER_SECTION * sizeof(stru= ct page), - PAGE_SIZE))); + memmap_boot_pages_add(-section_vmemmap_pages(pfn, PAGES_PER_SECTION, + NULL, NULL)); vmemmap_free(start, end, NULL); } =20 --=20 2.20.1 From nobody Wed Jun 17 02:53:04 2026 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 316293AC0EC for ; Wed, 22 Apr 2026 08:15:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845711; cv=none; b=tbXfyAibp1O3pTUJtu0oUU/4Ld8DjnEBmfcPJHjW5ASmikgryviUZfV8ffDRebfS+nYtss6vqn1ItUf6+AISiLTEZyBEbpuI9jfmLJsPjdqLim6xwfWGMBDtmqj7amUcZb4qW3CcgJKQI/ziE//0lPnlPcDpbG0XmldtzrxSV1c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845711; c=relaxed/simple; bh=rjM89uaOFVJYok1YTVzqeuURgkNRdtMpUS4YuC9QFN8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gXWWtk40lSWSctRyZ92Bqx9yTL69d5TAA+N0Am/kRzLTFqp52vBiwNcpT9deDBl6jiRuRj5bkJOxNhCYhOaFxsi7vqMD33iqtfPAY9qcl5URLyXWVShp6S6+7HsaJNSGVfJjw/1AsUWGP385hu1oW1oXDA1DvpKqUH3uqlUTN/Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=RtVm9qDj; arc=none smtp.client-ip=209.85.210.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="RtVm9qDj" Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-82f8b60e54dso3403724b3a.2 for ; Wed, 22 Apr 2026 01:15:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776845707; x=1777450507; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bFvE9gY2jdfvM/7z3bv6AJAQ88eqA1v0Ow6s9f2y7sk=; b=RtVm9qDjCTYhO7pdDzpWt+3kMXp7T7SMtIoC27Z2/FEzvQ+UWuJmKE7dvwco8CJzud IbR3QH24sNKr9aWxga/48XlbkYqu4eHQKbfnTN+3YrMmIWx11wxQ65tcVZa+7uTapml0 IBDY9iDcS+AO0lkI/vx2lb+nttljfUk4yQmaf8N1CH9HKAaZy6ut5/lFjkRNse2BsHJ5 qH5KMJi5SQi4YGvM1ikgoYPBtDb8S42bwDehqGF/IvZrRYuu+0XDw8Qm3TQN9CGdBdoT exYZtlOsI5gUzs5ijzK9vL1RUdHT4TfaaSMmCLLJRzvyyoq0DdM1ClEJwX1unBygPYeR i/FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776845707; x=1777450507; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=bFvE9gY2jdfvM/7z3bv6AJAQ88eqA1v0Ow6s9f2y7sk=; b=DvD22rlvSCSl4QeKTqiaJoM44LqUAGRHuQfgbywxF5/4XOPBFyVgj3GJak+SGreG+o NtglF2XpSKM5bJ7GVrjOGEklt2VbHG1uh3/sOVJ2Mi3YDYvLBMTGC9MbXnHYbDFuCgVi e4v2GrfudOeYH9HCE90XKH2z6buv9NjR24HSuWqI8mMg1mhC2UHVM07fcw78T/BLV3Io oPeylAd0KzEyk7wD+rMyWe5fGos62VTVHewoNdgJB7c3EZU4tqKw0zhf+1kRSP2MV7gj 6C2n3vdz4/QmebCTuXm5jLI9hzohY+WxfFrChXFRlS2x+bQ8IH9uspA+FK5H4qHeHe4A UQJg== X-Forwarded-Encrypted: i=1; AFNElJ/G6WvRt9bZLyThXh9t/fS5XNXgnYgRNaCC1ayDu+DSjPjLXoYigavDpO7FwgaOjbzqeTNHNcluVC7zSKE=@vger.kernel.org X-Gm-Message-State: AOJu0Yx/xwTHzDO+BoDLtPZLR5gEgdWZYwUd5nCbKIZQpI4XRamL2/IC P2sdnSPwZAMVJi5XEsOm91QCtOcpP0JCEk0TwBDivsGHXLdsAegOpjMxfZUXKDADx+8= X-Gm-Gg: AeBDietDbnAV/GmzDx325pbFkNnO/1DC6t1nr2fLUAnbfndrqJ/p38ybAgD0oZbpyRw +mnju8qF3OKy2/Ui9KXYXzSjGP30Fkf4HAeyschamlOkhsQspgFGGxtuBGMhRHgFJDelt/qR4d2 /9CJfA+52SE+KnCHxV/iXsJ/Pg+HPnVb4WqnLres6zcyL1GGKmlf4/bvLelSmmferfmUXgKGYHY zAUlTWrqQPToUAvJp4yGcAWhmnq283XnUsKzlRirFgoPIr36GUiFLSwigM6qZSM/9U4/78Yh7aP XMD2p2/wW4vYB1I0K4umOsY/7b66CDQucc+UtCTlMaztcRBiVkj+kz4aLa9G7s94emKUv13yLB5 op09+PbeEcx6KgwfCY+mUU5b4vGCue0crEd0T9FkeFg1bGzqyzEMqvf4zvM+Xvkb13jEZRQmr4t LMzIgL8lH0NsbRGgPxHzLc02krMUKy X-Received: by 2002:a05:6a00:3e16:b0:82f:5571:1a8d with SMTP id d2e1a72fcca58-82f8c8308e0mr22842732b3a.2.1776845706907; Wed, 22 Apr 2026 01:15:06 -0700 (PDT) Received: from n232-176-004.byted.org ([240e:83:200::34f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f8ec0307esm16522874b3a.53.2026.04.22.01.15.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Apr 2026 01:15:06 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Muchun Song , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 4/5] mm/mm_init: Fix pageblock migratetype for ZONE_DEVICE compound pages Date: Wed, 22 Apr 2026 16:14:19 +0800 Message-Id: <20260422081420.4009847-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260422081420.4009847-1-songmuchun@bytedance.com> References: <20260422081420.4009847-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The memmap_init_zone_device() function only initializes the migratetype of the first pageblock of a compound page. If the compound page size exceeds pageblock_nr_pages (e.g., 1GB hugepages with 2MB pageblocks), subsequent pageblocks in the compound page remain uninitialized. Move the migratetype initialization out of __init_zone_device_page() and into a separate pageblock_migratetype_init_range() function. This iterates over the entire PFN range of the memory, ensuring that all pageblocks are correctly initialized. Fixes: c4386bd8ee3a ("mm/memremap: add ZONE_DEVICE support for compound pag= es") Signed-off-by: Muchun Song Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Oscar Salvador --- mm/mm_init.c | 45 ++++++++++++++++++++++++++++++--------------- 1 file changed, 30 insertions(+), 15 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index f9f8e1af921c..9d0fe79a94de 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -674,6 +674,21 @@ static inline void fixup_hashdist(void) static inline void fixup_hashdist(void) {} #endif /* CONFIG_NUMA */ =20 +#ifdef CONFIG_ZONE_DEVICE +static __meminit void pageblock_migratetype_init_range(unsigned long pfn, + unsigned long nr_pages, + int migratetype) +{ + unsigned long end =3D pfn + nr_pages; + + for (pfn =3D pageblock_align(pfn); pfn < end; pfn +=3D pageblock_nr_pages= ) { + init_pageblock_migratetype(pfn_to_page(pfn), migratetype, false); + if (IS_ALIGNED(pfn, PAGES_PER_SECTION)) + cond_resched(); + } +} +#endif + /* * Initialize a reserved page unconditionally, finding its zone first. */ @@ -1011,21 +1026,6 @@ static void __ref __init_zone_device_page(struct pag= e *page, unsigned long pfn, page_folio(page)->pgmap =3D pgmap; page->zone_device_data =3D NULL; =20 - /* - * Mark the block movable so that blocks are reserved for - * movable at startup. This will force kernel allocations - * to reserve their blocks rather than leaking throughout - * the address space during boot when many long-lived - * kernel allocations are made. - * - * Please note that MEMINIT_HOTPLUG path doesn't clear memmap - * because this is done early in section_activate() - */ - if (pageblock_aligned(pfn)) { - init_pageblock_migratetype(page, MIGRATE_MOVABLE, false); - cond_resched(); - } - /* * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC are released * directly to the driver page allocator which will set the page count @@ -1122,6 +1122,9 @@ void __ref memmap_init_zone_device(struct zone *zone, =20 __init_zone_device_page(page, pfn, zone_idx, nid, pgmap); =20 + if (IS_ALIGNED(pfn, PAGES_PER_SECTION)) + cond_resched(); + if (pfns_per_compound =3D=3D 1) continue; =20 @@ -1129,6 +1132,18 @@ void __ref memmap_init_zone_device(struct zone *zone, compound_nr_pages(altmap, pgmap)); } =20 + /* + * Mark the block movable so that blocks are reserved for + * movable at startup. This will force kernel allocations + * to reserve their blocks rather than leaking throughout + * the address space during boot when many long-lived + * kernel allocations are made. + * + * Please note that MEMINIT_HOTPLUG path doesn't clear memmap + * because this is done early in section_activate() + */ + pageblock_migratetype_init_range(start_pfn, nr_pages, MIGRATE_MOVABLE); + pr_debug("%s initialised %lu pages in %ums\n", __func__, nr_pages, jiffies_to_msecs(jiffies - start)); } --=20 2.20.1 From nobody Wed Jun 17 02:53:04 2026 Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E97F3ACF01 for ; Wed, 22 Apr 2026 08:15:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845716; cv=none; b=LyXVOtvBsBT9rKtu5q+dJSAme1AJG2MPZFY3/lWTTKGylweq9bYAKr46ulGtKUHFcMUa6zQPW0pqzrQ2QiSnxBujf7vPaZl3sq4unzxmr64LUwdRWJ9PH29kkPVEFjuUqbCC0SUFTLvZN4pU6M+rWqalADm9z5w3Zg6E60KZycE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776845716; c=relaxed/simple; bh=+hKZ4k7hqyNsg4psQ/3XgfaHf6Qzc/5LMIzJwCaaKmo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=a3GnWjVE9s6qD8uTq0LONtmLgdJcPN1DGNMh8OEE+krumuUuKomhMS4ArHMh+A+fmVDOsk6vbzpLGWZxN5+HecEtTvRR0RZkUbixUuozYIMEVRXw8uWZbiot1ZfhH3ydi38bXAY7pOHxVgZv3UwqqPgKDoBywRa9zx3MRhdAquo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=bIGHGT3E; arc=none smtp.client-ip=209.85.210.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="bIGHGT3E" Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-82fcd0aa2dbso803073b3a.0 for ; Wed, 22 Apr 2026 01:15:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776845712; x=1777450512; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zoaHiX3AHT0NhATrQNxOKRmMP5bMIl291LjVzN6aqQg=; b=bIGHGT3E6jSLAhaABFtqN+ElZWN+hjtQHtegtlWb3oTxYkmBbDqUkt4XMg+xmGzODT hc4ZRwgt08rpp2YcPwi3nUAJOSjMjZsUtaa+1pXs+PySukIsENe8mV/smSgX3edd5zf9 7KjxJ+ERQQFQ3BW5/bVu9ueGj4b0vhXLt4LkFeo1fXZqpZ8zo8M4WGd7dLp4ixINt7gM nLxW33Ulc/J9Dqrs9Vz52TLaXxjd5EWvUxW5I6FVGq1y9BTn8BMJ/IJJwTmZYtY1s9w2 duI8KOF+vO6x1TC4swXtIwNE9YwUjlbiahpAp9EkUWIx8ZZvD1LijvgNmQbuRzGdBZLD k/eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776845712; x=1777450512; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=zoaHiX3AHT0NhATrQNxOKRmMP5bMIl291LjVzN6aqQg=; b=RVSGJzTTMqrbP5TeK0l3SD9xGMDH9EoX0rAB1gY6TB1eRhNOXiSZDCTMrz/J9xqWA3 HAWfyzEzsgWqkHXrHIaUzG1cytjRviM1PE9KiwA+h0fld8os3bnsHORG54hqJTIH/dpj MkMXot174ieJk+tV5KqRXLCJlO1ebcYtbiQGxmWIvMc2eXHyLHiVne3WGye1VIkg8ex2 rif1dtR/35EqsFqcC06lvdzi/h98N3xQoUkom5PFVO+KT40FRGLmtJQA88fbNabW2A+7 O0yU6TGL0kgry9uUDohk1DAvOEXb9lRn3kZoAD1z0AJlIPHhGSpEu/rckKEmMtls2Emp Oqfw== X-Forwarded-Encrypted: i=1; AFNElJ+n6SWdwI8xziBsQLbxv4BpiDwEtNJgRh6xm+zpbnRt4Qf53b/DccOPbmvh4O+ENeIL18JYIB8W4YxJgwI=@vger.kernel.org X-Gm-Message-State: AOJu0Ywk1KraUTjpRAKjJS3DDloSmt6CObLinklTYnWrgAVEEG5qDZCe n0CQnNaWvJs4C7bLX8BUSUrN/po095ivH+Z9qMI0JDbJtUTW4yGd1uFrZVhDrFLw4Hk= X-Gm-Gg: AeBDieuE3VIFG6M6ERdOH5AOvrOCb3lugMeLfaZL46GPJ3Q7SGlfx8BureZOO6GvD3c itTzApFLrH67K7oGt/QP878gaUgp3MLfgwu2QAQvL1HL3W6UpJmSGECJxgVTD6NhU4nduc0xXvI sedOmFJUTOHkz25Fkk1DbJJPRBVMVjpuDf3Z3Ww4c4tmfM5farRSltu/Qa6pHOtudpwwvEZ3jSI Pe2D2AagJRLSY2nO10A8Ig+fJUkN/p1QCTMY7QvAD1j8N4OG7nILFt+6ULREptL06rglR9n1MyP /anUjtjPkF+TQEkrHPm4inunMsSEsJEurjFhRxsC/jaGN16NpYrsBOsIPVjstkuu6siaswCYs11 /jzg4hOmS0natMzwhnEe9+JqdDA/YXGSDayFG6WUGP9w67+NhVIXcYiDDLb1iM67QfeKEVEIPxW Dc7WBDs0SdqbjzY6AipWLdgoW0n4kl X-Received: by 2002:a05:6a00:2988:b0:82f:58d4:d348 with SMTP id d2e1a72fcca58-82f8b57510dmr17881058b3a.35.1776845712042; Wed, 22 Apr 2026 01:15:12 -0700 (PDT) Received: from n232-176-004.byted.org ([240e:83:200::34f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82f8ec0307esm16522874b3a.53.2026.04.22.01.15.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Apr 2026 01:15:11 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Muchun Song , Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 5/5] mm/mm_init: Fix uninitialized struct pages for ZONE_DEVICE Date: Wed, 22 Apr 2026 16:14:20 +0800 Message-Id: <20260422081420.4009847-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260422081420.4009847-1-songmuchun@bytedance.com> References: <20260422081420.4009847-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If DAX memory is hotplugged into an unoccupied subsection of an early section, section_activate() reuses the unoptimized boot memmap. However, compound_nr_pages() still assumes that vmemmap optimization is in effect and initializes only the reduced number of struct pages. As a result, the remaining tail struct pages are left uninitialized, which can later lead to unexpected behavior or crashes. Fix this by treating early sections as unoptimized when calculating how many struct pages to initialize. Fixes: 6fd3620b3428 ("mm/page_alloc: reuse tail struct pages for compound d= evmaps") Signed-off-by: Muchun Song Acked-by: David Hildenbrand (Arm) --- mm/mm_init.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index 9d0fe79a94de..3d5af40d0943 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1056,10 +1056,17 @@ static void __ref __init_zone_device_page(struct pa= ge *page, unsigned long pfn, * of how the sparse_vmemmap internals handle compound pages in the lack * of an altmap. See vmemmap_populate_compound_pages(). */ -static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap, +static inline unsigned long compound_nr_pages(unsigned long pfn, + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { - if (!vmemmap_can_optimize(altmap, pgmap)) + /* + * If DAX memory is hot-plugged into an unoccupied subsection + * of an early section, the unoptimized boot memmap is reused. + * See section_activate(). + */ + if (early_section(__pfn_to_section(pfn)) || + !vmemmap_can_optimize(altmap, pgmap)) return pgmap_vmemmap_nr(pgmap); =20 return VMEMMAP_RESERVE_NR * (PAGE_SIZE / sizeof(struct page)); @@ -1129,7 +1136,7 @@ void __ref memmap_init_zone_device(struct zone *zone, continue; =20 memmap_init_compound(page, pfn, zone_idx, nid, pgmap, - compound_nr_pages(altmap, pgmap)); + compound_nr_pages(pfn, altmap, pgmap)); } =20 /* --=20 2.20.1