From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D85A982899 for ; Sun, 5 Apr 2026 12:53:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393610; cv=none; b=U0iBIctJDGj8bCWDk7m5z4eWou/e+xx+/32OFvyTXS79a/gLvvQ20upOcnuxP9nUp3vcan14v3quRGDUIaoKQ4wZ42WLSV3dqSXfvXvTF7l/1smxJOt2jNqjNAFcEfvuKdvDlEP1kEzTjjHV1ACK4taNfCWwO02SIYDNXou362M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393610; c=relaxed/simple; bh=htMVPQQBtrtxFxnD0//lX5HGQvAiU090DXKaAq62AGg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Vyu/nLuQ3WdqLuI2Jhyjc/mvCoLq+932c1AJcN5BeuE1S3A0pTn0NysHfsOtlMAPyYUoSp1MoZleZwE28nRgdY8XdKZMmEfuUAs//2Tk0sZW95eSfBmX4o3MddeezhMfHDTWWqUjUwhcSvpCvktW1Ja4d8eb7mrh2vRxPHZvm5o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=NAVnRRWc; arc=none smtp.client-ip=209.85.216.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="NAVnRRWc" Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-35c124d2613so1749896a91.2 for ; Sun, 05 Apr 2026 05:53:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393608; x=1775998408; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iSQZfRjRjpxozRWOnZuEHxRe4aXVCopNiKXhl9pDTQQ=; b=NAVnRRWcQex1DfmzgM9rEPT4kALlczygt0fXqO5jYzLQd3lpTKuxtJcegG6PE+8WDN 7oC5WZjWXdd1RwUDBriqFG3Fh1ODdbhh16ibA/Th4eyZ8sf0R0XeGUEoI3Mujz3gGuRR B4cRI63xkVJKKtUCCug/EgwMSSRrSBf6XpZ7p4P9MdurxfY6TB4yGkoESbCsmh6MhgqY 1Syjyu+7FL0Pl7yR+Hikbp6I7GH6l974WhQ8wPKsi0Xg7TJhE6FxqaD0wIQURPYXI+2T uA62tNF/6EU1nXvjry++BUlwUsh9h1XDu8qGGmnj2N1QV/x+ro3PZMfI3Qrl3dovjOSj EhCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393608; x=1775998408; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=iSQZfRjRjpxozRWOnZuEHxRe4aXVCopNiKXhl9pDTQQ=; b=kuEZk3erx1jybPnXcOvCn0+YMzCuOPT6M6ocyi3Zd19el1Z/rKEgSeYQNLZePrVR9q yJpJrCacY5T+J5mrHYpRKvdZxefE1KRsGk1v9v67PASm4wWsivM6mAxrWg9aSrZNRLCg iO04wIKKL0xK8C2kNcc4frKkp186e+a7uLQdCtJ8Zms1ULjGMvY7Pk6/ajr28aJB0kKg Un14wIBVa0QNojxeEeTBHWlm8ItklrJt4Kcmp29H3M4HbB/50XkDM4TQ9Vg2b3n/hjeE Eu4SeFi0acU34+OYRrl2zjhOfH0mlFMuVeySbPb9jeqXz31EnAPIr+f/U9AZFHUS/ewG V14g== X-Forwarded-Encrypted: i=1; AJvYcCWuR3epwQdoj6MPSFD4SHsYfcH+uo0LVht1ls+BUDzgaSehggVtn6fysqNfwm1K5gEB4itjQPqqYF1+8yU=@vger.kernel.org X-Gm-Message-State: AOJu0YycSmEu4OLvpozkfM51uOI0ilKYa6LBHY08oVgHInIfpVeAlVzk UTYhLYrarJ1UYDrY+gcPDL1CziyjFS1LPNKQ0Bbb/iL1YWusOr4ygRI/S6Vb/30WD4w= X-Gm-Gg: AeBDiev1Jreb+P3/DMm6fvP46+EDNd7lm7fC0PkMEjGClM23X78L7nxWmiA7ljlAep/ tskeiPUHthsrYwsvulsnNawT3iDCczBivfwBdPhomELswTkpjHGtcoy3aLeZ90QWj3nhQ8vtbIK AYxyv61iIcoSDDCyAJj27fyMTBfiM5SdBWRUuufEbKYpLuk/V43dobcQjPo2/Ttu8eAF/BgARQO OkS1F5zsrfS7AhOHG94dpewwwtsJNb9QLY9tfrHoKI0SMqIm1DA8Ynj1aiMA+NIHIPExRJXBjNe UTQ8et2drCy/TT1irN08CYUJgG4Rwii1M8SEBBe6E999CYVnCzR5xPo3hwMWb2Cfr/NiOCe7imk b09syHigMv7QJuv0LQGXqJ43iy5bvdmX8RplNvvKANJ55ZRNjIsUKabIEcyuJXvFbEiXlwOS4tM hVKwiPAwYphbxQP7Zi6/o/iFLhsyoo2QNT1OHmUcxzQ18= X-Received: by 2002:a17:90b:4c4b:b0:35d:93ff:2854 with SMTP id 98e67ed59e1d1-35de680e77fmr9867916a91.8.1775393608080; Sun, 05 Apr 2026 05:53:28 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.53.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:53:27 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 01/49] mm/sparse: fix vmemmap accounting imbalance on memory hotplug error Date: Sun, 5 Apr 2026 20:51:52 +0800 Message-Id: <20260405125240.2558577-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In section_activate(), if populate_section_memmap() fails, the error handling path calls section_deactivate() to roll back the state. This approach introduces an accounting imbalance. Since the commit c3576889d87b ("mm: fix accounting of memmap pages"), memmap pages are accounted for only after populate_section_memmap() succeeds. However, section_deactivate() unconditionally decrements the vmemmap account. Consequently, a failure in populate_section_memmap() leads to a negative offset (underflow) in the system's vmemmap tracking. We can fix this by ensuring that the vmemmap accounting is incremented immediately before checking for the success of populate_section_memmap(). If populate_section_memmap() fails, the subsequent call to section_deactivate() will decrement the accounting, perfectly offsetting the increment and maintaining balance. Fixes: c3576889d87b ("mm: fix accounting of memmap pages") Signed-off-by: Muchun Song --- mm/sparse-vmemmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 6eadb9d116e4..ee27d0c0efe2 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -822,11 +822,11 @@ static struct page * __meminit section_activate(int n= id, unsigned long pfn, return pfn_to_page(pfn); =20 memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); + memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); if (!memmap) { section_deactivate(pfn, nr_pages, altmap); return ERR_PTR(-ENOMEM); } - memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); =20 return memmap; } --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6637D29E117 for ; Sun, 5 Apr 2026 12:53:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393617; cv=none; b=k7YIZNMgiB8lkklrpY1mFPpFhREwzB9spQ70pAJa8s3Duhg1A8fd8BYd2BriUhspa5cG3egObmdJkZEHUHsRRM0zlc+pnJ/L9Pjl1AepQgK2s6s3jRe5k0b4YJtVSgCj/AEGbp8Cg4JC9LR/wtiTHSFPrh8vvIx9ENNrlJmI1Co= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393617; c=relaxed/simple; bh=JMfCi9lfWIvBHyiL5emmcLdXk+vHYhcw7S0oNOJEIBw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TGUcjIlWXcdHbEvQuF4UNk53DChn2MzNifbYsTGhdPPCjkgLLjg1F+jTMpDDuxil2UiBS8l3mnO/j2bGoKMIGMDBZ86lSt2ghGcARfEw7E3Cjn92bMUmlV1WFye0lGltjKC3bRUUeo3YkC/Q0hq3+j6mm7jufk7UZK61AOcqgVM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=b4DrF2GV; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="b4DrF2GV" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-354bc7c2c46so1787113a91.0 for ; Sun, 05 Apr 2026 05:53:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393615; x=1775998415; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qyPeWfsO418Re5hKKT3eSnZkoPoEx5W3lIDo032LUZM=; b=b4DrF2GVZa9rsUueI2EQ3CAwQbWuOW1IbOiZS9ZPL8IzwGMjfBuBHSJwvsvWCW8L7M pT09q3pJTR4fj9lqQIuO8nMBgzWtqWGN57BD8Hj5KwplCA+7HxEUgVlPD6rzmVLM/4w7 v3nQnh6aNw14hoIi94iMfU7QSn0i7a4EEPB8iYTygcRBhPSsdHKfiXD3ozjDwuesM1GU oXxxY/ZrZ+Q6BsgLMt9nik9gZPk9m+8Frrx/FBO+iIPT4RG1Sh0te6Gs5MHvL9rRDqg8 qBcGo+xnl0F6hjwPEI5LJuYKXclXfHcZxhH3RR0hblNPTqD0eXKi9PuLD5df2WyFvAd0 l3uA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393615; x=1775998415; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=qyPeWfsO418Re5hKKT3eSnZkoPoEx5W3lIDo032LUZM=; b=LVfprXhaicmMhDQEY5/V+mSi5ScSy+tIsiuCHWjUT57r8+a2/hVFDYWsHsel6w/Yun CCZWp9RN/QKxIccXa7lXdjxBpCa5Yy2ZlHGLoDd8ZfkmHiSAVFUNWBZBaAd1ulR7gM6Q CXEbZIRJ9s83jkxh8l0IgL/MWN7eX/YZdLRVecU/M/qF7riWn3dXmV8/1cZK1rqDX5RY yaK5eJOsB/UX6VILex//pvfU74OTu1uF7I3ELP6rNMFyOHBov7ztIR7SGeA6fBq825d1 zIAvQVAxcKBY94Cz2+QQvah8WdHfJfOqEYlfJ1qANJQ21BRzVqk20ZPZRKIcxK97kuHO +9eg== X-Forwarded-Encrypted: i=1; AJvYcCWVAGUi/CPkEd6dBxn3vVlUsDEKZjzFcvV3X9VvBkM8kUdvPD3slt9lR6tUUyXCoM51QWiEEkmK2P7Z0So=@vger.kernel.org X-Gm-Message-State: AOJu0YzvTf70zyCzUPEERWgHGUvVzKuS0wCK6qN2NMmgaCYxrydMde6j pcPwEi+Rwi7QBQLRhPTn8m8MCimx+V+UBEB2mxAiTjrCs/wDF716wr73eFm7CpJzzuA6Lm9XiMJ 6411V X-Gm-Gg: AeBDietciyRu11Y3LWjIlURb77Hhi7n7LJKslDoHSyRNmS0sctEPLPzpOXgmVoLOZQl pLR4JF3jifW3UH3mh2wT+OHRqCn0NfHAtoE61uWpA3R5N3u3CpRTIKjm82/oenDww1llMp2waaz e6PZS4Y+GLkTfNDcCQdh0lwZaXMvU1+EXWIv+x8cKpzHl/AT1gSKsiN9HvuATUddpAwWmetXkG2 0aKduHvBjui5ssf3cUM8CgdEfhli1vBcgJh/VjUFAiz26rj8NX4gAMFoaHAYVJo4pLfn3skj+Ad xhYuorpWTEoGRqUY4vKPt5b6MjY9QcoDdthgb5Q34RsHGrvDdOW750AsFyk99q/qOzaDB6qg3bS LE+yMgBk4vH79nTGoszCe/NXuEygPtCeWvLxbT66jTqN857RRSTAfxBUbonIl32qJzZvJ+OzB80 nRTS4A53LZEh+uDGCwIv8o8AnIWN0kEjynX6rWitnHpR0= X-Received: by 2002:a17:90b:4ac7:b0:35d:a8d9:3b4 with SMTP id 98e67ed59e1d1-35de678f7d7mr8776631a91.4.1775393614559; Sun, 05 Apr 2026 05:53:34 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.53.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:53:34 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 02/49] mm/sparse: add a @pgmap argument to memory deactivation paths Date: Sun, 5 Apr 2026 20:51:53 +0800 Message-Id: <20260405125240.2558577-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, memory hot-remove paths do not pass the struct dev_pagemap pointer down to section_deactivate(). This prevents the lower levels from knowing whether the section was originally populated with vmemmap optimizations (e.g., DAX with HVO enabled). Without this information, we cannot call vmemmap_can_optimize() to determine if the vmemmap pages were optimized. As a result, the vmemmap page accounting during teardown will mistakenly assume a non-optimized allocation, leading to incorrect page statistics. To lay the groundwork for fixing the vmemmap page accounting, we need to pass the @pgmap pointer down to the deactivation location. Plumb the @pgmap argument through the APIs of arch_remove_memory(), __remove_pages() and sparse_remove_section(), mirroring the corresponding *_activate() paths. Signed-off-by: Muchun Song --- arch/arm64/mm/mmu.c | 5 +++-- arch/loongarch/mm/init.c | 5 +++-- arch/powerpc/mm/mem.c | 5 +++-- arch/riscv/mm/init.c | 5 +++-- arch/s390/mm/init.c | 5 +++-- arch/x86/mm/init_64.c | 5 +++-- include/linux/memory_hotplug.h | 8 +++++--- mm/memory_hotplug.c | 12 ++++++------ mm/memremap.c | 4 ++-- mm/sparse-vmemmap.c | 8 ++++---- 10 files changed, 35 insertions(+), 27 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index ec1c6971a561..dc8a8281888c 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1994,12 +1994,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); __remove_pgd_mapping(swapper_pg_dir, __phys_to_virt(start), size); } =20 diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index 00f3822b6e47..c9c57f08fa2c 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -86,7 +86,8 @@ int arch_add_memory(int nid, u64 start, u64 size, struct = mhp_params *params) return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; @@ -95,7 +96,7 @@ void arch_remove_memory(u64 start, u64 size, struct vmem_= altmap *altmap) /* With altmap the first mapped page is offset from @start */ if (altmap) page +=3D vmem_altmap_offset(altmap); - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); } #endif =20 diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 648d0c5602ec..4c1afab91996 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -158,12 +158,13 @@ int __ref arch_add_memory(int nid, u64 start, u64 siz= e, return rc; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); arch_remove_linear_mapping(start, size); } #endif diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 5142ca80be6f..980f693e6b19 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1810,9 +1810,10 @@ int __ref arch_add_memory(int nid, u64 start, u64 si= ze, struct mhp_params *param return ret; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { - __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap); + __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap, pgmap); remove_linear_mapping(start, size); flush_tlb_all(); } diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 1f72efc2a579..11a689423440 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -276,12 +276,13 @@ int arch_add_memory(int nid, u64 start, u64 size, return rc; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); vmem_remove_mapping(start, size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index df2261fa4f98..77b889b71cf3 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1288,12 +1288,13 @@ kernel_physical_mapping_remove(unsigned long start,= unsigned long end) remove_pagetable(start, end, true, NULL); } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, + struct dev_pagemap *pgmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap, pgmap); kernel_physical_mapping_remove(start, start + size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 815e908c4135..7c9d66729c60 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -135,9 +135,10 @@ static inline bool movable_node_is_enabled(void) return movable_node_enabled; } =20 -extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap); +extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap, + struct dev_pagemap *pgmap); extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, struct dev_pagemap *pgmap); =20 /* reasonably generic interface to expand the physical pages */ extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_= pages, @@ -307,7 +308,8 @@ extern int sparse_add_section(int nid, unsigned long pf= n, unsigned long nr_pages, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); extern void sparse_remove_section(unsigned long pfn, unsigned long nr_page= s, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); extern struct zone *zone_for_pfn_range(enum mmop online_type, int nid, struct memory_group *group, unsigned long start_pfn, unsigned long nr_pages); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 8b18ddd1e7d5..05f5df12d843 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -583,7 +583,7 @@ void remove_pfn_range_from_zone(struct zone *zone, * calling offline_pages(). */ void __remove_pages(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { const unsigned long end_pfn =3D pfn + nr_pages; unsigned long cur_nr_pages; @@ -598,7 +598,7 @@ void __remove_pages(unsigned long pfn, unsigned long nr= _pages, /* Select all remaining pages up to the next section boundary */ cur_nr_pages =3D min(end_pfn - pfn, SECTION_ALIGN_UP(pfn + 1) - pfn); - sparse_remove_section(pfn, cur_nr_pages, altmap); + sparse_remove_section(pfn, cur_nr_pages, altmap, pgmap); } } =20 @@ -1418,7 +1418,7 @@ static void remove_memory_blocks_and_altmaps(u64 star= t, u64 size) =20 remove_memory_block_devices(cur_start, memblock_size); =20 - arch_remove_memory(cur_start, memblock_size, altmap); + arch_remove_memory(cur_start, memblock_size, altmap, NULL); =20 /* Verify that all vmemmap pages have actually been freed. */ WARN(altmap->alloc, "Altmap not fully unmapped"); @@ -1461,7 +1461,7 @@ static int create_altmaps_and_memory_blocks(int nid, = struct memory_group *group, ret =3D create_memory_block_devices(cur_start, memblock_size, nid, params.altmap, group); if (ret) { - arch_remove_memory(cur_start, memblock_size, NULL); + arch_remove_memory(cur_start, memblock_size, NULL, NULL); kfree(params.altmap); goto out; } @@ -1547,7 +1547,7 @@ int add_memory_resource(int nid, struct resource *res= , mhp_t mhp_flags) /* create memory block devices after memory was added */ ret =3D create_memory_block_devices(start, size, nid, NULL, group); if (ret) { - arch_remove_memory(start, size, params.altmap); + arch_remove_memory(start, size, params.altmap, NULL); goto error; } } @@ -2246,7 +2246,7 @@ static int try_remove_memory(u64 start, u64 size) * No altmaps present, do the removal directly */ remove_memory_block_devices(start, size); - arch_remove_memory(start, size, NULL); + arch_remove_memory(start, size, NULL, NULL); } else { /* all memblocks in the range have altmaps */ remove_memory_blocks_and_altmaps(start, size); diff --git a/mm/memremap.c b/mm/memremap.c index ac7be07e3361..c45b90f334ea 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -97,10 +97,10 @@ static void pageunmap_range(struct dev_pagemap *pgmap, = int range_id) PHYS_PFN(range_len(range))); if (pgmap->type =3D=3D MEMORY_DEVICE_PRIVATE) { __remove_pages(PHYS_PFN(range->start), - PHYS_PFN(range_len(range)), NULL); + PHYS_PFN(range_len(range)), NULL, pgmap); } else { arch_remove_memory(range->start, range_len(range), - pgmap_altmap(pgmap)); + pgmap_altmap(pgmap), pgmap); kasan_remove_zero_shadow(__va(range->start), range_len(range)); } mem_hotplug_done(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index ee27d0c0efe2..7aa9a97498eb 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -737,7 +737,7 @@ static int fill_subsection_map(unsigned long pfn, unsig= ned long nr_pages) * usage map, but still need to free the vmemmap range. */ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms =3D __pfn_to_section(pfn); bool section_is_early =3D early_section(ms); @@ -824,7 +824,7 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); if (!memmap) { - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); return ERR_PTR(-ENOMEM); } =20 @@ -885,13 +885,13 @@ int __meminit sparse_add_section(int nid, unsigned lo= ng start_pfn, } =20 void sparse_remove_section(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { struct mem_section *ms =3D __pfn_to_section(pfn); =20 if (WARN_ON_ONCE(!valid_section(ms))) return; =20 - section_deactivate(pfn, nr_pages, altmap); + section_deactivate(pfn, nr_pages, altmap, pgmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42C6582899 for ; Sun, 5 Apr 2026 12:53:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393622; cv=none; b=tcctw/AByZYOZ6SfGwJSWW2/tFY6AKcRqZUyEietAIkD3BsDXRur0+XYhXFKHdt1P4bXkhoqvlFoaFXvcrDGCoDFZ/ozfaal5mcnAjWLc8LMuUMQZM/X8uvgTAntFm0hR4oKhhJ4TYhnqKR41RrF6XEhqIr5CD8HbejeNYilbmY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393622; c=relaxed/simple; bh=0IIU+me2UXTh2hNaoH+8G/vCDS5FfKoFP++uJFAWbpg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=aXRF1pwr0tIRe3bVH8OzmrNto78+IbcOY/C/nKFo/em4fPjBca1mmttkZ89GofEiFRFVaCRSPI/0WbP9bOVGmCjgMYVTcmMh1VWqzqWTBh5gurcfe1CpNCaARY9qg76MVCxYwCSJg6ZyrRt3+qdyzjsuRzLQJ/rwybpxaBReTSs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=Z0nnI0oS; arc=none smtp.client-ip=209.85.216.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="Z0nnI0oS" Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-3567e2b4159so2008895a91.0 for ; Sun, 05 Apr 2026 05:53:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393621; x=1775998421; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=svlwS891ot0oMz7BvAOAaxoIhgC5OOWs0OTR+KAyilA=; b=Z0nnI0oSDKd8etgeWruGxqBMLfXe/gca6dSPpiT2BgZuqleSGadnSS/p9bxwMotcRL 6zJ3Do7xInk6PT0A9IPiCWfjqOW0aYbxl7eV5K/8pBL1Ef33NmoTy8/QqaoO5Z+Te+V/ SwYr0HHodZ5wKdvh1fPJJ9iBq2oms1YDI5x7B2rBGzf5plB+qic4DvgXTlqo/eAajMDs CXUvUo+EbXcZb89UDKE8tdrN/G0OWCmi3v9WDaGNAlo4G2zv0mVw6uE5Cnqfo2YsLTyA t+4HXZJSVm64ZQk/H3Uzplk2FPS8tW655dPKeIeB3gRBi4OlXaOhA4F+A37dnETXzp/W U7zQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393621; x=1775998421; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=svlwS891ot0oMz7BvAOAaxoIhgC5OOWs0OTR+KAyilA=; b=meELAWx2X59+ug0DfgN0YRQ3njkweDm4JQkAgg+/fRdrQsdjK3RnnkEogTLoGaLuE7 WR1a/eeSBljbsbGo6j4suCWo1maxRbG2LY9VU5j4CYMFC8OolqM+fIz9HTbXjpzclGR7 DNunv3hNRzi64kMb7wLc8Gk05TaBUc3Wca/0XwTOwQJI+aubCIUYV1Drce033fL84H5p KNKhSk76xzKJ2BRnYMsa0qUeuLklgrZxqlEIXuz4YRaB60zbOEYTuuT8dlfjbtKVkzh9 BQSxJoqLKvb21V90ZRWLCEMqgLkmibVpvSOFzEGpjx1qsraV07IROVyKUey/bB6rjePi QXCg== X-Forwarded-Encrypted: i=1; AJvYcCX/eC9v1p/9SV6jjyP4Tqim/U0lUONozgsZR1//5sg51jIXAHtzIjn4LzJM9dQ+EXESvj0i2aRR7ZjfpRM=@vger.kernel.org X-Gm-Message-State: AOJu0Yx848AmGok0FHvEAHmNkDGFdvfbT1p7X9Q6YpeyLb4T9Cm8ykDh e3aIEL7iEqgugxwoPL6AEgsaDm4z8dVdfxKHvJCsUCC4qBLsaaqw0rqTqekpVuRwP0g= X-Gm-Gg: AeBDieun8Bnx+y4RDTJgND9tD2UoKv1Y8q7bz06VBTgmmx8OdUfVzqKME5X3qtQ3ubW 64OUG8sLm0MT7w1YZ0Z+bLTIRT2W8kQgtBBuDMnCxHwdG3rd2ss6BZxyNeOe3POMvAw7Jf3o4Xw YIWB778YF4w9hxQtZTkii4FqiHFehl0qKqR7plSEq0FFHnaL/UeoGssCwYKyCDZ7zjzMJjXJz+g WXKoSLNBQH/APYKkw4y1Zzv9m7hcB1WwhpvWSdSt0oXsIyHYjgJFUrjPPlH5pNeV95h51Sf1TlI V1sPwtO90RY9K+fYuvaIIayPtwuyoejHLR2xxp3aS8AawuC5oZU4ZLX3NVoCwylDBp7WY7hc10w 7Q/oxpPqRsFEIZjIA2irrhRbuKwDKPqpe3xF12/+6PpMfJuy1Fn2fW+aLvEnOPDpoR+DI7SSWlo 90HpUhB02xWg/OVBFr7T0CV0hEuXVwnwLY1HBVEeVkWQkOIGgpgVdi7g== X-Received: by 2002:a17:90b:3143:b0:35b:9777:8bb1 with SMTP id 98e67ed59e1d1-35de68ec770mr8284643a91.19.1775393620585; Sun, 05 Apr 2026 05:53:40 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.53.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:53:40 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 03/49] mm/sparse: fix vmemmap page accounting for HVOed DAX Date: Sun, 5 Apr 2026 20:51:54 +0800 Message-Id: <20260405125240.2558577-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When HVO is enabled for DAX, the vmemmap page accounting is wrong since it only accounts for non-HVO case. Fix the accounting by introducing section_vmemmap_pages() that returns the exact number of vmemmap pages needed for the given pfn range. Fixes: 15995a352474 ("mm: report per-page metadata information") Signed-off-by: Muchun Song --- mm/sparse-vmemmap.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 7aa9a97498eb..0ef96b1afbcc 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -724,6 +724,27 @@ static int fill_subsection_map(unsigned long pfn, unsi= gned long nr_pages) return rc; } =20 +static int __meminit section_vmemmap_pages(unsigned long pfn, unsigned lon= g nr_pages, + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) +{ + unsigned int order =3D pgmap ? pgmap->vmemmap_shift : 0; + unsigned long pages_per_compound =3D 1L << order; + + VM_BUG_ON(!IS_ALIGNED(pfn | nr_pages, min(pages_per_compound, PAGES_PER_S= ECTION))); + VM_BUG_ON(pfn_to_section_nr(pfn) !=3D pfn_to_section_nr(pfn + nr_pages - = 1)); + + if (!vmemmap_can_optimize(altmap, pgmap)) + return DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE); + + if (order < PFN_SECTION_SHIFT) + return VMEMMAP_RESERVE_NR * nr_pages / pages_per_compound; + + if (IS_ALIGNED(pfn, pages_per_compound)) + return VMEMMAP_RESERVE_NR; + + return 0; +} + /* * To deactivate a memory region, there are 3 cases to handle: * @@ -775,11 +796,12 @@ static void section_deactivate(unsigned long pfn, uns= igned long nr_pages, * section_activate() and pfn_valid() . */ if (!section_is_early) { - memmap_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), PAG= E_SIZE))); + memmap_pages_add(-section_vmemmap_pages(pfn, nr_pages, altmap, + pgmap)); depopulate_section_memmap(pfn, nr_pages, altmap); } else if (memmap) { - memmap_boot_pages_add(-1L * (DIV_ROUND_UP(nr_pages * sizeof(struct page), - PAGE_SIZE))); + memmap_pages_add(-section_vmemmap_pages(pfn, nr_pages, altmap, + pgmap)); free_map_bootmem(memmap); } =20 @@ -822,7 +844,7 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, return pfn_to_page(pfn); =20 memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); - memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE)); + memmap_pages_add(section_vmemmap_pages(pfn, nr_pages, altmap, pgmap)); if (!memmap) { section_deactivate(pfn, nr_pages, altmap, pgmap); return ERR_PTR(-ENOMEM); --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F6DD82899 for ; Sun, 5 Apr 2026 12:53:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393629; cv=none; b=rbaN5rhW/NbOlsxHUT3HacZvDxL5ANtX5a8slRXwe8dZIoZOOHC774VmXibZ1mFTwAGuxKFduOe1XDCncdnLKYUv1cn8WYmRgD4+Bz1r0zhNCRTxrDidbLc+Wegdx9xqkdlQLGc7OkznFNsO0zO9gdX5cTsiyVLzreZN9Tuo+o4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393629; c=relaxed/simple; bh=AlSJ1AOAptY4ihJTdHO4+bMHZteTcb5GBIizqTXNahM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JoUBcD85p8Jbbts1rXGFY4yHaCykH6PimbjUCeSBNOO684cLeM11TpvPwSGX0t+pdwnj9i6j4QKuGmW799yz8JyWm+8pDXEc+wNuRORKG3d3ncSQvPa4mKeXEn5FqjQ9Rx2fDYud2sIJKSJjUNvP4nxkx8HDK0bq/QvT22F284Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=cQqMQsWw; arc=none smtp.client-ip=209.85.216.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="cQqMQsWw" Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-35da9692ec3so2874760a91.1 for ; Sun, 05 Apr 2026 05:53:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393628; x=1775998428; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qqZiP5IDxjx5eZ19SJ749S5+pviMXDrT55UzwpoHOxo=; b=cQqMQsWw0HUs2GdOEkMtKXBa58JYd1y8FPFTfvin7t67EzRn7J4e/XV5ekyQ1AFIVo 8iSA8TvEDiqbR997j1CgC+0S3blxN+0AXYuCRbAkjuOZV9W+w6qgJZAdWvUwjaoI0bKd Y7QlD3+//VxXbXpH4bU9Lh9jbY7syNh9LPDfz8o23KWgxA7lVh52JpRI+Ebe6WzUWmEb M9HPTdVJvBFI7N9XvJv2dyNKESq9PEisfUM7Y9U1MEnnIdI94afn7YVmCqe8Cew7eAaZ TvfWxvV3yjeU6leOBUwAPvQMwTLmMOFC+YrFvFFQOIKT29hg7Dm4h6kKPsFExTT7D2e+ U+eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393628; x=1775998428; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=qqZiP5IDxjx5eZ19SJ749S5+pviMXDrT55UzwpoHOxo=; b=EKcjxCBYWN75RiVZGlNQhq5AjvL8WaLC7wuV1OSGf1/kSzPUj3QMOU8wyJ3NpkSseT G06knGgXOHJzyfWsjQOhe1EYcMKlPS1hE8fKCP4dYLfaBZSjHmLbN3HekMboAjw1iTIu glqlfKrbvfl7SZLOB6WFx9K5DFhJzjmoVgdJglgYBMgrOPH/LwQZMRGf+1r4yoUTtavh d8D8QINh1e4L9iFITKdoyrxG2hx6rImS9vSNh3vczBSTeoYCh1VhI8Fq8IbpAUPAvooZ jsJi3+sCu73GqPcqDJQrhoxt4qgLarNPu6N95tFSQuEIgBVjbse8oSVNfXGvTet+k6p8 w9SQ== X-Forwarded-Encrypted: i=1; AJvYcCXUGC9rKNnBR3yN3VyX0Bi2R9DvRXKiLZk5+ReT2cF6D44wCYU/kHZ7feyJZeW/wVyV54lGWTBMv22NX5c=@vger.kernel.org X-Gm-Message-State: AOJu0YwkD/7OXECEv2SsxwyDUh3VLSPMIhyyrPzjgGFjdWm6bavEUJ74 Z5zxBJnpDN4ve5ygzWYBiCmBC1XaOjVhf3WdPasTzZvR2/n07CkEQl3OqlcAqj+juTY= X-Gm-Gg: AeBDietr6c7uu0LzGxUL8/sQFxcrZOI9ooDcbCeXh0Sx4nHxYV08bLMfQwuF/He1pRF wf13bC1jw61LY7PgWY7X3CNoH9q3frTY/bnXqAE612Nhap+GxkPdACrcRUUOXk9yrpL+aGV2Rc9 rjJfZPBJAAbrrHZ8VEx+T25Wy5z8/iXido4lYJLRfKxndg/hRF/ZWKK4q7LExV9cOa440W52UIE SlQd3Utr57zsTEpoQE1hJ2qie9Vs2ErIpPOPPrgZB+29g813/63DNY9kLRRanDMClQ8JKaYBWcR v5Eyx/RQG4kqQcIuoJGHHiJK1lP+8ER55GQldw41O/jV2c7Z2Ifgud3bvVhX4V2Zbs7WD6OxvXw 5spIPjoSOIll1JWvw6g6E6IBga2n//kJj5vm/2k6DGvyzlvxvZN2l9tPbyOznBgk5UMGKM2uSMV IzEUva9Z2aSA+nmMKjD2qY/NEi8FnfzpLck88xuMsQtqg= X-Received: by 2002:a17:90b:1dcc:b0:35d:a3b4:2f00 with SMTP id 98e67ed59e1d1-35de6810254mr8790455a91.8.1775393627679; Sun, 05 Apr 2026 05:53:47 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.53.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:53:46 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 04/49] mm/sparse: add a @pgmap parameter to arch vmemmap_populate() Date: Sun, 5 Apr 2026 20:51:55 +0800 Message-Id: <20260405125240.2558577-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the struct dev_pagemap pointer as a parameter to the architecture specific vmemmap_populate(), vmemmap_populate_hugepages() and vmemmap_populate_basepages() functions. Currently, the vmemmap optimization for DAX is handled mostly in an architecture-agnostic way via vmemmap_populate_compound_pages(). However, this approach skips crucial architecture-specific initialization steps. For example, the x86 path must call sync_global_pgds() after populating the vmemmap, which is currently being bypassed. To fix this, we need to push the awareness of device memory optimization (via the pgmap) down into the architecture-specific vmemmap_populate() paths. This will allow each architecture to handle the optimization while ensuring their specific initialization routines (like page directory synchronization) are correctly invoked. This is a preparatory patch only; it changes no behavior. The actual architecture-specific implementations and fixes will follow. Signed-off-by: Muchun Song --- arch/arm64/mm/mmu.c | 6 +++--- arch/loongarch/mm/init.c | 7 ++++--- arch/powerpc/include/asm/book3s/64/radix.h | 3 ++- arch/powerpc/mm/book3s64/radix_pgtable.c | 2 +- arch/powerpc/mm/init_64.c | 4 ++-- arch/riscv/mm/init.c | 4 ++-- arch/s390/mm/vmem.c | 2 +- arch/sparc/mm/init_64.c | 5 +++-- arch/x86/mm/init_64.c | 8 ++++---- include/linux/mm.h | 8 +++++--- mm/hugetlb_vmemmap.c | 4 ++-- mm/sparse-vmemmap.c | 10 ++++++---- 12 files changed, 35 insertions(+), 28 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index dc8a8281888c..86162aab5185 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1760,7 +1760,7 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); /* [start, end] should be within one section */ @@ -1768,9 +1768,9 @@ int __meminit vmemmap_populate(unsigned long start, u= nsigned long end, int node, =20 if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) || (end - start < PAGES_PER_SECTION * sizeof(struct page))) - return vmemmap_populate_basepages(start, end, node, altmap); + return vmemmap_populate_basepages(start, end, node, altmap, pgmap); else - return vmemmap_populate_hugepages(start, end, node, altmap); + return vmemmap_populate_hugepages(start, end, node, altmap, pgmap); } =20 #ifdef CONFIG_MEMORY_HOTPLUG diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index c9c57f08fa2c..d61c2e09caae 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -123,12 +123,13 @@ int __meminit vmemmap_check_pmd(pmd_t *pmd, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, - int node, struct vmem_altmap *altmap) + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { #if CONFIG_PGTABLE_LEVELS =3D=3D 2 - return vmemmap_populate_basepages(start, end, node, NULL); + return vmemmap_populate_basepages(start, end, node, NULL, pgmap); #else - return vmemmap_populate_hugepages(start, end, node, NULL); + return vmemmap_populate_hugepages(start, end, node, NULL, pgmap); #endif } =20 diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/incl= ude/asm/book3s/64/radix.h index da954e779744..bde07c6f900f 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -321,7 +321,8 @@ extern int __meminit radix__vmemmap_create_mapping(unsi= gned long start, unsigned long page_size, unsigned long phys); int __meminit radix__vmemmap_populate(unsigned long start, unsigned long e= nd, - int node, struct vmem_altmap *altmap); + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); void __ref radix__vmemmap_free(unsigned long start, unsigned long end, struct vmem_altmap *altmap); extern void radix__vmemmap_remove_mapping(unsigned long start, diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/boo= k3s64/radix_pgtable.c index 10aced261cff..568500343e5f 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1112,7 +1112,7 @@ static inline pte_t *vmemmap_pte_alloc(pmd_t *pmdp, i= nt node, =20 =20 int __meminit radix__vmemmap_populate(unsigned long start, unsigned long e= nd, int node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { unsigned long addr; unsigned long next; diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index b6f3ae03ca9e..8f4aa5b32186 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -275,12 +275,12 @@ static int __meminit __vmemmap_populate(unsigned long= start, unsigned long end, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { =20 #ifdef CONFIG_PPC_BOOK3S_64 if (radix_enabled()) - return radix__vmemmap_populate(start, end, node, altmap); + return radix__vmemmap_populate(start, end, node, altmap, pgmap); #endif =20 return __vmemmap_populate(start, end, node, altmap); diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 980f693e6b19..277c89661dff 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1443,7 +1443,7 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { /* * Note that SPARSEMEM_VMEMMAP is only selected for rv64 and that we @@ -1451,7 +1451,7 @@ int __meminit vmemmap_populate(unsigned long start, u= nsigned long end, int node, * memory hotplug, we are not able to update all the page tables with * the new PMDs. */ - return vmemmap_populate_hugepages(start, end, node, altmap); + return vmemmap_populate_hugepages(start, end, node, altmap, pgmap); } #endif =20 diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c index eeadff45e0e1..a7bf8d3d5601 100644 --- a/arch/s390/mm/vmem.c +++ b/arch/s390/mm/vmem.c @@ -506,7 +506,7 @@ static void vmem_remove_range(unsigned long start, unsi= gned long size) * Add a backed mem_map array to the virtual mem_map array. */ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { int ret; =20 diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index 367c269305e5..f870ca330f9e 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -2591,9 +2591,10 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int nod= e, } =20 int __meminit vmemmap_populate(unsigned long vstart, unsigned long vend, - int node, struct vmem_altmap *altmap) + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { - return vmemmap_populate_hugepages(vstart, vend, node, NULL); + return vmemmap_populate_hugepages(vstart, vend, node, NULL, pgmap); } #endif /* CONFIG_SPARSEMEM_VMEMMAP */ =20 diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 77b889b71cf3..e18cc81a30b4 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1557,7 +1557,7 @@ int __meminit vmemmap_check_pmd(pmd_t *pmd, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap) + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { int err; =20 @@ -1565,15 +1565,15 @@ int __meminit vmemmap_populate(unsigned long start,= unsigned long end, int node, VM_BUG_ON(!PAGE_ALIGNED(end)); =20 if (end - start < PAGES_PER_SECTION * sizeof(struct page)) - err =3D vmemmap_populate_basepages(start, end, node, NULL); + err =3D vmemmap_populate_basepages(start, end, node, NULL, pgmap); else if (boot_cpu_has(X86_FEATURE_PSE)) - err =3D vmemmap_populate_hugepages(start, end, node, altmap); + err =3D vmemmap_populate_hugepages(start, end, node, altmap, pgmap); else if (altmap) { pr_err_once("%s: no cpu support for altmap allocations\n", __func__); err =3D -ENOMEM; } else - err =3D vmemmap_populate_basepages(start, end, node, NULL); + err =3D vmemmap_populate_basepages(start, end, node, NULL, pgmap); if (!err) sync_global_pgds(start, end - 1); return err; diff --git a/include/linux/mm.h b/include/linux/mm.h index 0b776907152e..bebc5f892f81 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4877,11 +4877,13 @@ void vmemmap_set_pmd(pmd_t *pmd, void *p, int node, int vmemmap_check_pmd(pmd_t *pmd, int node, unsigned long addr, unsigned long next); int vmemmap_populate_basepages(unsigned long start, unsigned long end, - int node, struct vmem_altmap *altmap); + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); int vmemmap_populate_hugepages(unsigned long start, unsigned long end, - int node, struct vmem_altmap *altmap); + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap); int vmemmap_populate(unsigned long start, unsigned long end, int node, - struct vmem_altmap *altmap); + struct vmem_altmap *altmap, struct dev_pagemap *pgmap); int vmemmap_populate_hvo(unsigned long start, unsigned long end, unsigned int order, struct zone *zone, unsigned long headsize); diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 4a077d231d3a..50b7123f3bdd 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -829,7 +829,7 @@ void __init hugetlb_vmemmap_init_late(int nid) */ list_del(&m->list); =20 - vmemmap_populate(start, end, nid, NULL); + vmemmap_populate(start, end, nid, NULL, NULL); nr_mmap =3D end - start; memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE)); =20 @@ -845,7 +845,7 @@ void __init hugetlb_vmemmap_init_late(int nid) if (vmemmap_populate_hvo(start, end, huge_page_order(h), zone, HUGETLB_VMEMMAP_RESERVE_SIZE) < 0) { /* Fallback if HVO population fails */ - vmemmap_populate(start, end, nid, NULL); + vmemmap_populate(start, end, nid, NULL, NULL); nr_mmap =3D end - start; } else { m->flags |=3D HUGE_BOOTMEM_ZONES_VALID; diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 0ef96b1afbcc..387337bba05e 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -297,7 +297,8 @@ static int __meminit vmemmap_populate_range(unsigned lo= ng start, } =20 int __meminit vmemmap_populate_basepages(unsigned long start, unsigned lon= g end, - int node, struct vmem_altmap *altmap) + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { return vmemmap_populate_range(start, end, node, altmap, -1, 0); } @@ -400,7 +401,8 @@ int __weak __meminit vmemmap_check_pmd(pmd_t *pmd, int = node, } =20 int __meminit vmemmap_populate_hugepages(unsigned long start, unsigned lon= g end, - int node, struct vmem_altmap *altmap) + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long addr; unsigned long next; @@ -445,7 +447,7 @@ int __meminit vmemmap_populate_hugepages(unsigned long = start, unsigned long end, } } else if (vmemmap_check_pmd(pmd, node, addr, next)) continue; - if (vmemmap_populate_basepages(addr, next, node, altmap)) + if (vmemmap_populate_basepages(addr, next, node, altmap, pgmap)) return -ENOMEM; } return 0; @@ -559,7 +561,7 @@ struct page * __meminit __populate_section_memmap(unsig= ned long pfn, if (vmemmap_can_optimize(altmap, pgmap)) r =3D vmemmap_populate_compound_pages(pfn, start, end, nid, pgmap); else - r =3D vmemmap_populate(start, end, nid, altmap); + r =3D vmemmap_populate(start, end, nid, altmap, pgmap); =20 if (r < 0) return NULL; --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C31022C11E6 for ; Sun, 5 Apr 2026 12:53:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393636; cv=none; b=Dc6NHf6LniE3c6WQQft6Ge0lc8X+dev2qScXlPzy+INCJ1an2/Wea6EAODBh7+g65NKiMIzvsxy+6sRuVNWHEBuVzFBVwCf+gTp1PTzOPP36DpJQPc5cKwkZKJIstdFSJuvTxXhSe7orBo2PlUIyACzRkw8PZ/VUVyLYxIp7chk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393636; c=relaxed/simple; bh=sLWR8d5SjXBAR3RoLoS5+IdKvW8geJh5RTZ3+VHNBUM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JPwWbvQYKPebnR5e9q0q74vbFIWOjYxSQLdMnPowX6EEJHcX+gNP61tvv34ZUvf/z8m9E59mugQsgBmGxZ128JM1jSkEuVXl2c7yJCB/3wlqMupxoRobImNfAogUQ1S78RKhT3CpX+7NKYYIEgaTCgrjiFPs8h/DxbbQNdDFzmk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=KEI+hZNm; arc=none smtp.client-ip=209.85.216.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="KEI+hZNm" Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-35d90833cacso1831907a91.2 for ; Sun, 05 Apr 2026 05:53:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393634; x=1775998434; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UXznDGdcvbijOjbb/LymBNzIqLALuLwKxylcqGtnQjA=; b=KEI+hZNmoT92oZ935XcdZgEmBAigQXOBWuodJyYasS0UTzza4SzxoI/D87M0vESp8C cfK3rAI/+blpckyxHEPvpbVJc6Ls7bC2gX0nqQb8puivP47Mg4vqwypEohi5/MgcGy/S EPAXk9m15jrJ25pQD242NRrZxSa0SsShFaUIp/eEc1HWDtnotzaz3yuNSB8cxCrDzicO DQ5jPjLuCBg3dRlVbVLogUzcBYSw5fWp4CgKpVpHE1hgnxiwsnUS33KGZaWOJFRAtrs6 veC/UU19DppReicogMasqxfcA61xrIiAq2+P6QxqFQERljDlKg+yhIN9Kt7OSaX9SUhj 6SVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393634; x=1775998434; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=UXznDGdcvbijOjbb/LymBNzIqLALuLwKxylcqGtnQjA=; b=F2DLi4CJvOiZUe5mNaRYlEkHPdMNU8aKvsYIkYNcgJwJU7IbFZVGCg411xLHe2s+tw O6fXrVUHyJYVSePXEBSYkBaohVy5xfnUBVzfljBc/yoVjEyqMZ4+xMMkm59m3o1T/fmf pemNFfXbTjbsk+cmSnlhK5ynt9vOqx289oy2KV5J4p4gRm4auSyUcvIM5+sJ2uJZb3kD RWHYHvcEeUCtmeXjouypUuKzpkhqfqKP453MIrneh9aOREISVLDtemDauivVla2whNHj /quuL9Msj2AbOufMy+wjU1A3s86EU2ie6iRS1nBUquOX88Xyb/u77YMJLREzW/v9Guu4 NX4Q== X-Forwarded-Encrypted: i=1; AJvYcCUgZBZp8WrWE16bGCscoV3FGsSxc3qrHWVjbpzERGto2Cf9DcrD4aBW/XLs80T8EX6+XlpiXtslAPJtkK0=@vger.kernel.org X-Gm-Message-State: AOJu0YzeRXRaaj+k5UPMRJ7KIYa8mpMm5eizOQVZUpNDl8DmUVds3VqJ TOUIVMRCHTEWSgd6fGdlql53ftWCnjhWEu8KunDIATSPF/Aowp2vu79J0MzyNE+qcwE= X-Gm-Gg: AeBDietatoGu5v1M8+vvx7KBl8JjcFD3pCmepzBHrgNe8NAt62gJ0So1ZRIr6UOB++q ress0J+/jBJhcdaqj+1uCXzqEawJI3/vFJ6CMa/0zqSNTSN6MezujXW4dr+IdxCcigoHNsdU2kT VssUPxCL99TQ+m3gVnKa15KLAFkluOI0bkLYxg7DXAPE4CF8DfHV1TKZldRTBSxOS+HyuBY+YmA xRB4aYhFD759RO0Gcr9M99O1A2tqqNVJlr3KsvuLu8wqODMk39RCsQYtVGyub3J5PnpYbH7i4Qg XyxBd0Ka6lfIdzANi9jWuwjIqsTfvQA2E/gqHPTVv53DDx2Th/LU/9XZPQ/6etAzWw5hZ/veQFS a4004Txg8J46S2Uehz1i95f/dL41+U2EfbKQyjHu9xU2ct4+1Y25spVeFggYvi0O/KyP5kL/xZ+ JUPCRPyFDHnK8CtwhcT0RJ2mClNj+mLogNo0IZif79gK9xam2doO4Exg== X-Received: by 2002:a17:90a:e70f:b0:35b:e52a:6fe5 with SMTP id 98e67ed59e1d1-35de683b19amr9190466a91.5.1775393634044; Sun, 05 Apr 2026 05:53:54 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.53.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:53:53 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 05/49] mm/sparse: fix missing architecture-specific page table sync for HVO DAX Date: Sun, 5 Apr 2026 20:51:56 +0800 Message-Id: <20260405125240.2558577-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On x86-64, vmemmap_populate() normally calls sync_global_pgds() to keep the page tables in sync; however, when DAX HVO is enabled, vmemmap_populate_compound_pages() skips this architecture-specific step, so omitting the sync on x86-64 can later trigger vmemmap-access faults. Fix this by delegating the HVO DAX decision to the architecture: - Architectures that do not use the generic vmemmap_populate_basepages() or vmemmap_populate_hugepages() paths (e.g. powerpc) can implement HVO DAX directly in their own vmemmap_populate(). - Architectures that rely on the generic helpers implicitly inherit the correct operation logic and therefore enable HVO DAX safely without extra work in generic vmemmap_populate_basepages() or vmemmap_populate_hugepages(). This prevents the x86-64 sync issue. Fixes: 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compoun= d devmaps") Signed-off-by: Muchun Song --- arch/powerpc/include/asm/book3s/64/radix.h | 6 ------ arch/powerpc/mm/book3s64/radix_pgtable.c | 15 +++++++++----- mm/sparse-vmemmap.c | 24 +++++++++++----------- 3 files changed, 22 insertions(+), 23 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/incl= ude/asm/book3s/64/radix.h index bde07c6f900f..2600defa2dc2 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -357,11 +357,5 @@ int radix__remove_section_mapping(unsigned long start,= unsigned long end); #define vmemmap_can_optimize vmemmap_can_optimize bool vmemmap_can_optimize(struct vmem_altmap *altmap, struct dev_pagemap *= pgmap); #endif - -#define vmemmap_populate_compound_pages vmemmap_populate_compound_pages -int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, - unsigned long start, - unsigned long end, int node, - struct dev_pagemap *pgmap); #endif /* __ASSEMBLER__ */ #endif diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/boo= k3s64/radix_pgtable.c index 568500343e5f..dfa2f7dc7e15 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1109,7 +1109,10 @@ static inline pte_t *vmemmap_pte_alloc(pmd_t *pmdp, = int node, return pte_offset_kernel(pmdp, address); } =20 - +static int __meminit vmemmap_populate_compound_pages(unsigned long start_p= fn, + unsigned long start, + unsigned long end, int node, + struct dev_pagemap *pgmap); =20 int __meminit radix__vmemmap_populate(unsigned long start, unsigned long e= nd, int node, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) @@ -1122,6 +1125,8 @@ int __meminit radix__vmemmap_populate(unsigned long s= tart, unsigned long end, in pmd_t *pmd; pte_t *pte; =20 + if (vmemmap_can_optimize(altmap, pgmap)) + return vmemmap_populate_compound_pages(page_to_pfn((struct page *)start)= , start, end, node, pgmap); /* * If altmap is present, Make sure we align the start vmemmap addr * to PAGE_SIZE so that we calculate the correct start_pfn in @@ -1303,10 +1308,10 @@ static pte_t * __meminit vmemmap_compound_tail_page= (unsigned long addr, return pte; } =20 -int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, - unsigned long start, - unsigned long end, int node, - struct dev_pagemap *pgmap) +static int __meminit vmemmap_populate_compound_pages(unsigned long start_p= fn, + unsigned long start, + unsigned long end, int node, + struct dev_pagemap *pgmap) { /* * we want to map things as base page size mapping so that diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 387337bba05e..d3096de04cc6 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -296,10 +296,16 @@ static int __meminit vmemmap_populate_range(unsigned = long start, return 0; } =20 +static int __meminit vmemmap_populate_compound_pages(unsigned long start, + unsigned long end, int node, + struct dev_pagemap *pgmap); + int __meminit vmemmap_populate_basepages(unsigned long start, unsigned lon= g end, int node, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { + if (vmemmap_can_optimize(altmap, pgmap)) + return vmemmap_populate_compound_pages(start, end, node, pgmap); return vmemmap_populate_range(start, end, node, altmap, -1, 0); } =20 @@ -411,6 +417,9 @@ int __meminit vmemmap_populate_hugepages(unsigned long = start, unsigned long end, pud_t *pud; pmd_t *pmd; =20 + if (vmemmap_can_optimize(altmap, pgmap)) + return vmemmap_populate_compound_pages(start, end, node, pgmap); + for (addr =3D start; addr < end; addr =3D next) { next =3D pmd_addr_end(addr, end); =20 @@ -453,7 +462,6 @@ int __meminit vmemmap_populate_hugepages(unsigned long = start, unsigned long end, return 0; } =20 -#ifndef vmemmap_populate_compound_pages /* * For compound pages bigger than section size (e.g. x86 1G compound * pages with 2M subsection size) fill the rest of sections as tail @@ -491,14 +499,14 @@ static pte_t * __meminit compound_section_tail_page(u= nsigned long addr) return pte; } =20 -static int __meminit vmemmap_populate_compound_pages(unsigned long start_p= fn, - unsigned long start, +static int __meminit vmemmap_populate_compound_pages(unsigned long start, unsigned long end, int node, struct dev_pagemap *pgmap) { unsigned long size, addr; pte_t *pte; int rc; + unsigned long start_pfn =3D page_to_pfn((struct page *)start); =20 if (reuse_compound_section(start_pfn, pgmap)) { pte =3D compound_section_tail_page(start); @@ -544,26 +552,18 @@ static int __meminit vmemmap_populate_compound_pages(= unsigned long start_pfn, return 0; } =20 -#endif - struct page * __meminit __populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) { unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); - int r; =20 if (WARN_ON_ONCE(!IS_ALIGNED(pfn, PAGES_PER_SUBSECTION) || !IS_ALIGNED(nr_pages, PAGES_PER_SUBSECTION))) return NULL; =20 - if (vmemmap_can_optimize(altmap, pgmap)) - r =3D vmemmap_populate_compound_pages(pfn, start, end, nid, pgmap); - else - r =3D vmemmap_populate(start, end, nid, altmap, pgmap); - - if (r < 0) + if (vmemmap_populate(start, end, nid, altmap, pgmap)) return NULL; =20 return pfn_to_page(pfn); --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 21BE429E117 for ; Sun, 5 Apr 2026 12:54:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393642; cv=none; b=UpeKohiai3SflIvoTcqdqQ9wpQGzuc3f5OyS7BEa+snPPyeh8N+MRVviyoAKbuZb93Er5GyB7teTfdqHrcA7TSzvvqdd/yMnK7MlvljMs8rw9ud4DkqXr3bELMb9o6+e8XqbVmqQ7C3ZkyPTzkYG+FvmB96dBkFcOtvhQaLOekw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393642; c=relaxed/simple; bh=sKg38M9f8HFEjjnzb4fe3hLS391SnkYMsHvXec56REY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Tnt1IBQHGasuS25Ad4HKE3wXAUIq0ppFXgXkfGmGUhBHZ3aRdmgG+t/LH4CYtRf1jmcNU7XkWWWJ+jnHWYVntsITiSGwlBnnsvAloVYOJ5kXqKlk6NbmqYoQKudrvtygfTERPyKfEPnF9Nnotcb0qgPR7DZLol85D4zxu0w9Whw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=gUSEDdYI; arc=none smtp.client-ip=209.85.216.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="gUSEDdYI" Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-35d9c7bf9a1so2804256a91.3 for ; Sun, 05 Apr 2026 05:54:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393640; x=1775998440; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XDmw/QSiGpmicKvHs3oRyZcvGRBcNXeGZ4EHaUFkJWQ=; b=gUSEDdYIZS9XWXkfCPadvhhcnpdt5LcRH09eAH3Wnsh9hmfBhl2Yj2YGqr8IK+eqT5 26F/CqfXhxyAoqos7JVFnMnGtB26XmbeEnYrfZVbx7PelVmSP5eyfRfQ0d9+H9wbdQIm mMgooMokSu0lV9HhJxakYBiJDy3vfTBKfKcUen++a1ZyrEz+sVSNXwtAiYbzd6MjT/hp OgBDAe21uRVyjjwilkSLrDoiLSoz3alj13agVoPHDjH3ZF2yhMS8Iv4OVYJ5srF2JPS3 F9qNjX5VozrCkxjHRTTf2mfRrZGSxjSM/E3xfHehqScwyVMERcFHuur+VGELe35PMHUs 9IZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393640; x=1775998440; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=XDmw/QSiGpmicKvHs3oRyZcvGRBcNXeGZ4EHaUFkJWQ=; b=cQ9gJi7gZO1zzzbXPXhrMUoXmkm4puTKSPVIOhR+jfqRTY6dAczCIPzHPzTtQK2X++ VvS6Kl3MvAchJQTn96WqKvHJMXwjUGvEsNl8ONM3W2EJIu3FgxfR8bg5cKZzbR8scyra 3HBgJQ9IZn+UAxPg1/pHj8E4aA2Jp/OVHr2sjOA6m/n2tgL3CA2zwy5YvDaTT04Uzw0K BB1+6FCJZAmthU6c9bPpJqN6a8LxQnzqk1pv6NPmxho5wBPWivUU8QN+N4hKd5cEn3+W xPtqUPtzfzgoBNMBkMOue54Cw1Dvh22Xwny0zwvX+XEbx7dnWz3ERGnbcY6eknGKhf6k HAjA== X-Forwarded-Encrypted: i=1; AJvYcCWiw9SyNn4zZJs9h75euejx8sW/WjCj8jb6BjDZ4+RhFIVyGC/5f7x39RzXw2IEUVoJ0Rhr4V0hlFiiNys=@vger.kernel.org X-Gm-Message-State: AOJu0YyxgtqOSJeqU7bcGfeZqDVAZVYOZSuGHnd9hLPB8GC6f1jCP5N3 FujhZnh961p0K51GXSghOAW3+uI7OvD0NAMgasD+9t6ngnkLvXQDb6/526onqq21d6E= X-Gm-Gg: AeBDiesDPXOiK8uR8rHodBbuwG7m/mKk7N+YBIRQMPPD1JsmowoRAzh2sBXeGUgZcwJ llYNoL3J43ld5F2DPJxWzwONfD+Xy1SrILE10ZWMi+jm97edFn8D//B+C7ytM4LFtu7fJEi4vjZ j2JZWwbHDVQ446cXbwGgizplPS/VJ8bf8t4qdAcwaSIXX8myzrhe8hTBpamFlaU2MrFpyNFo2Nu ChKKYJQDiY+sAJsYMhK6nFbmoajjnnebY0ooo5XtjKrOKG8GicSPtnxxVhUopMvkeSTC/or8lQ/ wPLzeakvEGnHwOTW69rzhJh0dfeCpKlGyxvYNIZk8oRh0Wwm6GJp4nJ5XfSuCEy0qKR94SaZEno MSE1pCV2Ge+qMT6k1Vq6+VS/Cv+tCRbXE3b96Tr+Bfs3oJKoN6p3FN1SblSCDW2jNhA2PdPRvq8 8nQHmdp5Kvpw96Mb/9B8IRqoMhK81TERZrg8vJUlSo5uA= X-Received: by 2002:a17:90a:d005:b0:35d:9d28:e897 with SMTP id 98e67ed59e1d1-35de699f483mr8490321a91.28.1775393640457; Sun, 05 Apr 2026 05:54:00 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.53.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:54:00 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 06/49] mm/mm_init: fix uninitialized pageblock migratetype for ZONE_DEVICE compound pages Date: Sun, 5 Apr 2026 20:51:57 +0800 Message-Id: <20260405125240.2558577-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Previously, memmap_init_zone_device() only initialized the migratetype of the first pageblock of a compound page. If the compound page size exceeds pageblock_nr_pages (e.g., 1GB hugepages with 2MB pageblocks), subsequent pageblocks in the compound page would remain uninitialized. This patch moves the migratetype initialization out of __init_zone_device_page() and into a separate function pageblock_migratetype_init_range(). This function iterates over the entire PFN range of the memory, ensuring that all pageblocks are correctly initialized. Fixes: c4386bd8ee3a ("mm/memremap: add ZONE_DEVICE support for compound pag= es") Signed-off-by: Muchun Song --- mm/mm_init.c | 41 ++++++++++++++++++++++++++--------------- 1 file changed, 26 insertions(+), 15 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index 9a44e8458fed..4936ca78966c 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -674,6 +674,18 @@ static inline void fixup_hashdist(void) static inline void fixup_hashdist(void) {} #endif /* CONFIG_NUMA */ =20 +static __meminit void pageblock_migratetype_init_range(unsigned long pfn, + unsigned long nr_pages, + int migratetype) +{ + unsigned long end =3D pfn + nr_pages; + + for (pfn =3D pageblock_align(pfn); pfn < end; pfn +=3D pageblock_nr_pages= ) { + init_pageblock_migratetype(pfn_to_page(pfn), migratetype, false); + cond_resched(); + } +} + /* * Initialize a reserved page unconditionally, finding its zone first. */ @@ -1011,21 +1023,6 @@ static void __ref __init_zone_device_page(struct pag= e *page, unsigned long pfn, page_folio(page)->pgmap =3D pgmap; page->zone_device_data =3D NULL; =20 - /* - * Mark the block movable so that blocks are reserved for - * movable at startup. This will force kernel allocations - * to reserve their blocks rather than leaking throughout - * the address space during boot when many long-lived - * kernel allocations are made. - * - * Please note that MEMINIT_HOTPLUG path doesn't clear memmap - * because this is done early in section_activate() - */ - if (pageblock_aligned(pfn)) { - init_pageblock_migratetype(page, MIGRATE_MOVABLE, false); - cond_resched(); - } - /* * ZONE_DEVICE pages other than MEMORY_TYPE_GENERIC are released * directly to the driver page allocator which will set the page count @@ -1122,6 +1119,8 @@ void __ref memmap_init_zone_device(struct zone *zone, =20 __init_zone_device_page(page, pfn, zone_idx, nid, pgmap); =20 + cond_resched(); + if (pfns_per_compound =3D=3D 1) continue; =20 @@ -1129,6 +1128,18 @@ void __ref memmap_init_zone_device(struct zone *zone, compound_nr_pages(altmap, pgmap)); } =20 + /* + * Mark the block movable so that blocks are reserved for + * movable at startup. This will force kernel allocations + * to reserve their blocks rather than leaking throughout + * the address space during boot when many long-lived + * kernel allocations are made. + * + * Please note that MEMINIT_HOTPLUG path doesn't clear memmap + * because this is done early in section_activate() + */ + pageblock_migratetype_init_range(start_pfn, nr_pages, MIGRATE_MOVABLE); + pr_debug("%s initialised %lu pages in %ums\n", __func__, nr_pages, jiffies_to_msecs(jiffies - start)); } --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7754C25785D for ; Sun, 5 Apr 2026 12:54:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393647; cv=none; b=lLEHAT/C8x5hfFURx9SKJSptf2jGAJ3N6OvF2EwcOuskZ/N5EYCXyB7JClPZGzVJ+H6RejmHnEkOVCCwMYFgv7dGTC2zVv39G24PyFmnq4nXzRjf8sbXD1qSrc617di9gWYjV9ZvuXqZOytPo6HhEU0rZeBY7Yhs8DOcTP6B5hg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393647; c=relaxed/simple; bh=j5oZwPlJIuy1hsZM/j6hSlHs/OyXkslyO2d4256L5W8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=V//D3eglKOl1UcCCCvzmmReND8aHuXHbNXdcMZScrk/9j4cM99kH0xuO7wzT1i71QKcegrqZsuLsYCELOyGCA+dUcpfJfPFHEo74kNQKmjlHfdEV0U3/FWoJcAWr3PIEJcX1t/gK9Z5IP2hB50nw6b1vXNwATZLRgknonG6eeDQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=BwAHLUh9; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="BwAHLUh9" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-35da2d35eccso2001193a91.0 for ; Sun, 05 Apr 2026 05:54:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393646; x=1775998446; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=k9npI1C6XAEv6UQdoRbep1PBwjjWINNu45ua4UkL2pA=; b=BwAHLUh9cslgNrFbRnSdGpqbbZ0Iaj5OZ1JvSUw4GKiiuu9JOH//KLeijfJ5T1xRWJ PAeI2xIshyYy5gp8iSnPJ1HB52KfjP2GsR6SRrTyz1u2/DVJY/Dlj/3NFmkYiPbPZ3Tv xqsqHHWzv2Ls05d9qKY/jAGMMKahiwbzHVxVDmr3j7MJpWxVweobwgdmQpSAv0GE5njS kueR2Pft1aQPD6r6yNqtQu37ZLJovgov/VifKi3M5DsBdAcAg4/F2jh43VdBv+eqIsi3 qzdGHos9Yvyabak/TJ5YAQk8Yuc4GjEo4qemg+Djh7BfVKHGXLv+B/8z9mxvnogz30M+ B1+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393646; x=1775998446; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=k9npI1C6XAEv6UQdoRbep1PBwjjWINNu45ua4UkL2pA=; b=URU1AHe9B4DQ0kLn2df53Ykg/JD5OKwszdBJM6690/gT/oVwsuMUP5MjjxTm6N5Xro 23Havaf9FYja8WjFppsdNOdNGeUAplC1qmM/pepkgIEYw0InfD74curonPtT4gcuCdMn n+S9kUsmfRa8bKmTd2Gl8aGIf3dNyQNIvCvKxy0vVK3o7ZK0kA0oOFMAm6p+pGDR6hjc svnFFv03xGjYZR68qkmSdh4lMh36LmRIsVETKrzvujwo4z/0H/NZDG8zHl2TpQvjRzmN 9QqiXs3/82JdSnn5kKW33d4maVZMAegUNG9gVuooCCmal9e51JMsQu87RffVpZtqalkh nTzQ== X-Forwarded-Encrypted: i=1; AJvYcCWgydtstxNPV0kElc69bIA5tDOo/jcCZXSovZF5K2XS1jRNCNcIk8ES4fb7PUzPr9UOqirKEqtmSeENFfs=@vger.kernel.org X-Gm-Message-State: AOJu0Yx9k+tU02ZxYj6l9udhgEbLn9nFA8c1RJiKNTYoRcRegE1LKbpa JacOGsPHyFoqA+qwOK//ypWOErDUEhTQCD9yIgUl3zpUIzrsuKYPs5b45+5UdBDX5/o= X-Gm-Gg: AeBDieuqnFao2T7+77SILwsiUneyHCEsbYh2uX1QeqZUxdcHfPPPu+hA4ekMv14n1yU TUdqVRN6Iu508sS/nLf+SAlYYtVlKidpRamr9eCLnn1i8mkHP5Kl1uNjfnIB4bgOGEsstMCkaVU n1EDTBfrFPDlcuXlMvIj+Mb2HljLwSaXB6/hv/osWOXxXJYD+5nBhl7HqfXpPBKGWkhf1GCrzYD MLYDfFXlmWsjqa0Xp2N+/HwyHlvZ7iB0uwvlv1EltQvFJFwON+sTw+vvspvNLunZ0uxUpXW/izn zL5Nk9QLfdn4/B9XyJOzD3320687rg8ZOlH/o2g2wdEvykgDLa6V4czPHEgQ83t/3n1DCZeyO16 U3xRpmLMXgQ4+G8Svs4JIfn/Sx2BSYU44SpVZlmFC9xezt5m9TPboUF6f/vr5OGwQOt+6qHINLS WYlRDGWpczAgbT0BFQErdnBhxMOQP/kHTnrCW/l352ndg= X-Received: by 2002:a17:90b:586d:b0:35b:e4f8:7ac5 with SMTP id 98e67ed59e1d1-35de686380bmr8252102a91.7.1775393645672; Sun, 05 Apr 2026 05:54:05 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.54.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:54:05 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 07/49] mm/mm_init: use pageblock_migratetype_init_range() in deferred_free_pages() Date: Sun, 5 Apr 2026 20:51:58 +0800 Message-Id: <20260405125240.2558577-8-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Simplify deferred_free_pages by replacing the duplicate loops for initializing pageblock migratetype with a call to pageblock_migratetype_init_range to simplify the code. Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) --- mm/mm_init.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index 4936ca78966c..a92c5053f63d 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1974,13 +1974,12 @@ static void __init deferred_free_pages(unsigned lon= g pfn, if (!nr_pages) return; =20 + pageblock_migratetype_init_range(pfn, nr_pages, MIGRATE_MOVABLE); + page =3D pfn_to_page(pfn); =20 /* Free a large naturally-aligned chunk if possible */ if (nr_pages =3D=3D MAX_ORDER_NR_PAGES && IS_MAX_ORDER_ALIGNED(pfn)) { - for (i =3D 0; i < nr_pages; i +=3D pageblock_nr_pages) - init_pageblock_migratetype(page + i, MIGRATE_MOVABLE, - false); __free_pages_core(page, MAX_PAGE_ORDER, MEMINIT_EARLY); return; } @@ -1988,12 +1987,8 @@ static void __init deferred_free_pages(unsigned long= pfn, /* Accept chunks smaller than MAX_PAGE_ORDER upfront */ accept_memory(PFN_PHYS(pfn), nr_pages * PAGE_SIZE); =20 - for (i =3D 0; i < nr_pages; i++, page++, pfn++) { - if (pageblock_aligned(pfn)) - init_pageblock_migratetype(page, MIGRATE_MOVABLE, - false); - __free_pages_core(page, 0, MEMINIT_EARLY); - } + for (i =3D 0; i < nr_pages; i++) + __free_pages_core(page + i, 0, MEMINIT_EARLY); } =20 /* Completion tracking for deferred_init_memmap() threads */ --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C5F425785D for ; Sun, 5 Apr 2026 12:54:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393656; cv=none; b=keJkkHG95/u/BM5k64J7JMI6FgqnNOIzQu9WmQS0WCS3p3qgv3leXCsQxFX6idILX9Uolj9UM7I/hrgV3FoSKGkhquEyBbpxH9RYey3MmjZqmuTm2U6iU5qiQHfNcpJHOp9jAP27cH/dLFrevogiSUAtXm6X4N01LilRg/5E66s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393656; c=relaxed/simple; bh=qJzRA6L4X5lr3u0nDVMt/geRN/GJYI07GSlUhBSuQMg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=opzATKwPGRHz1q2/ixbrYDCgWnSzm6iUGr2Tg/S+lBYk2SKEUo62RHoFyMURXzzVaBpGpio8BkkQW3SOWPbcUON73rb27BYlBx2r1kWRW5kkoCHX5YoiAFW9dKj+tfgH7OIyK2w9FKpBe/aITbcAeQVqneYMMzW4i9JF2YfzHqM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=BOXqCmzz; arc=none smtp.client-ip=209.85.216.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="BOXqCmzz" Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-35d971fbcddso1861106a91.1 for ; Sun, 05 Apr 2026 05:54:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393655; x=1775998455; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oG1BlwIWBanSrBSTwRxoxLhn0mTG5YlmPdthc1egVX8=; b=BOXqCmzztNXF+dFbyAQlXXskgiUQBI8Bz/KkwO5VwAGf3rcASL8RFLG50GGCS4jAHN 7BUe4F18NYMLW9oDs/oKlcuxbOFTIzia2ZOAOVLk/1yBDsntnKghqm5bycJyvXatyrpk UrPaJc7VV5Mv6RoSvaWPU65LRz+GjXWwxznMUexAp2gqajUNOZ0s0UU1pkMCL3EGmBZY jjOtCKgdtFzu/r026VQaN1D7V664s0S8PdJNBGhN/chywE7i+pReaNZHpnBUZEVReaB0 btDtytsWea4BVoF/V5sY7wEXoEvsoAG0rv/fqjdYFySrd0VA8/nFZjjzx2z6cqshEBmY kDRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393655; x=1775998455; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=oG1BlwIWBanSrBSTwRxoxLhn0mTG5YlmPdthc1egVX8=; b=h6RxtFXtEAZd3thYCrlVgFMij3fyD590vH+gHYSDo4nCkCpexHcumWyyXbAwWnqnUQ PEzxKWCHXoORmSrbMoaeLjhc5qo6ULax9TZzsuPn37fmg/JrIDcAD+MyqYnT1v7HdANy MtiBZhdtdzkoRwNLLKPgUDlZSrxC8rJTtC6fAoF68mR29EPeumbkehd2xOhdNCR0VlPw yW/01gLya91qJaimPHqdK4072DKdIxegsjingHKy+yWAz7L2wS8NR3cSIN9bC5QLANkb zXTneGERrwmP3+rNQSlPxopsoZUh2h0SIHFRIXyS5iq2U84kXsOzt8dl0x+Z2IJVP8yI Yr/g== X-Forwarded-Encrypted: i=1; AJvYcCXiu8TWNgTC0MJrtbZzQxdLtVrTmbu5Z6Oh25ghvZbkpC/6f3008qzkswu9wleW2Icwu2O6bdCqyPNeGgc=@vger.kernel.org X-Gm-Message-State: AOJu0YzVH7etjmMrwesE2P4q8+dBVCtyfd79V+aMRh8+MCqXcIkHiiJw GXZ4Ub+F6zNNk5kSYL6Jhh0QR6QHcm/2y7S//seFPf821uNc7qvOiWLlayeR43fVV4s= X-Gm-Gg: AeBDieurtu0yZ3Sb+LeEnzkxMzcdf/nY3+Akj3DpV0c6bHMr6ZSr8AHjTPhznLwDC2N SsE9S0Zo8lmieoUKBgqto4hbxld/CNbuWjQZkzx+4kfTAnNrLeZ0As/2qmLFN4pVOovrA6Ah+Rt j75iDtOfB8ibFzYqvcsZknwPHmDI4WaE3kXPCfEjySCDsO/kViTCHDYU+FUV/6iGr3CfWGLgVD1 oNQwW4zyxx6RRnlQxWYPnBzrxKDSybaRv8zbmCcLBsvYiG1hyakChyZM/ax/88FuPkEK3O3CAO4 6WJ0p3GK6sPqZ3gJ4gHgLkJHUVKuOlfe4Vz2z57+BEl3Te5Kke73eDwlrmqOk/KMprPnbzw2z2k MpwqR9APQnqupt4DjiD4VS0KUH2qlI/v3zjXh3fp+6+KLg0KgNMyTbD5+ClziIQkj4cxOQE8Ps4 NxPl9aO8IqQefQRxtmCufX7eQIxygQOhM1x3gfzjS1WOo= X-Received: by 2002:a17:90b:52c7:b0:35b:a44f:b80 with SMTP id 98e67ed59e1d1-35de591e934mr6077709a91.1.1775393654892; Sun, 05 Apr 2026 05:54:14 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.54.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:54:14 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Chengkaitao Subject: [PATCH 08/49] mm: Convert vmemmap_p?d_populate() to static functions Date: Sun, 5 Apr 2026 20:51:59 +0800 Message-Id: <20260405125240.2558577-9-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Chengkaitao Since the vmemmap_p?d_populate functions are unused outside the mm subsystem, we can remove their external declarations and convert them to static functions. Signed-off-by: Chengkaitao --- include/linux/mm.h | 7 ------- mm/sparse-vmemmap.c | 10 +++++----- 2 files changed, 5 insertions(+), 12 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index bebc5f892f81..aa8c05de7585 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4860,13 +4860,6 @@ unsigned long section_map_size(void); struct page * __populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); -pgd_t *vmemmap_pgd_populate(unsigned long addr, int node); -p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node); -pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node); -pmd_t *vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node); -pte_t *vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, - struct vmem_altmap *altmap, unsigned long ptpfn, - unsigned long flags); void *vmemmap_alloc_block(unsigned long size, int node); struct vmem_altmap; void *vmemmap_alloc_block_buf(unsigned long size, int node, diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index d3096de04cc6..0ee03db0b22f 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -151,7 +151,7 @@ void __meminit vmemmap_verify(pte_t *pte, int node, start, end - 1); } =20 -pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int= node, +static pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long ad= dr, int node, struct vmem_altmap *altmap, unsigned long ptpfn, unsigned long flags) { @@ -195,7 +195,7 @@ static void * __meminit vmemmap_alloc_block_zero(unsign= ed long size, int node) return p; } =20 -pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int= node) +static pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long ad= dr, int node) { pmd_t *pmd =3D pmd_offset(pud, addr); if (pmd_none(*pmd)) { @@ -208,7 +208,7 @@ pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsi= gned long addr, int node) return pmd; } =20 -pud_t * __meminit vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int= node) +static pud_t * __meminit vmemmap_pud_populate(p4d_t *p4d, unsigned long ad= dr, int node) { pud_t *pud =3D pud_offset(p4d, addr); if (pud_none(*pud)) { @@ -221,7 +221,7 @@ pud_t * __meminit vmemmap_pud_populate(p4d_t *p4d, unsi= gned long addr, int node) return pud; } =20 -p4d_t * __meminit vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int= node) +static p4d_t * __meminit vmemmap_p4d_populate(pgd_t *pgd, unsigned long ad= dr, int node) { p4d_t *p4d =3D p4d_offset(pgd, addr); if (p4d_none(*p4d)) { @@ -234,7 +234,7 @@ p4d_t * __meminit vmemmap_p4d_populate(pgd_t *pgd, unsi= gned long addr, int node) return p4d; } =20 -pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, int node) +static pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, int node) { pgd_t *pgd =3D pgd_offset_k(addr); if (pgd_none(*pgd)) { --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4CFC32BEC2A for ; Sun, 5 Apr 2026 12:54:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393663; cv=none; b=gkLexs84Uwi6IUbpSMEEtKQyLyPjKIGlGh/WpVG63KyIdboOz3wlQQqZhuEkA8Mu6UUVqftb19b6OajrqD49S9jONYSK91iCVwiux7WcAw5otWClvS2gQC3SggXLCfl3D4fLO091c5EYvTI3Fu4N5Bvrcvqj3jbP30+m+2dW0IU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393663; c=relaxed/simple; bh=DzWaG3sOq/U7APC8LtQsrtZPxib9BUIMxwqoluvAtqc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CEIXmDk5DEb24rKBgdy6BiZNNUDXF8dOpVAwuUL/zEcwiqnX93zArnasZ8NATETBE90HRjc7d1+eg3MjW2pWQE58saw+QUaLElSB8LY/Ltr5kYj4Z/AxNPcBGP/AoYOA2vz4WvI+QcCmKzSb8aIORx2o4b0lHLEjldv/ZnBSVlc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ly06rJtV; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ly06rJtV" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-35da9c0c007so2932691a91.2 for ; Sun, 05 Apr 2026 05:54:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393662; x=1775998462; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=32+AkcMilp9VVAfduQufKDvkeOaHLrnbBWuUoleVhXI=; b=ly06rJtVYzJG9JMj9GPqRm3Z3c6Cy07Ck2wAc4wjYpkZEGLXfj45EN36C1bt34xSFq mQoRf+eVjnTDtAUg8yVMbT6QFQFL3qTZ6b1Z2cuhgIayDTAdQ4wkA3X0YCkwTBhQPyDi Xmri6ZfvGUbV9cUwLeYYhM1MMQDYnTmY59XvF2f7DS35lEUQJrXyl1B4oVpSEam4BDjr Eeq+ZxhZwmePXrpoCN5HWVvTGeE6TmciMSXfadhK7eK0lEvnlZgP2mW65VwPvTyihFbq zVwPWJfv0MPw1er6lOs8wl6gta18QaF9YAy0RbyOrMMJGwvPLQUph61rdcxbKitvS7ig SD0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393662; x=1775998462; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=32+AkcMilp9VVAfduQufKDvkeOaHLrnbBWuUoleVhXI=; b=ON1ZBeymxo2I/spVYv9+Y9iYN4ZOpevwT5y8XrHkjWELKBysrIbBW7uFlJzgVdElwd zXKxvfYIxjeiLI/dcdJ77Dnnj4Hn2pXkhiC9ha2NotLcWIBa4IxNr8885ihgJ+Pzi7Cp iqa9KbunazfSWtVK/qTai0eriGL9rW4P4NxqniEVlG/pr0iWR/j44+eLbbt0g5jdjK1y 8D/ZIPaehEDpahnf/8kpdBTfLAUFS4WvA7f1udsYFeNZyUT9wd+qAY4Dkca7yFNxYKnH vYj0J8CExguwnp17Src9e24yvCQtCh/pxYVBVpjBiG+9RKmoM5rwcuA/c0lp5qsQxJZ7 HGhA== X-Forwarded-Encrypted: i=1; AJvYcCVolLdGmAq8SZqI0dr520GNpRGiYsFV98aT6X6olUyByB1CFA4XR32oqzHoq3wrVWWA3AYRPjZygezMYrY=@vger.kernel.org X-Gm-Message-State: AOJu0YyxteSbo2FrbmpSMqC/EMkvShcvAJ6jd34zwCWnoymYVU0ahIEh Dx71MpTxF/nY7wDQ+w+dVrGm3HleUOAgYtaVUMDnpU7nBd8GLGANRJbKQwqf4GpM9dQ= X-Gm-Gg: AeBDietNFOGDe7B7x1GHKYBak9tMU38DbZLS03iiiHC9AoxnQFIKkA65nzgsPtAyfb0 mUitkwpU+PEPJUoKZoToyFNsWpo9iV6Xuq4tdy05XdipIAQN6T9PLxfTUtdK4gjJbqi+WeWUn3z IpQQNPkrjlgNBbRCrOJMrRuKkK7o2g6eNBX5g3VzcyYx+fykrPpDPl7vBUMe9D2ndTI1xlp9b61 h8g+7pp89DvHkUejby9IeCtC0cvLQtPOEvLPSYjOGKyFtg2SMhWkrmoKinDnsQ9IiRZliQJsw+A 5vIxeR8F2ifsTinFI1TBzE2ZdAaXyO9Y4JDikUTbpoDbA5LI5JJ3wX+OqrUSv678YAnplDukSij 8vULNJVmWpv/+qAMjzirPfAHD6SErkYjIfxs3yQGnWVnkiQVBkq7IesuWqKrJJ2VYLzde3wjVB6 TQfceDYPYDmbdneOG+BalJRhEzXG4KvK7obcxCzM8tE3c= X-Received: by 2002:a17:90b:5111:b0:35d:93ff:2855 with SMTP id 98e67ed59e1d1-35de67eba6amr8915799a91.8.1775393660266; Sun, 05 Apr 2026 05:54:20 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.54.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:54:19 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 09/49] mm: panic on memory allocation failure in sparse_init_nid() Date: Sun, 5 Apr 2026 20:52:00 +0800 Message-Id: <20260405125240.2558577-10-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When vmemmap pages allocation or usemap allocation fails, sparse_init_nid() currently only marks the corresponding section as non-present. However, subsequent code like memmap_init() iterating over PFNs does not check for non-present sections, leading to invalid memory access (additional, subsection_map_init() accessing the unallocated usemap as well). It is complex to audit and fix all boot-time PFN iterators to handle these partially initialized sections correctly. Since vmemmap and usemap allocati= on failures are extremely rare during early boot, the more appropriate approach is to expose the problem as early as possible. Therefore, use BUG_ON() to panic immediately if allocation fails, instead of attempting a partial recovery that leads to obscure crashes later. Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) --- mm/sparse.c | 37 ++++++++----------------------------- 1 file changed, 8 insertions(+), 29 deletions(-) diff --git a/mm/sparse.c b/mm/sparse.c index effdac6b0ab1..5c12b979a618 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -354,19 +354,15 @@ static void __init sparse_init_nid(int nid, unsigned = long pnum_begin, unsigned long map_count) { unsigned long pnum; - struct page *map; - struct mem_section *ms; - - if (sparse_usage_init(nid, map_count)) { - pr_err("%s: node[%d] usemap allocation failed", __func__, nid); - goto failed; - } =20 + if (sparse_usage_init(nid, map_count)) + panic("The node[%d] usemap allocation failed\n", nid); sparse_buffer_init(map_count * section_map_size(), nid); =20 sparse_vmemmap_init_nid_early(nid); =20 for_each_present_section_nr(pnum_begin, pnum) { + struct mem_section *ms; unsigned long pfn =3D section_nr_to_pfn(pnum); =20 if (pnum >=3D pnum_end) @@ -374,16 +370,12 @@ static void __init sparse_init_nid(int nid, unsigned = long pnum_begin, =20 ms =3D __nr_to_section(pnum); if (!preinited_vmemmap_section(ms)) { + struct page *map; + map =3D __populate_section_memmap(pfn, PAGES_PER_SECTION, - nid, NULL, NULL); - if (!map) { - pr_err("%s: node[%d] memory map backing failed. Some memory will not b= e available.", - __func__, nid); - pnum_begin =3D pnum; - sparse_usage_fini(); - sparse_buffer_fini(); - goto failed; - } + nid, NULL, NULL); + if (!map) + panic("Populate section (%ld) on node[%d] failed\n", pnum, nid); memmap_boot_pages_add(DIV_ROUND_UP(PAGES_PER_SECTION * sizeof(struct pa= ge), PAGE_SIZE)); sparse_init_early_section(nid, map, pnum, 0); @@ -391,19 +383,6 @@ static void __init sparse_init_nid(int nid, unsigned l= ong pnum_begin, } sparse_usage_fini(); sparse_buffer_fini(); - return; -failed: - /* - * We failed to allocate, mark all the following pnums as not present, - * except the ones already initialized earlier. - */ - for_each_present_section_nr(pnum_begin, pnum) { - if (pnum >=3D pnum_end) - break; - ms =3D __nr_to_section(pnum); - if (!preinited_vmemmap_section(ms)) - ms->section_mem_map =3D 0; - } } =20 /* --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8AAA2C11E6 for ; Sun, 5 Apr 2026 12:54:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393670; cv=none; b=gU6tGvnb+VcuUpSCAfJKuFOJ72/Zr/sTY0CUv9Pu3KusegiJoHMpILGLjjEGTwhn05YK5xZHyCZbD94Or8qsbRxWa1L0YHkszPyi1eG3BbVmmTkmWgPLwkbRYPK2iExcsmpM0Oiz+HfXGJuJEKuouWCA3LY37LVtcftVi1ZPmkM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393670; c=relaxed/simple; bh=8D+FSmXNxBO4Wfde5raQG8y+oFklD0b+m5D3iZS3zGk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mlRZ0AAsWSpJgVQKNZcr3GwD0RJFPCTuuPD2H6Rl7BdlGUfJg6GVvkiDAsRnr+zx4rmqqKiWwVq4Itrg7yrbK1/tNBrD55sgyW8G1/CGi/mo9KN8li5P7zk/8No0h5ICi3bJpcfZpzi3PKqO+wbFG6EElEZvbssPi+1SAZXyIn8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=Gr/lwAnC; arc=none smtp.client-ip=209.85.216.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="Gr/lwAnC" Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-354bc7c2c46so1787298a91.0 for ; Sun, 05 Apr 2026 05:54:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393667; x=1775998467; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uQZVpg6xJPTStuz2eFGC89hf7AN4wP07MvFFryn7lXk=; b=Gr/lwAnCww8M+qR1Asu9uqf7ssV65p0KzehP9D4zD/jE3goiUbEiOexN+oYKcOROgF iM/IorKtOxHAgNzAPOF8QLcgkkROqFu3KUi+2H5PT9AsRaFGKhahfw2u/Da0Z2cwy5oa v0NyMB9KOrEQ+D702GQLQvES6erPY83Eg+bE9DCLWO+lnGupjR5OeKQGgTcIpOgIToH2 aA+cbxIZeIJVTkHN3R2+F/PG85/vnsDOS9VviNRUDoFtTsxehkRlLuiaEPPO6B5GRKpr IfYMDExzcvhUqILLFCRFCXC9XLBN0umUDep40UlSjBmyZxnrLBuci/1BY/agUdZ8rfC9 nmWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393667; x=1775998467; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=uQZVpg6xJPTStuz2eFGC89hf7AN4wP07MvFFryn7lXk=; b=EN98ElCeOeJHgi0F0M1ZP+fOKRjdan7pFIUoeIJO1VvfgFbq9x6ZPIgyEQXVftki/R 1wW1AFTERikvcFxGpUfiqMUZzLR3HYSNdixoDBKdrsr9ARzfADWg24kBxOXK460gfjYB UGtp6RXkNEkDk9BEL5ODfpzx4K8H0USLIJQIU6CBmA7jT9+kOdu5PSg7ZGN8tA1iVLd2 pZXF3N7NEpfbh+WWG0SP5Pbzbh3sFNuPVPCOlW2KRI4glTmJNRq5HyLfiPwWMguggJmm X8W6bOi+x4nSskEVGHGRg0kr5KtraOYp/0TU0//9wNMAxllpSXFTc6IijM16o/6UZeW8 f3ig== X-Forwarded-Encrypted: i=1; AJvYcCXVsMj18Cx+jM+VgTY2+WatrZ4thQBoGq4s4DmKDgHSrBwiurs+xKo8SiWlDrsCP2ogDPMC4FLbSjMuj/0=@vger.kernel.org X-Gm-Message-State: AOJu0YxylH64psEGx4qqlmJcObTY6EV6GVWOny+0NhdA1xWt8bfq0jMD NBHL8KnhhMLTG7AHVCb4PUOjDO3JpSipF06oUlYelAfOK1HAceolC+lpRV6fzTsgIPY= X-Gm-Gg: AeBDiet7fj+CCizOpAEamW46ycyENxoTSBJv8qYqdEp6Q1C/mzLInsJlQoBRJe3BiUr ejvJtljrPVVNjKk/MRA8e3Ab/mPICSq75udBc//Nk3/p8yd97preFmGLBbOsouH3cCNIBCDJxEP JJARa5pmUOCrr4E6vv2x4tOoz5ImkJ6OHe1WxTeXN2+Ws0jhSpytIpFxkC8J5pJsSHl6yFIpLDe xLul3K6Ku1IR/g+pEMLukeUXPkQ/iHTtmqGGgXk2Pn9tmortfw9l/uCLNExQEHP1AEcxoHfVvES Ke0oRG+OKSuaAfGEfkAsAUPqRGFK7z9IkLulSxyYZmdaC0onbDNZdczEQ6LALKAYyCRz9QFSrkl rz8oUNjnD5kaMwIMDHkL/O6EWCFeHGNhLnMv70iYQn+dqDv2gMk2NgmtaWQgCWY9oiugqd2DDGZ iFM4LaZoP4X5bXgGqP1EbcjqQ8L59aK6pJhXF7N4U9Z3o= X-Received: by 2002:a17:90b:2b50:b0:35d:a6eb:197f with SMTP id 98e67ed59e1d1-35de660a696mr8364134a91.0.1775393667087; Sun, 05 Apr 2026 05:54:27 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.54.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:54:26 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 10/49] mm: move subsection_map_init() into sparse_init() Date: Sun, 5 Apr 2026 20:52:01 +0800 Message-Id: <20260405125240.2558577-11-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move the initialization of the subsection map from free_area_init() into sparse_init(). This encapsulates the logic within the sparse memory initialization code. Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) --- mm/internal.h | 5 ++--- mm/mm_init.c | 10 ++-------- mm/sparse-vmemmap.c | 11 ++++++++++- mm/sparse.c | 1 + 4 files changed, 15 insertions(+), 12 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index edb1c04d0617..d70075d0e788 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1004,10 +1004,9 @@ static inline void sparse_init(void) {} * mm/sparse-vmemmap.c */ #ifdef CONFIG_SPARSEMEM_VMEMMAP -void sparse_init_subsection_map(unsigned long pfn, unsigned long nr_pages); +void sparse_init_subsection_map(void); #else -static inline void sparse_init_subsection_map(unsigned long pfn, - unsigned long nr_pages) +static inline void sparse_init_subsection_map(void) { } #endif /* CONFIG_SPARSEMEM_VMEMMAP */ diff --git a/mm/mm_init.c b/mm/mm_init.c index a92c5053f63d..5ca4503e7622 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1857,18 +1857,12 @@ static void __init free_area_init(void) (u64)zone_movable_pfn[i] << PAGE_SHIFT); } =20 - /* - * Print out the early node map, and initialize the - * subsection-map relative to active online memory ranges to - * enable future "sub-section" extensions of the memory map. - */ + /* Print out the early node map. */ pr_info("Early memory node ranges\n"); - for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) { + for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) pr_info(" node %3d: [mem %#018Lx-%#018Lx]\n", nid, (u64)start_pfn << PAGE_SHIFT, ((u64)end_pfn << PAGE_SHIFT) - 1); - sparse_init_subsection_map(start_pfn, end_pfn - start_pfn); - } =20 /* Initialise every node */ mminit_verify_pageflags_layout(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 0ee03db0b22f..b7201c235419 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -603,7 +603,7 @@ static void subsection_mask_set(unsigned long *map, uns= igned long pfn, bitmap_set(map, idx, end - idx + 1); } =20 -void __init sparse_init_subsection_map(unsigned long pfn, unsigned long nr= _pages) +static void __init sparse_init_subsection_map_range(unsigned long pfn, uns= igned long nr_pages) { int end_sec_nr =3D pfn_to_section_nr(pfn + nr_pages - 1); unsigned long nr, start_sec_nr =3D pfn_to_section_nr(pfn); @@ -626,6 +626,15 @@ void __init sparse_init_subsection_map(unsigned long p= fn, unsigned long nr_pages } } =20 +void __init sparse_init_subsection_map(void) +{ + int i, nid; + unsigned long start, end; + + for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, &nid) + sparse_init_subsection_map_range(start, end - start); +} + #ifdef CONFIG_MEMORY_HOTPLUG =20 /* Mark all memory sections within the pfn range as online */ diff --git a/mm/sparse.c b/mm/sparse.c index 5c12b979a618..c7f91dc2e5b5 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -424,5 +424,6 @@ void __init sparse_init(void) } /* cover the last node */ sparse_init_nid(nid_begin, pnum_begin, pnum_end, map_count); + sparse_init_subsection_map(); vmemmap_populate_print_last(); } --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2AE42C11E6 for ; Sun, 5 Apr 2026 12:54:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393677; cv=none; b=ZZKNo33sDtcimRIDTlJmp7eAoZ5MLQvaqauYQ/qQLdXWLeqXZ5Qt4QGoMXte7YOXPz8iFUerYYYWwAnlYT+cmwA90YV3Wj089X4xIGCCmdi0kxnp9HQJzJbsyLsUANjNDyCLuuWswCCC50YR0WUCbBMW/aahQPGExwrgXrfieQk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393677; c=relaxed/simple; bh=1avgy6C1oIkqDhvlKNsM1KEScoKXq82WDeY4mxrOqcA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=alYeJOXNBB1Yc1b1uKbQDvU2/cqnnje3zMw9FQIvt5ZirGriKdgkMbV8aoytp/VIFXki3W8M5PjdddQwmnzMrlkrb2i3B8OtdW+YLykRO2kuKy0wXqoLQq7DNWvwsKTnpA1pr+7uhdDUZ8j4NiOLkdDrSn8PP5APMnBAzk2rXsA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=HH9CXiyw; arc=none smtp.client-ip=209.85.216.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="HH9CXiyw" Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-35c238f1063so1899013a91.1 for ; Sun, 05 Apr 2026 05:54:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393675; x=1775998475; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kuz46CNuL6O0+N77LBKvrgQNIe8WIqaWxa3OjXOrUz0=; b=HH9CXiywSnIAEqQVxYEPQLU4zuABn3P79HOOPLzmWc32IBIlwArcoT5T6HQdXGEs7W wwhWxhEcLrcYIDPkPSMy5HXtEuYk/1FsmlCatCAaP3TNlapDYrA6FwCwtkA//zEutzTM CnW69C/yzdy6zNh/z4ypez/gYOkX0Xck+qPw8gn94guADgiSiSzIwupKj84nHQPy4Ivc EY9hpnOfZJDWQ9uw1ugXd4Oq+uZCb0oxW9DgFjgd7iRfwsy5L4vKB4p6gYVbQy6vIORR lkPIKuekkwK9LtiHt7e3NdMxdK5eOPd2niOABBbY1w7d8wGQCB46xVAsqdECr1yamBSw PKgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393675; x=1775998475; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=kuz46CNuL6O0+N77LBKvrgQNIe8WIqaWxa3OjXOrUz0=; b=Va5HqmQmAZtxkAcKZ57BmP78M3zZTBHOoXPvAvwgoa+ar4MeEby+HeLpd1hGVb+fJR 3cce4ccMhR5OOIKuUBMC40QN75ReE1mmLz8TQS8xO7kIw/6gDnKywfWKuxtL/fXYGmX8 XB9uKnLrbFrp18GdJvpssGooETq9zWwMA18dgm+41CjTtBORaAe0lxa0mvE+5N3n3E6a MzKgk0x4vtxYiRNz93i2m6uTsntDKkOGtTjm9vO9d6ho6dQ0Lpjmz0XvdVSUeZhaHSdl faEgOtLyQXuuTy0ycjno3NV22xhbTsMwneHibIS0MpFNHxvIK7Ayd2e6CS12KBESZ0dz WvAQ== X-Forwarded-Encrypted: i=1; AJvYcCWz8TT6fMtWiRNQIsPeQV8xJ9nhREN9qezYjOQxV3/Gh8LgBmup/4uJ20tZKOiL1Sh00HybBncsEFbXdq8=@vger.kernel.org X-Gm-Message-State: AOJu0YwV+6SkOr8Xky8D6iGsg7yJUfnjTJ6HQIYj+Q6k9ZCvFomYcSvP u3hFYkOeixHerjcA7DWpqgjDllkgamIWDFJIvCddXsNMQqbQJstmZwiXALX9NF39XnQ= X-Gm-Gg: AeBDievWJc15iFmnftDr14HsLBkruGyI0sgnxn6S9HynVpI/7CRcPDPKoLz3VGzlsww N5wBNAPCm5jPdKreTwU1H/G+1YaDlU6dGlaND/iy4adG1t12m3kDrF25DZcg26hzArteloJjhLP LWqwKA2h16pu1hq8v47utGDUBG9t1XG8tIQvhmiVhO7Cc9Ez+W7gyJLRAyVar6K8vye3+nibH+y FT8wkFwC9tjVBkA5Z4aRL41vaupqJLb/A7We9uRJtrIRT4dFDbgB68pUHh6k38T/QL1ksdOkGYd 7OiSK2EwMHAT4ZwcM94ImDjIYMJwgkWY3oeMk0Bbpb3vP3Jwt/NzZO6p7ilXhRWQCiLi9OtsiEU PSy80GCiHh1zps7RvOOh5SC5u19/FVo08/u158Nhi1eUoBoRx5D3W0DILCIXbBXhh8+v0BhMiQ0 cT/V8ltF2xW+lVs5fMXm8WUUFtxsjdm+F/J/mFGswlCFM= X-Received: by 2002:a17:90b:48d0:b0:35c:a8f:5c5f with SMTP id 98e67ed59e1d1-35de68655bdmr8575177a91.8.1775393675104; Sun, 05 Apr 2026 05:54:35 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.54.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:54:34 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 11/49] mm: defer sparse_init() until after zone initialization Date: Sun, 5 Apr 2026 20:52:02 +0800 Message-Id: <20260405125240.2558577-12-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" According to the comment of free_area_init(), its main goal is to initialise all pg_data_t and zone data. However, sparse_init() and memmap_init() are aimed at allocating vmemmap pages and initializing struct page respectively, which differs from the goal of free_area_init(). Therefore, it is reasonable to move them out of free_area_init(). Call sparse_init() after free_area_init() to guarantee that zone data structures are available when sparse_init() executes. This change is a prerequisite for integrating vmemmap initialization steps and allows sparse_init() to safely access zone information if needed (e.g. HVO case). Also, move hugetlb reservation functions (hugetlb_cma_reserve() and hugetlb_bootmem_alloc()) to be after free_area_init(). This allows hugetlb reservation to access zone information to ensure that contiguous pages are not allocated across zone boundaries, which simplifies the hugetlb code. So this is a preparation for subsequent changes. Signed-off-by: Muchun Song Reviewed-by: Mike Rapoport (Microsoft) --- mm/mm_init.c | 15 ++++++++------- mm/sparse.c | 3 --- 2 files changed, 8 insertions(+), 10 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index 5ca4503e7622..72604d02a853 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1807,7 +1807,6 @@ static void __init free_area_init(void) bool descending; =20 arch_zone_limits_init(max_zone_pfn); - sparse_init(); =20 start_pfn =3D PHYS_PFN(memblock_start_of_DRAM()); descending =3D arch_has_descending_max_zone_pfns(); @@ -1896,11 +1895,7 @@ static void __init free_area_init(void) } } =20 - for_each_node_state(nid, N_MEMORY) - sparse_vmemmap_init_nid_late(nid); - calc_nr_kernel_pages(); - memmap_init(); =20 /* disable hash distribution for systems with a single node */ fixup_hashdist(); @@ -2669,10 +2664,16 @@ void __init __weak mem_init(void) =20 void __init mm_core_init_early(void) { - hugetlb_cma_reserve(); - hugetlb_bootmem_alloc(); + int nid; =20 free_area_init(); + /* Zone data structures are available from here. */ + hugetlb_cma_reserve(); + hugetlb_bootmem_alloc(); + sparse_init(); + for_each_node_state(nid, N_MEMORY) + sparse_vmemmap_init_nid_late(nid); + memmap_init(); } =20 /* diff --git a/mm/sparse.c b/mm/sparse.c index c7f91dc2e5b5..5fe0a7e66775 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -406,9 +406,6 @@ void __init sparse_init(void) pnum_begin =3D first_present_section_nr(); nid_begin =3D sparse_early_nid(__nr_to_section(pnum_begin)); =20 - /* Setup pageblock_order for HUGETLB_PAGE_SIZE_VARIABLE */ - set_pageblock_order(); - for_each_present_section_nr(pnum_begin + 1, pnum_end) { int nid =3D sparse_early_nid(__nr_to_section(pnum_end)); =20 --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 029A63375C5 for ; Sun, 5 Apr 2026 12:54:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393684; cv=none; b=ove56nAu7N/k5Pd6pgmlkkmKrCYM0O8bZhJfAcNzBNG4dfkw37+9yAJvk7JrC49vRaK0e7CfjrVBOOUJCSgh7ZFqDVy3qQkjAeLy6KIiH8TCkASIZLjZh5zse6a0xewilGLqiNkuM2k5hEsuhQZfvbM+QSJv+6yMimV+zXbtEhQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393684; c=relaxed/simple; bh=ZO2G1t6H9XIATHjMLcWbj9lQyaTIXcj59kdGUej8Mvo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FhNvL8d8nWYOxYZ+/sL+XY+YaLQUlWtzTsbqXyAxXLdy8PgTfkBA/HQ/0oCwrCIAoA1Ht9HVaD/j4Jvh7AFiZ0rbznIFlQ9+xkGuzOF+ugftPBn32uTH+P9TOMyHwWGsKwL7gd4X4Uny1HnRFEe00aUl22DqjMMxllSqaBHtbq4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=eLtFl37L; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="eLtFl37L" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-35d965648a2so2429200a91.0 for ; Sun, 05 Apr 2026 05:54:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393682; x=1775998482; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CpeyP/gKGQ3cmNpd3odA7JDcSLupYRRYiiUQtcwUonQ=; b=eLtFl37LUhGCMLTujf9yUsrZw+dToP2OfKnNETjiUX+ysGDaCBG+WYlgz5706dfDax 8ZQkdfujepDFWVyM9YRBsifJkvk82hYL3IqqTou5qzOIuMv5N57eFqzv9fA+DTq9f/3u cmf6lonWXP3O4gc2ATOAu3hp73wIwGTO0vwaAk2fmNYR8DdybmuTQhPJf+f3V2pcmQEY /jFF6A1zsyLQi2ri8FEVhnial3knaI/KCV476TTdu2XiWLnB9AGleVw58lgUFA9L58me psY1ld54bF4s78yrssF4XoV0vr7SuTs4Jz4Atbh21z0W3WNlpgdtLH5pR78KOpbp8ddh +fCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393682; x=1775998482; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=CpeyP/gKGQ3cmNpd3odA7JDcSLupYRRYiiUQtcwUonQ=; b=AxeP4r1S0eY7aDXLZQCrnu2O31grBArIh4Mb2ZrNeEl4keREjF+NxN3aEJgTCgrxT+ 6B1XtijTYbkgiUt2QNpbbmekkhKctj9x17wYrO9qp5egK2Qm/FeEIw2RACeZfQ7eEVaV zmoZ1G9eu74HRkt/hdHWa4vX7dIuoYAgkMXthcp6dRY7Hm4FiPUaFtKAvMN+Ns1LysS3 MJGQGVLKskpywDGvpGkHj7jtUmWk5GXFQLxairNp0XtHzVhn3vPjOjFaPxUfjJujiLm9 vtLUKWU0bExBhKCj9wcRxKo9klaE4AMcN9dnvEHHTyYl9rori3Y9AXfSFDg8xdM80Fx2 G1UQ== X-Forwarded-Encrypted: i=1; AJvYcCVcNDxoQEIOtphUn9yOAJMRrJx5auBScHvxNbuHRQcMyvSdsyRle/EzgpjLG9mdpRycQlw1NwwBBdd1+Bc=@vger.kernel.org X-Gm-Message-State: AOJu0YzTXKKusyXn1dDKtTWP/yqa2HtESWCQv/J7ijB+KB9nBmxjE0ga cImmc1rfYVZfyE0mdc+/XdhswS/sUyKUk9W82iYKPUthSEZj1Gp7Swex+T/XSfIC9go= X-Gm-Gg: AeBDietsl3Vb7mXqZn8d7tM9VDBb9tNFww/yCTGCDeyrLR2WXLNT/PYftjbb8OW7kv0 tL04kR/A8RXXEpMgicnqU1s8lUffXOVz5OcxDQtnJ2x+Do61cNhZHyyl/WqM5OHmGLo3vILXWrM QPpq48CWj6unwFGdr7vUAvQ3P2s0pwcK7OHhLN7DDE1bSU1Zgvnc0somQMPoF5jJPe5f+a9m+eQ nY0g8PqVKX+zMDjU3s5BYfhfZb5Txw5/3o05jwPrVZmxRutY5E0pe4J28tAKaqUzrRMEqkvzlSJ hFhzjWVkbBMaeaauSohOeGvmROSwuYofOnPfVqWMIq8f9lEkRQrzYMTQ+d2e4s3ULKbFYYiC8rd GfuqWe6u9A9JqtI0OVTouzKuEEV1G2jgcU+AKqlbP2V1NcJROBgvvTv4fNuNVDNXfgaaNhZ4O13 hgJMr+NN+AqXr6XsmVzkBPwyjNzqT5Oldz2LBK8/Rv6xg= X-Received: by 2002:a17:90b:5583:b0:35c:d98:d684 with SMTP id 98e67ed59e1d1-35de67da865mr8258899a91.6.1775393682300; Sun, 05 Apr 2026 05:54:42 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.54.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:54:41 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 12/49] mm: make set_pageblock_order() static Date: Sun, 5 Apr 2026 20:52:03 +0800 Message-Id: <20260405125240.2558577-13-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since set_pageblock_order() is only used in mm/mm_init.c now, make it static and remove its declaration from mm/internal.h. Signed-off-by: Muchun Song Reviewed-by: Mike Rapoport (Microsoft) --- mm/internal.h | 1 - mm/mm_init.c | 4 ++-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index d70075d0e788..8232084f0c5e 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1437,7 +1437,6 @@ extern unsigned long __must_check vm_mmap_pgoff(stru= ct file *, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); =20 -extern void set_pageblock_order(void); unsigned long reclaim_pages(struct list_head *folio_list); unsigned int reclaim_clean_pages_from_list(struct zone *zone, struct list_head *folio_list); diff --git a/mm/mm_init.c b/mm/mm_init.c index 72604d02a853..64363f35ad0d 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1489,7 +1489,7 @@ static inline void setup_usemap(struct zone *zone) {} #ifdef CONFIG_HUGETLB_PAGE_SIZE_VARIABLE =20 /* Initialise the number of pages represented by NR_PAGEBLOCK_BITS */ -void __init set_pageblock_order(void) +static void __init set_pageblock_order(void) { unsigned int order =3D PAGE_BLOCK_MAX_ORDER; =20 @@ -1515,7 +1515,7 @@ void __init set_pageblock_order(void) * include/linux/pageblock-flags.h for the values of pageblock_order based= on * the kernel config */ -void __init set_pageblock_order(void) +static inline void __init set_pageblock_order(void) { } =20 --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F002E3246F8 for ; Sun, 5 Apr 2026 12:54:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393690; cv=none; b=KmM8csTjKH8nMPPM+otpwdHChslTPDl6cT0gFeXFlOql59g8ZAlzbd6rAEAAwIVAod5I/7EI7QLolZou7RKg4Bs+Nnym8C9c/sXl1FgcIYZJTQKGVGvnap49WPHr6Pe33FzcU0+mZIXin4zT4ks0EVr4I5GBYGH0EslV4kYFo80= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393690; c=relaxed/simple; bh=7Y/5Z8bgniAmuM6pFcrChS0/8yHbPcLyWVQnQpD36/0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=lbXwtJpHI6Emz1eucKqZigPJI1WxJNbKHelaCt7xfvu5EVHMq/a49B27Gx4fvaZyK56rnzQClbNCtGHxRLtwVF+qC+4V/MTPU03wGdpsPqNnADH/LYvyyeOynCxCsX00AsS9eufx6osTxllIZyYDsLbfXhxQt1eeylNqG5zZ1c8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=cjMTB+2D; arc=none smtp.client-ip=209.85.216.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="cjMTB+2D" Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-3590042fa8eso2190908a91.1 for ; Sun, 05 Apr 2026 05:54:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393688; x=1775998488; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HEw5DKlzSRjrYp6HmZ+C9u13Pw0NEEG+M3MgilG+TeA=; b=cjMTB+2DZ1YJIc8jj45VP97VFzuQgWcxagF4qpDq25SNjK/wdlIuXII+L2tEAdzm25 EgLGlVfnngoWdAbP8LzdeMf9d4/cIIVOPC4SrtHMtfLf4C9CTb/1rHLrGbO8HwXn5VvP jOekn0u7nWzy2oHQ8KH9ucO9DBHaVmEBdkjZ3p7qKKm5IarxtOG6nZ2Adm2GWUDYSe9b Wh3PZKCwzc++pWClXMSZI2R2zXr8dC5N9aZMfzrr0+3hXJ5syMKd17AopIjiKrEM/Dd6 B3vWUmaSw3sa3NZYkK2vXN9jdtngxQ4F3kFlq+5aGJlFpYkdfa3sEklRe8Tc2ClQuTbl 04aA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393688; x=1775998488; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=HEw5DKlzSRjrYp6HmZ+C9u13Pw0NEEG+M3MgilG+TeA=; b=jO6/pNI8QDrVW2WqSz+Q25B0g1JCnSZLT3K3oFI2iMUXFwWbRS7qpUZjA64UeWHF4H tqoO2ba+wg0bu9K5i/Sacg9i0b7xc0RSCaAAfPJfo/R9ka0f3cZ3aC8AWlEwCGMGdb2s DGIIJHCdOFaQK2qSncgCxCFCtfPWNHrIRpgL9W+Xe/6FmoXo8uh2yHd9untR2BOqqC9A 2Fp2VDjSDrBPZ0WqaESayYTWo3jAwuWyff3c/uX5LVyejT9KOWgXNavX/EHyHpvl5y9e bjDKgzzlHjp9UxrywFH2oN9HqobvfkQyj09YokKj9HAztw4vuAAtQOFd1ImgHimu66nQ /T1g== X-Forwarded-Encrypted: i=1; AJvYcCUxfAriOFVLMq0fBX4O0TDouEGAi7xNahVjHW45ZYcC5bQLVTWaK4fg9z99y3xyBZcPMtK0QQQGb3qG428=@vger.kernel.org X-Gm-Message-State: AOJu0YzlgxVFwHfBBzwZGGaKLpRiqvOc/6iI21B0FkpPSATkP1R2UOpd Srt9Es7kDh7vEpwfbQ7/GjyhyMO0xDW5WJ//u57oXr1TmgjAFWaLJxqNpR8bWw5pPVY= X-Gm-Gg: AeBDieuuLTKzdypY3fFADi6a3dejjOf5jGNLNyUvNehMslFFO5MigsbVLxu4GoaiwZh iD46o0hs/HQ6H4EZ7ldD2fKujDu1LMxexCO+LuHQT2kYIpNIjKymWhxx+sh5IrnlJ8sedmvJh6U Yynzr3rBk0iPAz5yq8/G5kSe+0BB+qIp9N/oEX5hQVA1kYBOy134VRxo4YyKUReMASHd4vG4o6m CArXsOQP1/Qhsc0lzOjsnpQF4tCQPTIh2a1z/4x1Bq4An3HG8qREUhDZAMkokhXAkLcF2TFPJ+4 uCXc/XEk40vNhq7/4OyEQ7os5HMjpIUxso1YLaHVAfClPkrM32+7wAA0Cz7qhO8r+pBt66Tov5m 5kUwWnvhIXitdhB37UM9prKddON9Ft1rodvrf2rc3Gh2P5Fj7RrqXPyaj8Vn+Z6yxzIu7whAzt7 G8dlhHurFR0rh56tWrJzRr4cDXi7LJvKBU9ojBTbWCXBY= X-Received: by 2002:a17:90b:4cca:b0:35c:29ba:bf92 with SMTP id 98e67ed59e1d1-35de6811cedmr8847254a91.5.1775393688240; Sun, 05 Apr 2026 05:54:48 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.54.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:54:47 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 13/49] mm: integrate sparse_vmemmap_init_nid_late() into sparse_init_nid() Date: Sun, 5 Apr 2026 20:52:04 +0800 Message-Id: <20260405125240.2558577-14-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move the call to sparse_vmemmap_init_nid_late() from mm_core_init_early() into sparse_init_nid(). Since sparse_init() has been deferred until after zone initialization, the zone data structures are now available during sparse_init(). This satisfies the requirements of sparse_vmemmap_init_nid_late(), allowing it to be moved safely. This change unifies the vmemmap initialization steps by placing both sparse_vmemmap_init_nid_early() and sparse_vmemmap_init_nid_late() within the sparse memory initialization logic, making the code structure clearer. Signed-off-by: Muchun Song Reviewed-by: Mike Rapoport (Microsoft) --- mm/mm_init.c | 4 ---- mm/sparse.c | 2 ++ 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/mm/mm_init.c b/mm/mm_init.c index 64363f35ad0d..7a710fcbe3c8 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2664,15 +2664,11 @@ void __init __weak mem_init(void) =20 void __init mm_core_init_early(void) { - int nid; - free_area_init(); /* Zone data structures are available from here. */ hugetlb_cma_reserve(); hugetlb_bootmem_alloc(); sparse_init(); - for_each_node_state(nid, N_MEMORY) - sparse_vmemmap_init_nid_late(nid); memmap_init(); } =20 diff --git a/mm/sparse.c b/mm/sparse.c index 5fe0a7e66775..d940b973df66 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -383,6 +383,8 @@ static void __init sparse_init_nid(int nid, unsigned lo= ng pnum_begin, } sparse_usage_fini(); sparse_buffer_fini(); + + sparse_vmemmap_init_nid_late(nid); } =20 /* --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B9CC25785D for ; Sun, 5 Apr 2026 12:54:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393695; cv=none; b=pBWeJF6xoAbh4nv3WxE6JCHdzG2xKaaYpS2vLEA3H4km9wO1DtMNTpomuutkPfWfFByLDOzCslfZ2bJwkOA7FgOrFjVa7yaq6mLAkg7Z3s9uDF+8SHXdiPlX2rTd3Z3kDV9HvlAPxxDaDsC0Y58JKfbO1lBOf+eBJohBpFDSoLE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393695; c=relaxed/simple; bh=msp3iVJO4hzAAFM/AbeHU2nAMVee0Tj+xZef+hwhVsg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ix/UqM8xUMOdIWdkwSkxbbfrp3MDM8whE8r2vKeBO3thAC6RJYbVYWeWKbs4OVuvk+JRga7iyJOnQq/HNuezepWyywiGuwdEJ6XSlLAiv80VKeiCCj55L6A5bEEj+0E+u13461okHo5Igc2hVyB09XGuW5bFDrm518Esky1MTfs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=iHtJ1i4j; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="iHtJ1i4j" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-3567e2b4159so2009046a91.0 for ; Sun, 05 Apr 2026 05:54:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393694; x=1775998494; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TjgvWFWYEgCOLeyYyMsHcBGvGVNPbR5pUUyMN0Stxt4=; b=iHtJ1i4jml/H4DAA+PWVqOLvmUkMt5igUBtWm8nvE+Cly/gloz45mwLDGcuM5oV0D2 37sMKYMwuR1FqBewX9cQwxplfyOvLXb9dLbFP6nc+3BrSf+vq0dFSSVHVx4Pl/D5JJYz NcfPuBD4b44S41eEP+eyQ4jcQBuV2fMlbpR4NMfqP3HoB2fX9uZZjYQh4xfGipbpeVEQ PpsndQfPO5gI1azKCE9yt4bOeI3C3u7ShG6j9vmoD3MEIk5pe7qzScu5Koy+LV0gk/j8 hbFXQYRvT1+qfM87tS1hYIGGDBVrLAKfgfUNFq7l2a+dJxk5Bfqh8t8ZcJrB1pWdpsJi rwjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393694; x=1775998494; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=TjgvWFWYEgCOLeyYyMsHcBGvGVNPbR5pUUyMN0Stxt4=; b=Q3Bw56YB3qvUh31JXBvvTEaieyqg9D8TatO3OtU9pY7GGAGMKU0mFAe95s3NZLbefu PG7ZDd8zELiG4vy2L7sBm0wqPzBvdOBgUNDo6WtRJ9G5Z3ff0BsVcPcHw2gMvhumKv+H bfZgqnP8e/GOudvHrCkPMh9p84d8bd7JD+Cr+q0rePZQSD4J2u6azSHWiyNAH+e42c7D JkiJUYzwVPca+UKJgk4NxfskfpuxFZjFh+MYnhbEd1mhjfXds9O/ZVJRw4VXGP0tSVDz byQbFy/aylGmRpgc3AUEI128lhjtJXkd4Br7ubd+oEQBmj3SJXpxRUzWyLLh+FFk1DwI AwNA== X-Forwarded-Encrypted: i=1; AJvYcCX3IjBLjQJCLtquutLV9itlpi24uety6/+pyyXQiDGw9mpy0r1QAmWDjxeT4n8ycYNT/KTuP2x6u0+Agbo=@vger.kernel.org X-Gm-Message-State: AOJu0Yyke/mcrZy9XN5WaaScr/kbmJNdzmnECjZFcHY1/Bm0QFnYLmy2 3tMYs6pVXDM2iidn81KB2sep5IGoHF6B1u9+EvCv+lkN6uKoK5n9U9DfOArS07H/BRg= X-Gm-Gg: AeBDieuML/5xDqPfh6R+7keLUOLUhKTvYmjCVOBkRIcjBTSFLtp1KL3i/Xe5H3iyADE 2z2RJkXxYdMgWxhCtS/nXcgRWZPU6wU4lNR72aQ+TIoTUuU0bXxg/dA8wEs9NxiwJ+REjpOxfR3 cSnlF6KAlU5cUF7GkgGZG7fV5QyoDr4OhZ4WdJSDTN0NyezS1Ec6E5ul0lFidsSvxO9hOLWffXu N1b6BYo8N3PT1AJUOeKVpDifdBFKlFkUrhJ9UKjMOY+ORHMHIAdyKGHphm3w/W7HbnYi1Efz+aC nSxQvpR+E9mKTfK5RfXXvu5Hw0XsvOGiDyvQ8Rhqe1ZUqxvpWgeeSqZtINX4odP7o14PFg9MM+h 11yzADTYmxkfio8GlPwPH1a/97vTVdlEIuaXI3Fmi/P4/KDQDs1jJM1rkSELltmq6SjBmVjtNwU qOsPqeag+XzG/mT4kj1dsU52cHGwUOKKiVsWsGaMbm874= X-Received: by 2002:a17:90b:1344:b0:35b:90e7:c453 with SMTP id 98e67ed59e1d1-35de67d6677mr7866778a91.6.1775393693495; Sun, 05 Apr 2026 05:54:53 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.54.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:54:53 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 14/49] mm/cma: validate hugetlb CMA range by zone at reserve time Date: Sun, 5 Apr 2026 20:52:05 +0800 Message-Id: <20260405125240.2558577-15-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" During hugetlb_cma_reserve() we already have access to zone information, so= we can validate that the reserved CMA range does not span multiple zones. Doing this check up front allows future hugetlb allocations from CMA to ass= ume zone-valid CMA areas, avoiding additional validity checks and potential fallback/rollback paths, greatly simplifying the code. The pfn_valid() check is removed from cma_validate_zones() because mem_sect= ion is not initialized at that stage and it can trigger false warnings; keep the sanity check in cma_activate_area() instead. This is preparatory work for t= he follow-up simplification. Signed-off-by: Muchun Song Acked-by: Mike Rapoport (Microsoft) --- mm/cma.c | 3 ++- mm/hugetlb_cma.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index 15cc0ae76c8e..dd046a23f467 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -125,7 +125,6 @@ bool cma_validate_zones(struct cma *cma) * to be in the same zone. Simplify by forcing the entire * CMA resv range to be in the same zone. */ - WARN_ON_ONCE(!pfn_valid(base_pfn)); if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) { set_bit(CMA_ZONES_INVALID, &cma->flags); return false; @@ -164,6 +163,8 @@ static void __init cma_activate_area(struct cma *cma) bitmap_set(cmr->bitmap, 0, bitmap_count); } =20 + WARN_ON_ONCE(!pfn_valid(cmr->base_pfn)); + for (pfn =3D early_pfn[r]; pfn < cmr->base_pfn + cmr->count; pfn +=3D pageblock_nr_pages) init_cma_reserved_pageblock(pfn_to_page(pfn)); diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c index f83ae4998990..b068e9bf6537 100644 --- a/mm/hugetlb_cma.c +++ b/mm/hugetlb_cma.c @@ -233,9 +233,10 @@ void __init hugetlb_cma_reserve(void) res =3D cma_declare_contiguous_multi(size, PAGE_SIZE << order, HUGETLB_PAGE_ORDER, name, &hugetlb_cma[nid], nid); - if (res) { + if (res || !cma_validate_zones(hugetlb_cma[nid])) { pr_warn("hugetlb_cma: reservation failed: err %d, node %d", res, nid); + hugetlb_cma[nid] =3D NULL; continue; } =20 --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A05BC33FE26 for ; Sun, 5 Apr 2026 12:55:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393702; cv=none; b=o9fqS10KTB7T3Ga0pV9jaEZfv11klMyatSDDYEKrZh3jD/wLH+mGJ9mBSoGZRMBev3gBqG8sD2BQHn+qZUl+VWW2sDCgtpMJOYglXVJuRLmxpLyCgQCPTOPS7SeboQDHCBc37zmWl0qKtelEwXWOnl10nWW01Gp3h4SOAPACpiI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393702; c=relaxed/simple; bh=+tqs3vmPjitXuJiMczWEliJ96qns/Wy5BmztDctKHJ0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=k7naXoNCsd3IZH7MdLrIPpUJyUNxr+0l4TWG+/OPH/WRovtpLXbGpqV49oWwxgM/cD6ik3Z6PfeifMY84s3vFl08CxKm/0uiEWZ3cznmZR916SiZFV7UcxG31fc9sOBrxF7fbinLcewSx+axF6SC/VkNGi/GbDMhGedCd7AFGgQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ZDirT96x; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ZDirT96x" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-354bc7c2c46so1787440a91.0 for ; Sun, 05 Apr 2026 05:55:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393701; x=1775998501; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QwFPS6YUW38IEMACgWNvn97a3hFqhS5O4/kjgd2aI1E=; b=ZDirT96xWEPinXJu8dbvU9/SSVC0ewd+g+xMtD5/GCoChQERanBybg5TvZMMxKjgcK xzYMCqbNhK6GqdUCKaANqSkuYcKNK5PAHz8a7hy/K8q8W9G2+e9NuHoIZ/tMsz7WWhRj rlyG17dwHVQJ1MA8CaNtMlE8oaQaOaKceM8uBhKCdoTi+MRvSmr/rpti+xxTK9im4uK7 nYWeaA/g/dbPjDlVInyrVDOOC7qSwpK2MJ0o81yUv69puF3+mfwgIBXzZMYh5EBPozZJ pPIVAks/PqKSkNf7sSkLoAEdafdAeUVUtUYO8ubnPexpZcCT97IrYDmoU48SOXPJ2mnF o4Iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393701; x=1775998501; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QwFPS6YUW38IEMACgWNvn97a3hFqhS5O4/kjgd2aI1E=; b=pQS5IJg/tr1x+0BELobR+7EQ95BnjU/Q9ExbW6PSNYCyLhmaUQI8AXNhQtBiCR+nuV oWIpLexf7K3RnLq6j7t7Au4rrlqxqDUKEorwdKqrkY0wvRF1ogJU5PlPzE/XsVA8gGwy BJ5rP9j3aN2pA3X4DUEjp7Gw2cgQa3pFOzjYXa7sB6gwk0pJvJUy9dv7S7OBZDR4tEaN L4V3euMuiPsqmPlc+MYSFqPHbvwQ6k0/5bnYkyhi2ahSOfFai9+L081ypsScBYvDnlDD Ox3IKUAH4iI7fT2IxrEhT30h6jGRHpgmJK4ze2/N63Omkrm6xog2BwHBiRTeNQp1kuD1 XruQ== X-Forwarded-Encrypted: i=1; AJvYcCXPd5xJQ6X6diXraB8AAleWjZlznJ/WrppHZqllyCyEwDv02gEEmFTD7qn9wGQCzlOLFV+Mv7bAB1bawjU=@vger.kernel.org X-Gm-Message-State: AOJu0YxGPxscx0rrXJit6Puq8tPxkohDjuajNNsfepN0jbSZw8N5F7bJ 04WpJHO7Yn5zcIUmoQhhSk+pldRXA+BLLQf9XHDrY4nV1iGHDO2Ln+K7u/ZRTWulBII= X-Gm-Gg: AeBDiesA/a+AR6amfvshq3UyBKWF9SK8Mkx4rFtg5kO37KPTVtK/ZlDnnmfZ0DAVqow SDvXx/fxXu6f597pv8Dtn9eVYZCx3krFyrtfP+ZiZRjuLPtbfN6/wbbQJk5eRjbI+5+vL32Bo8E 8mvX9ZtCqMskDuWoSwAgl8Ji8in4kkwgHbgVQuKKr9KA++fHKz04VJVqbOTQgFifDWuFpdxfmsQ SE1CvEbijJdm6BV86/9TfDosIDMcXyoqsbeu44gaN6P8WbeCG6DXPv7dFfr5qSFDQX3k/ttLy7V geap4k46ZLJchylgBJMAYbTblaasLy60oK/0eu25moyRNUCgvPdmYs38OsFadJMtk4Ib3cm5qge wfqSXwMQNeImOndYz4hyhVS/ebP9uO6ASYiV33L6azyOLXaqlL2yp2Ole8zDOGuy/0zMZO9l5xa zEJIOs0hUpsuCJlj2AEq3kr8BInwSJGIQuW0cJ4XpkzFU= X-Received: by 2002:a17:90a:d406:b0:35b:945d:752a with SMTP id 98e67ed59e1d1-35de68f82damr8709011a91.17.1775393700794; Sun, 05 Apr 2026 05:55:00 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.54.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:55:00 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 15/49] mm/hugetlb: free cross-zone bootmem gigantic pages after allocation Date: Sun, 5 Apr 2026 20:52:06 +0800 Message-Id: <20260405125240.2558577-16-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" After moving hugetlb reservation after free_area_init(), zone information becomes available during bootmem huge page allocation. This allows us to identify and handle cross-zone gigantic pages more precisely. During alloc_bootmem(), pages that intersect multiple zones are added to the head of huge_boot_pages[nid] list (without ZONES_VALID flag), while pages with valid zones are added to the tail (with ZONES_VALID flag). After allocation completes, hugetlb_free_cross_zone_pages() iterates through the list and frees those cross-zone pages (entries without HUGE_BOOTMEM_ZONES_VALID flag). The count of freed pages is subtracted from the allocated count to ensure the final number reflects only valid huge pages. This applies to both per-node allocation path and the global gigantic allocation path, simplifying the code by avoiding cross-zone checks at later stages. Signed-off-by: Muchun Song --- mm/hugetlb.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 47 insertions(+), 6 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d6ea11113f1d..238495fd04e4 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3049,6 +3049,11 @@ struct folio *alloc_hugetlb_folio(struct vm_area_str= uct *vma, return ERR_PTR(-ENOSPC); } =20 +static bool __init hugetlb_bootmem_page_earlycma(struct huge_bootmem_page = *m) +{ + return m->flags & HUGE_BOOTMEM_CMA; +} + static __init void *alloc_bootmem(struct hstate *h, int nid, bool node_exa= ct) { struct huge_bootmem_page *m; @@ -3092,7 +3097,14 @@ static __init void *alloc_bootmem(struct hstate *h, = int nid, bool node_exact) * is not up yet. */ INIT_LIST_HEAD(&m->list); - list_add(&m->list, &huge_boot_pages[listnode]); + if (pfn_range_intersects_zones(listnode, PHYS_PFN(virt_to_phys(m)), + pages_per_huge_page(h))) { + VM_BUG_ON(hugetlb_bootmem_page_earlycma(m)); + list_add(&m->list, &huge_boot_pages[listnode]); + } else { + list_add_tail(&m->list, &huge_boot_pages[listnode]); + m->flags |=3D HUGE_BOOTMEM_ZONES_VALID; + } m->hstate =3D h; } =20 @@ -3186,11 +3198,6 @@ static bool __init hugetlb_bootmem_page_prehvo(struc= t huge_bootmem_page *m) return m->flags & HUGE_BOOTMEM_HVO; } =20 -static bool __init hugetlb_bootmem_page_earlycma(struct huge_bootmem_page = *m) -{ - return m->flags & HUGE_BOOTMEM_CMA; -} - /* * memblock-allocated pageblocks might not have the migrate type set * if marked with the 'noinit' flag. Set it to the default (MIGRATE_MOVABL= E) @@ -3393,6 +3400,34 @@ static void __init gather_bootmem_prealloc(void) padata_do_multithreaded(&job); } =20 +static unsigned long __init hugetlb_free_cross_zone_pages(struct hstate *h= , int nid) +{ + unsigned long freed =3D 0; + struct huge_bootmem_page *m, *tmp; + + if (!hstate_is_gigantic(h)) + return freed; + + list_for_each_entry_safe(m, tmp, &huge_boot_pages[nid], list) { + if (m->flags & HUGE_BOOTMEM_ZONES_VALID) + break; + + list_del(&m->list); + memblock_free(m, huge_page_size(h)); + freed++; + } + + if (freed) { + char buf[32]; + + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, sizeof(buf)); + pr_warn("HugeTLB: freeing %lu cross-zone hugepage of page size %s failed= node%d.\n", + freed, buf, nid); + } + + return freed; +} + static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, in= t nid) { unsigned long i; @@ -3423,6 +3458,8 @@ static void __init hugetlb_hstate_alloc_pages_onenode= (struct hstate *h, int nid) cond_resched(); } =20 + i -=3D hugetlb_free_cross_zone_pages(h, nid); + if (!list_empty(&folio_list)) prep_and_add_allocated_folios(h, &folio_list); =20 @@ -3496,6 +3533,7 @@ static void __init hugetlb_pages_alloc_boot_node(unsi= gned long start, unsigned l =20 static unsigned long __init hugetlb_gigantic_pages_alloc_boot(struct hstat= e *h) { + int nid; unsigned long i; =20 for (i =3D 0; i < h->max_huge_pages; ++i) { @@ -3504,6 +3542,9 @@ static unsigned long __init hugetlb_gigantic_pages_al= loc_boot(struct hstate *h) cond_resched(); } =20 + for_each_node(nid) + i -=3D hugetlb_free_cross_zone_pages(h, nid); + return i; } =20 --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A9C782899 for ; Sun, 5 Apr 2026 12:55:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393708; cv=none; b=jszSasmnxjwis24bS/baQSeL36im1CY+Vu6haKmNS1pEFp2fCropWpf3NiDbor84FRcu7bcZ0aqIUlDawb3OqXA+F27YXueXGNWmovxwgc1WfJkP3pXx49lZk81YEUN+XCAp6N7+6gQNQY7iLv2b+VPd8eYmgdkZ0HJhfwdEQpg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393708; c=relaxed/simple; bh=Pz47zmWbhdLPtir+pZVpHOLi1ccM+J1599+EYdoV38Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=r/Kfsyx6hEov+NtgowRuwC68gn9Mjf1uCMc10pW9oCNYfWdhItCBIkQwmF9WFzSQOr5Lj4X86yf22T4xZGauohP3oJVBAW4hdO0Ki8ld5ZX2UbPHwsxT+tbQQtXFslOXa8kRug8QvSQreKHq+zi0K4t0NRRt4nSJHRcxl9itfgQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=guLIt3Pt; arc=none smtp.client-ip=209.85.216.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="guLIt3Pt" Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-35d971fbcddso1861304a91.1 for ; Sun, 05 Apr 2026 05:55:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393707; x=1775998507; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=62kILe3INi6GkTRPTI+nnexklobUpHFgUm7h2fdMoGM=; b=guLIt3PtO9JUZEIxSq+jnpPxImxld3pzqay04sF8VqUgrQ9ogs3xg7cuzNcat3/8Md PS/WerIF3WBL5UPtUvx4W8QHfxdtQaJlfXlJ3Jv67QY6Y22AYfLpmnkHDlKexa9ehBya bL70THf5+gkEWAPwH9JkqegfDjpVlMDEXtNnhBlHQCzIdGr4/4IMx3H6cMhWqkWWwwOJ kvNWGD3fA8lKuM76AVgOF3iyl84Rp6Yq4/rL4PaR9lMJcTG5KdZZ1d8g43XW97BYi5nf wTOa7PrQEBxf6Ur3Dfo/rv/K+izITUy7SNQRU+/LGt/Rxe6acOsTGK57/tNuv4/b9jfD QBKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393707; x=1775998507; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=62kILe3INi6GkTRPTI+nnexklobUpHFgUm7h2fdMoGM=; b=Ivrzb+f1MPZrylDMVYWhycaoYfERcp8j6yqt52pxY+XIFWKzK/aufZkUpPjb3P1aQ8 4OhWbNyimjkuC3MPSMbrRSbV0UPTS0G/mv+vYMIjhnL+WuiFVWkDR/jS7QqzGPOj0/2X SsDJhIKYWxEtO8sAMU108rnL3NFK3gL1oa935R9TIIBAO3wknZA2CYqkNL+6mgW4Eyip Fnyy7l5U2WrtxI0O/RFQU5AL8pnALfeF3MFIy+b1Ab8QiNx06SRKHMXM2oPxi2VIC5Un xfbyO3mJIX8hUjyLZnharOgqZflQOM4YvaBgS76bEfsDR3PM/rRU5v3eO1OnunHF/gX0 MOyA== X-Forwarded-Encrypted: i=1; AJvYcCWT5dP/YDFU36bkxAo8u8dY9SBcgHUeHCKw5sF3cr9ZavD9JweGXmjwrKi7IMEqic3kH3ljwO2fPUvJwGU=@vger.kernel.org X-Gm-Message-State: AOJu0YwRi8doywY0dhlz7gFFM26VNT/OaXBhKwSBBE+OQEUxoXoqp3TY m863C6+rIjpq2juTwyErTap+CSK+w3x+kc4U056JvDcw0Rnanlyoynk3mCKeOSlt6Y4= X-Gm-Gg: AeBDieumNfb4+8WBkZlMe7joqMDNnCeAtUBYORuR1UW61TQ6OFioKHlxzITiDuGoG4l AosBu9REt/dYlPDtRQZEsUS1OcBDKQS/dXSnqCXs5Fdyz61DpkQK2iDWcMNZyEiCKBXjs8qyTRq mOW2+rH8SQWL6K37sARu3UhFjwe6my2DImOiKNFrB3GTp7kyuYJTB/lLNHk50fAWuYu4jCXr5Ar jGWFFj+p610mZo4Gmb2W4WSHGMDe5zCdV1HbAX4It6kp3PKOtrjaMWebR4yyg1NL7o6yq3l9e0A JFQ8V/BjP1kl2nnG0WFtqbLMtcjLe8P5XGi0Okqh21ltoUalmutqsXZfZhhyi+vKqlZ9XVlzyZU vNHla8//pdimYupQYu6jBdg5J81tI/KsD74u+FUlW9c6/F0zeYqCx57WNXGQ/PdAgWbJU8fTerJ hXnXyNo9Sqc+4KLyloRINJf3Dlk7xLE9sSvtoXE1muVk4= X-Received: by 2002:a17:90b:3d4b:b0:35d:9fe9:f830 with SMTP id 98e67ed59e1d1-35de5c42865mr7039662a91.12.1775393706456; Sun, 05 Apr 2026 05:55:06 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.55.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:55:06 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 16/49] mm/hugetlb: initialize vmemmap optimization in early stage Date: Sun, 5 Apr 2026 20:52:07 +0800 Message-Id: <20260405125240.2558577-17-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move pfn_to_zone() to be available for hugetlb_vmemmap_init_early(). Populate vmemmap HVO in hugetlb_vmemmap_init_early() for bootmem allocated huge pages. The zone information is already available in hugetlb_vmemmap_init_early(), so there is no need to wait for hugetlb_vmemmap_init_late() to access it. This prepares for the removal of hugetlb_vmemmap_init_late(). Signed-off-by: Muchun Song --- mm/hugetlb_vmemmap.c | 38 ++++++++++++++++++++++++-------------- 1 file changed, 24 insertions(+), 14 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 50b7123f3bdd..e25c70453928 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -745,6 +745,20 @@ static bool vmemmap_should_optimize_bootmem_page(struc= t huge_bootmem_page *m) return true; } =20 +static struct zone *pfn_to_zone(unsigned nid, unsigned long pfn) +{ + struct zone *zone; + enum zone_type zone_type; + + for (zone_type =3D 0; zone_type < MAX_NR_ZONES; zone_type++) { + zone =3D &NODE_DATA(nid)->node_zones[zone_type]; + if (zone_spans_pfn(zone, pfn)) + return zone; + } + + return NULL; +} + /* * Initialize memmap section for a gigantic page, HVO-style. */ @@ -752,6 +766,7 @@ void __init hugetlb_vmemmap_init_early(int nid) { unsigned long psize, paddr, section_size; unsigned long ns, i, pnum, pfn, nr_pages; + unsigned long start, end; struct huge_bootmem_page *m =3D NULL; void *map; =20 @@ -761,6 +776,8 @@ void __init hugetlb_vmemmap_init_early(int nid) section_size =3D (1UL << PA_SECTION_SHIFT); =20 list_for_each_entry(m, &huge_boot_pages[nid], list) { + struct zone *zone; + if (!vmemmap_should_optimize_bootmem_page(m)) continue; =20 @@ -769,6 +786,13 @@ void __init hugetlb_vmemmap_init_early(int nid) paddr =3D virt_to_phys(m); pfn =3D PHYS_PFN(paddr); map =3D pfn_to_page(pfn); + start =3D (unsigned long)map; + end =3D start + nr_pages * sizeof(struct page); + zone =3D pfn_to_zone(nid, pfn); + + BUG_ON(vmemmap_populate_hvo(start, end, huge_page_order(m->hstate), + zone, HUGETLB_VMEMMAP_RESERVE_SIZE)); + memmap_boot_pages_add(HUGETLB_VMEMMAP_RESERVE_SIZE / PAGE_SIZE); =20 pnum =3D pfn_to_section_nr(pfn); ns =3D psize / section_size; @@ -784,20 +808,6 @@ void __init hugetlb_vmemmap_init_early(int nid) } } =20 -static struct zone *pfn_to_zone(unsigned nid, unsigned long pfn) -{ - struct zone *zone; - enum zone_type zone_type; - - for (zone_type =3D 0; zone_type < MAX_NR_ZONES; zone_type++) { - zone =3D &NODE_DATA(nid)->node_zones[zone_type]; - if (zone_spans_pfn(zone, pfn)) - return zone; - } - - return NULL; -} - void __init hugetlb_vmemmap_init_late(int nid) { struct huge_bootmem_page *m, *tm; --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3490682899 for ; Sun, 5 Apr 2026 12:55:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393714; cv=none; b=aOyUYTI5Am7Hm79bDMLqm+CpioT3DCWJC13+gukRsELetQfuWCVVw/tKhoHS390mPJ7MRRsynNzU1nkYrTnSaSxzdyHWn99Hd8fAMKw/RYSAafUAsFu4eZXAnaE8tRPyIe2iBFTEOrLCdLJVEiOeyYR5YHW4XCnuWLKmqMw/6lc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393714; c=relaxed/simple; bh=4IVX+MhNYfYCp8edraOUZMwJxVl1fADGEXyN1ouqwbc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=MVCU59ckZ++kgaQsonLrlLyIpPlU50LXIloiHtuQtR8hR26qmKu7NdgPh5lf/fVuhmcUV3+13FwZcljPYriAetEuG69melJjPwikToGFD5y2WVj+cI9EUF8S1ThSFhhnmkkzabjL6WxuIfroMb18G1BQgWIuczvhuPtkeENVpEk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=lIUWUBnc; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="lIUWUBnc" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-35da9692ec3so2875319a91.1 for ; Sun, 05 Apr 2026 05:55:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393713; x=1775998513; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KKli8yHSnxAMZ2GmFUDAbnEpbh1w6hBHm6LRjyjJACk=; b=lIUWUBncQVfXdWxYpxPZ491gvdhth5GK3pEGdOBmL2Dq0sJ+MECKjbI+doLqM1GAHx 03nTKDH/hwpFpwHH47yEnyOrA6u2gei/PmAgNtqvLHUN1YTO9lyOf0ObeF921Oi4v6zs cnTvFs/knVW1K/g2eYRMZHtVxT6XqNClFHkz+NeH+jVcbm+QdcC+grjc6Py9z0tWZIJ/ cQhJsFTCoB88VOEenLu5aIqwBckjK8lRVyGEpxXM3HkL/82YysMi5TKnUwPEcaMhP/9t S3k+A46Cr58ovdjhJ/6DPP7iXxPlcD3jstU1McuG2jPBNzaWnmLKlsElt1rgJ48dUXdm fQQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393713; x=1775998513; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=KKli8yHSnxAMZ2GmFUDAbnEpbh1w6hBHm6LRjyjJACk=; b=pnUKEZV7yisAguiq96SA0bpdEPXSbAKYL0XgX6onTOjcpWxCLMJ/36RgZelqrrF84J BFPdqVNYdoSHqUOJN7kwbI8LdDF4TCJCd2KA3nCQ4fJZM6ZNAaNCrbMFtMhtRE4aEeUj 3hn9SOOZAyE0O7zMGVUtqT/LhfqQSXt+9ELZFx1TxiLo5368uIhf96LZKC6330XI5/cy AdaUGBxKbMqHVeOMatFEvIPKmjb2W2eTx58qcy7sZqTebsiru/MxoW4o6glU8c5TqOMb aA7aHsfhrXvxxMGDigE8n3egg1wUpx2GQo1GqFQUEKBIVpBlMKK7XJCTxfQnSVY5q0NT ztEQ== X-Forwarded-Encrypted: i=1; AJvYcCWFgRjU8HMysJBM4egQuL9MnA0iS0K9qkpIHRecJ6Cz180FdCUGOHc3us0lwXWNNlC+jJ88VnXSsbpqqJw=@vger.kernel.org X-Gm-Message-State: AOJu0YzRwqq2OcfDdmSnUoh9MmD7ikLygRtLVRdLytRj6ScRxklU0dLU MP8OO1u63blp9Tv/vx421XcuVGo3yFFFvynDWoSZlQ9dk7+k1vWAgnySzHc5uVki2JA= X-Gm-Gg: AeBDieu2rCImgo/Prdw2ZYMKeVvrxbe3QfE1FDvygv0WudEcTFbiYcgzv9YoDSFOnR/ ARNk53FFVnElfcGU1aiInlHp4gT20Sof8HVwj2rhp4pE4W0+FiuxAtbIfUGcuB6OgNICpJW4mme 13KWmW4sDr88fQ+OH8DreNbC1mWLWFvsCzSQGaguA2K5ptSJPYtXQ/4zHsPTVDcM6WZbzhNmsJZ glwMvtxt/VWWPbdlLvegWdgt6P16B+2C0tTcz5Kl/Ozeh9KgbEDQB+nskiE2GDGxjg7C55O2ONX XlGJdddQjMFY/n3KaMYlusvqRh+GXKjoQbI+R3C8ahX3FudX1frGeVWJaULUqWhbNfcr98yIJlI CtU8svTycjSMWXbr8XdidTuVQJgfD7Ghi3v3H/FspAD/Vffs3MGw89BqdLhr1G58O2L8Ly7wroU QFSqc1G9x5CN6vtbTT9rayRfC9lPdru6uPsv1Nj0pF6E+N07lE/c9Hgg== X-Received: by 2002:a17:90b:224e:b0:35d:a90d:580e with SMTP id 98e67ed59e1d1-35de693d12amr8416296a91.23.1775393712638; Sun, 05 Apr 2026 05:55:12 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.55.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:55:12 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 17/49] mm: remove sparse_vmemmap_init_nid_late() Date: Sun, 5 Apr 2026 20:52:08 +0800 Message-Id: <20260405125240.2558577-18-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" After deferring hugetlb bootmem allocation until after free_area_init() and checking cross-zone pages during allocation, the hugetlb_vmemmap_init_l= ate() function is no longer needed: 1. hugetlb_bootmem_alloc() is now called after free_area_init(), so zone information is available during bootmem huge page allocation. 2. During alloc_bootmem(), cross-zone pages are identified and marked with HUGE_BOOTMEM_ZONES_VALID flag. 3. After allocation, hugetlb_free_cross_zone_pages() frees those pages that intersect multiple zones. Since cross-zone pages are already handled in the allocation path, the late= -stage validation in hugetlb_vmemmap_init_late() is redundant and can be removed. Also, the sparse_vmemmap_init_nid_late() function is now empty and unused. Remove it to clean up the code. Signed-off-by: Muchun Song --- include/linux/hugetlb.h | 2 -- include/linux/mmzone.h | 7 ----- mm/hugetlb.c | 70 ----------------------------------------- mm/hugetlb_vmemmap.c | 58 ---------------------------------- mm/hugetlb_vmemmap.h | 5 --- mm/sparse-vmemmap.c | 11 ------- mm/sparse.c | 2 -- 7 files changed, 155 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 9c098a02a09e..23d95ed6121f 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -699,8 +699,6 @@ struct huge_bootmem_page { #define HUGE_BOOTMEM_ZONES_VALID 0x0002 #define HUGE_BOOTMEM_CMA 0x0004 =20 -bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m= ); - int isolate_or_dissolve_huge_folio(struct folio *folio, struct list_head *= list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long en= d_pfn); void wait_for_freed_hugetlb_folios(void); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index a071f1a0e242..8ee9dc60120a 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -2153,8 +2153,6 @@ static inline int preinited_vmemmap_section(const str= uct mem_section *section) } =20 void sparse_vmemmap_init_nid_early(int nid); -void sparse_vmemmap_init_nid_late(int nid); - #else static inline int preinited_vmemmap_section(const struct mem_section *sect= ion) { @@ -2163,10 +2161,6 @@ static inline int preinited_vmemmap_section(const st= ruct mem_section *section) static inline void sparse_vmemmap_init_nid_early(int nid) { } - -static inline void sparse_vmemmap_init_nid_late(int nid) -{ -} #endif =20 static inline int online_section_nr(unsigned long nr) @@ -2371,7 +2365,6 @@ static inline unsigned long next_present_section_nr(u= nsigned long section_nr) =20 #else #define sparse_vmemmap_init_nid_early(_nid) do {} while (0) -#define sparse_vmemmap_init_nid_late(_nid) do {} while (0) #define pfn_in_present_section pfn_valid #endif /* CONFIG_SPARSEMEM */ =20 diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 238495fd04e4..a00c9f3672b7 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -58,7 +58,6 @@ struct hstate hstates[HUGE_MAX_HSTATE]; =20 __initdata nodemask_t hugetlb_bootmem_nodes; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; -static unsigned long hstate_boot_nrinvalid[HUGE_MAX_HSTATE] __initdata; =20 /* * Due to ordering constraints across the init code for various @@ -3254,57 +3253,6 @@ static void __init prep_and_add_bootmem_folios(struc= t hstate *h, } } =20 -bool __init hugetlb_bootmem_page_zones_valid(int nid, - struct huge_bootmem_page *m) -{ - unsigned long start_pfn; - bool valid; - - if (m->flags & HUGE_BOOTMEM_ZONES_VALID) { - /* - * Already validated, skip check. - */ - return true; - } - - if (hugetlb_bootmem_page_earlycma(m)) { - valid =3D cma_validate_zones(m->cma); - goto out; - } - - start_pfn =3D virt_to_phys(m) >> PAGE_SHIFT; - - valid =3D !pfn_range_intersects_zones(nid, start_pfn, - pages_per_huge_page(m->hstate)); -out: - if (!valid) - hstate_boot_nrinvalid[hstate_index(m->hstate)]++; - - return valid; -} - -/* - * Free a bootmem page that was found to be invalid (intersecting with - * multiple zones). - * - * Since it intersects with multiple zones, we can't just do a free - * operation on all pages at once, but instead have to walk all - * pages, freeing them one by one. - */ -static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page = *page, - struct hstate *h) -{ - unsigned long npages =3D pages_per_huge_page(h); - unsigned long pfn; - - while (npages--) { - pfn =3D page_to_pfn(page); - __init_page_from_nid(pfn, nid); - free_reserved_page(page); - page++; - } -} - /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3320,17 +3268,6 @@ static void __init gather_bootmem_prealloc_node(unsi= gned long nid) struct folio *folio =3D (void *)page; =20 h =3D m->hstate; - if (!hugetlb_bootmem_page_zones_valid(nid, m)) { - /* - * Can't use this page. Initialize the - * page structures if that hasn't already - * been done, and give them to the page - * allocator. - */ - hugetlb_bootmem_free_invalid_page(nid, page, h); - continue; - } - /* * It is possible to have multiple huge page sizes (hstates) * in this list. If so, process each size separately. @@ -3700,20 +3637,13 @@ static void __init hugetlb_init_hstates(void) static void __init report_hugepages(void) { struct hstate *h; - unsigned long nrinvalid; =20 for_each_hstate(h) { char buf[32]; =20 - nrinvalid =3D hstate_boot_nrinvalid[hstate_index(h)]; - h->max_huge_pages -=3D nrinvalid; - string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32); pr_info("HugeTLB: registered %s page size, pre-allocated %ld pages\n", buf, h->nr_huge_pages); - if (nrinvalid) - pr_info("HugeTLB: %s page size: %lu invalid page%s discarded\n", - buf, nrinvalid, str_plural(nrinvalid)); pr_info("HugeTLB: %d KiB vmemmap can be freed for a %s page\n", hugetlb_vmemmap_optimizable_size(h) / SZ_1K, buf); } diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index e25c70453928..535f0369a496 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -807,64 +807,6 @@ void __init hugetlb_vmemmap_init_early(int nid) m->flags |=3D HUGE_BOOTMEM_HVO; } } - -void __init hugetlb_vmemmap_init_late(int nid) -{ - struct huge_bootmem_page *m, *tm; - unsigned long phys, nr_pages, start, end; - unsigned long pfn, nr_mmap; - struct zone *zone =3D NULL; - struct hstate *h; - void *map; - - if (!READ_ONCE(vmemmap_optimize_enabled)) - return; - - list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) { - if (!(m->flags & HUGE_BOOTMEM_HVO)) - continue; - - phys =3D virt_to_phys(m); - h =3D m->hstate; - pfn =3D PHYS_PFN(phys); - nr_pages =3D pages_per_huge_page(h); - map =3D pfn_to_page(pfn); - start =3D (unsigned long)map; - end =3D start + nr_pages * sizeof(struct page); - - if (!hugetlb_bootmem_page_zones_valid(nid, m)) { - /* - * Oops, the hugetlb page spans multiple zones. - * Remove it from the list, and populate it normally. - */ - list_del(&m->list); - - vmemmap_populate(start, end, nid, NULL, NULL); - nr_mmap =3D end - start; - memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE)); - - memblock_phys_free(phys, huge_page_size(h)); - continue; - } - - if (!zone || !zone_spans_pfn(zone, pfn)) - zone =3D pfn_to_zone(nid, pfn); - if (WARN_ON_ONCE(!zone)) - continue; - - if (vmemmap_populate_hvo(start, end, huge_page_order(h), zone, - HUGETLB_VMEMMAP_RESERVE_SIZE) < 0) { - /* Fallback if HVO population fails */ - vmemmap_populate(start, end, nid, NULL, NULL); - nr_mmap =3D end - start; - } else { - m->flags |=3D HUGE_BOOTMEM_ZONES_VALID; - nr_mmap =3D HUGETLB_VMEMMAP_RESERVE_SIZE; - } - - memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE)); - } -} #endif =20 static const struct ctl_table hugetlb_vmemmap_sysctls[] =3D { diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 18b490825215..7ac49c52457d 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -29,7 +29,6 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, st= ruct list_head *folio_l void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list= _head *folio_list); #ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT void hugetlb_vmemmap_init_early(int nid); -void hugetlb_vmemmap_init_late(int nid); #endif =20 =20 @@ -81,10 +80,6 @@ static inline void hugetlb_vmemmap_init_early(int nid) { } =20 -static inline void hugetlb_vmemmap_init_late(int nid) -{ -} - static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct h= state *h) { return 0; diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index b7201c235419..26cb55c12a83 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -581,17 +581,6 @@ void __init sparse_vmemmap_init_nid_early(int nid) { hugetlb_vmemmap_init_early(nid); } - -/* - * This is called just before the initialization of page structures - * through memmap_init. Zones are now initialized, so any work that - * needs to be done that needs zone information can be done from - * here. - */ -void __init sparse_vmemmap_init_nid_late(int nid) -{ - hugetlb_vmemmap_init_late(nid); -} #endif =20 static void subsection_mask_set(unsigned long *map, unsigned long pfn, diff --git a/mm/sparse.c b/mm/sparse.c index d940b973df66..5fe0a7e66775 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -383,8 +383,6 @@ static void __init sparse_init_nid(int nid, unsigned lo= ng pnum_begin, } sparse_usage_fini(); sparse_buffer_fini(); - - sparse_vmemmap_init_nid_late(nid); } =20 /* --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1C9582899 for ; Sun, 5 Apr 2026 12:55:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393721; cv=none; b=FBVGjbfmODCsrkl/XPOesWnk84vuIM+7gthbE4lRApeXXsR2L/WP7gGX2GMs+rg2Fps1NBXvgfykPIw9wXxDqx5u5KSFioxcw0fA7lSHSIHNr0AcymTmLtXzg752BLj0T85hBKXxf9mUVwstZJ8UuSa2Dr4YdJePoOfEqFVrzR0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393721; c=relaxed/simple; bh=cVRJ949Mny0yDUOvuUsfAh9U4OEoYHeQil4gwt/JMHs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=e8pMOktYThppEdUzASX7Qe10inSaIJ+a70HvTw1kKu9ws5rga71c2Rq+rC8De8J6ZujtKS3FYZ/bFfVCKFrYVE9qp5Tomt266/mFLQpD9uTvoLfhF5eJe8gD4yOveadsL2QweiIQOkmqZJt+1fQAemMSnumL64wJEepHgIwbmMg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ATv6pz24; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ATv6pz24" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-35d9c7bf9a1so2804759a91.3 for ; Sun, 05 Apr 2026 05:55:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393719; x=1775998519; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VJ1CmGSdxm9qljCoiKFVSLrEpkjzI4pVEVcuHUoMZpY=; b=ATv6pz24ij3p917wBNDrzb7+BwgK/J6Um5d1f63qf/zL8Nfs+Y1YqsPhl70YRLt3rR HZ9T2WWb76aSxIsdvisPkkwSt9SZrN0wB0VqabIgo3AAXT2oeVwFR3LaaX15mU5k4ST0 /2B+yHisJURoF0W2UTAJFg0ORv5tEC9nR4Mi9TP1MYdUAHENTshKgHmti6B9oDo8Gh1u tj7CT77w/w3+TRIxx/hReQ7nm7NA3EhI4lwx89AcVefdr1nNRNwx0fkCVhSyVLYkodif PsBws9nlkef710eOk7zTbCQ+A4JgyHwaboj74QSnEoFp2GhMDGyMVLRYVCsQIipq2PGs lwLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393719; x=1775998519; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=VJ1CmGSdxm9qljCoiKFVSLrEpkjzI4pVEVcuHUoMZpY=; b=dmxBgJ8xtxXLe9nHyx/iqrKI8vAsciDuyEbiDtEMPi/GDcpn9zqWNnx9D1eWtawLTE aiPavnaPaiz+eJ0vGvESWtBr4lguFUkmty/HOWoJ8hOqrr+sqakaIC8UBR54Qrk+ZXJZ 3YESLh6SlliUIL5NYO1ruuGggyQZzrJQKoSC2yp8tiwUV681zIjvhiAvrgZtFxPSIaUm Gn+0iq6ZJz5HkmsoYy5jSDYpx9PDKxSWTI1+ZDVeC3PUVodL5St6kr2Gn+z7/pQvAUn4 nU7erTMZIO2N/80WvWubv654w621VZyTXHjUprN5j9UNe06s22M/Vy73mfI3zKHD7ide EwRg== X-Forwarded-Encrypted: i=1; AJvYcCXMUw55shuOyuRIrgCDsgorTxBzIJ63HlDek5X+tpOCRegFgfcOV4+s1XXOYXJK4q37oEopXfccrcYufqk=@vger.kernel.org X-Gm-Message-State: AOJu0YwJdoT/kQQGHb6eJQ+/qP024ZuQz0U8xfmQWl1ZFk1a1OSsvojp 6lDxALaKsEfA42JBhpjy7CMW9K0GNKchaqRsqowJZ1T5TM/qJQZhLl9aNqaC7+sr14g= X-Gm-Gg: AeBDievxqEVNWbXWiOs1W9wLdIpnoah3zH+BJCLX9nI+tNfFPyDsBSmrdzqdEdG2tYa DQzZ2+iBcLoD+23ZB0V5mMrayCMMIZBiUOkcEUkEWunU/vKOTUgSxddr3xxmTOZSxWPNG31Y98J cKdqNEkBID7oYs5s/YeRppYEXiIdRf3DFtNz7K7y0W1/Tn0mlfOOqkalEka0inZ5MrOUBljauZO LCE6QcT47cUNg51PfJVLe24PXQSUWtrck0o2qHp9GC8O1QQKSrhGzCRVOCObZVroDzcMwUEnp3D qQh8KS4f70u+LdjcPSq1E8YbHtOSqpmhZve8D0ZKcoPjk8bsbwnYle3qNW0Lp1irXOVXmtg9eEm pQAxFuPQTbkzXqZbSFTdEJUbH05P1TuR2Wt7ffj6PrfGPe3XHk8QqAeoZ9ue2GHLetYuekpq8td 2Q+56BkVdmcTVq8U4ILNSlCKg0+n7zE4W/4BFVstS6/Ds= X-Received: by 2002:a17:90b:268a:b0:35b:8d89:7199 with SMTP id 98e67ed59e1d1-35de68ec6eemr8630177a91.15.1775393718992; Sun, 05 Apr 2026 05:55:18 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.55.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:55:18 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 18/49] mm/mm_init: make __init_page_from_nid() static Date: Sun, 5 Apr 2026 20:52:09 +0800 Message-Id: <20260405125240.2558577-19-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since last commit removed the only external user of __init_page_from_nid(), this function is now only used locally within mm/mm_init.c under the CONFIG_DEFERRED_STRUCT_PAGE_INIT block. Make __init_page_from_nid() static, move it inside the CONFIG_DEFERRED_STRUCT_PAGE_INIT block to clean up the code. Signed-off-by: Muchun Song --- mm/internal.h | 1 - mm/mm_init.c | 4 ++-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 8232084f0c5e..a8acabcd1d93 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1755,7 +1755,6 @@ static inline bool pte_needs_soft_dirty_wp(struct vm_= area_struct *vma, pte_t pte =20 void __meminit __init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid); -void __meminit __init_page_from_nid(unsigned long pfn, int nid); =20 /* shrinker related functions */ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memc= g, diff --git a/mm/mm_init.c b/mm/mm_init.c index 7a710fcbe3c8..977a837b7ef6 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -686,10 +686,11 @@ static __meminit void pageblock_migratetype_init_rang= e(unsigned long pfn, } } =20 +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* * Initialize a reserved page unconditionally, finding its zone first. */ -void __meminit __init_page_from_nid(unsigned long pfn, int nid) +static void __meminit __init_page_from_nid(unsigned long pfn, int nid) { pg_data_t *pgdat; int zid; @@ -709,7 +710,6 @@ void __meminit __init_page_from_nid(unsigned long pfn, = int nid) false); } =20 -#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT static inline void pgdat_set_deferred_range(pg_data_t *pgdat) { pgdat->first_deferred_pfn =3D ULONG_MAX; --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 987C32C029D for ; Sun, 5 Apr 2026 12:55:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393730; cv=none; b=F5Qcw4P3lazoqn5JgTTzU7ZtOqRQfhlBIQsu7ItyfCCdM4OzFa+0Qv8+OoDmhC9U09YjPpCsL9AJawd3EsrVEAUNBX2aMqF1gn9Tjh3nDwgWDkNtkqgTt2785iikPd5bNiwrIH/RNbfQvkrB3N9K8NeJujMBb54LLVEqEDRWlxU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393730; c=relaxed/simple; bh=VMPdMsUa+GprRtZMsYfQVMB/VYlijFV+uhKYRFaXMK8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KDLIuoSJnMR12DQo0YNdIZy7RbScqMRxd3bQHHrzzwWkojppq5F18mBXZdbpZy+iYLYjVXtI34N+M8OJk2isnzCYIERSxJ7VMza71K4DI1m8Yx6uwAjrTvUz8sw4m5Fvv6EHIGhSnDOnKMMG+WjZmu1ZdFNxCnOGx2/pSgIcfdM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=k2GZnxC5; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="k2GZnxC5" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-35c206f0481so3038927a91.0 for ; Sun, 05 Apr 2026 05:55:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393728; x=1775998528; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2JrFioPtriOOEM2n2rZ0j26nDzRll/vGxWzMg0geq/4=; b=k2GZnxC52oCaHhc3rTtu2XWQ2mIewxngo1hCSyC4y6fTKPpaILIIHTW98js8OzJqMQ PihSIi7Y20ee7DjseU56siaG/0FHOXNW/PxBzwCYIwpyKRrxEvwHTXqDLahngUWlyvGy sgR7/3R20M7aUf5nByDBmC6ODWRrnTy7r+v4oOVXbgnmsrhctA3iiJnYUcKhfDqQgc1P hotgZMHe2VGY/eUy4qfoUL4dx8428Wi/qeF6HlVgnsfHaE/9g0iyfIti6je4rTvAM2oW JIdAGv1gjH+4Rv0Ts+kyCVyXKd7DiidGDMnf7ZhG87dJahpc0ILr6lSRDEUSYOdG29CX yH2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393728; x=1775998528; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=2JrFioPtriOOEM2n2rZ0j26nDzRll/vGxWzMg0geq/4=; b=hYSSHz1p+XrG4pbYNL02/sOibRV9XjLM8XNVNFkpPb4FrhaMYFw9MNF5olvjrMLNmc 5/m1kWDeAetLjyxPw7EYH84wRMzZkR6eyxL+SEHYJn/OTVRfi1hszRh1RbjXVATB7wkj hVdaB6aniSh2ukJEud6T15+6pEAHga5fpgQVLA7QLBCBd2RZrSBNEYeoZk8gcvS6Ihhi agbyu4v2xSLyxYiTe8FJ2cVzdc87YqjZJov+5rVepUNEZEr+btgOG9QxeoByq3pT9DGx xnMpJRCgzoyEJcjuHy//rDlM5lrJ8t7gKGWs33GH54GnWDkP4TJrtSvWiITKhUE/EJ99 l5MQ== X-Forwarded-Encrypted: i=1; AJvYcCXWxjforppJLOo3DJ0AyV+yppQRbREURBsRYFZVxv8PXqbFYpcER4mKXqEl4QgdPiZRHPGTmbYD5vawIg0=@vger.kernel.org X-Gm-Message-State: AOJu0YxqxY8x/ycqWwqspipSUnC27u0qgvr5U9cyQsKw1SOBPLAuemVA vbA5tQIt8yEE19QgZuByGO8ba0h4fKBSoHZf/AQXZpdIW+orYetsZtkADfkHYYvvIJA= X-Gm-Gg: AeBDietuX+OBfnBueTtgtHOy43rsehRaDfEX4dr5LCqA/qo2YrVl6ytZlaGKS+JDz3q Sm6J/GKv7hL3MusZ1Y4wTTcmVyjCfe0f57ym4WCHuyUX0fuXY6pv/qgVbQ9StNWRCeuh8jUGN8A nYtZ05w5YIRiucxF7pv3CikEmXg8LuUvUYBr4NZO2aDwjrYeFxefmmpjbg2wMq4H2X9HKsEf1ud 74mt88QS4obG/ct8L+dNkEyQCP8YTu7SJUCnbPZfTv86LlJDaoWCDYzr8U4/x8Fv/OtkTaN4uGD dGxp/vGY+Frhf9RNXj/Qq7hY0723mSR4f4y3R/enTeftsHeKTu9n+Wy0N/2MxZEJjE3xASLTPt9 XDiXyw3mMvl40RDVCQsFKV1gTGAtQ0saFJIzd3bej+jYdc2xQJJJy0a99pQ0/yNR18rdGfdSEB/ N8f3szK0xt1CYnMKfXURV+XXNm/hAiXkcqvSBeiGQlhKk= X-Received: by 2002:a17:90b:134f:b0:356:2c7b:c026 with SMTP id 98e67ed59e1d1-35de691a6bamr8984676a91.23.1775393727846; Sun, 05 Apr 2026 05:55:27 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.55.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:55:27 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 19/49] mm/sparse-vmemmap: remove the VMEMMAP_POPULATE_PAGEREF flag Date: Sun, 5 Apr 2026 20:52:10 +0800 Message-Id: <20260405125240.2558577-20-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The VMEMMAP_POPULATE_PAGEREF flag is only used to ensure that we call get_page() when slab is available, as mentioned in the comment: "and through vmemmap_populate_compound_pages() when slab is available". Since we can check slab_is_available() directly, the flag and the associated argument passing can be removed to simplify the code. Signed-off-by: Muchun Song --- mm/sparse-vmemmap.c | 40 ++++++++++++++-------------------------- 1 file changed, 14 insertions(+), 26 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 26cb55c12a83..3fdb6808e8ab 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -33,13 +33,6 @@ #include =20 #include "hugetlb_vmemmap.h" - -/* - * Flags for vmemmap_populate_range and friends. - */ -/* Get a ref on the head page struct page, for ZONE_DEVICE compound pages = */ -#define VMEMMAP_POPULATE_PAGEREF 0x0001 - #include "internal.h" =20 /* @@ -152,8 +145,8 @@ void __meminit vmemmap_verify(pte_t *pte, int node, } =20 static pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long ad= dr, int node, - struct vmem_altmap *altmap, - unsigned long ptpfn, unsigned long flags) + struct vmem_altmap *altmap, + unsigned long ptpfn) { pte_t *pte =3D pte_offset_kernel(pmd, addr); if (pte_none(ptep_get(pte))) { @@ -175,7 +168,7 @@ static pte_t * __meminit vmemmap_pte_populate(pmd_t *pm= d, unsigned long addr, in * and through vmemmap_populate_compound_pages() when * slab is available. */ - if (flags & VMEMMAP_POPULATE_PAGEREF) + if (slab_is_available()) get_page(pfn_to_page(ptpfn)); } entry =3D pfn_pte(ptpfn, PAGE_KERNEL); @@ -248,8 +241,7 @@ static pgd_t * __meminit vmemmap_pgd_populate(unsigned = long addr, int node) =20 static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int = node, struct vmem_altmap *altmap, - unsigned long ptpfn, - unsigned long flags) + unsigned long ptpfn) { pgd_t *pgd; p4d_t *p4d; @@ -269,7 +261,7 @@ static pte_t * __meminit vmemmap_populate_address(unsig= ned long addr, int node, pmd =3D vmemmap_pmd_populate(pud, addr, node); if (!pmd) return NULL; - pte =3D vmemmap_pte_populate(pmd, addr, node, altmap, ptpfn, flags); + pte =3D vmemmap_pte_populate(pmd, addr, node, altmap, ptpfn); if (!pte) return NULL; vmemmap_verify(pte, node, addr, addr + PAGE_SIZE); @@ -280,15 +272,14 @@ static pte_t * __meminit vmemmap_populate_address(uns= igned long addr, int node, static int __meminit vmemmap_populate_range(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap, - unsigned long ptpfn, - unsigned long flags) + unsigned long ptpfn) { unsigned long addr =3D start; pte_t *pte; =20 for (; addr < end; addr +=3D PAGE_SIZE) { pte =3D vmemmap_populate_address(addr, node, altmap, - ptpfn, flags); + ptpfn); if (!pte) return -ENOMEM; } @@ -306,7 +297,7 @@ int __meminit vmemmap_populate_basepages(unsigned long = start, unsigned long end, { if (vmemmap_can_optimize(altmap, pgmap)) return vmemmap_populate_compound_pages(start, end, node, pgmap); - return vmemmap_populate_range(start, end, node, altmap, -1, 0); + return vmemmap_populate_range(start, end, node, altmap, -1); } =20 /* @@ -382,7 +373,7 @@ int __meminit vmemmap_populate_hvo(unsigned long addr, = unsigned long end, return -ENOMEM; =20 for (maddr =3D addr; maddr < addr + headsize; maddr +=3D PAGE_SIZE) { - pte =3D vmemmap_populate_address(maddr, node, NULL, -1, 0); + pte =3D vmemmap_populate_address(maddr, node, NULL, -1); if (!pte) return -ENOMEM; } @@ -390,8 +381,7 @@ int __meminit vmemmap_populate_hvo(unsigned long addr, = unsigned long end, /* * Reuse the last page struct page mapped above for the rest. */ - return vmemmap_populate_range(maddr, end, node, NULL, - page_to_pfn(tail), 0); + return vmemmap_populate_range(maddr, end, node, NULL, page_to_pfn(tail)); } #endif =20 @@ -518,8 +508,7 @@ static int __meminit vmemmap_populate_compound_pages(un= signed long start, * with just tail struct pages. */ return vmemmap_populate_range(start, end, node, NULL, - pte_pfn(ptep_get(pte)), - VMEMMAP_POPULATE_PAGEREF); + pte_pfn(ptep_get(pte))); } =20 size =3D min(end - start, pgmap_vmemmap_nr(pgmap) * sizeof(struct page)); @@ -527,13 +516,13 @@ static int __meminit vmemmap_populate_compound_pages(= unsigned long start, unsigned long next, last =3D addr + size; =20 /* Populate the head page vmemmap page */ - pte =3D vmemmap_populate_address(addr, node, NULL, -1, 0); + pte =3D vmemmap_populate_address(addr, node, NULL, -1); if (!pte) return -ENOMEM; =20 /* Populate the tail pages vmemmap page */ next =3D addr + PAGE_SIZE; - pte =3D vmemmap_populate_address(next, node, NULL, -1, 0); + pte =3D vmemmap_populate_address(next, node, NULL, -1); if (!pte) return -ENOMEM; =20 @@ -543,8 +532,7 @@ static int __meminit vmemmap_populate_compound_pages(un= signed long start, */ next +=3D PAGE_SIZE; rc =3D vmemmap_populate_range(next, last, node, NULL, - pte_pfn(ptep_get(pte)), - VMEMMAP_POPULATE_PAGEREF); + pte_pfn(ptep_get(pte))); if (rc) return -ENOMEM; } --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9DB692BEC2A for ; Sun, 5 Apr 2026 12:55:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393735; cv=none; b=QhNAk7KtVFWm3VfExhMoHPUuHEwfIg87QYxpytReZUHVvf3NxGiEDG1O04dSy3MCSwqzWcM3h6RqEI1iY7tfYatAJ2A9v8MpPH2KYZy1nBom5Xlf+sW3PnSKmCl+wmv19eoxXptBLJtSsrzbJ0nXSnPzJVdWlH7L+6eP3KzHcqg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393735; c=relaxed/simple; bh=InQi375WF1ik0P8hT6qw43agFsjFHMFkwPs86eDOXJc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=W9pFlaIhuHoP3SW1E3YWwjxYyaG+3iMiBQB6RsDkRvIjR5PHip1eSqMtdZQQ70Nd6j5/QwEDDsYywhdKcmNF6r65NnJXVeGq28xRERqM5ji5qQNl+OO6NrlvjEqNUFPo2DADmWPWgiirDyT0Z5rjN9EBhDUGP7SDUN9vuEdtlvU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=kH0jsczd; arc=none smtp.client-ip=209.85.216.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="kH0jsczd" Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-35d96be7c13so1889924a91.0 for ; Sun, 05 Apr 2026 05:55:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393734; x=1775998534; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kLG7p8wcALm1QMUUetEVr6LzElxTKf6AgyaW2eyLe0w=; b=kH0jsczdhWEu6QEdrAAnX7qrHMMvxCQYL5D+VTbPZ2d5cSWwQUJsokEMtr0knneqwt 2j2a6eH6HdFZUq+YYarTQA7x3CyiGig42khAhOn0YX//N6Wg4Asg31TiO3x6dwkeaVil jhG5ooN3sJQ2XiZfJwYJD7Y1h6KSaIe+cXxsuRZ8Kp/l3AYZR2/YzOLz6QDmqQgGsCvo fEb1khJ4YtH3oLFHhA4KLXDiGlnWNHTtKer083/1t0Ne1cHrvMwna2YLJ7CgK2J+pekV S+Ifvkwnkr9eCKSH+Hwsg8P9dMkl5SM22xCaPrNhZtbcASkAhfU6dToNj2XVDjuPWlgM tUPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393734; x=1775998534; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=kLG7p8wcALm1QMUUetEVr6LzElxTKf6AgyaW2eyLe0w=; b=DvM0v81Buy8ALnX56lB01g5PbQEG/jR+Y3ZL/TOK4pEEX4dqeS2YzK44SUe9u4ISU0 QxWfIake94BcpnzULoWWiPBmNnFJyErgO27mLGbfq7oq2CrrUai4NY8XMLyfyzJ5JCmP Go6beCpiLS8MiB9sky29Dv66jaj4wz1dVH0fuNK3vm4k3Xxo9X3Q0UoSM3D0mLLR9MwY 8DDWq1WnynpVlTs84+sMh161oYTZlcAIkDOhBHAvnhuWawiUDR0jPDip+OxvhDj8uQL9 wfEUCAXtqdg+efWaF/V8MmHtKpjCG9h2mS5H37Vs9llOnVKSpWWbp0OvIUgL5PndIzfK vHfQ== X-Forwarded-Encrypted: i=1; AJvYcCVavi2fPeWSp6AdgBZnrmDPqiQE+dooXWskpPtGThd7s7K9o8Fc9ka5NRpLGw2/sGEZRMZt5VCepstB7SE=@vger.kernel.org X-Gm-Message-State: AOJu0YzHz72PmmcGcQDdcr/fEsK2igVa4SezsVTrlW3vvM3gPTZQ6Mfz iBSOrMBi+NQKloQW6fqr5mhGwTMPNyiqvi17dOakYeuERt7xcHCGxxnyVb9FtUrxK+k= X-Gm-Gg: AeBDiesbVBMBAJ2WIShaJvwSQzd+Nd1b9LWMfe0Wnbj+AdTS1NKjiBACrRhvYf6+GAH oZtX+wbYy6n5eaTRvJ0sBWJ93kDVZkpZsJDW2gdIvOiA8IXBW/+YNxzqs55+wX9Sfx2MvZA5clu t0zbbczZVbWd7p/P5V2IouMCS3ggUY7mTTwX55pJz6ha/FW7bO4FDC2suLHmCWxt+3MSpvpgVXu 1NNJyVaRoD52whZDj/qqr3Br/qObTX067u0BVqV6R0QOfQTV36y6Qu5ZgM1AW2mZfkopR3qJTPS HPunclVbvAUHtdidvaF1gT0fJoB+IRsVZqJ1A0eZ3fGQIw488HE7hHqrAGKRtAoS/uYGnSFfrfP A9CQPQDUzqNHHtNBpakTm+DtdA4MamzG8Tq/RNc0UsRJ3S9pckYpW6XF8oN6S95vxvXeU4juxxM lTDEmQ8xmKeLRoU5TV8AekrfFZwzLeOar+HWerQbpc1Ls= X-Received: by 2002:a17:90b:1e49:b0:35d:9d5d:10bb with SMTP id 98e67ed59e1d1-35de57b10a3mr7106976a91.0.1775393733924; Sun, 05 Apr 2026 05:55:33 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.55.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:55:33 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 20/49] mm: rename vmemmap optimization macros to generic names Date: Sun, 5 Apr 2026 20:52:11 +0800 Message-Id: <20260405125240.2558577-21-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation for unifying the vmemmap optimization paths for both DAX and HugeTLB, rename the existing vmemmap tail page macros to more generic, semantic-based names. The original names (e.g., VMEMMAP_TAIL_MIN_ORDER) fail to clearly express the actual requirement: it represents the minimum order of a folio that can satisfy the vmemmap optimization. To provide a broader and clearer abstraction for other users like DAX, replace them with newly introduced macros like OPTIMIZABLE_FOLIO_MIN_ORDER and NR_OPTIMIZABLE_FOLIO_SIZES. These new macros, along with OPTIMIZED_FOLIO_VMEMMAP_PAGES, OPTIMIZED_FOLIO_VMEMMAP_SIZE, and OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS, are explicitly bound to the 'folio' concept. This systematic naming makes it clearer to describe the properties of a vmemmap-optimized folio rather than just raw pages. Signed-off-by: Muchun Song --- include/linux/mmzone.h | 18 ++++++++++-------- mm/hugetlb_vmemmap.c | 6 +++--- mm/sparse-vmemmap.c | 4 ++-- 3 files changed, 15 insertions(+), 13 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 8ee9dc60120a..378feaf4e4ed 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -107,13 +107,15 @@ is_power_of_2(sizeof(struct page)) ? \ MAX_FOLIO_NR_PAGES * sizeof(struct page) : 0) =20 -/* - * vmemmap optimization (like HVO) is only possible for page orders that f= ill - * two or more pages with struct pages. - */ -#define VMEMMAP_TAIL_MIN_ORDER (ilog2(2 * PAGE_SIZE / sizeof(struct page))) -#define __NR_VMEMMAP_TAILS (MAX_FOLIO_ORDER - VMEMMAP_TAIL_MIN_ORDER + 1) -#define NR_VMEMMAP_TAILS (__NR_VMEMMAP_TAILS > 0 ? __NR_VMEMMAP_TAILS : 0) +/* The number of vmemmap pages required by a vmemmap-optimized folio. */ +#define OPTIMIZED_FOLIO_VMEMMAP_PAGES 1 +#define OPTIMIZED_FOLIO_VMEMMAP_SIZE (OPTIMIZED_FOLIO_VMEMMAP_PAGES * PAG= E_SIZE) +#define OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS (OPTIMIZED_FOLIO_VMEMMAP_SIZE= / sizeof(struct page)) +#define OPTIMIZABLE_FOLIO_MIN_ORDER (ilog2(OPTIMIZED_FOLIO_VMEMMAP_PAGE_S= TRUCTS) + 1) + +#define __NR_OPTIMIZABLE_FOLIO_SIZES (MAX_FOLIO_ORDER - OPTIMIZABLE_FOLIO= _MIN_ORDER + 1) +#define NR_OPTIMIZABLE_FOLIO_SIZES \ + (__NR_OPTIMIZABLE_FOLIO_SIZES > 0 ? __NR_OPTIMIZABLE_FOLIO_SIZES : 0) =20 enum migratetype { MIGRATE_UNMOVABLE, @@ -1144,7 +1146,7 @@ struct zone { atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS]; atomic_long_t vm_numa_event[NR_VM_NUMA_EVENT_ITEMS]; #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP - struct page *vmemmap_tails[NR_VMEMMAP_TAILS]; + struct page *vmemmap_tails[NR_OPTIMIZABLE_FOLIO_SIZES]; #endif } ____cacheline_internodealigned_in_smp; =20 diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 535f0369a496..d6dd47c232e0 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -495,7 +495,7 @@ static bool vmemmap_should_optimize_folio(const struct = hstate *h, struct folio * =20 static struct page *vmemmap_get_tail(unsigned int order, struct zone *zone) { - const unsigned int idx =3D order - VMEMMAP_TAIL_MIN_ORDER; + const unsigned int idx =3D order - OPTIMIZABLE_FOLIO_MIN_ORDER; struct page *tail, *p; int node =3D zone_to_nid(zone); =20 @@ -828,7 +828,7 @@ static int __init hugetlb_vmemmap_init(void) BUILD_BUG_ON(__NR_USED_SUBPAGE > HUGETLB_VMEMMAP_RESERVE_PAGES); =20 for_each_zone(zone) { - for (int i =3D 0; i < NR_VMEMMAP_TAILS; i++) { + for (int i =3D 0; i < NR_OPTIMIZABLE_FOLIO_SIZES; i++) { struct page *tail, *p; unsigned int order; =20 @@ -836,7 +836,7 @@ static int __init hugetlb_vmemmap_init(void) if (!tail) continue; =20 - order =3D i + VMEMMAP_TAIL_MIN_ORDER; + order =3D i + OPTIMIZABLE_FOLIO_MIN_ORDER; p =3D page_to_virt(tail); for (int j =3D 0; j < PAGE_SIZE / sizeof(struct page); j++) init_compound_tail(p + j, NULL, order, zone); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 3fdb6808e8ab..9f70559df4e8 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -330,12 +330,12 @@ static __meminit struct page *vmemmap_get_tail(unsign= ed int order, struct zone * unsigned int idx; int node =3D zone_to_nid(zone); =20 - if (WARN_ON_ONCE(order < VMEMMAP_TAIL_MIN_ORDER)) + if (WARN_ON_ONCE(order < OPTIMIZABLE_FOLIO_MIN_ORDER)) return NULL; if (WARN_ON_ONCE(order > MAX_FOLIO_ORDER)) return NULL; =20 - idx =3D order - VMEMMAP_TAIL_MIN_ORDER; + idx =3D order - OPTIMIZABLE_FOLIO_MIN_ORDER; tail =3D zone->vmemmap_tails[idx]; if (tail) return tail; --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 501082BEC2A for ; Sun, 5 Apr 2026 12:55:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393741; cv=none; b=Pd4AggyPB+gb3R/vwROXB6dZE8b/qi5cdSkpYuXjnFjWLHHrIIXO22rskxvaRVZPbqHg1WyP8UOpNYivH9/mFSEHTHhuw7IYlooJ4hCyPQ/pgGolcfJHkqhAIwuPwGAc01XQjZWyOVXKWK12mBCzrRPxvYNPfIm3HQr1Pq0F97s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393741; c=relaxed/simple; bh=fO+aA2jGCBlliuF/m+iuqQoBVzl2ynCuI+WYgjaRD2c=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=O8ZXJ/vB0sRaRNnmHrPlvS6pqxgtTDym+NTBWUhDnDWfBhIPiiN9eBRgWr3ysoWAcLOQjje+e8Kj9E8wtHbqRQxnKUffuzrfUNuqbwagR9+9/eboV/3gXmURqD19LABSEKuBQLhSs24uRZqVXpr7ySAe5TgsvuFDo0e7fEZ8egM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ihyz/PUD; arc=none smtp.client-ip=209.85.216.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ihyz/PUD" Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-35d965648a2so2429421a91.0 for ; Sun, 05 Apr 2026 05:55:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393740; x=1775998540; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RosLEkDdV2Cdz9XETYo2dsVY+zgApycOz9SAkpm7hvM=; b=ihyz/PUDoo5/EA0hevvCN7ksPB7PestTpNtZB2YGPwOgMeJNTuPvNmBwGUrc0W0+/3 2DJMx8szLVO6cvJdafFZ88le4jmgSQ5ZTaV7qnLyyd6ZgKqyS6az7/7uni/UqH89jDwr MDirrlp/mTAkBn4S+4yLjDQ78VtZ/nsupD5TWu5sPtqWpmUso1CI7GGdlA1jT4NMgOmA 1E9u6QGyekPMQYQsudyl9ehu1v+bNOQTmps5Yh1nCFokBWuFd2qcxn6Xr+MoLe3nM3Vy RlogJoqlihe+a1EjM0EZPn1V26ufh5wYY5ydaGPddk8QQ2SLXcV0oUSIuhgmeOsC8bCf a+DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393740; x=1775998540; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=RosLEkDdV2Cdz9XETYo2dsVY+zgApycOz9SAkpm7hvM=; b=f0j5Cb3ZUct09eRw1n4bivccxspVxRQeNFafawmqqIYrLiHj5i68QV08HyZWoSdaVu 33+hadQo9x24Rp2vsiMvnXH6QDGl/3ZBPKUj6Msqh9h/KLgHZJdPQZ7Cu9rb8yFrCXXx ZbyyXQ7IRrWPNx0alNB+l8vkrSl0D1VDzKSXVH4vxIM4DYaZyuZNxV3YaJkIhIccR/Eb JXidUbfA4D4gjyztIca/3cx9Z3ekI44NrEgY+SF2Qiiz8NRd9msnwlUBEaOzre3QJIi6 wOGLB6dl8cY0hQZg47LACX53dlUt/bNxhaTEitSHjA86V6xeheuiFK+hLuqaBU4E5l5r Rtow== X-Forwarded-Encrypted: i=1; AJvYcCXAvFBli3trpUHrUfbej5q95dsimGWHB1EKL8R5zfeOIbJsE/BBxfAYZA2j7o3IiIJiMMx43yG6/M1nJQY=@vger.kernel.org X-Gm-Message-State: AOJu0YyEMBCizs7ydbNiZ7jc+6SOMVMjPOfXTimciKd71/84KLWGYfl7 Qp935LwScbegaNYMsXG0leExb5i5hr5fWmHEA4ubniQ4rDeZwjdPWA9Rv/CgDZ32LoQ= X-Gm-Gg: AeBDieu47CGy1KSwZzrQCWpn04sSSwnpjVKKAlYAciru6lWoq/aFsopNAxTxuYSB0IY 5MIsxE/OP8oE9ORKgMPhWMdgEFFwA2k/IKBrkGOmtXAIFrA7A0+1D8OJpE1xxqLlF72Yuo0Fw9l LoM/B/RBQeZshXeyucvaRD2t1MDb0Bz6AJ2B1W1VsFBvVT6RmME6IZKSIBaSVxvUjpntlija7uP dBzyxXWby78cqsW3thjYXupUIU8KLI2E0sPllij+OZC/sY/uBRTR6Ox/EyGgP/X4NgAie03QYAo dTh0p0IsJb5RxSZtTtwu59M1YgDeaqhdoxEdXoDTwXmUeNelLQ9cQwIC/xO3FtxefEfu7J5eNiz QH+rXD1k11+mu0Aj+K5klIkWA++ApT6QyG0UiDwwVHmqK6Md0sR4L5CErogMnaeepZCDtAFM0bK gM28FKgs9pSjQzaqqy9VvWI/nRtiAderBVtVny6KBSr5I= X-Received: by 2002:a17:90b:554e:b0:35d:aeb2:25b2 with SMTP id 98e67ed59e1d1-35de69780b4mr8868406a91.27.1775393739584; Sun, 05 Apr 2026 05:55:39 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.55.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:55:39 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 21/49] mm/sparse: drop power-of-2 size requirement for struct mem_section Date: Sun, 5 Apr 2026 20:52:12 +0800 Message-Id: <20260405125240.2558577-22-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since sparsemem-extreme was introduced, struct mem_section has been forced to a power-of-2 size so that the section-to-root lookup could use a cheap bit-mask instead of an expensive divide: section =3D &mem_section[root][nr & SECTION_ROOT_MASK]; This is enforced at compile time with BUILD_BUG_ON(!is_power_of_2(sizeof(struct mem_section))); and forces us to add padding that grows and shrinks with every config combination, wasting memory just to keep the structure aligned to the next power of two. With CONFIG_PAGE_EXTENSION enabled the padding alone can reach 42 struct mem_section instances per section-root page. Drop the requirement and switch to a plain modulo: section =3D &mem_section[root][nr % SECTIONS_PER_ROOT]; Modern compilers turn the divide into a multiply-by-reciprocal approach, so the runtime impact is negligible. In return we get: 1. Immediate memory savings when CONFIG_PAGE_EXTENSION is enabled. 2. Freedom to extend struct mem_section in the future without having to fiddle with artificial padding or the power-of-2 rule. Signed-off-by: Muchun Song --- include/linux/mmzone.h | 8 +------- mm/sparse.c | 2 -- scripts/gdb/linux/mm.py | 6 ++---- 3 files changed, 3 insertions(+), 13 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 378feaf4e4ed..3e3755666846 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -2013,12 +2013,7 @@ struct mem_section { * section. (see page_ext.h about this.) */ struct page_ext *page_ext; - unsigned long pad; #endif - /* - * WARNING: mem_section must be a power-of-2 in size for the - * calculation and use of SECTION_ROOT_MASK to make sense. - */ }; =20 #ifdef CONFIG_SPARSEMEM_EXTREME @@ -2029,7 +2024,6 @@ struct mem_section { =20 #define SECTION_NR_TO_ROOT(sec) ((sec) / SECTIONS_PER_ROOT) #define NR_SECTION_ROOTS DIV_ROUND_UP(NR_MEM_SECTIONS, SECTIONS_PER_ROOT) -#define SECTION_ROOT_MASK (SECTIONS_PER_ROOT - 1) =20 #ifdef CONFIG_SPARSEMEM_EXTREME extern struct mem_section **mem_section; @@ -2053,7 +2047,7 @@ static inline struct mem_section *__nr_to_section(uns= igned long nr) if (!mem_section || !mem_section[root]) return NULL; #endif - return &mem_section[root][nr & SECTION_ROOT_MASK]; + return &mem_section[root][nr % SECTIONS_PER_ROOT]; } extern size_t mem_section_usage_size(void); =20 diff --git a/mm/sparse.c b/mm/sparse.c index 5fe0a7e66775..cfe4ffd89baf 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -394,8 +394,6 @@ void __init sparse_init(void) unsigned long pnum_end, pnum_begin, map_count =3D 1; int nid_begin; =20 - /* see include/linux/mmzone.h 'struct mem_section' definition */ - BUILD_BUG_ON(!is_power_of_2(sizeof(struct mem_section))); memblocks_present(); =20 if (compound_info_has_mask()) { diff --git a/scripts/gdb/linux/mm.py b/scripts/gdb/linux/mm.py index d78908f6664d..0c9eeed92064 100644 --- a/scripts/gdb/linux/mm.py +++ b/scripts/gdb/linux/mm.py @@ -70,7 +70,6 @@ class x86_page_ops(): self.SECTIONS_PER_ROOT =3D 1 =20 self.NR_SECTION_ROOTS =3D DIV_ROUND_UP(self.NR_MEM_SECTIONS, self.= SECTIONS_PER_ROOT) - self.SECTION_ROOT_MASK =3D self.SECTIONS_PER_ROOT - 1 =20 try: self.SECTION_HAS_MEM_MAP =3D 1 << int(gdb.parse_and_eval('SECT= ION_HAS_MEM_MAP_BIT')) @@ -100,7 +99,7 @@ class x86_page_ops(): def __nr_to_section(self, nr): root =3D self.SECTION_NR_TO_ROOT(nr) mem_section =3D gdb.parse_and_eval("mem_section") - return mem_section[root][nr & self.SECTION_ROOT_MASK] + return mem_section[root][nr % self.SECTIONS_PER_ROOT] =20 def pfn_to_section_nr(self, pfn): return pfn >> self.PFN_SECTION_SHIFT @@ -249,7 +248,6 @@ class aarch64_page_ops(): self.SECTIONS_PER_ROOT =3D 1 =20 self.NR_SECTION_ROOTS =3D DIV_ROUND_UP(self.NR_MEM_SECTIONS, self.= SECTIONS_PER_ROOT) - self.SECTION_ROOT_MASK =3D self.SECTIONS_PER_ROOT - 1 self.SUBSECTION_SHIFT =3D 21 self.SEBSECTION_SIZE =3D 1 << self.SUBSECTION_SHIFT self.PFN_SUBSECTION_SHIFT =3D self.SUBSECTION_SHIFT - self.PAGE_SH= IFT @@ -304,7 +302,7 @@ class aarch64_page_ops(): def __nr_to_section(self, nr): root =3D self.SECTION_NR_TO_ROOT(nr) mem_section =3D gdb.parse_and_eval("mem_section") - return mem_section[root][nr & self.SECTION_ROOT_MASK] + return mem_section[root][nr % self.SECTIONS_PER_ROOT] =20 def pfn_to_section_nr(self, pfn): return pfn >> self.PFN_SECTION_SHIFT --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 318BD33DEDF for ; Sun, 5 Apr 2026 12:55:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393748; cv=none; b=l2gkIHt70QN2dcCheqHgNdlN11IW379Bh0zr9u0abwhlnZzINd1tbQ4hE+nZ9r5Cx0nEDTzsu5XB3AVi80OXT6Exve0wVrbdlmoc/zkN++A0xc1RxISLLnP2DqXA0T740R65fLkjkWNxqLZiR9sMu/dUqkF6N3ObvGKLEffgCuE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393748; c=relaxed/simple; bh=d3iO5AC8RNTfMRStJLbyxq/hJF3Y9GghztnMg/4IfjI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=qLDkuqGkXYv4YKG0Zs16IIt9j8JX0kPsaZgxqwNF/wQjh3L6iBZ/Q/faPSdAo+g82Tem7WMm0fUKp3PJmfsnzJvtI7D8F922dHpIW7HH7Ybt2gnNlnRy49jdQIQc92lHZk3qe+MFhZUpOE/xmSo6I0M4cQexJCtjZzpiONrPSyY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=XSL8y+Gr; arc=none smtp.client-ip=209.85.216.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="XSL8y+Gr" Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-3591cc98871so1313259a91.3 for ; Sun, 05 Apr 2026 05:55:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393746; x=1775998546; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sb6cTU6E76tVxQHa/I/Ro9Ws0lLMXSEWXuSGv9owj/A=; b=XSL8y+GrkJ7kxcHntw455J9yQsJC4NEhFhIs2WUMPF/iwJIIIIHJ3EMsIYJABFr8+0 LVpG26CSLMKkmOfLyui2c17PK0wYUA/S/AtDGsJ0CwicTULHK4eKSAITaeifziYNuTE6 49bYWjEaxAIUZ55+JLu+KJMTPMuDuEMk1DMO4O89zyuQTTqdXvGY/zmP/AMc2ZP1ObOK RVwdv/7w8X/mY4SQiXp1hA9lKyfSEptqunLEaAw2K/bfjY57499v5PeJexUnqLRnBDr1 bDRQSwVC8Pyc9WXyWIoiPeR6KLYSlxEUOyiYZhxdU1RnT8Xv1m/PqPmW/17gBsQidykl Rp6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393746; x=1775998546; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=sb6cTU6E76tVxQHa/I/Ro9Ws0lLMXSEWXuSGv9owj/A=; b=a4L4rLt53qjeUkmUKZe8tudRk0Bu78L0bzTKI24/cQKVjXf1LmyFVtGO/yjrhIKQgS nuA30/YJyuPdDf3szF4Q9ZuUy8ERN/ejJJeVRdwLq64f6rA2lfAKfB5a1aHnu4on3d9e sMRfp89Lu+Wl2egW9WkwKajS1sCx16UCtUTQgk/lM48sZD8UsiYJBWSxrNuibmOW6U2U acvC2TUAwRuibpQPvgHVnNyLAor781+c569R6ZGVlPpDXMiCyDqADKU/k1EjOpmjhWII iy64ed6kXxQFsPHVBjjkh81PWN1A6/ix989W+RwBoqgJmT2Fqo7QoJLW7A/uG+KNFQjj wWXQ== X-Forwarded-Encrypted: i=1; AJvYcCWK/32dDGl3MsIaJweZd78BzIyC3iWu8CzfmxU0l3AlYDJQJNvhsjmI5VnVvKLelfARaL5uvr/BjsXS7aM=@vger.kernel.org X-Gm-Message-State: AOJu0Yxj12qJ5zWap/zMKbdKen9ok0+/LclwXnW+bN/Zl/b1yPpSstYD BNIdMPh+or7zE4cskjECwvY4erXypOz8nw31BoPr+NECKoCq5zPGpFR/2fh/XfQV+wo= X-Gm-Gg: AeBDiettZ+9YQAOW1wB5nkFIr4667BlOHqswRmJINcnnwRW2z5iDZI5dB/w+Nm6I3Cl lIkVjGnmV0/UzWmJiX+VaALQLvYuEc3XpKbgbQN8E02HEt1M0NYd0qRmxwgpjoUslTv6Q4JIROe LUI+ZK7D1OlfvWgeAtvcDpfardKNgX5S3HG5FyNLybAOmbZOam9PYWMknBELkHs93/x5qsjJBAf 4ozPH7ym+bcAb5Dx3PkWqlvJDbs+bTcN+yZnTCUNzUCUsXhjYaboyywbVCkG2fOA5lrVgmKOgDW FNzRMG5gY1yFh3DYT5gynYshm7XXD6T5i68Z2RfFf2DaeHQC1TmX3yl9YePvm613MLesgP3WZOy HWei8kT5alYjD0l3WUtUnVFS58qzlypYSnIanfkvLK+mpSk6E1hOgVRyUKTX5aslQEhNV3GOn2v vsuuSgrW3EeSOTvDTMx0x9LWwWsjgf0VVXT8WPPcSIIhY= X-Received: by 2002:a17:90b:2ccc:b0:35d:a0b7:9608 with SMTP id 98e67ed59e1d1-35de67dc7d4mr8298573a91.7.1775393746346; Sun, 05 Apr 2026 05:55:46 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.55.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:55:46 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 22/49] mm/sparse: introduce compound page order to mem_section Date: Sun, 5 Apr 2026 20:52:13 +0800 Message-Id: <20260405125240.2558577-23-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" During early system boot and DAX device initialization, the establishment of vmemmap page mappings via __populate_section_memmap() is done on a per-section basis, followed by the initialization of the struct page area. Currently, there are two scenarios utilizing HugeTLB Vmemmap Optimization (HVO): HugeTLB and DAX. For HugeTLB, the SPARSEMEM_VMEMMAP_PREINIT mechanism is used to apply HVO (with Read-Write mappings), and later, vmemmap_wrprotect_hvo() is used to enforce Read-Only mappings. HugeTLB has to manage its own related statistics and metadata updates; the work done by hugetlb_vmemmap_init_early() is somewhat similar to what sparse_init_nid() does. Furthermore, the shared vmemmap tail pages allocated by vmemmap_get_tail() are left uninitialized because they would be overwritten by the subsequent memmap_init(). We are forced to compensate for this in hugetlb_vmemmap_init(). This limitation also forces us to maintain two separate implementations of vmemmap_get_tail= () (one in hugetlb_vmemmap.c and another in sparse-vmemmap.c). For DAX, HVO is already applied via __populate_section_memmap(), but it does not employ Read-Only mappings, which introduces potential security risks. Moreover, the fact that HugeTLB and DAX implement different logics for what is essentially the same purpose increases code complexity and maintenance burden. The root cause of these issues is that a memory section is completely unawa= re of the concept of compound pages. It cannot properly handle HVO or struct page initialization for them. To solve this, introduce the concept of compound page order to the memory section (`struct mem_section`). Typically, a section holds compound pages of a specific order, and a larger compound page will span multiple sections. In the future, this order information can be utilized to unify and streamli= ne the aforementioned scenarios. Signed-off-by: Muchun Song --- include/linux/mmzone.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 3e3755666846..620503aa29ba 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -2014,6 +2014,14 @@ struct mem_section { */ struct page_ext *page_ext; #endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP + /* + * The order of compound pages in this section. Typically, the section + * holds compound pages of this order; a larger compound page will span + * multiple sections. + */ + unsigned int order; +#endif }; =20 #ifdef CONFIG_SPARSEMEM_EXTREME @@ -2210,6 +2218,17 @@ static inline bool pfn_section_first_valid(struct me= m_section *ms, unsigned long *pfn =3D (*pfn & PAGE_SECTION_MASK) + (bit * PAGES_PER_SUBSECTION); return true; } + +static inline void section_set_order(struct mem_section *section, unsigned= int order) +{ + VM_BUG_ON(section->order && order && section->order !=3D order); + section->order =3D order; +} + +static inline unsigned int section_order(const struct mem_section *section) +{ + return section->order; +} #else static inline int pfn_section_valid(struct mem_section *ms, unsigned long = pfn) { @@ -2220,6 +2239,15 @@ static inline bool pfn_section_first_valid(struct me= m_section *ms, unsigned long { return true; } + +static inline void section_set_order(struct mem_section *section, unsigned= int order) +{ +} + +static inline unsigned int section_order(const struct mem_section *section) +{ + return 0; +} #endif =20 void sparse_init_early_section(int nid, struct page *map, unsigned long pn= um, --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2F8D3246F8 for ; Sun, 5 Apr 2026 12:55:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393755; cv=none; b=I6TaXDPycx4ZNWhdbmLz6sZjkT5rCwqdM2uVzIMw7J8ONttG7o4Gh9+KbmLIt5xHd1OAVO7fALpmSzV8BbjJNa+MwSHrVvY5eESZH9xxNifEvS4SUGtMe1vRAOs4QBqd0c6vQIzm1MAI8A+v91mPCFm3MbWgB53spDxgAn/corY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393755; c=relaxed/simple; bh=b4c8O4qgiF0EQu4/JaFeEXRQQfFT3iLeumLNakHLN5k=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=A1BpNYC/+cxf+bsFlAwQicuCYERPzaNCdl8L+a/5RUMYe1SswB9z39rODH6KsoTNhP4pVGgvAgphZQOuV0fYUrFGktunBz/eKJr38Smdeszlbuh+Ko/GFG4QGnnrXViTxIxKbs8Rt6pzfI6D8OB6KaXdv35S/sAVSHfOh7D/m6g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=IfZj0U86; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="IfZj0U86" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-35c206f0481so3039052a91.0 for ; Sun, 05 Apr 2026 05:55:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393753; x=1775998553; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bqgbPUicXY/csWEcQb9KuYcgGnVxfQn/0ErN/EA6ZY8=; b=IfZj0U86lCJxkrl52au8fHmeY2vfJNA6VOqkLTRVG9jHpLM5Q7y8ev2ixKhzOjTJCg 9g2v+SHW1aph+VcOcAwQcJE09jTSINVzVXydSnK+CYKMTpljLsmc1o2Sz1E7KlIgfjUu x1c8sgyz/h7UOU9c0qLIyLmef3BGt/WWL33c8zFQjBSBrgLCuiV+OX/k8Nt506T+L/7a jkdN4GXeEIRsQiq4KFfiR9UJBmM/ry/xWs+ok3nFzkGR22dlptza7mkqqDkdbc3LhLt1 LfOyacan4BZpcIHL4cPsqflMbQGmEqwX/hW515kw/qyzv1GWTodw1/p32b71AtQUjiW5 Q3Tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393753; x=1775998553; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=bqgbPUicXY/csWEcQb9KuYcgGnVxfQn/0ErN/EA6ZY8=; b=fStJpfDxrYL0Duy2XIQYM4dG7ofWRDjFTrOG2RNv0gvdbtKXgrS27U42Z/Rm2V+YTm oEhrux08kpwyCQROXmksQ2/yIPOUVdbtekwMC4qQ61i/cjowS60bryvVCa18kUwJcyb6 faFYyaHNG7cezUB7qz9Mz4VWuRWy7xgNouj5xJdwqAWBc3MaJ8Bw5JAC/RPHdTSLklKr NDGLBflpPE9CbCr30JCYZqrbU0eCnbMlbVqUZnlSQyk5LS99UsoLLCxO8qQHHvmY3fNj Fz9tY5MLh2KbOxCS2KWaUyCenQW9TPJXJuQdubT2CyFaq+fZ53LNFfpLvr/3RBN5Mgyj n28A== X-Forwarded-Encrypted: i=1; AJvYcCVp26A6OLIEDEALEBYoCr6bmltt3PFVNZR5ZSXMJ09v3whB9GOsC38JjvEbF2gEJC2MQ7uJ11PtyFiVDIo=@vger.kernel.org X-Gm-Message-State: AOJu0Yx9ORc9Vs0RJ/7s/1bFTUVQiOz7nHwIzn9QGjB7V/aSOars4OtJ s1vLF1C5XFdxGahZcLBG4c1V6i9VHnRSrUkZLlGQ/WLIyzJn0dJut/JJEP3zhHOeMPw= X-Gm-Gg: AeBDiev16H8PnoSZI5/NeUV4L6O/ijyhfeqAQc/5Des5lznXeTAjOVz8wEAylrD7IOw CtxvYu8G5rnfadKFsXGJ+kfJ1+i1ZYImzXv3E1nJkFjqZquUUlMZ9ngyyq02I7Z7j7qE3pjXv7W hZrobtzpn5F9bJxe9dKvuoLqi6JZm2OwLIATthMCdlhK2ISV08ngQurfbK21q9l24VfTwWLfx56 FuqKWCSQ/ZdKmUSdcY6xOZMMMQ+1AnT4lAvmqc9TEo6tOZs6uf+jaUz+i31A9S5mZzWumPGgJUy I/LhalIRWD+pXNLPpHcExRLVsvbsoHzeCgz8dMMWEA00qHtyVwaCiww7O0YY6Z2CemVtat8LHFN S24IrgCKh1FdruBg6O031d48AfVFX+oKesNCthDTFcSGu1s6hLcKu8c7BlCiaDntPf5ckjXsNe5 7iicTnu/MCToJt4nsmQakx90Vzsrrf5YW2xvuR3l3hRf4= X-Received: by 2002:a17:90b:3a8a:b0:35c:30a8:319 with SMTP id 98e67ed59e1d1-35de660e4d6mr9096576a91.0.1775393753200; Sun, 05 Apr 2026 05:55:53 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.55.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:55:52 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 23/49] mm/mm_init: skip initializing shared tail pages for compound pages Date: Sun, 5 Apr 2026 20:52:14 +0800 Message-Id: <20260405125240.2558577-24-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, memmap_init_range() unconditionally initializes all struct pages within a section. However, when HugeTLB Vmemmap Optimization (HVO) is enabl= ed, shared vmemmap tail pages are allocated during the vmemmap population phase (e.g., via vmemmap_get_tail()). These shared tail pages are left intentiona= lly uninitialized at that time because the subsequent memmap_init() would simply overwrite them. If memmap_init_range() continues to initialize these shared tail pages, it will overwrite the carefully constructed HVO mappings and metadata. This fo= rces subsystems like HugeTLB to implement workarounds (like re-initializing or compensating for the overwritten data in their own init routines, as seen in hugetlb_vmemmap_init()). Therefore, the primary motivation of this patch is to prevent memmap_init_r= ange() from incorrectly overwriting the shared vmemmap tail pages. By detecting if= a page is an optimizable compound vmemmap page (using the newly introduced se= ction order), we can safely skip its redundant initialization. As a significant side-effect, skipping the initialization of these shared t= ail pages also saves substantial CPU cycles during the early boot stage. Signed-off-by: Muchun Song --- mm/internal.h | 11 +++++++++++ mm/mm_init.c | 19 +++++++++++++++---- 2 files changed, 26 insertions(+), 4 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index a8acabcd1d93..1060d7c07f5b 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1011,6 +1011,17 @@ static inline void sparse_init_subsection_map(void) } #endif /* CONFIG_SPARSEMEM_VMEMMAP */ =20 +static inline bool vmemmap_page_optimizable(const struct page *page) +{ + unsigned long pfn =3D page_to_pfn(page); + unsigned int order =3D section_order(__pfn_to_section(pfn)); + + if (!is_power_of_2(sizeof(struct page))) + return false; + + return (pfn & ((1L << order) - 1)) >=3D OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRU= CTS; +} + #if defined CONFIG_COMPACTION || defined CONFIG_CMA =20 /* diff --git a/mm/mm_init.c b/mm/mm_init.c index 977a837b7ef6..7f5b326e9298 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -676,12 +676,13 @@ static inline void fixup_hashdist(void) {} =20 static __meminit void pageblock_migratetype_init_range(unsigned long pfn, unsigned long nr_pages, - int migratetype) + int migratetype, + bool isolate) { unsigned long end =3D pfn + nr_pages; =20 for (pfn =3D pageblock_align(pfn); pfn < end; pfn +=3D pageblock_nr_pages= ) { - init_pageblock_migratetype(pfn_to_page(pfn), migratetype, false); + init_pageblock_migratetype(pfn_to_page(pfn), migratetype, isolate); cond_resched(); } } @@ -912,6 +913,16 @@ void __meminit memmap_init_range(unsigned long size, i= nt nid, unsigned long zone } =20 page =3D pfn_to_page(pfn); + if (vmemmap_page_optimizable(page)) { + struct mem_section *ms =3D __pfn_to_section(pfn); + unsigned long start =3D pfn; + + pfn =3D min(ALIGN(start, 1L << section_order(ms)), end_pfn); + pageblock_migratetype_init_range(start, pfn - start, migratetype, + isolate_pageblock); + continue; + } + __init_single_page(page, pfn, zone, nid); if (context =3D=3D MEMINIT_HOTPLUG) { #ifdef CONFIG_ZONE_DEVICE @@ -1138,7 +1149,7 @@ void __ref memmap_init_zone_device(struct zone *zone, * Please note that MEMINIT_HOTPLUG path doesn't clear memmap * because this is done early in section_activate() */ - pageblock_migratetype_init_range(start_pfn, nr_pages, MIGRATE_MOVABLE); + pageblock_migratetype_init_range(start_pfn, nr_pages, MIGRATE_MOVABLE, fa= lse); =20 pr_debug("%s initialised %lu pages in %ums\n", __func__, nr_pages, jiffies_to_msecs(jiffies - start)); @@ -1963,7 +1974,7 @@ static void __init deferred_free_pages(unsigned long = pfn, if (!nr_pages) return; =20 - pageblock_migratetype_init_range(pfn, nr_pages, MIGRATE_MOVABLE); + pageblock_migratetype_init_range(pfn, nr_pages, MIGRATE_MOVABLE, false); =20 page =3D pfn_to_page(pfn); =20 --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66E623112AB for ; Sun, 5 Apr 2026 12:56:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393762; cv=none; b=TRgvpJjNguQ4MKzkvq/joPDdRHcW1FgHFZptquLD6TcEwOPfvG6FvwJtjsOT/91hjB8A34VBrX1dn0l2tHLwHnq78iBeoe0QAsBpJB6HibQBN/9LbTUhc8CUNNBc5klhXr2I8tx8MmxFRCsn5lU4NV6XIyyxNOBkgI1DOVqYydQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393762; c=relaxed/simple; bh=mKj4nCTSR/c0LrQrG5ysHgSaETEvplTmxXfZ1zEwX1E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QSr0nnOsrxA4OMzaw5Cc0sfapcYHAfRG0blR3K4OT2Ts8zJHfVLl1TkFW4Tne+kXK91+hkjAsg1vef9ttmbDiZMTn3KlRBtl+gqnydDT0VgY9u9gEX/HhUeGbKxzHuOUsax9wnQmx3OZdlEmQ5GbbFkTK0ABKj6NlfkDGrfVJ8c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=GlUmEX3X; arc=none smtp.client-ip=209.85.216.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="GlUmEX3X" Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-35d9827661bso1534772a91.3 for ; Sun, 05 Apr 2026 05:56:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393761; x=1775998561; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Mltk+8iYdxXiKYafIwHZmXXZQXNazh8dOc5/PN4fBuo=; b=GlUmEX3XF8Z3XBe7QlhM68xj47BcngHR8YqPRyEnf23v84rhbkO+hOU4B0j8t3FCML hXWDWhp5tpCqA9adoyA4G6JTcwUqDyrgHHsJ0cyMf5pLsBYu/O5y61hqmHpXdIPoRjQD JDYY11Er32rlWR2UmL/Xbv4lcOJ8x9CiNawseudNTPEKx1LMwYQc208e0FJaE1X3ZFd6 OA0AvQRCxd0DXR3T5mpYfVN51tSWspNtWHjJjoESwdtbrhep+P3ZfYJWgTygftXu2oV1 zJl++B28oB2poLdCYhnUH6iewDUs8YFh7s0/S1t7FiG3DAOpRzNxKt+4C9C5wcs6qd28 Cb6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393761; x=1775998561; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Mltk+8iYdxXiKYafIwHZmXXZQXNazh8dOc5/PN4fBuo=; b=L0se7TxTvhsw2ZWPRVdXkBS/d9XqV0SHkKRKJY7gGIQg9AF4BD49ppXU1JCg+gL453 1iTYuzGb/FcP4eIeYYYQTRY8dgz5vuhcBFpv+3EfVTWPrd7b9OsEHYV5wbh91nhK507P D41EM4doaLAOruoHjiYumWwOwxVKgujNnljfeH1HMTHXaDx7CixbbCjRhqcwlZEIe52T 4HTnctqP+xDppTbCtgsCn0k9gYMQrif9sxoNz/plWoRVXLPORb3SdcEtaBSriIr6/c8q E32MtMzEh6elc7puRz2KRSn/UnhPlosiqb3zxSqin4Qf1azhllGBy1z/m3b3nTLGn4ME 893Q== X-Forwarded-Encrypted: i=1; AJvYcCWLGgw2iirO3OefB9+BaSYXL3+KvtFBjflkqngkOopjJhzwjcytAzxHl99QZ2Dw/we0FjXlt0N0kRxLwk0=@vger.kernel.org X-Gm-Message-State: AOJu0Yyfvq3e7HlvzHm5e4Ycsn2b6kSq3DaDBbfRIcOBWMUOgiF6kE9Y mHiODw84xEz/SS1kFHYiIHO1V1ChCAaJDS0ShCguk9yXifTbfUAqwpXrRvvYjFZ9UrSzVjdPutD rZmYF X-Gm-Gg: AeBDies1xCppboceDy+Q15aYuC+ELEGJdM1D4GVI43/vSw5k0Qcalf/i5VuEIwBld7s zeHkb3p01TStfCQ+nHdiRmuC5NopymxFjc4CuSDYaMmTaGEHCaVnEHRMLC5XZMcj99rY07o95cp XwdkLxpaqEyfjmfb0zPvAFZAm5KuKBVuJyYZr+IuIJn8mdYfyKIWv5Y+ixAcZX2wVcLhnBPPpjx RJcC3KWQYTYUENhTtZTAri3uPd0l5besqbhThhkOy6TZQFjWHgFHze24Ej2Z5vy7f6aX7P2TRB6 k1cNiM+638WqioYb+Ff24mRB4o8SxIOnoKcX3WlLg4iG9/G+kSC8z7wH7QBlkKGsDWzvWk53abH DJdnGo7RQSrsFb0PMWuBjjGwveKd0cj3EBzD2K0rWl4oC+nCX06psTZRO771TzNeieKvwIFDKFw KoDYqC8mOaRfoT9d8GNyoUlJY/8hW7uxwLhADL+6p9AdHns5ynwzv9pQ== X-Received: by 2002:a17:90b:4986:b0:359:855f:ff96 with SMTP id 98e67ed59e1d1-35de6942cf7mr8878256a91.17.1775393760754; Sun, 05 Apr 2026 05:56:00 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.55.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:00 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 24/49] mm/sparse-vmemmap: initialize shared tail vmemmap page upon allocation Date: Sun, 5 Apr 2026 20:52:15 +0800 Message-Id: <20260405125240.2558577-25-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Previously, the shared vmemmap tail page allocated in vmemmap_get_tail() was intentionally left uninitialized. This was because the subsequent memmap_init_range() would unconditionally overwrite any initialization done here, forcing subsystems (like HugeTLB) to compensate and perform the initialization later in their own specific routines (e.g., hugetlb_vmemmap_= init()). Thanks to the previous patch, memmap_init_range() is now aware of the section's compound page order and safely skips the redundant initialization for these optimizable compound vmemmap pages. Because the overwrite issue is resolved, we can now fully initialize the shared tail pages (via init_compound_tail()) immediately upon allocation in vmemmap_get_tail(). This simplifies the initialization flow and removes the need to defer this work to specific subsystems. Note that the initialization logic in hugetlb_vmemmap_init() is not removed yet. It will be completely removed once HugeTLB switches to the new memory section compound page order mechanism. Signed-off-by: Muchun Song --- mm/sparse-vmemmap.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 9f70559df4e8..2a6c3c82f9f5 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -340,18 +340,11 @@ static __meminit struct page *vmemmap_get_tail(unsign= ed int order, struct zone * if (tail) return tail; =20 - /* - * Only allocate the page, but do not initialize it. - * - * Any initialization done here will be overwritten by memmap_init(). - * - * hugetlb_vmemmap_init() will take care of initialization after - * memmap_init(). - */ - p =3D vmemmap_alloc_block_zero(PAGE_SIZE, node); if (!p) return NULL; + for (int i =3D 0; i < PAGE_SIZE / sizeof(struct page); i++) + init_compound_tail(p + i, NULL, order, zone); =20 tail =3D virt_to_page(p); zone->vmemmap_tails[idx] =3D tail; --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E6C4215075 for ; Sun, 5 Apr 2026 12:56:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393769; cv=none; b=Xg8V0lEcIGeUuvjRknXUc7Kc7oZYRJHB38s3bumZcztfwq4rWO88+iiEYj8/EOMzw0sNgnvw9plRTPIKl+ELLnSeGPwNiotNjGK17CZt0jlUAr5sPMWUwPWLJMGBUB0UzIWe04ivolO3pTqjl4ZG/3JjEugkBJlrXSCd8GwFSj8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393769; c=relaxed/simple; bh=gBV4zIpE6bpUqaxFSs3Bfbk1XpucxePoeEZMPKpuwUA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=apLOTPqAdb5l7TelFqHhMCW1sTrrn9IlIYAOjxEEySZd4PFvOBNACdNNncaHuykbDD+LOglnVR2FA/mD/8xaTD+jc0D/WRuIrLAdvrwXDA9bsp3NCltxeT/nA4CLc07qXSDEAX/Qrh6A6dOKUfkQ7nD6WVsVXHhKSKJweK/FBvA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ORf+EdA6; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ORf+EdA6" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-35d8e548a05so3517110a91.1 for ; Sun, 05 Apr 2026 05:56:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393767; x=1775998567; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yuildpo+hHfNl+izDYgZmo65TSC8uYjcfc21fVKbu/w=; b=ORf+EdA6vsIFl/zU7tjIv+jfLYjqvgeLzGEHOxPwVRY2Z1u4gkkCs7nSdbGfPC6xDF WVbYQAQSBH9LQ+XHcCHDDT9OnBDeTtroi6TcYqBgCaU98RoMLaQvvkuCzCTAHnjAeMxW xtLH6Nc6CaXYYV8zqSnWbS0b50yUnVkJkAwXgOMlWEqqeez4BtbL/ek6wgSBJefQYo35 fAr7i0J/A+O+gzbMUq6ktEBWULvVKnCLGd64uw/rklzI1sTGJt4/0ztzmbzXsi+5p+53 wkywbXZ9WwcnWW0aF+yYQ81hFhLId6DLWQHngJUrhPyJVJoTT8ysJuJ7Qc3E+9DO/smP 55Fw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393767; x=1775998567; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=yuildpo+hHfNl+izDYgZmo65TSC8uYjcfc21fVKbu/w=; b=llA2oS7iGUd/+0C229zjmewbfkvqFEPWoRNNLlQXUeUk+JjPX0/chgRkWQp6dN65LL eSBoFqfHyvUsr7pxMUfsFuXPAVtPhwEMu53kVSlPR4FW5Ga2yS+Ieihh8oB1R7fFz/IB maK1ICktYwlTL775o5PZwsKgcp3IhL37ulcOL/yCzEv9fGpo/LzDt/al9MDohXTSDm58 HPuNXzxocIVLpHCCNXG+DqQoEFatZEZNKycCsJchUnwWvK545Io3X34Ytmnav21PJ94+ RwA5JxypR6Ea0yEEhXRAKjsFMmkj1Rp3UB9GZgAzkwBSpQztgI04OqNHoGGJWuXvUkDo S96g== X-Forwarded-Encrypted: i=1; AJvYcCVlWta9hcXxGDT7XN/2Qk/VmKa5B3wJq/mmdTwLkoxRJsWjUXc25Sint/98XZB7G8VjxbZP9C1qt9jjHmo=@vger.kernel.org X-Gm-Message-State: AOJu0Yxx6dRgXwOL1EmeaX2wjc32S7uzG0oa0lIjrbPAXBhJ0T2LPVRX T6nQZYO3eMxSdnf7vsGRo436xkg4tsnFMGNB6ZCyV3bTJn7MybkHOiz1YQAxm3FCRhk= X-Gm-Gg: AeBDiesArOYusaRIpVcNjuCUZEito5G88Vic8dAlkwifnOJ1W720ItZymEJX/y6hHYj e+BGJVIoMGNzeK5xaEBl9ZFxuQfZVV96/rwhpBPS6beNbPSWI31geUJERSQVFifk44PnbMK9OwG FHKe5JGu1dFIm+gtYU8q6CBCPxlzrRI5f7z3wJVgkk22PR7Vb00lgUyUVKGQwwv6txC601yxAzZ ia/HCQCEvNstFE3e0KGBftwEyffTsBt8qpLxHQcrxBlf75aIbyJBNQDCE2IsPKrOprmg0eUZXMa afV2lavNtpK5SmzuRuiMHD9GcOnMP/MuTuushP+zI8TWQ98G5pNyQtMxnUzP71Zr7Af0fMdNbpt 4QP3Y0MHcWAq0arZ7eOnLiZ8fjtQboKQWWtUDgnzxS4Euh9Zn9T59+J2z8OMB56rU4NQeu1wfEK r17RnsU5V+sdzqiJ6G2tAxjopXaex26K18RC/H6gD4CZy736f3gKnepQ== X-Received: by 2002:a17:90b:4ac7:b0:35b:9720:98d0 with SMTP id 98e67ed59e1d1-35de679086dmr9103094a91.5.1775393766817; Sun, 05 Apr 2026 05:56:06 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:06 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 25/49] mm/sparse-vmemmap: support vmemmap-optimizable compound page population Date: Sun, 5 Apr 2026 20:52:16 +0800 Message-Id: <20260405125240.2558577-26-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Previously, vmemmap optimization (HVO) was tightly coupled with HugeTLB and relied on CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP. With the recent introduction of compound page order to struct mem_section, we can now generalize this optimization to be based on sections rather than being HugeTLB-specific. This patch refactors the vmemmap population logic to utilize the new section-level order information by updating vmemmap_pte_populate() to dynamically allocates or reuses the shared tail page if a section contains optimizable compound pages. These changes centralize the HVO logic within the core sparse-vmemmap code, reducing code duplication and paving the way for unifying the vmemmap optimization paths for both HugeTLB and DAX. Signed-off-by: Muchun Song --- include/linux/mmzone.h | 8 ++++- mm/internal.h | 3 ++ mm/sparse-vmemmap.c | 66 +++++++++++++++++++++++++----------------- mm/sparse.c | 30 +++++++++++++++++-- 4 files changed, 78 insertions(+), 29 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 620503aa29ba..e4d37492ca63 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1145,7 +1145,7 @@ struct zone { /* Zone statistics */ atomic_long_t vm_stat[NR_VM_ZONE_STAT_ITEMS]; atomic_long_t vm_numa_event[NR_VM_NUMA_EVENT_ITEMS]; -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP +#ifdef CONFIG_SPARSEMEM_VMEMMAP struct page *vmemmap_tails[NR_OPTIMIZABLE_FOLIO_SIZES]; #endif } ____cacheline_internodealigned_in_smp; @@ -2250,6 +2250,12 @@ static inline unsigned int section_order(const struc= t mem_section *section) } #endif =20 +static inline bool section_vmemmap_optimizable(const struct mem_section *s= ection) +{ + return is_power_of_2(sizeof(struct page)) && + section_order(section) >=3D OPTIMIZABLE_FOLIO_MIN_ORDER; +} + void sparse_init_early_section(int nid, struct page *map, unsigned long pn= um, unsigned long flags); =20 diff --git a/mm/internal.h b/mm/internal.h index 1060d7c07f5b..c0d0f546864c 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -996,6 +996,9 @@ static inline void __section_mark_present(struct mem_se= ction *ms, =20 ms->section_mem_map |=3D SECTION_MARKED_PRESENT; } + +int section_vmemmap_pages(unsigned long pfn, unsigned long nr_pages, + struct vmem_altmap *altmap, struct dev_pagemap *pgmap); #else static inline void sparse_init(void) {} #endif /* CONFIG_SPARSEMEM */ diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 2a6c3c82f9f5..6522c36aac20 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -144,17 +144,47 @@ void __meminit vmemmap_verify(pte_t *pte, int node, start, end - 1); } =20 +static struct zone __meminit *pfn_to_zone(unsigned long pfn, int nid) +{ + pg_data_t *pgdat =3D NODE_DATA(nid); + + for (enum zone_type zone_type =3D 0; zone_type < MAX_NR_ZONES; zone_type+= +) { + struct zone *zone =3D &pgdat->node_zones[zone_type]; + + if (zone_spans_pfn(zone, pfn)) + return zone; + } + + return NULL; +} + +static __meminit struct page *vmemmap_get_tail(unsigned int order, struct = zone *zone); + static pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long ad= dr, int node, struct vmem_altmap *altmap, unsigned long ptpfn) { pte_t *pte =3D pte_offset_kernel(pmd, addr); + if (pte_none(ptep_get(pte))) { pte_t entry; - void *p; + + if (vmemmap_page_optimizable((struct page *)addr) && + ptpfn =3D=3D (unsigned long)-1) { + struct page *page; + unsigned long pfn =3D page_to_pfn((struct page *)addr); + const struct mem_section *ms =3D __pfn_to_section(pfn); + + page =3D vmemmap_get_tail(section_order(ms), + pfn_to_zone(pfn, node)); + if (!page) + return NULL; + ptpfn =3D page_to_pfn(page); + } =20 if (ptpfn =3D=3D (unsigned long)-1) { - p =3D vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); + void *p =3D vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); + if (!p) return NULL; ptpfn =3D PHYS_PFN(__pa(p)); @@ -323,7 +353,6 @@ void vmemmap_wrprotect_hvo(unsigned long addr, unsigned= long end, } } =20 -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP static __meminit struct page *vmemmap_get_tail(unsigned int order, struct = zone *zone) { struct page *p, *tail; @@ -352,6 +381,7 @@ static __meminit struct page *vmemmap_get_tail(unsigned= int order, struct zone * return tail; } =20 +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP int __meminit vmemmap_populate_hvo(unsigned long addr, unsigned long end, unsigned int order, struct zone *zone, unsigned long headsize) @@ -404,6 +434,9 @@ int __meminit vmemmap_populate_hugepages(unsigned long = start, unsigned long end, return vmemmap_populate_compound_pages(start, end, node, pgmap); =20 for (addr =3D start; addr < end; addr =3D next) { + unsigned long pfn =3D page_to_pfn((struct page *)addr); + const struct mem_section *ms =3D __pfn_to_section(pfn); + next =3D pmd_addr_end(addr, end); =20 pgd =3D vmemmap_pgd_populate(addr, node); @@ -419,7 +452,7 @@ int __meminit vmemmap_populate_hugepages(unsigned long = start, unsigned long end, return -ENOMEM; =20 pmd =3D pmd_offset(pud, addr); - if (pmd_none(pmdp_get(pmd))) { + if (pmd_none(pmdp_get(pmd)) && !section_vmemmap_optimizable(ms)) { void *p; =20 p =3D vmemmap_alloc_block_buf(PMD_SIZE, node, altmap); @@ -437,8 +470,10 @@ int __meminit vmemmap_populate_hugepages(unsigned long= start, unsigned long end, */ return -ENOMEM; } - } else if (vmemmap_check_pmd(pmd, node, addr, next)) + } else if (vmemmap_check_pmd(pmd, node, addr, next)) { + VM_BUG_ON(section_vmemmap_optimizable(ms)); continue; + } if (vmemmap_populate_basepages(addr, next, node, altmap, pgmap)) return -ENOMEM; } @@ -705,27 +740,6 @@ static int fill_subsection_map(unsigned long pfn, unsi= gned long nr_pages) return rc; } =20 -static int __meminit section_vmemmap_pages(unsigned long pfn, unsigned lon= g nr_pages, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) -{ - unsigned int order =3D pgmap ? pgmap->vmemmap_shift : 0; - unsigned long pages_per_compound =3D 1L << order; - - VM_BUG_ON(!IS_ALIGNED(pfn | nr_pages, min(pages_per_compound, PAGES_PER_S= ECTION))); - VM_BUG_ON(pfn_to_section_nr(pfn) !=3D pfn_to_section_nr(pfn + nr_pages - = 1)); - - if (!vmemmap_can_optimize(altmap, pgmap)) - return DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE); - - if (order < PFN_SECTION_SHIFT) - return VMEMMAP_RESERVE_NR * nr_pages / pages_per_compound; - - if (IS_ALIGNED(pfn, pages_per_compound)) - return VMEMMAP_RESERVE_NR; - - return 0; -} - /* * To deactivate a memory region, there are 3 cases to handle: * diff --git a/mm/sparse.c b/mm/sparse.c index cfe4ffd89baf..62659752980e 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -345,6 +345,32 @@ static void __init sparse_usage_fini(void) sparse_usagebuf =3D sparse_usagebuf_end =3D NULL; } =20 +int __meminit section_vmemmap_pages(unsigned long pfn, unsigned long nr_pa= ges, + struct vmem_altmap *altmap, struct dev_pagemap *pgmap) +{ + const struct mem_section *ms =3D __pfn_to_section(pfn); + unsigned int order =3D pgmap ? pgmap->vmemmap_shift : section_order(ms); + unsigned long pages_per_compound =3D 1L << order; + unsigned int vmemmap_pages =3D OPTIMIZED_FOLIO_VMEMMAP_PAGES; + + if (vmemmap_can_optimize(altmap, pgmap)) + vmemmap_pages =3D VMEMMAP_RESERVE_NR; + + VM_BUG_ON(!IS_ALIGNED(pfn | nr_pages, min(pages_per_compound, PAGES_PER_S= ECTION))); + VM_BUG_ON(pfn_to_section_nr(pfn) !=3D pfn_to_section_nr(pfn + nr_pages - = 1)); + + if (!vmemmap_can_optimize(altmap, pgmap) && !section_vmemmap_optimizable(= ms)) + return DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE); + + if (order < PFN_SECTION_SHIFT) + return vmemmap_pages * nr_pages / pages_per_compound; + + if (IS_ALIGNED(pfn, pages_per_compound)) + return vmemmap_pages; + + return 0; +} + /* * Initialize sparse on a specific node. The node spans [pnum_begin, pnum_= end) * And number of present sections in this node is map_count. @@ -376,8 +402,8 @@ static void __init sparse_init_nid(int nid, unsigned lo= ng pnum_begin, nid, NULL, NULL); if (!map) panic("Populate section (%ld) on node[%d] failed\n", pnum, nid); - memmap_boot_pages_add(DIV_ROUND_UP(PAGES_PER_SECTION * sizeof(struct pa= ge), - PAGE_SIZE)); + memmap_boot_pages_add(section_vmemmap_pages(pfn, PAGES_PER_SECTION, + NULL, NULL)); sparse_init_early_section(nid, map, pnum, 0); } } --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66828215075 for ; Sun, 5 Apr 2026 12:56:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393774; cv=none; b=QQU8nZ1Mv+z4eX31poMDtJWd9T1R5r+8yZn2weNYmwIKNxw2diDZbFIlZZQN85woDA8PEILF3P2pFU0MTFos5H1O9DszfYABsFtmWOIPQvw+iZKSpotR/0Pv4nTVkuLzcK5cpHVBdm7bz+Rji91CQB2ew+tTtU2y0CqX4qMf/T8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393774; c=relaxed/simple; bh=B6bWpnXhutPqtzOTApmLSoJ/RfgQzFUNCd6jxZwp5sM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=qDBFJCVezVcFeVHJpU6BIrfNUIm1ozwUhWh8zwSqW/XXD4YtLCSoAMMpf9EChGLKwFsHeWWdf97L2g52vrBUASqjko48ljTG8WLW455N19A38GB0n+ZMjV5olMshDcXz/+xaqG+VYIB0ev7yObtXqI75qoNUnH7Ls43OyvJ2aGc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=aXth4zrW; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="aXth4zrW" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-35d99bae2ebso2819090a91.3 for ; Sun, 05 Apr 2026 05:56:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393772; x=1775998572; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CEDM6s7dxdsKjdGxQjN87sllzWekchktC1ag1ZZ29CI=; b=aXth4zrWQe57Mm/19nEpK516zsZJPOS1S1CRH7bu72PIbmjWz5XvtoqaG9bQtMGeYV oivCOVpolFzmMmzbGDEXMitx4iJGfe2LKOYsA6n8shczTeiQgmWx04ZUh0VBsI1im5Df VJ4hxNC0L7eYxUJucTPqJ9L2iS+VEAxvUf3rcRkYH0FVEhZV4jtbsmtrf97gc2fiZHwL QkDl79uO+5eFMYleABz5FRO3FhgBf8Gskz/Z/HU9JyQ7YCpL1agOrJkmlbn4nW+g+Jh/ On2mJM6z/N/uutD6cWR9dqJ87mHvsaLXoSnvi9a0lb9iv6GDAAkLNJyzzRRu+fzH6ZzR vHLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393772; x=1775998572; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=CEDM6s7dxdsKjdGxQjN87sllzWekchktC1ag1ZZ29CI=; b=Zr9/x+Pq5MnI5jB8sSRZzTbNKnNvmnOdj2yKkwVRvRUtvb9KBtcL7Ws9RmRPHjnNHm X2BtzpvQ1E3kdS6Flj+ngz2FLHo9BPSViDU0xayTxb9IieIbmg/XXsPAV+loN7H658RA Owd5yhUSZFcTOZPR0xTPqS+twETK60nrpploTJWz2qXfmzsaQOc9vLTNOl+X8+a9lCHJ dk2CChnxR7DasyhnPatoZq+2r8S/2ICtt3ypH3CV1k1fNRDnFuwFRRjTK+nXQcJchniY G8mT4nfPUZXCnxs7GyfR3bp5GEW1+n7rj4Ng9AIFxR2Ia8NT8zZ+tMYZ2hN6HPLyW8Ob aGwA== X-Forwarded-Encrypted: i=1; AJvYcCW8NjM1Tk8Hr2W9KRYCny+BSHymmdngu0NGGzu3J548RW29X6gVdPKaSoSnh1xfepIWGj/3X2xhqoNAz7M=@vger.kernel.org X-Gm-Message-State: AOJu0YzX9AdZbTrHpVWgukAf9Gnqh1BYeDnXNU9nGFubbnSNPXDmacOn iY8nXNQ0uVkKMO6hP7oIIJ3OjcFzEhLvpXP+xgIrouo3Wusg/S4RedYoq95u8ts4ml0= X-Gm-Gg: AeBDievdpqVJsOI6Ompo4F2gT4YPJV2JyIegnNZRDqYk/ZPk+y9LeYOY5li+IMHWMgI XIefFmHU2rlPTxcJwYY5CTIBrzL/maOlCGjOzLkuI0J9R9oScApD5xahNRIAM2FCm0jl0Jn2+JU VlOx1lejOZ7VmD8iA8NtF1vLdV5DkF26dfOrEjskL24bq021VBLhrZvLvuYowxLND2MOD+TVFd2 p2/Ouhivl8zG/RWGAyBOWToG/DlUzKbN7HlFgmK4urRGkDHVWr2zfg2fYj2WQ9qYl5JIw1O8koE siDmdrZIxXGmCx3s0HjM+7uDVRgG8qvc4LoKT9WAQxsMXoGPVk2JX/JuJbcl0AyXfFVBKj24eIZ ZiB6ZoGFIyh9twt7da4L8stDjM1tyG87hc66isA4ST/RWUIBbC723s09JqeD8B2W3FsopEJjXaI 961JZcEOVlNTHhJAsCnA8mQmjAVyf5CCmkFLFJfhc1wkM= X-Received: by 2002:a17:90b:224e:b0:35b:e56e:a17e with SMTP id 98e67ed59e1d1-35de68cf52fmr9251356a91.17.1775393772471; Sun, 05 Apr 2026 05:56:12 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:12 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 26/49] mm/hugetlb: use generic vmemmap optimization macros Date: Sun, 5 Apr 2026 20:52:17 +0800 Message-Id: <20260405125240.2558577-27-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use the generic macros OPTIMIZED_FOLIO_VMEMMAP_SIZE, OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS and OPTIMIZED_FOLIO_VMEMMAP_PAGES instead of the hugetlb-specific HUGETLB_VMEMMAP_RESERVE_SIZE and HUGETLB_VMEMMAP_RESERVE_PAGES to describe the vmemmap-optimized folio. Signed-off-by: Muchun Song --- mm/hugetlb.c | 4 ++-- mm/hugetlb_vmemmap.c | 14 +++++++------- mm/hugetlb_vmemmap.h | 9 +-------- 3 files changed, 10 insertions(+), 17 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a00c9f3672b7..a7e0599802cb 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3241,7 +3241,7 @@ static void __init prep_and_add_bootmem_folios(struct= hstate *h, * be no contention. */ hugetlb_folio_init_tail_vmemmap(folio, h, - HUGETLB_VMEMMAP_RESERVE_PAGES, + OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS, pages_per_huge_page(h)); } hugetlb_bootmem_init_migratetype(folio, h); @@ -3280,7 +3280,7 @@ static void __init gather_bootmem_prealloc_node(unsig= ned long nid) WARN_ON(folio_ref_count(folio) !=3D 1); =20 hugetlb_folio_init_vmemmap(folio, h, - HUGETLB_VMEMMAP_RESERVE_PAGES); + OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS); init_new_hugetlb_folio(folio); =20 if (hugetlb_bootmem_page_prehvo(m)) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index d6dd47c232e0..0af528c0e229 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -407,7 +407,7 @@ static int __hugetlb_vmemmap_restore_folio(const struct= hstate *h, vmemmap_start =3D (unsigned long)&folio->page; vmemmap_end =3D vmemmap_start + hugetlb_vmemmap_size(h); =20 - vmemmap_start +=3D HUGETLB_VMEMMAP_RESERVE_SIZE; + vmemmap_start +=3D OPTIMIZED_FOLIO_VMEMMAP_SIZE; =20 /* * The pages which the vmemmap virtual address range [@vmemmap_start, @@ -637,10 +637,10 @@ static void __hugetlb_vmemmap_optimize_folios(struct = hstate *h, spfn =3D (unsigned long)&folio->page; epfn =3D spfn + pages_per_huge_page(h); vmemmap_wrprotect_hvo(spfn, epfn, folio_nid(folio), - HUGETLB_VMEMMAP_RESERVE_SIZE); + OPTIMIZED_FOLIO_VMEMMAP_SIZE); register_page_bootmem_memmap(pfn_to_section_nr(spfn), &folio->page, - HUGETLB_VMEMMAP_RESERVE_SIZE); + OPTIMIZED_FOLIO_VMEMMAP_SIZE); continue; } =20 @@ -791,8 +791,8 @@ void __init hugetlb_vmemmap_init_early(int nid) zone =3D pfn_to_zone(nid, pfn); =20 BUG_ON(vmemmap_populate_hvo(start, end, huge_page_order(m->hstate), - zone, HUGETLB_VMEMMAP_RESERVE_SIZE)); - memmap_boot_pages_add(HUGETLB_VMEMMAP_RESERVE_SIZE / PAGE_SIZE); + zone, OPTIMIZED_FOLIO_VMEMMAP_SIZE)); + memmap_boot_pages_add(OPTIMIZED_FOLIO_VMEMMAP_PAGES); =20 pnum =3D pfn_to_section_nr(pfn); ns =3D psize / section_size; @@ -824,8 +824,8 @@ static int __init hugetlb_vmemmap_init(void) const struct hstate *h; struct zone *zone; =20 - /* HUGETLB_VMEMMAP_RESERVE_SIZE should cover all used struct pages */ - BUILD_BUG_ON(__NR_USED_SUBPAGE > HUGETLB_VMEMMAP_RESERVE_PAGES); + /* OPTIMIZED_FOLIO_VMEMMAP_SIZE should cover all used struct pages */ + BUILD_BUG_ON(__NR_USED_SUBPAGE > OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS); =20 for_each_zone(zone) { for (int i =3D 0; i < NR_OPTIMIZABLE_FOLIO_SIZES; i++) { diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 7ac49c52457d..66e11893d076 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -12,13 +12,6 @@ #include #include =20 -/* - * Reserve one vmemmap page, all vmemmap addresses are mapped to it. See - * Documentation/mm/vmemmap_dedup.rst. - */ -#define HUGETLB_VMEMMAP_RESERVE_SIZE PAGE_SIZE -#define HUGETLB_VMEMMAP_RESERVE_PAGES (HUGETLB_VMEMMAP_RESERVE_SIZE / size= of(struct page)) - #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *fo= lio); long hugetlb_vmemmap_restore_folios(const struct hstate *h, @@ -43,7 +36,7 @@ static inline unsigned int hugetlb_vmemmap_size(const str= uct hstate *h) */ static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct h= state *h) { - int size =3D hugetlb_vmemmap_size(h) - HUGETLB_VMEMMAP_RESERVE_SIZE; + int size =3D hugetlb_vmemmap_size(h) - OPTIMIZED_FOLIO_VMEMMAP_SIZE; =20 if (!is_power_of_2(sizeof(struct page))) return 0; --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 39852215075 for ; Sun, 5 Apr 2026 12:56:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393782; cv=none; b=DGQE/7ZokHV1/uqoz7hb6i/G6HvuhhRSOqzedqJxjWzkmysjORODqSTL0Od4DmNrXU6Tg9xs1dVZBcjhFsEPhBZn9VNcopTGqy7sWCtqa/TFAP4aU6BkV7nsBuHz9zzvHYCVz/rTVqn60AZzpdqHNwE5sSjI6MCAxW+yl/pFrSc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393782; c=relaxed/simple; bh=Ke4gF0pK27h8FO5n3WD2reElse5OO5N+7HZZWahOQhw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bcdOXJ15zIoNOnTftkZTiQoPxaPDkzFeFPKQj5CRQqCrdQMuKeC0qRP4kJqH/Nk0Jq4r23SExj4e0x2OaxBU5fL61bGCRmcjRZZcrVPaqp4Iv7wad/PWd4BRxRF/8JhTdy19gdaL6orBhYofqprPvULI3X4tJdLxXQObdUA21/E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=GbtxXfDu; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="GbtxXfDu" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-358ed696623so1323428a91.0 for ; Sun, 05 Apr 2026 05:56:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393781; x=1775998581; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=i+vulRN5m4Yydq260oUv4yelbrwHrOXM8b9fFHhOg90=; b=GbtxXfDuSkS607R5cbxCeLVR8MqWDVWBrPK6zJ8b+LdA5WU62rOJislEgHyuFK7VQ1 Zp2FlLfruZvxm9THqZXhxvwSAPLnY7JLBbcQvHrctDqhA9nHZQYeEFAfByALAvN+Vy/c mJtfmRdXSY0d/C+qyBL5cUiKkxPKWZVbILdoRuDUnOo3G9S4bAjUIuXxZjfbblT8lU5s 2cvbbW1pWADTJHqIN4QrSVmvckWpEYY0Za0/nyf1zRmxAm1j/g4SWkguDXxzm9OwSTnB EH1Qeg9inRewOzLthNFKev+yfHQZjCH1OLeQJGHttB8QDKxDXTgzQf6MnM6TjlPGLCeK LvwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393781; x=1775998581; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=i+vulRN5m4Yydq260oUv4yelbrwHrOXM8b9fFHhOg90=; b=SOcn3eLbqjFaQNUkLdWBmK8Trcl8lavOpVBHYAx/T+/oSYADKE/TNcOeK8H2fATQPB oaeYKbYerHdMRGQpbtsHa4CO29YvDvkKmcUw+VV545yZhx52q45tZKKIzK6p3b7wQxZs TRS3TMuDhL2PAxwNV2V9kKZ8i6dH4OmruKGPR+U9LN0GrsBgUauANqAXy6m0zse3k3Cy h8T5f1VfE6283SAbmoCz2EhCIqfguawgZlgGyl7Lbcp//MM2Y1fUN8eEQr7cm9wIb8j+ szFJ2uUUlFBBjqZwNLhAiBlCEKDG4omCGRIJz8v6sErDT52rfQjFKpZuKLzORnkptBlH L2Tw== X-Forwarded-Encrypted: i=1; AJvYcCWjEr/iEqsvwnt7fcRRbC7JcAClI6u4HzzDo8XIbLYWqOg0fRjVqgKWoBDqST3hLnlGgjXL57j6FKk3VHY=@vger.kernel.org X-Gm-Message-State: AOJu0YxDggG/zT1GjXw5vI6hjzl3hpaqM1SuVp7vBL/bDX/5uBirlA+8 B7a1M+8btu4SqiL4PE6sgLUm4OgFowSFhreLN+D5l/ljRgcQjr9PqykygWaVDfVGYrk= X-Gm-Gg: AeBDies8OGMculv9pCDmAw0QFYLmyShroof8OdcjtoWdY8GxH13mKIBhrtyGP8m1A6j sH9wUZIlYxM3rH081jc6Wy3OR44rwfzPZwyTm2r2qW5mBJ7gavcXwV4SMKCCZPnMIRVUk5dSrJL tbrULQrdUfI/SuAeun4Vute/R79BshocWUIqlEsTDsFhJm+Ii/EfEQZqDCbu8RGtnfxBPusWYga 6czrL6Qn/0Iv++dVdV1gdiweJ7p3IwFWHXe//XBKu4kFJjRBTho+PZfYQi2eHvtKV9kBPCm9KO/ f0+7W89KN/O/JOYhzeOLG3SEWXI1rJ+l67z3wVaPT6ovmYtbLTCZIkbnAMpI5+omlCyIWFAcqRH D/OFj3HV6UO2PnXn8M4kUbx1+ccKJaeiWucHH/MPHZYbXGnyQFjL2wKl4qMbme4zVHghzn3Cib2 RakyBIVzi2hPbkCNFCvt70/Ph11VFQwdva3IjrjnOLgIeJeaAfwmN/tQ== X-Received: by 2002:a17:90b:1c10:b0:35d:9f7c:141c with SMTP id 98e67ed59e1d1-35de695a4c6mr8913118a91.27.1775393780469; Sun, 05 Apr 2026 05:56:20 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:20 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 27/49] mm: call memblocks_present() before HugeTLB initialization Date: Sun, 5 Apr 2026 20:52:18 +0800 Message-Id: <20260405125240.2558577-28-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extract memblocks_present() from sparse_init() and call it earlier in mm_core_init_early(). This ensures that the struct mem_section array is properly allocated and marked as present before HugeTLB bootmem allocation. This is a necessary preparation for the subsequent patches, which will need to perform early setting of the section order for HugeTLB pages. Signed-off-by: Muchun Song --- mm/internal.h | 2 ++ mm/mm_init.c | 1 + mm/sparse.c | 4 +--- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index c0d0f546864c..27c06250d6b8 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -963,6 +963,7 @@ void memmap_init_range(unsigned long, int, unsigned lon= g, unsigned long, * mm/sparse.c */ #ifdef CONFIG_SPARSEMEM +void memblocks_present(void); void sparse_init(void); int sparse_index_init(unsigned long section_nr, int nid); =20 @@ -1000,6 +1001,7 @@ static inline void __section_mark_present(struct mem_= section *ms, int section_vmemmap_pages(unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); #else +static inline void memblocks_present(void) {} static inline void sparse_init(void) {} #endif /* CONFIG_SPARSEMEM */ =20 diff --git a/mm/mm_init.c b/mm/mm_init.c index 7f5b326e9298..b47f65425bc1 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2675,6 +2675,7 @@ void __init __weak mem_init(void) =20 void __init mm_core_init_early(void) { + memblocks_present(); free_area_init(); /* Zone data structures are available from here. */ hugetlb_cma_reserve(); diff --git a/mm/sparse.c b/mm/sparse.c index 62659752980e..7779554c5a0c 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -195,7 +195,7 @@ static void __init memory_present(int nid, unsigned lon= g start, unsigned long en * This is a convenience function that is useful to mark all of the systems * memory as present during initialization. */ -static void __init memblocks_present(void) +void __init memblocks_present(void) { unsigned long start, end; int i, nid; @@ -420,8 +420,6 @@ void __init sparse_init(void) unsigned long pnum_end, pnum_begin, map_count =3D 1; int nid_begin; =20 - memblocks_present(); - if (compound_info_has_mask()) { VM_WARN_ON_ONCE(!IS_ALIGNED((unsigned long) pfn_to_page(0), MAX_FOLIO_VMEMMAP_ALIGN)); --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 06C76341AAE for ; Sun, 5 Apr 2026 12:56:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393788; cv=none; b=M3Y7V7KfGhr0X4XJ1LOEv1N/r7fxkWSWzX+1v6kzYtlbXQHElepoX4K0/5GXrJlUn9Ipe+7kSeB5bg4zoHifLdxRD7FzKmsva5alu6v3jNnbIXw3lgrq7Gz6XsRflL3BqeYFPl3tIdloMvV74Fiww4DOjx7mtNlty7wZjmLG6q8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393788; c=relaxed/simple; bh=rP81vrJhL+5FfMpH9stDaIGdoLmDBKGq1ZvmGJMLwCY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=J5K2ZanXguUvA/WNBPh0lE9Tc7yu0kUIPjZRqtqdCUk+GvefNp1n3jAGskvBlwWzq5ZN+zOjq0kdo0jxlcQ7JHsD5iU9RLxnNukuztfhcuKPXCqdsXT5dk0aeqRaB4Q+GSlvzwLzM3U0ZlwW3wdsxBlH1ZWF1jJ4Ca2bgCN8V6o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=giT6o2gJ; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="giT6o2gJ" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-35da9692ec3so2875863a91.1 for ; Sun, 05 Apr 2026 05:56:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393786; x=1775998586; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=awaFXPFMYUbrBte65uesX1HHeaiEoqtP90DsGctc43A=; b=giT6o2gJFzh6TDo9CMcwK5V/Imu1LqW7SxHSkjfyd7pw02Ozl2bcTd4Y5GMckeLF4f lE913gkRvCLOj7TM9Tc4IErOMNVuUk1dIwGV+txlcVdVfWbKG4I2yL6RVUyhEsBA+xDQ mVlqZOnTnMdG9fYLYOdk0lj3wh2cgOjHgMHxCJluUplaB08RNmBgDPWfUBCHfJWvOxXd vzIugKhmZMdIShlohEIe1JQDJaPYHeR5qa0pwUbpxe99R5nDRyWtZx085t9I6Wlkkokb P1zFrBAzQchSAWmblQcQerMMXgHnzsbcHKKc69TwGsQ6NPzZeoIW35NqScv0EVQU/Awb 1T9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393786; x=1775998586; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=awaFXPFMYUbrBte65uesX1HHeaiEoqtP90DsGctc43A=; b=IQaBLvir/85ec5h59mZV+4FxE77OwWQelz5jwMzztquAsCIROOycxWl4LWb0yhNeBu nnAPRjjp57AsOTcmuE+BAbZRWqMLl+2+BCfgkdUR67K/IvEbAPHtAd9a7Ugb8L2Te3Lq A4M2T1t4KYdcSjAyaKFqMzK4OFS3Ua5kXX1w4/MtSof+JGim+JOhVFpl6+pkfWSbbF21 xEXVWSU8dSdUwdCh8X3iB0q8hbseaXeFaZorDloyR/k9xnzOlz9kx5TSQiLNXoyE432k HjZXU7pVcEkmIb/1SfYLSpInY4CC1QAffmPEWBMLcqdbXJXT3XWWmNLhTo9jm22l9J+u ejDg== X-Forwarded-Encrypted: i=1; AJvYcCUXzMmgpjG+4N5+ySMNLNDci3GoH4pnPzdz5Ns6vBs11Nfqgv5+JbrUHpNM94+QZd2YjzipRW8TR89mmzk=@vger.kernel.org X-Gm-Message-State: AOJu0Yxlri5PhhD73uihEqsoYA8IJ+ZnzRkH/5I3rAqNPU9RXCEYkVgO Wnpn6DqWrw0VLNg7flhvOIlNJskzDK9FtP0ny2umSJD2Gt/Kp6MPWmbtvw3iOvD8QAQ= X-Gm-Gg: AeBDiev44OLbe+2KWXTInDrPhihP6EKItfzDIjwdFZG11bL/XKo9Fe/qSaTlh6BYfzK zrPaOH008FMiHGQV1St8nHwDXcUaeBWMnDyur+0e6+u/z3xRACN9/7wYzYJqXJHvuM1LjkngEHM VZWAjBUanLXdnKFzUUz14RCr53FubULMuHUAMw0rrdrh/PYeI9Dm0u7zje+rPtqV/3TXx4OX6yG sRp2NklxWSzrlBmdAFxQ+et2jou5WTtu0xfpL2yptXbm4uAA+mcWNZwR1hwvSR7ydzlrJzivCoU wGOFWUSjRNsUcegQLhZvseOU4zwCXtmbte1O0Z+BcazsRi7ODAd2NnatN/rcVDdy7mitNMF5Ryb Mc1ZZgvvIDG1aRpaBDvcHRv4HHocOiPz73v9p67i+SMG+BH5W0ar7TcHYyQOTZkEEyf23ABlbhY m7PxfBpluwvTNrtmIP3/NVb52mi7yLELNvtOU358k95fq34rMxUEMAjA== X-Received: by 2002:a17:90b:3811:b0:359:1130:1047 with SMTP id 98e67ed59e1d1-35de68ebf53mr9391600a91.17.1775393786359; Sun, 05 Apr 2026 05:56:26 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:26 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 28/49] mm/hugetlb: switch HugeTLB to use generic vmemmap optimization Date: Sun, 5 Apr 2026 20:52:19 +0800 Message-Id: <20260405125240.2558577-29-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Switch the HugeTLB vmemmap optimization to use the new infrastructure introduced in the previous patches (specifically, the compound page support in sparse-vmemmap). Previously, optimizing bootmem HugeTLB pages required dedicated and complex pre-initialization logic, such as hugetlb_vmemmap_init_early() and vmemmap_populate_hvo(). This approach manually handled page mapping and initialization at a very early stage. This patch removes all those special-cased functions and simply calls hugetlb_vmemmap_optimize_bootmem_page() directly from alloc_bootmem(). By explicitly setting the compound page order in the mem_section (via section_set_order), the generic sparse-vmemmap initialization code will now automatically handle the shared tail page mapping for these bootmem pages. This significantly simplifies the code, eliminates duplicate logic, and seamlessly integrates bootmem vmemmap optimization with the generic vmemmap optimization flow. Signed-off-by: Muchun Song --- include/linux/mm.h | 3 - include/linux/mmzone.h | 13 +++++ mm/bootmem_info.c | 5 +- mm/hugetlb.c | 8 ++- mm/hugetlb_vmemmap.c | 121 +++-------------------------------------- mm/hugetlb_vmemmap.h | 11 ++-- mm/sparse-vmemmap.c | 29 ---------- 7 files changed, 32 insertions(+), 158 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index aa8c05de7585..93e447468131 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4877,9 +4877,6 @@ int vmemmap_populate_hugepages(unsigned long start, u= nsigned long end, struct dev_pagemap *pgmap); int vmemmap_populate(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); -int vmemmap_populate_hvo(unsigned long start, unsigned long end, - unsigned int order, struct zone *zone, - unsigned long headsize); void vmemmap_wrprotect_hvo(unsigned long start, unsigned long end, int nod= e, unsigned long headsize); void vmemmap_populate_print_last(void); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index e4d37492ca63..0bd20efac427 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -2250,6 +2250,19 @@ static inline unsigned int section_order(const struc= t mem_section *section) } #endif =20 +static inline void section_set_order_pfn_range(unsigned long pfn, + unsigned long nr_pages, + unsigned int order) +{ + unsigned long section_nr =3D pfn_to_section_nr(pfn); + + if (!IS_ALIGNED(pfn | nr_pages, PAGES_PER_SECTION)) + return; + + for (int i =3D 0; i < nr_pages / PAGES_PER_SECTION; i++) + section_set_order(__nr_to_section(section_nr + i), order); +} + static inline bool section_vmemmap_optimizable(const struct mem_section *s= ection) { return is_power_of_2(sizeof(struct page)) && diff --git a/mm/bootmem_info.c b/mm/bootmem_info.c index 3d7675a3ae04..24f45d86ffb3 100644 --- a/mm/bootmem_info.c +++ b/mm/bootmem_info.c @@ -51,9 +51,8 @@ static void __init register_page_bootmem_info_section(uns= igned long start_pfn) section_nr =3D pfn_to_section_nr(start_pfn); ms =3D __nr_to_section(section_nr); =20 - if (!preinited_vmemmap_section(ms)) - register_page_bootmem_memmap(section_nr, pfn_to_page(start_pfn), - PAGES_PER_SECTION); + register_page_bootmem_memmap(section_nr, pfn_to_page(start_pfn), + PAGES_PER_SECTION); =20 usage =3D ms->usage; page =3D virt_to_page(usage); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a7e0599802cb..dff94ab7040a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3096,6 +3096,7 @@ static __init void *alloc_bootmem(struct hstate *h, i= nt nid, bool node_exact) * is not up yet. */ INIT_LIST_HEAD(&m->list); + m->hstate =3D h; if (pfn_range_intersects_zones(listnode, PHYS_PFN(virt_to_phys(m)), pages_per_huge_page(h))) { VM_BUG_ON(hugetlb_bootmem_page_earlycma(m)); @@ -3103,8 +3104,8 @@ static __init void *alloc_bootmem(struct hstate *h, i= nt nid, bool node_exact) } else { list_add_tail(&m->list, &huge_boot_pages[listnode]); m->flags |=3D HUGE_BOOTMEM_ZONES_VALID; + hugetlb_vmemmap_optimize_bootmem_page(m); } - m->hstate =3D h; } =20 return m; @@ -3283,13 +3284,16 @@ static void __init gather_bootmem_prealloc_node(uns= igned long nid) OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS); init_new_hugetlb_folio(folio); =20 - if (hugetlb_bootmem_page_prehvo(m)) + if (hugetlb_bootmem_page_prehvo(m)) { /* * If pre-HVO was done, just set the * flag, the HVO code will then skip * this folio. */ folio_set_hugetlb_vmemmap_optimized(folio); + section_set_order_pfn_range(folio_pfn(folio), + pages_per_huge_page(h), 0); + } =20 if (hugetlb_bootmem_page_earlycma(m)) folio_set_hugetlb_cma(folio); diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 0af528c0e229..8c567b8c67cc 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -638,9 +638,6 @@ static void __hugetlb_vmemmap_optimize_folios(struct hs= tate *h, epfn =3D spfn + pages_per_huge_page(h); vmemmap_wrprotect_hvo(spfn, epfn, folio_nid(folio), OPTIMIZED_FOLIO_VMEMMAP_SIZE); - register_page_bootmem_memmap(pfn_to_section_nr(spfn), - &folio->page, - OPTIMIZED_FOLIO_VMEMMAP_SIZE); continue; } =20 @@ -706,108 +703,21 @@ void hugetlb_vmemmap_optimize_bootmem_folios(struct = hstate *h, struct list_head __hugetlb_vmemmap_optimize_folios(h, folio_list, true); } =20 -#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT - -/* Return true of a bootmem allocated HugeTLB page should be pre-HVO-ed */ -static bool vmemmap_should_optimize_bootmem_page(struct huge_bootmem_page = *m) -{ - unsigned long section_size, psize, pmd_vmemmap_size; - phys_addr_t paddr; - - if (!READ_ONCE(vmemmap_optimize_enabled)) - return false; - - if (!hugetlb_vmemmap_optimizable(m->hstate)) - return false; - - psize =3D huge_page_size(m->hstate); - paddr =3D virt_to_phys(m); - - /* - * Pre-HVO only works if the bootmem huge page - * is aligned to the section size. - */ - section_size =3D (1UL << PA_SECTION_SHIFT); - if (!IS_ALIGNED(paddr, section_size) || - !IS_ALIGNED(psize, section_size)) - return false; - - /* - * The pre-HVO code does not deal with splitting PMDS, - * so the bootmem page must be aligned to the number - * of base pages that can be mapped with one vmemmap PMD. - */ - pmd_vmemmap_size =3D (PMD_SIZE / (sizeof(struct page))) << PAGE_SHIFT; - if (!IS_ALIGNED(paddr, pmd_vmemmap_size) || - !IS_ALIGNED(psize, pmd_vmemmap_size)) - return false; - - return true; -} - -static struct zone *pfn_to_zone(unsigned nid, unsigned long pfn) +void __init hugetlb_vmemmap_optimize_bootmem_page(struct huge_bootmem_page= *m) { - struct zone *zone; - enum zone_type zone_type; - - for (zone_type =3D 0; zone_type < MAX_NR_ZONES; zone_type++) { - zone =3D &NODE_DATA(nid)->node_zones[zone_type]; - if (zone_spans_pfn(zone, pfn)) - return zone; - } - - return NULL; -} - -/* - * Initialize memmap section for a gigantic page, HVO-style. - */ -void __init hugetlb_vmemmap_init_early(int nid) -{ - unsigned long psize, paddr, section_size; - unsigned long ns, i, pnum, pfn, nr_pages; - unsigned long start, end; - struct huge_bootmem_page *m =3D NULL; - void *map; + struct hstate *h =3D m->hstate; + unsigned long pfn =3D PHYS_PFN(virt_to_phys(m)); =20 if (!READ_ONCE(vmemmap_optimize_enabled)) return; =20 - section_size =3D (1UL << PA_SECTION_SHIFT); - - list_for_each_entry(m, &huge_boot_pages[nid], list) { - struct zone *zone; - - if (!vmemmap_should_optimize_bootmem_page(m)) - continue; - - nr_pages =3D pages_per_huge_page(m->hstate); - psize =3D nr_pages << PAGE_SHIFT; - paddr =3D virt_to_phys(m); - pfn =3D PHYS_PFN(paddr); - map =3D pfn_to_page(pfn); - start =3D (unsigned long)map; - end =3D start + nr_pages * sizeof(struct page); - zone =3D pfn_to_zone(nid, pfn); - - BUG_ON(vmemmap_populate_hvo(start, end, huge_page_order(m->hstate), - zone, OPTIMIZED_FOLIO_VMEMMAP_SIZE)); - memmap_boot_pages_add(OPTIMIZED_FOLIO_VMEMMAP_PAGES); - - pnum =3D pfn_to_section_nr(pfn); - ns =3D psize / section_size; - - for (i =3D 0; i < ns; i++) { - sparse_init_early_section(nid, map, pnum, - SECTION_IS_VMEMMAP_PREINIT); - map +=3D section_map_size(); - pnum++; - } + if (!hugetlb_vmemmap_optimizable(h)) + return; =20 + section_set_order_pfn_range(pfn, pages_per_huge_page(h), huge_page_order(= h)); + if (section_vmemmap_optimizable(__pfn_to_section(pfn))) m->flags |=3D HUGE_BOOTMEM_HVO; - } } -#endif =20 static const struct ctl_table hugetlb_vmemmap_sysctls[] =3D { { @@ -822,27 +732,10 @@ static const struct ctl_table hugetlb_vmemmap_sysctls= [] =3D { static int __init hugetlb_vmemmap_init(void) { const struct hstate *h; - struct zone *zone; =20 /* OPTIMIZED_FOLIO_VMEMMAP_SIZE should cover all used struct pages */ BUILD_BUG_ON(__NR_USED_SUBPAGE > OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS); =20 - for_each_zone(zone) { - for (int i =3D 0; i < NR_OPTIMIZABLE_FOLIO_SIZES; i++) { - struct page *tail, *p; - unsigned int order; - - tail =3D zone->vmemmap_tails[i]; - if (!tail) - continue; - - order =3D i + OPTIMIZABLE_FOLIO_MIN_ORDER; - p =3D page_to_virt(tail); - for (int j =3D 0; j < PAGE_SIZE / sizeof(struct page); j++) - init_compound_tail(p + j, NULL, order, zone); - } - } - for_each_hstate(h) { if (hugetlb_vmemmap_optimizable(h)) { register_sysctl_init("vm", hugetlb_vmemmap_sysctls); diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 66e11893d076..ff8e4c6e9833 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -20,10 +20,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate = *h, void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *= folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *f= olio_list); void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list= _head *folio_list); -#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT -void hugetlb_vmemmap_init_early(int nid); -#endif - +void hugetlb_vmemmap_optimize_bootmem_page(struct huge_bootmem_page *m); =20 static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { @@ -69,13 +66,13 @@ static inline void hugetlb_vmemmap_optimize_bootmem_fol= ios(struct hstate *h, { } =20 -static inline void hugetlb_vmemmap_init_early(int nid) +static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct h= state *h) { + return 0; } =20 -static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct h= state *h) +static inline void hugetlb_vmemmap_optimize_bootmem_page(struct huge_bootm= em_page *m) { - return 0; } #endif /* CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP */ =20 diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 6522c36aac20..d266bcf45b5c 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -32,7 +32,6 @@ #include #include =20 -#include "hugetlb_vmemmap.h" #include "internal.h" =20 /* @@ -381,33 +380,6 @@ static __meminit struct page *vmemmap_get_tail(unsigne= d int order, struct zone * return tail; } =20 -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP -int __meminit vmemmap_populate_hvo(unsigned long addr, unsigned long end, - unsigned int order, struct zone *zone, - unsigned long headsize) -{ - unsigned long maddr; - struct page *tail; - pte_t *pte; - int node =3D zone_to_nid(zone); - - tail =3D vmemmap_get_tail(order, zone); - if (!tail) - return -ENOMEM; - - for (maddr =3D addr; maddr < addr + headsize; maddr +=3D PAGE_SIZE) { - pte =3D vmemmap_populate_address(maddr, node, NULL, -1); - if (!pte) - return -ENOMEM; - } - - /* - * Reuse the last page struct page mapped above for the rest. - */ - return vmemmap_populate_range(maddr, end, node, NULL, page_to_pfn(tail)); -} -#endif - void __weak __meminit vmemmap_set_pmd(pmd_t *pmd, void *p, int node, unsigned long addr, unsigned long next) { @@ -595,7 +567,6 @@ struct page * __meminit __populate_section_memmap(unsig= ned long pfn, */ void __init sparse_vmemmap_init_nid_early(int nid) { - hugetlb_vmemmap_init_early(nid); } #endif =20 --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4D86A344D8C for ; Sun, 5 Apr 2026 12:56:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393793; cv=none; b=tFftpGH7wW+PpAOUPtBOJUKbCnkgsW1OlbSyFtWymSFaKSVkQroDvDm9Wco0Hvk7qL5t+fVUNgK+qI1TWkR9xc3N5ktX3YT6oNcCgtkfXEiLisOZFRCzwEkxdBgiQcePD9MTatjdpR0nDFtzwRjj4Mu8wKpZoTeYtbhXLe+95Cg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393793; c=relaxed/simple; bh=3qPxTT1h93xPrLOIRKPOX3LLEn4ECFnWE3EYSsS/Bls=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hsm/uNtCGIB3MLz3FXudHgCBBmcHGDCiu3kpknyqYxhCKbkQrXeH/P8SPco0oiKVVxsyxrzZ9PQt/OEU5D6XgIb3Jgb+Ff1dKrf+k/d4bnF6yZUtlNXHwN5ZZU/5FS4Crkr9vsIk4bDqG5FZwU+ZvJRQR6ReSxIkL7OEVg3S1rE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=gvbM5Fec; arc=none smtp.client-ip=209.85.216.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="gvbM5Fec" Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-3567e2b4159so2009405a91.0 for ; Sun, 05 Apr 2026 05:56:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393792; x=1775998592; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cxlpdrP2oi+I/kPwst+8wcAG9djYXbVqK5QGAlYJoDc=; b=gvbM5Fec0scf38XkBRUT/m5NllVycWyOSVWgTPNgISLUl3XiEzuO1xvf74uqo5P6H5 kVVj+i9s22skRIK1Kiga2Yh3I+f5/zqnATdT60qVRK5w7MRia44RMv8W8rPNJUFZToPh L+/nZcphkDrUZ+sSCN4UmJg6q9oqetYDChfeUMDme7TyKlX88J39rUAuRol6HiE0JF7u gwE+Z1//EHqqoX0fMakh7AhocxXoNZJ5KHgLtn7uw1eXPa+hjvjGKe9UbZdiVqjbp/dP I8qSD0HLPC76ZeHGvPWpGA6xybKEmn/V3AhVHUVyeiarcT7HPiZaJmAHMYzm8uAkF8Ot cIHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393792; x=1775998592; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=cxlpdrP2oi+I/kPwst+8wcAG9djYXbVqK5QGAlYJoDc=; b=fFmjDAW9FVNFAu/9t2XcMLpAX6HAMXOTyzaXakm+MN9avXbWG7dBNTtIdcbxdXgxHX Z0zS/u5UTPZLmxM3EnNxHKcX+yOLdZsqnSZqq+p5naoDSCtzUFpSKCylBliS7BaoEpBI V57P0Iq69NKP9WcaojUxe7KzMaUiG75l3B+oRoj8vIcee4W+czYAOHDVAjucIFdB4zfr eJRdRV3Xti3pANMSqPZG3qx/tV2ZoXeBS0BVNw875shwqrE+ua7Qf32VLSM15ANiatbz 5F7gICC+U1K28cHUCxzs6gtXjBGj+4ysd1h+Cxg+GfLokGjEPGSTpxCLorOGd4NiU5wF kU8A== X-Forwarded-Encrypted: i=1; AJvYcCVn3itFzT1uKzujw55O60Xuz+dxYQbIYsvYEFGYVNghZkzrk1OjYGkhYvYtYjPHuVxNibpjSlqEPudBo9Y=@vger.kernel.org X-Gm-Message-State: AOJu0Yxfb4ROTxmr7GKQ5NsGVb34qqKsvZJ4NKtGQQQmL98ZRda8N+HR qh8lOJ5RbzYRbobA5RyhWWcJTt+ENabtyP/weYImjUxdGWa8fT55azipIOAeABihYN8= X-Gm-Gg: AeBDiesLw+oOzv/TE+ddtromy12mTcptw0iVQh+4ivX58E1fXOaWYF7+/YhwwtHRYGb h2j9XWTgUxAdRhBgDT17YAHAVJTUUtb8yc8wRKZnwIYJwadB3Jh0vGa69JVBrRRhTXnI4elgohj dP9n0dXrSnD66yJQtCbS2aP8eF+F3Cmv2NHrBKmvQWCxVRywL18U6tPr1nPXks4aLoaFHRiji24 EnbjgrPyVGTvwvuPCM2yqRZaOb+Z6lVI046eyc2MxJ5FccqQO1wwErlzT4HNxveEPwv12Up3MiW yr07l6SsfFdNie7JVnljRz2UzhlXUNHFMwlRkO7O6ezRd4luAjANC2kehEJbztnla3fkxPDOLHR qQ4LJNz5ILvXsOrsajYtml0abfzucPchqSTlPX2AElzPw6a0qa8dcpOEyxnTDMmMtMt/MHP9yJu afNd6O8EOj/y1afm2zQoEPigEeLEALdWb1aj8DpeG5QeI= X-Received: by 2002:a17:90b:52c7:b0:359:8dfd:64c8 with SMTP id 98e67ed59e1d1-35de69702cdmr7776655a91.24.1775393791575; Sun, 05 Apr 2026 05:56:31 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:31 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 29/49] mm: extract pfn_to_zone() helper Date: Sun, 5 Apr 2026 20:52:20 +0800 Message-Id: <20260405125240.2558577-30-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extract pfn_to_zone() from sparse-vmemmap.c to mm_init.c and use it in __init_page_from_nid(). This removes duplicated code for finding the zone for a given PFN. Signed-off-by: Muchun Song --- mm/internal.h | 1 + mm/mm_init.c | 28 ++++++++++++++++------------ mm/sparse-vmemmap.c | 14 -------------- 3 files changed, 17 insertions(+), 26 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 27c06250d6b8..b569d8309f4d 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1350,6 +1350,7 @@ static inline bool deferred_pages_enabled(void) } #endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ =20 +struct zone *pfn_to_zone(unsigned long pfn, int nid); void init_deferred_page(unsigned long pfn, int nid); =20 enum mminit_level { diff --git a/mm/mm_init.c b/mm/mm_init.c index b47f65425bc1..e47d08b63154 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -687,24 +687,28 @@ static __meminit void pageblock_migratetype_init_rang= e(unsigned long pfn, } } =20 +struct zone __meminit *pfn_to_zone(unsigned long pfn, int nid) +{ + pg_data_t *pgdat =3D NODE_DATA(nid); + + for (enum zone_type zone_type =3D 0; zone_type < MAX_NR_ZONES; zone_type+= +) { + struct zone *zone =3D &pgdat->node_zones[zone_type]; + + if (zone_spans_pfn(zone, pfn)) + return zone; + } + + return NULL; +} + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT /* * Initialize a reserved page unconditionally, finding its zone first. */ static void __meminit __init_page_from_nid(unsigned long pfn, int nid) { - pg_data_t *pgdat; - int zid; - - pgdat =3D NODE_DATA(nid); - - for (zid =3D 0; zid < MAX_NR_ZONES; zid++) { - struct zone *zone =3D &pgdat->node_zones[zid]; - - if (zone_spans_pfn(zone, pfn)) - break; - } - __init_single_page(pfn_to_page(pfn), pfn, zid, nid); + __init_single_page(pfn_to_page(pfn), pfn, + zone_idx(pfn_to_zone(pfn, nid)), nid); =20 if (pageblock_aligned(pfn)) init_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE, diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index d266bcf45b5c..9da49b0d03f0 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -143,20 +143,6 @@ void __meminit vmemmap_verify(pte_t *pte, int node, start, end - 1); } =20 -static struct zone __meminit *pfn_to_zone(unsigned long pfn, int nid) -{ - pg_data_t *pgdat =3D NODE_DATA(nid); - - for (enum zone_type zone_type =3D 0; zone_type < MAX_NR_ZONES; zone_type+= +) { - struct zone *zone =3D &pgdat->node_zones[zone_type]; - - if (zone_spans_pfn(zone, pfn)) - return zone; - } - - return NULL; -} - static __meminit struct page *vmemmap_get_tail(unsigned int order, struct = zone *zone); =20 static pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long ad= dr, int node, --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DAF7529E117 for ; Sun, 5 Apr 2026 12:56:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393801; cv=none; b=KbZByyk0qH5EJGyvhsxiz50oymAlLeXsiOYoma2edrSN/HoXwWT8xK+e3WSpZVPyWkYgwPuIMsRxJXC+/L7L9cYZUH9ps5ow7YhNmH0UZAXPAHFGWvbl0ApPjb8SvCWS52d7fAXtqrDSZ1MYZr79HfYvRrHNdydh2hFMs2RWx7I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393801; c=relaxed/simple; bh=IHCZVJh2i+KQCrURlqahz08hClnGLyZ8ptcpo0xRQWM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tZKee1JOERk54JOPp4HvJS4ezPcKmO7cT7CNhdf6pb8Hl+bCztZKxCoAp8l6Pcz6VRZUtbtb9BR2WfA4yBb+mN0bGiuTEKXUlFHVPSLKZZ4yXoP43iOWiBuhOvrqoCl4J/AcoDTQAZw4bf2B31G8qiME7ZdA7TYLr8or0RADJM4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=DB9mdSt9; arc=none smtp.client-ip=209.85.216.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="DB9mdSt9" Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-35d8e548a05so3517345a91.1 for ; Sun, 05 Apr 2026 05:56:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393799; x=1775998599; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MAH/SBWDnTjIq0A4CWltUgbBq7R5GiJqy6Ep8VuXWKc=; b=DB9mdSt9MzDSlyhCKSusryi02wlUjt48YgEbfeE1zvwQqxAkLFNEu56iTVhf8b7sNO CCc6up8nl5Dpkz6yNACJNChIXlzQjHVR4ihrMu3JpzRlu/SBclk90PE6sNzxednQYK/V nbtD8+THY0S/xsuqBBk0BaWiqw3UbONNVy/BLWhJEsbOXoQnLNBHXxJCjTl57KZaXr+e AnwMEZKgvVtajPofc9WCTd/nVetVWpkTFvV3PzC2J08VsHW0V+9HPFPRNVff65jK/zwk N7wGj2uGaTmRR3kMsAPW799kiInUWOwBFhjvNgzU22uPfAcEdF39hn2Cuaknog4Oscij gvFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393799; x=1775998599; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=MAH/SBWDnTjIq0A4CWltUgbBq7R5GiJqy6Ep8VuXWKc=; b=nXZLCgSEwF9jWWquKgPRpLr/JqYbrs7UtMON13EVCQCadK73ClZ0VxaG+x6UQQ8Z/x qWH7l0Lx/apG4GQE9ZDE1JD7v1GCV7RZqAr3CPpky+1hbsMtat9sE6UTO6XR4CcJ3YC/ 1uCMBYd+c98fUG6MdHB5SZZ2e+4RTR6AQa6M5iHiBcIwYSXu1VGKcYt3jKLFZvoMxD1H jZy7asTYWjI0rEOLm1A4jcpuI5NQg25qRcWEKRa/ECpPpro4sljCaGGhA+Of0VnKNVgG w1Pix9gaK0dQESEimbiV0x2EZ49LDvD/scKUYfeyIbYbadQhf5h0zA0X/Sm0PJhkvoZN 0M4Q== X-Forwarded-Encrypted: i=1; AJvYcCXo04hbVECu/TwvJvA+D2D9Ae+NZ27DsuOxKbvheUpskcYZD+SKANe2djCVyGJzHkDsEbLeThbTD32lLCY=@vger.kernel.org X-Gm-Message-State: AOJu0YziZStOGTnWSDYScFs65ZUc5+R1pu20H8J+XhAZCmc2/XJgnuTs D3tsuRF9dQyAoGX8r54/ffRMQpVo6WQr/sdcf56Kmo1Lp16UCbNhmsArjRVjDfe0/yY= X-Gm-Gg: AeBDieuJyxqW6c+ciPuBaDwtbONQL3YPRLlSHDi7fvB6yFxNMpNiEEctF+NFbDFy2AB ztIqhGYipHXbV3RSzPthNPAKWQy4GljtFRzXfd/KZnJdX7n9DodVS9TJDBqvutGCpqGAzeEziga e9EyAE4B9ZmC2/02WZpuD2ChbsF0Jzk8xcUOOJ1asaKtbNi1qSDgIRV8LElFoIo1T0Urs8J8GeP BjzVG81pWRNMMytkD83n5OfvA3ZEHjmwsUTlVp8leYV4PT9iZIY0Vm8IYi+MjnojdYhHLcL1Scs m+7levZ7QYLd0Zsg8Ypcj//96u/q7E3PZAOeElPZ+rUPXlGa26UkHpAIfJl5dKlBK8xnHRa7+Fd rshtPE/CkcUdaHBlyjlews/WEdpyTrHK1z6mo9U9DnPbbWz3G1FyR9AY+tJfuFrPlaFOr8d5cYQ eXYXeIYFswWnOdNSPgc8BU8d5B3juXAjP9O0aZnSNWIrYcDhcMLHcu0g== X-Received: by 2002:a17:90b:3c85:b0:359:9014:99e7 with SMTP id 98e67ed59e1d1-35de69b84bamr8528565a91.29.1775393799189; Sun, 05 Apr 2026 05:56:39 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:38 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 30/49] mm/sparse-vmemmap: remove unused SPARSEMEM_VMEMMAP_PREINIT feature Date: Sun, 5 Apr 2026 20:52:21 +0800 Message-Id: <20260405125240.2558577-31-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since the bootmem vmemmap optimization has been reimplemented to use the new early compound vmemmap infrastructure, the old SPARSEMEM_VMEMMAP_PREINIT feature and its related code (e.g., sparse_vmemmap_init_nid_early(), preinited_vmemmap_section()) are no longer used. Remove them to clean up the code. Signed-off-by: Muchun Song --- arch/x86/Kconfig | 1 - fs/Kconfig | 1 - include/linux/mmzone.h | 25 ------------------------- mm/Kconfig | 5 ----- mm/sparse-vmemmap.c | 13 ------------- mm/sparse.c | 23 ++++++++--------------- 6 files changed, 8 insertions(+), 60 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 99bb5217649a..f19625648f0f 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -148,7 +148,6 @@ config X86 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64 - select ARCH_WANT_HUGETLB_VMEMMAP_PREINIT if X86_64 select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH select ARCH_WANT_IRQS_OFF_ACTIVATE_MM diff --git a/fs/Kconfig b/fs/Kconfig index 43cb06de297f..e70aa5f0429a 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -278,7 +278,6 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP def_bool HUGETLB_PAGE depends on ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP depends on SPARSEMEM_VMEMMAP - select SPARSEMEM_VMEMMAP_PREINIT if ARCH_WANT_HUGETLB_VMEMMAP_PREINIT =20 config HUGETLB_PMD_PAGE_TABLE_SHARING def_bool HUGETLB_PAGE diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 0bd20efac427..75425407e0c4 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -2078,9 +2078,6 @@ enum { SECTION_IS_EARLY_BIT, #ifdef CONFIG_ZONE_DEVICE SECTION_TAINT_ZONE_DEVICE_BIT, -#endif -#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT - SECTION_IS_VMEMMAP_PREINIT_BIT, #endif SECTION_MAP_LAST_BIT, }; @@ -2092,9 +2089,6 @@ enum { #ifdef CONFIG_ZONE_DEVICE #define SECTION_TAINT_ZONE_DEVICE BIT(SECTION_TAINT_ZONE_DEVICE_BIT) #endif -#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT -#define SECTION_IS_VMEMMAP_PREINIT BIT(SECTION_IS_VMEMMAP_PREINIT_BIT) -#endif #define SECTION_MAP_MASK (~(BIT(SECTION_MAP_LAST_BIT) - 1)) #define SECTION_NID_SHIFT SECTION_MAP_LAST_BIT =20 @@ -2149,24 +2143,6 @@ static inline int online_device_section(const struct= mem_section *section) } #endif =20 -#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT -static inline int preinited_vmemmap_section(const struct mem_section *sect= ion) -{ - return (section && - (section->section_mem_map & SECTION_IS_VMEMMAP_PREINIT)); -} - -void sparse_vmemmap_init_nid_early(int nid); -#else -static inline int preinited_vmemmap_section(const struct mem_section *sect= ion) -{ - return 0; -} -static inline void sparse_vmemmap_init_nid_early(int nid) -{ -} -#endif - static inline int online_section_nr(unsigned long nr) { return online_section(__nr_to_section(nr)); @@ -2407,7 +2383,6 @@ static inline unsigned long next_present_section_nr(u= nsigned long section_nr) #endif =20 #else -#define sparse_vmemmap_init_nid_early(_nid) do {} while (0) #define pfn_in_present_section pfn_valid #endif /* CONFIG_SPARSEMEM */ =20 diff --git a/mm/Kconfig b/mm/Kconfig index e8bf1e9e6ad9..3cce862088f1 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -410,8 +410,6 @@ config SPARSEMEM_VMEMMAP pfn_to_page and page_to_pfn operations. This is the most efficient option when sufficient kernel resources are available. =20 -config SPARSEMEM_VMEMMAP_PREINIT - bool # # Select this config option from the architecture Kconfig, if it is prefer= red # to enable the feature of HugeTLB/dev_dax vmemmap optimization. @@ -422,9 +420,6 @@ config ARCH_WANT_OPTIMIZE_DAX_VMEMMAP config ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP bool =20 -config ARCH_WANT_HUGETLB_VMEMMAP_PREINIT - bool - config HAVE_MEMBLOCK_PHYS_MAP bool =20 diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 9da49b0d03f0..c35d912a1fef 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -543,19 +543,6 @@ struct page * __meminit __populate_section_memmap(unsi= gned long pfn, return pfn_to_page(pfn); } =20 -#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT -/* - * This is called just before initializing sections for a NUMA node. - * Any special initialization that needs to be done before the - * generic initialization can be done from here. Sections that - * are initialized in hooks called from here will be skipped by - * the generic initialization. - */ -void __init sparse_vmemmap_init_nid_early(int nid) -{ -} -#endif - static void subsection_mask_set(unsigned long *map, unsigned long pfn, unsigned long nr_pages) { diff --git a/mm/sparse.c b/mm/sparse.c index 7779554c5a0c..04c641b97325 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -385,27 +385,20 @@ static void __init sparse_init_nid(int nid, unsigned = long pnum_begin, panic("The node[%d] usemap allocation failed\n", nid); sparse_buffer_init(map_count * section_map_size(), nid); =20 - sparse_vmemmap_init_nid_early(nid); - for_each_present_section_nr(pnum_begin, pnum) { - struct mem_section *ms; unsigned long pfn =3D section_nr_to_pfn(pnum); + struct page *map; =20 if (pnum >=3D pnum_end) break; =20 - ms =3D __nr_to_section(pnum); - if (!preinited_vmemmap_section(ms)) { - struct page *map; - - map =3D __populate_section_memmap(pfn, PAGES_PER_SECTION, - nid, NULL, NULL); - if (!map) - panic("Populate section (%ld) on node[%d] failed\n", pnum, nid); - memmap_boot_pages_add(section_vmemmap_pages(pfn, PAGES_PER_SECTION, - NULL, NULL)); - sparse_init_early_section(nid, map, pnum, 0); - } + map =3D __populate_section_memmap(pfn, PAGES_PER_SECTION, + nid, NULL, NULL); + if (!map) + panic("Populate section (%ld) on node[%d] failed\n", pnum, nid); + memmap_boot_pages_add(section_vmemmap_pages(pfn, PAGES_PER_SECTION, + NULL, NULL)); + sparse_init_early_section(nid, map, pnum, 0); } sparse_usage_fini(); sparse_buffer_fini(); --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49DEC29E117 for ; Sun, 5 Apr 2026 12:56:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393806; cv=none; b=SBj4lLzd1RS7wWqbcGNB5sEOvt9C3qPHO6wz9u6aQg3Ywgz/mBJa7swl9d5IVUBH6JIxMbw9ExVIu4NTiKWavCjmILZZ4TZ3kkW8Ry3voOGLUmcbjHoTg3lQkxix0+UEKPNZpsQr5mFjPGmcbLqsu98fIdwmzALkv+MRF1Z25hI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393806; c=relaxed/simple; bh=Ix/IPAwlg1JlsH3RkDpGAFGyMWekSK5pvRyCCjodtiY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bVWcj33eYDKB6Zpc/1SBQUsMzNJtY+uL2Je6fSwWvj3wqj8v1Gic3zsPhaytSrnj5emgAXPi+eVBS+GE1JWHX3MNdCLgtNyYpoFqDIe0AJPmK3ND1b135WzlYKL1b02BtVNTdd2o95ShDH7GLbCHtRlHrRLtf0bv3aXYVGdhbXI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=S09JKbc4; arc=none smtp.client-ip=209.85.216.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="S09JKbc4" Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-3590042fa8eso2191718a91.1 for ; Sun, 05 Apr 2026 05:56:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393804; x=1775998604; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/Tl0DYENAt+V70Fu8vSLfAcsd763D1dzR7HZC3FaIa4=; b=S09JKbc4KoxrLJOO6NJrO7itOlsrbGkX4Y38eZpQkt8znp+EIf991xyy5plqn/fSQu mOp+vWjH0/tVDNnC7xq6jpeRNMduGAI/mywKoOY6rbr6+r0TVvHu5IIIoFjCi4JaOqCy r16p+qLzdN66HVMnuTkU4jTBeqTfx0w5GjlOOMULdCUMPeLrYQUYyoPy4zvuZ0RAVzl1 eRGHMoF9XynJzfkRtM1OD2lW6eoXex9T7Dd7IUKkZ4DijVnW/uxYKSuouIHwZ2CuXzI8 YlvCkLYDVw17v6mtz2coWPuKQtVVoctJA5RcZtZt86zD4fobCTMWJ49uAydRDuo+Pfhj Vbug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393804; x=1775998604; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=/Tl0DYENAt+V70Fu8vSLfAcsd763D1dzR7HZC3FaIa4=; b=SCju0Bwxm1jol5mzNX6lGM24XGtHqW3ez3QqPM2EBQXsiIXxoYutB+b5eY1iE6o0R3 o05qhlz7BCOv2UjbAQZVhPZpLWsYVKfsNmp+5kJ4rmHPH/+drgyfk2W2CReHPDdyMUi2 pXw32CwVFYTegZbQQP+fG2H5S2ovYSfVGaMnX6XsJjqmpuCaO+xdTsDHNCYg3qm6hBRv +6Bt46QUwVt9yuj8GSmX01JlIMo1uH6B3e0NfZ5zegUrrtEqBExV/a/uljbCCC4nMkrQ 8CVWvdr4EIeWTr3tZXMLqqdsWYEMaKb8hSN84/XMf2f1HEgKIz55JWcXJbSG4Ej/vSRZ e+Qg== X-Forwarded-Encrypted: i=1; AJvYcCVy7wo6VbC5ncCjVVuSScriEAY9/st9YQEylES9WO1iTeF0e0vDbmcbZkU/rAuNXPGerVM38kAQNPbv6VY=@vger.kernel.org X-Gm-Message-State: AOJu0Yy96nUlKp1/eQ3zoEMG5bxyVWIuMNuwLDU+/75PWW6iVkdsFdM9 PfXigxw+fI2WrcjH0rUt+WfCNK6coMbVyJtFLLHdSOvox3rTBjyrztHQDnqJmqw7XUM= X-Gm-Gg: AeBDietBRNBoAlqIne2fIUawB5hASmm8zu+563d5vI6vK1B9gVe5+4s9C222aJpWGBh HWaswRcBcUirkvro5GJMs9K9jumPHM9TNqb+sBgUlpuxhGbyjzdT/FDRngIzc91VKUAM38jws6r lQ0XRwk22u8oxBNjNgptKdFHsulZcbTSbcmCOz9W3vTChU19qtdF4YOZYur4X81Up01tX3PAPPl gKjJ9Lu5ebKWa76jgpErkNXmUnPue1im2BeZ0QouCVTkBolqwEzS/FTmuY3EKes0uqA00Foq8Iu 638vRBIHd6Nf2AbDFBkFUoQplE6z0UVDB3D+EuseEJiyUN+xTwsRhL3As8wq8ihZWlZ+r3fxkXH 7V6dsEtRcGCpHMbF5GwaxKnauZfhH8bgTwzeA9CMzjukj0YbcCaMiCJ3EumQhIUX78PoHkTwgxA 6Tcj/AkpBV4gSz0c2iMPX3mla5mtPVULgZos3KM4fKx2I= X-Received: by 2002:a17:90b:1cd0:b0:35c:812:612a with SMTP id 98e67ed59e1d1-35de69cece4mr8333640a91.29.1775393804441; Sun, 05 Apr 2026 05:56:44 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:44 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 31/49] mm/hugetlb: remove HUGE_BOOTMEM_HVO flag and simplify pre-HVO logic Date: Sun, 5 Apr 2026 20:52:22 +0800 Message-Id: <20260405125240.2558577-32-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The pre-HVO feature is used to optimize the vmemmap pages of HugeTLB bootmem pages. Previously, the HUGE_BOOTMEM_HVO flag was used to indicate whether a bootmem page has been pre-optimized. However, we can directly determine if a huge page is pre-optimized by checking its section's optimization status using section_vmemmap_optimizable(). The pre-initialization mechanism of vmemmap has been completely removed in previous patches, making the HUGE_BOOTMEM_HVO flag and its related checks redundant. By directly using section_vmemmap_optimizable(), we can safely remove the HUGE_BOOTMEM_HVO flag, clean up the associated state maintenance in struct huge_bootmem_page, and simplify the bootmem page optimization checks in the hugetlb initialization path. Signed-off-by: Muchun Song --- include/linux/hugetlb.h | 5 ++--- mm/hugetlb.c | 16 ++-------------- mm/hugetlb_vmemmap.c | 5 ----- 3 files changed, 4 insertions(+), 22 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 23d95ed6121f..6bedeaee9b79 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -695,9 +695,8 @@ struct huge_bootmem_page { struct cma *cma; }; =20 -#define HUGE_BOOTMEM_HVO 0x0001 -#define HUGE_BOOTMEM_ZONES_VALID 0x0002 -#define HUGE_BOOTMEM_CMA 0x0004 +#define HUGE_BOOTMEM_ZONES_VALID BIT(0) +#define HUGE_BOOTMEM_CMA BIT(1) =20 int isolate_or_dissolve_huge_folio(struct folio *folio, struct list_head *= list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long en= d_pfn); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index dff94ab7040a..59728e942384 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3193,11 +3193,6 @@ static void __init hugetlb_folio_init_vmemmap(struct= folio *folio, prep_compound_head(&folio->page, huge_page_order(h)); } =20 -static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) -{ - return m->flags & HUGE_BOOTMEM_HVO; -} - /* * memblock-allocated pageblocks might not have the migrate type set * if marked with the 'noinit' flag. Set it to the default (MIGRATE_MOVABL= E) @@ -3284,16 +3279,9 @@ static void __init gather_bootmem_prealloc_node(unsi= gned long nid) OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS); init_new_hugetlb_folio(folio); =20 - if (hugetlb_bootmem_page_prehvo(m)) { - /* - * If pre-HVO was done, just set the - * flag, the HVO code will then skip - * this folio. - */ + if (section_vmemmap_optimizable(__pfn_to_section(folio_pfn(folio)))) folio_set_hugetlb_vmemmap_optimized(folio); - section_set_order_pfn_range(folio_pfn(folio), - pages_per_huge_page(h), 0); - } + section_set_order_pfn_range(folio_pfn(folio), folio_nr_pages(folio), 0); =20 if (hugetlb_bootmem_page_earlycma(m)) folio_set_hugetlb_cma(folio); diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 8c567b8c67cc..a190b9b94346 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -711,12 +711,7 @@ void __init hugetlb_vmemmap_optimize_bootmem_page(stru= ct huge_bootmem_page *m) if (!READ_ONCE(vmemmap_optimize_enabled)) return; =20 - if (!hugetlb_vmemmap_optimizable(h)) - return; - section_set_order_pfn_range(pfn, pages_per_huge_page(h), huge_page_order(= h)); - if (section_vmemmap_optimizable(__pfn_to_section(pfn))) - m->flags |=3D HUGE_BOOTMEM_HVO; } =20 static const struct ctl_table hugetlb_vmemmap_sysctls[] =3D { --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 642493446CA for ; Sun, 5 Apr 2026 12:56:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393813; cv=none; b=pphvAFEHO0fIWy2VAEQj4numIhkbW5mvEeG2mOuHMXFwYnUIZYKJTlT/WR41AvbtaB+3uj47P48olHC4H5FsMlNXRN4Hvllagmq6AHIIQ47QThA080okDWG2t6Bbrzl35Puhs/ccA5BgCI0OrYcV0qX4wCoTEfAVNGszwXwIBzg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393813; c=relaxed/simple; bh=1KSOiWKQYYLgH4v2lox1UJms5Hn8Tx8NOkdtVW8ui+Y=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dfuy8Mv/OFqB3p/HCcJwhMCEmg5BdsbUpDkhrJBjIabN209iFynSt4NRGiS/lvt44vgsRty8X9nQUrjVyESiU7N9ZnD00799eAWSLkmCZI28rvykalojlsxy1mgcB5YqRRTD9aAIg4wz4kiYS6CzmpiJx2CUq3Yr1fNWovzflLA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=UPGgdrQS; arc=none smtp.client-ip=209.85.216.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="UPGgdrQS" Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-35da1af3e10so2885727a91.3 for ; Sun, 05 Apr 2026 05:56:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393812; x=1775998612; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=A80UjlR159hWaMO6LOhQ/IhxwfIv1wi1/ZVyLQIlI8g=; b=UPGgdrQSRmCFWXOZTRBwZEIEp5qdysEs0UZwdGwRvUjVb9d9AYQWZSlyAkstdK0BBo TO4ovIuWctfGiRT3lUqqDbziDmoB45z5l5E/7SP2LpQHtFCchCiitSOR5FTilyWBvozd EsM8bGMULc+l+wz4IbJJQT19/cR97qIUIumj/4jkuNF5CtZFfBKa44WDqUg3hrD/I3ei LddQFCGBdKS63O+IJ41IeKU+CjhkqUz6cvN/x8IZlS9WaIVd3dugXjxZ8A3afGChgzOx xcN/ZywGVCLLv9ztlOBSIyYsoqdnYToOI3L7GIt8+mG5ku4ydvwkqQ7+8QMNipVaCkeF ZO3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393812; x=1775998612; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=A80UjlR159hWaMO6LOhQ/IhxwfIv1wi1/ZVyLQIlI8g=; b=JB6wRu7Bz8v9JdwrGTvAxv8UxhaSmYXlT8wRQx8+46deCJz9XOxZ7LAxuz9XXhRk6L GP7AkvrnRgKy+qi4M6CMpL8EsZKtnTFIkqk5V60VEow4SEIn3mmHXguz72nHGDBjThzf p/K/zpoLTY8sedKgPjkjdJGT36e1Td+Zjpi90PC6ShEk5OsVpjEaeX1vJicCpfNh2VpK 92oEQ0M1JNfoV1T5mQg80WY6caV1DtNezhLGVz7Jm9RaQ/7P0ev3JhPdU8XFyUehCYfs n5GTT49D1dAJ5ntDYQp+yV9B2KBvpe1haLzAO9HkZoYhf4SM/9z6/H47ZKxtSjPX755u DHBQ== X-Forwarded-Encrypted: i=1; AJvYcCWY/M6u9/2nojFJ8LBWAoIAlmKniMclSroPks7mb2uHRk4Q8tt8+MfDW3EyD0wVGuqOHMvRjhrt3O8uKdk=@vger.kernel.org X-Gm-Message-State: AOJu0YzmlecjN2qjdIY+uF2DtWsaEBnGhx2tqRSoVZ8OxrT4nAtG/kwL RBRKG7TTibCTDwrGQi2W/PzG8NEMqBZTIYICi1cRw44nOYxaWf+cEZgABRN5vN1A2nw= X-Gm-Gg: AeBDietcxYOOJuS9j6aciRWwhRjUhuEmprG0fkuwU0Wkadls8b87ZEWrUY+g/1v13vl 1g4ow3xqtzFQhFx94QsLuuCKDluDGRd1r4VKf+T7S9kckFX3r9qDw2vN2TPcCVv5ff6xGCQQHBu cXjFMJ+hhoqVNKDKX7IxbcgWSSD4iYYKaZCjHgLrz3Wuux0bB/q09VROVEcWMnP7nugVI90xCAM PR2tyTjRqgzHqisaL+nS8KYTG7pHk8OWL0i49u0EkH8AnjfjMiDlHwL6eq3rQct0KZzWI/cI5yx wrj3kT9IfHqPHx2Qy93cZ+PlHEgUowS8BXxpxuJc+yfixLU11yVNNB12ZlyGaXal46B4rwYDS2K RDG/kWa2XVp4iKp8G3EZKEbEZvqeSUgBU1dtxtzAqdNC4rABEPrNe/+v3UjgeFpgcSblBly6AjQ apUWIxrlXuANMyojwr+Xe7QSQZ4nvdAp0VhejPI2I8u40= X-Received: by 2002:a17:90b:3889:b0:34c:2db6:578f with SMTP id 98e67ed59e1d1-35de691abfcmr8431895a91.19.1775393811585; Sun, 05 Apr 2026 05:56:51 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:51 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 32/49] mm/sparse-vmemmap: consolidate shared tail page allocation Date: Sun, 5 Apr 2026 20:52:23 +0800 Message-Id: <20260405125240.2558577-33-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, both HugeTLB and sparse-vmemmap have their own logic to get or allocate the shared tail page for vmemmap optimization. The HugeTLB version handles runtime concurrency using cmpxchg, while the sparse-vmemmap version (used only at boot time) was simpler. This patch unifies them into a single function in mm/sparse-vmemmap.c. The new function of vmemmap_shared_tail_page() is introduced: it returns the shared page frame used to map the tail vmemmap pages of a compound page. Furthermore, vmemmap_alloc_block_zero() is used as a safe allocation method for both situations: 1. It calls alloc_pages_node() (via vmemmap_alloc_block()) when slab is available. 2. It falls back to bootmem allocation during early boot, making the functi= on suitable for use in both early boot (sparse-vmemmap init) and runtime (HugeTLB HVO) contexts. This reduces code duplication and ensures consistent behavior. Signed-off-by: Muchun Song --- include/linux/mm.h | 1 + mm/hugetlb_vmemmap.c | 28 +--------------------------- mm/sparse-vmemmap.c | 42 +++++++++++++++++++++--------------------- 3 files changed, 23 insertions(+), 48 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 93e447468131..15841829b7eb 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4880,6 +4880,7 @@ int vmemmap_populate(unsigned long start, unsigned lo= ng end, int node, void vmemmap_wrprotect_hvo(unsigned long start, unsigned long end, int nod= e, unsigned long headsize); void vmemmap_populate_print_last(void); +struct page *vmemmap_shared_tail_page(unsigned int order, struct zone *zon= e); #ifdef CONFIG_MEMORY_HOTPLUG void vmemmap_free(unsigned long start, unsigned long end, struct vmem_altmap *altmap); diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index a190b9b94346..a7ea98fcc18e 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -493,32 +493,6 @@ static bool vmemmap_should_optimize_folio(const struct= hstate *h, struct folio * return true; } =20 -static struct page *vmemmap_get_tail(unsigned int order, struct zone *zone) -{ - const unsigned int idx =3D order - OPTIMIZABLE_FOLIO_MIN_ORDER; - struct page *tail, *p; - int node =3D zone_to_nid(zone); - - tail =3D READ_ONCE(zone->vmemmap_tails[idx]); - if (likely(tail)) - return tail; - - tail =3D alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, 0); - if (!tail) - return NULL; - - p =3D page_to_virt(tail); - for (int i =3D 0; i < PAGE_SIZE / sizeof(struct page); i++) - init_compound_tail(p + i, NULL, order, zone); - - if (cmpxchg(&zone->vmemmap_tails[idx], NULL, tail)) { - __free_page(tail); - tail =3D READ_ONCE(zone->vmemmap_tails[idx]); - } - - return tail; -} - static int __hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio, struct list_head *vmemmap_pages, @@ -535,7 +509,7 @@ static int __hugetlb_vmemmap_optimize_folio(const struc= t hstate *h, return ret; =20 nid =3D folio_nid(folio); - vmemmap_tail =3D vmemmap_get_tail(h->order, folio_zone(folio)); + vmemmap_tail =3D vmemmap_shared_tail_page(h->order, folio_zone(folio)); if (!vmemmap_tail) return -ENOMEM; =20 diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index c35d912a1fef..309d935fb05e 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -143,8 +143,6 @@ void __meminit vmemmap_verify(pte_t *pte, int node, start, end - 1); } =20 -static __meminit struct page *vmemmap_get_tail(unsigned int order, struct = zone *zone); - static pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long ad= dr, int node, struct vmem_altmap *altmap, unsigned long ptpfn) @@ -160,8 +158,8 @@ static pte_t * __meminit vmemmap_pte_populate(pmd_t *pm= d, unsigned long addr, in unsigned long pfn =3D page_to_pfn((struct page *)addr); const struct mem_section *ms =3D __pfn_to_section(pfn); =20 - page =3D vmemmap_get_tail(section_order(ms), - pfn_to_zone(pfn, node)); + page =3D vmemmap_shared_tail_page(section_order(ms), + pfn_to_zone(pfn, node)); if (!page) return NULL; ptpfn =3D page_to_pfn(page); @@ -338,32 +336,34 @@ void vmemmap_wrprotect_hvo(unsigned long addr, unsign= ed long end, } } =20 -static __meminit struct page *vmemmap_get_tail(unsigned int order, struct = zone *zone) +struct page *vmemmap_shared_tail_page(unsigned int order, struct zone *zon= e) { - struct page *p, *tail; - unsigned int idx; - int node =3D zone_to_nid(zone); + void *addr; + struct page *page; + unsigned int idx =3D order - OPTIMIZABLE_FOLIO_MIN_ORDER; =20 - if (WARN_ON_ONCE(order < OPTIMIZABLE_FOLIO_MIN_ORDER)) - return NULL; - if (WARN_ON_ONCE(order > MAX_FOLIO_ORDER)) + if (WARN_ON_ONCE(idx >=3D ARRAY_SIZE(zone->vmemmap_tails))) return NULL; =20 - idx =3D order - OPTIMIZABLE_FOLIO_MIN_ORDER; - tail =3D zone->vmemmap_tails[idx]; - if (tail) - return tail; + page =3D READ_ONCE(zone->vmemmap_tails[idx]); + if (likely(page)) + return page; =20 - p =3D vmemmap_alloc_block_zero(PAGE_SIZE, node); - if (!p) + addr =3D vmemmap_alloc_block_zero(PAGE_SIZE, zone_to_nid(zone)); + if (!addr) return NULL; + for (int i =3D 0; i < PAGE_SIZE / sizeof(struct page); i++) - init_compound_tail(p + i, NULL, order, zone); + init_compound_tail((struct page *)addr + i, NULL, order, zone); =20 - tail =3D virt_to_page(p); - zone->vmemmap_tails[idx] =3D tail; + page =3D virt_to_page(addr); + if (cmpxchg(&zone->vmemmap_tails[idx], NULL, page) !=3D NULL) { + VM_BUG_ON(!slab_is_available()); + __free_page(page); + page =3D READ_ONCE(zone->vmemmap_tails[idx]); + } =20 - return tail; + return page; } =20 void __weak __meminit vmemmap_set_pmd(pmd_t *pmd, void *p, int node, --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2DED81A285 for ; Sun, 5 Apr 2026 12:56:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393820; cv=none; b=ZqQs45vxA+fYh24S2IhIbRapK2uCcxIMZMz4pSuAsZ8upCY4M0oDMbR4YlT6fM9emO7+b+oKnd1q3HH8JRyBnnUUnGabhq0jrFVWLlTiVfUN6n4Ng+z40cqm+2SS05T/IMmk4GGFYkQ03+k15kUQmJduvuLqJa5gY2RBrC0HwGY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393820; c=relaxed/simple; bh=U1xYJzyIOmWMhzjlIZoqtYUpDfMQH8hO3g4PPKXrTg0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=aNsKVtVvdh5XGhPJFUq0A8+EA8VWIMzqJITUEuNt7gBLcawcUHGRDNq35Y/PIxLDqS4BuYG8J0G3FC5MtvJhbFfU39xvHouQT1Joc5GPQignUKv9vaCqVdwVkfBLuF5Wi5kcG6ohrPjow7Pl9kkpCBsiTkjl1Mfb2WGvg/SsFW8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=gsrUlybU; arc=none smtp.client-ip=209.85.216.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="gsrUlybU" Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-358ed696623so1323541a91.0 for ; Sun, 05 Apr 2026 05:56:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393818; x=1775998618; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6htym6+MXjsCESpIggFByDAobBonwD3F5r0quJbwaas=; b=gsrUlybUq5QypyyqWXfEalROsOklRsVq0QvjLnzHqDYqokT5zFwyn2CKhIglmmX02a +YD5lHmIuMvuGpec0/cRMfNwqKR6MBgKMUWrLtUKgsa+Mm4nberF/k3lJEqklmH9n2W/ Akh6n9UmbUtHKsV2KXDqqxyb6JF9zu9OKBmvLChAOyQvrdb6rAqzl+tBuJB7eUg2d5xJ UELHGaeNeRqa2qBhM3h/4ud/kjjc1vQkr9sJje3eg03VB/2cR7jZ7faWAJtUY5LVzu3L syoZMzYJknRaK+pLPa2JwSYnzRb1g6vzapbEYU7ogoHBi44V1RBIj0O4w+wUp9MaIz+f g6MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393818; x=1775998618; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=6htym6+MXjsCESpIggFByDAobBonwD3F5r0quJbwaas=; b=S2QAJVeqlXaHFf+tlmn6vteptuhvauaOcI0eTE1Xt3nvH6d4LuxKJsikEQy56u5wg1 taWqN13esVh/xr/Fv68V53ZuoRwgd8UjCnF+pXPQSwn/5x18HP/tH18CvrT9rVyLKC1i yww492yRPg4G/tKNKtiLhvAB2ApP2B2tSsRJBYS7dvQMWry9LECO4O7VuYW1MjP7rQ6w +YMW45gND51kz5ATYfBLOcHPoQt+S4WMwOQGVhM8zkfNnW/vM5HubJe1Nbt47Gkqlidc ZIJHDuBLlqU5Cspd0VIPy2uqUZRG7usQOem4nzhaROHlk8pY2FcaYBygWixp35rk9ybf sf8w== X-Forwarded-Encrypted: i=1; AJvYcCXCm64ihS+vrPpJmbopV3KZZJ/VnwrkC6NN1UW9n3Aox+XykXsksKIBFv9vZhmg6pgYUDm5HUiqbcG6O98=@vger.kernel.org X-Gm-Message-State: AOJu0YygC6FoPwMKr99b+xQhcRjxEbxoJO1G9cD18bn9+nPrRWDdqt2S xxacbm99b9HhOVaE+UpmNiX0RJy92gEyb7GNqZzA/Zfg1PxpvHHKn7RN7PGM04beBqo= X-Gm-Gg: AeBDieuV8iio5b4/4HSjubt0C8TLNsPsJR1UQy1Q5ZHM586q5YIMTmSlw6LNoVwNphG AhRJsMj/e/mOMkiO/oxQCzoL1J9hK1MY86xRmnFTz0B29zuF8ybY7V4sJOK+4VsNAdhzAmk+N38 qaSHKBX1HtFu8ngICEwkDG2Db5IBmPgAl+AaquJLcTE8LlfUb4dY/TkpG35bom7OEOJNP19CEhc BjdMyQC1KIOt02bh8I/xMe8hpcjCBmqa+vAiNvf5k9RIi4XFKXOASfx5sNwPG6o3SDvWHC4cGfN hpNc011OnTHQ4fDLcP5eWKlEQY6YhiDohP0czqtV/ThpxxFVCYGCBELjZ/TWhTMAdLgFlc/MnyJ +VvCV6xCujocplSiEcb/MN5HOeNhSDmqoWsElVJw1qswOJUL+YRJ71qRsusI4c8dw3FHUX1Addx +Q7XyP3/h75E/HwAfTbXC1M1rmKcy3jwt9yKVxuG+Ojhc= X-Received: by 2002:a17:90b:3809:b0:35d:9560:3efc with SMTP id 98e67ed59e1d1-35de68ce84bmr7967133a91.14.1775393818406; Sun, 05 Apr 2026 05:56:58 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:56:58 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 33/49] mm: introduce CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION Date: Sun, 5 Apr 2026 20:52:24 +0800 Message-Id: <20260405125240.2558577-34-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Previously, the vmemmap optimization logic in mm/sparse-vmemmap.c was closely tied to HugeTLB via CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP. With recent refactoring (e.g., introducing compound page order to struct mem_section), the core vmemmap optimization machinery has become more generic and can be utilized by other subsystems like DAX. To reflect this generalization and decouple the core optimization logic from HugeTLB-specific configurations, this patch introduces a new common Kconfig option: CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION. Both HugeTLB and DAX now select this generic option, ensuring that the shared optimization infrastructure is enabled whenever either subsystem requires it. Signed-off-by: Muchun Song --- fs/Kconfig | 1 + include/linux/mmzone.h | 33 ++++++++++++++++++--------------- include/linux/page-flags.h | 5 +---- mm/Kconfig | 5 +++++ 4 files changed, 25 insertions(+), 19 deletions(-) diff --git a/fs/Kconfig b/fs/Kconfig index e70aa5f0429a..9b56a90e13db 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -278,6 +278,7 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP def_bool HUGETLB_PAGE depends on ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP depends on SPARSEMEM_VMEMMAP + select SPARSEMEM_VMEMMAP_OPTIMIZATION =20 config HUGETLB_PMD_PAGE_TABLE_SHARING def_bool HUGETLB_PAGE diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 75425407e0c4..6edcb0cc46c4 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -102,9 +102,9 @@ * * HVO which is only active if the size of struct page is a power of 2. */ -#define MAX_FOLIO_VMEMMAP_ALIGN \ - (IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) && \ - is_power_of_2(sizeof(struct page)) ? \ +#define MAX_FOLIO_VMEMMAP_ALIGN \ + (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION) && \ + is_power_of_2(sizeof(struct page)) ? \ MAX_FOLIO_NR_PAGES * sizeof(struct page) : 0) =20 /* The number of vmemmap pages required by a vmemmap-optimized folio. */ @@ -115,7 +115,8 @@ =20 #define __NR_OPTIMIZABLE_FOLIO_SIZES (MAX_FOLIO_ORDER - OPTIMIZABLE_FOLIO= _MIN_ORDER + 1) #define NR_OPTIMIZABLE_FOLIO_SIZES \ - (__NR_OPTIMIZABLE_FOLIO_SIZES > 0 ? __NR_OPTIMIZABLE_FOLIO_SIZES : 0) + ((__NR_OPTIMIZABLE_FOLIO_SIZES > 0 && \ + IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION)) ? __NR_OPTIMIZABLE_F= OLIO_SIZES : 0) =20 enum migratetype { MIGRATE_UNMOVABLE, @@ -2014,7 +2015,7 @@ struct mem_section { */ struct page_ext *page_ext; #endif -#ifdef CONFIG_SPARSEMEM_VMEMMAP +#ifdef CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION /* * The order of compound pages in this section. Typically, the section * holds compound pages of this order; a larger compound page will span @@ -2194,7 +2195,19 @@ static inline bool pfn_section_first_valid(struct me= m_section *ms, unsigned long *pfn =3D (*pfn & PAGE_SECTION_MASK) + (bit * PAGES_PER_SUBSECTION); return true; } +#else +static inline int pfn_section_valid(struct mem_section *ms, unsigned long = pfn) +{ + return 1; +} + +static inline bool pfn_section_first_valid(struct mem_section *ms, unsigne= d long *pfn) +{ + return true; +} +#endif =20 +#ifdef CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION static inline void section_set_order(struct mem_section *section, unsigned= int order) { VM_BUG_ON(section->order && order && section->order !=3D order); @@ -2206,16 +2219,6 @@ static inline unsigned int section_order(const struc= t mem_section *section) return section->order; } #else -static inline int pfn_section_valid(struct mem_section *ms, unsigned long = pfn) -{ - return 1; -} - -static inline bool pfn_section_first_valid(struct mem_section *ms, unsigne= d long *pfn) -{ - return true; -} - static inline void section_set_order(struct mem_section *section, unsigned= int order) { } diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 0e03d816e8b9..12665b34586c 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -208,14 +208,11 @@ enum pageflags { static __always_inline bool compound_info_has_mask(void) { /* - * Limit mask usage to HugeTLB vmemmap optimization (HVO) where it - * makes a difference. - * * The approach with mask would work in the wider set of conditions, * but it requires validating that struct pages are naturally aligned * for all orders up to the MAX_FOLIO_ORDER, which can be tricky. */ - if (!IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP)) + if (!IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION)) return false; =20 return is_power_of_2(sizeof(struct page)); diff --git a/mm/Kconfig b/mm/Kconfig index 3cce862088f1..e81aa77182b2 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -410,12 +410,17 @@ config SPARSEMEM_VMEMMAP pfn_to_page and page_to_pfn operations. This is the most efficient option when sufficient kernel resources are available. =20 +config SPARSEMEM_VMEMMAP_OPTIMIZATION + bool + depends on SPARSEMEM_VMEMMAP + # # Select this config option from the architecture Kconfig, if it is prefer= red # to enable the feature of HugeTLB/dev_dax vmemmap optimization. # config ARCH_WANT_OPTIMIZE_DAX_VMEMMAP bool + select SPARSEMEM_VMEMMAP_OPTIMIZATION if SPARSEMEM_VMEMMAP =20 config ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP bool --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE29B29E117 for ; Sun, 5 Apr 2026 12:57:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393827; cv=none; b=TJnOoHHqOABmSx2eR+xCIs6jy3h2mF3MkCmlnKJ+9wrIO8N5pcUqqf1MeSwKN5yBXevOAqBHX4KLyJo5vxJAdwB6C5ozO8rwW4P+HeaPMlyhxDLabcivgPL5p0tEEo3c4olrQRUgeBYNeGjHTQd49fPiu0bEdl2Sbn30FgqpTjo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393827; c=relaxed/simple; bh=T7whV11KQHBkOJciU1mZig6wrUwXD7eeOH17vv79YLA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JDWi6HmT6Qbx52KHts4mRHVEESTimKapFTTPc5JQEOAKeMgV5OiCh0u9xB1Df4x0DrG3svcpR16FhZSJAwIYqffSVgGMgTQb87qWIQz6aBFWCoFIr7T4JmhSq7muo0ucRJ8QfDQtjr5d+TKuJ8a20jLobWyJWWLK8pzDGGcZFp8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=FkqIv+x1; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="FkqIv+x1" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-358d80f60ccso1933992a91.3 for ; Sun, 05 Apr 2026 05:57:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393824; x=1775998624; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dvO4goCvgbIDjzyl1a4aazkWRerXOYwkBjCejKe9WH8=; b=FkqIv+x1NT81ZOzQmMwlKVi1jVu8eJHIXxlK5mnfF1KdxZbDQSAxvO7CNuX6SrEylv 5CD7TJla2Xp36jxijLhScT8HMpTTnKcS1u6vAyOt2wQduCJLNYzAIY1v6ASb0vdC56gY +FlTFTunoCLLkTYY0sw20PxkvYrhXsW3m/64/yJBGdqJOO0zQB2bklY92ROzGQn7QW3c OrlWavmYkL1YSW++m4Os7A3zwCSHdpTaKqZS5Y6XOADR/bFR8y9nwacfgocgFscU7jfn /frxxUeRXHYnRNfKry62YVb57atqlRNXlt38moonf1oNPzdi0uaAiHLptSi6Z+uAyPqH Y44Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393824; x=1775998624; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=dvO4goCvgbIDjzyl1a4aazkWRerXOYwkBjCejKe9WH8=; b=VQFQx0zUN+BlYi5yx6FBI466U5ycqAJvSGSmFLEgSJtogrwl9p78n/cGLZoBRygYm6 xk8+xYZ4JLqqPePN5XXRWOYlk5KQnH3onyiycOuUEstjR5To7l+fi60fVnRRHiedp/i0 5rZiG+C7VxmJpeXF3WvzxUbG9OktQz1wMgp/dkDFYE+HyRgfN8tUVt8y1Z4lQgnzq6QZ iIQlV8iAxiFtx1Zks6pWhvqg6UvvxOdcpnl5BlljtFfH6B58PciCtuxA7Jilg3XR0qU1 e4tp0NeGh62766YhDSDRaUhlWl5dUdmVqqLpjLWwmDnO9ofncO+BAwN9skvfqwQY05Y3 8oEQ== X-Forwarded-Encrypted: i=1; AJvYcCXS9VIybnu7Xt49esoNkQ1oPfd3NM8f+6kUM6JvDJiBwdAJevBlpOQXBl4AtqEr/VpH+MSytP7ZsfK+mMM=@vger.kernel.org X-Gm-Message-State: AOJu0YxDFxwBId9wm98XSAxoX7xvc14IbyYbF642uaCXvTClfQuLi9zP rOWxBtfhDnmpxAdP4LnhgNmTxAr0rEA5GflNKOG9zdUiY9w3rlAwKmjEjWTqW91+b48= X-Gm-Gg: AeBDietjZtUm/cwNm6KDXk9gaVJRnyyPdLGS3XMr/rLb7VEwQoI/OWKgkRR+OC3VDm9 3vpKGTu4KM33RqmcKHFcTvMrORi0dKe/PHeF2QPTjWe1iMN7ix4V0Zox68pTX8g5F2sfkImFU/u JNEbKdqIZ3QqkcN92o70f6kOpC05S/8F046S8ky+WGESyS3abBgL78slnlljzDn/L8ZMBf7Z7st v2qR9iMTR6SnvoxPEPrxGEgOxmPqYHwObMCZKF92B8LbtXrYMF1B6pZLQpcOz3IvDH8q/mZdQPV bvLHUAagJETcAmYMLuZE1Y2T8OYYs03NkxrSK85BOeosGN0Jax/siBZKJvAQlYn8CF8vUc4anbb JdaqKDUTOf3WjhGdhcptzTFtvMoyV5R57G2XHRbyUfXfbvRdei432udfWiZWI7f2hBGffXNvEJh WvqMbGkn3EnS99j08dssJX5jkBL69QajSCmXiDi66LOus= X-Received: by 2002:a17:90a:d2c7:b0:35c:cba:344f with SMTP id 98e67ed59e1d1-35de68414eemr9240766a91.13.1775393824055; Sun, 05 Apr 2026 05:57:04 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.56.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:57:03 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 34/49] mm/sparse-vmemmap: switch DAX to use generic vmemmap optimization Date: Sun, 5 Apr 2026 20:52:25 +0800 Message-Id: <20260405125240.2558577-35-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Recent refactoring introduced common vmemmap optimization logic via CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION. While HugeTLB already uses it, DAX requires slightly different handling because it needs to preserve 2 vmemmap pages, instead of the 1 page HugeTLB preserves. This patch updates DAX vmemmap optimization to manually allocate the second vmemmap page, and integrates DAX memory setup to correctly set the compound order and allocate/reuse the shared vmemmap tail page. Note that manually allocating the vmemmap page is a temporary solution and will be unified with the logic that HugeTLB relies on in the future. Signed-off-by: Muchun Song --- arch/powerpc/mm/book3s64/radix_pgtable.c | 5 +- mm/memory_hotplug.c | 5 +- mm/mm_init.c | 8 ++- mm/sparse-vmemmap.c | 82 ++++++++++++++---------- 4 files changed, 58 insertions(+), 42 deletions(-) diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/boo= k3s64/radix_pgtable.c index dfa2f7dc7e15..ad44883b1030 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1124,9 +1124,10 @@ int __meminit radix__vmemmap_populate(unsigned long = start, unsigned long end, in pud_t *pud; pmd_t *pmd; pte_t *pte; + unsigned long pfn =3D page_to_pfn((struct page *)start); =20 - if (vmemmap_can_optimize(altmap, pgmap)) - return vmemmap_populate_compound_pages(page_to_pfn((struct page *)start)= , start, end, node, pgmap); + if (vmemmap_can_optimize(altmap, pgmap) && section_vmemmap_optimizable(__= pfn_to_section(pfn))) + return vmemmap_populate_compound_pages(pfn, start, end, node, pgmap); /* * If altmap is present, Make sure we align the start vmemmap addr * to PAGE_SIZE so that we calculate the correct start_pfn in diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 05f5df12d843..28306196c0fe 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -551,8 +551,9 @@ void remove_pfn_range_from_zone(struct zone *zone, /* Select all remaining pages up to the next section boundary */ cur_nr_pages =3D min(end_pfn - pfn, SECTION_ALIGN_UP(pfn + 1) - pfn); - page_init_poison(pfn_to_page(pfn), - sizeof(struct page) * cur_nr_pages); + if (!section_vmemmap_optimizable(__pfn_to_section(pfn))) + page_init_poison(pfn_to_page(pfn), + sizeof(struct page) * cur_nr_pages); } =20 /* diff --git a/mm/mm_init.c b/mm/mm_init.c index e47d08b63154..636a0f9644f6 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1069,9 +1069,10 @@ static void __ref __init_zone_device_page(struct pag= e *page, unsigned long pfn, * of an altmap. See vmemmap_populate_compound_pages(). */ static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + struct dev_pagemap *pgmap, + const struct mem_section *ms) { - if (!vmemmap_can_optimize(altmap, pgmap)) + if (!section_vmemmap_optimizable(ms)) return pgmap_vmemmap_nr(pgmap); =20 return VMEMMAP_RESERVE_NR * (PAGE_SIZE / sizeof(struct page)); @@ -1140,7 +1141,8 @@ void __ref memmap_init_zone_device(struct zone *zone, continue; =20 memmap_init_compound(page, pfn, zone_idx, nid, pgmap, - compound_nr_pages(altmap, pgmap)); + compound_nr_pages(altmap, pgmap, + __pfn_to_section(pfn))); } =20 /* diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 309d935fb05e..6f959a999d5b 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -353,8 +353,12 @@ struct page *vmemmap_shared_tail_page(unsigned int ord= er, struct zone *zone) if (!addr) return NULL; =20 - for (int i =3D 0; i < PAGE_SIZE / sizeof(struct page); i++) - init_compound_tail((struct page *)addr + i, NULL, order, zone); + for (int i =3D 0; i < PAGE_SIZE / sizeof(struct page); i++) { + page =3D (struct page *)addr + i; + if (zone_is_zone_device(zone)) + __SetPageReserved(page); + init_compound_tail(page, NULL, order, zone); + } =20 page =3D virt_to_page(addr); if (cmpxchg(&zone->vmemmap_tails[idx], NULL, page) !=3D NULL) { @@ -458,23 +462,6 @@ static bool __meminit reuse_compound_section(unsigned = long start_pfn, return !IS_ALIGNED(offset, nr_pages) && nr_pages > PAGES_PER_SUBSECTION; } =20 -static pte_t * __meminit compound_section_tail_page(unsigned long addr) -{ - pte_t *pte; - - addr -=3D PAGE_SIZE; - - /* - * Assuming sections are populated sequentially, the previous section's - * page data can be reused. - */ - pte =3D pte_offset_kernel(pmd_off_k(addr), addr); - if (!pte) - return NULL; - - return pte; -} - static int __meminit vmemmap_populate_compound_pages(unsigned long start, unsigned long end, int node, struct dev_pagemap *pgmap) @@ -483,42 +470,62 @@ static int __meminit vmemmap_populate_compound_pages(= unsigned long start, pte_t *pte; int rc; unsigned long start_pfn =3D page_to_pfn((struct page *)start); + const struct mem_section *ms =3D __pfn_to_section(start_pfn); + struct page *tail =3D NULL; =20 - if (reuse_compound_section(start_pfn, pgmap)) { - pte =3D compound_section_tail_page(start); - if (!pte) - return -ENOMEM; + /* This may occur in sub-section scenarios. */ + if (!section_vmemmap_optimizable(ms)) + return vmemmap_populate_range(start, end, node, NULL, -1); =20 - /* - * Reuse the page that was populated in the prior iteration - * with just tail struct pages. - */ +#ifdef CONFIG_ZONE_DEVICE + tail =3D vmemmap_shared_tail_page(section_order(ms), + &NODE_DATA(node)->node_zones[ZONE_DEVICE]); +#endif + if (!tail) + return -ENOMEM; + + if (reuse_compound_section(start_pfn, pgmap)) return vmemmap_populate_range(start, end, node, NULL, - pte_pfn(ptep_get(pte))); - } + page_to_pfn(tail)); =20 size =3D min(end - start, pgmap_vmemmap_nr(pgmap) * sizeof(struct page)); for (addr =3D start; addr < end; addr +=3D size) { unsigned long next, last =3D addr + size; + void *p; =20 /* Populate the head page vmemmap page */ pte =3D vmemmap_populate_address(addr, node, NULL, -1); if (!pte) return -ENOMEM; =20 + /* + * Allocate manually since vmemmap_populate_address() will assume DAX + * only needs 1 vmemmap page to be reserved, however DAX now needs 2 + * vmemmap pages. This is a temporary solution and will be unified + * with HugeTLB in the future. + */ + p =3D vmemmap_alloc_block_buf(PAGE_SIZE, node, NULL); + if (!p) + return -ENOMEM; + /* Populate the tail pages vmemmap page */ next =3D addr + PAGE_SIZE; - pte =3D vmemmap_populate_address(next, node, NULL, -1); + pte =3D vmemmap_populate_address(next, node, NULL, PHYS_PFN(__pa(p))); + /* + * get_page() is called above. Since we are not actually + * reusing it, to avoid a memory leak, we call put_page() here. + */ + put_page(virt_to_page(p)); if (!pte) return -ENOMEM; =20 /* - * Reuse the previous page for the rest of tail pages + * Reuse the shared vmemmap page for the rest of tail pages * See layout diagram in Documentation/mm/vmemmap_dedup.rst */ next +=3D PAGE_SIZE; rc =3D vmemmap_populate_range(next, last, node, NULL, - pte_pfn(ptep_get(pte))); + page_to_pfn(tail)); if (rc) return -ENOMEM; } @@ -744,8 +751,10 @@ static void section_deactivate(unsigned long pfn, unsi= gned long nr_pages, free_map_bootmem(memmap); } =20 - if (empty) + if (empty) { ms->section_mem_map =3D (unsigned long)NULL; + section_set_order(ms, 0); + } } =20 static struct page * __meminit section_activate(int nid, unsigned long pfn, @@ -824,6 +833,9 @@ int __meminit sparse_add_section(int nid, unsigned long= start_pfn, if (ret < 0) return ret; =20 + ms =3D __nr_to_section(section_nr); + if (vmemmap_can_optimize(altmap, pgmap) && nr_pages =3D=3D PAGES_PER_SECT= ION) + section_set_order(ms, pgmap->vmemmap_shift); memmap =3D section_activate(nid, start_pfn, nr_pages, altmap, pgmap); if (IS_ERR(memmap)) return PTR_ERR(memmap); @@ -832,9 +844,9 @@ int __meminit sparse_add_section(int nid, unsigned long= start_pfn, * Poison uninitialized struct pages in order to catch invalid flags * combinations. */ - page_init_poison(memmap, sizeof(struct page) * nr_pages); + if (!section_vmemmap_optimizable(ms)) + page_init_poison(memmap, sizeof(struct page) * nr_pages); =20 - ms =3D __nr_to_section(section_nr); __section_mark_present(ms, section_nr); =20 /* Align memmap to section boundary in the subsection case */ --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C9CAD34678C for ; Sun, 5 Apr 2026 12:57:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393832; cv=none; b=YBweQLdwZot7qfuaopocNoVJTCDGhkaDtISgn2LFkBEeILBrko7je0pERgGHp90iczNZAQj9aYA0lowBt1d/KkXcVPr/roTayfYAxLb+buQzxsG6y/4F02pgeoiw7xOmQuMzCZzOdTwd8bYPqI//vxAJWoNJBXukoc107FiUn+E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393832; c=relaxed/simple; bh=cjWHUxBImsEhbj99gQlgr1oc+fuRoadjQKyVs9btqrQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=f+nTaA0SQ0I+baAbOVz+cQPi48uxoRvyFSVnjWZK0JmXU0Lj/ijjSo/pO+uxSK81lbSyEiOgoBwX+wGDmeMxphcqLdDK2TISls2kNGSBGhNsjdoGMAP0dRqqDGpO2sVGRmun22Hhvonj/MSVNHHoBfTyj4jw64JwR03RC5+akrY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ZtX8ggdj; arc=none smtp.client-ip=209.85.216.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ZtX8ggdj" Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-35d9749c26dso2358407a91.2 for ; Sun, 05 Apr 2026 05:57:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393830; x=1775998630; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wYLFkhjHZeMhg1MIWamwRyHzBSmpgXxo2vlP0IB+Kzg=; b=ZtX8ggdjcOTOdqQ3+FkupM51pYInOTEH+9AG9QpxAmSaMHkugib1SSa6MSI9PghKG4 PW6q4bSrZij/xP+QMMar/IT3IMCD1DLvcZuwLt31we6EtaEav6uz2xEzemhGw+XHCt6z xXuwq/4FZkz2f0v4mXb93xxGoHC0bUtFFLv5UdIUKTASB8tVkrnmAE9mxvZVeIHKyvFb 9eCXG1cxI1gr5llbydL7ubtKgUE79oPjW2o+3JPisq5GTZcO9mVj+2IAGcZpxnPbJFXm HLO5LU0Vp9yfNQ+An58ofYl+Or4MhRzxP4bUEkiNVqrPOT8Wo0Ot4qmt8VhLcs0but3J sbqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393830; x=1775998630; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=wYLFkhjHZeMhg1MIWamwRyHzBSmpgXxo2vlP0IB+Kzg=; b=P6asBTI87XZvu5emvXdJ+3ws8drUZws95lquf+vRtvjGckaUjr+GeGADO30+LU7Dyx GGkMLNLg/7+tVv95DtxIPqwZ+nFD4SupuJ3sRk8j82BKfRIH6ClDrG/IJLHRW+mnCkQ6 gNf+VkfBwhuQKDBU4nIYUVh9rEicL1JX1rg5CqNnlCd2c3X+XpYwgSFXhEcv4s08kY3b c59uSh6lCxh1v3WxWOzXZteyB9uuSk5VW+Tg786CEF4jdaGbf78TPrn73VncSMoW1S1U 4QrEIqyYVzDXclER5+AEa2mY9FLzfQi67G0nakT9Vtnc3hwtAeFGHAwI4Dnp4hqkvUFx GIsw== X-Forwarded-Encrypted: i=1; AJvYcCWcwI+Nx07GXGR+veaThI0uYouqF3tE1p9QAi1Nwt9aS+QgbpebfH8OtcT/LMJtuHxi3GGdsFC3yAgpm/I=@vger.kernel.org X-Gm-Message-State: AOJu0Yw7Ve7L6/KB4AEwS4xYcuiZKmX+Zc1Rz5jig2CtyRuumHlx/yzI Sgq08VcAfrv3c/i//cBm9Q0CycpfIZiYIyXe1WMbdZGhueud/hfryGtF0QyUER5bkvA= X-Gm-Gg: AeBDietx6RifIpsQi1I163rhpfZo/2qSeoM2NayxaTF3Xi5bLy17k9wOffqeByPriLM yYRI7kxGN2YGsY6x4dyZs5VnOxyQ3HCAX7Xm505Avb041dGNKuv6Qfo1ZX/D79jMhHYX4iLohbe nzGw5PV1rPWPuIp/hFrznLjimi7436JNzYwRUUBynmeA0uxqeusV38y9ipkQ5rQaVT4HBK/i+A6 R4T7KKp4xGDX5xyUQZ9Dc5Vrkm/6wwFsMvc7Rcoaleddk+9TbTSrF/6ZTFziP/SVUJ2fIbAusa9 XqpVmF6iz9ruhY/LscYCNKGTdf7rzoud9cNX+vWpSC5G5D3AzTgmTw4+kJDZei4dSrnHd1Pi8rn /8G2O5ZMduIC5l/ops12S1zCcjHbYmmI3pZ4pRXt5yCQAOeforPaKw0vWB6Hu2Oa89Ii3OIoKkp OpohtpNsiIJNhhJ7Y1ELey93BOF56rTYpDV8FfSEa9+pE= X-Received: by 2002:a17:90a:d005:b0:35d:9d28:e897 with SMTP id 98e67ed59e1d1-35de699f483mr8495535a91.28.1775393830042; Sun, 05 Apr 2026 05:57:10 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.57.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:57:09 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 35/49] mm/sparse-vmemmap: introduce section zone to struct mem_section Date: Sun, 5 Apr 2026 20:52:26 +0800 Message-Id: <20260405125240.2558577-36-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, HugeTLB obtains zone information for vmemmap optimization through early pfn_to_zone(). However, ZONE_DEVICE cannot utilize this approach because its zone information is updated after vmemmap population. To pave the way for unifying DAX and HugeTLB vmemmap optimization, this patch introduces the 'zone' member to struct mem_section. This allows both DAX and HugeTLB to reliably obtain zone information directly from the memory section. Signed-off-by: Muchun Song --- include/linux/mmzone.h | 31 +++++++++++++++++++++++++++---- mm/hugetlb.c | 2 +- mm/hugetlb_vmemmap.c | 4 +++- mm/sparse-vmemmap.c | 19 +++++++++++++------ 4 files changed, 44 insertions(+), 12 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 6edcb0cc46c4..846a7ee1334f 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -2022,6 +2022,7 @@ struct mem_section { * multiple sections. */ unsigned int order; + enum zone_type zone; #endif }; =20 @@ -2214,32 +2215,54 @@ static inline void section_set_order(struct mem_sec= tion *section, unsigned int o section->order =3D order; } =20 +static inline void section_set_zone(struct mem_section *section, enum zone= _type zone) +{ + section->zone =3D zone; +} + static inline unsigned int section_order(const struct mem_section *section) { return section->order; } + +static inline enum zone_type section_zone(const struct mem_section *sectio= n) +{ + return section->zone; +} #else static inline void section_set_order(struct mem_section *section, unsigned= int order) { } =20 +static inline void section_set_zone(struct mem_section *section, enum zone= _type zone) +{ +} + static inline unsigned int section_order(const struct mem_section *section) { return 0; } + +static inline enum zone_type section_zone(const struct mem_section *sectio= n) +{ + return 0; +} #endif =20 -static inline void section_set_order_pfn_range(unsigned long pfn, - unsigned long nr_pages, - unsigned int order) +static inline void section_set_compound_range(unsigned long pfn, + unsigned long nr_pages, + unsigned int order, + enum zone_type zone) { unsigned long section_nr =3D pfn_to_section_nr(pfn); =20 if (!IS_ALIGNED(pfn | nr_pages, PAGES_PER_SECTION)) return; =20 - for (int i =3D 0; i < nr_pages / PAGES_PER_SECTION; i++) + for (int i =3D 0; i < nr_pages / PAGES_PER_SECTION; i++) { section_set_order(__nr_to_section(section_nr + i), order); + section_set_zone(__nr_to_section(section_nr + i), zone); + } } =20 static inline bool section_vmemmap_optimizable(const struct mem_section *s= ection) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 59728e942384..ce5a58aab5c3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3281,7 +3281,7 @@ static void __init gather_bootmem_prealloc_node(unsig= ned long nid) =20 if (section_vmemmap_optimizable(__pfn_to_section(folio_pfn(folio)))) folio_set_hugetlb_vmemmap_optimized(folio); - section_set_order_pfn_range(folio_pfn(folio), folio_nr_pages(folio), 0); + section_set_compound_range(folio_pfn(folio), folio_nr_pages(folio), 0, 0= ); =20 if (hugetlb_bootmem_page_earlycma(m)) folio_set_hugetlb_cma(folio); diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index a7ea98fcc18e..92c95ebdbb9a 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -681,11 +681,13 @@ void __init hugetlb_vmemmap_optimize_bootmem_page(str= uct huge_bootmem_page *m) { struct hstate *h =3D m->hstate; unsigned long pfn =3D PHYS_PFN(virt_to_phys(m)); + int nid =3D early_pfn_to_nid(PHYS_PFN(__pa(m))); =20 if (!READ_ONCE(vmemmap_optimize_enabled)) return; =20 - section_set_order_pfn_range(pfn, pages_per_huge_page(h), huge_page_order(= h)); + section_set_compound_range(pfn, pages_per_huge_page(h), huge_page_order(h= ), + zone_idx(pfn_to_zone(pfn, nid))); } =20 static const struct ctl_table hugetlb_vmemmap_sysctls[] =3D { diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 6f959a999d5b..1867b5dcc73c 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -143,6 +143,11 @@ void __meminit vmemmap_verify(pte_t *pte, int node, start, end - 1); } =20 +static inline struct zone *section_to_zone(const struct mem_section *ms, i= nt nid) +{ + return &NODE_DATA(nid)->node_zones[section_zone(ms)]; +} + static pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long ad= dr, int node, struct vmem_altmap *altmap, unsigned long ptpfn) @@ -159,7 +164,7 @@ static pte_t * __meminit vmemmap_pte_populate(pmd_t *pm= d, unsigned long addr, in const struct mem_section *ms =3D __pfn_to_section(pfn); =20 page =3D vmemmap_shared_tail_page(section_order(ms), - pfn_to_zone(pfn, node)); + section_to_zone(ms, node)); if (!page) return NULL; ptpfn =3D page_to_pfn(page); @@ -471,16 +476,14 @@ static int __meminit vmemmap_populate_compound_pages(= unsigned long start, int rc; unsigned long start_pfn =3D page_to_pfn((struct page *)start); const struct mem_section *ms =3D __pfn_to_section(start_pfn); - struct page *tail =3D NULL; + struct page *tail; =20 /* This may occur in sub-section scenarios. */ if (!section_vmemmap_optimizable(ms)) return vmemmap_populate_range(start, end, node, NULL, -1); =20 -#ifdef CONFIG_ZONE_DEVICE tail =3D vmemmap_shared_tail_page(section_order(ms), - &NODE_DATA(node)->node_zones[ZONE_DEVICE]); -#endif + section_to_zone(ms, node)); if (!tail) return -ENOMEM; =20 @@ -834,8 +837,12 @@ int __meminit sparse_add_section(int nid, unsigned lon= g start_pfn, return ret; =20 ms =3D __nr_to_section(section_nr); - if (vmemmap_can_optimize(altmap, pgmap) && nr_pages =3D=3D PAGES_PER_SECT= ION) + if (vmemmap_can_optimize(altmap, pgmap) && nr_pages =3D=3D PAGES_PER_SECT= ION) { section_set_order(ms, pgmap->vmemmap_shift); +#ifdef CONFIG_ZONE_DEVICE + section_set_zone(ms, ZONE_DEVICE); +#endif + } memmap =3D section_activate(nid, start_pfn, nr_pages, altmap, pgmap); if (IS_ERR(memmap)) return PTR_ERR(memmap); --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 777E71A285 for ; Sun, 5 Apr 2026 12:57:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393837; cv=none; b=A9yiI66BOkU0ltZNcgbBZTBjdvwS3DDEvrzKbIgr42jWEa9mU9/pxmpRR2v8Q0El+Ay7BH6EYfSxoaSW0kFmmIG/9xKgIPG8Z872HmEazsE5q0kbmFOd6Xsx3Qp0TJLbd+N+ZLWNNO/jDAc/Jd83n/x8q5M68PARfMYsIXbTbF0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393837; c=relaxed/simple; bh=8Xjv26l+ljt3eoAf5NlZ4aSmVzEh0oeVsAFUygfinP4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mBI4EjDpMMNhpK9y6QCEdgq1PxZMamy14ykwaL/mjWPO0MuU5/ih2uJHXHMPwOZOCWn97KtQGR/F9gqauvfJTAbHGb1s9YFc3N/M2DbM86gMh7sJwdOqxNg+jbWXyO+5rcIaLb94WEZ9q1hI8wcxVuQ5qG7FIyf2C/S9NRKt6zo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=XpGbd3Np; arc=none smtp.client-ip=209.85.216.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="XpGbd3Np" Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-35d95017a68so1681228a91.3 for ; Sun, 05 Apr 2026 05:57:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393836; x=1775998636; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dhCvzWzEYwqtXTRdhJr7F88/ilDc9KP9wceAmsw7dyk=; b=XpGbd3NpYu89BPbJwHfd/qN3xkqVT0khR4Qpu+o+x37pJN3Hxd3tzdQrzWqrg9IUhv iLISTqWPj1c74GzHgCjbubuCBYTZN2GX0VF3uNWfm/qw/IEAOd99qg2pyHHYpQEQZlWL VayEQURYuuldBJyQa24wMQmtUotlv7G5AljE61B6nYwBPM3RBQjbNfLhjbjdGdJ0Hpg3 RiP5RP/xznAA+cirSnHAisjmK54Wn3UX4+JFD34hc4w62tZvccAvDCPRx+rF99DRqBR0 CT6qdk9xD3DIjFdSaNuT+TIyjoOACHI1uaBNQol4yAY0DiCo7qmIJs546KmEhQ8UILG4 BBdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393836; x=1775998636; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=dhCvzWzEYwqtXTRdhJr7F88/ilDc9KP9wceAmsw7dyk=; b=Ws53j2jymwDHAGxwfKn5WKhEYJSELxcd5SzgbdUBye7lb9vfxH3znXIGT3V6yiBXOJ CUbCD06L+75pmJ+VJWlxejW2VTfBFor7O0lHN/0gc01LVeIolgTNstptT8D4gdd9MfAH zFp5bHpFHFqRdqkAI3QQKqvE0u3bkm6LXwUioTmmT7u27akR0jAzl3LD7QR6GF7XQw0h kqxR5LJFSH5tl5At+6nFdrYnrYq64cNuVSmKx5+aRJC4/cFKfAJTG7oyQ0ms7Q+idq9m 4h8jc7Uh/dnm24L6Wf+R5pA4eydMu+dqv+YgcVlpr2kg1EUN51b9zesTnuwoQswiypwg h6fw== X-Forwarded-Encrypted: i=1; AJvYcCVf0LobjOK6W8gGfkgLtCtl+7P73SosEVAwKUcdm5JI7nYdtpqFjcCdj9EHRvBPoFjI2dX5ETUDzNybluw=@vger.kernel.org X-Gm-Message-State: AOJu0Yx3CADaGWn+HpZPmMhgCv6Y8shbVclqpfZ0xNzw4VxhdZ8K+hy9 bLih2damvrhxvuQKo9toOGnPElR1SeAY3/Cej101U3/TFb9K3SddNKOXpHwa2tKVgf0= X-Gm-Gg: AeBDiesCTLoahNmVt5jyETuRyx9TBMjnL3w4Jp7SGfx03121KdPXOlGxH/WBeUM9p8G Eva9imY+5HCLShNRmU5j/DzAPtcFEuz33izwj5W0fdmDEe/tK1Kq8/+CTc+t3nTGCDuY3+vRq1S biEkcIWazzsMVJviKeSq5S/8bNQScqk0J2diXj7fmwexxTWwsvY3fwZ3N1H7XJ0XYLMXLYjpszK y3lMRMd7Kn8/VHXmcdnWNTkO7+ErIP1ShTUanEdBwlaZbj2ZZam37C2hiCRvFSzUThaAJYFxFjd 1NDUOFkqgBlfWTBmOO5zSrwy1l39z8piG7i9+90ea06VwJtx6vhJnvSl+2zKFgdTls3X4PAZyZ8 HIk2XKm+3a/HS+QtGt8G4oYu9RwXcLXOCTUxO/ZGAe1khqDx/+dRMrAd395yTELpT7hHM/iYAqx 6cWJ5LXXwplomhamSJ8xhXZ89+NntzSlefAua01W6wJHo= X-Received: by 2002:a17:90b:3f4d:b0:35d:9482:2233 with SMTP id 98e67ed59e1d1-35de69a66e1mr8904624a91.24.1775393835707; Sun, 05 Apr 2026 05:57:15 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.57.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:57:15 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 36/49] powerpc/mm: use generic vmemmap_shared_tail_page() in compound vmemmap Date: Sun, 5 Apr 2026 20:52:27 +0800 Message-Id: <20260405125240.2558577-37-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The ultimate goal is to unify the vmemmap optimization logic for both DAX and HugeTLB. To achieve this, all platforms need to align with the standard HugeTLB approach of using vmemmap_shared_tail_page() for tail page mappings. This patch updates PowerPC to utilize vmemmap_shared_tail_page() to retrieve the pre-allocated and initialized shared tail page, instead of dynamically looking up and constructing the tail page mapping via vmemmap_compound_tail_page(). As a byproduct of this alignment, it greatly simplifies the vmemmap compound page mapping logic in radix_pgtable by removing vmemmap_compound_tail_page() entirely. Signed-off-by: Muchun Song --- arch/powerpc/mm/book3s64/radix_pgtable.c | 81 +++--------------------- 1 file changed, 9 insertions(+), 72 deletions(-) diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/boo= k3s64/radix_pgtable.c index ad44883b1030..5ce3deb464d5 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1256,59 +1256,6 @@ static pte_t * __meminit radix__vmemmap_populate_add= ress(unsigned long addr, int return pte; } =20 -static pte_t * __meminit vmemmap_compound_tail_page(unsigned long addr, - unsigned long pfn_offset, int node) -{ - pgd_t *pgd; - p4d_t *p4d; - pud_t *pud; - pmd_t *pmd; - pte_t *pte; - unsigned long map_addr; - - /* the second vmemmap page which we use for duplication */ - map_addr =3D addr - pfn_offset * sizeof(struct page) + PAGE_SIZE; - pgd =3D pgd_offset_k(map_addr); - p4d =3D p4d_offset(pgd, map_addr); - pud =3D vmemmap_pud_alloc(p4d, node, map_addr); - if (!pud) - return NULL; - pmd =3D vmemmap_pmd_alloc(pud, node, map_addr); - if (!pmd) - return NULL; - if (pmd_leaf(*pmd)) - /* - * The second page is mapped as a hugepage due to a nearby request. - * Force our mapping to page size without deduplication - */ - return NULL; - pte =3D vmemmap_pte_alloc(pmd, node, map_addr); - if (!pte) - return NULL; - /* - * Check if there exist a mapping to the left - */ - if (pte_none(*pte)) { - /* - * Populate the head page vmemmap page. - * It can fall in different pmd, hence - * vmemmap_populate_address() - */ - pte =3D radix__vmemmap_populate_address(map_addr - PAGE_SIZE, node, NULL= , NULL); - if (!pte) - return NULL; - /* - * Populate the tail pages vmemmap page - */ - pte =3D radix__vmemmap_pte_populate(pmd, map_addr, node, NULL, NULL); - if (!pte) - return NULL; - vmemmap_verify(pte, node, map_addr, map_addr + PAGE_SIZE); - return pte; - } - return pte; -} - static int __meminit vmemmap_populate_compound_pages(unsigned long start_p= fn, unsigned long start, unsigned long end, int node, @@ -1327,6 +1274,13 @@ static int __meminit vmemmap_populate_compound_pages= (unsigned long start_pfn, pud_t *pud; pmd_t *pmd; pte_t *pte; + const struct mem_section *ms =3D __pfn_to_section(start_pfn); + struct page *tail_page; + + tail_page =3D vmemmap_shared_tail_page(section_order(ms), + &NODE_DATA(node)->node_zones[section_zone(ms)]); + if (!tail_page) + return -ENOMEM; =20 for (addr =3D start; addr < end; addr =3D next) { =20 @@ -1358,9 +1312,8 @@ static int __meminit vmemmap_populate_compound_pages(= unsigned long start_pfn, next =3D addr + PAGE_SIZE; continue; } else { - unsigned long nr_pages =3D pgmap_vmemmap_nr(pgmap); + unsigned long nr_pages =3D 1L << section_order(ms); unsigned long pfn_offset =3D addr_pfn - ALIGN_DOWN(addr_pfn, nr_pages); - pte_t *tail_page_pte; =20 /* * if the address is aligned to huge page size it is the @@ -1386,24 +1339,8 @@ static int __meminit vmemmap_populate_compound_pages= (unsigned long start_pfn, next =3D addr + 2 * PAGE_SIZE; continue; } - /* - * get the 2nd mapping details - * Also create it if that doesn't exist - */ - tail_page_pte =3D vmemmap_compound_tail_page(addr, pfn_offset, node); - if (!tail_page_pte) { - - pte =3D radix__vmemmap_pte_populate(pmd, addr, node, NULL, NULL); - if (!pte) - return -ENOMEM; - vmemmap_verify(pte, node, addr, addr + PAGE_SIZE); - - addr_pfn +=3D 1; - next =3D addr + PAGE_SIZE; - continue; - } =20 - pte =3D radix__vmemmap_pte_populate(pmd, addr, node, NULL, pte_page(*ta= il_page_pte)); + pte =3D radix__vmemmap_pte_populate(pmd, addr, node, NULL, tail_page); if (!pte) return -ENOMEM; vmemmap_verify(pte, node, addr, addr + PAGE_SIZE); --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F7E0342CB0 for ; Sun, 5 Apr 2026 12:57:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393846; cv=none; b=BfB0MZAB6ovVmfMBaSQFEAwjq16XR+IzIUPmDuyJ6Fq/E9lHcjYMuB1X0PlTi7gLgwqT6dyzxbydcQmF7RZp/duvIl7xzoGy01LPuuPQsKt3q+E16jz4SB0S3MR/qAj/Vxiv6//kUGTCEjZt33vMJWGtjbMuAs4xKjP/obZmE6Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393846; c=relaxed/simple; bh=tBPSTOd19ASXDp2tcHZEZoKmLbPCx8yK1RZCtuOemYo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=I9lJwr1PZDWKkay2E8syZkrp4tiSIWFkdqMwFDyAaC7L5Vb6rj0DAOALx3F1nBFUxl1V9JFpP9m85qYqmc//flBsJCIVdfOeDZ4bYmfIbtvMvXR3V1Yq6hRNaH1zVwymUuRkQMaCDFy371F1c7erXKD97aC7diuojt53ozKJQNA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=lKfnFMJX; arc=none smtp.client-ip=209.85.216.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="lKfnFMJX" Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-35691a231a7so1644466a91.3 for ; Sun, 05 Apr 2026 05:57:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393844; x=1775998644; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Un8QJIOHiLQFUGTVr1x/7L06qAEVN23h4FxFsc9SMV0=; b=lKfnFMJXxo2nyYAdyKnUzpZx6d7PdMyD/D94EeSEh9VSfm26j96XuSTfJwv86YEyF6 3h81in+wquTVLVyNn/51fRmW9xiMXH2UkiYdZFtynGTbhCT8mMtgCViHxtKtVoe+gMw7 H2Pm1OjLgVAO/Ae7G8EPR62r01+uZKgsVRM3pIr8g4rb5l/AFVRQowNucgotxKsdvNgZ BNH6/ypqLqp5t7SuoC4an8QPeascKyafvVQ25fsYw4Yiwbwx0Zrllo8ufdHo8yxqABYL 7YMZr/zEMcShwQ3XWrDcwIPKipF2oWMhUaQNxVmhlaQeWhqtALaxRyR+rSwFwwrbC3ZW UyeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393844; x=1775998644; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Un8QJIOHiLQFUGTVr1x/7L06qAEVN23h4FxFsc9SMV0=; b=ktAOLkcM1xOIEjiMA7KFLuM+221cdb+ShATYTweIcsWSAFPfTr58sjKDOogKnO8djP P8dWdX0eoV6gCN4llSAIBV0aVC9FljtdDBJpjse4v66GZFJZdsW5lY1Q2lIAgRNgBNMD or2PszzsC0NeSgc96vyazoulfVV+s7Zn6hNqzI8mSSHLOUwN70+k6tyDNLpG6TCh/kWa gqiL9scn2MkqtnsR2wxEKtw/Nhq5cL0SsJISoQciRNB/kkCbNU8xOuFMaaTLgUgzyxVX 0+FdWL9AQMq1CHB1Pp8sbeQKkiYWhy96DW8duxeATPvFsVadVxsq3fPsz4V13F5bVim6 /ESQ== X-Forwarded-Encrypted: i=1; AJvYcCWIRhLQDlB4LoiHgAwne+Yrs9lljrzg9/qsN2Clj19Rxyhpv6t/cn2O+ktYgEvYO8wVO6SJtLzYQBAsoCI=@vger.kernel.org X-Gm-Message-State: AOJu0YzutkTHfn0EoVrg6bAZf8F2fW0+gDwycjsChOFIIKsqgvz2LQLT yx2eCZvmW45N4FU5503W7Z7FSufRCUhrWo0ddw44hzzA0g0OiQTmViEUXy6+t+DOorg= X-Gm-Gg: AeBDiesXj8Fdx3ZbnUAmFxC9yQWSYCy+fXqD9o3coogsVjgvyY6VM/UQsnjLw9GSuil xRI3rlMwqxLNlM5JgfKaEkrkIsOWYZVSDe0DoE917qJK5OBPN7PW1VTVBfDvKG4hVU6N0dlvNzL y6j+cQ1HwN4nP3d8EPvuA9moKlv3J/rwG0DDMh9Z3Ya+WZ19frXvrYRrs/s095wee7V0Xcc2Dyj uOSO8SEfwEuDvlUEBZtk5YeZ4+WLBxJR2pVrQuAv+L1rSsoE5V84uVrkOhsmGX1/OG5QSvGPpbY qWl5dViuF5gnzduHwiZvQQsCaboBXvM6CbSvg2scgjGZih3U9V06BslyNDbcnssLLWR4u6YsofN uPYdQg1jopT7SelUVsRWkOT5hmvp7W4jrnh478MziNoRfQx2WFrryX9IFz1YpC2lxWzKFqpYBrF dNHYcElk0XVvqAGmrLzsGVS0IYVKvhmG8sVLMrehMRhCg= X-Received: by 2002:a17:90b:4fc4:b0:35d:a374:b385 with SMTP id 98e67ed59e1d1-35de6a1951bmr7999215a91.29.1775393844264; Sun, 05 Apr 2026 05:57:24 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.57.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:57:23 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 37/49] mm/sparse-vmemmap: unify DAX and HugeTLB vmemmap optimization Date: Sun, 5 Apr 2026 20:52:28 +0800 Message-Id: <20260405125240.2558577-38-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The ultimate goal of the recent refactoring series is to unify the vmemmap optimization logic for both DAX and HugeTLB under a common framework (CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION). A key breakthrough in this unification is that DAX now only requires 1 vmemmap page to be preserved (the head page), aligning its requirements exactly with HugeTLB. Previously, DAX optimization relied on a dedicated upper-level function, vmemmap_populate_compound_pages, which handled the manual allocation of the head page AND the first tail page before reusing the shared tail page for the rest. Because DAX and HugeTLB are now perfectly aligned in their optimization requirements (1 reserved page + reused shared tail pages), this patch eliminates the dedicated compound page mapping loop entirely. Instead, it pushes the optimization decision down to the lowest level in vmemmap_pte_populate. Now, all mapping requests flow through the standard vmemmap_populate_basepages. Signed-off-by: Muchun Song --- arch/powerpc/mm/book3s64/radix_pgtable.c | 13 +- include/linux/mm.h | 2 +- mm/mm_init.c | 2 +- mm/sparse-vmemmap.c | 185 +++++------------------ 4 files changed, 40 insertions(+), 162 deletions(-) diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/boo= k3s64/radix_pgtable.c index 5ce3deb464d5..714d5cdc10ec 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1326,17 +1326,8 @@ static int __meminit vmemmap_populate_compound_pages= (unsigned long start_pfn, return -ENOMEM; vmemmap_verify(pte, node, addr, addr + PAGE_SIZE); =20 - /* - * Populate the tail pages vmemmap page - * It can fall in different pmd, hence - * vmemmap_populate_address() - */ - pte =3D radix__vmemmap_populate_address(addr + PAGE_SIZE, node, NULL, = NULL); - if (!pte) - return -ENOMEM; - - addr_pfn +=3D 2; - next =3D addr + 2 * PAGE_SIZE; + addr_pfn +=3D 1; + next =3D addr + PAGE_SIZE; continue; } =20 diff --git a/include/linux/mm.h b/include/linux/mm.h index 15841829b7eb..bceef0dc578b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4912,7 +4912,7 @@ static inline void vmem_altmap_free(struct vmem_altma= p *altmap, } #endif =20 -#define VMEMMAP_RESERVE_NR 2 +#define VMEMMAP_RESERVE_NR OPTIMIZED_FOLIO_VMEMMAP_PAGES #ifdef CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP static inline bool __vmemmap_can_optimize(struct vmem_altmap *altmap, struct dev_pagemap *pgmap) diff --git a/mm/mm_init.c b/mm/mm_init.c index 636a0f9644f6..6b23b5f02544 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1066,7 +1066,7 @@ static void __ref __init_zone_device_page(struct page= *page, unsigned long pfn, * initialize is a lot smaller that the total amount of struct pages being * mapped. This is a paired / mild layering violation with explicit knowle= dge * of how the sparse_vmemmap internals handle compound pages in the lack - * of an altmap. See vmemmap_populate_compound_pages(). + * of an altmap. */ static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap, struct dev_pagemap *pgmap, diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 1867b5dcc73c..fd7b0e1e5aba 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -152,46 +152,40 @@ static pte_t * __meminit vmemmap_pte_populate(pmd_t *= pmd, unsigned long addr, in struct vmem_altmap *altmap, unsigned long ptpfn) { - pte_t *pte =3D pte_offset_kernel(pmd, addr); - - if (pte_none(ptep_get(pte))) { - pte_t entry; - - if (vmemmap_page_optimizable((struct page *)addr) && - ptpfn =3D=3D (unsigned long)-1) { - struct page *page; - unsigned long pfn =3D page_to_pfn((struct page *)addr); - const struct mem_section *ms =3D __pfn_to_section(pfn); - - page =3D vmemmap_shared_tail_page(section_order(ms), - section_to_zone(ms, node)); - if (!page) - return NULL; - ptpfn =3D page_to_pfn(page); - } + pte_t entry, *pte =3D pte_offset_kernel(pmd, addr); =20 - if (ptpfn =3D=3D (unsigned long)-1) { - void *p =3D vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); - - if (!p) - return NULL; - ptpfn =3D PHYS_PFN(__pa(p)); - } else { - /* - * When a PTE/PMD entry is freed from the init_mm - * there's a free_pages() call to this page allocated - * above. Thus this get_page() is paired with the - * put_page_testzero() on the freeing path. - * This can only called by certain ZONE_DEVICE path, - * and through vmemmap_populate_compound_pages() when - * slab is available. - */ - if (slab_is_available()) - get_page(pfn_to_page(ptpfn)); - } - entry =3D pfn_pte(ptpfn, PAGE_KERNEL); - set_pte_at(&init_mm, addr, pte, entry); + if (!pte_none(ptep_get(pte))) + return pte; + + /* See layout diagram in Documentation/mm/vmemmap_dedup.rst. */ + if (vmemmap_page_optimizable((struct page *)addr)) { + struct page *page; + unsigned long pfn =3D page_to_pfn((struct page *)addr); + const struct mem_section *ms =3D __pfn_to_section(pfn); + + page =3D vmemmap_shared_tail_page(section_order(ms), + section_to_zone(ms, node)); + if (!page) + return NULL; + + /* + * When a PTE entry is freed, a free_pages() call occurs. This + * get_page() pairs with put_page_testzero() on the freeing + * path. This can only occur when slab is available. + */ + if (slab_is_available()) + get_page(page); + ptpfn =3D page_to_pfn(page); + } else { + void *p =3D vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); + + if (!p) + return NULL; + ptpfn =3D PHYS_PFN(__pa(p)); } + entry =3D pfn_pte(ptpfn, PAGE_KERNEL); + set_pte_at(&init_mm, addr, pte, entry); + return pte; } =20 @@ -287,17 +281,15 @@ static pte_t * __meminit vmemmap_populate_address(uns= igned long addr, int node, return pte; } =20 -static int __meminit vmemmap_populate_range(unsigned long start, - unsigned long end, int node, - struct vmem_altmap *altmap, - unsigned long ptpfn) +int __meminit vmemmap_populate_basepages(unsigned long start, unsigned lon= g end, + int node, struct vmem_altmap *altmap, + struct dev_pagemap *pgmap) { unsigned long addr =3D start; pte_t *pte; =20 for (; addr < end; addr +=3D PAGE_SIZE) { - pte =3D vmemmap_populate_address(addr, node, altmap, - ptpfn); + pte =3D vmemmap_populate_address(addr, node, altmap, -1); if (!pte) return -ENOMEM; } @@ -305,19 +297,6 @@ static int __meminit vmemmap_populate_range(unsigned l= ong start, return 0; } =20 -static int __meminit vmemmap_populate_compound_pages(unsigned long start, - unsigned long end, int node, - struct dev_pagemap *pgmap); - -int __meminit vmemmap_populate_basepages(unsigned long start, unsigned lon= g end, - int node, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) -{ - if (vmemmap_can_optimize(altmap, pgmap)) - return vmemmap_populate_compound_pages(start, end, node, pgmap); - return vmemmap_populate_range(start, end, node, altmap, -1); -} - /* * Write protect the mirrored tail page structs for HVO. This will be * called from the hugetlb code when gathering and initializing the @@ -397,9 +376,6 @@ int __meminit vmemmap_populate_hugepages(unsigned long = start, unsigned long end, pud_t *pud; pmd_t *pmd; =20 - if (vmemmap_can_optimize(altmap, pgmap)) - return vmemmap_populate_compound_pages(start, end, node, pgmap); - for (addr =3D start; addr < end; addr =3D next) { unsigned long pfn =3D page_to_pfn((struct page *)addr); const struct mem_section *ms =3D __pfn_to_section(pfn); @@ -447,95 +423,6 @@ int __meminit vmemmap_populate_hugepages(unsigned long= start, unsigned long end, return 0; } =20 -/* - * For compound pages bigger than section size (e.g. x86 1G compound - * pages with 2M subsection size) fill the rest of sections as tail - * pages. - * - * Note that memremap_pages() resets @nr_range value and will increment - * it after each range successful onlining. Thus the value or @nr_range - * at section memmap populate corresponds to the in-progress range - * being onlined here. - */ -static bool __meminit reuse_compound_section(unsigned long start_pfn, - struct dev_pagemap *pgmap) -{ - unsigned long nr_pages =3D pgmap_vmemmap_nr(pgmap); - unsigned long offset =3D start_pfn - - PHYS_PFN(pgmap->ranges[pgmap->nr_range].start); - - return !IS_ALIGNED(offset, nr_pages) && nr_pages > PAGES_PER_SUBSECTION; -} - -static int __meminit vmemmap_populate_compound_pages(unsigned long start, - unsigned long end, int node, - struct dev_pagemap *pgmap) -{ - unsigned long size, addr; - pte_t *pte; - int rc; - unsigned long start_pfn =3D page_to_pfn((struct page *)start); - const struct mem_section *ms =3D __pfn_to_section(start_pfn); - struct page *tail; - - /* This may occur in sub-section scenarios. */ - if (!section_vmemmap_optimizable(ms)) - return vmemmap_populate_range(start, end, node, NULL, -1); - - tail =3D vmemmap_shared_tail_page(section_order(ms), - section_to_zone(ms, node)); - if (!tail) - return -ENOMEM; - - if (reuse_compound_section(start_pfn, pgmap)) - return vmemmap_populate_range(start, end, node, NULL, - page_to_pfn(tail)); - - size =3D min(end - start, pgmap_vmemmap_nr(pgmap) * sizeof(struct page)); - for (addr =3D start; addr < end; addr +=3D size) { - unsigned long next, last =3D addr + size; - void *p; - - /* Populate the head page vmemmap page */ - pte =3D vmemmap_populate_address(addr, node, NULL, -1); - if (!pte) - return -ENOMEM; - - /* - * Allocate manually since vmemmap_populate_address() will assume DAX - * only needs 1 vmemmap page to be reserved, however DAX now needs 2 - * vmemmap pages. This is a temporary solution and will be unified - * with HugeTLB in the future. - */ - p =3D vmemmap_alloc_block_buf(PAGE_SIZE, node, NULL); - if (!p) - return -ENOMEM; - - /* Populate the tail pages vmemmap page */ - next =3D addr + PAGE_SIZE; - pte =3D vmemmap_populate_address(next, node, NULL, PHYS_PFN(__pa(p))); - /* - * get_page() is called above. Since we are not actually - * reusing it, to avoid a memory leak, we call put_page() here. - */ - put_page(virt_to_page(p)); - if (!pte) - return -ENOMEM; - - /* - * Reuse the shared vmemmap page for the rest of tail pages - * See layout diagram in Documentation/mm/vmemmap_dedup.rst - */ - next +=3D PAGE_SIZE; - rc =3D vmemmap_populate_range(next, last, node, NULL, - page_to_pfn(tail)); - if (rc) - return -ENOMEM; - } - - return 0; -} - struct page * __meminit __populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap) --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 880833446CA for ; Sun, 5 Apr 2026 12:57:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393852; cv=none; b=An1uuiQdYxStybDpDSfdoyIIIA36pNlFxTVL1zbeRJf8TEQdblAsA1phdwJLx4zjaOYo5oRT9DjUTwkDw2B1zgbgEceD4inWyJ3m0+eaPQ8kcbIu2R1li6Y68IxSVv3P/WSbk/XdvaV/HIHQ1JEPZlG1x4jSn1nSiBGScH3Y1+U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393852; c=relaxed/simple; bh=bCvmT3PUvBujz7P/bZPqx8KFVByUsSGxlvyXz9oV1dw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LODq546vVq7WUqLMWUyLPHLYX3ox+JtTcWPk7SAXYclV5l2io9W2pE2DJtH+FViYNbZqLIy+19/u9FO0MdpdYi3fACKrdoI4By99FFXGcQ3vLDzOs5H3bASnwicza8axE+D+enBlLEx6Kwj0KLjhzV9P9lkqOTtt6N47qeTkM3c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=YGfchzM1; arc=none smtp.client-ip=209.85.216.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="YGfchzM1" Received: by mail-pj1-f51.google.com with SMTP id 98e67ed59e1d1-35c124d2613so1750824a91.2 for ; Sun, 05 Apr 2026 05:57:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393851; x=1775998651; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yEhjAbTiZVKX8bAZfBAftf7xoJQ7IQs8AgKwKN6i98g=; b=YGfchzM1+2ay1eF7zo/qVQ8Xw2ggs9B+jYJLiLrH9taUys31UTzhy0tljOzaOOySv7 Q0h0Q+bBZXZFeHdcTzNnwPuLgp23O041j7bLHojro7elz0t2fxHP9xiKqUzGEOvGxbkW c4bdwKL6o0XKnthLAIzBldRpaRKJEHRB94SfQsHldFO6svNEvEr/OHRWz6bZ1TFKvjOb QuyjE8t0MjWZZloC2QgvrdGH94oYjQ5PoH/O8I5DRZfJv+Xv6MGGcRxBiyDw+HWKv+Sl Busg5Xz3H7dAKh2zU1ZHnaZ+GD7e6385U+2D3nWowFCvpLSxLJ07jqgmb34U3tEUH6T4 KmWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393851; x=1775998651; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=yEhjAbTiZVKX8bAZfBAftf7xoJQ7IQs8AgKwKN6i98g=; b=bldGZGautR9mNyv0T5skP3OF9agcCkRS9YFrHVOFsyO7mMyDOW9d7UMC4EH9UFfnLT cGXl0cWWw485Rb5P2f/aWAp6GuVWa1jcbuMugH0Y/15hyeGz9V8aNZbgfNXs77b0e850 F1uJ6uKaqyhRaaJwVlDIrZgnVcPU6vlxbWRyOpQHdPWLjNHYegqmgWhU8AdfbiiAo8FT Lkrk0E0W5cFaUqLzXAvObxK3K97BmbRNLiVIF6RtK/H35g/tosh7PvtXtFkk8aAs70Zr rkHBzGDSKWx+WNb01lHp7SSdQY71Lypa5Y0vblFNSDbq/+onlSyyzZOkBk9+dQ3q0cB9 AnlQ== X-Forwarded-Encrypted: i=1; AJvYcCVOCp1UpNjgI5p3DEi+h/hluKow9ksKRQYCPtlyO0UG35z3r3mfWESiOsGIhMjbsGPSmvyesRb4rKZV5Bs=@vger.kernel.org X-Gm-Message-State: AOJu0YxfVNTyAOdRuo1e7amc2tDQFVQAop5w8vWKQvyobModakIAPec3 1K/BRQiGrPeEbkioQu8fAHjjErKii8ZgnTzqzIz6RTK/3HQ5c12xzltZhS08A4JIFZxRxll9Q+h gAw3I X-Gm-Gg: AeBDievbU91KTugYEFM5UK49eKd13ia3x15Kfs2nFpvFwGbLr6K+1BQwTTmj9IXJoaq X4l165LLHUIhQCxng15mqpJ+5sYpGiYQNEQdCxmrIPXwN48wop+6FpcSXdahZcr2voeKr4QK0RH WinaUvPWVB7Z3g5qgm6ez/VAg3qKZBh8WOvTTh5Bw3Meim6UCxHgJl9dnFgNXtz3Sbe6BmkpaFQ IhojOSda1xmeaXnk4Qbpk62y1n2SAq99XAuExULyUJqPfHXYPsZtapYcRA5j25Ufaw0B09UuQU6 HTJy8tkXecQqDBI/O/q2VzCr1lMTL5m3nh6NmxpINI+xRKKgyerB2Xu8tsuwEVfbLvq7N9xg8tK 4IBEhyRkacyx+4Dq4vOD4BkqiJtDNfqJhQegK7rdsrCYhf91rv9IMtHrJ93/IsZS94SeBUhwZpo XM2eO/7oa6gaGupjPxYvir2to3b2uLDDNo2QkrVMMvuyM= X-Received: by 2002:a17:90b:2b4b:b0:34c:fe7e:84fe with SMTP id 98e67ed59e1d1-35de69aec1bmr9125066a91.28.1775393850723; Sun, 05 Apr 2026 05:57:30 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.57.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:57:30 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 38/49] mm/sparse-vmemmap: remap the shared tail pages as read-only Date: Sun, 5 Apr 2026 20:52:29 +0800 Message-Id: <20260405125240.2558577-39-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" HugeTLB enforces Read-Only mappings for HVO to prevent illegal write operations, whereas DAX currently does not, which introduces potential security risks. Now that we are unifying the HVO logic for HugeTLB and DAX, we can remap the shared tail pages as read-only directly in vmemmap_pte_populate(). This ensures that both HugeTLB and DAX benefit from the read-only protection of vmemmap tail pages right from the point of mapping establishment. Signed-off-by: Muchun Song --- mm/sparse-vmemmap.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index fd7b0e1e5aba..c70275717054 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -176,14 +176,17 @@ static pte_t * __meminit vmemmap_pte_populate(pmd_t *= pmd, unsigned long addr, in if (slab_is_available()) get_page(page); ptpfn =3D page_to_pfn(page); + + /* Remap shared tail page read-only to catch illegal writes. */ + entry =3D pfn_pte(ptpfn, PAGE_KERNEL_RO); } else { void *p =3D vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); =20 if (!p) return NULL; ptpfn =3D PHYS_PFN(__pa(p)); + entry =3D pfn_pte(ptpfn, PAGE_KERNEL); } - entry =3D pfn_pte(ptpfn, PAGE_KERNEL); set_pte_at(&init_mm, addr, pte, entry); =20 return pte; --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 916FB8632B for ; Sun, 5 Apr 2026 12:57:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393864; cv=none; b=BwFXCeS02iiYjWdy+YqZg7ZRBN594hzUuorDp8C1JEu+MDqzVZje1KygtAQBefS8wn1h8Po+LuvV4HMEvaBIfRHro5Og7Rzfbu+mzAm39DYu2YnLAZi1TuVv7DS8uCQ0PoWJPMs8meEnk+U2hIlKWe/EPP+6TY/QomkDwD3H3o8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393864; c=relaxed/simple; bh=waoXXKQkJYGEnpgcpG5NXOTv9+GnEhPI7hCNIDi0K4E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=uegHGGxtksDU7KBXsBSHD/S4A5BUkuDXitgr73AW3oVTq0UnaZHZi3pG5rG1yw7ixppZvQqBIXNwVHgfvG8IsSbfOh1RbWX85P5CfTtrrFBVRW2/hdpQXrK4zVlyciuva2qZutVaLs3PA+7aIFuejPjSjbf1jMy5mh6QFz3bSLI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=cPPjbo5a; arc=none smtp.client-ip=209.85.216.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="cPPjbo5a" Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-35d971fbcddso1861843a91.1 for ; Sun, 05 Apr 2026 05:57:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393863; x=1775998663; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FJ+KumqAX13hYqdLVZVTFF+dZIRX9xi7vByMiu7tcmo=; b=cPPjbo5aHvKps5x2ZTtDdrvq1nMRgDv5Laamc6bOWFjWjG/TeGPQyhFPN16Y3BpLD9 Yofr1vSFNA33z9HUW9zDWs2KpSbxLTfXPDqrIqj++T1vs2Ur+HsG2tI8bNp+wZMcHmqG /Ir9p7jQQW5YlIWRVfdcHQaWI+g2fB+z+pWzNoPF/i0OVW9jRScD1aRIeM8R45PEi87O A6WtMHtPaIahs5Vwn9fADDVeI+HsqkPWMQmq7oi2FHd73isXhqwaV1JoPbHFQtBe+rhO 186XOcl/GzJO663JKj1RU/9wHQ9FhWTT3h29YHASychLBgEzCViWm6senHyQxXDHnbIA 4cxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393863; x=1775998663; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=FJ+KumqAX13hYqdLVZVTFF+dZIRX9xi7vByMiu7tcmo=; b=Dl4BOQ6R9Ltsa7ReSgq27eZ1+XJLMlnE0Xwq+FDr7PUc9SpH2tnjKaJV6MzwytqwrP p3U37fH92mWyy+q0qohTRYp3kpQo0AOs+m27QOt/rqviXzUhpnGl2IsZ2rTyOTuAKDm0 kwubWMdn7N+x8AHktkQixZTdvzXYyJYKw0mSTzXfvaJay77dARb91ilJh4w3XN5yWyUP QYJe0BMdY1Gi15oVPTo3mLAUOkqSubwFhtDB5wCvGcILjLZtsH+K4EHmmLzTOGarv49D V+eoyEmCzNy/t1xQ6SgRhpCzUCs2q/eL65p3GmY3bcBXGYRD/Awp7RKD9Z9c7eiDimlJ cDWA== X-Forwarded-Encrypted: i=1; AJvYcCWVYWcOFmecPu/c+S6DR48Vm4gCUqJsuKIlUIcz263i+n2IZTKIpeGzPNCcw0z/olfa/afgcYN18Pi0Xa4=@vger.kernel.org X-Gm-Message-State: AOJu0YyBgIVQyx9+V/X+zkTvnSROerzy9C/0trRjEJ/oBGSBUEokFpwQ yCp7U2yE4GkuY7CqrJDygBi/yhbIWYsmmyZfbAwn6830LG1agaz/Ongbi59y6A0RFGQ= X-Gm-Gg: AeBDietpfN8vgZwLL8zjhMAPIS2fvGKBCOpEWP78krZalsjwxRqY5pV4aAV9qznGnmq hquMADwJrkiXWTiyRC8m3UDvUOn5hjh6rrTzLtOPIhb34z0Bjdpb+ISt+XqmEcTbl1OSpD6RV8n 5y+5k3//hH1rlDWJ4Qfkg+X9MhJJeGblKevGrWpm0OdaTbKbuhml3BUePeeEvl9nm+XlKL/6y1I uCEy8IG12WD2DtE7rcw9cMmWNMUxcSDA6z5syqeYEHKcbEOA6zo0m+G+dLjH7Q1QHElv9wbj33J LkgroLripBUZ5BAiQjbG5Kx8hKXolOiXKlJNkwqBD1DSVPpzY9zCimtf8VsCKqRc2g79fUT6rN3 cXK9wNPLcoSdlkyz0TH4sGXo7K772s6qUlJB1iD32FGEK5k1cVhCX7xIqjm5C/Ktd/3jVZ8lYpX dloLr5ZABWpNrYW6PAas5fayzlzNdLZ95Q3WFMVw1zsNI= X-Received: by 2002:a17:90b:3f50:b0:356:21e9:73ff with SMTP id 98e67ed59e1d1-35de5c3d5b5mr6803962a91.11.1775393862969; Sun, 05 Apr 2026 05:57:42 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.57.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:57:42 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 39/49] mm/sparse-vmemmap: remove unused ptpfn argument Date: Sun, 5 Apr 2026 20:52:30 +0800 Message-Id: <20260405125240.2558577-40-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The ptpfn argument is unused as it is assigned unconditionally before being used in vmemmap_pte_populate(). Let's remove it. Signed-off-by: Muchun Song --- mm/sparse-vmemmap.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index c70275717054..36e5bcb5ba9b 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -149,8 +149,7 @@ static inline struct zone *section_to_zone(const struct= mem_section *ms, int nid } =20 static pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long ad= dr, int node, - struct vmem_altmap *altmap, - unsigned long ptpfn) + struct vmem_altmap *altmap) { pte_t entry, *pte =3D pte_offset_kernel(pmd, addr); =20 @@ -175,17 +174,15 @@ static pte_t * __meminit vmemmap_pte_populate(pmd_t *= pmd, unsigned long addr, in */ if (slab_is_available()) get_page(page); - ptpfn =3D page_to_pfn(page); =20 /* Remap shared tail page read-only to catch illegal writes. */ - entry =3D pfn_pte(ptpfn, PAGE_KERNEL_RO); + entry =3D pfn_pte(page_to_pfn(page), PAGE_KERNEL_RO); } else { void *p =3D vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); =20 if (!p) return NULL; - ptpfn =3D PHYS_PFN(__pa(p)); - entry =3D pfn_pte(ptpfn, PAGE_KERNEL); + entry =3D pfn_pte(PHYS_PFN(__pa(p)), PAGE_KERNEL); } set_pte_at(&init_mm, addr, pte, entry); =20 @@ -255,8 +252,7 @@ static pgd_t * __meminit vmemmap_pgd_populate(unsigned = long addr, int node) } =20 static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int = node, - struct vmem_altmap *altmap, - unsigned long ptpfn) + struct vmem_altmap *altmap) { pgd_t *pgd; p4d_t *p4d; @@ -276,7 +272,7 @@ static pte_t * __meminit vmemmap_populate_address(unsig= ned long addr, int node, pmd =3D vmemmap_pmd_populate(pud, addr, node); if (!pmd) return NULL; - pte =3D vmemmap_pte_populate(pmd, addr, node, altmap, ptpfn); + pte =3D vmemmap_pte_populate(pmd, addr, node, altmap); if (!pte) return NULL; vmemmap_verify(pte, node, addr, addr + PAGE_SIZE); @@ -292,7 +288,7 @@ int __meminit vmemmap_populate_basepages(unsigned long = start, unsigned long end, pte_t *pte; =20 for (; addr < end; addr +=3D PAGE_SIZE) { - pte =3D vmemmap_populate_address(addr, node, altmap, -1); + pte =3D vmemmap_populate_address(addr, node, altmap); if (!pte) return -ENOMEM; } --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A71A215075 for ; Sun, 5 Apr 2026 12:57:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393873; cv=none; b=NUeG6EvkVdLyy573Eh/UXWHYxbGcM5Kmd8CxIHkpqi+mZcP2PdCz/EZUo4oZDJ/tYvR5pAJa2A+RRzUJEokBaeVetfiAdcxqsQD7DiJwqiQDGvQ2kXFLJYf/gvq6KUxrOkh9+EXcJfLU484Kzdg7xOSWJTwuwGFysOK9X1n6I6I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393873; c=relaxed/simple; bh=+jfBTxqP88zt5yXCWaXMCbqL6vews7B1Zlos/u9oaAc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=OStrQZ7rPVrPOwRollylQvh8FfvMyIe3xHCAT602GRqBeTG95KyWZwBZKxwqhjWIaziELnxOP/VWx3hM5iPeHOEmUdLOKTREigaV4xhoFpkNpqj0rrnCStQNYYlhLeLWU3MYWcz1dZMnbEfJg156e0BJRDXsrgZnmJcuHEmFxR0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=WYsTvm+k; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="WYsTvm+k" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-35d9f68d011so2030667a91.2 for ; Sun, 05 Apr 2026 05:57:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393872; x=1775998672; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=S8clnDPRP9vVZ4b0lvB2dcHmNampuC1uraKtfe//7Z8=; b=WYsTvm+kQTUd7G5HFkhs6FNOUeJGuh4RBL8H/oSuDeLRiwxBJ1A3v+XiF31ZNEGwSN 27TOnKEJlze1LU2C6WTiv0CsTHmBb8ZU102b1qyvH4CTpTgE6niNjvPKXPCV6gjnF3X9 l2Ex/xdu6AOatQMCIWvOsrGzbu6eb5Lec0czfbfBzZsCxOcMbEZtlhUL0fDz1LANrHBN oKWesugx9k9cU6WSH1Ilzc+jBGq96sKeOHQunrOR6jtE6wWmCxERCh/IES5pn6HeJNp3 vT9NPnt1loZr4x18BqkwCtsxGt47kwHc9XjmSKVXgtc+k6NnDRM4EOP67wmIzo84AjR0 VmJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393872; x=1775998672; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=S8clnDPRP9vVZ4b0lvB2dcHmNampuC1uraKtfe//7Z8=; b=oPoA2QBDTDiKnD2gtdfGqm1CBE7GumfBr66trox1EI5VmXgHd+bVbnN8t7hisu0OOE 1oMrARvHCTBgdrmM5WsDP3sF3vU82pbT/Yj+dxQRGfaAj5FlFAZkCrBwlZJjc07H+akE MWjwR2IrCGtfyrW9hvwEEpfXup2h2fs1BLYz1CJCpeaBuuHU5f4p/5dV54osF4NM2EFM 085jlCE9q6qLiIiYK6pjMgvdAljIabh7CKwZcduAasI8x8Pm1ZGGOIZ6pcTgOlOt6A3B RRM7wkxaE/cVnN8J9DqWvwArbbS9eDYhtKgYjCVZqkq1r9x3Q9w0u5C+tCEWgjpZx7Ef mFRg== X-Forwarded-Encrypted: i=1; AJvYcCWr9ctk+l7ntpoqetVDL6wqnYNpS26S4zulNzy7LNXydN3QFpe+xzP1yb9DM9K36lryObUWz9BDPAwPPlI=@vger.kernel.org X-Gm-Message-State: AOJu0YxWyS161p42aiMdvhWY/3BQjtPEEj78JlAmH3g6rvgCg7OzrQnb Ubh79bPBTpxH7+4vr3MyFRBWQnrI/UM4pHiSvlScjDY+WUKGI+oChsmFibpNS/zkwQk= X-Gm-Gg: AeBDiesEUdzoll73h8T7X/h2rb3ixhNbBDspztoAMZc/XDB1JD7Scde1PE0/6Yx0gXP tv1x/N6hmhkBzHqS8EH6smZvyVhYg4UEwGiFzElM2c9nVK5VgN/2hDxvJyo65/AInljbeiSmfBp EiWhd9smzoHMxJEy/7i0aSNBQDDatmXuirtyNiWkuCyYiEYBPNU8RLO23lV6igAARGUsbHjmf9/ vWFL2XA5L53ArT0ZHc30DsGl2Aa6k/y+rg3sOdDEi+Km48fEDA9fOWLKJS7fkOS83YZUgBVKRRE 5p3kVRsvOnMpuNlJ21fjLgVOXk3SlZ4YMsOXsL5emdydSMGnpSYEmk9Q8e7GL+aZyzA0a4VJWRW nfLmTmhuMBTo3QWGRv7Uw9jXfHAoylDH8uBGULWhYsxlAo91KLa/+VZUGvYZ4rmqUyMY/Mt48HN Y6nI+mmcqd5FRqnCmohbYp0gxrCRbQ9b1t3U2+mfa60po= X-Received: by 2002:a17:90b:4ac7:b0:35d:a2aa:3b05 with SMTP id 98e67ed59e1d1-35de678f96fmr8947608a91.5.1775393871728; Sun, 05 Apr 2026 05:57:51 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.57.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:57:51 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 40/49] mm/hugetlb_vmemmap: remove vmemmap_wrprotect_hvo() and related code Date: Sun, 5 Apr 2026 20:52:31 +0800 Message-Id: <20260405125240.2558577-41-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since we have already remapped the shared tail pages as read-only in vmemmap_pte_populate() right at the point of mapping establishment, the separate pass of read-only mapping enforcement via vmemmap_wrprotect_hvo() for HugeTLB bootmem folios is no longer necessary. Remove vmemmap_wrprotect_hvo() and the associated wrapper hugetlb_vmemmap_optimize_bootmem_folios(), simplifying the code by directly using hugetlb_vmemmap_optimize_folios() for bootmem folios as well. Signed-off-by: Muchun Song --- include/linux/mm.h | 2 -- mm/hugetlb.c | 2 +- mm/hugetlb_vmemmap.c | 31 ++++--------------------------- mm/hugetlb_vmemmap.h | 6 ------ mm/sparse-vmemmap.c | 23 ----------------------- 5 files changed, 5 insertions(+), 59 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index bceef0dc578b..c36001c9d571 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4877,8 +4877,6 @@ int vmemmap_populate_hugepages(unsigned long start, u= nsigned long end, struct dev_pagemap *pgmap); int vmemmap_populate(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); -void vmemmap_wrprotect_hvo(unsigned long start, unsigned long end, int nod= e, - unsigned long headsize); void vmemmap_populate_print_last(void); struct page *vmemmap_shared_tail_page(unsigned int order, struct zone *zon= e); #ifdef CONFIG_MEMORY_HOTPLUG diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ce5a58aab5c3..84f095a23ef2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3226,7 +3226,7 @@ static void __init prep_and_add_bootmem_folios(struct= hstate *h, struct folio *folio, *tmp_f; =20 /* Send list for bulk vmemmap optimization processing */ - hugetlb_vmemmap_optimize_bootmem_folios(h, folio_list); + hugetlb_vmemmap_optimize_folios(h, folio_list); =20 list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { if (!folio_test_hugetlb_vmemmap_optimized(folio)) { diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 92c95ebdbb9a..d595ef759bc2 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -589,31 +589,18 @@ static int hugetlb_vmemmap_split_folio(const struct h= state *h, struct folio *fol return vmemmap_remap_split(vmemmap_start, vmemmap_end); } =20 -static void __hugetlb_vmemmap_optimize_folios(struct hstate *h, - struct list_head *folio_list, - bool boot) +void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *f= olio_list) { struct folio *folio; - int nr_to_optimize; + unsigned long nr_to_optimize =3D 0; LIST_HEAD(vmemmap_pages); unsigned long flags =3D VMEMMAP_REMAP_NO_TLB_FLUSH; =20 - nr_to_optimize =3D 0; list_for_each_entry(folio, folio_list, lru) { int ret; - unsigned long spfn, epfn; - - if (boot && folio_test_hugetlb_vmemmap_optimized(folio)) { - /* - * Already optimized by pre-HVO, just map the - * mirrored tail page structs RO. - */ - spfn =3D (unsigned long)&folio->page; - epfn =3D spfn + pages_per_huge_page(h); - vmemmap_wrprotect_hvo(spfn, epfn, folio_nid(folio), - OPTIMIZED_FOLIO_VMEMMAP_SIZE); + + if (folio_test_hugetlb_vmemmap_optimized(folio)) continue; - } =20 nr_to_optimize++; =20 @@ -667,16 +654,6 @@ static void __hugetlb_vmemmap_optimize_folios(struct h= state *h, free_vmemmap_page_list(&vmemmap_pages); } =20 -void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *f= olio_list) -{ - __hugetlb_vmemmap_optimize_folios(h, folio_list, false); -} - -void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list= _head *folio_list) -{ - __hugetlb_vmemmap_optimize_folios(h, folio_list, true); -} - void __init hugetlb_vmemmap_optimize_bootmem_page(struct huge_bootmem_page= *m) { struct hstate *h =3D m->hstate; diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index ff8e4c6e9833..0022f9c5a101 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -19,7 +19,6 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *= h, struct list_head *non_hvo_folios); void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *= folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *f= olio_list); -void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list= _head *folio_list); void hugetlb_vmemmap_optimize_bootmem_page(struct huge_bootmem_page *m); =20 static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) @@ -61,11 +60,6 @@ static inline void hugetlb_vmemmap_optimize_folios(struc= t hstate *h, struct list { } =20 -static inline void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *= h, - struct list_head *folio_list) -{ -} - static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct h= state *h) { return 0; diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 36e5bcb5ba9b..ba8c0c64f160 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -296,29 +296,6 @@ int __meminit vmemmap_populate_basepages(unsigned long= start, unsigned long end, return 0; } =20 -/* - * Write protect the mirrored tail page structs for HVO. This will be - * called from the hugetlb code when gathering and initializing the - * memblock allocated gigantic pages. The write protect can't be - * done earlier, since it can't be guaranteed that the reserved - * page structures will not be written to during initialization, - * even if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled. - * - * The PTEs are known to exist, and nothing else should be touching - * these pages. The caller is responsible for any TLB flushing. - */ -void vmemmap_wrprotect_hvo(unsigned long addr, unsigned long end, - int node, unsigned long headsize) -{ - unsigned long maddr; - pte_t *pte; - - for (maddr =3D addr + headsize; maddr < end; maddr +=3D PAGE_SIZE) { - pte =3D virt_to_kpte(maddr); - ptep_set_wrprotect(&init_mm, maddr, pte); - } -} - struct page *vmemmap_shared_tail_page(unsigned int order, struct zone *zon= e) { void *addr; --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEFA83368AA for ; Sun, 5 Apr 2026 12:58:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393883; cv=none; b=qjo9hq06wcjgFdqQqSHstBUdRcHFeNstiyE7PR8HSVMCcuTaK7riYqvf0fYdxpfOzKJCcm3HOhaCF9ShnFq0YMhENQ+Lt3aczywYwN+xLmH0djFzA8Y1XwVPlIi7o5EA5po6gdvnyaXLpGvQrEDTt16iG0VTMokoB1iipj17pgc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393883; c=relaxed/simple; bh=mp8QrSFyBGVVklZaNX13OTcUiatsYZJSpjYGjx4COGY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RyNCXs9o8iJ3adzz81TfJfSxjSia2LklN7GpDWmFLBdFApUwRGcs0abdYg+41slJK/Xt+VF/VJS031fjPHhu1TqhXjHw3P9U1pRRnO05Vln6JxL/PR+Kp5lvLW2DJpF5ropVrV6UVVnl9euD/1heehlWa/ZOFFOlLxACJU/Vq/4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=eQglfLa3; arc=none smtp.client-ip=209.85.216.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="eQglfLa3" Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-35da9692ec3so2876483a91.1 for ; Sun, 05 Apr 2026 05:58:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393881; x=1775998681; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MUKNSIiZkKh812RF4RmJVKT9tRU/GLdSthiF4gOWCSk=; b=eQglfLa3gvWchokM8NC2vpc4KCWiDwUqQB4D9OR/jqyzPXV4y9uDMPSGzW8OADtL8B fnx7nnL6ftAN5w2vrvPnkeTcxUq4lRla25NNM/8Vb2EcWsEd+uAPx82UjU/2v9T1AiNB 3eGQSEa+b/+Kjs8MzMq6wXxJKsc3ry4YwUkmVcek25QQV1zucyMsn2NeuNyZktUdlAhB u/+GSTgxJkjCjKRkZ/rGMxUa+2Y89bhmDKLnYZ7W3RcpXjbkFV16dD5nCZLBeFqps7uj x8JKY+QgtoKdW0/q9Q+7OLcMQCn8zltIYnxnQ3SZqI6m3V6bPTZqvMI5d7WAs6pa+G1s jBog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393881; x=1775998681; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=MUKNSIiZkKh812RF4RmJVKT9tRU/GLdSthiF4gOWCSk=; b=KEUeC6qU0eRz2lVCVOQ5GQNguuhLCoYk7588p93zooidKnMmPwsAofQGWwh1npGujZ 1+xBiJiZpBX3bRf7hAONzW2oRc8oJDb4qsFtgNRyEnjyrRbmvnYpLNT7EIMcDaT+mqAA 0XYgad97/DVfHngPDsxiOpsw91FTDBjAynOGH1fpjk4ZBPFmh8jd/mg/Wk6pd0wK07al Dby/79jjQktkXLQSSlJupMC2l37CzG46+hBGiGSwMKxvt5LfWb6IB2XPLAT3icR1tE96 ty53qIR1UkoCgE0KW2fWjHsqXoiOEPQy8nLzwpFnnfZ/RoJZxX4M0sYpl6NYFOI51SQp Ujew== X-Forwarded-Encrypted: i=1; AJvYcCVVbuINhL2qF0KIU+dNjml3iPTqs0EOMIbJnn75OeRB4q26TTxj1OtT2RjEptjoaNlRfzo3sxMF128YjzM=@vger.kernel.org X-Gm-Message-State: AOJu0YzcUA0iht9O8HsyWU55yMmXbzdsCzDH46jOpYMQGCkFP9NmCA7u jfhUT54S0AaeR7AYztEJVZcwFGfoLGccxpyESMF5eDtFURiMLH43OLrUYEsyp33ceZ4= X-Gm-Gg: AeBDietv+UhMkn3+AQZ063EXGxBWv+0ZmR8e9juJS96+HpUl6Pn41Sy7RHZ/tjFD+nu U6Yecq6rTAhzzXh45eTYvg2bl0RXKjGBLnhKCd8gzDZ7wbLrtqSaj2QMHny32fCFGCGDO+T422X LMUNvm22o1YiGBMy8K3vwRhV51EjwtoDCGFSmNbkkfNbktRrk0b0ulhdYDJyzp/+TSN5KeQvLXf ZJPxsF1d+q1e4lvHbD/ULZm2xH+N1Ot4CuKpyKuT9w3k1o3IPm/HLNRhfp116ze1tGhU3A8UQVW 4l9x8eWsGEfqNLZcmeZ19AU7H0T5yEOz8sUM8nexQYKkzpEoxYOEcIG+qjYDxteydB9bhIJl66J an45ErKlMVXDz+YRDR1zP9ytZ9jyiNvyy7BAiacpT7s0Ym/jTr3CXVxGAAchcertEMdmLas9t4X DVoHVDB3S7bUcpdDseMqJAK8TN3vOsf2N0P4gwHOP2WaEjaQn8k/xhrg== X-Received: by 2002:a17:90b:268a:b0:35b:8d89:7199 with SMTP id 98e67ed59e1d1-35de68ec6eemr8635210a91.15.1775393881284; Sun, 05 Apr 2026 05:58:01 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.57.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:58:00 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 41/49] mm/sparse: simplify section_vmemmap_pages() Date: Sun, 5 Apr 2026 20:52:32 +0800 Message-Id: <20260405125240.2558577-42-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" After unifying DAX and HugeTLB vmemmap optimizations, we can now simplify section_vmemmap_pages(). Previously, section_vmemmap_pages() needed to take altmap and pgmap arguments to determine if vmemmap optimization was enabled. However, sparse_add_section() already sets the section order using section_set_order(ms, pgmap->vmemmap_shift) if vmemmap_can_optimize() is true and the size is aligned to PAGES_PER_SECTION. As a result, section_vmemmap_optimizable(ms) is sufficient to determine if the section can be optimized, and section_order(ms) can directly provide the order, making the altmap and pgmap arguments redundant. Remove the unused altmap and pgmap arguments from section_vmemmap_pages(). Signed-off-by: Muchun Song --- mm/internal.h | 3 +-- mm/sparse-vmemmap.c | 8 +++----- mm/sparse.c | 18 ++++++------------ 3 files changed, 10 insertions(+), 19 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index b569d8309f4d..7f0731e5c84f 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -998,8 +998,7 @@ static inline void __section_mark_present(struct mem_se= ction *ms, ms->section_mem_map |=3D SECTION_MARKED_PRESENT; } =20 -int section_vmemmap_pages(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap); +int section_vmemmap_pages(unsigned long pfn, unsigned long nr_pages); #else static inline void memblocks_present(void) {} static inline void sparse_init(void) {} diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index ba8c0c64f160..ac2efba9ef92 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -608,12 +608,10 @@ static void section_deactivate(unsigned long pfn, uns= igned long nr_pages, * section_activate() and pfn_valid() . */ if (!section_is_early) { - memmap_pages_add(-section_vmemmap_pages(pfn, nr_pages, altmap, - pgmap)); + memmap_pages_add(-section_vmemmap_pages(pfn, nr_pages)); depopulate_section_memmap(pfn, nr_pages, altmap); } else if (memmap) { - memmap_pages_add(-section_vmemmap_pages(pfn, nr_pages, altmap, - pgmap)); + memmap_pages_add(-section_vmemmap_pages(pfn, nr_pages)); free_map_bootmem(memmap); } =20 @@ -658,7 +656,7 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, return pfn_to_page(pfn); =20 memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); - memmap_pages_add(section_vmemmap_pages(pfn, nr_pages, altmap, pgmap)); + memmap_pages_add(section_vmemmap_pages(pfn, nr_pages)); if (!memmap) { section_deactivate(pfn, nr_pages, altmap, pgmap); return ERR_PTR(-ENOMEM); diff --git a/mm/sparse.c b/mm/sparse.c index 04c641b97325..163bb17bba96 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -345,28 +345,23 @@ static void __init sparse_usage_fini(void) sparse_usagebuf =3D sparse_usagebuf_end =3D NULL; } =20 -int __meminit section_vmemmap_pages(unsigned long pfn, unsigned long nr_pa= ges, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) +int __meminit section_vmemmap_pages(unsigned long pfn, unsigned long nr_pa= ges) { const struct mem_section *ms =3D __pfn_to_section(pfn); - unsigned int order =3D pgmap ? pgmap->vmemmap_shift : section_order(ms); + unsigned int order =3D section_order(ms); unsigned long pages_per_compound =3D 1L << order; - unsigned int vmemmap_pages =3D OPTIMIZED_FOLIO_VMEMMAP_PAGES; - - if (vmemmap_can_optimize(altmap, pgmap)) - vmemmap_pages =3D VMEMMAP_RESERVE_NR; =20 VM_BUG_ON(!IS_ALIGNED(pfn | nr_pages, min(pages_per_compound, PAGES_PER_S= ECTION))); VM_BUG_ON(pfn_to_section_nr(pfn) !=3D pfn_to_section_nr(pfn + nr_pages - = 1)); =20 - if (!vmemmap_can_optimize(altmap, pgmap) && !section_vmemmap_optimizable(= ms)) + if (!section_vmemmap_optimizable(ms)) return DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE); =20 if (order < PFN_SECTION_SHIFT) - return vmemmap_pages * nr_pages / pages_per_compound; + return OPTIMIZED_FOLIO_VMEMMAP_PAGES * nr_pages / pages_per_compound; =20 if (IS_ALIGNED(pfn, pages_per_compound)) - return vmemmap_pages; + return OPTIMIZED_FOLIO_VMEMMAP_PAGES; =20 return 0; } @@ -396,8 +391,7 @@ static void __init sparse_init_nid(int nid, unsigned lo= ng pnum_begin, nid, NULL, NULL); if (!map) panic("Populate section (%ld) on node[%d] failed\n", pnum, nid); - memmap_boot_pages_add(section_vmemmap_pages(pfn, PAGES_PER_SECTION, - NULL, NULL)); + memmap_boot_pages_add(section_vmemmap_pages(pfn, PAGES_PER_SECTION)); sparse_init_early_section(nid, map, pnum, 0); } sparse_usage_fini(); --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4150223AB88 for ; Sun, 5 Apr 2026 12:58:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393888; cv=none; b=eO8XP49wM0oNXi9PMGhjBQ/Ugi+KDm+zTXB7x9185A40i0YFVsjiGKGyLj4HDyC15K0FAGwF5UCug4Bj7kZeffFBjdo12SrxgEyDx1h06uY7hxnMsuY+wPiSEn2SVwKoDRNotMdfP/bkFKEZ9eeqa8g1MFfhFRrtliOaC5UlQ4o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393888; c=relaxed/simple; bh=WBcjbTE6eUJsP4hiGaq7xFXKTylxoDu9zJ6ZW64oWPQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=OCLSoQVPI/PXbJAWU+lxyLa4sls74CHp7eiPEtdQA3rpaPvoALJ49O3bsCmBuHnGDPrkDttaHiMONnj5tbmc23fNUGEVp8G00dnJuPMWa4gKvhdhUmadPFFFXnoJw8gQ52O4N8UIJwq8kbYPWsqSNhjqNuJw2n8iPCiUtY94AbI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=aLZOUR14; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="aLZOUR14" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-35d9c7bf9a1so2805830a91.3 for ; Sun, 05 Apr 2026 05:58:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393886; x=1775998686; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KkFZynHuznCOlBtjfU13B+G62bbJJ2w/QApOyapvamc=; b=aLZOUR14HyD7i0t/4sDfSPI9E/reRxFt7aKkYfskc4VDGHsJS6cYh5T6l1lJPXFdzD VNTNdf3pdM8uNiMSoslHyiXE+SPp6q4QvHUqAc5XIJh0JwduimEL3VaE7dqdDXC6Q1Y5 qe4kqqas7obcux6YLDkrEx/dVbbHKs1w9mduT/u4IREN5LDetf5sEyCg9iKrGwCx831c 1C9blVvnYFZFYQSITsJFWy6Lstoqc8Dqc0dlV/Q5hC9GqhGMvDOO/LCAZ/lTzgh2vXfV SAjS9Mx3jmkDTRu8uF0yjw40EtwrcXXqr+rFZnruqnM29GQ0gT/9LVXpk8lAS3nRPi7p Xulw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393886; x=1775998686; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=KkFZynHuznCOlBtjfU13B+G62bbJJ2w/QApOyapvamc=; b=WQp1szvHCUuq7DM55pQnRVWK+E8WaFRKgHf5+SYl6Kmyxo/oEHz0ZBir7YxPkoCjvJ jKCasq7UaDQRKOxLjuun4xPt3oz2qslL/srGT9U9cX6aUF67/QxGeYHSPM51Su/KCb5G Ka/aIJmxhCLNWP3Unf+EIGEvDL+vA5zhZns0QkXXJyIrXYwCTGW+qDL9vlTN9PUt+4uD EpwD7WhxHKDCcxiKLN9WMg1Xnhq3gtszTx8EqgeyPS4Rj8BmVgFL4LgaI7jK+AhbmWjK dAk6jTVGgDqM7pQdSzS3K3Kr6ZT3okVDB1Q6hNhqPv6+NqXw8Z9bAfY+gs7ftMxFChDb inyA== X-Forwarded-Encrypted: i=1; AJvYcCUDSimN44GueiG/2KDiF0Q57Ukerkum/lnHABPY3e6Gn+zdXekgnymztDfg5iqMBQwknX8MPYXHiEZUTk0=@vger.kernel.org X-Gm-Message-State: AOJu0YzJea1qhCCJEGDKFs4klikEm/BNzq63hW8gACbF67SY0LfrHP9l BBTNIJ0s9KTqdV8gGRfqVZx+aof/Zety0PkV1l+9x5wDUm8voYShvST0pwgvHcX7xAg= X-Gm-Gg: AeBDieuNzfFKJ1ZhXNfOAFkHkszhdYZeHbSGj10KPa+20lir7h+bk72P1YvAmfXP5Uk vald4JTwS+tTRb3HUxWCjz3QNbYxEidqTZaV6ykcK7x4Kpu+4+9ZnXI/ZQZBRLF7ZMhKDJCS7Or aHcTEmdmKqbSHXttmsNgzuBqEavhkDBSJRMUKLJ5rH+JzphmE/FP/7S5nxuLjmuId1anMuJ7TyM 4yWyI0Xc+j+LQw3qJFM6sW6KuAds/iZtP0nPDJOYUJmNEijntgQBs8EtX+WTr8iLm3oRQhvWS9U NrsilCwNXPjp8v2IOz8/dbw2UUwERMeIzD37gQOkAoT2w9qSVIfJgFlRVVzh+wAVFu6wXVImxQ/ Xb7Hap7wM9NYSdoQymG7Ze5/OnDOc4DCH9gRS+TM9uNqAKNp+rMLjnsMivpO1wJ/+rHTqTFWrRl DYmD7FFaRypulfJn8DCd9sTosG28/9xteLa3C/1PR9s6Q= X-Received: by 2002:a17:90a:d403:b0:359:3426:c60a with SMTP id 98e67ed59e1d1-35de67da841mr8355420a91.4.1775393886516; Sun, 05 Apr 2026 05:58:06 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.58.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:58:06 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 42/49] mm/sparse-vmemmap: introduce section_vmemmap_page_structs() Date: Sun, 5 Apr 2026 20:52:33 +0800 Message-Id: <20260405125240.2558577-43-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The function compound_nr_pages() in mm_init.c was introduced to determine how many unique struct pages need to be initialized when vmemmap optimizati= on is enabled. However, it exposes sparse_vmemmap internals to mm_init.c. Now that DAX and HugeTLB vmemmap optimizations are unified and simplified, we can expose a cleaner API from sparse.c to calculate the exact number of struct page structures needed. Introduce section_vmemmap_page_structs() which returns the number of page structs that require initialization, rather than the number of physical vmemmap pages to allocate. This perfectly aligns with the requirements of memmap_init_zone_device(). As a result: 1. compound_nr_pages() is removed entirely. 2. The internal section_vmemmap_pages() in sparse.c is rewritten as a simple wrapper that calculates the number of physical pages based on section_vmemmap_page_structs(). 3. A restrictive VM_BUG_ON spanning sections is removed, safely allowing compound pages (like 1G DAX pages) to cross section boundaries during device memory initialization. Signed-off-by: Muchun Song --- mm/internal.h | 8 +++++++- mm/mm_init.c | 21 +-------------------- mm/sparse.c | 9 ++++----- 3 files changed, 12 insertions(+), 26 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 7f0731e5c84f..02064f21bfe1 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -998,7 +998,13 @@ static inline void __section_mark_present(struct mem_s= ection *ms, ms->section_mem_map |=3D SECTION_MARKED_PRESENT; } =20 -int section_vmemmap_pages(unsigned long pfn, unsigned long nr_pages); +int section_vmemmap_page_structs(unsigned long pfn, unsigned long nr_pages= ); + +static inline int section_vmemmap_pages(unsigned long pfn, unsigned long n= r_pages) +{ + return DIV_ROUND_UP(section_vmemmap_page_structs(pfn, nr_pages) * + sizeof(struct page), PAGE_SIZE); +} #else static inline void memblocks_present(void) {} static inline void sparse_init(void) {} diff --git a/mm/mm_init.c b/mm/mm_init.c index 6b23b5f02544..74ccc556bf6e 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1060,24 +1060,6 @@ static void __ref __init_zone_device_page(struct pag= e *page, unsigned long pfn, } } =20 -/* - * With compound page geometry and when struct pages are stored in ram most - * tail pages are reused. Consequently, the amount of unique struct pages = to - * initialize is a lot smaller that the total amount of struct pages being - * mapped. This is a paired / mild layering violation with explicit knowle= dge - * of how the sparse_vmemmap internals handle compound pages in the lack - * of an altmap. - */ -static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap, - struct dev_pagemap *pgmap, - const struct mem_section *ms) -{ - if (!section_vmemmap_optimizable(ms)) - return pgmap_vmemmap_nr(pgmap); - - return VMEMMAP_RESERVE_NR * (PAGE_SIZE / sizeof(struct page)); -} - static void __ref memmap_init_compound(struct page *head, unsigned long head_pfn, unsigned long zone_idx, int nid, @@ -1141,8 +1123,7 @@ void __ref memmap_init_zone_device(struct zone *zone, continue; =20 memmap_init_compound(page, pfn, zone_idx, nid, pgmap, - compound_nr_pages(altmap, pgmap, - __pfn_to_section(pfn))); + section_vmemmap_page_structs(pfn, pfns_per_compound)); } =20 /* diff --git a/mm/sparse.c b/mm/sparse.c index 163bb17bba96..400542302ad4 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -345,23 +345,22 @@ static void __init sparse_usage_fini(void) sparse_usagebuf =3D sparse_usagebuf_end =3D NULL; } =20 -int __meminit section_vmemmap_pages(unsigned long pfn, unsigned long nr_pa= ges) +int __meminit section_vmemmap_page_structs(unsigned long pfn, unsigned lon= g nr_pages) { const struct mem_section *ms =3D __pfn_to_section(pfn); unsigned int order =3D section_order(ms); unsigned long pages_per_compound =3D 1L << order; =20 VM_BUG_ON(!IS_ALIGNED(pfn | nr_pages, min(pages_per_compound, PAGES_PER_S= ECTION))); - VM_BUG_ON(pfn_to_section_nr(pfn) !=3D pfn_to_section_nr(pfn + nr_pages - = 1)); =20 if (!section_vmemmap_optimizable(ms)) - return DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_SIZE); + return nr_pages; =20 if (order < PFN_SECTION_SHIFT) - return OPTIMIZED_FOLIO_VMEMMAP_PAGES * nr_pages / pages_per_compound; + return OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS * nr_pages / pages_per_compo= und; =20 if (IS_ALIGNED(pfn, pages_per_compound)) - return OPTIMIZED_FOLIO_VMEMMAP_PAGES; + return OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS; =20 return 0; } --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE95A3112AB for ; Sun, 5 Apr 2026 12:58:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393896; cv=none; b=IIsqppo9NC0cjj/ZggE7g26l+PZHIev/kM9hnkYuoUAny+inSlubnEMG6P4oAbU6KLRKynBna9Kmws63ftEFSS5aBb7EYzppO7CxmH3OcdYvoigoCyv9tkct6DpCK/4po1EXgAS12q99uLGRALnMM5yGT10Tm/8piLntTOHOqm8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393896; c=relaxed/simple; bh=Y93CAXVsi/VyTz4DHxdOMpE7rITjVkq2noFbgRCMom0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Pt7NsUBqWSdbDIJF5WccBZo6oaTMG6SdqxWe1t34R8fU1oxANp6eMg/+G0wJdqdhmHHfe6cLTWYoishm7x216geBh6mMHTymAdDP2L10l2/q4vLJ1pyQygObl6YDUW8FdGO8GNsspEfG9COlYD3Ipm0ZTUYBbtwLhWZvxYSbDgE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=aEurZ+Fa; arc=none smtp.client-ip=209.85.216.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="aEurZ+Fa" Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-354a18c48b5so2696241a91.1 for ; Sun, 05 Apr 2026 05:58:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393894; x=1775998694; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QwisMsoot6phm5Bl4goqnbBJk52l8dQLV+Q/Y11L1i8=; b=aEurZ+Fa73+0itHQnLpHMM2LjSZfEWuUAIqTr9Mc9c/uD3meIiZ+dIbEWZqmwlQnYb +lTDndfgIa1cjwyKKRa9aLGdh1cFOnQ9u15DghvUg1Pd/0xY3kuzoFH2Gxyhto5Ve6uW UtQpRNCFcojt16YgUPvZPBLPzsWzMqv7E2TUVE1jfbX29I+Q6OV41QRGsmbQ3dVE0Kcq uI2efI5X9lTxFGy2NjRMlQzDGt3Hof5RFC2dHuCXqvROge0MxSu0ZJCfm24tc7J4axoX ubIWSu+MjNlxN+aTqmfF9Arihi9uGdEDt7woMD3x89cWtgfMfbzKXEHbQ2fK41ixq/ir nz2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393894; x=1775998694; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QwisMsoot6phm5Bl4goqnbBJk52l8dQLV+Q/Y11L1i8=; b=e3UU3neOKziu+9AG3KQcNA2rypyPgCrDM61u71qOCrfPtbLPNdfsM5KuAGlPrwBcKJ j/l1TI3TKUNuJApFcCZFeGNOW/BIaT+dViEzokf7Gi40FdGWGB73A14erSG6cO2veKz5 fARTjgqYdsd7XKpZ0Aj4/6aejI2a/P4HbFONDTgKpYamcHTU07CBQhg6IBOZzuPWDH9l CPBTzDsOP8GxcQaVhm0YqcUCM/WM/PyEExMkHoc3l7MaClClZEYygZw7IF3ATX4Nw35g kmbHoHOiNtHUo3hPazpQCl1oy56SRs5nmokZB+T1FkAYsz6U6RfTSx2ZdPJewybtM4HZ XrJg== X-Forwarded-Encrypted: i=1; AJvYcCUUJNXftZhZR/9HQQh0Rz9ztReWY1TMcqI4fUIpbr7z5OIh13mdsOLpX+GX8a5uby4cHlnx40RPdyccKmE=@vger.kernel.org X-Gm-Message-State: AOJu0YyUdF6jFBOrkmg3xkfnvXe73QzzOAUXLLU225GYdmadYXWykOWn hjnG/0NdIRJB8ihGt/8tGrKz9FdD3+/GuKrcOgxCKlC7g22+/QQuExmOIl1uVOxKGBc= X-Gm-Gg: AeBDievM9ly50CBUX1PjhVPaIRnz0srJkbPSFNICHxXmCnmkLLmyfRnPVvkpd2hU3+1 ih7WAyp0K3cckSdDFRWF/vWeFJux6MFd6sRuwQqJ6fs6J5txtG52hzEzGkArMykkPVRJ9IEcVX5 /hoAQt+EjQkkDbvGZUd3xiTggCg5jfLyum5mRR7UCvgTZZvDmoaVTArNxRWSrpAlUZO3rnVEmN/ eP0KI7sH2XQHjYR9oF2Q7IvGF1uBUjL8AemuTVgyQoYmyDTRryTQQ2oI/DVON35kJDVxvM5Lghb c9M4vrHZb0nfoiBg+IertXXbJto/xQ0NfykdgbTL0S265vEZDYMH8BD+ZQiZdJBMgpNW9YpCc0w 35241mpASk/BkJ5rTSJSM5l5lc/HWOSgEDM2mEUD1162ohGSuXeHveL4f+3J3b2vmNro+a2gYEf 5sxtUJwnjgvZKB6+RUG+I4HEAFzbw4g5ia28zpj0ZuycG+w5CHzljD4A== X-Received: by 2002:a17:90b:1dcc:b0:35b:929f:7e92 with SMTP id 98e67ed59e1d1-35de68ce5d0mr8629035a91.18.1775393894073; Sun, 05 Apr 2026 05:58:14 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.58.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:58:13 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 43/49] powerpc/mm: rely on generic vmemmap_can_optimize() to simplify code Date: Sun, 5 Apr 2026 20:52:34 +0800 Message-Id: <20260405125240.2558577-44-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The goal of this patch is to simplify the code by removing unnecessary architecture-specific overrides. After unifying DAX and HugeTLB vmemmap optimizations, we can rely on the generic rule of vmemmap_can_optimize() instead of having architecture specific overrides. In radix__vmemmap_populate(), we can directly depend on section_vmemmap_optimizable(__pfn_to_section(pfn)) because the upper layer (sparse_add_section()) has already set the section order correctly if the optimization condition was met. In the fallback case for Hash MMU (!radix_enabled) inside vmemmap_populate(= ), we reset the section order to 0. This is necessary because sparse_add_secti= on() may have optimistically set the section order assuming optimization could be enabled, but Hash MMU does not support it. This ensures that section_vmemmap_pages() calculates the unoptimized page count accurately. Signed-off-by: Muchun Song --- arch/powerpc/include/asm/book3s/64/radix.h | 5 ----- arch/powerpc/mm/book3s64/radix_pgtable.c | 12 +----------- arch/powerpc/mm/init_64.c | 1 + 3 files changed, 2 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/incl= ude/asm/book3s/64/radix.h index 2600defa2dc2..18e28deba255 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -352,10 +352,5 @@ int radix__create_section_mapping(unsigned long start,= unsigned long end, int nid, pgprot_t prot); int radix__remove_section_mapping(unsigned long start, unsigned long end); #endif /* CONFIG_MEMORY_HOTPLUG */ - -#ifdef CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP -#define vmemmap_can_optimize vmemmap_can_optimize -bool vmemmap_can_optimize(struct vmem_altmap *altmap, struct dev_pagemap *= pgmap); -#endif #endif /* __ASSEMBLER__ */ #endif diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/boo= k3s64/radix_pgtable.c index 714d5cdc10ec..36a69589fae4 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -977,16 +977,6 @@ int __meminit radix__vmemmap_create_mapping(unsigned l= ong start, return 0; } =20 -#ifdef CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP -bool vmemmap_can_optimize(struct vmem_altmap *altmap, struct dev_pagemap *= pgmap) -{ - if (radix_enabled()) - return __vmemmap_can_optimize(altmap, pgmap); - - return false; -} -#endif - int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, unsigned long addr, unsigned long next) { @@ -1126,7 +1116,7 @@ int __meminit radix__vmemmap_populate(unsigned long s= tart, unsigned long end, in pte_t *pte; unsigned long pfn =3D page_to_pfn((struct page *)start); =20 - if (vmemmap_can_optimize(altmap, pgmap) && section_vmemmap_optimizable(__= pfn_to_section(pfn))) + if (section_vmemmap_optimizable(__pfn_to_section(pfn))) return vmemmap_populate_compound_pages(pfn, start, end, node, pgmap); /* * If altmap is present, Make sure we align the start vmemmap addr diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index 8f4aa5b32186..56cbea89d304 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -283,6 +283,7 @@ int __meminit vmemmap_populate(unsigned long start, uns= igned long end, int node, return radix__vmemmap_populate(start, end, node, altmap, pgmap); #endif =20 + section_set_order(__pfn_to_section(page_to_pfn((struct page *)start)), 0); return __vmemmap_populate(start, end, node, altmap); } =20 --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f49.google.com (mail-pj1-f49.google.com [209.85.216.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E1AB3446CA for ; Sun, 5 Apr 2026 12:58:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393901; cv=none; b=XoUMeLIsItmS++FcqatBs6qUPksAYfLLCGsHegaQY7gxYBddS7MtDI+PHHC2uj4YRhw96iMMw7ZHAeKKv7tQaytoex9ZW53qad63aje9kDcBqodRrxII4bz2R+RO663abTI9Htjx1WT4mzA5Wz+y5LOI1XkjftivqbXJjHRRJlk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393901; c=relaxed/simple; bh=KtWSrElMNWetbna4smvk4S+8bBeUiD3ktuvzJ/Q66z0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ddBDAJWPS451E4phKMoZ09vlVAxCbVng8ReuWdsmZhs+3oj3Dtktm+0qX0j+0UzmWyo7f/JX3h3S7m7CsU7GAd4jk04AlFG68glIIF4aydh/TK6kyRx/0bFIK4joUgaLwnIQPPAsKRUpvC9jnE7a72WVwQEZDvgd3KRy+scIhcU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=FMWXY/uS; arc=none smtp.client-ip=209.85.216.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="FMWXY/uS" Received: by mail-pj1-f49.google.com with SMTP id 98e67ed59e1d1-358ed696623so1323711a91.0 for ; Sun, 05 Apr 2026 05:58:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393900; x=1775998700; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tR6tFFSAYX0KQImZ49bGZfXfFcdkBNLOh7cRjPEro20=; b=FMWXY/uSuTP6GzxsGFUBaFdu801nnjSBQrblr7yQcBCbBphOmgTwTemJYXM+Y0MUv8 MRC7DbESngWWnAmRPYk1VQTeuzUmRfW4VMw5oiDq1uuioEZYyjxaLQcq7x9NOvTQp+us miP9s49SInRJxQ78fCMwbFF98WnhcRXil0jXGDD+yolYL/FFYvGnsqi8AJS919XGlnfD HTHzJ0Z1AOgLifoY7loqICcp/4j0WviUCbW6chkpxKDVd6G+6AIdw/gIt0e+rDwyH11s kT8mEsVfoPO9WfrvwhBLoMbT1yqTIRWFW6agY7tY/k76qItUcjLElgJAdWxXavndA+Ef /ZXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393900; x=1775998700; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=tR6tFFSAYX0KQImZ49bGZfXfFcdkBNLOh7cRjPEro20=; b=sg4Bco21YbzqcAbUMwfqN72Jqi0jOF3hnRNlYGQuHQozfV4W9nMP+gNqHGImwmgJNU Z43V0ODDOjiycG9f6iZemIB27WedvPLu0Xs2DD0LmnG1PsvfY6FzHdQ0fwjwdE0INGS9 ndAxokyG1dBKxKP2htrYcobJupmpUSZRnsfgZAMmqkY4jd320O4CpuBfFf6assRLenN0 ZiJ/8+D1w12D6+Ffi7/+lYqZBwSDB7sEGM5NfVRsxHbCbhq3ut5Hg+XIezrv4nt2L52A jmhcB/T+OozQvXzw/UZfJN8W72fxOBRlQNw69aNaNJYt5UiZlZsX+1TlHwA28cklin8J LlzQ== X-Forwarded-Encrypted: i=1; AJvYcCUMqG/h4K9PlrRBtXY7ghXiaiqNrd6RtR9xzIZ/z2lyOnut8gaP3BgrBIsfRuGsZmXP4Jz+6Xys+pZ5Egw=@vger.kernel.org X-Gm-Message-State: AOJu0Yy6sSbrKCKXdfDIa3TB57W8ASk8+/7J5GFWb/EnanUD/Q8KlwyV GA/rFakzWMoSMRrGfJ1OLHZH0sZZM4QfveAKrXjv+xjdmFAUh/LTgsVdI2Qd+XmBZw0= X-Gm-Gg: AeBDietbtc6Skxftkz/FnYkWsLHoyE/2XrLE7hsTLGJSbeULGbS0Q450ZMI6P2i+0iG ZFGcq9fBzoumrN0OkYD1M+knqYPCiXaVrl4eJ5swWhFF0klGTB/4Kf/LYf8JFwRoiseIT/kj11e 3RN4EZGFmy2MFWL4jaDifzdHe/UDNxBFt+ZT6+EKxK3h5Ed1/kd9UrKkE2bFLPiMzYYoxzqY/tk COw1oH1G/cWlw2kCgHcExKlfshcYjw4qCTsr6Uyn7+OsTUJqzkApv7DeZZ+L7DNrqkMMtaOukGH 1FSBmJ9ZR64Q/73n4+NLvHIsDqVdvmnc6ckR7/CxXvpopKbScFOnK/yLHnDfePEWcW2XTvGN7WI vtNimYUJ2HzeF1RLGzD5SPnGHxpHTC4YnmKLvyYUlGModRf8AFny/qoHuPjv6nuIyVyuI6XJC4X sorapfX8oujIPYS2Mc1S3kuxq8iWBI7GLtj40tJkgPn+M= X-Received: by 2002:a17:90b:224e:b0:35b:e51b:1935 with SMTP id 98e67ed59e1d1-35de68cfba8mr8245353a91.17.1775393899744; Sun, 05 Apr 2026 05:58:19 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.58.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:58:19 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 44/49] mm/sparse-vmemmap: drop ARCH_WANT_OPTIMIZE_DAX_VMEMMAP and simplify checks Date: Sun, 5 Apr 2026 20:52:35 +0800 Message-Id: <20260405125240.2558577-45-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Historically, when device DAX vmemmap optimization was introduced, it was initially implemented as a generic feature within sparse-vmemmap.c. However, it was later discovered that architectures with specific page table formats (such as PowerPC with hash translation) would crash because the generic vmemmap_populate_compound_pages() was unaware of their specific page table setup (e.g., bolted table entries). To address this, commit 87a7ae75d738 ("mm/vmemmap/devdax: fix kernel crash when probing devdax devices") introduced a restrictive config option, which eventually evolved into ARCH_WANT_OPTIMIZE_DAX_VMEMMAP (via commits 0b376f1e0ff5 and 0b6f15824cc7). This effectively turned a generic optimization into an opt-in architectural feature. However, the architecture landscape has evolved. The decision of whether to apply DAX vmemmap optimization techniques for specific page table formats is now fully delegated to the architecture-specific implementations (e.g., within vmemmap_populate()). The upper-level Kconfig restrictions and the rigid generic wrapper functions are no longer necessary to prevent crashes, as the architectures themselves handle the viability of the mappings. If an architecture does not support DAX vmemmap optimization, it can simply implement fallback logic similar to what PowerPC does in its vmemmap_populate() routines. If the architecture supports neither HugeTLB vmemmap optimization nor DAX vmemmap optimization, but still wants to reduce code size and disable this feature entirely, it is now possible to turn off SPARSEMEM_VMEMMAP_OPTIMIZA= TION. It is no longer a hidden option, but rather a user-configurable boolean und= er the SPARSEMEM_VMEMMAP umbrella. Therefore, this patch removes the redundant ARCH_WANT_OPTIMIZE_DAX_VMEMMAP and drops the complicated vmemmap_can_optimize() helper. Instead, we unify SPARSEMEM_VMEMMAP_OPTIMIZATION as a fundamental core capability that is enabled by default whenever SPARSEMEM_VMEMMAP is selected. The check in sparse_add_section() is safely simplified to: if (!altmap && pgmap && nr_pages =3D=3D PAGES_PER_SECTION) which succinctly reflects the prerequisites for the optimization without unnecessary boilerplate. Signed-off-by: Muchun Song --- arch/powerpc/Kconfig | 1 - arch/riscv/Kconfig | 1 - arch/x86/Kconfig | 1 - include/linux/mm.h | 34 ---------------------------------- mm/Kconfig | 14 ++++++++------ mm/sparse-vmemmap.c | 2 +- 6 files changed, 9 insertions(+), 44 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index da4e2ec2af20..8158d5d0c226 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -184,7 +184,6 @@ config PPC select ARCH_WANT_IPC_PARSE_VERSION select ARCH_WANT_IRQS_OFF_ACTIVATE_MM select ARCH_WANT_LD_ORPHAN_WARN - select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if PPC_RADIX_MMU select ARCH_WANTS_MODULES_DATA_IN_VMALLOC if PPC_BOOK3S_32 || PPC_8xx select ARCH_WEAK_RELEASE_ACQUIRE select BINFMT_ELF diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 61a9d8d3ea64..a8eccb828e7b 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -85,7 +85,6 @@ config RISCV select ARCH_WANT_GENERAL_HUGETLB if !RISCV_ISA_SVNAPOT select ARCH_WANT_HUGE_PMD_SHARE if 64BIT select ARCH_WANT_LD_ORPHAN_WARN if !XIP_KERNEL - select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP select ARCH_WANTS_NO_INSTR select ARCH_WANTS_THP_SWAP if HAVE_ARCH_TRANSPARENT_HUGEPAGE diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index f19625648f0f..83c55e286b40 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -146,7 +146,6 @@ config X86 select ARCH_WANT_GENERAL_HUGETLB select ARCH_WANT_HUGE_PMD_SHARE if X86_64 select ARCH_WANT_LD_ORPHAN_WARN - select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64 select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH diff --git a/include/linux/mm.h b/include/linux/mm.h index c36001c9d571..8baa224444be 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4910,40 +4910,6 @@ static inline void vmem_altmap_free(struct vmem_altm= ap *altmap, } #endif =20 -#define VMEMMAP_RESERVE_NR OPTIMIZED_FOLIO_VMEMMAP_PAGES -#ifdef CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP -static inline bool __vmemmap_can_optimize(struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) -{ - unsigned long nr_pages; - unsigned long nr_vmemmap_pages; - - if (!pgmap || !is_power_of_2(sizeof(struct page))) - return false; - - nr_pages =3D pgmap_vmemmap_nr(pgmap); - nr_vmemmap_pages =3D ((nr_pages * sizeof(struct page)) >> PAGE_SHIFT); - /* - * For vmemmap optimization with DAX we need minimum 2 vmemmap - * pages. See layout diagram in Documentation/mm/vmemmap_dedup.rst - */ - return !altmap && (nr_vmemmap_pages > VMEMMAP_RESERVE_NR); -} -/* - * If we don't have an architecture override, use the generic rule - */ -#ifndef vmemmap_can_optimize -#define vmemmap_can_optimize __vmemmap_can_optimize -#endif - -#else -static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) -{ - return false; -} -#endif - enum mf_flags { MF_COUNT_INCREASED =3D 1 << 0, MF_ACTION_REQUIRED =3D 1 << 1, diff --git a/mm/Kconfig b/mm/Kconfig index e81aa77182b2..166552d5d69a 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -411,17 +411,19 @@ config SPARSEMEM_VMEMMAP efficient option when sufficient kernel resources are available. =20 config SPARSEMEM_VMEMMAP_OPTIMIZATION - bool + bool "Enable Vmemmap Optimization Infrastructure" + default y depends on SPARSEMEM_VMEMMAP + help + This allows features like HugeTLB and DAX to map multiple contiguous + vmemmap pages to a single underlying physical page to save memory. + + If unsure, say Y. =20 # # Select this config option from the architecture Kconfig, if it is prefer= red -# to enable the feature of HugeTLB/dev_dax vmemmap optimization. +# to enable the feature of HugeTLB vmemmap optimization. # -config ARCH_WANT_OPTIMIZE_DAX_VMEMMAP - bool - select SPARSEMEM_VMEMMAP_OPTIMIZATION if SPARSEMEM_VMEMMAP - config ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP bool =20 diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index ac2efba9ef92..752a48112504 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -698,7 +698,7 @@ int __meminit sparse_add_section(int nid, unsigned long= start_pfn, return ret; =20 ms =3D __nr_to_section(section_nr); - if (vmemmap_can_optimize(altmap, pgmap) && nr_pages =3D=3D PAGES_PER_SECT= ION) { + if (!altmap && pgmap && nr_pages =3D=3D PAGES_PER_SECTION) { section_set_order(ms, pgmap->vmemmap_shift); #ifdef CONFIG_ZONE_DEVICE section_set_zone(ms, ZONE_DEVICE); --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C68BF3368AA for ; Sun, 5 Apr 2026 12:58:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393910; cv=none; b=GrNZhHS9ECBwh6aN0wAKDL+kB40ME/RzgPoRxEye8XA8aYGB1kmgBbvx26XroK4gjMgXHuqj6O+00F5GWKZ13guza324dV6BvlqGM9MOW5/RfWGxSfo2lSdz0hczqFZngi38kHlZ2EervIFhoUv5BRPGkFQAArTYmO1XwpmMhxY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393910; c=relaxed/simple; bh=K/md7W7j9aMZlQ6XzVpeWurFCijas9eojrsrqGyBLd4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mtb1EguAasyycz05Cih5U9xki1vFTUktfembdONgOYQUiANoIx888N+Or+kx8dFVvkJeneyWCVbdc4BYvNbfObMkKolHzGIWPb5XSjtuw3PnASAtz+Fa4Cnyi59ezcUBDjhOTlsQKi6ylkqQ7firHH7nrJWi9C6+zD2vPek0LxY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=gLBjz6h2; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="gLBjz6h2" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-35da9692ec3so2876662a91.1 for ; Sun, 05 Apr 2026 05:58:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393907; x=1775998707; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=J13vgUIT6r3ooj2ZXZFUt9fRLd85j+bHxFhjbMlfMio=; b=gLBjz6h2IQ8SaUyoNL7mGZGTOHe/XL2zG/6ya6UfptlqHDKHvdTzm8u/7idXtjPRiP CiJHgo3o2xZhHqVvHGQCL7YxsYQ2anFjKNUKSYvg5thjBCEfD7Gk7I0AegT7+dGKfZKi Jq2bP2EZh8D1c75UuH7wS77ON2LNp6+TPdTkubvJhdpu+Oc0izqythuG5vZdH7YZgrGj zfdxgzetn9VOKEW+4s6//ecMwDyTnGwfv+UQyU/Smp2yr7z0FnnQpUCt4F1fg0zVE+1w r9I7PPYgYSMbM1B78rr/Y1FbneURpswh3pYJUoU6CChGSYLPxcsxCT7reghKH/oUy059 raPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393907; x=1775998707; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=J13vgUIT6r3ooj2ZXZFUt9fRLd85j+bHxFhjbMlfMio=; b=YWBvSeVSrqWNXyskVG/QZ+behbRZOJ88VpwALewiNzv0TeVLT0Oc+hLx6wDaANW9Ee yQJNTPdlw76VQNA3WYncg1LUBlxMYp9iiTIKNTIuBp6fpC1qN8lfJVasP/D4BeFRa80l n72eFGTe7kmmSdIwGliWoHMIQWOljvdMbNG2uqxYUwIRQdTy7nQICo1mRKEiSvaKYXPJ lyg+WiqZ1I505qAygUPG0vxSmWl67RgDszqhEG4FNw2H18sSAGW++vMCHp6quTja0aLi xYvEP0B0TwX8aFJ7NcYa0P0yN4G+fd5kZbzvr621M5j8kCNOolMBUcfZBW9TQrQewYQ2 rXSQ== X-Forwarded-Encrypted: i=1; AJvYcCXsxKw05bysbgklcTXZFsPksD6P6eOR0ifEcQ2Vz/afgs4KGZ3TL+bPhUj2PeNzFlcsqBJ3PEXpyw3BSOk=@vger.kernel.org X-Gm-Message-State: AOJu0Yx9NVzs1eF2Iacffxe5Su9lbZ4W8SbFSeVsJtPiTeRuNIZ0MxxS K+iKufakUyK2OdxGeefIcGnl2RQ/Sh4fKjkLleEHOmQz9C3d0jvMelGO2woK5qOncOI= X-Gm-Gg: AeBDietJaSSDxXSoxdGX9rxsUS7kIixJ6VNQjNORKq95iy3oc/wEU92fjMEVlN7FL2b DlMIPWRMxjZrG7FDVp6HydEcN73GKY1RUApVe5Cml9wvZ52pLrP0a2T15dgJHkwJlJWoqIFsOc9 7Ad4VF59VaGdUOgI1ik6gstcKl4OrQ8uzt6tQljqQv/0LvqDCrEF2XTdq7Y3SI4CcLjH4y9QuVa oNQ/ig1ilLxZ8twYGpgIzmM726zWXui76ujJUYMHqHu6ysrw2EY92H9VvNxq/AECfmMGYkSAfO/ Z+UweDUid1F65fXoQKTALNVKiOHBCeMqEdZDL4W9YvAksEUKcZks/FbaaDBBRpdnyNL7MmPwCZZ Vj8Fyd7V7aP6DdMEFtxmizkKquR9guelrGC73YibmGsce6/ylP8uT58tsSUrxnjN/0koR+BD2Z2 627sF5T/1VAzaGJWCKvZdXLKJ3dVOtfQEhHj1bgu+Uwwg= X-Received: by 2002:a17:90b:314d:b0:359:f43d:4a6e with SMTP id 98e67ed59e1d1-35de662f67fmr9299980a91.0.1775393907102; Sun, 05 Apr 2026 05:58:27 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.58.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:58:26 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 45/49] mm/sparse-vmemmap: drop @pgmap parameter from vmemmap populate APIs Date: Sun, 5 Apr 2026 20:52:36 +0800 Message-Id: <20260405125240.2558577-46-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since architecture-specific choices about vmemmap optimization are now handled directly inside the vmemmap_populate() implementations, the @pgmap is no longer needed in the core memory hotplug APIs and most sparse section routines. Remove the pgmap parameter entirely from: - sparse_remove_section() - __remove_pages() - arch_remove_memory() - vmemmap_populate() and related functions This simplifies the API a little. Signed-off-by: Muchun Song --- arch/arm64/mm/mmu.c | 11 ++++---- arch/loongarch/mm/init.c | 12 ++++---- arch/powerpc/include/asm/book3s/64/radix.h | 4 +-- arch/powerpc/mm/book3s64/radix_pgtable.c | 10 +++---- arch/powerpc/mm/init_64.c | 4 +-- arch/powerpc/mm/mem.c | 5 ++-- arch/riscv/mm/init.c | 9 +++--- arch/s390/mm/init.c | 5 ++-- arch/s390/mm/vmem.c | 2 +- arch/sparc/mm/init_64.c | 5 ++-- arch/x86/mm/init_64.c | 13 ++++----- include/linux/memory_hotplug.h | 8 ++---- include/linux/mm.h | 11 +++----- mm/memory_hotplug.c | 12 ++++---- mm/memremap.c | 4 +-- mm/sparse-vmemmap.c | 33 +++++++++------------- mm/sparse.c | 6 ++-- 17 files changed, 65 insertions(+), 89 deletions(-) diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c index 86162aab5185..ec1c6971a561 100644 --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -1760,7 +1760,7 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) + struct vmem_altmap *altmap) { WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END)); /* [start, end] should be within one section */ @@ -1768,9 +1768,9 @@ int __meminit vmemmap_populate(unsigned long start, u= nsigned long end, int node, =20 if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) || (end - start < PAGES_PER_SECTION * sizeof(struct page))) - return vmemmap_populate_basepages(start, end, node, altmap, pgmap); + return vmemmap_populate_basepages(start, end, node, altmap); else - return vmemmap_populate_hugepages(start, end, node, altmap, pgmap); + return vmemmap_populate_hugepages(start, end, node, altmap); } =20 #ifdef CONFIG_MEMORY_HOTPLUG @@ -1994,13 +1994,12 @@ int arch_add_memory(int nid, u64 start, u64 size, return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap, pgmap); + __remove_pages(start_pfn, nr_pages, altmap); __remove_pgd_mapping(swapper_pg_dir, __phys_to_virt(start), size); } =20 diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index d61c2e09caae..00f3822b6e47 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -86,8 +86,7 @@ int arch_add_memory(int nid, u64 start, u64 size, struct = mhp_params *params) return ret; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; @@ -96,7 +95,7 @@ void arch_remove_memory(u64 start, u64 size, struct vmem_= altmap *altmap, /* With altmap the first mapped page is offset from @start */ if (altmap) page +=3D vmem_altmap_offset(altmap); - __remove_pages(start_pfn, nr_pages, altmap, pgmap); + __remove_pages(start_pfn, nr_pages, altmap); } #endif =20 @@ -123,13 +122,12 @@ int __meminit vmemmap_check_pmd(pmd_t *pmd, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, - int node, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + int node, struct vmem_altmap *altmap) { #if CONFIG_PGTABLE_LEVELS =3D=3D 2 - return vmemmap_populate_basepages(start, end, node, NULL, pgmap); + return vmemmap_populate_basepages(start, end, node, NULL); #else - return vmemmap_populate_hugepages(start, end, node, NULL, pgmap); + return vmemmap_populate_hugepages(start, end, node, NULL); #endif } =20 diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/incl= ude/asm/book3s/64/radix.h index 18e28deba255..0c9195dd50c9 100644 --- a/arch/powerpc/include/asm/book3s/64/radix.h +++ b/arch/powerpc/include/asm/book3s/64/radix.h @@ -316,13 +316,11 @@ static inline int radix__has_transparent_pud_hugepage= (void) #endif =20 struct vmem_altmap; -struct dev_pagemap; extern int __meminit radix__vmemmap_create_mapping(unsigned long start, unsigned long page_size, unsigned long phys); int __meminit radix__vmemmap_populate(unsigned long start, unsigned long e= nd, - int node, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap); + int node, struct vmem_altmap *altmap); void __ref radix__vmemmap_free(unsigned long start, unsigned long end, struct vmem_altmap *altmap); extern void radix__vmemmap_remove_mapping(unsigned long start, diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/boo= k3s64/radix_pgtable.c index 36a69589fae4..190448a17119 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -1101,11 +1101,10 @@ static inline pte_t *vmemmap_pte_alloc(pmd_t *pmdp,= int node, =20 static int __meminit vmemmap_populate_compound_pages(unsigned long start_p= fn, unsigned long start, - unsigned long end, int node, - struct dev_pagemap *pgmap); + unsigned long end, int node); =20 int __meminit radix__vmemmap_populate(unsigned long start, unsigned long e= nd, int node, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) + struct vmem_altmap *altmap) { unsigned long addr; unsigned long next; @@ -1117,7 +1116,7 @@ int __meminit radix__vmemmap_populate(unsigned long s= tart, unsigned long end, in unsigned long pfn =3D page_to_pfn((struct page *)start); =20 if (section_vmemmap_optimizable(__pfn_to_section(pfn))) - return vmemmap_populate_compound_pages(pfn, start, end, node, pgmap); + return vmemmap_populate_compound_pages(pfn, start, end, node); /* * If altmap is present, Make sure we align the start vmemmap addr * to PAGE_SIZE so that we calculate the correct start_pfn in @@ -1248,8 +1247,7 @@ static pte_t * __meminit radix__vmemmap_populate_addr= ess(unsigned long addr, int =20 static int __meminit vmemmap_populate_compound_pages(unsigned long start_p= fn, unsigned long start, - unsigned long end, int node, - struct dev_pagemap *pgmap) + unsigned long end, int node) { /* * we want to map things as base page size mapping so that diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index 56cbea89d304..8e18ed427fdd 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -275,12 +275,12 @@ static int __meminit __vmemmap_populate(unsigned long= start, unsigned long end, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) + struct vmem_altmap *altmap) { =20 #ifdef CONFIG_PPC_BOOK3S_64 if (radix_enabled()) - return radix__vmemmap_populate(start, end, node, altmap, pgmap); + return radix__vmemmap_populate(start, end, node, altmap); #endif =20 section_set_order(__pfn_to_section(page_to_pfn((struct page *)start)), 0); diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 4c1afab91996..648d0c5602ec 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -158,13 +158,12 @@ int __ref arch_add_memory(int nid, u64 start, u64 siz= e, return rc; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, - struct dev_pagemap *pgmap) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap, pgmap); + __remove_pages(start_pfn, nr_pages, altmap); arch_remove_linear_mapping(start, size); } #endif diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 277c89661dff..5142ca80be6f 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1443,7 +1443,7 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) + struct vmem_altmap *altmap) { /* * Note that SPARSEMEM_VMEMMAP is only selected for rv64 and that we @@ -1451,7 +1451,7 @@ int __meminit vmemmap_populate(unsigned long start, u= nsigned long end, int node, * memory hotplug, we are not able to update all the page tables with * the new PMDs. */ - return vmemmap_populate_hugepages(start, end, node, altmap, pgmap); + return vmemmap_populate_hugepages(start, end, node, altmap); } #endif =20 @@ -1810,10 +1810,9 @@ int __ref arch_add_memory(int nid, u64 start, u64 si= ze, struct mhp_params *param return ret; } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, - struct dev_pagemap *pgmap) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) { - __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap, pgmap); + __remove_pages(start >> PAGE_SHIFT, size >> PAGE_SHIFT, altmap); remove_linear_mapping(start, size); flush_tlb_all(); } diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 11a689423440..1f72efc2a579 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -276,13 +276,12 @@ int arch_add_memory(int nid, u64 start, u64 size, return rc; } =20 -void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) +void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap, pgmap); + __remove_pages(start_pfn, nr_pages, altmap); vmem_remove_mapping(start, size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c index a7bf8d3d5601..eeadff45e0e1 100644 --- a/arch/s390/mm/vmem.c +++ b/arch/s390/mm/vmem.c @@ -506,7 +506,7 @@ static void vmem_remove_range(unsigned long start, unsi= gned long size) * Add a backed mem_map array to the virtual mem_map array. */ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) + struct vmem_altmap *altmap) { int ret; =20 diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c index f870ca330f9e..367c269305e5 100644 --- a/arch/sparc/mm/init_64.c +++ b/arch/sparc/mm/init_64.c @@ -2591,10 +2591,9 @@ int __meminit vmemmap_check_pmd(pmd_t *pmdp, int nod= e, } =20 int __meminit vmemmap_populate(unsigned long vstart, unsigned long vend, - int node, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + int node, struct vmem_altmap *altmap) { - return vmemmap_populate_hugepages(vstart, vend, node, NULL, pgmap); + return vmemmap_populate_hugepages(vstart, vend, node, NULL); } #endif /* CONFIG_SPARSEMEM_VMEMMAP */ =20 diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index e18cc81a30b4..df2261fa4f98 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1288,13 +1288,12 @@ kernel_physical_mapping_remove(unsigned long start,= unsigned long end) remove_pagetable(start, end, true, NULL); } =20 -void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map, - struct dev_pagemap *pgmap) +void __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *alt= map) { unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap, pgmap); + __remove_pages(start_pfn, nr_pages, altmap); kernel_physical_mapping_remove(start, start + size); } #endif /* CONFIG_MEMORY_HOTPLUG */ @@ -1557,7 +1556,7 @@ int __meminit vmemmap_check_pmd(pmd_t *pmd, int node, } =20 int __meminit vmemmap_populate(unsigned long start, unsigned long end, int= node, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) + struct vmem_altmap *altmap) { int err; =20 @@ -1565,15 +1564,15 @@ int __meminit vmemmap_populate(unsigned long start,= unsigned long end, int node, VM_BUG_ON(!PAGE_ALIGNED(end)); =20 if (end - start < PAGES_PER_SECTION * sizeof(struct page)) - err =3D vmemmap_populate_basepages(start, end, node, NULL, pgmap); + err =3D vmemmap_populate_basepages(start, end, node, NULL); else if (boot_cpu_has(X86_FEATURE_PSE)) - err =3D vmemmap_populate_hugepages(start, end, node, altmap, pgmap); + err =3D vmemmap_populate_hugepages(start, end, node, altmap); else if (altmap) { pr_err_once("%s: no cpu support for altmap allocations\n", __func__); err =3D -ENOMEM; } else - err =3D vmemmap_populate_basepages(start, end, node, NULL, pgmap); + err =3D vmemmap_populate_basepages(start, end, node, NULL); if (!err) sync_global_pgds(start, end - 1); return err; diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 7c9d66729c60..815e908c4135 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -135,10 +135,9 @@ static inline bool movable_node_is_enabled(void) return movable_node_enabled; } =20 -extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap, - struct dev_pagemap *pgmap); +extern void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *al= tmap); extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap); + struct vmem_altmap *altmap); =20 /* reasonably generic interface to expand the physical pages */ extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_= pages, @@ -308,8 +307,7 @@ extern int sparse_add_section(int nid, unsigned long pf= n, unsigned long nr_pages, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); extern void sparse_remove_section(unsigned long pfn, unsigned long nr_page= s, - struct vmem_altmap *altmap, - struct dev_pagemap *pgmap); + struct vmem_altmap *altmap); extern struct zone *zone_for_pfn_range(enum mmop online_type, int nid, struct memory_group *group, unsigned long start_pfn, unsigned long nr_pages); diff --git a/include/linux/mm.h b/include/linux/mm.h index 8baa224444be..adca19a4b2c7 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4858,8 +4858,7 @@ static inline void print_vma_addr(char *prefix, unsig= ned long rip) void *sparse_buffer_alloc(unsigned long size); unsigned long section_map_size(void); struct page * __populate_section_memmap(unsigned long pfn, - unsigned long nr_pages, int nid, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap); + unsigned long nr_pages, int nid, struct vmem_altmap *altmap); void *vmemmap_alloc_block(unsigned long size, int node); struct vmem_altmap; void *vmemmap_alloc_block_buf(unsigned long size, int node, @@ -4870,13 +4869,11 @@ void vmemmap_set_pmd(pmd_t *pmd, void *p, int node, int vmemmap_check_pmd(pmd_t *pmd, int node, unsigned long addr, unsigned long next); int vmemmap_populate_basepages(unsigned long start, unsigned long end, - int node, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap); + int node, struct vmem_altmap *altmap); int vmemmap_populate_hugepages(unsigned long start, unsigned long end, - int node, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap); + int node, struct vmem_altmap *altmap); int vmemmap_populate(unsigned long start, unsigned long end, int node, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap); + struct vmem_altmap *altmap); void vmemmap_populate_print_last(void); struct page *vmemmap_shared_tail_page(unsigned int order, struct zone *zon= e); #ifdef CONFIG_MEMORY_HOTPLUG diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 28306196c0fe..68dd56dd9f74 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -584,7 +584,7 @@ void remove_pfn_range_from_zone(struct zone *zone, * calling offline_pages(). */ void __remove_pages(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) + struct vmem_altmap *altmap) { const unsigned long end_pfn =3D pfn + nr_pages; unsigned long cur_nr_pages; @@ -599,7 +599,7 @@ void __remove_pages(unsigned long pfn, unsigned long nr= _pages, /* Select all remaining pages up to the next section boundary */ cur_nr_pages =3D min(end_pfn - pfn, SECTION_ALIGN_UP(pfn + 1) - pfn); - sparse_remove_section(pfn, cur_nr_pages, altmap, pgmap); + sparse_remove_section(pfn, cur_nr_pages, altmap); } } =20 @@ -1419,7 +1419,7 @@ static void remove_memory_blocks_and_altmaps(u64 star= t, u64 size) =20 remove_memory_block_devices(cur_start, memblock_size); =20 - arch_remove_memory(cur_start, memblock_size, altmap, NULL); + arch_remove_memory(cur_start, memblock_size, altmap); =20 /* Verify that all vmemmap pages have actually been freed. */ WARN(altmap->alloc, "Altmap not fully unmapped"); @@ -1462,7 +1462,7 @@ static int create_altmaps_and_memory_blocks(int nid, = struct memory_group *group, ret =3D create_memory_block_devices(cur_start, memblock_size, nid, params.altmap, group); if (ret) { - arch_remove_memory(cur_start, memblock_size, NULL, NULL); + arch_remove_memory(cur_start, memblock_size, NULL); kfree(params.altmap); goto out; } @@ -1548,7 +1548,7 @@ int add_memory_resource(int nid, struct resource *res= , mhp_t mhp_flags) /* create memory block devices after memory was added */ ret =3D create_memory_block_devices(start, size, nid, NULL, group); if (ret) { - arch_remove_memory(start, size, params.altmap, NULL); + arch_remove_memory(start, size, params.altmap); goto error; } } @@ -2247,7 +2247,7 @@ static int try_remove_memory(u64 start, u64 size) * No altmaps present, do the removal directly */ remove_memory_block_devices(start, size); - arch_remove_memory(start, size, NULL, NULL); + arch_remove_memory(start, size, NULL); } else { /* all memblocks in the range have altmaps */ remove_memory_blocks_and_altmaps(start, size); diff --git a/mm/memremap.c b/mm/memremap.c index c45b90f334ea..ac7be07e3361 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -97,10 +97,10 @@ static void pageunmap_range(struct dev_pagemap *pgmap, = int range_id) PHYS_PFN(range_len(range))); if (pgmap->type =3D=3D MEMORY_DEVICE_PRIVATE) { __remove_pages(PHYS_PFN(range->start), - PHYS_PFN(range_len(range)), NULL, pgmap); + PHYS_PFN(range_len(range)), NULL); } else { arch_remove_memory(range->start, range_len(range), - pgmap_altmap(pgmap), pgmap); + pgmap_altmap(pgmap)); kasan_remove_zero_shadow(__va(range->start), range_len(range)); } mem_hotplug_done(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 752a48112504..68dcc52591d5 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -281,8 +281,7 @@ static pte_t * __meminit vmemmap_populate_address(unsig= ned long addr, int node, } =20 int __meminit vmemmap_populate_basepages(unsigned long start, unsigned lon= g end, - int node, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + int node, struct vmem_altmap *altmap) { unsigned long addr =3D start; pte_t *pte; @@ -342,8 +341,7 @@ int __weak __meminit vmemmap_check_pmd(pmd_t *pmd, int = node, } =20 int __meminit vmemmap_populate_hugepages(unsigned long start, unsigned lon= g end, - int node, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + int node, struct vmem_altmap *altmap) { unsigned long addr; unsigned long next; @@ -393,15 +391,14 @@ int __meminit vmemmap_populate_hugepages(unsigned lon= g start, unsigned long end, VM_BUG_ON(section_vmemmap_optimizable(ms)); continue; } - if (vmemmap_populate_basepages(addr, next, node, altmap, pgmap)) + if (vmemmap_populate_basepages(addr, next, node, altmap)) return -ENOMEM; } return 0; } =20 struct page * __meminit __populate_section_memmap(unsigned long pfn, - unsigned long nr_pages, int nid, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long start =3D (unsigned long) pfn_to_page(pfn); unsigned long end =3D start + nr_pages * sizeof(struct page); @@ -410,7 +407,7 @@ struct page * __meminit __populate_section_memmap(unsig= ned long pfn, !IS_ALIGNED(nr_pages, PAGES_PER_SUBSECTION))) return NULL; =20 - if (vmemmap_populate(start, end, nid, altmap, pgmap)) + if (vmemmap_populate(start, end, nid, altmap)) return NULL; =20 return pfn_to_page(pfn); @@ -486,10 +483,9 @@ void offline_mem_sections(unsigned long start_pfn, uns= igned long end_pfn) } =20 static struct page * __meminit populate_section_memmap(unsigned long pfn, - unsigned long nr_pages, int nid, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { - return __populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); + return __populate_section_memmap(pfn, nr_pages, nid, altmap); } =20 static void depopulate_section_memmap(unsigned long pfn, unsigned long nr_= pages, @@ -570,7 +566,7 @@ static int fill_subsection_map(unsigned long pfn, unsig= ned long nr_pages) * usage map, but still need to free the vmemmap range. */ static void section_deactivate(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) + struct vmem_altmap *altmap) { struct mem_section *ms =3D __pfn_to_section(pfn); bool section_is_early =3D early_section(ms); @@ -622,8 +618,7 @@ static void section_deactivate(unsigned long pfn, unsig= ned long nr_pages, } =20 static struct page * __meminit section_activate(int nid, unsigned long pfn, - unsigned long nr_pages, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + unsigned long nr_pages, struct vmem_altmap *altmap) { struct mem_section *ms =3D __pfn_to_section(pfn); struct mem_section_usage *usage =3D NULL; @@ -655,10 +650,10 @@ static struct page * __meminit section_activate(int n= id, unsigned long pfn, if (nr_pages < PAGES_PER_SECTION && early_section(ms)) return pfn_to_page(pfn); =20 - memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap); + memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap); memmap_pages_add(section_vmemmap_pages(pfn, nr_pages)); if (!memmap) { - section_deactivate(pfn, nr_pages, altmap, pgmap); + section_deactivate(pfn, nr_pages, altmap); return ERR_PTR(-ENOMEM); } =20 @@ -704,7 +699,7 @@ int __meminit sparse_add_section(int nid, unsigned long= start_pfn, section_set_zone(ms, ZONE_DEVICE); #endif } - memmap =3D section_activate(nid, start_pfn, nr_pages, altmap, pgmap); + memmap =3D section_activate(nid, start_pfn, nr_pages, altmap); if (IS_ERR(memmap)) return PTR_ERR(memmap); =20 @@ -726,13 +721,13 @@ int __meminit sparse_add_section(int nid, unsigned lo= ng start_pfn, } =20 void sparse_remove_section(unsigned long pfn, unsigned long nr_pages, - struct vmem_altmap *altmap, struct dev_pagemap *pgmap) + struct vmem_altmap *altmap) { struct mem_section *ms =3D __pfn_to_section(pfn); =20 if (WARN_ON_ONCE(!valid_section(ms))) return; =20 - section_deactivate(pfn, nr_pages, altmap, pgmap); + section_deactivate(pfn, nr_pages, altmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/mm/sparse.c b/mm/sparse.c index 400542302ad4..77bb0113bac5 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -237,8 +237,7 @@ unsigned long __init section_map_size(void) } =20 struct page __init *__populate_section_memmap(unsigned long pfn, - unsigned long nr_pages, int nid, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + unsigned long nr_pages, int nid, struct vmem_altmap *altmap) { unsigned long size =3D section_map_size(); struct page *map =3D sparse_buffer_alloc(size); @@ -386,8 +385,7 @@ static void __init sparse_init_nid(int nid, unsigned lo= ng pnum_begin, if (pnum >=3D pnum_end) break; =20 - map =3D __populate_section_memmap(pfn, PAGES_PER_SECTION, - nid, NULL, NULL); + map =3D __populate_section_memmap(pfn, PAGES_PER_SECTION, nid, NULL); if (!map) panic("Populate section (%ld) on node[%d] failed\n", pnum, nid); memmap_boot_pages_add(section_vmemmap_pages(pfn, PAGES_PER_SECTION)); --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A7C53375C5 for ; Sun, 5 Apr 2026 12:58:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393915; cv=none; b=gcG7cHEgpUc0N6vtopQTL8kbB2ifO+S8cxY61S2ndisH/8khIiK4mLO78V3w9gayuzdd2olZgcBtxNj+UpNUtqT87mTaIjnVdjZJ/jjtk+hY1fg0Dh+WkD5cCuM2d1fqdxzJjq2DKwOYHRZbaVECor7++y4g6C+HEWOpTiEI7hM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393915; c=relaxed/simple; bh=PI+w1zBU/cgdAt9H8Zg4MBK6wCBlVj/WzPpfJpQg4U0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=VkZqwLh9GKDpi2t98Ptzo54Ts4eLkiobDfa5BTcqQmgxna/uSBEOM49v3uw7Zn5R4X6hq17IOBihnE1GW7aDmB/SOhatr7k3GWrm9MlZB/2owAlzGVxThst3cQSx2Z+cxv/uaFehqFIxcX2mMW+1fW5C2rE1tVcucVTyOoPoW+E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=iWUhtzAw; arc=none smtp.client-ip=209.85.216.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="iWUhtzAw" Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-35da9c0c007so2934284a91.2 for ; Sun, 05 Apr 2026 05:58:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393914; x=1775998714; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=64qVqbFMPmvyYt7CQKJ/AlutLuZCqUEnEB7XrUuUTaU=; b=iWUhtzAwinEbu37vgy5js01TyiIPMWB68KjMcWe7+mDSo7xoRGj4qdAcIbdEQXLxtf adMtA+/3tJdhcbAUAaDMiyo+JAJyjkMWhkiRfL8siNbPzkNIvHZYVWfel8PsWrWqqYJS QuWZq+qqZio6kpjXW/haxR/dHfP48al3K031ej4FIjyavmkjKMR3rCONtI3SKH1x8kro rboy48vxbn8A51kJYtx5CZ3NBdK+3A5UlisaPNGVKVD65kqwxS2AbRAVbAQX7Iq/lqmC q3Pq0N+pfIadOJYuDYSR0NlzuGED7yjVojuVhJQwmbeHRvFmefH0NrnQer19TbhyMzwq V0EQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393914; x=1775998714; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=64qVqbFMPmvyYt7CQKJ/AlutLuZCqUEnEB7XrUuUTaU=; b=UhtoJ6V7e9nkjvI+PxvSuPThk4fIROdYY4b1XjtAjBctaiPK/8g4K3s4qJ34ZcyMNQ 5oM/IQJIe8ZPp4hE3uoCwNi27S2A0nphUF6l2e+PiuhFKklXN+qrAjpG2AYVo+0wCZx5 IVWl5gqtwzZ+7JHDHmfWnDhgkzKnJ+bU6DmlgfBYFSk8MX+74ANSIvHD1kdYycTgmrQ8 gJn4espz7Q2lOTxwO5o2ufLMVVvzrvatMVWvR81/ccaPrftPaJye7q+DhLMeEhOwvkcC zvWQV/BOoIrifzD/eWSFOLUFqUVYyHhu32Ou/v2YhbECwvenPA+TOGMYE+jHBG62H46s SJ9w== X-Forwarded-Encrypted: i=1; AJvYcCU0MwPwMjphrISxvSAIdzx2ASWYzkMgijFv30p/lkqxbj0c+3tUY3N832nsBwku5VYQVZrNabAMeMzwsC8=@vger.kernel.org X-Gm-Message-State: AOJu0YzBs2u4RC4L4z0wqEKykx5mS1+z4VIpQQW/RxYPqt8GlMtyq8kf jxO0aEkec6xQYWCpzhm7WJ08qImHetjbhKdUPhDvfA/YobZOG+YkJCw2yLigXind3Ko= X-Gm-Gg: AeBDies13dVQCCufjYiNHH54Po640DsIF7E0BI6WKk5zfu/ZKonVh0HU+RvHn7fuU6b 6pmn0PPrPfuWhsaa2H3zEg74sZP1vwEtGhnZj4vqHxpMhG78q1j7eBoRitoAvIGFKi1QfadS7ZG r1MevztV01SKMUPDQN7kDO1AzUsD7S7mjdvk+/416SIV+92Ez2gemSiv/npN3rgFZhfFTR9JyDo YWMf7V6qOiT6yDK7va1bY66vtaEfjMfhUNQjUnOA1D2WREz/5+Rh4tu8UjoaY7a2Sb4jOnjqVKp cXrFV1r3ym1qJ3EI5VX1qkdP5+/nEOYOLlMxa52XamhCs7+x04oZHjYihuke1lJnXGEHLP/CUMh VQCSyJYHyDbKJ94UHdeQiqgMjFvqp4qz3fvl62U1O+iwt8sbyfYOdnre+gd/aD1v3Ty/aDCK+YF VoQXLNVo5xFGKGPiUKNfCyUAu4sSKiMVlNQIutheBHUak= X-Received: by 2002:a17:90b:55cb:b0:35b:b52d:f34d with SMTP id 98e67ed59e1d1-35de679faa8mr8791452a91.5.1775393913904; Sun, 05 Apr 2026 05:58:33 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.58.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:58:33 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 46/49] mm/sparse: replace pgmap with order and zone in sparse_add_section() Date: Sun, 5 Apr 2026 20:52:37 +0800 Message-Id: <20260405125240.2558577-47-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The sparse_add_section() function was using the struct dev_pagemap argument only to extract the vmemmap_shift value to set the compound page order for vmemmap optimization, and to set the zone to ZONE_DEVICE. Since the full struct dev_pagemap is not needed here and is being removed from the rest of the vmemmap APIs, replace the pgmap parameter with direct order and zone parameters in sparse_add_section(). This cleanly decouples the sparse memory infrastructure from the ZONE_DEVICE struct dev_pagemap. The main motivation behind this decouple is to make sparse_add_section() a more generic memory population interface that can be easily reused for other non ZONE_DEVICE population use cases in the future. Signed-off-by: Muchun Song --- include/linux/memory_hotplug.h | 2 +- mm/memory_hotplug.c | 10 ++++++++-- mm/sparse-vmemmap.c | 14 +++++++------- 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 815e908c4135..089052d64b01 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -305,7 +305,7 @@ extern void remove_pfn_range_from_zone(struct zone *zon= e, unsigned long nr_pages); extern int sparse_add_section(int nid, unsigned long pfn, unsigned long nr_pages, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap); + unsigned int order, enum zone_type zone); extern void sparse_remove_section(unsigned long pfn, unsigned long nr_page= s, struct vmem_altmap *altmap); extern struct zone *zone_for_pfn_range(enum mmop online_type, diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 68dd56dd9f74..0f7707f3d4bb 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -385,10 +385,17 @@ int __add_pages(int nid, unsigned long pfn, unsigned = long nr_pages, unsigned long cur_nr_pages; int err; struct vmem_altmap *altmap =3D params->altmap; + unsigned int order =3D params->pgmap ? params->pgmap->vmemmap_shift : 0; + enum zone_type zid =3D 0; =20 if (WARN_ON_ONCE(!pgprot_val(params->pgprot))) return -EINVAL; =20 +#ifdef CONFIG_ZONE_DEVICE + if (params->pgmap) + zid =3D ZONE_DEVICE; +#endif + VM_BUG_ON(!mhp_range_allowed(PFN_PHYS(pfn), nr_pages * PAGE_SIZE, false)); =20 if (altmap) { @@ -412,8 +419,7 @@ int __add_pages(int nid, unsigned long pfn, unsigned lo= ng nr_pages, /* Select all remaining pages up to the next section boundary */ cur_nr_pages =3D min(end_pfn - pfn, SECTION_ALIGN_UP(pfn + 1) - pfn); - err =3D sparse_add_section(nid, pfn, cur_nr_pages, altmap, - params->pgmap); + err =3D sparse_add_section(nid, pfn, cur_nr_pages, altmap, order, zid); if (err) break; cond_resched(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 68dcc52591d5..894352cb8957 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -666,7 +666,8 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, * @start_pfn: start pfn of the memory range * @nr_pages: number of pfns to add in the section * @altmap: alternate pfns to allocate the memmap backing store - * @pgmap: alternate compound page geometry for devmap mappings + * @order: section order + * @zone: section zone. Note that it is ignored when @order is 0. * * This is only intended for hotplug. * @@ -681,7 +682,7 @@ static struct page * __meminit section_activate(int nid= , unsigned long pfn, */ int __meminit sparse_add_section(int nid, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap, - struct dev_pagemap *pgmap) + unsigned int order, enum zone_type zone) { unsigned long section_nr =3D pfn_to_section_nr(start_pfn); struct mem_section *ms; @@ -693,11 +694,10 @@ int __meminit sparse_add_section(int nid, unsigned lo= ng start_pfn, return ret; =20 ms =3D __nr_to_section(section_nr); - if (!altmap && pgmap && nr_pages =3D=3D PAGES_PER_SECTION) { - section_set_order(ms, pgmap->vmemmap_shift); -#ifdef CONFIG_ZONE_DEVICE - section_set_zone(ms, ZONE_DEVICE); -#endif + /* HVO is not supported when memmap pages are backed by an altmap. */ + if (!altmap && nr_pages =3D=3D PAGES_PER_SECTION && order) { + section_set_order(ms, order); + section_set_zone(ms, zone); } memmap =3D section_activate(nid, start_pfn, nr_pages, altmap); if (IS_ERR(memmap)) --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8B5F3375C5 for ; Sun, 5 Apr 2026 12:58:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393921; cv=none; b=h4BJZHwPX2f13GpjIqDloCjhUvCiFkJscu7k8XiF06MFXIESbArCBO34YRXdXiLSQPLpDhWTUQ8opX89EzJVBIa1RiOFi3tkX70ktYLw3kkyMpaXAyvE2gIYg96kJ144JB+W8eOEeFRBF7J3uMr4l6GCH2mOJbjbxWHNn9xY9X8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393921; c=relaxed/simple; bh=v+bAkYoT0TzJTglTl0qP6r0uvROIopyFNmk+e4A3A4g=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Rh6cY4ea3Mni2TDor4mkeHXRfRW/ejGCjLYvGmCJJ6uKM2pgzfs3wup4CBef+sVl3qh9BwBGfNClDriYqkH/yNlDvrhNgBIMz7ajThvxphDWsKM0sNR4Vn/05cvRCsejZnm+DFE2tqrn9a1Jl+yxp/THGT4eMYvPnyGHoLC8nVE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=FyqYODnf; arc=none smtp.client-ip=209.85.216.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="FyqYODnf" Received: by mail-pj1-f50.google.com with SMTP id 98e67ed59e1d1-35d99bae2ebso2820031a91.3 for ; Sun, 05 Apr 2026 05:58:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393919; x=1775998719; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4TJ3Bj0XU7cO6WKSJAjWpCQXgVQz+8UMWhX8sNdkyQM=; b=FyqYODnfA1t77Uzw6wN/gWe5WcbV+sh5abhTTs5qv61cTUsLwqO5izhas6rg30J6ai /eKbhDORJFjZIwLsMxToeWLHRe9UHJoCkPhqNnH0LK/5W3Fi7aOhQwiKHrN3UY8OsLTp 5qPbQnlMbQt47eV8KyI7/jIhAJxF7sb76XlwReelUYgsLfMGxqw2B794Q2B58aFT2qP2 HXJ0E67TdLhk0v3qTdpoQwS1l/9rMC/aLrk/4tLP/wmUC4sdk98CrH885x00E/UDB9xY FHFd19yrmQP54sf7n1nkSNhwpWpnP1kjBkJLhobUHHbWU0SscUFGJdJvhupOMLU8eRVv l8og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393919; x=1775998719; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=4TJ3Bj0XU7cO6WKSJAjWpCQXgVQz+8UMWhX8sNdkyQM=; b=OuyFN9rbbmWWWYJJq9xT7tZ0B3utpjrJlUIKZ5KIHo6DfWZh4M5nHoZWGTnrvIRgK1 h1WKGEOw8t8j3Cz9S14Woa/Z4WZPTGd7lEjFtwTQTnvDYaWU0Ipu6+ATBsZboQDWiJWr 2sa1VATJ4wRlTJPCgR72W/f8zMwOCzqib34a1Yu1ZmQ7/MD8y7vBNjT784dbIegfN3uK Kt5d4BHoIMjzzH9vugW9EF0eMw9hTYBMBpdPvl2hhUyRE3yrCzuBGYIZ/akTHhIiqtqv QXR21amxpCzeYvmFjXUcqznToFGekKAybqOt/O2anPolg2lc98aeJTPwlzo5g1FhtOYk Bw+A== X-Forwarded-Encrypted: i=1; AJvYcCX7aDzBiqWcoOM4mLqwE33COfpBolPXauxa6+z6JMgbSCdbXUh4+DgWmR3S2BnZCkKetqJ5WV+ILC2sH5s=@vger.kernel.org X-Gm-Message-State: AOJu0Yx7uCIS32XDYODYqnyLM1irMfGf4efPST7Xcx9LrPQ8eik3T5TN lhtW8X+WiDJoXZe0o19/1GSBoMxz3mtAR81GKcartSFGhoc+5Bn2KUXpDZNmkSgh+bY= X-Gm-Gg: AeBDiev3M80rUGWZosn/o09HWbmuleHi5gj0pQrKt30GTZFnfTw9jLdH5ep07D4YFLN XKKCvaPB8w2rYpSdQA2EJI+IJTdgK21/5FhMFIZdgmujz2U8CnYUqM+AeR5WaARnY8ySLufSy9C 2w4uge++u0yfKBEIJGa6IbFAz61976aytxSWOov3rGvaoqXqETwdKf8HuLi0DSg9QqaSvpxcvJv W/uEPoLDeFS31NZgsZBJzZ//8iqgVSgJsmVNMAGfceweJqZPT037/PrCrCfQbyjkEkH9bNzIVqB 7FSjKkPfkRIUR5wYsLSrpUOrB7HrCO20FlF18K9SLVeF3WgrYxY3kkyYaF1CdcZjO5xxCV2Dwoc x1qJw1AoFV1KPT5/wamJ8qYaKty8NRl2PPqd2oss1/tGZds6oZ3MA/gkQIPRebkCsI1XUL/uQkf 9Oaarh09kF+ploqnlYmBOImOS5wqPXSHYWIXdlM0m8vao= X-Received: by 2002:a17:90b:52c3:b0:35d:a861:36de with SMTP id 98e67ed59e1d1-35de691b394mr9081749a91.24.1775393919158; Sun, 05 Apr 2026 05:58:39 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.58.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:58:38 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 47/49] mm: redefine HVO as Hugepage Vmemmap Optimization Date: Sun, 5 Apr 2026 20:52:38 +0800 Message-Id: <20260405125240.2558577-48-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The vmemmap optimization is a generic method used to save struct page overhead. Currently, HVO stands for "HugeTLB Vmemmap Optimization", which strictly ties the concept to the HugeTLB subsystem. To reflect the general applicability of this technique, redefine HVO as "Hugepage Vmemmap Optimization" in generalized contexts, and "Hugepage Vmemmap Optimization for HugeTLB" in contexts strictly related to the HugeTLB subsystem. Update all generic references and comments in the codebase to use the new terminology "Hugepage Vmemmap Optimization", and modify the HugeTLB-specific ones to "Hugepage Vmemmap Optimization (HVO) for HugeTLB". Signed-off-by: Muchun Song --- Documentation/admin-guide/kernel-parameters.txt | 2 +- Documentation/admin-guide/sysctl/vm.rst | 2 +- Documentation/mm/vmemmap_dedup.rst | 2 +- fs/Kconfig | 4 ++-- include/linux/mmzone.h | 2 +- mm/Kconfig | 2 +- 6 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentatio= n/admin-guide/kernel-parameters.txt index 1c8a16309270..ae711cd7887d 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -2125,7 +2125,7 @@ Kernel parameters hugetlb_free_vmemmap=3D [KNL] Requires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP enabled. - Control if HugeTLB Vmemmap Optimization (HVO) is enabled. + Control if Hugepage Vmemmap Optimization (HVO) for HugeTLB is enabled. Allows heavy hugetlb users to free up some more memory (7 * PAGE_SIZE for each 2MB hugetlb page). Format: { on | off (default) } diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-= guide/sysctl/vm.rst index 97e12359775c..886f5e78686f 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -665,7 +665,7 @@ This knob is not available when the size of 'struct pag= e' (a structure defined in include/linux/mm_types.h) is not power of two (an unusual system config= could result in this). =20 -Enable (set to 1) or disable (set to 0) HugeTLB Vmemmap Optimization (HVO). +Enable (set to 1) or disable (set to 0) Hugepage Vmemmap Optimization (HVO= ) for HugeTLB. =20 Once enabled, the vmemmap pages of subsequent allocation of HugeTLB pages = from buddy allocator will be optimized (7 pages per 2MB HugeTLB page and 4095 p= ages diff --git a/Documentation/mm/vmemmap_dedup.rst b/Documentation/mm/vmemmap_= dedup.rst index 9fa8642ded48..44e80bd2e398 100644 --- a/Documentation/mm/vmemmap_dedup.rst +++ b/Documentation/mm/vmemmap_dedup.rst @@ -8,7 +8,7 @@ A vmemmap diet for HugeTLB and Device DAX HugeTLB =3D=3D=3D=3D=3D=3D=3D =20 -This section is to explain how HugeTLB Vmemmap Optimization (HVO) works. +This section is to explain how Hugepage Vmemmap Optimization (HVO) for Hug= eTLB works. =20 The ``struct page`` structures are used to describe a physical page frame.= By default, there is a one-to-one mapping from a page frame to its correspond= ing diff --git a/fs/Kconfig b/fs/Kconfig index 9b56a90e13db..0bcd5b5721a8 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -261,11 +261,11 @@ menuconfig HUGETLBFS =20 if HUGETLBFS config HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON - bool "HugeTLB Vmemmap Optimization (HVO) defaults to on" + bool "Hugepage Vmemmap Optimization (HVO) for HugeTLB defaults to on" default n depends on HUGETLB_PAGE_OPTIMIZE_VMEMMAP help - The HugeTLB Vmemmap Optimization (HVO) defaults to off. Say Y here to + The Hugepage Vmemmap Optimization (HVO) for HugeTLB defaults to off. Sa= y Y here to enable HVO by default. It can be disabled via hugetlb_free_vmemmap=3Doff (boot command line) or hugetlb_optimize_vmemmap (sysctl). endif # HUGETLBFS diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 846a7ee1334f..a6900f585f9b 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -97,7 +97,7 @@ #define MAX_FOLIO_NR_PAGES (1UL << MAX_FOLIO_ORDER) =20 /* - * HugeTLB Vmemmap Optimization (HVO) requires struct pages of the head pa= ge to + * Hugepage Vmemmap Optimization (HVO) requires struct pages of the head p= age to * be naturally aligned with regard to the folio size. * * HVO which is only active if the size of struct page is a power of 2. diff --git a/mm/Kconfig b/mm/Kconfig index 166552d5d69a..33a36e20db3a 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -411,7 +411,7 @@ config SPARSEMEM_VMEMMAP efficient option when sufficient kernel resources are available. =20 config SPARSEMEM_VMEMMAP_OPTIMIZATION - bool "Enable Vmemmap Optimization Infrastructure" + bool "Enable Hugepage Vmemmap Optimization (HVO) Infrastructure" default y depends on SPARSEMEM_VMEMMAP help --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1B7F923AB88 for ; Sun, 5 Apr 2026 12:58:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393932; cv=none; b=XPPnHIjWC9RTCAe0QT1yjPgsHt2Tj2Bd3FX9CgJ0Rpl/2Cn/kUeRKovYT7rI88L3j8AadYJAX6MExrKD6X41tGa4F2kHavmu6KMIppAwBJauIDKsStoN5Q5ctJOfb6BYUmaXL71VlDoyLQ/HsmNcWlGmcd4QjZbjOm/MIQ4Vm0k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393932; c=relaxed/simple; bh=rjAth5phZ/IPo2Jju3+LNW7p/gAcjBA7bYCVojd2ZW4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=E1AHOIQyaTIZoBd2AaYJoTSsjjx6Xr3QE1sxLG0DXUt7wb7znBhgCPPXMVqfy8r6sC0VZTRIQw2dGZ5hZE1xzaUBGnCs/cLmTQfML/7zdBf858tR9CdPdOb0QD7ApfDleyp7IkiQwgCtbfTLltNeqQTflgMjLJbKdyouZmit1MI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=MPcTT8BL; arc=none smtp.client-ip=209.85.216.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="MPcTT8BL" Received: by mail-pj1-f46.google.com with SMTP id 98e67ed59e1d1-3591cc98871so1313625a91.3 for ; Sun, 05 Apr 2026 05:58:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393931; x=1775998731; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ydW451GMW82JpMBYkjDYyQiyHYoh54ssuDXhIqDFdlI=; b=MPcTT8BLDub3+1Xek+9URioYDiNBtEunqwOMWk1vwOIl+x6IykOS8XTu5SWUsvwYEu VvBCPwbiDv+S8zKvsusFz3w4tGnjuRETxblahw+j3uFbejXq0HZkvrGCz1/DgoPN4VCy Pu7Zy3FFIAGV/1SSIMnc2kXi/L2LBsDIUqGvBZ26mOZ3cEL4dMn4aNTA8ZaIh4geDnoQ ZqBDcC4s+F3I2l5RUC1qk81rl3dp5fPpu9vqapLxomKnXeiwpRRegDPUackg7qgAmvGh ch1vpjUPSHDVXyBPvaQPb2eyoR2vUeLesBQ8yOh55DWagu/ksOJ0k8NivTCReKcgTj1l 55xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393931; x=1775998731; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ydW451GMW82JpMBYkjDYyQiyHYoh54ssuDXhIqDFdlI=; b=nBZljuHb4mPFDQyKG/TAkzwnltq8M3lDRiOiBxPtP56rOdu7e6rHyYbVvwURsFQgJV PEnxVCGceETQAA9VD2sw6TeNFDTet1/q3FOn/i7MT80dr3+s0fxvIqMa8OWmjqmC61On fjGGKvtzQckWHAdRxAYDpGSfXGUEeXXuwnKIXhdB2M+AHwaowjGmN6Lju0Ri9XHH77t/ YMi1lQnQrXb+rQulcbAJbwWg+gf/XNGMsvxDXXlqVNRWVva46ub9gCen1c5dzBa3/rFb 12Y2eDKKF629p4NdELyDte41N9vs5mMJDMIup9iy50+qnFTmKRF/fdAe0h6IwcgRNvjR fNsQ== X-Forwarded-Encrypted: i=1; AJvYcCXnx5qRE6TsDapmUD9dzP9YNdXJGs3GNtU6WhRJHTNct9ndDOOgDxQ5UVENtWgyxmVk/A+YS1UmX92kzf8=@vger.kernel.org X-Gm-Message-State: AOJu0Yzh1NADmRYrVZ5sZNWb9UkBrNhDhRzp70lHNmdXZCKl5FyH1FTj SRJunLIGlZ8s0lKUIwbNR5GHz/jwse5YTNPiyVRP4SzC9qAY6Qp005tS0lW8N+NWCFM= X-Gm-Gg: AeBDiettecUAxRaGyIe/BKFTP9PulhCtOnRzfi4wi9wFrYqTrx+BG1HDnUfpW167QlC 4peRfixmysx+KBzjPYeDFNYbWgezL7XzRvZ+O63g9IDUaRZYKpMNgitADqRPRN+d7ZW18veXVam /uNdoN9tnKvL070ZqeIUPQaUs9+vndtaWpS9lUgJDIDaB/hd8D2oO6RewmS5uWSzsyY4tjrcQbs FkYEoYuPa/qDHpNCX5mFsXl9xDgli1sUxjYSTT7H9FNa+/duS0yRbLVsDDixFAMfIu/0GsdX/qG JgwVb1d72UsQFig4M3F+gnZhW5hgzydyBKC0JF4XvC0wXCu+sUDAgTjzD//bZ/TgZfp/IUWYihg IMnnEEJImlOJ3vScueDX568PIq0Ai0Q+mxWLtWDHxNwflQbHpli6BNE2UwHAAMLqYLHvS7br+CG rOHs3hPeoAk8xvwP5IreklAlPfKe5kAe3iMF/VuB0jv+Q= X-Received: by 2002:a17:90b:2688:b0:354:a332:1a61 with SMTP id 98e67ed59e1d1-35de678f2b0mr8914364a91.5.1775393930429; Sun, 05 Apr 2026 05:58:50 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.58.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:58:50 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 48/49] Documentation/mm: restructure vmemmap_dedup.rst to reflect generalized HVO Date: Sun, 5 Apr 2026 20:52:39 +0800 Message-Id: <20260405125240.2558577-49-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The documentation Documentation/mm/vmemmap_dedup.rst is slightly outdated and poorly structured. The current document primarily focuses on HugeTLB pages as the main context, which fails to clearly communicate the general principles of Hugepage Vmemmap Optimization (HVO). To make it more logical and readable, refine the document. Specifically, introduce the general HVO concepts, principles, and calculations that are agnostic to specific subsystems. Remove the outdated and subsystem-specific contexts (like explicit HugeTLB and Device DAX sections) to better reflect the universal applicability of HVO to any large compound page. This reorganization makes the documentation much easier to read and understand, and aligns with the recent renaming and generalization of the HVO mechanism. Signed-off-by: Muchun Song --- Documentation/mm/vmemmap_dedup.rst | 218 ++++++----------------------- 1 file changed, 42 insertions(+), 176 deletions(-) diff --git a/Documentation/mm/vmemmap_dedup.rst b/Documentation/mm/vmemmap_= dedup.rst index 44e80bd2e398..a21d84fcbe24 100644 --- a/Documentation/mm/vmemmap_dedup.rst +++ b/Documentation/mm/vmemmap_dedup.rst @@ -1,107 +1,33 @@ - .. SPDX-License-Identifier: GPL-2.0 =20 -=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D -A vmemmap diet for HugeTLB and Device DAX -=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D - -HugeTLB -=3D=3D=3D=3D=3D=3D=3D - -This section is to explain how Hugepage Vmemmap Optimization (HVO) for Hug= eTLB works. +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D +Fundamentals of Hugepage Vmemmap Optimization (HVO) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D =20 -The ``struct page`` structures are used to describe a physical page frame.= By -default, there is a one-to-one mapping from a page frame to its correspond= ing +The ``struct page`` structures are used to describe a physical base page f= rame. +By default, there is a one-to-one mapping from a page frame to its corresp= onding ``struct page``. =20 -HugeTLB pages consist of multiple base page size pages and is supported by= many -architectures. See Documentation/admin-guide/mm/hugetlbpage.rst for more -details. On the x86-64 architecture, HugeTLB pages of size 2MB and 1GB are -currently supported. Since the base page size on x86 is 4KB, a 2MB HugeTLB= page -consists of 512 base pages and a 1GB HugeTLB page consists of 262144 base = pages. -For each base page, there is a corresponding ``struct page``. - -Within the HugeTLB subsystem, only the first 4 ``struct page`` are used to -contain unique information about a HugeTLB page. ``__NR_USED_SUBPAGE`` pro= vides -this upper limit. The only 'useful' information in the remaining ``struct = page`` -is the compound_info field, and this field is the same for all tail pages. - -By removing redundant ``struct page`` for HugeTLB pages, memory can be ret= urned -to the buddy allocator for other uses. - -Different architectures support different HugeTLB pages. For example, the -following table is the HugeTLB page size supported by x86 and arm64 -architectures. Because arm64 supports 4k, 16k, and 64k base pages and -supports contiguous entries, so it supports many kinds of sizes of HugeTLB -page. - -+--------------+-----------+----------------------------------------------= -+ -| Architecture | Page Size | HugeTLB Page Size = | -+--------------+-----------+-----------+-----------+-----------+----------= -+ -| x86-64 | 4KB | 2MB | 1GB | | = | -+--------------+-----------+-----------+-----------+-----------+----------= -+ -| | 4KB | 64KB | 2MB | 32MB | 1GB = | -| +-----------+-----------+-----------+-----------+----------= -+ -| arm64 | 16KB | 2MB | 32MB | 1GB | = | -| +-----------+-----------+-----------+-----------+----------= -+ -| | 64KB | 2MB | 512MB | 16GB | = | -+--------------+-----------+-----------+-----------+-----------+----------= -+ - -When the system boot up, every HugeTLB page has more than one ``struct pag= e`` -structs which size is (unit: pages):: - - struct_size =3D HugeTLB_Size / PAGE_SIZE * sizeof(struct page) / PAGE_S= IZE - -Where HugeTLB_Size is the size of the HugeTLB page. We know that the size -of the HugeTLB page is always n times PAGE_SIZE. So we can get the followi= ng -relationship:: - - HugeTLB_Size =3D n * PAGE_SIZE - -Then:: +When huge pages (large compound page) are used, they consist of multiple b= ase +page size pages. For each base page, there is a corresponding ``struct pag= e``. +However, only a few ``struct page`` +structures are actually used to contain unique information about the huge = page. +The only 'useful' information in the remaining tail ``struct page`` struct= ures +is the ``->compound_info`` field to get the head page structure, and this = field +is the same for all tail pages. =20 - struct_size =3D n * PAGE_SIZE / PAGE_SIZE * sizeof(struct page) / PAGE_= SIZE - =3D n * sizeof(struct page) / PAGE_SIZE +We can remove redundant ``struct page`` structures for huge pages to save = memory. +This optimization is referred to as Hugepage Vmemmap Optimization (HVO). =20 -We can use huge mapping at the pud/pmd level for the HugeTLB page. +The optimization is only applied when the size of the ``struct page`` is a +power-of-2. In this case, all tail pages of the same order are identical. = See +``compound_head()``. This allows us to remap the tail pages of the vmemmap= to a +shared page. =20 -For the HugeTLB page of the pmd level mapping, then:: +Let=E2=80=99s take a system with a 2 MB huge page and a base page size of = 4 KB as an +example for illustration. Here is how things look before optimization:: =20 - struct_size =3D n * sizeof(struct page) / PAGE_SIZE - =3D PAGE_SIZE / sizeof(pte_t) * sizeof(struct page) / PAGE_= SIZE - =3D sizeof(struct page) / sizeof(pte_t) - =3D 64 / 8 - =3D 8 (pages) - -Where n is how many pte entries which one page can contains. So the value = of -n is (PAGE_SIZE / sizeof(pte_t)). - -This optimization only supports 64-bit system, so the value of sizeof(pte_= t) -is 8. And this optimization also applicable only when the size of ``struct= page`` -is a power of two. In most cases, the size of ``struct page`` is 64 bytes = (e.g. -x86-64 and arm64). So if we use pmd level mapping for a HugeTLB page, the -size of ``struct page`` structs of it is 8 page frames which size depends = on the -size of the base page. - -For the HugeTLB page of the pud level mapping, then:: - - struct_size =3D PAGE_SIZE / sizeof(pmd_t) * struct_size(pmd) - =3D PAGE_SIZE / 8 * 8 (pages) - =3D PAGE_SIZE (pages) - -Where the struct_size(pmd) is the size of the ``struct page`` structs of a -HugeTLB page of the pmd level mapping. - -E.g.: A 2MB HugeTLB page on x86_64 consists in 8 page frames while 1GB -HugeTLB page consists in 4096. - -Next, we take the pmd level mapping of the HugeTLB page as an example to -show the internal implementation of this optimization. There are 8 pages -``struct page`` structs associated with a HugeTLB page which is pmd mapped. - -Here is how things look before optimization:: - - HugeTLB struct pages(8 pages) page frame(8 pa= ges) + 2MB Hugepage struct pages (8 pages) page frame (= 8 pages) +-----------+ ---virt_to_page---> +-----------+ mapping to +---------= --+ | | | 0 | -------------> | 0 = | | | +-----------+ +---------= --+ @@ -112,9 +38,9 @@ Here is how things look before optimization:: | | | 3 | -------------> | 3 = | | | +-----------+ +---------= --+ | | | 4 | -------------> | 4 = | - | PMD | +-----------+ +---------= --+ - | level | | 5 | -------------> | 5 = | - | mapping | +-----------+ +---------= --+ + | | +-----------+ +---------= --+ + | | | 5 | -------------> | 5 = | + | | +-----------+ +---------= --+ | | | 6 | -------------> | 6 = | | | +-----------+ +---------= --+ | | | 7 | -------------> | 7 = | @@ -124,34 +50,27 @@ Here is how things look before optimization:: | | +-----------+ =20 -The first page of ``struct page`` (page 0) associated with the HugeTLB page -contains the 4 ``struct page`` necessary to describe the HugeTLB. The rema= ining -pages of ``struct page`` (page 1 to page 7) are tail pages. - -The optimization is only applied when the size of the struct page is a pow= er -of 2. In this case, all tail pages of the same order are identical. See -compound_head(). This allows us to remap the tail pages of the vmemmap to a -shared, read-only page. The head page is also remapped to a new page. This -allows the original vmemmap pages to be freed. +We remap the tail pages (page 1 to page 7) of the vmemmap to a shared, rea= d-only +page (per-zone). =20 Here is how things look after remapping:: =20 - HugeTLB struct pages(8 pages) page fr= ame (new) + 2MB Hugepage struct pages(8 pages) page frame= (1 page) +-----------+ ---virt_to_page---> +-----------+ mapping to +---------= -------+ | | | 0 | -------------> | 0 = | | | +-----------+ +---------= -------+ | | | 1 | ------=E2=94=90 | | +-----------+ | - | | | 2 | ------=E2=94=BC +-= ---------------------------+ + | | | 2 | ------=E2=94=BC + | | +-----------+ | + | | | 3 | ------=E2=94=BC +-= ---------------------------+ | | +-----------+ | | A single= , per-zone page | - | | | 3 | ------=E2=94=BC------> | = frame shared among all | + | | | 4 | ------=E2=94=BC------> | = frame shared among all | | | +-----------+ | | hugepage= s of the same size | - | | | 4 | ------=E2=94=BC +-= ---------------------------+ + | | | 5 | ------=E2=94=BC +-= ---------------------------+ + | | +-----------+ | + | | | 6 | ------=E2=94=BC | | +-----------+ | - | | | 5 | ------=E2=94=BC - | PMD | +-----------+ | - | level | | 6 | ------=E2=94=BC - | mapping | +-----------+ | | | | 7 | ------=E2=94=98 | | +-----------+ | | @@ -159,65 +78,12 @@ Here is how things look after remapping:: | | +-----------+ =20 -When a HugeTLB is freed to the buddy system, we should allocate 7 pages for -vmemmap pages and restore the previous mapping relationship. - -For the HugeTLB page of the pud level mapping. It is similar to the former. -We also can use this approach to free (PAGE_SIZE - 1) vmemmap pages. - -Apart from the HugeTLB page of the pmd/pud level mapping, some architectur= es -(e.g. aarch64) provides a contiguous bit in the translation table entries -that hints to the MMU to indicate that it is one of a contiguous set of -entries that can be cached in a single TLB entry. - -The contiguous bit is used to increase the mapping size at the pmd and pte -(last) level. So this type of HugeTLB page can be optimized only when its -size of the ``struct page`` structs is greater than **1** page. - -Device DAX -=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D - -The device-dax interface uses the same tail deduplication technique explai= ned -in the previous chapter, except when used with the vmemmap in -the device (altmap). - -The following page sizes are supported in DAX: PAGE_SIZE (4K on x86_64), -PMD_SIZE (2M on x86_64) and PUD_SIZE (1G on x86_64). -For powerpc equivalent details see Documentation/arch/powerpc/vmemmap_dedu= p.rst - -The differences with HugeTLB are relatively minor. - -It only use 3 ``struct page`` for storing all information as opposed -to 4 on HugeTLB pages. - -There's no remapping of vmemmap given that device-dax memory is not part of -System RAM ranges initialized at boot. Thus the tail page deduplication -happens at a later stage when we populate the sections. HugeTLB reuses the -the head vmemmap page representing, whereas device-dax reuses the tail -vmemmap page. This results in only half of the savings compared to HugeTLB. - -Deduplicated tail pages are not mapped read-only. +Therefore, for any hugepage, if the total size of its corresponding ``stru= ct pages`` +is greater than or equal to the size of two base pages, then HVO technolog= y can +be applied to this hugepage to save memory. For example, in this case, the +smallest hugepage that can apply HVO is 512 KB (its order corresponds to +``OPTIMIZABLE_FOLIO_MIN_ORDER``). Therefore, any hugepage with an order gr= eater +than or equal to ``OPTIMIZABLE_FOLIO_MIN_ORDER`` can apply HVO technology. =20 -Here's how things look like on device-dax after the sections are populated= :: - - +-----------+ ---virt_to_page---> +-----------+ mapping to +---------= --+ - | | | 0 | -------------> | 0 = | - | | +-----------+ +---------= --+ - | | | 1 | -------------> | 1 = | - | | +-----------+ +---------= --+ - | | | 2 | ----------------^ ^ ^ ^ ^= ^ - | | +-----------+ | | | |= | - | | | 3 | ------------------+ | | |= | - | | +-----------+ | | |= | - | | | 4 | --------------------+ | |= | - | PMD | +-----------+ | |= | - | level | | 5 | ----------------------+ |= | - | mapping | +-----------+ |= | - | | | 6 | ------------------------+= | - | | +-----------+ = | - | | | 7 | -------------------------= -+ - | | +-----------+ - | | - | | - | | - +-----------+ +Meanwhile, each HVOed hugepage still has ``OPTIMIZED_FOLIO_VMEMMAP_PAGE_ST= RUCTS`` +available ``struct page`` structures. --=20 2.20.1 From nobody Sun Jun 14 19:01:43 2026 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 78C951A285 for ; Sun, 5 Apr 2026 12:58:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393940; cv=none; b=UbfB4HtmkeW/Mksk58tGznykquUZlyZ70lClezhx84Q8gAXvVf3L2/htsdHp49tcdXF56q1MvPd8Gki6zGE3ewFXlMm1OapGIz2cp8iBNkSaZVNGBvvl/zMODL6xl3HqAQ22NRnFYrMx8jZ32/DfupY6izCevfpbIJ4NNRd55sE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775393940; c=relaxed/simple; bh=HMPx/DEHg8LK27VzLd63C2wA82YiIbZQlJlS2l/+Eck=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XvwHO6Lsl1VGV+esrTYv05ZGoZ5GrB27wns+fe3fqtPH8OrEGK8ow0wWSQLPO1KfcV3KjvbvyTYjuPdeu7jkBj7JMw1pxBiywNPn6KW03N7Jb3gWrCzLG9jFLJlDjUki3OhFS4Lh3u9LtiHpmYnxRxR/4jqZnthlietDOFonS6Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=VDN7xMem; arc=none smtp.client-ip=209.85.216.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="VDN7xMem" Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-35691a231a7so1644633a91.3 for ; Sun, 05 Apr 2026 05:58:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1775393938; x=1775998738; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5OTMoz77Yb7VHAB553cXlMYhsYoA1Jus+1lDf2tDpdk=; b=VDN7xMemv3qQC8hF2TtNBs/m+5wsueGl1B+z0Ed1vPUNLH1jRqtDEhEtbtUy7CiYBI dF9Ok06daAlAoRnlWCT4AxKaN8WsVr5nTgOQf0sA0MgQxhEs8dYnJI2LGDBkbLQnlNLq ILlei/i994ylmOOfK9HdKumNC4Or9sSf+KZJtKkyvmSw6+WrVzsJyMjwDtXvSP21gyGh aTjThw0yg40q8FOiN4uD0OoDLZAGqpQGq6CfB2mDQ5Jvtpg4p5okCk3gM294QxVblIn8 j7qj/t9z+qgQbFWM0POryJgRD9VOe02vT8zJO0QdITAfFhi53ISDsEGXao+KnmGg6Sup pV9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775393938; x=1775998738; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=5OTMoz77Yb7VHAB553cXlMYhsYoA1Jus+1lDf2tDpdk=; b=OYpHl+5UgBOyb+x94x2/FaASOyvjzHuz4suNhBukYFrcsin3ickH/QU/OR98r0jBfb 93c7ybGSDyP8h8c2TF2UZb+CmjqSxFg8op0h3ponlbx+vFZsBWrobdhN4bFx78OM8cYc 64XywoBSJsBpqKpynIHxDhN3hyhhqGv4YJXGpa80phhsqrS32M8MrbASIioylp/BwQtY 8qfJvwebufXzgYxeF5Yy/vXhDADrRiwmKmQqH1E5Pffroc+uQNm/kHY/4nlRQEYMqzGa WGc84LH0GvGdFDJitt9IZp18goUfMHPf/itFcEoCEC9Lmpd+KWus6gB4LnzCoIEXJUlR EMLA== X-Forwarded-Encrypted: i=1; AJvYcCXCqNwGbVQ5TnLCTCsklCPFPli9wHw2/bsq0pBWCKlmhTRr4O170WQtZLifXOUTnqMgCrsZbhH8+h1NH8A=@vger.kernel.org X-Gm-Message-State: AOJu0Yyy5fFnUNNYi6xfVMOptgR6eNCqA3Q7ctR58ye3lS/gfXMd25L2 VpBjOmain38xWW88QMVtX6UYRNKeyOy6c4XroPWpAk3ddyPeXcZF/ZmZ2eNrNiYH49w= X-Gm-Gg: AeBDiesuzB5kJ8hOuArg6Dt7kUo2LSQCha1NtgH6NI9g6AolTot47aFeH+NlIAWmOG3 zWdaMfNuh67BXQGKkdMZatAFsHkoFdKI63Oarx0lHL8eZM86eADAX71uk+dJvZqD3SlH+tHJ0On CK49wTGMgLWlKe7lHwdKaT3sqd+Dy/19tKYH6DcZW+mO49EB1Q0SHNKXTE1Lm63uFPMOhwiCy4N VE0ksz64aU126meybIpLh9Cg9BKOZ0KDzhv3xBiKGtUTXdMj3yiG84oGwynbe6640mtUcFPRBkk U4F8tTTwJ8gRnjTvZRI2cK3KcoDi95KqaUrqXu+5njx3a2h3qKQFT+trGjGArWvmwBp2SgvPBJ1 iKBT/b81Ny1mcjnvvH+RgyR1GagoQ5bHPFfYadNoIIMrsjrBLiUk/uJ0gq5GMjOdpesNO2cVnBD ZYb4AwkCJm+t006rUURYaaQ7PYeXgD/XN0a6r79qsm4Pc= X-Received: by 2002:a17:90b:2b4e:b0:356:35a5:4a64 with SMTP id 98e67ed59e1d1-35de6842dc2mr8500150a91.4.1775393937697; Sun, 05 Apr 2026 05:58:57 -0700 (PDT) Received: from n232-176-004.byted.org ([36.110.163.97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-35de66b4808sm3748505a91.2.2026.04.05.05.58.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 05 Apr 2026 05:58:57 -0700 (PDT) From: Muchun Song To: Andrew Morton , David Hildenbrand , Muchun Song , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Muchun Song Subject: [PATCH 49/49] mm: consolidate struct page power-of-2 size checks for HVO Date: Sun, 5 Apr 2026 20:52:40 +0800 Message-Id: <20260405125240.2558577-50-songmuchun@bytedance.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260405125240.2558577-1-songmuchun@bytedance.com> References: <20260405125240.2558577-1-songmuchun@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The Hugepage Vmemmap Optimization (HVO) requires that struct page size is a power of two. This size is evaluated by the C compiler and currently cannot be natively evaluated by Kconfig. Therefore, the condition is_power_of_2(sizeof(struct page)) was scattered across several macros and static inline functions. Extract the check into a preprocessor macro STRUCT_PAGE_SIZE_IS_POWER_OF_2 evaluated during the Kbuild process. Define SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED as a master toggle that is 1 only if both Kconfig CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION and the power of 2 size check are true. This allows us to completely remove all scattered sizeof(struct page) checks, making the code much cleaner and eliminating redundant logic. Additionally, mm/hugetlb_vmemmap.c and its corresponding header are now guarded by SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED. This brings an added benefit: when struct page size is not a power of 2, the compiler can entirely optimize away the unused functions in mm/hugetlb_vmemmap.c, reducing kernel image size. Signed-off-by: Muchun Song --- include/linux/mm_types.h | 2 ++ include/linux/mm_types_task.h | 4 ++++ include/linux/mmzone.h | 32 +++++++++++++++----------------- include/linux/page-flags.h | 28 ++++------------------------ kernel/bounds.c | 2 ++ mm/hugetlb_vmemmap.c | 2 ++ mm/hugetlb_vmemmap.h | 4 +--- mm/internal.h | 3 --- mm/sparse.c | 6 ++---- mm/util.c | 2 +- 10 files changed, 33 insertions(+), 52 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index a308e2c23b82..6de6c0c20f8b 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -15,7 +15,9 @@ #include #include #include +#ifndef __GENERATING_BOUNDS_H #include +#endif #include #include #include diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h index 11bf319d78ec..09e5039fff97 100644 --- a/include/linux/mm_types_task.h +++ b/include/linux/mm_types_task.h @@ -17,7 +17,11 @@ #include #endif =20 +#ifndef __GENERATING_BOUNDS_H #define ALLOC_SPLIT_PTLOCKS (SPINLOCK_SIZE > BITS_PER_LONG/8) +#else +#define ALLOC_SPLIT_PTLOCKS 0 +#endif =20 /* * When updating this, please also update struct resident_page_types[] in diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index a6900f585f9b..3a46cb0bfaaa 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -96,27 +96,26 @@ =20 #define MAX_FOLIO_NR_PAGES (1UL << MAX_FOLIO_ORDER) =20 -/* - * Hugepage Vmemmap Optimization (HVO) requires struct pages of the head p= age to - * be naturally aligned with regard to the folio size. - * - * HVO which is only active if the size of struct page is a power of 2. - */ -#define MAX_FOLIO_VMEMMAP_ALIGN \ - (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION) && \ - is_power_of_2(sizeof(struct page)) ? \ - MAX_FOLIO_NR_PAGES * sizeof(struct page) : 0) - /* The number of vmemmap pages required by a vmemmap-optimized folio. */ #define OPTIMIZED_FOLIO_VMEMMAP_PAGES 1 #define OPTIMIZED_FOLIO_VMEMMAP_SIZE (OPTIMIZED_FOLIO_VMEMMAP_PAGES * PAG= E_SIZE) #define OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRUCTS (OPTIMIZED_FOLIO_VMEMMAP_SIZE= / sizeof(struct page)) #define OPTIMIZABLE_FOLIO_MIN_ORDER (ilog2(OPTIMIZED_FOLIO_VMEMMAP_PAGE_S= TRUCTS) + 1) =20 +#if defined(CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION) && STRUCT_PAGE_SIZE_IS_= POWER_OF_2 +#define SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED 1 +/* + * Hugepage Vmemmap Optimization (HVO) requires struct pages of the head p= age to + * be naturally aligned with regard to the folio size. + */ +#define MAX_FOLIO_VMEMMAP_ALIGN (MAX_FOLIO_NR_PAGES * sizeof(struct page= )) #define __NR_OPTIMIZABLE_FOLIO_SIZES (MAX_FOLIO_ORDER - OPTIMIZABLE_FOLIO= _MIN_ORDER + 1) #define NR_OPTIMIZABLE_FOLIO_SIZES \ - ((__NR_OPTIMIZABLE_FOLIO_SIZES > 0 && \ - IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION)) ? __NR_OPTIMIZABLE_F= OLIO_SIZES : 0) + (__NR_OPTIMIZABLE_FOLIO_SIZES > 0 ? __NR_OPTIMIZABLE_FOLIO_SIZES : 0) +#else +#define MAX_FOLIO_VMEMMAP_ALIGN 0 +#define NR_OPTIMIZABLE_FOLIO_SIZES 0 +#endif =20 enum migratetype { MIGRATE_UNMOVABLE, @@ -2015,7 +2014,7 @@ struct mem_section { */ struct page_ext *page_ext; #endif -#ifdef CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION +#ifdef SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED /* * The order of compound pages in this section. Typically, the section * holds compound pages of this order; a larger compound page will span @@ -2208,7 +2207,7 @@ static inline bool pfn_section_first_valid(struct mem= _section *ms, unsigned long } #endif =20 -#ifdef CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION +#ifdef SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED static inline void section_set_order(struct mem_section *section, unsigned= int order) { VM_BUG_ON(section->order && order && section->order !=3D order); @@ -2267,8 +2266,7 @@ static inline void section_set_compound_range(unsigne= d long pfn, =20 static inline bool section_vmemmap_optimizable(const struct mem_section *s= ection) { - return is_power_of_2(sizeof(struct page)) && - section_order(section) >=3D OPTIMIZABLE_FOLIO_MIN_ORDER; + return section_order(section) >=3D OPTIMIZABLE_FOLIO_MIN_ORDER; } =20 void sparse_init_early_section(int nid, struct page *map, unsigned long pn= um, diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 12665b34586c..bea934d49750 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -198,32 +198,12 @@ enum pageflags { =20 #ifndef __GENERATING_BOUNDS_H =20 -/* - * For tail pages, if the size of struct page is power-of-2 ->compound_info - * encodes the mask that converts the address of the tail page address to - * the head page address. - * - * Otherwise, ->compound_info has direct pointer to head pages. - */ -static __always_inline bool compound_info_has_mask(void) -{ - /* - * The approach with mask would work in the wider set of conditions, - * but it requires validating that struct pages are naturally aligned - * for all orders up to the MAX_FOLIO_ORDER, which can be tricky. - */ - if (!IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP_OPTIMIZATION)) - return false; - - return is_power_of_2(sizeof(struct page)); -} - static __always_inline unsigned long _compound_head(const struct page *pag= e) { unsigned long info =3D READ_ONCE(page->compound_info); unsigned long mask; =20 - if (!compound_info_has_mask()) { + if (!IS_ENABLED(SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED)) { /* Bit 0 encodes PageTail() */ if (info & 1) return info - 1; @@ -232,8 +212,8 @@ static __always_inline unsigned long _compound_head(con= st struct page *page) } =20 /* - * If compound_info_has_mask() is true the rest of the info encodes - * the mask that converts the address of the tail page to the head page. + * If HVO is enabled the rest of the info encodes the mask that converts + * the address of the tail page to the head page. * * No need to clear bit 0 in the mask as 'page' always has it clear. * @@ -257,7 +237,7 @@ static __always_inline void set_compound_head(struct pa= ge *tail, unsigned int shift; unsigned long mask; =20 - if (!compound_info_has_mask()) { + if (!IS_ENABLED(SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED)) { WRITE_ONCE(tail->compound_info, (unsigned long)head | 1); return; } diff --git a/kernel/bounds.c b/kernel/bounds.c index 02b619eb6106..ff2ec3834d32 100644 --- a/kernel/bounds.c +++ b/kernel/bounds.c @@ -8,6 +8,7 @@ #define __GENERATING_BOUNDS_H #define COMPILE_OFFSETS /* Include headers that define the enum constants of interest */ +#include #include #include #include @@ -30,6 +31,7 @@ int main(void) DEFINE(LRU_GEN_WIDTH, 0); DEFINE(__LRU_REFS_WIDTH, 0); #endif + DEFINE(STRUCT_PAGE_SIZE_IS_POWER_OF_2, is_power_of_2(sizeof(struct page))= ); /* End of constants */ =20 return 0; diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index d595ef759bc2..0347341be156 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -21,6 +21,7 @@ #include "hugetlb_vmemmap.h" #include "internal.h" =20 +#ifdef SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED /** * struct vmemmap_remap_walk - walk vmemmap page table * @@ -693,3 +694,4 @@ static int __init hugetlb_vmemmap_init(void) return 0; } late_initcall(hugetlb_vmemmap_init); +#endif diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 0022f9c5a101..bd576ef41ee7 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -12,7 +12,7 @@ #include #include =20 -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP +#if defined(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) && defined(SPARSEMEM_VME= MMAP_OPTIMIZATION_ENABLED) int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *fo= lio); long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *folio_list, @@ -34,8 +34,6 @@ static inline unsigned int hugetlb_vmemmap_optimizable_si= ze(const struct hstate { int size =3D hugetlb_vmemmap_size(h) - OPTIMIZED_FOLIO_VMEMMAP_SIZE; =20 - if (!is_power_of_2(sizeof(struct page))) - return 0; return size > 0 ? size : 0; } #else diff --git a/mm/internal.h b/mm/internal.h index 02064f21bfe1..121c9076f09a 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1026,9 +1026,6 @@ static inline bool vmemmap_page_optimizable(const str= uct page *page) unsigned long pfn =3D page_to_pfn(page); unsigned int order =3D section_order(__pfn_to_section(pfn)); =20 - if (!is_power_of_2(sizeof(struct page))) - return false; - return (pfn & ((1L << order) - 1)) >=3D OPTIMIZED_FOLIO_VMEMMAP_PAGE_STRU= CTS; } =20 diff --git a/mm/sparse.c b/mm/sparse.c index 77bb0113bac5..7375f66a58d5 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -404,10 +404,8 @@ void __init sparse_init(void) unsigned long pnum_end, pnum_begin, map_count =3D 1; int nid_begin; =20 - if (compound_info_has_mask()) { - VM_WARN_ON_ONCE(!IS_ALIGNED((unsigned long) pfn_to_page(0), - MAX_FOLIO_VMEMMAP_ALIGN)); - } + VM_WARN_ON_ONCE(IS_ENABLED(SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED) && + !IS_ALIGNED((unsigned long)pfn_to_page(0), MAX_FOLIO_VMEMMAP_ALIGN)); =20 pnum_begin =3D first_present_section_nr(); nid_begin =3D sparse_early_nid(__nr_to_section(pnum_begin)); diff --git a/mm/util.c b/mm/util.c index f063fd4de1e8..783b2081ea74 100644 --- a/mm/util.c +++ b/mm/util.c @@ -1348,7 +1348,7 @@ void snapshot_page(struct page_snapshot *ps, const st= ruct page *page) foliop =3D (struct folio *)page; } else { /* See compound_head() */ - if (compound_info_has_mask()) { + if (IS_ENABLED(SPARSEMEM_VMEMMAP_OPTIMIZATION_ENABLED)) { unsigned long p =3D (unsigned long)page; =20 foliop =3D (struct folio *)(p & info); --=20 2.20.1