From nobody Tue Dec 16 14:49:55 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3FBCB348880 for ; Fri, 5 Dec 2025 19:44:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764963854; cv=none; b=SbJHDpQyAKfRAZgwgfDA5l9cbSdlprBcZhxneyARNYe3ho+a/zyUZmL8ktGB2C9tJy+VdzCv1G7jmGHOaEcGHByICoNvqtCmZWwj363UKVAQjj2HQpuDh0hK3/rAWA2Wpe6EH9Z3q7Xh13gD2naJGBJ/mVFZhAfax7DQydUKGwQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764963854; c=relaxed/simple; bh=qHJKjreeIB8u6+GC7KtXgFelYE26+ueZzxyId6y/BHA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DFFeN3RDBcoI5dT+a5+Jjn4QKWH9AhqgQw5oWWzZK122TVnSNauBfY7VVfDIcZZi8r9PfAys7orG/K8viiCzkYy3XcgBscuqlX6bTqtkZWa0wV0qPkXOhYoCGunnZ19e+FQBQtbP1E6E9SJJ3RVZvs4oJYv6eExRVkjL5u6NCSc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aVyMDVUn; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aVyMDVUn" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4621BC19424; Fri, 5 Dec 2025 19:44:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1764963853; bh=qHJKjreeIB8u6+GC7KtXgFelYE26+ueZzxyId6y/BHA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aVyMDVUnqvDhSDHXNC1m2wArU7R1MUDFvsEPnrTbUKjqQt1T5H8KHPX3g+qQ5GcXO XdrTtneqi0WkYXwY55LbOrndMb8Z7l1ounMEiplPr0Lwt0MbH1oM0p8kx6EaYg6rxC ZWNh6oHU+VTewK8gnXlbGvi9aaMtBZzDzk8fwBoGgc4FgHCWAn1lKz9xIIYb1ozKPJ cW/OWFyD1n+d51NUaVEuLVVj40S7MdhJLWpkc0KyPP9UG2qOGftPwbLqzJArruzo1w PbrktfwvyRgeJrxhSxHuh2d01dlLnnXygtrhJ7/o+DyyYWunowyGp9jW2aobtZmKGA GM3ZX4AR37/gg== Received: from phl-compute-01.internal (phl-compute-01.internal [10.202.2.41]) by mailfauth.phl.internal (Postfix) with ESMTP id 91C13F40070; Fri, 5 Dec 2025 14:44:11 -0500 (EST) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-01.internal (MEProxy); Fri, 05 Dec 2025 14:44:11 -0500 X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgdelvdehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceurghi lhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurh ephffvvefufffkofgjfhggtgfgsehtkeertdertdejnecuhfhrohhmpefmihhrhihlucfu hhhuthhsvghmrghuuceokhgrsheskhgvrhhnvghlrdhorhhgqeenucggtffrrghtthgvrh hnpeffkefffedugfeiudejheefleehteevtefgvefhveetheehkefhjeefhefgleejveen ucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehkihhrih hllhdomhgvshhmthhprghuthhhphgvrhhsohhnrghlihhthidqudeiudduiedvieehhedq vdekgeeggeejvdekqdhkrghspeepkhgvrhhnvghlrdhorhhgsehshhhuthgvmhhovhdrnh grmhgvpdhnsggprhgtphhtthhopeduledpmhhouggvpehsmhhtphhouhhtpdhrtghpthht oheprghkphhmsehlihhnuhigqdhfohhunhgurghtihhonhdrohhrghdprhgtphhtthhope hmuhgthhhunhdrshhonhhgsehlihhnuhigrdguvghvpdhrtghpthhtohepuggrvhhiuges khgvrhhnvghlrdhorhhgpdhrtghpthhtohepohhsrghlvhgrughorhesshhushgvrdguvg dprhgtphhtthhopehrphhptheskhgvrhhnvghlrdhorhhgpdhrtghpthhtohepvhgsrggs khgrsehsuhhsvgdrtgiipdhrtghpthhtoheplhhorhgvnhiiohdrshhtohgrkhgvshesoh hrrggtlhgvrdgtohhmpdhrtghpthhtohepfihilhhlhiesihhnfhhrrgguvggrugdrohhr ghdprhgtphhtthhopeiiihihsehnvhhiughirgdrtghomh X-ME-Proxy: Feedback-ID: i10464835:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 5 Dec 2025 14:44:11 -0500 (EST) From: Kiryl Shutsemau To: Andrew Morton , Muchun Song Cc: David Hildenbrand , Oscar Salvador , Mike Rapoport , Vlastimil Babka , Lorenzo Stoakes , Matthew Wilcox , Zi Yan , Baoquan He , Michal Hocko , Johannes Weiner , Jonathan Corbet , Usama Arif , kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Kiryl Shutsemau Subject: [PATCH 11/11] hugetlb: Update vmemmap_dedup.rst Date: Fri, 5 Dec 2025 19:43:47 +0000 Message-ID: <20251205194351.1646318-12-kas@kernel.org> X-Mailer: git-send-email 2.51.2 In-Reply-To: <20251205194351.1646318-1-kas@kernel.org> References: <20251205194351.1646318-1-kas@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Update the documentation regarding vmemmap optimization for hugetlb to reflect the changes in how the kernel maps the tail pages. Fake heads no longer exist. Remove their description. Signed-off-by: Kiryl Shutsemau --- Documentation/mm/vmemmap_dedup.rst | 60 +++++++++++++----------------- 1 file changed, 26 insertions(+), 34 deletions(-) diff --git a/Documentation/mm/vmemmap_dedup.rst b/Documentation/mm/vmemmap_= dedup.rst index 1863d88d2dcb..a0c4c79d6922 100644 --- a/Documentation/mm/vmemmap_dedup.rst +++ b/Documentation/mm/vmemmap_dedup.rst @@ -124,33 +124,35 @@ Here is how things look before optimization:: | | +-----------+ =20 -The value of page->compound_info is the same for all tail pages. The first -page of ``struct page`` (page 0) associated with the HugeTLB page contains= the 4 -``struct page`` necessary to describe the HugeTLB. The only use of the rem= aining -pages of ``struct page`` (page 1 to page 7) is to point to page->compound_= info. -Therefore, we can remap pages 1 to 7 to page 0. Only 1 page of ``struct pa= ge`` -will be used for each HugeTLB page. This will allow us to free the remaini= ng -7 pages to the buddy allocator. +The first page of ``struct page`` (page 0) associated with the HugeTLB page +contains the 4 ``struct page`` necessary to describe the HugeTLB. The rema= ining +pages of ``struct page`` (page 1 to page 7) are tail pages. + +The optimization is only applied when the size of the struct page is a pow= er-of-2 +In this case, all tail pages of the same order are identical. See +compound_head(). This allows us to remap the tail pages of the vmemmap to a +shared, read-only page. The head page is also remapped to a new page. This +allows the original vmemmap pages to be freed. =20 Here is how things look after remapping:: =20 - HugeTLB struct pages(8 pages) page frame(8 pa= ges) - +-----------+ ---virt_to_page---> +-----------+ mapping to +---------= --+ - | | | 0 | -------------> | 0 = | - | | +-----------+ +---------= --+ - | | | 1 | ---------------^ ^ ^ ^ ^ = ^ ^ - | | +-----------+ | | | | = | | - | | | 2 | -----------------+ | | | = | | - | | +-----------+ | | | = | | - | | | 3 | -------------------+ | | = | | - | | +-----------+ | | = | | - | | | 4 | ---------------------+ | = | | - | PMD | +-----------+ | = | | - | level | | 5 | -----------------------+ = | | - | mapping | +-----------+ = | | - | | | 6 | -------------------------= + | - | | +-----------+ = | - | | | 7 | -------------------------= --+ + HugeTLB struct pages(8 pages) page fr= ame + +-----------+ ---virt_to_page---> +-----------+ mapping to +---------= -------+ + | | | 0 | -------------> | 0 = | + | | +-----------+ +---------= -------+ + | | | 1 | ------=E2=94=90 + | | +-----------+ | + | | | 2 | ------=E2=94=BC +-= ---------------+ + | | +-----------+ | | vmemmap_= tail | + | | | 3 | ------=E2=94=BC------> | = shared for the | + | | +-----------+ | | struct h= state | + | | | 4 | ------=E2=94=BC +-= ---------------+ + | | +-----------+ | + | | | 5 | ------=E2=94=BC + | PMD | +-----------+ | + | level | | 6 | ------=E2=94=BC + | mapping | +-----------+ | + | | | 7 | ------=E2=94=98 | | +-----------+ | | | | @@ -172,16 +174,6 @@ The contiguous bit is used to increase the mapping siz= e at the pmd and pte (last) level. So this type of HugeTLB page can be optimized only when its size of the ``struct page`` structs is greater than **1** page. =20 -Notice: The head vmemmap page is not freed to the buddy allocator and all -tail vmemmap pages are mapped to the head vmemmap page frame. So we can see -more than one ``struct page`` struct with ``PG_head`` (e.g. 8 per 2 MB Hug= eTLB -page) associated with each HugeTLB page. The ``compound_head()`` can handle -this correctly. There is only **one** head ``struct page``, the tail -``struct page`` with ``PG_head`` are fake head ``struct page``. We need an -approach to distinguish between those two different types of ``struct page= `` so -that ``compound_head()`` can return the real head ``struct page`` when the -parameter is the tail ``struct page`` but with ``PG_head``. - Device DAX =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --=20 2.51.2