From nobody Wed Apr 29 08:09:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD9CECCA47B for ; Mon, 13 Jun 2022 06:35:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235589AbiFMGfk (ORCPT ); Mon, 13 Jun 2022 02:35:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234610AbiFMGfd (ORCPT ); Mon, 13 Jun 2022 02:35:33 -0400 Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37950BC81 for ; Sun, 12 Jun 2022 23:35:32 -0700 (PDT) Received: by mail-pj1-x1034.google.com with SMTP id y13-20020a17090a154d00b001eaaa3b9b8dso2167947pja.2 for ; Sun, 12 Jun 2022 23:35:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=B72Ua44K+e6NGA36blstxa09i97avcKmfrwwXvB2UR0=; b=qwkYRnZuS/W9slMWkIAwYldruSCyscLX8/wyF7kLD8G4uAeT2sqkPMDJ9FLteGBkba jPtSNz9K5I6n2/vLHYGNDpiJsEr6w0AdooLipoVAUAZ8Y4EBOxLw9+mQhljpgwFEQzqf yztz6xnzmXcan/IcMsTOWEk5Pil7u0lNfnQePAdqkP+93elHBvBPF44pEJT1wFoWpLse 3kVLIYgT3gyvTFurcJnYDGckq9NnRYCNM90frf7MuhOYmtrIXMfiKNc7i4JsrOf8flV7 yS3Q7pWIirK0dCxNgn28AalTjpfE0oz8lfmYe0RuwZiDGEikCM0cN6ykGA39pbDhQ8x8 Fgew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=B72Ua44K+e6NGA36blstxa09i97avcKmfrwwXvB2UR0=; b=e8EAbUM06WgrF/va5WMIkTGZZ0FAod6Wrkr7ORrws+aj2IFMwTjZiWccvnrQecVRQu 3jwZOPvUdt/MGP4CL1bI0Qyi1EUtMlXzXvV/1HCqTJp3u26n0U4RpR30DZT9Fqbp2dhr bUOYuKU6xc6T0VWAsXVuUSin0YYeVA/RoUfAnPbtn1urKzifdH97urYJ8796k4dmQ4o4 +rK0/pYdB5muhwkzb3A3SI7lGMWONSlvqGtSacDYkYLMeuoSTLantONuOUqOIsB4CP/V WyI8a/e/XJSXn+Zjm/L3vnZmWodbc5Th0h10ygZ1In5eUbwj3hp7arzbx8UB+e46600+ 79MA== X-Gm-Message-State: AOAM530w8CZSw1vuhnDUmDzV6tJWTcPGnhXIcDP6yaj9JbAAlgZZcONq rRDRx1qURJ1z5y0hkO+/NvqJUdQznQgGJ2KFiyk= X-Google-Smtp-Source: ABdhPJy8ePrJh8eNpFPc7r+hoH67PAyb+LUzYIz//Skci1r3NK3WQ1UlwXpoXXJMTLqE6uQnUgUeXg== X-Received: by 2002:a17:902:d2c9:b0:167:1195:3a41 with SMTP id n9-20020a170902d2c900b0016711953a41mr51441419plc.126.1655102131696; Sun, 12 Jun 2022 23:35:31 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.255]) by smtp.gmail.com with ESMTPSA id v3-20020aa799c3000000b0051bc538baadsm4366554pfi.184.2022.06.12.23.35.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jun 2022 23:35:31 -0700 (PDT) From: Muchun Song To: mike.kravetz@oracle.com, david@redhat.com, akpm@linux-foundation.org, corbet@lwn.net Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Muchun Song , Catalin Marinas , Will Deacon , Anshuman Khandual Subject: [PATCH 1/6] mm: hugetlb_vmemmap: delete hugetlb_optimize_vmemmap_enabled() Date: Mon, 13 Jun 2022 14:35:07 +0800 Message-Id: <20220613063512.17540-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.1 (Apple Git-133) In-Reply-To: <20220613063512.17540-1-songmuchun@bytedance.com> References: <20220613063512.17540-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The name hugetlb_optimize_vmemmap_enabled() a bit confusing as it tests two conditions (enabled and pages in use). Instead of coming up to an appropriate name, we could just delete it. There is already a discussion about deleting it in thread [1]. There is only one user of hugetlb_optimize_vmemmap_enabled() outside of hugetlb_vmemmap, that is flush_dcache_page() in arch/arm64/mm/flush.c. However, it does not need to call hugetlb_optimize_vmemmap_enabled() in flush_dcache_page() since HugeTLB pages are always fully mapped and only head page will be set PG_dcache_clean meaning only head page's flag may need to be cleared (see commit cf5a501d985b). So it is easy to remove hugetlb_optimize_vmemmap_enabled(). Link: https://lore.kernel.org/all/c77c61c8-8a5a-87e8-db89-d04d8aaab4cc@orac= le.com/ [1] Signed-off-by: Muchun Song Cc: Catalin Marinas Cc: Will Deacon Cc: Anshuman Khandual Reviewed-by: Catalin Marinas Reviewed-by: Mike Kravetz Reviewed-by: Oscar Salvador --- arch/arm64/mm/flush.c | 13 +++---------- include/linux/page-flags.h | 14 ++------------ 2 files changed, 5 insertions(+), 22 deletions(-) diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c index fc4f710e9820..5f9379b3c8c8 100644 --- a/arch/arm64/mm/flush.c +++ b/arch/arm64/mm/flush.c @@ -76,17 +76,10 @@ EXPORT_SYMBOL_GPL(__sync_icache_dcache); void flush_dcache_page(struct page *page) { /* - * Only the head page's flags of HugeTLB can be cleared since the tail - * vmemmap pages associated with each HugeTLB page are mapped with - * read-only when CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is enabled (more - * details can refer to vmemmap_remap_pte()). Although - * __sync_icache_dcache() only set PG_dcache_clean flag on the head - * page struct, there is more than one page struct with PG_dcache_clean - * associated with the HugeTLB page since the head vmemmap page frame - * is reused (more details can refer to the comments above - * page_fixed_fake_head()). + * HugeTLB pages are always fully mapped and only head page will be + * set PG_dcache_clean (see comments in __sync_icache_dcache()). */ - if (hugetlb_optimize_vmemmap_enabled() && PageHuge(page)) + if (PageHuge(page)) page =3D compound_head(page); =20 if (test_bit(PG_dcache_clean, &page->flags)) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index de80f0c26b2f..b8b992cb201c 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -203,12 +203,6 @@ enum pageflags { DECLARE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON, hugetlb_optimize_vmemmap_key); =20 -static __always_inline bool hugetlb_optimize_vmemmap_enabled(void) -{ - return static_branch_maybe(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_O= N, - &hugetlb_optimize_vmemmap_key); -} - /* * If the feature of optimizing vmemmap pages associated with each HugeTLB * page is enabled, the head vmemmap page frame is reused and all of the t= ail @@ -227,7 +221,8 @@ static __always_inline bool hugetlb_optimize_vmemmap_en= abled(void) */ static __always_inline const struct page *page_fixed_fake_head(const struc= t page *page) { - if (!hugetlb_optimize_vmemmap_enabled()) + if (!static_branch_maybe(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON, + &hugetlb_optimize_vmemmap_key)) return page; =20 /* @@ -255,11 +250,6 @@ static inline const struct page *page_fixed_fake_head(= const struct page *page) { return page; } - -static inline bool hugetlb_optimize_vmemmap_enabled(void) -{ - return false; -} #endif =20 static __always_inline int page_is_fake_head(struct page *page) --=20 2.11.0 From nobody Wed Apr 29 08:09:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 427A9CCA47E for ; Mon, 13 Jun 2022 06:35:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235770AbiFMGfm (ORCPT ); Mon, 13 Jun 2022 02:35:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234910AbiFMGfg (ORCPT ); Mon, 13 Jun 2022 02:35:36 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 89AD3B872 for ; Sun, 12 Jun 2022 23:35:35 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d5so1693448plo.12 for ; Sun, 12 Jun 2022 23:35:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XLBHCSl3x40n/JirLCrBjaqZu+zJJs2ufPMKi7d2Sww=; b=oyts69b4avBW6IfgVF+4QpgyMTlLyDvTK8H8cgqFmeMxrl7bKHSPH0QcLfPSUXkpFx nvC3JmFuxzPBTZ08riXpF3Hnn0WwSf0Xz5YoNlEhowi0FwIsSdFz3JNWmEg73Mu8zFZL yTHQ74HZI/9sqE+udr02AltX1Wk3vQDAL+J9nK9PLL2UcKGGnsTrwzNPqPTZC0uucGgB SIfPr0Zg0Vv+ShY5B1ecsFgDlSe6swxvd7QQA3/qQk0QLSHfPE2ri/ZSGcRQNsI0mcpp c9v+ELelY0gKRbnsL7BJ+AN8XbTC/qE5w/gmOik/HPgrmqeYOvTpOOBkPbC19WeXC67S 4ytg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XLBHCSl3x40n/JirLCrBjaqZu+zJJs2ufPMKi7d2Sww=; b=YKKzBrnp+ozgzo8UK+O3XUhTNuFYxBEKiuPewxd20eq9t07poWRE+37k+fDKpn7ePY /6fsm5c2E4KskY9CHlRKCiGPKSimqU5nBrgoQz39tIbh/43C1GybJLGHoE81tn6u43Aw pS49GXZEeguDevx0nioJ+6/EFqbkFIuoKwg/8enZuJvjHvBL+r4REbjCQkImHkpBu+1I l2lBAqrVbYYGmLUSlAEGPeL3et0Erxce9pPyNjf3s2zIJR0wZ0fRCjbwUgWJ2DrFN1MN +5Q8kbaF0RAXAoJLoL6OoqcurqaEtB0ifsKFEiV6TUB2pac7TTyTlTkEuG5vUiuOiGxO O0cQ== X-Gm-Message-State: AOAM532PPWPo43z7en4rFdJw0SR/tK/zZ6jthjLhRzGaitD+X+V1Lpd6 o8jZkUGdeFgWek9tagtWEgl4Sg== X-Google-Smtp-Source: ABdhPJxCusXzGwHuisl+klFQ+x+A+Evfz+qDo9pEo1G3bTqjCJLS5ohmbbAPHwTAHuGVHxld3Y5Wog== X-Received: by 2002:a17:902:e742:b0:166:4d34:3be8 with SMTP id p2-20020a170902e74200b001664d343be8mr53370044plf.140.1655102135058; Sun, 12 Jun 2022 23:35:35 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.255]) by smtp.gmail.com with ESMTPSA id v3-20020aa799c3000000b0051bc538baadsm4366554pfi.184.2022.06.12.23.35.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jun 2022 23:35:34 -0700 (PDT) From: Muchun Song To: mike.kravetz@oracle.com, david@redhat.com, akpm@linux-foundation.org, corbet@lwn.net Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Muchun Song Subject: [PATCH 2/6] mm: hugetlb_vmemmap: optimize vmemmap_optimize_mode handling Date: Mon, 13 Jun 2022 14:35:08 +0800 Message-Id: <20220613063512.17540-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.1 (Apple Git-133) In-Reply-To: <20220613063512.17540-1-songmuchun@bytedance.com> References: <20220613063512.17540-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" We hold an another reference to hugetlb_optimize_vmemmap_key when making vmemmap_optimize_mode on, because we use static_key to tell memory_hotplug that memory_hotplug.memmap_on_memory should be overridden. However, this rule has gone when we have introduced SECTION_CANNOT_OPTIMIZE_VMEMMAP. Therefore, we could simplify vmemmap_optimize_mode handling by not holding an another reference to hugetlb_optimize_vmemmap_key. Signed-off-by: Muchun Song Reviewed-by: Mike Kravetz Reviewed-by: Oscar Salvador --- include/linux/page-flags.h | 6 ++--- mm/hugetlb_vmemmap.c | 65 +++++-------------------------------------= ---- 2 files changed, 9 insertions(+), 62 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index b8b992cb201c..da7ccc3b16ad 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -200,8 +200,7 @@ enum pageflags { #ifndef __GENERATING_BOUNDS_H =20 #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP -DECLARE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON, - hugetlb_optimize_vmemmap_key); +DECLARE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); =20 /* * If the feature of optimizing vmemmap pages associated with each HugeTLB @@ -221,8 +220,7 @@ DECLARE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_OPTIMIZE_V= MEMMAP_DEFAULT_ON, */ static __always_inline const struct page *page_fixed_fake_head(const struc= t page *page) { - if (!static_branch_maybe(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON, - &hugetlb_optimize_vmemmap_key)) + if (!static_branch_unlikely(&hugetlb_optimize_vmemmap_key)) return page; =20 /* diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index e20a7082f2f8..132dc83f0130 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -23,42 +23,15 @@ #define RESERVE_VMEMMAP_NR 1U #define RESERVE_VMEMMAP_SIZE (RESERVE_VMEMMAP_NR << PAGE_SHIFT) =20 -enum vmemmap_optimize_mode { - VMEMMAP_OPTIMIZE_OFF, - VMEMMAP_OPTIMIZE_ON, -}; - -DEFINE_STATIC_KEY_MAYBE(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON, - hugetlb_optimize_vmemmap_key); +DEFINE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); =20 -static enum vmemmap_optimize_mode vmemmap_optimize_mode =3D +static bool vmemmap_optimize_enabled =3D IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON); =20 -static void vmemmap_optimize_mode_switch(enum vmemmap_optimize_mode to) -{ - if (vmemmap_optimize_mode =3D=3D to) - return; - - if (to =3D=3D VMEMMAP_OPTIMIZE_OFF) - static_branch_dec(&hugetlb_optimize_vmemmap_key); - else - static_branch_inc(&hugetlb_optimize_vmemmap_key); - WRITE_ONCE(vmemmap_optimize_mode, to); -} - static int __init hugetlb_vmemmap_early_param(char *buf) { - bool enable; - enum vmemmap_optimize_mode mode; - - if (kstrtobool(buf, &enable)) - return -EINVAL; - - mode =3D enable ? VMEMMAP_OPTIMIZE_ON : VMEMMAP_OPTIMIZE_OFF; - vmemmap_optimize_mode_switch(mode); - - return 0; + return kstrtobool(buf, &vmemmap_optimize_enabled); } early_param("hugetlb_free_vmemmap", hugetlb_vmemmap_early_param); =20 @@ -103,7 +76,7 @@ static unsigned int optimizable_vmemmap_pages(struct hst= ate *h, unsigned long pfn =3D page_to_pfn(head); unsigned long end =3D pfn + pages_per_huge_page(h); =20 - if (READ_ONCE(vmemmap_optimize_mode) =3D=3D VMEMMAP_OPTIMIZE_OFF) + if (!READ_ONCE(vmemmap_optimize_enabled)) return 0; =20 for (; pfn < end; pfn +=3D PAGES_PER_SECTION) { @@ -155,7 +128,6 @@ void __init hugetlb_vmemmap_init(struct hstate *h) =20 if (!is_power_of_2(sizeof(struct page))) { pr_warn_once("cannot optimize vmemmap pages because \"struct page\" cros= ses page boundaries\n"); - static_branch_disable(&hugetlb_optimize_vmemmap_key); return; } =20 @@ -176,36 +148,13 @@ void __init hugetlb_vmemmap_init(struct hstate *h) } =20 #ifdef CONFIG_PROC_SYSCTL -static int hugetlb_optimize_vmemmap_handler(struct ctl_table *table, int w= rite, - void *buffer, size_t *length, - loff_t *ppos) -{ - int ret; - enum vmemmap_optimize_mode mode; - static DEFINE_MUTEX(sysctl_mutex); - - if (write && !capable(CAP_SYS_ADMIN)) - return -EPERM; - - mutex_lock(&sysctl_mutex); - mode =3D vmemmap_optimize_mode; - table->data =3D &mode; - ret =3D proc_dointvec_minmax(table, write, buffer, length, ppos); - if (write && !ret) - vmemmap_optimize_mode_switch(mode); - mutex_unlock(&sysctl_mutex); - - return ret; -} - static struct ctl_table hugetlb_vmemmap_sysctls[] =3D { { .procname =3D "hugetlb_optimize_vmemmap", - .maxlen =3D sizeof(enum vmemmap_optimize_mode), + .data =3D &vmemmap_optimize_enabled, + .maxlen =3D sizeof(int), .mode =3D 0644, - .proc_handler =3D hugetlb_optimize_vmemmap_handler, - .extra1 =3D SYSCTL_ZERO, - .extra2 =3D SYSCTL_ONE, + .proc_handler =3D proc_dobool, }, { } }; --=20 2.11.0 From nobody Wed Apr 29 08:09:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D531ACCA47E for ; Mon, 13 Jun 2022 06:35:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236259AbiFMGft (ORCPT ); Mon, 13 Jun 2022 02:35:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38572 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229787AbiFMGfk (ORCPT ); Mon, 13 Jun 2022 02:35:40 -0400 Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9903FBCBE for ; Sun, 12 Jun 2022 23:35:39 -0700 (PDT) Received: by mail-pf1-x42f.google.com with SMTP id x138so4892318pfc.12 for ; Sun, 12 Jun 2022 23:35:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2+OKPYQ5REAoye/wSsuy111hflzgcPbPLQAUS9w4vzY=; b=UYN43vXERJXZ9csVHQZAu3w+v7BI/PQ0oeR4+TZu85yj6WdN/18M7tPeNA86JQv0e9 SgArwTPIRb2QOuLEAdNysGtz7p4ElisPmsr3carhhenLFpoFRcylbY/nbwXXKm2HtLRF Wol3JvHDVcHMeKhemIgfBhErkg6/Jo7+PB+1RDjJb7PRfjKx2EldoY8ueWHgFrIyaC5f 4irSNHL3qy7UKj3th4VmZDa8vJ4HYoc01II29r3nFIT7jgbCEGQXpmHxkYZbk+1TRwJR cY4zcduFoLB93qL6h4m9LTBxkNG2//NyIft3/u5IYhYM0QwOCiIswLcEdxAe2XGt3SIx f1PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2+OKPYQ5REAoye/wSsuy111hflzgcPbPLQAUS9w4vzY=; b=ySkdpTCVPRrjGMHobMxWvz0FLeb7bLHWikbNbCBvY8kNrbBnwQkyqiwERB0NLXBN5Z /axTo8f10a5EtsMNc7JmtMulmFQziF5fswXnOrqqfJPDcsNpIJ2fWPUuzKnmbl/q0TB+ ONPp8x7tUDNByKubz7MlPyoE4r8kT4tj319duvVugexaW/NggMYPjTx9wwg4b3i/iPfO XbBp9MD2kaQZ6b5oWL8zz25RcY//OBa1qDPD5Vm2AO87lzKmwY06HlFDwqpBPPcZW+zV l0IjWuYDXIjF6mJMAPhplJIBUj073pyt8AP4TxYfDN7ApmqQuWZ0FRMd024A4eC6eMtB nX0g== X-Gm-Message-State: AOAM532sgvif2nPd1EzM3cV80twk/Bv5A3E2QCggyzsKFnokGbKuAUs3 0tl3A8u2ZsH1NFu4trIE5+BJjQ== X-Google-Smtp-Source: ABdhPJxz7UwWsexBbVriPZbR0Xz+JC+GNqKyElGl7birTSFGxX0z0EQ/fFZlYKwK8LYDlaLorQU6Ug== X-Received: by 2002:a63:cc09:0:b0:3fb:aae7:4964 with SMTP id x9-20020a63cc09000000b003fbaae74964mr49566503pgf.118.1655102139008; Sun, 12 Jun 2022 23:35:39 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.255]) by smtp.gmail.com with ESMTPSA id v3-20020aa799c3000000b0051bc538baadsm4366554pfi.184.2022.06.12.23.35.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jun 2022 23:35:38 -0700 (PDT) From: Muchun Song To: mike.kravetz@oracle.com, david@redhat.com, akpm@linux-foundation.org, corbet@lwn.net Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Muchun Song Subject: [PATCH 3/6] mm: hugetlb_vmemmap: introduce the name HVO Date: Mon, 13 Jun 2022 14:35:09 +0800 Message-Id: <20220613063512.17540-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.1 (Apple Git-133) In-Reply-To: <20220613063512.17540-1-songmuchun@bytedance.com> References: <20220613063512.17540-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" It it inconvenient to mention the feature of optimizing vmemmap pages assoc= iated with HugeTLB pages when communicating with others since there is no specifi= c or abbreviated name for it when it is first introduced. Let us give it a name= HVO (HugeTLB Vmemmap Optimization) from now. This commit also updates the document about "hugetlb_free_vmemmap" by the w= ay discussed in thread [1]. Link: https://lore.kernel.org/all/21aae898-d54d-cc4b-a11f-1bb7fddcfffa@redh= at.com/ [1] Signed-off-by: Muchun Song Reviewed-by: Mike Kravetz Reviewed-by: Oscar Salvador --- Documentation/admin-guide/kernel-parameters.txt | 7 ++++--- Documentation/admin-guide/mm/hugetlbpage.rst | 3 +-- Documentation/admin-guide/sysctl/vm.rst | 3 +-- fs/Kconfig | 13 ++++++------- mm/hugetlb_vmemmap.c | 8 ++++---- mm/hugetlb_vmemmap.h | 4 ++-- 6 files changed, 18 insertions(+), 20 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentatio= n/admin-guide/kernel-parameters.txt index 391b43fee93e..7539553b3fb0 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1725,12 +1725,13 @@ hugetlb_free_vmemmap=3D [KNL] Reguires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP enabled. + Control if HugeTLB Vmemmap Optimization (HVO) is enabled. Allows heavy hugetlb users to free up some more memory (7 * PAGE_SIZE for each 2MB hugetlb page). - Format: { [oO][Nn]/Y/y/1 | [oO][Ff]/N/n/0 (default) } + Format: { on | off (default) } =20 - [oO][Nn]/Y/y/1: enable the feature - [oO][Ff]/N/n/0: disable the feature + on: enable HVO + off: disable HVO =20 Built with CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON=3Dy, the default is on. diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/a= dmin-guide/mm/hugetlbpage.rst index a90330d0a837..64e0d5c512e7 100644 --- a/Documentation/admin-guide/mm/hugetlbpage.rst +++ b/Documentation/admin-guide/mm/hugetlbpage.rst @@ -164,8 +164,7 @@ default_hugepagesz will all result in 256 2M huge pages being allocated. Valid default huge page size is architecture dependent. hugetlb_free_vmemmap - When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables optimizing - unused vmemmap pages associated with each HugeTLB page. + When CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP is set, this enables HVO. =20 When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages`` indicates the current number of pre-allocated huge pages of the default si= ze. diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-= guide/sysctl/vm.rst index d7374a1e8ac9..c9f35db973f0 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -569,8 +569,7 @@ This knob is not available when the size of 'struct pag= e' (a structure defined in include/linux/mm_types.h) is not power of two (an unusual system config= could result in this). =20 -Enable (set to 1) or disable (set to 0) the feature of optimizing vmemmap = pages -associated with each HugeTLB page. +Enable (set to 1) or disable (set to 0) HugeTLB Vmemmap Optimization (HVO). =20 Once enabled, the vmemmap pages of subsequent allocation of HugeTLB pages = from buddy allocator will be optimized (7 pages per 2MB HugeTLB page and 4095 p= ages diff --git a/fs/Kconfig b/fs/Kconfig index 5976eb33535f..2f9fd840cb66 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -247,8 +247,7 @@ config HUGETLB_PAGE =20 # # Select this config option from the architecture Kconfig, if it is prefer= red -# to enable the feature of minimizing overhead of struct page associated w= ith -# each HugeTLB page. +# to enable the feature of HugeTLB Vmemmap Optimization (HVO). # config ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP bool @@ -259,14 +258,14 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP depends on SPARSEMEM_VMEMMAP =20 config HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON - bool "Default optimizing vmemmap pages of HugeTLB to on" + bool "Default HugeTLB Vmemmap Optimization (HVO) to on" default n depends on HUGETLB_PAGE_OPTIMIZE_VMEMMAP help - When using HUGETLB_PAGE_OPTIMIZE_VMEMMAP, the optimizing unused vmemmap - pages associated with each HugeTLB page is default off. Say Y here - to enable optimizing vmemmap pages of HugeTLB by default. It can then - be disabled on the command line via hugetlb_free_vmemmap=3Doff. + When using HUGETLB_PAGE_OPTIMIZE_VMEMMAP, the HugeTLB Vmemmap + Optimization (HVO) is off by default. Say Y here to enable HVO + by default. It can then be disabled on the command line via + hugetlb_free_vmemmap=3Doff or sysctl. =20 config MEMFD_CREATE def_bool TMPFS || HUGETLBFS diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 132dc83f0130..c10540993577 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -1,8 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 /* - * Optimize vmemmap pages associated with HugeTLB + * HugeTLB Vmemmap Optimization (HVO) * - * Copyright (c) 2020, Bytedance. All rights reserved. + * Copyright (c) 2020, ByteDance. All rights reserved. * * Author: Muchun Song * @@ -120,8 +120,8 @@ void __init hugetlb_vmemmap_init(struct hstate *h) =20 /* * There are only (RESERVE_VMEMMAP_SIZE / sizeof(struct page)) struct - * page structs that can be used when CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMA= P, - * so add a BUILD_BUG_ON to catch invalid usage of the tail struct page. + * page structs that can be used when HVO is enabled, add a BUILD_BUG_ON + * to catch invalid usage of the tail page structs. */ BUILD_BUG_ON(__NR_USED_SUBPAGE >=3D RESERVE_VMEMMAP_SIZE / sizeof(struct page)); diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 109b0a53b6fe..ba66fadad9fc 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -1,8 +1,8 @@ // SPDX-License-Identifier: GPL-2.0 /* - * Optimize vmemmap pages associated with HugeTLB + * HugeTLB Vmemmap Optimization (HVO) * - * Copyright (c) 2020, Bytedance. All rights reserved. + * Copyright (c) 2020, ByteDance. All rights reserved. * * Author: Muchun Song */ --=20 2.11.0 From nobody Wed Apr 29 08:09:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09895C43334 for ; Mon, 13 Jun 2022 06:35:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236378AbiFMGfz (ORCPT ); Mon, 13 Jun 2022 02:35:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234910AbiFMGfq (ORCPT ); Mon, 13 Jun 2022 02:35:46 -0400 Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21219BC81 for ; Sun, 12 Jun 2022 23:35:44 -0700 (PDT) Received: by mail-pj1-x102b.google.com with SMTP id gc3-20020a17090b310300b001e33092c737so5031685pjb.3 for ; Sun, 12 Jun 2022 23:35:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Cd5TADEYIw6EN5sUEtH2CBYk3q1W3D399BGmAPG/bAw=; b=vEdfjXcvbTsdSeTcxUk5kx5POEj7WY0iyWJKmSGt22Lx3wglGegKrdLGvzJweT1AfH cH6HNumJqtGob9JvAQ65Hj0ngIHjw1ODg/2td5bWnuQDN0fcsyjTco7ht5RDsSqp1NI5 9mw4tmKtNMqMmoHhhdek7rkOUVRQ93m0cV0p+aPlK5ahPwIYQgiJmj1YsN/KSHiUsLat XYboVO2+y7ALLL6J6Hvett0jO+TGXUJ+kE9i0lmnx69FivI2/5Ws8UtzUawNq/q6X6yl J9pkD174uMVGuXy0l5ILCjcs1esTWjr80/6LNxgDcC/wLNwJ/mvIF4+ePaF1Zc8kuBcQ jpbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Cd5TADEYIw6EN5sUEtH2CBYk3q1W3D399BGmAPG/bAw=; b=UOVdE93h1lIYTqmpArV9GzSIt3lWQV4MzufRBtSyGESx4OusoBiMHPR5IEDAzuh+z7 HjuYUiKcpgioTup6arSh/VKNL9FwA5l2rFpLq+eFhG7LPN3i93zdPiRN4J3hV8W0YQ2/ +gzGs1CAdRx9sJDZYEZthe6deUtjoiN8QwrLnvWS7zwXiYYI6KTIzWZpS13f4xazGpjF KRQDV+tH5Z/g4kL+UKHnMHd/dlF6flPLLgVGnNHOggp1AkW1AcflbUofUXDlwgwnrtB6 VOXse379WCNo5KP+Ji1IMxULKYFB/Q7i/eukjaxFJp5kjpCrYM5VtzEwdbC9+6GvGKST j7Sw== X-Gm-Message-State: AOAM532PUUr5NhibYQ/gwzTPHfkvMeupg8RBjwV8b6J2waNKGpgm3afH 8PEmllOvlWaR0GmiIA23UmnKQQ== X-Google-Smtp-Source: ABdhPJyUPsj3ZCa5CUrkPMU5XgNXd44Hm79QahH4VTCrNacbiTJjMF1MkuSpn9bKxk79RDbBHcwhKA== X-Received: by 2002:a17:90a:c70a:b0:1e2:eb3e:239f with SMTP id o10-20020a17090ac70a00b001e2eb3e239fmr14202730pjt.94.1655102143315; Sun, 12 Jun 2022 23:35:43 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.255]) by smtp.gmail.com with ESMTPSA id v3-20020aa799c3000000b0051bc538baadsm4366554pfi.184.2022.06.12.23.35.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jun 2022 23:35:42 -0700 (PDT) From: Muchun Song To: mike.kravetz@oracle.com, david@redhat.com, akpm@linux-foundation.org, corbet@lwn.net Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Muchun Song Subject: [PATCH 4/6] mm: hugetlb_vmemmap: move vmemmap code related to HugeTLB to hugetlb_vmemmap.c Date: Mon, 13 Jun 2022 14:35:10 +0800 Message-Id: <20220613063512.17540-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.1 (Apple Git-133) In-Reply-To: <20220613063512.17540-1-songmuchun@bytedance.com> References: <20220613063512.17540-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When I first introduced vmemmap manipulation functions related to HugeTLB, I thought those functions may be reused by other modules (e.g. using similar approach to optimize vmemmap pages, unfortunately, the DAX used the same approach but does not use those functions). After two years, we didn't see any other users. So move those functions to hugetlb_vmemmap.c. Signed-off-by: Muchun Song Reviewed-by: Mike Kravetz Reviewed-by: Oscar Salvador --- include/linux/mm.h | 7 - mm/hugetlb_vmemmap.c | 391 +++++++++++++++++++++++++++++++++++++++++++++++= +++- mm/sparse-vmemmap.c | 391 -----------------------------------------------= ---- 3 files changed, 390 insertions(+), 399 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 623c2ee8330a..152d0eefe5aa 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3208,13 +3208,6 @@ static inline void print_vma_addr(char *prefix, unsi= gned long rip) } #endif =20 -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP -int vmemmap_remap_free(unsigned long start, unsigned long end, - unsigned long reuse); -int vmemmap_remap_alloc(unsigned long start, unsigned long end, - unsigned long reuse, gfp_t gfp_mask); -#endif - void *sparse_buffer_alloc(unsigned long size); struct page * __populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index c10540993577..abdf441215bb 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -10,9 +10,31 @@ */ #define pr_fmt(fmt) "HugeTLB: " fmt =20 -#include +#include +#include +#include +#include #include "hugetlb_vmemmap.h" =20 +/** + * struct vmemmap_remap_walk - walk vmemmap page table + * + * @remap_pte: called for each lowest-level entry (PTE). + * @nr_walked: the number of walked pte. + * @reuse_page: the page which is reused for the tail vmemmap pages. + * @reuse_addr: the virtual address of the @reuse_page page. + * @vmemmap_pages: the list head of the vmemmap pages that can be freed + * or is mapped from. + */ +struct vmemmap_remap_walk { + void (*remap_pte)(pte_t *pte, unsigned long addr, + struct vmemmap_remap_walk *walk); + unsigned long nr_walked; + struct page *reuse_page; + unsigned long reuse_addr; + struct list_head *vmemmap_pages; +}; + /* * There are a lot of struct page structures associated with each HugeTLB = page. * For tail pages, the value of compound_head is the same. So we can reuse= first @@ -23,6 +45,373 @@ #define RESERVE_VMEMMAP_NR 1U #define RESERVE_VMEMMAP_SIZE (RESERVE_VMEMMAP_NR << PAGE_SHIFT) =20 +static int __split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start) +{ + pmd_t __pmd; + int i; + unsigned long addr =3D start; + struct page *page =3D pmd_page(*pmd); + pte_t *pgtable =3D pte_alloc_one_kernel(&init_mm); + + if (!pgtable) + return -ENOMEM; + + pmd_populate_kernel(&init_mm, &__pmd, pgtable); + + for (i =3D 0; i < PMD_SIZE / PAGE_SIZE; i++, addr +=3D PAGE_SIZE) { + pte_t entry, *pte; + pgprot_t pgprot =3D PAGE_KERNEL; + + entry =3D mk_pte(page + i, pgprot); + pte =3D pte_offset_kernel(&__pmd, addr); + set_pte_at(&init_mm, addr, pte, entry); + } + + spin_lock(&init_mm.page_table_lock); + if (likely(pmd_leaf(*pmd))) { + /* Make pte visible before pmd. See comment in pmd_install(). */ + smp_wmb(); + pmd_populate_kernel(&init_mm, pmd, pgtable); + flush_tlb_kernel_range(start, start + PMD_SIZE); + } else { + pte_free_kernel(&init_mm, pgtable); + } + spin_unlock(&init_mm.page_table_lock); + + return 0; +} + +static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start) +{ + int leaf; + + spin_lock(&init_mm.page_table_lock); + leaf =3D pmd_leaf(*pmd); + spin_unlock(&init_mm.page_table_lock); + + if (!leaf) + return 0; + + return __split_vmemmap_huge_pmd(pmd, start); +} + +static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr, + unsigned long end, + struct vmemmap_remap_walk *walk) +{ + pte_t *pte =3D pte_offset_kernel(pmd, addr); + + /* + * The reuse_page is found 'first' in table walk before we start + * remapping (which is calling @walk->remap_pte). + */ + if (!walk->reuse_page) { + walk->reuse_page =3D pte_page(*pte); + /* + * Because the reuse address is part of the range that we are + * walking, skip the reuse address range. + */ + addr +=3D PAGE_SIZE; + pte++; + walk->nr_walked++; + } + + for (; addr !=3D end; addr +=3D PAGE_SIZE, pte++) { + walk->remap_pte(pte, addr, walk); + walk->nr_walked++; + } +} + +static int vmemmap_pmd_range(pud_t *pud, unsigned long addr, + unsigned long end, + struct vmemmap_remap_walk *walk) +{ + pmd_t *pmd; + unsigned long next; + + pmd =3D pmd_offset(pud, addr); + do { + int ret; + + ret =3D split_vmemmap_huge_pmd(pmd, addr & PMD_MASK); + if (ret) + return ret; + + next =3D pmd_addr_end(addr, end); + vmemmap_pte_range(pmd, addr, next, walk); + } while (pmd++, addr =3D next, addr !=3D end); + + return 0; +} + +static int vmemmap_pud_range(p4d_t *p4d, unsigned long addr, + unsigned long end, + struct vmemmap_remap_walk *walk) +{ + pud_t *pud; + unsigned long next; + + pud =3D pud_offset(p4d, addr); + do { + int ret; + + next =3D pud_addr_end(addr, end); + ret =3D vmemmap_pmd_range(pud, addr, next, walk); + if (ret) + return ret; + } while (pud++, addr =3D next, addr !=3D end); + + return 0; +} + +static int vmemmap_p4d_range(pgd_t *pgd, unsigned long addr, + unsigned long end, + struct vmemmap_remap_walk *walk) +{ + p4d_t *p4d; + unsigned long next; + + p4d =3D p4d_offset(pgd, addr); + do { + int ret; + + next =3D p4d_addr_end(addr, end); + ret =3D vmemmap_pud_range(p4d, addr, next, walk); + if (ret) + return ret; + } while (p4d++, addr =3D next, addr !=3D end); + + return 0; +} + +static int vmemmap_remap_range(unsigned long start, unsigned long end, + struct vmemmap_remap_walk *walk) +{ + unsigned long addr =3D start; + unsigned long next; + pgd_t *pgd; + + VM_BUG_ON(!PAGE_ALIGNED(start)); + VM_BUG_ON(!PAGE_ALIGNED(end)); + + pgd =3D pgd_offset_k(addr); + do { + int ret; + + next =3D pgd_addr_end(addr, end); + ret =3D vmemmap_p4d_range(pgd, addr, next, walk); + if (ret) + return ret; + } while (pgd++, addr =3D next, addr !=3D end); + + /* + * We only change the mapping of the vmemmap virtual address range + * [@start + PAGE_SIZE, end), so we only need to flush the TLB which + * belongs to the range. + */ + flush_tlb_kernel_range(start + PAGE_SIZE, end); + + return 0; +} + +/* + * Free a vmemmap page. A vmemmap page can be allocated from the memblock + * allocator or buddy allocator. If the PG_reserved flag is set, it means + * that it allocated from the memblock allocator, just free it via the + * free_bootmem_page(). Otherwise, use __free_page(). + */ +static inline void free_vmemmap_page(struct page *page) +{ + if (PageReserved(page)) + free_bootmem_page(page); + else + __free_page(page); +} + +/* Free a list of the vmemmap pages */ +static void free_vmemmap_page_list(struct list_head *list) +{ + struct page *page, *next; + + list_for_each_entry_safe(page, next, list, lru) { + list_del(&page->lru); + free_vmemmap_page(page); + } +} + +static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, + struct vmemmap_remap_walk *walk) +{ + /* + * Remap the tail pages as read-only to catch illegal write operation + * to the tail pages. + */ + pgprot_t pgprot =3D PAGE_KERNEL_RO; + pte_t entry =3D mk_pte(walk->reuse_page, pgprot); + struct page *page =3D pte_page(*pte); + + list_add_tail(&page->lru, walk->vmemmap_pages); + set_pte_at(&init_mm, addr, pte, entry); +} + +/* + * How many struct page structs need to be reset. When we reuse the head + * struct page, the special metadata (e.g. page->flags or page->mapping) + * cannot copy to the tail struct page structs. The invalid value will be + * checked in the free_tail_pages_check(). In order to avoid the message + * of "corrupted mapping in tail page". We need to reset at least 3 (one + * head struct page struct and two tail struct page structs) struct page + * structs. + */ +#define NR_RESET_STRUCT_PAGE 3 + +static inline void reset_struct_pages(struct page *start) +{ + int i; + struct page *from =3D start + NR_RESET_STRUCT_PAGE; + + for (i =3D 0; i < NR_RESET_STRUCT_PAGE; i++) + memcpy(start + i, from, sizeof(*from)); +} + +static void vmemmap_restore_pte(pte_t *pte, unsigned long addr, + struct vmemmap_remap_walk *walk) +{ + pgprot_t pgprot =3D PAGE_KERNEL; + struct page *page; + void *to; + + BUG_ON(pte_page(*pte) !=3D walk->reuse_page); + + page =3D list_first_entry(walk->vmemmap_pages, struct page, lru); + list_del(&page->lru); + to =3D page_to_virt(page); + copy_page(to, (void *)walk->reuse_addr); + reset_struct_pages(to); + + set_pte_at(&init_mm, addr, pte, mk_pte(page, pgprot)); +} + +/** + * vmemmap_remap_free - remap the vmemmap virtual address range [@start, @= end) + * to the page which @reuse is mapped to, then free vmemmap + * which the range are mapped to. + * @start: start address of the vmemmap virtual address range that we want + * to remap. + * @end: end address of the vmemmap virtual address range that we want to + * remap. + * @reuse: reuse address. + * + * Return: %0 on success, negative error code otherwise. + */ +static int vmemmap_remap_free(unsigned long start, unsigned long end, + unsigned long reuse) +{ + int ret; + LIST_HEAD(vmemmap_pages); + struct vmemmap_remap_walk walk =3D { + .remap_pte =3D vmemmap_remap_pte, + .reuse_addr =3D reuse, + .vmemmap_pages =3D &vmemmap_pages, + }; + + /* + * In order to make remapping routine most efficient for the huge pages, + * the routine of vmemmap page table walking has the following rules + * (see more details from the vmemmap_pte_range()): + * + * - The range [@start, @end) and the range [@reuse, @reuse + PAGE_SIZE) + * should be continuous. + * - The @reuse address is part of the range [@reuse, @end) that we are + * walking which is passed to vmemmap_remap_range(). + * - The @reuse address is the first in the complete range. + * + * So we need to make sure that @start and @reuse meet the above rules. + */ + BUG_ON(start - reuse !=3D PAGE_SIZE); + + mmap_read_lock(&init_mm); + ret =3D vmemmap_remap_range(reuse, end, &walk); + if (ret && walk.nr_walked) { + end =3D reuse + walk.nr_walked * PAGE_SIZE; + /* + * vmemmap_pages contains pages from the previous + * vmemmap_remap_range call which failed. These + * are pages which were removed from the vmemmap. + * They will be restored in the following call. + */ + walk =3D (struct vmemmap_remap_walk) { + .remap_pte =3D vmemmap_restore_pte, + .reuse_addr =3D reuse, + .vmemmap_pages =3D &vmemmap_pages, + }; + + vmemmap_remap_range(reuse, end, &walk); + } + mmap_read_unlock(&init_mm); + + free_vmemmap_page_list(&vmemmap_pages); + + return ret; +} + +static int alloc_vmemmap_page_list(unsigned long start, unsigned long end, + gfp_t gfp_mask, struct list_head *list) +{ + unsigned long nr_pages =3D (end - start) >> PAGE_SHIFT; + int nid =3D page_to_nid((struct page *)start); + struct page *page, *next; + + while (nr_pages--) { + page =3D alloc_pages_node(nid, gfp_mask, 0); + if (!page) + goto out; + list_add_tail(&page->lru, list); + } + + return 0; +out: + list_for_each_entry_safe(page, next, list, lru) + __free_pages(page, 0); + return -ENOMEM; +} + +/** + * vmemmap_remap_alloc - remap the vmemmap virtual address range [@start, = end) + * to the page which is from the @vmemmap_pages + * respectively. + * @start: start address of the vmemmap virtual address range that we want + * to remap. + * @end: end address of the vmemmap virtual address range that we want to + * remap. + * @reuse: reuse address. + * @gfp_mask: GFP flag for allocating vmemmap pages. + * + * Return: %0 on success, negative error code otherwise. + */ +static int vmemmap_remap_alloc(unsigned long start, unsigned long end, + unsigned long reuse, gfp_t gfp_mask) +{ + LIST_HEAD(vmemmap_pages); + struct vmemmap_remap_walk walk =3D { + .remap_pte =3D vmemmap_restore_pte, + .reuse_addr =3D reuse, + .vmemmap_pages =3D &vmemmap_pages, + }; + + /* See the comment in the vmemmap_remap_free(). */ + BUG_ON(start - reuse !=3D PAGE_SIZE); + + if (alloc_vmemmap_page_list(start, end, gfp_mask, &vmemmap_pages)) + return -ENOMEM; + + mmap_read_lock(&init_mm); + vmemmap_remap_range(reuse, end, &walk); + mmap_read_unlock(&init_mm); + + return 0; +} + DEFINE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); =20 diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 49cb15cbe590..473effcb2285 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -27,400 +27,9 @@ #include #include #include -#include -#include =20 #include #include -#include - -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP -/** - * struct vmemmap_remap_walk - walk vmemmap page table - * - * @remap_pte: called for each lowest-level entry (PTE). - * @nr_walked: the number of walked pte. - * @reuse_page: the page which is reused for the tail vmemmap pages. - * @reuse_addr: the virtual address of the @reuse_page page. - * @vmemmap_pages: the list head of the vmemmap pages that can be freed - * or is mapped from. - */ -struct vmemmap_remap_walk { - void (*remap_pte)(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk); - unsigned long nr_walked; - struct page *reuse_page; - unsigned long reuse_addr; - struct list_head *vmemmap_pages; -}; - -static int __split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start) -{ - pmd_t __pmd; - int i; - unsigned long addr =3D start; - struct page *page =3D pmd_page(*pmd); - pte_t *pgtable =3D pte_alloc_one_kernel(&init_mm); - - if (!pgtable) - return -ENOMEM; - - pmd_populate_kernel(&init_mm, &__pmd, pgtable); - - for (i =3D 0; i < PMD_SIZE / PAGE_SIZE; i++, addr +=3D PAGE_SIZE) { - pte_t entry, *pte; - pgprot_t pgprot =3D PAGE_KERNEL; - - entry =3D mk_pte(page + i, pgprot); - pte =3D pte_offset_kernel(&__pmd, addr); - set_pte_at(&init_mm, addr, pte, entry); - } - - spin_lock(&init_mm.page_table_lock); - if (likely(pmd_leaf(*pmd))) { - /* Make pte visible before pmd. See comment in pmd_install(). */ - smp_wmb(); - pmd_populate_kernel(&init_mm, pmd, pgtable); - flush_tlb_kernel_range(start, start + PMD_SIZE); - } else { - pte_free_kernel(&init_mm, pgtable); - } - spin_unlock(&init_mm.page_table_lock); - - return 0; -} - -static int split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start) -{ - int leaf; - - spin_lock(&init_mm.page_table_lock); - leaf =3D pmd_leaf(*pmd); - spin_unlock(&init_mm.page_table_lock); - - if (!leaf) - return 0; - - return __split_vmemmap_huge_pmd(pmd, start); -} - -static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr, - unsigned long end, - struct vmemmap_remap_walk *walk) -{ - pte_t *pte =3D pte_offset_kernel(pmd, addr); - - /* - * The reuse_page is found 'first' in table walk before we start - * remapping (which is calling @walk->remap_pte). - */ - if (!walk->reuse_page) { - walk->reuse_page =3D pte_page(*pte); - /* - * Because the reuse address is part of the range that we are - * walking, skip the reuse address range. - */ - addr +=3D PAGE_SIZE; - pte++; - walk->nr_walked++; - } - - for (; addr !=3D end; addr +=3D PAGE_SIZE, pte++) { - walk->remap_pte(pte, addr, walk); - walk->nr_walked++; - } -} - -static int vmemmap_pmd_range(pud_t *pud, unsigned long addr, - unsigned long end, - struct vmemmap_remap_walk *walk) -{ - pmd_t *pmd; - unsigned long next; - - pmd =3D pmd_offset(pud, addr); - do { - int ret; - - ret =3D split_vmemmap_huge_pmd(pmd, addr & PMD_MASK); - if (ret) - return ret; - - next =3D pmd_addr_end(addr, end); - vmemmap_pte_range(pmd, addr, next, walk); - } while (pmd++, addr =3D next, addr !=3D end); - - return 0; -} - -static int vmemmap_pud_range(p4d_t *p4d, unsigned long addr, - unsigned long end, - struct vmemmap_remap_walk *walk) -{ - pud_t *pud; - unsigned long next; - - pud =3D pud_offset(p4d, addr); - do { - int ret; - - next =3D pud_addr_end(addr, end); - ret =3D vmemmap_pmd_range(pud, addr, next, walk); - if (ret) - return ret; - } while (pud++, addr =3D next, addr !=3D end); - - return 0; -} - -static int vmemmap_p4d_range(pgd_t *pgd, unsigned long addr, - unsigned long end, - struct vmemmap_remap_walk *walk) -{ - p4d_t *p4d; - unsigned long next; - - p4d =3D p4d_offset(pgd, addr); - do { - int ret; - - next =3D p4d_addr_end(addr, end); - ret =3D vmemmap_pud_range(p4d, addr, next, walk); - if (ret) - return ret; - } while (p4d++, addr =3D next, addr !=3D end); - - return 0; -} - -static int vmemmap_remap_range(unsigned long start, unsigned long end, - struct vmemmap_remap_walk *walk) -{ - unsigned long addr =3D start; - unsigned long next; - pgd_t *pgd; - - VM_BUG_ON(!PAGE_ALIGNED(start)); - VM_BUG_ON(!PAGE_ALIGNED(end)); - - pgd =3D pgd_offset_k(addr); - do { - int ret; - - next =3D pgd_addr_end(addr, end); - ret =3D vmemmap_p4d_range(pgd, addr, next, walk); - if (ret) - return ret; - } while (pgd++, addr =3D next, addr !=3D end); - - /* - * We only change the mapping of the vmemmap virtual address range - * [@start + PAGE_SIZE, end), so we only need to flush the TLB which - * belongs to the range. - */ - flush_tlb_kernel_range(start + PAGE_SIZE, end); - - return 0; -} - -/* - * Free a vmemmap page. A vmemmap page can be allocated from the memblock - * allocator or buddy allocator. If the PG_reserved flag is set, it means - * that it allocated from the memblock allocator, just free it via the - * free_bootmem_page(). Otherwise, use __free_page(). - */ -static inline void free_vmemmap_page(struct page *page) -{ - if (PageReserved(page)) - free_bootmem_page(page); - else - __free_page(page); -} - -/* Free a list of the vmemmap pages */ -static void free_vmemmap_page_list(struct list_head *list) -{ - struct page *page, *next; - - list_for_each_entry_safe(page, next, list, lru) { - list_del(&page->lru); - free_vmemmap_page(page); - } -} - -static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk) -{ - /* - * Remap the tail pages as read-only to catch illegal write operation - * to the tail pages. - */ - pgprot_t pgprot =3D PAGE_KERNEL_RO; - pte_t entry =3D mk_pte(walk->reuse_page, pgprot); - struct page *page =3D pte_page(*pte); - - list_add_tail(&page->lru, walk->vmemmap_pages); - set_pte_at(&init_mm, addr, pte, entry); -} - -/* - * How many struct page structs need to be reset. When we reuse the head - * struct page, the special metadata (e.g. page->flags or page->mapping) - * cannot copy to the tail struct page structs. The invalid value will be - * checked in the free_tail_pages_check(). In order to avoid the message - * of "corrupted mapping in tail page". We need to reset at least 3 (one - * head struct page struct and two tail struct page structs) struct page - * structs. - */ -#define NR_RESET_STRUCT_PAGE 3 - -static inline void reset_struct_pages(struct page *start) -{ - int i; - struct page *from =3D start + NR_RESET_STRUCT_PAGE; - - for (i =3D 0; i < NR_RESET_STRUCT_PAGE; i++) - memcpy(start + i, from, sizeof(*from)); -} - -static void vmemmap_restore_pte(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk) -{ - pgprot_t pgprot =3D PAGE_KERNEL; - struct page *page; - void *to; - - BUG_ON(pte_page(*pte) !=3D walk->reuse_page); - - page =3D list_first_entry(walk->vmemmap_pages, struct page, lru); - list_del(&page->lru); - to =3D page_to_virt(page); - copy_page(to, (void *)walk->reuse_addr); - reset_struct_pages(to); - - set_pte_at(&init_mm, addr, pte, mk_pte(page, pgprot)); -} - -/** - * vmemmap_remap_free - remap the vmemmap virtual address range [@start, @= end) - * to the page which @reuse is mapped to, then free vmemmap - * which the range are mapped to. - * @start: start address of the vmemmap virtual address range that we want - * to remap. - * @end: end address of the vmemmap virtual address range that we want to - * remap. - * @reuse: reuse address. - * - * Return: %0 on success, negative error code otherwise. - */ -int vmemmap_remap_free(unsigned long start, unsigned long end, - unsigned long reuse) -{ - int ret; - LIST_HEAD(vmemmap_pages); - struct vmemmap_remap_walk walk =3D { - .remap_pte =3D vmemmap_remap_pte, - .reuse_addr =3D reuse, - .vmemmap_pages =3D &vmemmap_pages, - }; - - /* - * In order to make remapping routine most efficient for the huge pages, - * the routine of vmemmap page table walking has the following rules - * (see more details from the vmemmap_pte_range()): - * - * - The range [@start, @end) and the range [@reuse, @reuse + PAGE_SIZE) - * should be continuous. - * - The @reuse address is part of the range [@reuse, @end) that we are - * walking which is passed to vmemmap_remap_range(). - * - The @reuse address is the first in the complete range. - * - * So we need to make sure that @start and @reuse meet the above rules. - */ - BUG_ON(start - reuse !=3D PAGE_SIZE); - - mmap_read_lock(&init_mm); - ret =3D vmemmap_remap_range(reuse, end, &walk); - if (ret && walk.nr_walked) { - end =3D reuse + walk.nr_walked * PAGE_SIZE; - /* - * vmemmap_pages contains pages from the previous - * vmemmap_remap_range call which failed. These - * are pages which were removed from the vmemmap. - * They will be restored in the following call. - */ - walk =3D (struct vmemmap_remap_walk) { - .remap_pte =3D vmemmap_restore_pte, - .reuse_addr =3D reuse, - .vmemmap_pages =3D &vmemmap_pages, - }; - - vmemmap_remap_range(reuse, end, &walk); - } - mmap_read_unlock(&init_mm); - - free_vmemmap_page_list(&vmemmap_pages); - - return ret; -} - -static int alloc_vmemmap_page_list(unsigned long start, unsigned long end, - gfp_t gfp_mask, struct list_head *list) -{ - unsigned long nr_pages =3D (end - start) >> PAGE_SHIFT; - int nid =3D page_to_nid((struct page *)start); - struct page *page, *next; - - while (nr_pages--) { - page =3D alloc_pages_node(nid, gfp_mask, 0); - if (!page) - goto out; - list_add_tail(&page->lru, list); - } - - return 0; -out: - list_for_each_entry_safe(page, next, list, lru) - __free_pages(page, 0); - return -ENOMEM; -} - -/** - * vmemmap_remap_alloc - remap the vmemmap virtual address range [@start, = end) - * to the page which is from the @vmemmap_pages - * respectively. - * @start: start address of the vmemmap virtual address range that we want - * to remap. - * @end: end address of the vmemmap virtual address range that we want to - * remap. - * @reuse: reuse address. - * @gfp_mask: GFP flag for allocating vmemmap pages. - * - * Return: %0 on success, negative error code otherwise. - */ -int vmemmap_remap_alloc(unsigned long start, unsigned long end, - unsigned long reuse, gfp_t gfp_mask) -{ - LIST_HEAD(vmemmap_pages); - struct vmemmap_remap_walk walk =3D { - .remap_pte =3D vmemmap_restore_pte, - .reuse_addr =3D reuse, - .vmemmap_pages =3D &vmemmap_pages, - }; - - /* See the comment in the vmemmap_remap_free(). */ - BUG_ON(start - reuse !=3D PAGE_SIZE); - - if (alloc_vmemmap_page_list(start, end, gfp_mask, &vmemmap_pages)) - return -ENOMEM; - - mmap_read_lock(&init_mm); - vmemmap_remap_range(reuse, end, &walk); - mmap_read_unlock(&init_mm); - - return 0; -} -#endif /* CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP */ =20 /* * Allocate a block of memory to be used to back the virtual memory map --=20 2.11.0 From nobody Wed Apr 29 08:09:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83EEDCCA47B for ; Mon, 13 Jun 2022 06:36:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236647AbiFMGgC (ORCPT ); Mon, 13 Jun 2022 02:36:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236720AbiFMGfv (ORCPT ); Mon, 13 Jun 2022 02:35:51 -0400 Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ACFC7BE39 for ; Sun, 12 Jun 2022 23:35:47 -0700 (PDT) Received: by mail-pg1-x532.google.com with SMTP id y187so4741719pgd.3 for ; Sun, 12 Jun 2022 23:35:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=J3yPX4pLteTBSSt3jyoxQC4mEw/61R+QtZm4AH/eCAw=; b=tFpKgo0j+PNLUDk0x391HcyJFSlH7ARGiFiArHFBqwwQvFstQSoTZFlD63itVTgYpc kUUVltfZZJyaBSPftr8vB5U60aL3+Bhhc3i9SxGcNDi/qPj+qPLIIzGOrxA0UqjRKOYe SJe68LscSGyRfpTbJWWulEAdjHzitTlvSq0Kgup9y7NhhFvjWFNRz49wAFlozK/qxWqV siEjOibdp+vbrc5CDd14o6Oxe7QSb5wHX6a2IUr2AzgcEZIY0IFOOHhtonhSWbQeg8bK JGZPFSeVlFm7Z22v9a5EJKJ3OIkN90hGOjnWYRJkdY5mztcGZsPWI8fuObj9As7/kq5u tDag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=J3yPX4pLteTBSSt3jyoxQC4mEw/61R+QtZm4AH/eCAw=; b=c0hOuMlw5Oqm5X1XPlHnq14PJ86vSqqpfQDhySpRldZjiIt6i9N/QvfW1o9QSSMt57 TVKqzTG9P5xGHKe/zGoOcp5aBJ70XLWjp41nQ143blGjvpr5l/f6LZkPYYIJ0tftWEh/ H9GQJraeW4ItycKtsmh95vLfXvygmySMnm6eRYmrpK9y0Cb0jKh4WmI+mJ+DNcAPCNWy OSrizH8cOZlVXhxpUsTZiiDNZVv04DFta89OnsLOqMfzO4wa3YNEw/LAMPbdKHxcqjp+ 8IgyJQ+uisAiZpceJrWSju2wCLF3kDCFW2hkxaJtoY3/qqO9k1cDFhVgb6yIumE+zmUf CM6w== X-Gm-Message-State: AOAM530Xu8/Rfh48fceBed/oZ/l2+ovNutF5gvsax+ZNm10CzNM7bkWb KEkYtved6HY+G+McTwopk8wyRA== X-Google-Smtp-Source: ABdhPJy28P5B0qqGa3Ke6+SfMWEOc2cytKxdWdyOLSShHZNaM7W3nV99RcCZipyIALw6c+KciZ9G3w== X-Received: by 2002:a63:cf51:0:b0:408:85f4:fb33 with SMTP id b17-20020a63cf51000000b0040885f4fb33mr1973347pgj.589.1655102147198; Sun, 12 Jun 2022 23:35:47 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.255]) by smtp.gmail.com with ESMTPSA id v3-20020aa799c3000000b0051bc538baadsm4366554pfi.184.2022.06.12.23.35.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jun 2022 23:35:46 -0700 (PDT) From: Muchun Song To: mike.kravetz@oracle.com, david@redhat.com, akpm@linux-foundation.org, corbet@lwn.net Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Muchun Song Subject: [PATCH 5/6] mm: hugetlb_vmemmap: replace early_param() with core_param() Date: Mon, 13 Jun 2022 14:35:11 +0800 Message-Id: <20220613063512.17540-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.1 (Apple Git-133) In-Reply-To: <20220613063512.17540-1-songmuchun@bytedance.com> References: <20220613063512.17540-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" After the following commit and previous commit applied. 78f39084b41d ("mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl") There is no order requirement between the parameter of "hugetlb_free_vmemma= p" and "hugepages" since we have removed the check of whether HVO is enabled from hugetlb_vmemmap_init(). Therefore we can safely replace early_param() with core_param() to simplify the code. Signed-off-by: Muchun Song Reviewed-by: Mike Kravetz --- mm/hugetlb_vmemmap.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index abdf441215bb..9808d32cdb9e 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -415,14 +415,8 @@ static int vmemmap_remap_alloc(unsigned long start, un= signed long end, DEFINE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); =20 -static bool vmemmap_optimize_enabled =3D - IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON); - -static int __init hugetlb_vmemmap_early_param(char *buf) -{ - return kstrtobool(buf, &vmemmap_optimize_enabled); -} -early_param("hugetlb_free_vmemmap", hugetlb_vmemmap_early_param); +static bool vmemmap_optimize_enabled =3D IS_ENABLED(CONFIG_HUGETLB_PAGE_OP= TIMIZE_VMEMMAP_DEFAULT_ON); +core_param(hugetlb_free_vmemmap, vmemmap_optimize_enabled, bool, 0); =20 /* * Previously discarded vmemmap pages will be allocated and remapping --=20 2.11.0 From nobody Wed Apr 29 08:09:57 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC909C43334 for ; Mon, 13 Jun 2022 06:36:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236127AbiFMGgG (ORCPT ); Mon, 13 Jun 2022 02:36:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236765AbiFMGfx (ORCPT ); Mon, 13 Jun 2022 02:35:53 -0400 Received: from mail-pg1-x52d.google.com (mail-pg1-x52d.google.com [IPv6:2607:f8b0:4864:20::52d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B075BF66 for ; Sun, 12 Jun 2022 23:35:51 -0700 (PDT) Received: by mail-pg1-x52d.google.com with SMTP id s135so4707370pgs.10 for ; Sun, 12 Jun 2022 23:35:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gwkuZE56aSA9/tatUtzCc+m4R0MzHGUai/Pf7RD5URU=; b=Kjz8rGkl7FUYvNUnypP5Qyyj+Q6Yamk+s0aKQd9kRPShoI7ZOlXd8aKUfnLlq9VRDB 2IfUB/K+zfaqtPom55HNE6nY/05iePr6inRaRAesiMWdSIoAh1ye3EsBFXLtNMLr2tAR Fb/Hsv2nG/b2HGDs6VBfx0c9n0h/U8MnTniUCYUNU8LUi4I+p+AMkK5YytsFI58VYv5t 1Jl91fMb+LgXqyUBvkiNO2PGH6dWAAXvpAtDzGS8jywgFBpsKNXleUSj+JD4WX5bfVPZ rZaLzWXTJfU0O6StEleg8g9iCZOzQsFeWDeSdE23GLC3Ca0j2sx41hCPzX3sfrqha1hw zjSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gwkuZE56aSA9/tatUtzCc+m4R0MzHGUai/Pf7RD5URU=; b=ctyKimp5sbyS5M8DQrpqf1bPgVQT3fD8433lrOZTiKhDrrAoPdHLe3TLNN81zndBLs oic0fU8CIBIt5KgzZxqusQtGJYHuZOK0+rCQ5rhGrpBU4DgLM5+LEkdYj7nvY0PMuoPF gy48J8C7bwVoLqo8l24SUvuwyQ2YVBCoYc+wmlly8wWUbqyZ6TP/Ed4vJHsNKqXP8CAQ Iw2vy7h76ZZcUDLuGlbqVcT1OKC9/9XO/7jbu6hqiEQ6U1Vn1cHQZUIecwj5aEG3x0sv 8wo94Zh6zvZqan/1CwacSdFCRQmNRQbD+NP1WCrwBEjFGr/R/UWg14wXFnXFloLSQ2Xj 3uTQ== X-Gm-Message-State: AOAM532UEY+U8+Xldr4Ey38zJ1ci2oE7TwvFc2YuhzcdSxRUzi8AgOiu i/l5dMZi/InvG2Q/2mSPvco7dw== X-Google-Smtp-Source: ABdhPJx2XekdnvADwYsXewHpUnTmAPdpTfW58vn7LUP2xBm6iJ9za0hOyv1YbpUo8oZuj7TQ2LmxOQ== X-Received: by 2002:a62:6410:0:b0:4f3:9654:266d with SMTP id y16-20020a626410000000b004f39654266dmr58255506pfb.59.1655102150896; Sun, 12 Jun 2022 23:35:50 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.255]) by smtp.gmail.com with ESMTPSA id v3-20020aa799c3000000b0051bc538baadsm4366554pfi.184.2022.06.12.23.35.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jun 2022 23:35:50 -0700 (PDT) From: Muchun Song To: mike.kravetz@oracle.com, david@redhat.com, akpm@linux-foundation.org, corbet@lwn.net Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Muchun Song Subject: [PATCH 6/6] mm: hugetlb_vmemmap: improve hugetlb_vmemmap code readability Date: Mon, 13 Jun 2022 14:35:12 +0800 Message-Id: <20220613063512.17540-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.1 (Apple Git-133) In-Reply-To: <20220613063512.17540-1-songmuchun@bytedance.com> References: <20220613063512.17540-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There is a discussion about the name of hugetlb_vmemmap_alloc/free in thread [1]. The suggestion suggested by David is rename "alloc/free" to "optimize/restore" to make functionalities clearer to users, "optimize" means the function will optimize vmemmap pages, while "restore" means restoring its vmemmap pages discared before. This commit does this. Another discussion is the confusion RESERVE_VMEMMAP_NR isn't used explicitly for vmemmap_addr but implicitly for vmemmap_end in hugetlb_vmemmap_alloc/free. David suggested we can compute what hugetlb_vmemmap_init() does now at runtime. We do not need to worry for the overhead of computing at runtime since the calculation is simple enough and those functions are not in a hot path. This commit has the following improvements: 1) The function suffixed name ("optimize/restore") is more expressive. 2) The logic becomes less weird in hugetlb_vmemmap_optmize/restore(). 3) The hugetlb_vmemmap_init() does not need to be exported anymore. 4) A ->optimize_vmemmap_pages field in struct hstate is killed. 5) There is only one place where checks is_power_of_2(sizeof(struct page)) instead of two places. 6) Add more comments for hugetlb_vmemmap_optmize/restore(). 7) For external users, hugetlb_optimize_vmemmap_pages() is used for detecting if the HugeTLB's vmemmap pages is optimizable originally. In this commit, it is killed and we introduce a new helper hugetlb_vmemmap_optimizable() to replace it. The name is more expressive. Link: https://lore.kernel.org/all/20220404074652.68024-2-songmuchun@bytedan= ce.com/ [1] Signed-off-by: Muchun Song --- include/linux/hugetlb.h | 7 +-- mm/hugetlb.c | 11 ++-- mm/hugetlb_vmemmap.c | 154 +++++++++++++++++++++++---------------------= ---- mm/hugetlb_vmemmap.h | 39 +++++++----- 4 files changed, 105 insertions(+), 106 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 642a39016f9a..0b475faf9bf4 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -640,9 +640,6 @@ struct hstate { unsigned int nr_huge_pages_node[MAX_NUMNODES]; unsigned int free_huge_pages_node[MAX_NUMNODES]; unsigned int surplus_huge_pages_node[MAX_NUMNODES]; -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP - unsigned int optimize_vmemmap_pages; -#endif #ifdef CONFIG_CGROUP_HUGETLB /* cgroup control files */ struct cftype cgroup_files_dfl[8]; @@ -718,7 +715,7 @@ static inline struct hstate *hstate_vma(struct vm_area_= struct *vma) return hstate_file(vma->vm_file); } =20 -static inline unsigned long huge_page_size(struct hstate *h) +static inline unsigned long huge_page_size(const struct hstate *h) { return (unsigned long)PAGE_SIZE << h->order; } @@ -747,7 +744,7 @@ static inline bool hstate_is_gigantic(struct hstate *h) return huge_page_order(h) >=3D MAX_ORDER; } =20 -static inline unsigned int pages_per_huge_page(struct hstate *h) +static inline unsigned int pages_per_huge_page(const struct hstate *h) { return 1 << h->order; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 259b9c41892f..26a5af7f0065 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1541,7 +1541,7 @@ static void __update_and_free_page(struct hstate *h, = struct page *page) if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) return; =20 - if (hugetlb_vmemmap_alloc(h, page)) { + if (hugetlb_vmemmap_restore(h, page)) { spin_lock_irq(&hugetlb_lock); /* * If we cannot allocate vmemmap pages, just refuse to free the @@ -1627,7 +1627,7 @@ static DECLARE_WORK(free_hpage_work, free_hpage_workf= n); =20 static inline void flush_free_hpage_work(struct hstate *h) { - if (hugetlb_optimize_vmemmap_pages(h)) + if (hugetlb_vmemmap_optimizable(h)) flush_work(&free_hpage_work); } =20 @@ -1749,7 +1749,7 @@ static void __prep_account_new_huge_page(struct hstat= e *h, int nid) =20 static void __prep_new_huge_page(struct hstate *h, struct page *page) { - hugetlb_vmemmap_free(h, page); + hugetlb_vmemmap_optimize(h, page); INIT_LIST_HEAD(&page->lru); set_compound_page_dtor(page, HUGETLB_PAGE_DTOR); hugetlb_set_page_subpool(page, NULL); @@ -2122,7 +2122,7 @@ int dissolve_free_huge_page(struct page *page) * Attempt to allocate vmemmmap here so that we can take * appropriate action on failure. */ - rc =3D hugetlb_vmemmap_alloc(h, head); + rc =3D hugetlb_vmemmap_restore(h, head); if (!rc) { /* * Move PageHWPoison flag from head page to the raw @@ -3434,7 +3434,7 @@ static int demote_free_huge_page(struct hstate *h, st= ruct page *page) remove_hugetlb_page_for_demote(h, page, false); spin_unlock_irq(&hugetlb_lock); =20 - rc =3D hugetlb_vmemmap_alloc(h, page); + rc =3D hugetlb_vmemmap_restore(h, page); if (rc) { /* Allocation of vmemmmap failed, we can not demote page */ spin_lock_irq(&hugetlb_lock); @@ -4124,7 +4124,6 @@ void __init hugetlb_add_hstate(unsigned int order) h->next_nid_to_free =3D first_memory_node; snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB", huge_page_size(h)/1024); - hugetlb_vmemmap_init(h); =20 parsed_hstate =3D h; } diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 9808d32cdb9e..595b0cee3109 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -35,16 +35,6 @@ struct vmemmap_remap_walk { struct list_head *vmemmap_pages; }; =20 -/* - * There are a lot of struct page structures associated with each HugeTLB = page. - * For tail pages, the value of compound_head is the same. So we can reuse= first - * page of head page structures. We map the virtual addresses of all the p= ages - * of tail page structures to the head page struct, and then free these pa= ge - * frames. Therefore, we need to reserve one pages as vmemmap areas. - */ -#define RESERVE_VMEMMAP_NR 1U -#define RESERVE_VMEMMAP_SIZE (RESERVE_VMEMMAP_NR << PAGE_SHIFT) - static int __split_vmemmap_huge_pmd(pmd_t *pmd, unsigned long start) { pmd_t __pmd; @@ -418,32 +408,38 @@ EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); static bool vmemmap_optimize_enabled =3D IS_ENABLED(CONFIG_HUGETLB_PAGE_OP= TIMIZE_VMEMMAP_DEFAULT_ON); core_param(hugetlb_free_vmemmap, vmemmap_optimize_enabled, bool, 0); =20 -/* - * Previously discarded vmemmap pages will be allocated and remapping - * after this function returns zero. +/** + * hugetlb_vmemmap_restore - restore previously optimized (by + * hugetlb_vmemmap_optimize()) vmemmap pages which + * will be reallocated and remapped. + * @h: struct hstate. + * @head: the head page whose vmemmap pages will be restored. + * + * Return: %0 if @head's vmemmap pages have been reallocated and remapped, + * negative error code otherwise. */ -int hugetlb_vmemmap_alloc(struct hstate *h, struct page *head) +int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head) { int ret; - unsigned long vmemmap_addr =3D (unsigned long)head; - unsigned long vmemmap_end, vmemmap_reuse, vmemmap_pages; + unsigned long vmemmap_start =3D (unsigned long)head; + unsigned long vmemmap_end, vmemmap_reuse, vmemmap_size; =20 if (!HPageVmemmapOptimized(head)) return 0; =20 - vmemmap_addr +=3D RESERVE_VMEMMAP_SIZE; - vmemmap_pages =3D hugetlb_optimize_vmemmap_pages(h); - vmemmap_end =3D vmemmap_addr + (vmemmap_pages << PAGE_SHIFT); - vmemmap_reuse =3D vmemmap_addr - PAGE_SIZE; + vmemmap_size =3D hugetlb_vmemmap_size(h); + vmemmap_end =3D vmemmap_start + vmemmap_size; + vmemmap_reuse =3D vmemmap_start; + vmemmap_start +=3D RESERVE_VMEMMAP_SIZE; =20 /* - * The pages which the vmemmap virtual address range [@vmemmap_addr, + * The pages which the vmemmap virtual address range [@vmemmap_start, * @vmemmap_end) are mapped to are freed to the buddy allocator, and * the range is mapped to the page which @vmemmap_reuse is mapped to. * When a HugeTLB page is freed to the buddy allocator, previously * discarded vmemmap pages must be allocated and remapping. */ - ret =3D vmemmap_remap_alloc(vmemmap_addr, vmemmap_end, vmemmap_reuse, + ret =3D vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse, GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE); if (!ret) { ClearHPageVmemmapOptimized(head); @@ -453,84 +449,62 @@ int hugetlb_vmemmap_alloc(struct hstate *h, struct pa= ge *head) return ret; } =20 -static unsigned int optimizable_vmemmap_pages(struct hstate *h, - struct page *head) +/* Return true iff a HugeTLB whose vmemmap should and can be optimized. */ +static bool vmemmap_should_optimize(const struct hstate *h, const struct p= age *head) { unsigned long pfn =3D page_to_pfn(head); unsigned long end =3D pfn + pages_per_huge_page(h); =20 if (!READ_ONCE(vmemmap_optimize_enabled)) - return 0; + return false; + + if (!hugetlb_vmemmap_optimizable(h)) + return false; =20 for (; pfn < end; pfn +=3D PAGES_PER_SECTION) { if (section_cannot_optimize_vmemmap(__pfn_to_section(pfn))) - return 0; + return false; } =20 - return hugetlb_optimize_vmemmap_pages(h); + return true; } =20 -void hugetlb_vmemmap_free(struct hstate *h, struct page *head) +/** + * hugetlb_vmemmap_optimize - optimize @head page's vmemmap pages. + * @h: struct hstate. + * @head: the head page whose vmemmap pages will be optimized. + * + * This function only tries to optimize @head's vmemmap pages and does not + * guarantee that the optimization will succeed after it returns. The call= er + * can use HPageVmemmapOptimized(@head) to detect if @head's vmemmap pages + * have been optimized. + */ +void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head) { - unsigned long vmemmap_addr =3D (unsigned long)head; - unsigned long vmemmap_end, vmemmap_reuse, vmemmap_pages; + unsigned long vmemmap_start =3D (unsigned long)head; + unsigned long vmemmap_end, vmemmap_reuse, vmemmap_size; =20 - vmemmap_pages =3D optimizable_vmemmap_pages(h, head); - if (!vmemmap_pages) + if (!vmemmap_should_optimize(h, head)) return; =20 static_branch_inc(&hugetlb_optimize_vmemmap_key); =20 - vmemmap_addr +=3D RESERVE_VMEMMAP_SIZE; - vmemmap_end =3D vmemmap_addr + (vmemmap_pages << PAGE_SHIFT); - vmemmap_reuse =3D vmemmap_addr - PAGE_SIZE; + vmemmap_size =3D hugetlb_vmemmap_size(h); + vmemmap_end =3D vmemmap_start + vmemmap_size; + vmemmap_reuse =3D vmemmap_start; + vmemmap_start +=3D RESERVE_VMEMMAP_SIZE; =20 /* - * Remap the vmemmap virtual address range [@vmemmap_addr, @vmemmap_end) + * Remap the vmemmap virtual address range [@vmemmap_start, @vmemmap_end) * to the page which @vmemmap_reuse is mapped to, then free the pages - * which the range [@vmemmap_addr, @vmemmap_end] is mapped to. + * which the range [@vmemmap_start, @vmemmap_end] is mapped to. */ - if (vmemmap_remap_free(vmemmap_addr, vmemmap_end, vmemmap_reuse)) + if (vmemmap_remap_free(vmemmap_start, vmemmap_end, vmemmap_reuse)) static_branch_dec(&hugetlb_optimize_vmemmap_key); else SetHPageVmemmapOptimized(head); } =20 -void __init hugetlb_vmemmap_init(struct hstate *h) -{ - unsigned int nr_pages =3D pages_per_huge_page(h); - unsigned int vmemmap_pages; - - /* - * There are only (RESERVE_VMEMMAP_SIZE / sizeof(struct page)) struct - * page structs that can be used when HVO is enabled, add a BUILD_BUG_ON - * to catch invalid usage of the tail page structs. - */ - BUILD_BUG_ON(__NR_USED_SUBPAGE >=3D - RESERVE_VMEMMAP_SIZE / sizeof(struct page)); - - if (!is_power_of_2(sizeof(struct page))) { - pr_warn_once("cannot optimize vmemmap pages because \"struct page\" cros= ses page boundaries\n"); - return; - } - - vmemmap_pages =3D (nr_pages * sizeof(struct page)) >> PAGE_SHIFT; - /* - * The head page is not to be freed to buddy allocator, the other tail - * pages will map to the head page, so they can be freed. - * - * Could RESERVE_VMEMMAP_NR be greater than @vmemmap_pages? It is true - * on some architectures (e.g. aarch64). See Documentation/arm64/ - * hugetlbpage.rst for more details. - */ - if (likely(vmemmap_pages > RESERVE_VMEMMAP_NR)) - h->optimize_vmemmap_pages =3D vmemmap_pages - RESERVE_VMEMMAP_NR; - - pr_info("can optimize %d vmemmap pages for %s\n", - h->optimize_vmemmap_pages, h->name); -} - -#ifdef CONFIG_PROC_SYSCTL static struct ctl_table hugetlb_vmemmap_sysctls[] =3D { { .procname =3D "hugetlb_optimize_vmemmap", @@ -542,16 +516,36 @@ static struct ctl_table hugetlb_vmemmap_sysctls[] =3D= { { } }; =20 -static __init int hugetlb_vmemmap_sysctls_init(void) +static int __init hugetlb_vmemmap_init(void) { + const struct hstate *h; + bool optimizable =3D false; + /* - * If "struct page" crosses page boundaries, the vmemmap pages cannot - * be optimized. + * There are only (RESERVE_VMEMMAP_SIZE / sizeof(struct page)) struct + * page structs that can be used when HVO is enabled. */ - if (is_power_of_2(sizeof(struct page))) - register_sysctl_init("vm", hugetlb_vmemmap_sysctls); + BUILD_BUG_ON(__NR_USED_SUBPAGE >=3D RESERVE_VMEMMAP_SIZE / sizeof(struct = page)); + + for_each_hstate(h) { + char buf[16]; + unsigned int size =3D 0; + + if (hugetlb_vmemmap_optimizable(h)) + size =3D hugetlb_vmemmap_size(h) - RESERVE_VMEMMAP_SIZE; + optimizable =3D size ? true : optimizable; + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, + sizeof(buf)); + pr_info("%d KiB vmemmap can be optimized for a %s page\n", + size / SZ_1K, buf); + } =20 + if (optimizable) { + if (IS_ENABLED(CONFIG_PROC_SYSCTL)) + register_sysctl_init("vm", hugetlb_vmemmap_sysctls); + pr_info("%d huge pages whose vmemmap are optimized at boot\n", + static_key_count(&hugetlb_optimize_vmemmap_key.key)); + } return 0; } -late_initcall(hugetlb_vmemmap_sysctls_init); -#endif /* CONFIG_PROC_SYSCTL */ +late_initcall(hugetlb_vmemmap_init); diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index ba66fadad9fc..0af3f08cf63c 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -11,35 +11,44 @@ #include =20 #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP -int hugetlb_vmemmap_alloc(struct hstate *h, struct page *head); -void hugetlb_vmemmap_free(struct hstate *h, struct page *head); -void hugetlb_vmemmap_init(struct hstate *h); +int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head); +void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head); =20 /* - * How many vmemmap pages associated with a HugeTLB page that can be - * optimized and freed to the buddy allocator. + * There are a lot of struct page structures associated with each HugeTLB = page. + * The only 'useful' information in the tail page structs is the compound_= head + * field which is the same for all tail page structs. So we can reuse the = first + * page frame of page structs. The virtual addresses of all the remaining = pages + * of tail page structs will be mapped to the head page frame, and then th= ese + * tail page frames are freed. Therefore, we need to reserve one page as + * vmemmap. See Documentation/vm/vmemmap_dedup.rst. */ -static inline unsigned int hugetlb_optimize_vmemmap_pages(struct hstate *h) +#define RESERVE_VMEMMAP_SIZE PAGE_SIZE + +static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { - return h->optimize_vmemmap_pages; + return pages_per_huge_page(h) * sizeof(struct page); } -#else -static inline int hugetlb_vmemmap_alloc(struct hstate *h, struct page *hea= d) + +static inline bool hugetlb_vmemmap_optimizable(const struct hstate *h) { - return 0; + if (!is_power_of_2(sizeof(struct page))) + return false; + return hugetlb_vmemmap_size(h) > RESERVE_VMEMMAP_SIZE; } - -static inline void hugetlb_vmemmap_free(struct hstate *h, struct page *hea= d) +#else +static inline int hugetlb_vmemmap_restore(const struct hstate *h, struct p= age *head) { + return 0; } =20 -static inline void hugetlb_vmemmap_init(struct hstate *h) +static inline void hugetlb_vmemmap_optimize(const struct hstate *h, struct= page *head) { } =20 -static inline unsigned int hugetlb_optimize_vmemmap_pages(struct hstate *h) +static inline bool hugetlb_vmemmap_optimizable(const struct hstate *h) { - return 0; + return false; } #endif /* CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP */ #endif /* _LINUX_HUGETLB_VMEMMAP_H */ --=20 2.11.0