From nobody Tue Dec 16 23:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE2B9CA5534 for ; Wed, 13 Sep 2023 10:54:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239805AbjIMKyi (ORCPT ); Wed, 13 Sep 2023 06:54:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239766AbjIMKye (ORCPT ); Wed, 13 Sep 2023 06:54:34 -0400 Received: from mail-lj1-x230.google.com (mail-lj1-x230.google.com [IPv6:2a00:1450:4864:20::230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 463AC1726 for ; Wed, 13 Sep 2023 03:54:08 -0700 (PDT) Received: by mail-lj1-x230.google.com with SMTP id 38308e7fff4ca-2bfb17435e4so15902811fa.0 for ; Wed, 13 Sep 2023 03:54:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1694602446; x=1695207246; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=av0vrlLKsg9xBQaLOcb8CPCy93VQFu46a3TbtxpfU4U=; b=P5GWihmMHmwcNkj9lfDILOmiDbCAze1/o2zhKzbWGEmPSTV6Vveaa4b96w64JfAGgH UXr2WFXck9FLMeG1WT6GZbzf1dY0VJ7tee9TjuJSGvzLjKfdSc7J7pv8T5I4d1SxIC8P JByflZJXcSNcu7WmFW2H9ub1Ppgi0oYGo0feH64uYnCcKfTojMGVee2IuWkhYXFMW70q TRqXzB/7caGX1dcpqMkotb0p1ELUEMNCICleLJsP3QpsshS90o22q0w5SDW+RF3jj4hr 9zQvnQaiBYEcw1vVXCq+x8xmtHHlgX/YgajPjDRBQsyonoW4ykWYYaMXEN/gNQV3y0pe FW9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694602446; x=1695207246; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=av0vrlLKsg9xBQaLOcb8CPCy93VQFu46a3TbtxpfU4U=; b=bUjMjqZHFrRNWI7FouGKUW7ZBBDYeX7cVgtXgdDJcDz3/KH8ikWxiNuBrNunmNR018 WmkG+GfdWRWqWSglr6e5XNXD/ytGLBuzeWFDzC93kXQb30+M8xUwyQKETCgmdbliVlyX p/oDjtcNdBJWLA28liaqWyEryB0Q/aPtA4HL9CZy5m2pu1aEBFag5AI/da64Pu8ZFUMi z3WeWNJmHsNWuJ0FIeg83/Y97Z8OW3xW/E6YbJb8aqgPamUBsv7vRQkMPaeCDAfpLnjo ieKporHgBGSkKTkYqp6RYpiXgafxnHh057NWQJERa4F2SWwVWI5uCB0csotv6MJUZYCZ PHqQ== X-Gm-Message-State: AOJu0Ywic08p1kjGAsL6vieFVDK5vh95s8iuQbzbDitcYME455NSr8Mt jIR9LOShvvFb0X5CD1SGlU0ptw== X-Google-Smtp-Source: AGHT+IHgwCAQBLvB0z1DFSfacZLlZe7iyFgzrmCJNMai57WocOmkFwr5tPYGi6nJ7z357utYkGkfeA== X-Received: by 2002:a05:651c:8f:b0:2bf:9576:afd4 with SMTP id 15-20020a05651c008f00b002bf9576afd4mr1875974ljq.16.1694602446317; Wed, 13 Sep 2023 03:54:06 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:59d5:28b9:7f13:cd1b]) by smtp.gmail.com with ESMTPSA id p36-20020a05600c1da400b003ff013a4fd9sm1757055wms.7.2023.09.13.03.54.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 03:54:05 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v5 1/4] mm: hugetlb_vmemmap: Use nid of the head page to reallocate it Date: Wed, 13 Sep 2023 11:53:58 +0100 Message-Id: <20230913105401.519709-2-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230913105401.519709-1-usama.arif@bytedance.com> References: <20230913105401.519709-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" If tail page prep and initialization is skipped, then the "start" page will not contain the correct nid. Use the nid from first vmemap page. Signed-off-by: Usama Arif Reviewed-by: Muchun Song Reviewed-by: Mike Kravetz --- mm/hugetlb_vmemmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index aeb7dd889eee..3cdb38d87a95 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -319,7 +319,7 @@ static int vmemmap_remap_free(unsigned long start, unsi= gned long end, .reuse_addr =3D reuse, .vmemmap_pages =3D &vmemmap_pages, }; - int nid =3D page_to_nid((struct page *)start); + int nid =3D page_to_nid((struct page *)reuse); gfp_t gfp_mask =3D GFP_KERNEL | __GFP_THISNODE | __GFP_NORETRY | __GFP_NOWARN; =20 --=20 2.25.1 From nobody Tue Dec 16 23:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB9D9CA5538 for ; Wed, 13 Sep 2023 10:54:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239803AbjIMKyP (ORCPT ); Wed, 13 Sep 2023 06:54:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239795AbjIMKyM (ORCPT ); Wed, 13 Sep 2023 06:54:12 -0400 Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D2AA819AF for ; Wed, 13 Sep 2023 03:54:08 -0700 (PDT) Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-401d80f4ef8so72280485e9.1 for ; Wed, 13 Sep 2023 03:54:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1694602447; x=1695207247; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HvAW5HglNzFFKY02dQL2QCkQguw1CPVi9WQHfYQdWVQ=; b=h0VVpYdN84xbMdEtnN1XNkQlKJ+TXQ7/JBBGx9NAvKIGlqjSdcq1QNBIHPvEvmBHZj evJv4iPHL18OT7srF+imsAkmriRjANgq3jGiWZ88pA4i54BdI4LVz37hpNM+NoCuA7ax WoblLdI5f2mpw3cK4qk2tWqgUjT6k49yyX26ODA0O4SMnm/V0vUjxT+Rs9dNMfM/Sr3Q 8GP8BM9rPqeu24zDBGwagsF1A6lRnRIN0+4NrnRM3Sur+43CWeQ4vz22EiqfPG6G7G9A VqCwqiVrQdxLc+kYyTcmi1ePyl9FwlNkwB38zyQjWSeZ5i49JkGAkRAjM7uYW7znQEnR IwDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694602447; x=1695207247; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HvAW5HglNzFFKY02dQL2QCkQguw1CPVi9WQHfYQdWVQ=; b=JYalv5cjFN5df9UpD0G7gc6BeErHwQhUkn64CxLNflMSCF/vYA2StuXKP3dWeHV987 UHDi7k6H30/aa4YEkx8Ddj9ivn0+LCtKWYYB2H2MRy8oIRkHqieonJflc9xoNc8LwuYf 3SuQp4BO6bXwgcc0eBv1SgNYgIYpvQVVU3jflCS8z/k5JUYXflpv1v5EmrzanVNdDt37 DePVmFKIPgOLLSFGCwmt9ZfOqcIdU7bvkMr3xVRqI2QFmMEbBrTPP607aVzdqxi5g8xu VSgfXKvuqv+GqJO+brhvac9iH3fBvoQWC6dp84IVcfzE8cfX53DD5n37hFvf/1LKG7VQ ReWw== X-Gm-Message-State: AOJu0YyU7nU1tG/BAnLAye27CsZ0V+65YOhICfe22bm93s4wDl/rnksL ELqOWnJLHTzsGKSVfAo9vC6T9g== X-Google-Smtp-Source: AGHT+IHvbrYE5CU1LLfrzLv3Yh3xia4AXw13/BHJVfTjNw8q0gSbwaI8RZroEq0GxStSNj/tvBmGxA== X-Received: by 2002:a05:600c:2196:b0:402:bcac:5773 with SMTP id e22-20020a05600c219600b00402bcac5773mr1720136wme.38.1694602447327; Wed, 13 Sep 2023 03:54:07 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:59d5:28b9:7f13:cd1b]) by smtp.gmail.com with ESMTPSA id p36-20020a05600c1da400b003ff013a4fd9sm1757055wms.7.2023.09.13.03.54.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 03:54:06 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v5 2/4] memblock: pass memblock_type to memblock_setclr_flag Date: Wed, 13 Sep 2023 11:53:59 +0100 Message-Id: <20230913105401.519709-3-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230913105401.519709-1-usama.arif@bytedance.com> References: <20230913105401.519709-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This allows setting flags to both memblock types and is in preparation for setting flags (for e.g. to not initialize struct pages) on reserved memory region. Signed-off-by: Usama Arif Reviewed-by: Muchun Song Reviewed-by: Mike Rapoport (IBM) Acked-by: Mike Kravetz --- mm/memblock.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index 913b2520a9a0..a49efbaee7e0 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -901,10 +901,9 @@ int __init_memblock memblock_physmem_add(phys_addr_t b= ase, phys_addr_t size) * * Return: 0 on success, -errno on failure. */ -static int __init_memblock memblock_setclr_flag(phys_addr_t base, - phys_addr_t size, int set, int flag) +static int __init_memblock memblock_setclr_flag(struct memblock_type *type, + phys_addr_t base, phys_addr_t size, int set, int flag) { - struct memblock_type *type =3D &memblock.memory; int i, ret, start_rgn, end_rgn; =20 ret =3D memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); @@ -933,7 +932,7 @@ static int __init_memblock memblock_setclr_flag(phys_ad= dr_t base, */ int __init_memblock memblock_mark_hotplug(phys_addr_t base, phys_addr_t si= ze) { - return memblock_setclr_flag(base, size, 1, MEMBLOCK_HOTPLUG); + return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_HOT= PLUG); } =20 /** @@ -945,7 +944,7 @@ int __init_memblock memblock_mark_hotplug(phys_addr_t b= ase, phys_addr_t size) */ int __init_memblock memblock_clear_hotplug(phys_addr_t base, phys_addr_t s= ize) { - return memblock_setclr_flag(base, size, 0, MEMBLOCK_HOTPLUG); + return memblock_setclr_flag(&memblock.memory, base, size, 0, MEMBLOCK_HOT= PLUG); } =20 /** @@ -962,7 +961,7 @@ int __init_memblock memblock_mark_mirror(phys_addr_t ba= se, phys_addr_t size) =20 system_has_some_mirror =3D true; =20 - return memblock_setclr_flag(base, size, 1, MEMBLOCK_MIRROR); + return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_MIR= ROR); } =20 /** @@ -982,7 +981,7 @@ int __init_memblock memblock_mark_mirror(phys_addr_t ba= se, phys_addr_t size) */ int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size) { - return memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP); + return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_NOM= AP); } =20 /** @@ -994,7 +993,7 @@ int __init_memblock memblock_mark_nomap(phys_addr_t bas= e, phys_addr_t size) */ int __init_memblock memblock_clear_nomap(phys_addr_t base, phys_addr_t siz= e) { - return memblock_setclr_flag(base, size, 0, MEMBLOCK_NOMAP); + return memblock_setclr_flag(&memblock.memory, base, size, 0, MEMBLOCK_NOM= AP); } =20 static bool should_skip_region(struct memblock_type *type, --=20 2.25.1 From nobody Tue Dec 16 23:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E8DCCA5534 for ; Wed, 13 Sep 2023 10:54:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239818AbjIMKyl (ORCPT ); Wed, 13 Sep 2023 06:54:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239795AbjIMKyf (ORCPT ); Wed, 13 Sep 2023 06:54:35 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F3FB219AD for ; Wed, 13 Sep 2023 03:54:09 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id 5b1f17b1804b1-403012f27e3so48670505e9.3 for ; Wed, 13 Sep 2023 03:54:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1694602448; x=1695207248; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Xz27K9YR9w0fQiJ9Te2x4N9NzQgI00KGt4T8XMBUFg0=; b=WGUJTz8QsszW4Ja7bgpbpod/cBjlUmhVC/9bvnphWkbZPo58rm5UqL9+Tzdy7A+o+G /gRcr87LqxhSEFyxAqf2Fq7lDOzhw/ESxcsrA0xepVDhXnATFm+daRd/d7rNTiFJ1Wq2 FgmvFQ6xSnYBlJqFa3gH5Jm+hQ266Kyhw3vIjAX7OvEX1A0Nli/f9SWkVM7ux0jhEhma 2cwJs9lod8+MBxNBCBSzdiTYI4AIxLjPN/w96DRoC3yPcWl4HynzNyesxLaYGEgl64lX +6j7aS3CuShMFNZRktEGET6U3KslQGuAP1SvSAXTG+8IdarAqQPhpECFOoqDE0WswMXp PMEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694602448; x=1695207248; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xz27K9YR9w0fQiJ9Te2x4N9NzQgI00KGt4T8XMBUFg0=; b=FhhvYOJUtNQUJmJbITG8citJY4Dc+SeNEyJH7N2LVc2k4gFQcy4W350i6VRt4GNITs cDCoH28wEbe1wWWxzBDpakweCx+oGXfqQS45ZjUedgjqFg/oXZOonsKEcY0+TRn3yZSM 36yA0U2hljQpkn1UYxee+zYQjRp5JuJQvFvmEC/fWyz5BZAYnMCzQ3EwUBKrQjr5UPuG tc3qxyauYbeGAZhL7IO5e9PBQC2dzjCYMMhMUDl62xmM8IskjiO5kTkBBXVzJyAf+GhE +1gg02u5VhnI04YiPrnAtjrhHXvDlE0AG1yRIcwP98PtM6s6jy3IbsWy6/5ubF1286Rj fO1w== X-Gm-Message-State: AOJu0YyraDptqzo4aRe4KjbcyBCz6EU4qYxNjO+ysniJ/oafcLyDEUzX /V3XwcmyfuMCqwxvjVbWnyO9xw== X-Google-Smtp-Source: AGHT+IELq4IpOoGAAOVl1ZdlRYGeDT1zMvMdNhwEFXBsF/9A4wBjX2ebeHr30qj0D/lpVvMZsIc+Jw== X-Received: by 2002:a05:600c:ac5:b0:401:d803:624f with SMTP id c5-20020a05600c0ac500b00401d803624fmr1620823wmr.4.1694602448422; Wed, 13 Sep 2023 03:54:08 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:59d5:28b9:7f13:cd1b]) by smtp.gmail.com with ESMTPSA id p36-20020a05600c1da400b003ff013a4fd9sm1757055wms.7.2023.09.13.03.54.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 03:54:07 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v5 3/4] memblock: introduce MEMBLOCK_RSRV_NOINIT flag Date: Wed, 13 Sep 2023 11:54:00 +0100 Message-Id: <20230913105401.519709-4-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230913105401.519709-1-usama.arif@bytedance.com> References: <20230913105401.519709-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" For reserved memory regions marked with this flag, reserve_bootmem_region is not called during memmap_init_reserved_pages. This can be used to avoid struct page initialization for regions which won't need them, for e.g. hugepages with Hugepage Vmemmap Optimization enabled. Signed-off-by: Usama Arif Acked-by: Muchun Song Reviewed-by: Mike Rapoport (IBM) --- include/linux/memblock.h | 9 +++++++++ mm/memblock.c | 33 ++++++++++++++++++++++++++++----- 2 files changed, 37 insertions(+), 5 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 1c1072e3ca06..ae3bde302f70 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -40,6 +40,8 @@ extern unsigned long long max_possible_pfn; * via a driver, and never indicated in the firmware-provided memory map as * system RAM. This corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED in the * kernel resource tree. + * @MEMBLOCK_RSRV_NOINIT: memory region for which struct pages are + * not initialized (only for reserved regions). */ enum memblock_flags { MEMBLOCK_NONE =3D 0x0, /* No special request */ @@ -47,6 +49,7 @@ enum memblock_flags { MEMBLOCK_MIRROR =3D 0x2, /* mirrored region */ MEMBLOCK_NOMAP =3D 0x4, /* don't add to kernel direct mapping */ MEMBLOCK_DRIVER_MANAGED =3D 0x8, /* always detected via a driver */ + MEMBLOCK_RSRV_NOINIT =3D 0x10, /* don't initialize struct pages */ }; =20 /** @@ -125,6 +128,7 @@ int memblock_clear_hotplug(phys_addr_t base, phys_addr_= t size); int memblock_mark_mirror(phys_addr_t base, phys_addr_t size); int memblock_mark_nomap(phys_addr_t base, phys_addr_t size); int memblock_clear_nomap(phys_addr_t base, phys_addr_t size); +int memblock_reserved_mark_noinit(phys_addr_t base, phys_addr_t size); =20 void memblock_free_all(void); void memblock_free(void *ptr, size_t size); @@ -259,6 +263,11 @@ static inline bool memblock_is_nomap(struct memblock_r= egion *m) return m->flags & MEMBLOCK_NOMAP; } =20 +static inline bool memblock_is_reserved_noinit(struct memblock_region *m) +{ + return m->flags & MEMBLOCK_RSRV_NOINIT; +} + static inline bool memblock_is_driver_managed(struct memblock_region *m) { return m->flags & MEMBLOCK_DRIVER_MANAGED; diff --git a/mm/memblock.c b/mm/memblock.c index a49efbaee7e0..8f7a0cb668d4 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -996,6 +996,24 @@ int __init_memblock memblock_clear_nomap(phys_addr_t b= ase, phys_addr_t size) return memblock_setclr_flag(&memblock.memory, base, size, 0, MEMBLOCK_NOM= AP); } =20 +/** + * memblock_reserved_mark_noinit - Mark a reserved memory region with flag + * MEMBLOCK_RSRV_NOINIT which results in the struct pages not being initia= lized + * for this region. + * @base: the base phys addr of the region + * @size: the size of the region + * + * struct pages will not be initialized for reserved memory regions marked= with + * %MEMBLOCK_RSRV_NOINIT. + * + * Return: 0 on success, -errno on failure. + */ +int __init_memblock memblock_reserved_mark_noinit(phys_addr_t base, phys_a= ddr_t size) +{ + return memblock_setclr_flag(&memblock.reserved, base, size, 1, + MEMBLOCK_RSRV_NOINIT); +} + static bool should_skip_region(struct memblock_type *type, struct memblock_region *m, int nid, int flags) @@ -2112,13 +2130,18 @@ static void __init memmap_init_reserved_pages(void) memblock_set_node(start, end, &memblock.reserved, nid); } =20 - /* initialize struct pages for the reserved regions */ + /* + * initialize struct pages for reserved regions that don't have + * the MEMBLOCK_RSRV_NOINIT flag set + */ for_each_reserved_mem_region(region) { - nid =3D memblock_get_region_node(region); - start =3D region->base; - end =3D start + region->size; + if (!memblock_is_reserved_noinit(region)) { + nid =3D memblock_get_region_node(region); + start =3D region->base; + end =3D start + region->size; =20 - reserve_bootmem_region(start, end, nid); + reserve_bootmem_region(start, end, nid); + } } } =20 --=20 2.25.1 From nobody Tue Dec 16 23:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE3FCCA5534 for ; Wed, 13 Sep 2023 10:54:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239831AbjIMKyo (ORCPT ); Wed, 13 Sep 2023 06:54:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60254 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239774AbjIMKyf (ORCPT ); Wed, 13 Sep 2023 06:54:35 -0400 Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E102C19B0 for ; Wed, 13 Sep 2023 03:54:10 -0700 (PDT) Received: by mail-wm1-x329.google.com with SMTP id 5b1f17b1804b1-403012f276dso45401955e9.0 for ; Wed, 13 Sep 2023 03:54:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1694602449; x=1695207249; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=a6h7k6NjGFuqJLpb7jnCYfQxn4IX1VZuVPP7jz0KAXo=; b=Il2gmVNS1UwjG+jTYRikdvw9VYRUsjgiPbO357R0u8SQfjZLpcqlbI4wgt8pFuqDVq N/1RhK/O+txiCZ/PRH4C+qTzDEt//QVBhbJL2vmWK8I5JYKH4qQYlf3inSUJH4oJDzwk rYhOKutisxQPaptw6+a+Fbjii5dQWWktLkKVBS55KuyNx/tGFFtOE9809FatILUnex62 o129rd0tpcbIqVWfB+EZgU5fsiijYgfn+OYvlwr7N9Ol7lLMZFsAm7hC0R7cpdA+p5jk xCz8440PH3q36JVcq0zqzcSFjQOFTcCHVf7AYQWes1ld6hfyMbJaT/fMa9twqWKV+3bi M0LA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694602449; x=1695207249; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a6h7k6NjGFuqJLpb7jnCYfQxn4IX1VZuVPP7jz0KAXo=; b=W89F2mgirqA1scnFJyu/omlNCzN7WdRKlLj+Fa/177Gu7bb+C2W2SzLvrJ3l9dE/TB GG0RDuBhD9OpLO95pR8azM+oNcCBZEXxTaBRCG+JV9PSQTPpkk0eNNYR0KnrTn1+Zljg aVXe4orpf6bwh0e9F/S58kLp3+wE640lviIxNzL/fCeZe5BFOJ6sKv5MKwXPpvfNmLm8 tM8Hy8/dxikbFSVuPUqoDheFeNvdrXzILNbiAVs5PSWe+f2d+RwFOpWOVXkSLAN+FoGC ZQaXd5HPYWdEUxuN5sAKYOAkLWkd7Bwpfbwlc8WeK5jfaAXjE1nE3hzdKF4sB5DcYwAP H3VA== X-Gm-Message-State: AOJu0Yxd3mCC2Kj08Snxc8Y/xblMd6+YQmfc3oBpByXa8QfQl2dq2vA/ wWBQs0UwFbbgxAzWAPwBxM72NA== X-Google-Smtp-Source: AGHT+IGeRnxvvuKSKnq0o8Epoa2KLRxqMqvvhw8b/xo192TKibj8bohHyp3e0nydPYtGTqXT2wrAMg== X-Received: by 2002:a1c:4c0d:0:b0:3fd:3006:410b with SMTP id z13-20020a1c4c0d000000b003fd3006410bmr1731104wmf.34.1694602449407; Wed, 13 Sep 2023 03:54:09 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:59d5:28b9:7f13:cd1b]) by smtp.gmail.com with ESMTPSA id p36-20020a05600c1da400b003ff013a4fd9sm1757055wms.7.2023.09.13.03.54.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 03:54:08 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v5 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO Date: Wed, 13 Sep 2023 11:54:01 +0100 Message-Id: <20230913105401.519709-5-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230913105401.519709-1-usama.arif@bytedance.com> References: <20230913105401.519709-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The new boot flow when it comes to initialization of gigantic pages is as follows: - At boot time, for a gigantic page during __alloc_bootmem_hugepage, the region after the first struct page is marked as noinit. - This results in only the first struct page to be initialized in reserve_bootmem_region. As the tail struct pages are not initialized at this point, there can be a significant saving in boot time if HVO succeeds later on. - Later on in the boot, the head page is prepped and the first HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages are initialized. - HVO is attempted. If it is not successful, then the rest of the tail struct pages are initialized. If it is successful, no more tail struct pages need to be initialized saving significant boot time. The WARN_ON for increased ref count in gather_bootmem_prealloc was changed to a VM_BUG_ON. This is OK as there should be no speculative references this early in boot process. The VM_BUG_ON's are there just in case such code is introduced. Signed-off-by: Usama Arif --- mm/hugetlb.c | 63 +++++++++++++++++++++++++++++++++++++------- mm/hugetlb_vmemmap.c | 2 +- mm/hugetlb_vmemmap.h | 9 ++++--- mm/internal.h | 3 +++ mm/mm_init.c | 2 +- 5 files changed, 64 insertions(+), 15 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c32ca241df4b..ed37c6e4e952 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3169,6 +3169,15 @@ int __alloc_bootmem_huge_page(struct hstate *h, int = nid) } =20 found: + + /* + * Only initialize the head struct page in memmap_init_reserved_pages, + * rest of the struct pages will be initialized by the HugeTLB subsystem = itself. + * The head struct page is used to get folio information by the HugeTLB + * subsystem like zone id and node id. + */ + memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), + huge_page_size(h) - PAGE_SIZE); /* Put them into a private list first because mem_map is not up yet */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages); @@ -3176,6 +3185,42 @@ int __alloc_bootmem_huge_page(struct hstate *h, int = nid) return 1; } =20 +/* Initialize [start_page:end_page_number] tail struct pages of a hugepage= */ +static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio, + unsigned long start_page_number, + unsigned long end_page_number) +{ + enum zone_type zone =3D zone_idx(folio_zone(folio)); + int nid =3D folio_nid(folio); + unsigned long head_pfn =3D folio_pfn(folio); + unsigned long pfn, end_pfn =3D head_pfn + end_page_number; + int ret; + + for (pfn =3D head_pfn + start_page_number; pfn < end_pfn; pfn++) { + struct page *page =3D pfn_to_page(pfn); + + __init_single_page(page, pfn, zone, nid); + prep_compound_tail((struct page *)folio, pfn - head_pfn); + ret =3D page_ref_freeze(page, 1); + VM_BUG_ON(!ret); + } +} + +static void __init hugetlb_folio_init_vmemmap(struct folio *folio, struct = hstate *h, + unsigned long nr_pages) +{ + int ret; + + /* Prepare folio head */ + __folio_clear_reserved(folio); + __folio_set_head(folio); + ret =3D page_ref_freeze(&folio->page, 1); + VM_BUG_ON(!ret); + /* Initialize the necessary tail struct pages */ + hugetlb_folio_init_tail_vmemmap(folio, 1, nr_pages); + prep_compound_head((struct page *)folio, huge_page_order(h)); +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_ORDER) pages. @@ -3186,19 +3231,19 @@ static void __init gather_bootmem_prealloc(void) =20 list_for_each_entry(m, &huge_boot_pages, list) { struct page *page =3D virt_to_page(m); - struct folio *folio =3D page_folio(page); + struct folio *folio =3D (void *)page; struct hstate *h =3D m->hstate; =20 VM_BUG_ON(!hstate_is_gigantic(h)); WARN_ON(folio_ref_count(folio) !=3D 1); - if (prep_compound_gigantic_folio(folio, huge_page_order(h))) { - WARN_ON(folio_test_reserved(folio)); - prep_new_hugetlb_folio(h, folio, folio_nid(folio)); - free_huge_folio(folio); /* add to the hugepage allocator */ - } else { - /* VERY unlikely inflated ref count on a tail page */ - free_gigantic_folio(folio, huge_page_order(h)); - } + + hugetlb_folio_init_vmemmap(folio, h, HUGETLB_VMEMMAP_RESERVE_PAGES); + prep_new_hugetlb_folio(h, folio, folio_nid(folio)); + /* If HVO fails, initialize all tail struct pages */ + if (!HPageVmemmapOptimized(&folio->page)) + hugetlb_folio_init_tail_vmemmap(folio, HUGETLB_VMEMMAP_RESERVE_PAGES, + pages_per_huge_page(h)); + free_huge_folio(folio); /* add to the hugepage allocator */ =20 /* * We need to restore the 'stolen' pages to totalram_pages diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 3cdb38d87a95..772a877918d7 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -589,7 +589,7 @@ static int __init hugetlb_vmemmap_init(void) const struct hstate *h; =20 /* HUGETLB_VMEMMAP_RESERVE_SIZE should cover all used struct pages */ - BUILD_BUG_ON(__NR_USED_SUBPAGE * sizeof(struct page) > HUGETLB_VMEMMAP_RE= SERVE_SIZE); + BUILD_BUG_ON(__NR_USED_SUBPAGE > HUGETLB_VMEMMAP_RESERVE_PAGES); =20 for_each_hstate(h) { if (hugetlb_vmemmap_optimizable(h)) { diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 25bd0e002431..4573899855d7 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -10,15 +10,16 @@ #define _LINUX_HUGETLB_VMEMMAP_H #include =20 -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP -int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head); -void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head); - /* * Reserve one vmemmap page, all vmemmap addresses are mapped to it. See * Documentation/vm/vmemmap_dedup.rst. */ #define HUGETLB_VMEMMAP_RESERVE_SIZE PAGE_SIZE +#define HUGETLB_VMEMMAP_RESERVE_PAGES (HUGETLB_VMEMMAP_RESERVE_SIZE / size= of(struct page)) + +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP +int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head); +void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head); =20 static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { diff --git a/mm/internal.h b/mm/internal.h index d1d4bf4e63c0..d74061aa6de7 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1154,4 +1154,7 @@ struct vma_prepare { struct vm_area_struct *remove; struct vm_area_struct *remove2; }; + +void __meminit __init_single_page(struct page *page, unsigned long pfn, + unsigned long zone, int nid); #endif /* __MM_INTERNAL_H */ diff --git a/mm/mm_init.c b/mm/mm_init.c index 50f2f34745af..fed4370b02e1 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -555,7 +555,7 @@ static void __init find_zone_movable_pfns_for_nodes(voi= d) node_states[N_MEMORY] =3D saved_node_state; } =20 -static void __meminit __init_single_page(struct page *page, unsigned long = pfn, +void __meminit __init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid) { mm_zero_struct_page(page); --=20 2.25.1