From nobody Fri Dec 19 10:09:55 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E987EB8FAF for ; Wed, 6 Sep 2023 11:28:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239715AbjIFL2L (ORCPT ); Wed, 6 Sep 2023 07:28:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239652AbjIFL2E (ORCPT ); Wed, 6 Sep 2023 07:28:04 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3206F198A for ; Wed, 6 Sep 2023 04:26:58 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id 5b1f17b1804b1-40061928e5aso34014035e9.3 for ; Wed, 06 Sep 2023 04:26:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1693999574; x=1694604374; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=av0vrlLKsg9xBQaLOcb8CPCy93VQFu46a3TbtxpfU4U=; b=czpBoukDfVXyFaJd9Gk+66hc0DcfXYWJU0NzkDqNlmZ4mLox/jGqU03IZY6Pzt4HrG gKmT0Vsddr7sOa8T7RC5Ns7LK3gp915RaQCLxevSRXnMnwX9xSoINXUjZPW3mtkY+pDL Xa0Ap5+sn4qyNGV+hX+FntVAei4R3a0eOJ6CQjbm+GMuKFN9fflhxQvBynHJ1F7rXf4k 7MCEnLkR/L8kxSvwTT7CVI3mnht7dBBzxRpimlsTzGxKc/n6zsZPJ5bvw/v0VyumrjvU VGUIjQ7UbeYxTGYLPJcy6H9xJADP8ojPa64aAG5WxVLWPnE2DafotBLYDoJvTEInQwMf FloQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693999574; x=1694604374; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=av0vrlLKsg9xBQaLOcb8CPCy93VQFu46a3TbtxpfU4U=; b=CTzrp20a9yO1+413lG6yx5kik8wVyEyamterxyj3wo15NMhRkxfw+FmG78lg2/eorZ uWU3ld+d7o11N6r7spmY0S/WGKsotAwidj3VoLj1Uoq/Nebs+HI4blXDE/7NJohBRIIU y+uqisZ/TPQCuHpgOb70W1SRQz1Y/SE7k/X5jYOvpP748A3heP7guAxSFdIfXpF6UGWl U3sZLyCqqLTd2q8Xz59FWgjV0/38/XXNcfyfLQtrM6vGeOnf5BtMmhxj8TZk8WENQiOo J3Ln1ZrSJB8kuxSE80kY5VWX8nP11+rL+GoPzqsA4JzecNPyuQOqsnAZ2dMvJT8mm9Ay Pi6g== X-Gm-Message-State: AOJu0Yx2lEzytkYmtCWf7TvM4wW8QcuIdK5ZeOEFdMj7gqMwcBeHWMv2 qh2cuEWgOlnK1ctEK8ZIqb2adg== X-Google-Smtp-Source: AGHT+IGfuQHaisNrB06g+lM8DTnZejhWb0AvcDwNQlGyVyddAaYvBx9qw9hwTN1kM9+mHaBZBNHmCA== X-Received: by 2002:a05:600c:3781:b0:401:73b2:f043 with SMTP id o1-20020a05600c378100b0040173b2f043mr2208214wmr.1.1693999574258; Wed, 06 Sep 2023 04:26:14 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:6dce:31d8:efb1:5d81]) by smtp.gmail.com with ESMTPSA id n9-20020a05600c294900b003feff926fc5sm19517038wmd.17.2023.09.06.04.26.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Sep 2023 04:26:13 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v4 1/4] mm: hugetlb_vmemmap: Use nid of the head page to reallocate it Date: Wed, 6 Sep 2023 12:26:02 +0100 Message-Id: <20230906112605.2286994-2-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230906112605.2286994-1-usama.arif@bytedance.com> References: <20230906112605.2286994-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" If tail page prep and initialization is skipped, then the "start" page will not contain the correct nid. Use the nid from first vmemap page. Signed-off-by: Usama Arif Reviewed-by: Muchun Song Reviewed-by: Mike Kravetz --- mm/hugetlb_vmemmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index aeb7dd889eee..3cdb38d87a95 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -319,7 +319,7 @@ static int vmemmap_remap_free(unsigned long start, unsi= gned long end, .reuse_addr =3D reuse, .vmemmap_pages =3D &vmemmap_pages, }; - int nid =3D page_to_nid((struct page *)start); + int nid =3D page_to_nid((struct page *)reuse); gfp_t gfp_mask =3D GFP_KERNEL | __GFP_THISNODE | __GFP_NORETRY | __GFP_NOWARN; =20 --=20 2.25.1 From nobody Fri Dec 19 10:09:55 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E329DEB8FA5 for ; Wed, 6 Sep 2023 11:27:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233417AbjIFL1o (ORCPT ); Wed, 6 Sep 2023 07:27:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235254AbjIFL1n (ORCPT ); Wed, 6 Sep 2023 07:27:43 -0400 Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AAAF19AB for ; Wed, 6 Sep 2023 04:26:59 -0700 (PDT) Received: by mail-lf1-x12a.google.com with SMTP id 2adb3069b0e04-501bef6e0d3so2771904e87.1 for ; Wed, 06 Sep 2023 04:26:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1693999575; x=1694604375; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HvAW5HglNzFFKY02dQL2QCkQguw1CPVi9WQHfYQdWVQ=; b=F/LmyV5XT3asg5nV83+0hPDnPzGaG4y2A38iHqkIFdMMLv2GM7SAC5L7lLJiKg07oj AECMS9OVo9+ssAxtPQWjNtIEZkHcnKBovdf7mwmf/3z7jcEQ7wGA7ZquiYRe2PKNgW1n oRpeIrKRAZvlq1ugzSOBJD+1/W+hxsozWh8rIbQCnlVN7WztAo3UC9Jqj81Ovqn+HPuP 4xx3SyTwoiZGml8tQMUraLfMcAzzP26PSjdNmbpDTgO60EkJmeERafLu7n0bfv/sYsvW TuyzKvYd9lYjjHjChpq7wgOKftdFtfI5WItTDurFPee3YX84R0MvoeQpnuhitgSGKpdk MTiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693999575; x=1694604375; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HvAW5HglNzFFKY02dQL2QCkQguw1CPVi9WQHfYQdWVQ=; b=Hfm2BkRYDBRjm/5pKEU9dDJTdzGEhMbBkQQ8XbEBMVGpNhGSdiepHFT98/cEdKh4nK TcDkAEB4zUElssb5RXdlwjw2FTE82LR5CvA9SJsr1N+A7Z9xOMa1V51KSh4HR2Y4n3mr uknt5utuBa8a1/umZ4MUlp4254LF94b8Afme9bBuPgB2yPDQLcC7zmNwmoRPy7doBiyc x/2eUNMbXaWFeFMEjYsoNY0f93Ne2xhNMUJtWspD3aXQOsnJxP5PoWCDCw+5uzhIMKuX fjEry9lYlSib4FnvYoH+5BReO92C/eUZZcDBDQ/43CUXZfaBGmN/9vCRbfgSnOR7OSF8 W1Kg== X-Gm-Message-State: AOJu0Yy5BpZjx7Ewr/10on8wLuYL3/8969wvzcnUbYARrif6D7aKhk6O vl/2xKijoCiKbNvMsWxBzlG2bw== X-Google-Smtp-Source: AGHT+IELbss0QLtzg1jVEelmiCJSpw2y8Xgxgu0g1ZmM4E+wM/cWOJ04LOlpRaDfgoONu1e0+ddNrg== X-Received: by 2002:a2e:701a:0:b0:2b9:4413:864e with SMTP id l26-20020a2e701a000000b002b94413864emr2071899ljc.53.1693999575193; Wed, 06 Sep 2023 04:26:15 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:6dce:31d8:efb1:5d81]) by smtp.gmail.com with ESMTPSA id n9-20020a05600c294900b003feff926fc5sm19517038wmd.17.2023.09.06.04.26.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Sep 2023 04:26:14 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v4 2/4] memblock: pass memblock_type to memblock_setclr_flag Date: Wed, 6 Sep 2023 12:26:03 +0100 Message-Id: <20230906112605.2286994-3-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230906112605.2286994-1-usama.arif@bytedance.com> References: <20230906112605.2286994-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This allows setting flags to both memblock types and is in preparation for setting flags (for e.g. to not initialize struct pages) on reserved memory region. Signed-off-by: Usama Arif Reviewed-by: Muchun Song Reviewed-by: Mike Rapoport (IBM) Acked-by: Mike Kravetz --- mm/memblock.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index 913b2520a9a0..a49efbaee7e0 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -901,10 +901,9 @@ int __init_memblock memblock_physmem_add(phys_addr_t b= ase, phys_addr_t size) * * Return: 0 on success, -errno on failure. */ -static int __init_memblock memblock_setclr_flag(phys_addr_t base, - phys_addr_t size, int set, int flag) +static int __init_memblock memblock_setclr_flag(struct memblock_type *type, + phys_addr_t base, phys_addr_t size, int set, int flag) { - struct memblock_type *type =3D &memblock.memory; int i, ret, start_rgn, end_rgn; =20 ret =3D memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); @@ -933,7 +932,7 @@ static int __init_memblock memblock_setclr_flag(phys_ad= dr_t base, */ int __init_memblock memblock_mark_hotplug(phys_addr_t base, phys_addr_t si= ze) { - return memblock_setclr_flag(base, size, 1, MEMBLOCK_HOTPLUG); + return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_HOT= PLUG); } =20 /** @@ -945,7 +944,7 @@ int __init_memblock memblock_mark_hotplug(phys_addr_t b= ase, phys_addr_t size) */ int __init_memblock memblock_clear_hotplug(phys_addr_t base, phys_addr_t s= ize) { - return memblock_setclr_flag(base, size, 0, MEMBLOCK_HOTPLUG); + return memblock_setclr_flag(&memblock.memory, base, size, 0, MEMBLOCK_HOT= PLUG); } =20 /** @@ -962,7 +961,7 @@ int __init_memblock memblock_mark_mirror(phys_addr_t ba= se, phys_addr_t size) =20 system_has_some_mirror =3D true; =20 - return memblock_setclr_flag(base, size, 1, MEMBLOCK_MIRROR); + return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_MIR= ROR); } =20 /** @@ -982,7 +981,7 @@ int __init_memblock memblock_mark_mirror(phys_addr_t ba= se, phys_addr_t size) */ int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size) { - return memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP); + return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_NOM= AP); } =20 /** @@ -994,7 +993,7 @@ int __init_memblock memblock_mark_nomap(phys_addr_t bas= e, phys_addr_t size) */ int __init_memblock memblock_clear_nomap(phys_addr_t base, phys_addr_t siz= e) { - return memblock_setclr_flag(base, size, 0, MEMBLOCK_NOMAP); + return memblock_setclr_flag(&memblock.memory, base, size, 0, MEMBLOCK_NOM= AP); } =20 static bool should_skip_region(struct memblock_type *type, --=20 2.25.1 From nobody Fri Dec 19 10:09:55 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2C16EB8FA5 for ; Wed, 6 Sep 2023 11:28:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236232AbjIFL2J (ORCPT ); Wed, 6 Sep 2023 07:28:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239677AbjIFL2F (ORCPT ); Wed, 6 Sep 2023 07:28:05 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D19F199C for ; Wed, 6 Sep 2023 04:26:59 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id 5b1f17b1804b1-401ec23be82so36702505e9.0 for ; Wed, 06 Sep 2023 04:26:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1693999576; x=1694604376; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iQ+4wrvPpGZ5a1nklWsXWLvy3+iTY75RmIyUz7pfkEI=; b=hg+HWA6x7AUaGc7r9e+u+i8RJ0/TYgJhXY7d9ySaCeUxU3YRWAy3hi70H3etEjdgcH 6vK4UE2wg9/HejvqWWUcFEY5VxQxt+th8uM/OclYJzfZxQ6w8M2cLuf52PJY2poeKTbx fy9HmJ1W6iHFRe/1izQAx/R7W2QhEVENfmDT58kUCSh9xzLXrVaXUyYXpWF3huNWeT31 hc8pwYFoVKwzYeELYdEFoezwDon1oJbEuhU95yNEezenv/FFj3lFrT3xeIN/7KevrMQf rZPkKzgR/2nyU3T6NXK6oBR1xN+dj5kDM/1KMFQw9y3pCuJZeZL83e5N7cfmkqyD6Eck JeOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693999576; x=1694604376; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iQ+4wrvPpGZ5a1nklWsXWLvy3+iTY75RmIyUz7pfkEI=; b=cz64ZgD60SgaL0pQU5LJOK2D8QNhd6CSUoFly6Or1Sn5iXNxKjFvXZEPPY6fDsSHQZ BWI29j5xriOdd2o4hGWrPkNdSsp/ph2HmSO+otEUxxFk/60VrsbXL8fWLSX5Lss0ULVA 21YCtKDAfQ5uKancYOFYCIesaATDxEm3zX3K2/+9nykqglFFAot4WoSbSVZ/L9i4ZWiA kCespiV9xVJ0a1RHDn4bcN72FdPCppD86LFSmM7YAxi+uRT4a4bwK6vTyrbIldizRCxC aj74lQQqYVDlM+YB9XnxZ5ZvDD8CK91HJLRUG8L9IUfeRyjfg+kV5xu1Vy1waf1+Kr6+ 9OZA== X-Gm-Message-State: AOJu0YzTgZz2YkHxNWJfvw8SQbtYHC0+WB2Sf3xEYZlVeV+l/a0+hWhn /jorjl0qfUo1nqNKPIC9JMkAUw== X-Google-Smtp-Source: AGHT+IFKTigEzECohyKXcbSiG234lo0AYUo/LW5o7o0IXxyKMwkwe7Dz6DDQAswIExRg7WA4d/Alpg== X-Received: by 2002:a05:600c:ada:b0:3f9:b8df:26ae with SMTP id c26-20020a05600c0ada00b003f9b8df26aemr1895949wmr.34.1693999576050; Wed, 06 Sep 2023 04:26:16 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:6dce:31d8:efb1:5d81]) by smtp.gmail.com with ESMTPSA id n9-20020a05600c294900b003feff926fc5sm19517038wmd.17.2023.09.06.04.26.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Sep 2023 04:26:15 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v4 3/4] memblock: introduce MEMBLOCK_RSRV_NOINIT flag Date: Wed, 6 Sep 2023 12:26:04 +0100 Message-Id: <20230906112605.2286994-4-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230906112605.2286994-1-usama.arif@bytedance.com> References: <20230906112605.2286994-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" For reserved memory regions marked with this flag, reserve_bootmem_region is not called during memmap_init_reserved_pages. This can be used to avoid struct page initialization for regions which won't need them, for e.g. hugepages with HVO enabled. Signed-off-by: Usama Arif Acked-by: Muchun Song Reviewed-by: Mike Rapoport (IBM) --- include/linux/memblock.h | 9 +++++++++ mm/memblock.c | 33 ++++++++++++++++++++++++++++----- 2 files changed, 37 insertions(+), 5 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 1c1072e3ca06..ae3bde302f70 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -40,6 +40,8 @@ extern unsigned long long max_possible_pfn; * via a driver, and never indicated in the firmware-provided memory map as * system RAM. This corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED in the * kernel resource tree. + * @MEMBLOCK_RSRV_NOINIT: memory region for which struct pages are + * not initialized (only for reserved regions). */ enum memblock_flags { MEMBLOCK_NONE =3D 0x0, /* No special request */ @@ -47,6 +49,7 @@ enum memblock_flags { MEMBLOCK_MIRROR =3D 0x2, /* mirrored region */ MEMBLOCK_NOMAP =3D 0x4, /* don't add to kernel direct mapping */ MEMBLOCK_DRIVER_MANAGED =3D 0x8, /* always detected via a driver */ + MEMBLOCK_RSRV_NOINIT =3D 0x10, /* don't initialize struct pages */ }; =20 /** @@ -125,6 +128,7 @@ int memblock_clear_hotplug(phys_addr_t base, phys_addr_= t size); int memblock_mark_mirror(phys_addr_t base, phys_addr_t size); int memblock_mark_nomap(phys_addr_t base, phys_addr_t size); int memblock_clear_nomap(phys_addr_t base, phys_addr_t size); +int memblock_reserved_mark_noinit(phys_addr_t base, phys_addr_t size); =20 void memblock_free_all(void); void memblock_free(void *ptr, size_t size); @@ -259,6 +263,11 @@ static inline bool memblock_is_nomap(struct memblock_r= egion *m) return m->flags & MEMBLOCK_NOMAP; } =20 +static inline bool memblock_is_reserved_noinit(struct memblock_region *m) +{ + return m->flags & MEMBLOCK_RSRV_NOINIT; +} + static inline bool memblock_is_driver_managed(struct memblock_region *m) { return m->flags & MEMBLOCK_DRIVER_MANAGED; diff --git a/mm/memblock.c b/mm/memblock.c index a49efbaee7e0..8f7a0cb668d4 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -996,6 +996,24 @@ int __init_memblock memblock_clear_nomap(phys_addr_t b= ase, phys_addr_t size) return memblock_setclr_flag(&memblock.memory, base, size, 0, MEMBLOCK_NOM= AP); } =20 +/** + * memblock_reserved_mark_noinit - Mark a reserved memory region with flag + * MEMBLOCK_RSRV_NOINIT which results in the struct pages not being initia= lized + * for this region. + * @base: the base phys addr of the region + * @size: the size of the region + * + * struct pages will not be initialized for reserved memory regions marked= with + * %MEMBLOCK_RSRV_NOINIT. + * + * Return: 0 on success, -errno on failure. + */ +int __init_memblock memblock_reserved_mark_noinit(phys_addr_t base, phys_a= ddr_t size) +{ + return memblock_setclr_flag(&memblock.reserved, base, size, 1, + MEMBLOCK_RSRV_NOINIT); +} + static bool should_skip_region(struct memblock_type *type, struct memblock_region *m, int nid, int flags) @@ -2112,13 +2130,18 @@ static void __init memmap_init_reserved_pages(void) memblock_set_node(start, end, &memblock.reserved, nid); } =20 - /* initialize struct pages for the reserved regions */ + /* + * initialize struct pages for reserved regions that don't have + * the MEMBLOCK_RSRV_NOINIT flag set + */ for_each_reserved_mem_region(region) { - nid =3D memblock_get_region_node(region); - start =3D region->base; - end =3D start + region->size; + if (!memblock_is_reserved_noinit(region)) { + nid =3D memblock_get_region_node(region); + start =3D region->base; + end =3D start + region->size; =20 - reserve_bootmem_region(start, end, nid); + reserve_bootmem_region(start, end, nid); + } } } =20 --=20 2.25.1 From nobody Fri Dec 19 10:09:55 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49926EB8FA5 for ; Wed, 6 Sep 2023 11:28:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239766AbjIFL2N (ORCPT ); Wed, 6 Sep 2023 07:28:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239706AbjIFL2H (ORCPT ); Wed, 6 Sep 2023 07:28:07 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8974310F5 for ; Wed, 6 Sep 2023 04:27:00 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id ffacd0b85a97d-307d20548adso2980003f8f.0 for ; Wed, 06 Sep 2023 04:27:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1693999577; x=1694604377; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FStpXjXfy05Wptcs5VKtJaC3uUKwiP44RVFSTPHctoE=; b=j4V4MPpSoQrIIgsQD//FSkHEWpctmbGNg/rtFfRFnG7VF2+qsrMVsAxi0ktwBvD+r5 x4VUPBQKgSOrbpWWAJrVFnDn4X4A0as+dIiYZHTuhGaphyObcBOF4BK5dk/mIBClVfax 31eniDb+S3wViupq85JNB5Tw4eCna/X1szqnJBpcXvFvXE0FPg0a7ihNtG0g6mUy0qDB 5pv+epAq0KTUqcu+nrJ+iUhyet9/hSSt+WfV5X2BVFddNPi431whGnUaZt8OR4uq70ka a0soVN6aveQk8n1peKaxxzMTiwtTqa4Zfma17zrXQDYQuSLYOVmeB+5Rkr9aNY7AZkdh KCwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693999577; x=1694604377; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FStpXjXfy05Wptcs5VKtJaC3uUKwiP44RVFSTPHctoE=; b=MWw81iydJ7KuJPciKBOIk3Mgc8+wj2hC7KUJGyRjp3I9GyKjzbmH27/xJs+gXix7Od 03GMcb4ct//yvvCfqpnamckmbxzJAYIgKEYACXvGKQbvNXEyPGbdxWA0n8TmzJdIz/k8 f/iag0xW5tSi9cDl4zTkoAIVQwyXfM1Vs5HS3jhtIw70+g47gh3HdNPbrOJaUAtVn+vj qocUtwG810LxjSzifj8BxuU+j+5DAnS7IA8sq1xyhKVmxWb7bnAYTEe11VH0oEYegKrc OYZ3xiqxdkfnNS7nm6PAYbMgNIafgdAziMTtH+FnT9LFhSARC0Cc4bwmgLns6YsQ08b0 0gOQ== X-Gm-Message-State: AOJu0YxMCYCGSotOfcqdOY7TLN0qx5IAStd4rqB3RJ9kXXCZ84LHAKZF brCsb6WoNScjLEPjA5ErgrzfGQ== X-Google-Smtp-Source: AGHT+IG3C1Ltp43a9GTFE7tcAw7+FslSjp6rOutSzsw4cYuJWx+ebBM3Lt3tEbnjyOzPbJs8SSeFmw== X-Received: by 2002:a05:6000:11ca:b0:306:46c4:d313 with SMTP id i10-20020a05600011ca00b0030646c4d313mr1851070wrx.28.1693999577038; Wed, 06 Sep 2023 04:26:17 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:6dce:31d8:efb1:5d81]) by smtp.gmail.com with ESMTPSA id n9-20020a05600c294900b003feff926fc5sm19517038wmd.17.2023.09.06.04.26.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Sep 2023 04:26:16 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v4 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO Date: Wed, 6 Sep 2023 12:26:05 +0100 Message-Id: <20230906112605.2286994-5-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230906112605.2286994-1-usama.arif@bytedance.com> References: <20230906112605.2286994-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The new boot flow when it comes to initialization of gigantic pages is as follows: - At boot time, for a gigantic page during __alloc_bootmem_hugepage, the region after the first struct page is marked as noinit. - This results in only the first struct page to be initialized in reserve_bootmem_region. As the tail struct pages are not initialized at this point, there can be a significant saving in boot time if HVO succeeds later on. - Later on in the boot, the head page is prepped and the first HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages are initialized. - HVO is attempted. If it is not successful, then the rest of the tail struct pages are initialized. If it is successful, no more tail struct pages need to be initialized saving significant boot time. Signed-off-by: Usama Arif --- mm/hugetlb.c | 61 +++++++++++++++++++++++++++++++++++++------- mm/hugetlb_vmemmap.c | 2 +- mm/hugetlb_vmemmap.h | 9 ++++--- mm/internal.h | 3 +++ mm/mm_init.c | 2 +- 5 files changed, 62 insertions(+), 15 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c32ca241df4b..540e0386514e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3169,6 +3169,15 @@ int __alloc_bootmem_huge_page(struct hstate *h, int = nid) } =20 found: + + /* + * Only initialize the head struct page in memmap_init_reserved_pages, + * rest of the struct pages will be initialized by the HugeTLB subsystem = itself. + * The head struct page is used to get folio information by the HugeTLB + * subsystem like zone id and node id. + */ + memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), + huge_page_size(h) - PAGE_SIZE); /* Put them into a private list first because mem_map is not up yet */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages); @@ -3176,6 +3185,40 @@ int __alloc_bootmem_huge_page(struct hstate *h, int = nid) return 1; } =20 +/* Initialize [start_page:end_page_number] tail struct pages of a hugepage= */ +static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio, + unsigned long start_page_number, + unsigned long end_page_number) +{ + enum zone_type zone =3D zone_idx(folio_zone(folio)); + int nid =3D folio_nid(folio); + unsigned long head_pfn =3D folio_pfn(folio); + unsigned long pfn, end_pfn =3D head_pfn + end_page_number; + + for (pfn =3D head_pfn + start_page_number; pfn < end_pfn; pfn++) { + struct page *page =3D pfn_to_page(pfn); + + __init_single_page(page, pfn, zone, nid); + prep_compound_tail((struct page *)folio, pfn - head_pfn); + set_page_count(page, 0); + } +} + +static void __init hugetlb_folio_init_vmemmap(struct folio *folio, struct = hstate *h, + unsigned long nr_pages) +{ + int ret; + + /* Prepare folio head */ + __folio_clear_reserved(folio); + __folio_set_head(folio); + ret =3D page_ref_freeze(&folio->page, 1); + VM_BUG_ON(!ret); + /* Initialize the necessary tail struct pages */ + hugetlb_folio_init_tail_vmemmap(folio, 1, nr_pages); + prep_compound_head((struct page *)folio, huge_page_order(h)); +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_ORDER) pages. @@ -3186,19 +3229,19 @@ static void __init gather_bootmem_prealloc(void) =20 list_for_each_entry(m, &huge_boot_pages, list) { struct page *page =3D virt_to_page(m); - struct folio *folio =3D page_folio(page); + struct folio *folio =3D (void *)page; struct hstate *h =3D m->hstate; =20 VM_BUG_ON(!hstate_is_gigantic(h)); WARN_ON(folio_ref_count(folio) !=3D 1); - if (prep_compound_gigantic_folio(folio, huge_page_order(h))) { - WARN_ON(folio_test_reserved(folio)); - prep_new_hugetlb_folio(h, folio, folio_nid(folio)); - free_huge_folio(folio); /* add to the hugepage allocator */ - } else { - /* VERY unlikely inflated ref count on a tail page */ - free_gigantic_folio(folio, huge_page_order(h)); - } + + hugetlb_folio_init_vmemmap(folio, h, HUGETLB_VMEMMAP_RESERVE_PAGES); + prep_new_hugetlb_folio(h, folio, folio_nid(folio)); + /* If HVO fails, initialize all tail struct pages */ + if (!HPageVmemmapOptimized(&folio->page)) + hugetlb_folio_init_tail_vmemmap(folio, HUGETLB_VMEMMAP_RESERVE_PAGES, + pages_per_huge_page(h)); + free_huge_folio(folio); /* add to the hugepage allocator */ =20 /* * We need to restore the 'stolen' pages to totalram_pages diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 3cdb38d87a95..772a877918d7 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -589,7 +589,7 @@ static int __init hugetlb_vmemmap_init(void) const struct hstate *h; =20 /* HUGETLB_VMEMMAP_RESERVE_SIZE should cover all used struct pages */ - BUILD_BUG_ON(__NR_USED_SUBPAGE * sizeof(struct page) > HUGETLB_VMEMMAP_RE= SERVE_SIZE); + BUILD_BUG_ON(__NR_USED_SUBPAGE > HUGETLB_VMEMMAP_RESERVE_PAGES); =20 for_each_hstate(h) { if (hugetlb_vmemmap_optimizable(h)) { diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 25bd0e002431..4573899855d7 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -10,15 +10,16 @@ #define _LINUX_HUGETLB_VMEMMAP_H #include =20 -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP -int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head); -void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head); - /* * Reserve one vmemmap page, all vmemmap addresses are mapped to it. See * Documentation/vm/vmemmap_dedup.rst. */ #define HUGETLB_VMEMMAP_RESERVE_SIZE PAGE_SIZE +#define HUGETLB_VMEMMAP_RESERVE_PAGES (HUGETLB_VMEMMAP_RESERVE_SIZE / size= of(struct page)) + +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP +int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head); +void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head); =20 static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { diff --git a/mm/internal.h b/mm/internal.h index d1d4bf4e63c0..d74061aa6de7 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1154,4 +1154,7 @@ struct vma_prepare { struct vm_area_struct *remove; struct vm_area_struct *remove2; }; + +void __meminit __init_single_page(struct page *page, unsigned long pfn, + unsigned long zone, int nid); #endif /* __MM_INTERNAL_H */ diff --git a/mm/mm_init.c b/mm/mm_init.c index 50f2f34745af..fed4370b02e1 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -555,7 +555,7 @@ static void __init find_zone_movable_pfns_for_nodes(voi= d) node_states[N_MEMORY] =3D saved_node_state; } =20 -static void __meminit __init_single_page(struct page *page, unsigned long = pfn, +void __meminit __init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid) { mm_zero_struct_page(page); --=20 2.25.1