From nobody Wed Dec 17 02:26:05 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0328C88CB2 for ; Fri, 25 Aug 2023 11:19:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240494AbjHYLT3 (ORCPT ); Fri, 25 Aug 2023 07:19:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243017AbjHYLTQ (ORCPT ); Fri, 25 Aug 2023 07:19:16 -0400 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77B6F1FC4 for ; Fri, 25 Aug 2023 04:18:50 -0700 (PDT) Received: by mail-wr1-x432.google.com with SMTP id ffacd0b85a97d-31c65820134so623900f8f.1 for ; Fri, 25 Aug 2023 04:18:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1692962329; x=1693567129; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nxvZHKtYENKC1N901ur/ZbT/Aj89mU0uDVwxqxk+868=; b=K/Jam9neS+gbs4EpfQDhGOSkHOH1JbnmbxU85K89H2k7ENKtddEhSBMHutTYc2ujhw ZMp0c1Adl/pniAsythqyZLNw4NXnEULOCFp0MuLwG2lxrPEoKnblKqitSh0fpcWNhXkz C4MYw4KlJP7ZB7HCzQeS49dldXs559hTCc1db7Kq3xFpD1EY8DCo7sLfxOqmJl6PxG/S wbJjBOIR3B2vAcL9tu3KuUiquHb/e+5RaoNbkUK/TBgFdvAo0zeqvkuVeYFqFBSB6MUa tZwke+2F5iihc4kb3eDabnmS298FS/DaifERsJLqGQC2ThyLlXPsFG2aaNJGbHq00ppL y1JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692962329; x=1693567129; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nxvZHKtYENKC1N901ur/ZbT/Aj89mU0uDVwxqxk+868=; b=VKWOQc0jgB9BLBHOMQniDlLQe+PEs6ZhSXvmkmcC3BdtO9ir9Om1zX76b/A0OLYYyY Avjib2KXDjvHla6BwmW3K5ckRX3o9ae8qVv+oJ3lV814HSOBoSuar8QnbGhP6AbSrn/y WOfYzpSGwM9wzfUh/RO96wiYt69fnPv8KWw68IBDT8VzZdkdatUFVhdxYCGvc9QXjkea c0PHNNaJecynhzEOmgrGx4Ho1Rh/YI9+CBW1zV4aOR86zupWDge9SIAIr6WiJCyfnhsy b4PF0iZFKMEU2mX7cpEHTfO7vuOvcXwxnfnOHTm8RnAXtAi81Kju55biumC7sUbVN/QG FrNQ== X-Gm-Message-State: AOJu0Yxx4xZJuCfw41aox/Bs4HN8oTDIMuA/itSHXYwyiLSW23pdzuCY UUfbHSynxKDR5lLYPcOta1W+/A== X-Google-Smtp-Source: AGHT+IG7fidFBnhPCCm7z2OPyt2Wleqo2J6p8ROG+32Spl+z/EhkhXc8nY2Kixdo4gyAJlJbfWGltA== X-Received: by 2002:a5d:568f:0:b0:317:e5ec:8767 with SMTP id f15-20020a5d568f000000b00317e5ec8767mr13413417wrv.21.1692962329092; Fri, 25 Aug 2023 04:18:49 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:9000:8d13:d0bb:ba7d]) by smtp.gmail.com with ESMTPSA id g9-20020a056000118900b0031ad5fb5a0fsm1939502wrx.58.2023.08.25.04.18.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Aug 2023 04:18:48 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v3 1/4] mm: hugetlb_vmemmap: Use nid of the head page to reallocate it Date: Fri, 25 Aug 2023 12:18:33 +0100 Message-Id: <20230825111836.1715308-2-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230825111836.1715308-1-usama.arif@bytedance.com> References: <20230825111836.1715308-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" If tail page prep and initialization is skipped, then the "start" page will not contain the correct nid. Use the nid from first vmemap page. Signed-off-by: Usama Arif Reviewed-by: Mike Kravetz Reviewed-by: Muchun Song --- mm/hugetlb_vmemmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index c2007ef5e9b0..208907f2c5e1 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -324,7 +324,7 @@ static int vmemmap_remap_free(unsigned long start, unsi= gned long end, .reuse_addr =3D reuse, .vmemmap_pages =3D &vmemmap_pages, }; - int nid =3D page_to_nid((struct page *)start); + int nid =3D page_to_nid((struct page *)reuse); gfp_t gfp_mask =3D GFP_KERNEL | __GFP_THISNODE | __GFP_NORETRY | __GFP_NOWARN; =20 --=20 2.25.1 From nobody Wed Dec 17 02:26:05 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4FC7EE49A3 for ; Fri, 25 Aug 2023 11:19:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240884AbjHYLTb (ORCPT ); Fri, 25 Aug 2023 07:19:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243103AbjHYLTR (ORCPT ); Fri, 25 Aug 2023 07:19:17 -0400 Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7DB8ECD1 for ; Fri, 25 Aug 2023 04:18:51 -0700 (PDT) Received: by mail-wr1-x433.google.com with SMTP id ffacd0b85a97d-313e742a787so539724f8f.1 for ; Fri, 25 Aug 2023 04:18:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1692962330; x=1693567130; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LjF+zN6TJ1BvEvRrI5C4CD8rIkkkthqR+fjsUt4Q8CQ=; b=arFqt1RTyp2yJ+qBHYnjbmsTvf9yikTfxTS2ZNvYs2M11Hlpj8DhIjlISwrDnL6HJG WEAKtjAc0kFEN8lxFwik4/1OUesCxPtqjE8+g0XBxBCrsd8wW2t+0KR83Ei7FuHlf3+N q44ZOVmcpl/SM5UrImy/4Zwyb3S4VKAI9MvDv5kHG3Wl49UpTUbepRZ8KnvEAvA+tbHw 6iuQYc0p0a1s7nX0im3kgSKFNlKt2Dpu3gp59UyVHEqsh7BFth7+gIqsiHpq9VZvb1ar YrKNpyAQzfy61sEIwUFBts1gdBuOR47DT+cM+daulfr4gUp3ttCceUtshX1CRUx3IgGV vfdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692962330; x=1693567130; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LjF+zN6TJ1BvEvRrI5C4CD8rIkkkthqR+fjsUt4Q8CQ=; b=ccaO52qo2et0l9gdUGhbK8pRqoaKnlOACzCVfNZsyG1H6VtiKaTKQfx2z4oo3gNU2t GR2yhKkYM9h3q4HLPHp3SJYT/b+leu/rHFm46coSEFoFtdRLIFq3GIN/PMdt0sme2WvD QqnsyXopcg+YLgQ4+5/2dO0YfwlISbovgdZYLmYiwr+nSXg2wboh8Zooxqf13k5eeWNM B+nO7lP4Q1angZCswlUGQClXQZxF9TbNT8Y6B7BSzya3sORDTxuR3YuzkhgeO6j72OSD dMnK2XRuKC3eKn0d0wFCMdYK57pb4gxQfvWIxGpbVvB0dSrTgf+6P9xNJY1zI50t+bIe MSxw== X-Gm-Message-State: AOJu0Yyc74IWac8aAB/+w+O0E0SstmErwxtfGdPvWmtSVXyRMJQ6+1Rt 0YOkLmf6afOkvy8LoE3aaRDHrA== X-Google-Smtp-Source: AGHT+IFlq4uXcqCHuwCivQQQkMphZFErOKTGJdelk1WSHrPd5grjavTs+RZd4i9P7/+NjTNfRJorLw== X-Received: by 2002:a5d:510f:0:b0:313:df09:acfb with SMTP id s15-20020a5d510f000000b00313df09acfbmr16500274wrt.11.1692962330072; Fri, 25 Aug 2023 04:18:50 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:9000:8d13:d0bb:ba7d]) by smtp.gmail.com with ESMTPSA id g9-20020a056000118900b0031ad5fb5a0fsm1939502wrx.58.2023.08.25.04.18.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Aug 2023 04:18:49 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v3 2/4] memblock: pass memblock_type to memblock_setclr_flag Date: Fri, 25 Aug 2023 12:18:34 +0100 Message-Id: <20230825111836.1715308-3-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230825111836.1715308-1-usama.arif@bytedance.com> References: <20230825111836.1715308-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This allows setting flags to both memblock types and is in preparation for setting flags (for e.g. to not initialize struct pages) on reserved memory region. Signed-off-by: Usama Arif Acked-by: Mike Kravetz Reviewed-by: Mike Rapoport (IBM) Reviewed-by: Muchun Song --- mm/memblock.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/mm/memblock.c b/mm/memblock.c index f9e61e565a53..43cb4404d94c 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -896,10 +896,9 @@ int __init_memblock memblock_physmem_add(phys_addr_t b= ase, phys_addr_t size) * * Return: 0 on success, -errno on failure. */ -static int __init_memblock memblock_setclr_flag(phys_addr_t base, - phys_addr_t size, int set, int flag) +static int __init_memblock memblock_setclr_flag(struct memblock_type *type, + phys_addr_t base, phys_addr_t size, int set, int flag) { - struct memblock_type *type =3D &memblock.memory; int i, ret, start_rgn, end_rgn; =20 ret =3D memblock_isolate_range(type, base, size, &start_rgn, &end_rgn); @@ -928,7 +927,7 @@ static int __init_memblock memblock_setclr_flag(phys_ad= dr_t base, */ int __init_memblock memblock_mark_hotplug(phys_addr_t base, phys_addr_t si= ze) { - return memblock_setclr_flag(base, size, 1, MEMBLOCK_HOTPLUG); + return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_HOT= PLUG); } =20 /** @@ -940,7 +939,7 @@ int __init_memblock memblock_mark_hotplug(phys_addr_t b= ase, phys_addr_t size) */ int __init_memblock memblock_clear_hotplug(phys_addr_t base, phys_addr_t s= ize) { - return memblock_setclr_flag(base, size, 0, MEMBLOCK_HOTPLUG); + return memblock_setclr_flag(&memblock.memory, base, size, 0, MEMBLOCK_HOT= PLUG); } =20 /** @@ -957,7 +956,7 @@ int __init_memblock memblock_mark_mirror(phys_addr_t ba= se, phys_addr_t size) =20 system_has_some_mirror =3D true; =20 - return memblock_setclr_flag(base, size, 1, MEMBLOCK_MIRROR); + return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_MIR= ROR); } =20 /** @@ -977,7 +976,7 @@ int __init_memblock memblock_mark_mirror(phys_addr_t ba= se, phys_addr_t size) */ int __init_memblock memblock_mark_nomap(phys_addr_t base, phys_addr_t size) { - return memblock_setclr_flag(base, size, 1, MEMBLOCK_NOMAP); + return memblock_setclr_flag(&memblock.memory, base, size, 1, MEMBLOCK_NOM= AP); } =20 /** @@ -989,7 +988,7 @@ int __init_memblock memblock_mark_nomap(phys_addr_t bas= e, phys_addr_t size) */ int __init_memblock memblock_clear_nomap(phys_addr_t base, phys_addr_t siz= e) { - return memblock_setclr_flag(base, size, 0, MEMBLOCK_NOMAP); + return memblock_setclr_flag(&memblock.memory, base, size, 0, MEMBLOCK_NOM= AP); } =20 static bool should_skip_region(struct memblock_type *type, --=20 2.25.1 From nobody Wed Dec 17 02:26:05 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB0F4C3DA66 for ; Fri, 25 Aug 2023 11:19:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239827AbjHYLT1 (ORCPT ); Fri, 25 Aug 2023 07:19:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239381AbjHYLSz (ORCPT ); Fri, 25 Aug 2023 07:18:55 -0400 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C9211FCB for ; Fri, 25 Aug 2023 04:18:52 -0700 (PDT) Received: by mail-wr1-x436.google.com with SMTP id ffacd0b85a97d-319559fd67dso685955f8f.3 for ; Fri, 25 Aug 2023 04:18:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1692962331; x=1693567131; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rt7VA+fn11wPN2WZOsDR2kSrdsBwWADUH8ZcQSjaiPs=; b=EsMNC1tzWSxQdc599l5SHDEEa98rqlp7iDryuFhBYk8fHJ511NFQyTznh0mCDBUIn8 VL8K3tloTbGtKUrcH4jjAAsVmaUTZ2RNBUAs8YhCr5NG+vVevNZdq3uQ1G1uBY6LhJVF rTwmWgEz1MvQA9fuhdFC9bXz4LoNfymZjqSABGUPV23JzQFZnk9thziUhDk9YadzQzpK 0Y4SguwVDubIC/esFXv70EfncJva9GyavT5l0kJrs+9ZjMhqgofSo0kMiSVl8P6uaV89 V29hVlv/XbysBximjie5EPo24O7osCBf+oDoT7rAPNGdseqXFPTWzYczRyNlM8M7odR8 9gkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692962331; x=1693567131; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rt7VA+fn11wPN2WZOsDR2kSrdsBwWADUH8ZcQSjaiPs=; b=M6flj4A8x/w8NgVcGLECzTLyw71Y4f3hkSW/cn6raUIJR2t0LPtA4cbFq8m1FQmT1q LG1A2AySW3pWnVwrr5YKpGOCIksJNda9/fi3uw9506OfmFm/8FPMMt5O3V8SQPZp9t5R EtRTIvpt6BAvSjo043/BWONnxI4KsDNr42tvvRTsYbZaQptY2acbcrHBW6kBqIIZ8hDS sCGOtGg4QlYygL3oLc8dSLlFUv9IJObeHfVdTqm3mFPRMsF17nL27FjUpDEO+8/+lQ61 iistUbhiaE3chIHzQqXIKElPScuMKkclSYVFhFpMy4PP7L9CAZbNYLxW3P6R30uTzp3G hWrw== X-Gm-Message-State: AOJu0Yyvzm1vLWO3qLiUjNVEZ45/e2risZgCElBFZHBd9CIwnqSe2YcP Do3llPiqASjUFT30m5mxydpezw== X-Google-Smtp-Source: AGHT+IEec5EPpcaKObbex9bUWyp4GWlAmHPClDlt9NrVYi+5iTZNS2Jj3ebXr6R8Gt7v2gUbQKOW1A== X-Received: by 2002:a5d:4403:0:b0:31a:d49a:38d with SMTP id z3-20020a5d4403000000b0031ad49a038dmr13545294wrq.54.1692962331003; Fri, 25 Aug 2023 04:18:51 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:9000:8d13:d0bb:ba7d]) by smtp.gmail.com with ESMTPSA id g9-20020a056000118900b0031ad5fb5a0fsm1939502wrx.58.2023.08.25.04.18.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Aug 2023 04:18:50 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v3 3/4] memblock: introduce MEMBLOCK_RSRV_NOINIT_VMEMMAP flag Date: Fri, 25 Aug 2023 12:18:35 +0100 Message-Id: <20230825111836.1715308-4-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230825111836.1715308-1-usama.arif@bytedance.com> References: <20230825111836.1715308-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" For reserved memory regions marked with this flag, reserve_bootmem_region is not called during memmap_init_reserved_pages. This can be used to avoid struct page initialization for regions which won't need them, for e.g. hugepages with HVO enabled. Signed-off-by: Usama Arif Reviewed-by: Muchun Song --- include/linux/memblock.h | 10 ++++++++++ mm/memblock.c | 32 +++++++++++++++++++++++++++----- 2 files changed, 37 insertions(+), 5 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index f71ff9f0ec81..6d681d053880 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -40,6 +40,8 @@ extern unsigned long long max_possible_pfn; * via a driver, and never indicated in the firmware-provided memory map as * system RAM. This corresponds to IORESOURCE_SYSRAM_DRIVER_MANAGED in the * kernel resource tree. + * @MEMBLOCK_RSRV_NOINIT_VMEMMAP: memory region for which struct pages are + * not initialized (only for reserved regions). */ enum memblock_flags { MEMBLOCK_NONE =3D 0x0, /* No special request */ @@ -47,6 +49,8 @@ enum memblock_flags { MEMBLOCK_MIRROR =3D 0x2, /* mirrored region */ MEMBLOCK_NOMAP =3D 0x4, /* don't add to kernel direct mapping */ MEMBLOCK_DRIVER_MANAGED =3D 0x8, /* always detected via a driver */ + /* don't initialize struct pages associated with this reserver memory blo= ck */ + MEMBLOCK_RSRV_NOINIT_VMEMMAP =3D 0x10, }; =20 /** @@ -125,6 +129,7 @@ int memblock_clear_hotplug(phys_addr_t base, phys_addr_= t size); int memblock_mark_mirror(phys_addr_t base, phys_addr_t size); int memblock_mark_nomap(phys_addr_t base, phys_addr_t size); int memblock_clear_nomap(phys_addr_t base, phys_addr_t size); +int memblock_reserved_mark_noinit_vmemmap(phys_addr_t base, phys_addr_t si= ze); =20 void memblock_free_all(void); void memblock_free(void *ptr, size_t size); @@ -259,6 +264,11 @@ static inline bool memblock_is_nomap(struct memblock_r= egion *m) return m->flags & MEMBLOCK_NOMAP; } =20 +static inline bool memblock_is_noinit_vmemmap(struct memblock_region *m) +{ + return m->flags & MEMBLOCK_RSRV_NOINIT_VMEMMAP; +} + static inline bool memblock_is_driver_managed(struct memblock_region *m) { return m->flags & MEMBLOCK_DRIVER_MANAGED; diff --git a/mm/memblock.c b/mm/memblock.c index 43cb4404d94c..a9782228c840 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -991,6 +991,23 @@ int __init_memblock memblock_clear_nomap(phys_addr_t b= ase, phys_addr_t size) return memblock_setclr_flag(&memblock.memory, base, size, 0, MEMBLOCK_NOM= AP); } =20 +/** + * memblock_reserved_mark_noinit_vmemmap - Mark a reserved memory region w= ith flag + * MEMBLOCK_RSRV_NOINIT_VMEMMAP. + * @base: the base phys addr of the region + * @size: the size of the region + * + * struct pages will not be initialized for reserved memory regions marked= with + * %MEMBLOCK_RSRV_NOINIT_VMEMMAP. + * + * Return: 0 on success, -errno on failure. + */ +int __init_memblock memblock_reserved_mark_noinit_vmemmap(phys_addr_t base= , phys_addr_t size) +{ + return memblock_setclr_flag(&memblock.reserved, base, size, 1, + MEMBLOCK_RSRV_NOINIT_VMEMMAP); +} + static bool should_skip_region(struct memblock_type *type, struct memblock_region *m, int nid, int flags) @@ -2107,13 +2124,18 @@ static void __init memmap_init_reserved_pages(void) memblock_set_node(start, end, &memblock.reserved, nid); } =20 - /* initialize struct pages for the reserved regions */ + /* + * initialize struct pages for reserved regions that don't have + * the MEMBLOCK_RSRV_NOINIT_VMEMMAP flag set + */ for_each_reserved_mem_region(region) { - nid =3D memblock_get_region_node(region); - start =3D region->base; - end =3D start + region->size; + if (!memblock_is_noinit_vmemmap(region)) { + nid =3D memblock_get_region_node(region); + start =3D region->base; + end =3D start + region->size; =20 - reserve_bootmem_region(start, end, nid); + reserve_bootmem_region(start, end, nid); + } } } =20 --=20 2.25.1 From nobody Wed Dec 17 02:26:05 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1108CC7EE2C for ; Fri, 25 Aug 2023 11:19:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241794AbjHYLTc (ORCPT ); Fri, 25 Aug 2023 07:19:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243123AbjHYLTU (ORCPT ); Fri, 25 Aug 2023 07:19:20 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6817E1FDE for ; Fri, 25 Aug 2023 04:18:53 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id ffacd0b85a97d-31c79850df5so657777f8f.1 for ; Fri, 25 Aug 2023 04:18:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1692962332; x=1693567132; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=r6ewY66oYf3vDH9aPoj7pb9ZlWRxktJaiCYaiSDx5Sk=; b=KyUsO0r4rRvScvgaql7Mq5sBM59AgdiLlku31XVLehfo/OyOKwTeyZBHqhENtKIEl9 P3ymLBkS3If/LAKRNANtlRWYrskCp2K3RSSWfZcBfwGiK9kYFXjQDHz7qE2ymRYfFUMn uUbWss9XFgzITSidEKkZUWaJh7avtNC9lP7769KoK82tO41UWCXcLCs+4Xm8A5SkBzXK RYBbCFiYgMht2pLOGOKshr6GOT8+pWg3D47dUqEho1eTmsge1mAirZ0LvqexlT/0jC5M TYPbzcJPLn+3vaif0HDBwx0dC08rv+ZWDWMU5TML4/SojKT5VJpBFSDMRI3vpn/dt11/ wBLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692962332; x=1693567132; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=r6ewY66oYf3vDH9aPoj7pb9ZlWRxktJaiCYaiSDx5Sk=; b=QTthRpqfQprC+8HObVMngMV6VLH8JOlTSwpTynXdg+4CA0zeU6QYY3N4/77VG4irWG ctFot6IUcClDyfSUOyo2PfGfeX2IiU/2llgSxZB9QwjkTrr3gD3B7CQlvFl3F0yYzgzc xX+g8YJqDqBhy2Df6Hb1/bQSBiqBON0feIheup+s2vuhkpN7wjmHJLvR058Eim5f/OYo O2w3/vbhfo5kCxukaTfaaHVNg+27Dp6MhTlF9ssVZpMHj7TD309zlVqrYt61JHzT6Dul RfWC5LzspDRhoWAuP1ZNc81nHxGrCgwUFtsYbvrs97ja6UVlnXiOwMaxyE4C8am6+nZk TnbA== X-Gm-Message-State: AOJu0YxaBWTAm9iQDXHCMtSauu7y2+SNvL10gbq7GuGESiJBIDbV0nSh vhtoLAFQ2GbzH7Sy1xsdpeHjwA== X-Google-Smtp-Source: AGHT+IGJuVnbgEdL7B1yqcNDta9SW8PrCZLooH6K+4U6WsXn8SKs1tWicPmF0cFJa/KnOnpJGZAshw== X-Received: by 2002:adf:f08b:0:b0:315:a32d:311f with SMTP id n11-20020adff08b000000b00315a32d311fmr14949179wro.14.1692962331890; Fri, 25 Aug 2023 04:18:51 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b5c7:0:9000:8d13:d0bb:ba7d]) by smtp.gmail.com with ESMTPSA id g9-20020a056000118900b0031ad5fb5a0fsm1939502wrx.58.2023.08.25.04.18.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Aug 2023 04:18:51 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, songmuchun@bytedance.com, fam.zheng@bytedance.com, liangma@liangbit.com, punit.agrawal@bytedance.com, Usama Arif Subject: [v3 4/4] mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO Date: Fri, 25 Aug 2023 12:18:36 +0100 Message-Id: <20230825111836.1715308-5-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230825111836.1715308-1-usama.arif@bytedance.com> References: <20230825111836.1715308-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The new boot flow when it comes to initialization of gigantic pages is as follows: - At boot time, for a gigantic page during __alloc_bootmem_hugepage, the region after the first struct page is marked as noinit. - This results in only the first struct page to be initialized in reserve_bootmem_region. As the tail struct pages are not initialized at this point, there can be a significant saving in boot time if HVO succeeds later on. - Later on in the boot, HVO is attempted. If its successful, only the first HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page) - 1 tail struct pages after the head struct page are initialized. If it is not successful, then all of the tail struct pages are initialized. Signed-off-by: Usama Arif --- mm/hugetlb.c | 52 +++++++++++++++++++++++++++++++++++--------- mm/hugetlb_vmemmap.h | 8 +++---- mm/internal.h | 3 +++ mm/mm_init.c | 2 +- 4 files changed, 50 insertions(+), 15 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6da626bfb52e..964f7a2b693e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1953,7 +1953,6 @@ static void __prep_account_new_huge_page(struct hstat= e *h, int nid) =20 static void __prep_new_hugetlb_folio(struct hstate *h, struct folio *folio) { - hugetlb_vmemmap_optimize(h, &folio->page); INIT_LIST_HEAD(&folio->lru); folio_set_compound_dtor(folio, HUGETLB_PAGE_DTOR); hugetlb_set_folio_subpool(folio, NULL); @@ -2225,6 +2224,7 @@ static struct folio *alloc_fresh_hugetlb_folio(struct= hstate *h, return NULL; } } + hugetlb_vmemmap_optimize(h, &folio->page); prep_new_hugetlb_folio(h, folio, folio_nid(folio)); =20 return folio; @@ -2943,6 +2943,7 @@ static int alloc_and_dissolve_hugetlb_folio(struct hs= tate *h, new_folio =3D alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL); if (!new_folio) return -ENOMEM; + hugetlb_vmemmap_optimize(h, &new_folio->page); __prep_new_hugetlb_folio(h, new_folio); =20 retry: @@ -3206,6 +3207,15 @@ int __alloc_bootmem_huge_page(struct hstate *h, int = nid) } =20 found: + + /* + * Only initialize the head struct page in memmap_init_reserved_pages, + * rest of the struct pages will be initialized by the HugeTLB subsystem = itself. + * The head struct page is used to get folio information by the HugeTLB + * subsystem like zone id and node id. + */ + memblock_reserved_mark_noinit_vmemmap(virt_to_phys((void *)m + PAGE_SIZE), + huge_page_size(h) - PAGE_SIZE); /* Put them into a private list first because mem_map is not up yet */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages); @@ -3213,6 +3223,27 @@ int __alloc_bootmem_huge_page(struct hstate *h, int = nid) return 1; } =20 +static void __init hugetlb_folio_init_vmemmap(struct hstate *h, struct fol= io *folio, + unsigned long nr_pages) +{ + enum zone_type zone =3D zone_idx(folio_zone(folio)); + int nid =3D folio_nid(folio); + unsigned long head_pfn =3D folio_pfn(folio); + unsigned long pfn, end_pfn =3D head_pfn + nr_pages; + + __folio_clear_reserved(folio); + __folio_set_head(folio); + + for (pfn =3D head_pfn + 1; pfn < end_pfn; pfn++) { + struct page *page =3D pfn_to_page(pfn); + + __init_single_page(page, pfn, zone, nid); + prep_compound_tail((struct page *)folio, pfn - head_pfn); + set_page_count(page, 0); + } + prep_compound_head((struct page *)folio, huge_page_order(h)); +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_ORDER) pages. @@ -3223,19 +3254,19 @@ static void __init gather_bootmem_prealloc(void) =20 list_for_each_entry(m, &huge_boot_pages, list) { struct page *page =3D virt_to_page(m); - struct folio *folio =3D page_folio(page); + struct folio *folio =3D (void *)page; struct hstate *h =3D m->hstate; + unsigned long nr_pages =3D pages_per_huge_page(h); =20 VM_BUG_ON(!hstate_is_gigantic(h)); WARN_ON(folio_ref_count(folio) !=3D 1); - if (prep_compound_gigantic_folio(folio, huge_page_order(h))) { - WARN_ON(folio_test_reserved(folio)); - prep_new_hugetlb_folio(h, folio, folio_nid(folio)); - free_huge_page(page); /* add to the hugepage allocator */ - } else { - /* VERY unlikely inflated ref count on a tail page */ - free_gigantic_folio(folio, huge_page_order(h)); - } + + hugetlb_vmemmap_optimize(h, &folio->page); + if (HPageVmemmapOptimized(&folio->page)) + nr_pages =3D HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page); + hugetlb_folio_init_vmemmap(h, folio, nr_pages); + prep_new_hugetlb_folio(h, folio, folio_nid(folio)); + free_huge_page(page); /* add to the hugepage allocator */ =20 /* * We need to restore the 'stolen' pages to totalram_pages @@ -3656,6 +3687,7 @@ static int demote_free_hugetlb_folio(struct hstate *h= , struct folio *folio) else prep_compound_page(subpage, target_hstate->order); folio_change_private(inner_folio, NULL); + hugetlb_vmemmap_optimize(h, &folio->page); prep_new_hugetlb_folio(target_hstate, inner_folio, nid); free_huge_page(subpage); } diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 25bd0e002431..d30aff8f3573 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -10,16 +10,16 @@ #define _LINUX_HUGETLB_VMEMMAP_H #include =20 -#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP -int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head); -void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head); - /* * Reserve one vmemmap page, all vmemmap addresses are mapped to it. See * Documentation/vm/vmemmap_dedup.rst. */ #define HUGETLB_VMEMMAP_RESERVE_SIZE PAGE_SIZE =20 +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP +int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head); +void hugetlb_vmemmap_optimize(const struct hstate *h, struct page *head); + static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { return pages_per_huge_page(h) * sizeof(struct page); diff --git a/mm/internal.h b/mm/internal.h index a7d9e980429a..31b3d45f4609 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1102,4 +1102,7 @@ struct vma_prepare { struct vm_area_struct *remove; struct vm_area_struct *remove2; }; + +void __meminit __init_single_page(struct page *page, unsigned long pfn, + unsigned long zone, int nid); #endif /* __MM_INTERNAL_H */ diff --git a/mm/mm_init.c b/mm/mm_init.c index a1963c3322af..3d4ab595ca7d 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -551,7 +551,7 @@ static void __init find_zone_movable_pfns_for_nodes(voi= d) node_states[N_MEMORY] =3D saved_node_state; } =20 -static void __meminit __init_single_page(struct page *page, unsigned long = pfn, +void __meminit __init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid) { mm_zero_struct_page(page); --=20 2.25.1