From nobody Sat Feb 7 10:16:20 2026 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B7BA15C137 for ; Sun, 11 Aug 2024 21:21:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723411298; cv=none; b=GS7BNSGSaZWFWTsKMdA4KP8zBdGhpiKLOhX9U+xzF30XJ+9eKDdK6akjcqhx4CXGQU2q3E+KNRWuCMDFKq78cKvz8911AdD1VXYyhwVIuZPVZ2EQxiekydT8S7w6JCqfHbi2g6jLblUTWqH7SeU48u/CpjLMuoelqLggh4bsCxA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723411298; c=relaxed/simple; bh=CxucUH6eTkoQX20WqNxvTvPTAh0/zOMUPLj1GJwi7lA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=nVQ190U2RqNG/dyVxj7i1u9MW5rXAIaeVggyWkxYWfCcXo1Xz9wnk2zN/u5W+Ix+jodjQrTRy+Zlr6M/AdlFHOE8DtavBLIcK9he4iUN5KowBDPRS/CCrzyVEHFdrINaGPzK/jbofvrJbb0GItHk8tVmX7/ZLBQkAildahBe5qo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=mr28u22f; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="mr28u22f" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-654d96c2bb5so71758667b3.2 for ; Sun, 11 Aug 2024 14:21:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723411295; x=1724016095; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=THGaYoJGBVo1NFT+ygjobh7TePYwQA+eDE4p3qbAKWk=; b=mr28u22fLVZHFPKPupNnpX1VvV0A/jz074pfLA0e/uYL+5OAXng6I+vwPnHpfxBaoT hK0iW+XUmMiUzEV42t0jbkjDBrMisBHKIK+icIrZwoFtmn7AEXGfa0ItActVUg4WOvW7 Xm3objDgMj7vlHghjB3pg29Dulx1bs1x9nF6P8QhlVZGhpBvU2eE2+aVL8By5WHfL/H/ J6Kp21YAgatpIWJF0rUDYBJmJP6g1TcFoatPBwsQS/uvxcuNc59a8kxT53GrURbpUMrr xRz9wgXV2UL1VM9ENUd+adH7FsJrozD9fztQn+roDqURxQqxwHv5SGAZrJGbCjYyk+yk rAXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723411295; x=1724016095; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=THGaYoJGBVo1NFT+ygjobh7TePYwQA+eDE4p3qbAKWk=; b=pA9oHxsE8v2pjZxjAT4qogaQ4Q9mZrminVj1ezT8yNV5znSNX9XISUaKgD0nKFwkVX GLHluuKGHk1sqGs1sagf9I4K8b/V3zp/daOdFLhdVB7fW5YFL9nIH1ItePZBypvUiqV6 KTmafPgSX1QNnSwHWzMYXpWRDZGa3NIHhYx2d55wFQze9cdmEiCS2iCoM3evcLk5Rf2x fOLMbZDruYmjqtt0Qos8J3Zv+MGULplpo8BAu5GAdKcslT+n5XWb9mzLRWodd80AjqpW yG9Z9f+yQKqfyLGEicvRoiX9QSM1Ue+86ceirdWSOizs4ZShMbLGJSA9IHUTBpr2RtEB uEWA== X-Forwarded-Encrypted: i=1; AJvYcCXJqikFu7ZoQDyTkM3CYWmi3Geq5PiTzgtepS924USuUIRX+5kpdWW6ikSH4I6jKNZMSO1xDsoKNByLZZIlb4iIfj7jdW0aD2osR6m5 X-Gm-Message-State: AOJu0YyKyhQ4vV76fw3ac/pPGOAfgy6L7L50rZ7EXDk0Xc7SlUG2V3Pj 4D+WfCMKCpXOWHpKgGmbZeC5uypzQL7BT3ebUAsMXStpifn23MbKJmhK1LYnRJwBXwEWLSJoSPg 24A== X-Google-Smtp-Source: AGHT+IGkx0nGnqUgrMKet1uE7S+7hqwGUl1ojaHvm5FZlh6Ngw6ccwePmfpItDUFlv/Deh8AXgWhV/Td5YI= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:c9c:12b4:a1e3:7f10]) (user=yuzhao job=sendgmr) by 2002:a25:dc8d:0:b0:e0e:4350:d7de with SMTP id 3f1490d57ef6-e0eb9a28207mr13988276.9.1723411295441; Sun, 11 Aug 2024 14:21:35 -0700 (PDT) Date: Sun, 11 Aug 2024 15:21:27 -0600 In-Reply-To: <20240811212129.3074314-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240811212129.3074314-1-yuzhao@google.com> X-Mailer: git-send-email 2.46.0.76.ge559c4bf1a-goog Message-ID: <20240811212129.3074314-2-yuzhao@google.com> Subject: [PATCH mm-unstable v1 1/3] mm/contig_alloc: support __GFP_COMP From: Yu Zhao To: Andrew Morton , Muchun Song Cc: "Matthew Wilcox (Oracle)" , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Support __GFP_COMP in alloc_contig_range(). When the flag is set, upon success the function returns a large folio prepared by prep_new_page(), rather than a range of order-0 pages prepared by split_free_pages() (which is renamed from split_map_pages()). alloc_contig_range() can return folios larger than MAX_PAGE_ORDER, e.g., gigantic hugeTLB folios. As a result, on the free path free_one_page() needs to handle this case by split_large_buddy(), in addition to free_contig_range() properly handling large folios by folio_put(). Signed-off-by: Yu Zhao --- mm/compaction.c | 48 +++------------------ mm/internal.h | 9 ++++ mm/page_alloc.c | 111 ++++++++++++++++++++++++++++++++++-------------- 3 files changed, 94 insertions(+), 74 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index eb95e9b435d0..1ebfef98e1d0 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -79,40 +79,6 @@ static inline bool is_via_compact_memory(int order) { re= turn false; } #define COMPACTION_HPAGE_ORDER (PMD_SHIFT - PAGE_SHIFT) #endif =20 -static struct page *mark_allocated_noprof(struct page *page, unsigned int = order, gfp_t gfp_flags) -{ - post_alloc_hook(page, order, __GFP_MOVABLE); - return page; -} -#define mark_allocated(...) alloc_hooks(mark_allocated_noprof(__VA_ARGS__)) - -static void split_map_pages(struct list_head *freepages) -{ - unsigned int i, order; - struct page *page, *next; - LIST_HEAD(tmp_list); - - for (order =3D 0; order < NR_PAGE_ORDERS; order++) { - list_for_each_entry_safe(page, next, &freepages[order], lru) { - unsigned int nr_pages; - - list_del(&page->lru); - - nr_pages =3D 1 << order; - - mark_allocated(page, order, __GFP_MOVABLE); - if (order) - split_page(page, order); - - for (i =3D 0; i < nr_pages; i++) { - list_add(&page->lru, &tmp_list); - page++; - } - } - list_splice_init(&tmp_list, &freepages[0]); - } -} - static unsigned long release_free_list(struct list_head *freepages) { int order; @@ -742,11 +708,11 @@ static unsigned long isolate_freepages_block(struct c= ompact_control *cc, * * Non-free pages, invalid PFNs, or zone boundaries within the * [start_pfn, end_pfn) range are considered errors, cause function to - * undo its actions and return zero. + * undo its actions and return zero. cc->freepages[] are empty. * * Otherwise, function returns one-past-the-last PFN of isolated page * (which may be greater then end_pfn if end fell in a middle of - * a free page). + * a free page). cc->freepages[] contain free pages isolated. */ unsigned long isolate_freepages_range(struct compact_control *cc, @@ -754,10 +720,9 @@ isolate_freepages_range(struct compact_control *cc, { unsigned long isolated, pfn, block_start_pfn, block_end_pfn; int order; - struct list_head tmp_freepages[NR_PAGE_ORDERS]; =20 for (order =3D 0; order < NR_PAGE_ORDERS; order++) - INIT_LIST_HEAD(&tmp_freepages[order]); + INIT_LIST_HEAD(&cc->freepages[order]); =20 pfn =3D start_pfn; block_start_pfn =3D pageblock_start_pfn(pfn); @@ -788,7 +753,7 @@ isolate_freepages_range(struct compact_control *cc, break; =20 isolated =3D isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, tmp_freepages, 0, true); + block_end_pfn, cc->freepages, 0, true); =20 /* * In strict mode, isolate_freepages_block() returns 0 if @@ -807,13 +772,10 @@ isolate_freepages_range(struct compact_control *cc, =20 if (pfn < end_pfn) { /* Loop terminated early, cleanup. */ - release_free_list(tmp_freepages); + release_free_list(cc->freepages); return 0; } =20 - /* __isolate_free_page() does not map the pages */ - split_map_pages(tmp_freepages); - /* We don't use freelists for anything. */ return pfn; } diff --git a/mm/internal.h b/mm/internal.h index acda347620c6..03e795ce755f 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -679,6 +679,15 @@ extern void prep_compound_page(struct page *page, unsi= gned int order); =20 extern void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags); + +static inline struct page *post_alloc_hook_noprof(struct page *page, unsig= ned int order, + gfp_t gfp_flags) +{ + post_alloc_hook(page, order, __GFP_MOVABLE); + return page; +} +#define mark_allocated(...) alloc_hooks(post_alloc_hook_noprof(__VA_ARGS__= )) + extern bool free_pages_prepare(struct page *page, unsigned int order); =20 extern int user_min_free_kbytes; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 84a7154fde93..6c801404a108 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1196,16 +1196,36 @@ static void free_pcppages_bulk(struct zone *zone, i= nt count, spin_unlock_irqrestore(&zone->lock, flags); } =20 +/* Split a multi-block free page into its individual pageblocks */ +static void split_large_buddy(struct zone *zone, struct page *page, + unsigned long pfn, int order, fpi_t fpi) +{ + unsigned long end =3D pfn + (1 << order); + + VM_WARN_ON_ONCE(!IS_ALIGNED(pfn, 1 << order)); + /* Caller removed page from freelist, buddy info cleared! */ + VM_WARN_ON_ONCE(PageBuddy(page)); + + if (order > pageblock_order) + order =3D pageblock_order; + + while (pfn !=3D end) { + int mt =3D get_pfnblock_migratetype(page, pfn); + + __free_one_page(page, pfn, zone, order, mt, fpi); + pfn +=3D 1 << order; + page =3D pfn_to_page(pfn); + } +} + static void free_one_page(struct zone *zone, struct page *page, unsigned long pfn, unsigned int order, fpi_t fpi_flags) { unsigned long flags; - int migratetype; =20 spin_lock_irqsave(&zone->lock, flags); - migratetype =3D get_pfnblock_migratetype(page, pfn); - __free_one_page(page, pfn, zone, order, migratetype, fpi_flags); + split_large_buddy(zone, page, pfn, order, fpi_flags); spin_unlock_irqrestore(&zone->lock, flags); } =20 @@ -1697,27 +1717,6 @@ static unsigned long find_large_buddy(unsigned long = start_pfn) return start_pfn; } =20 -/* Split a multi-block free page into its individual pageblocks */ -static void split_large_buddy(struct zone *zone, struct page *page, - unsigned long pfn, int order) -{ - unsigned long end_pfn =3D pfn + (1 << order); - - VM_WARN_ON_ONCE(order <=3D pageblock_order); - VM_WARN_ON_ONCE(pfn & (pageblock_nr_pages - 1)); - - /* Caller removed page from freelist, buddy info cleared! */ - VM_WARN_ON_ONCE(PageBuddy(page)); - - while (pfn !=3D end_pfn) { - int mt =3D get_pfnblock_migratetype(page, pfn); - - __free_one_page(page, pfn, zone, pageblock_order, mt, FPI_NONE); - pfn +=3D pageblock_nr_pages; - page =3D pfn_to_page(pfn); - } -} - /** * move_freepages_block_isolate - move free pages in block for page isolat= ion * @zone: the zone @@ -1758,7 +1757,7 @@ bool move_freepages_block_isolate(struct zone *zone, = struct page *page, del_page_from_free_list(buddy, zone, order, get_pfnblock_migratetype(buddy, pfn)); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, buddy, pfn, order); + split_large_buddy(zone, buddy, pfn, order, FPI_NONE); return true; } =20 @@ -1769,7 +1768,7 @@ bool move_freepages_block_isolate(struct zone *zone, = struct page *page, del_page_from_free_list(page, zone, order, get_pfnblock_migratetype(page, pfn)); set_pageblock_migratetype(page, migratetype); - split_large_buddy(zone, page, pfn, order); + split_large_buddy(zone, page, pfn, order, FPI_NONE); return true; } move: @@ -6482,6 +6481,31 @@ int __alloc_contig_migrate_range(struct compact_cont= rol *cc, return (ret < 0) ? ret : 0; } =20 +static void split_free_pages(struct list_head *list) +{ + int order; + + for (order =3D 0; order < NR_PAGE_ORDERS; order++) { + struct page *page, *next; + int nr_pages =3D 1 << order; + + list_for_each_entry_safe(page, next, &list[order], lru) { + int i; + + mark_allocated(page, order, __GFP_MOVABLE); + if (!order) + continue; + + split_page(page, order); + + /* add all subpages to the order-0 head, in sequence */ + list_del(&page->lru); + for (i =3D 0; i < nr_pages; i++) + list_add_tail(&page[i].lru, &list[0]); + } + } +} + /** * alloc_contig_range() -- tries to allocate given range of pages * @start: start PFN to allocate @@ -6594,12 +6618,25 @@ int alloc_contig_range_noprof(unsigned long start, = unsigned long end, goto done; } =20 - /* Free head and tail (if any) */ - if (start !=3D outer_start) - free_contig_range(outer_start, start - outer_start); - if (end !=3D outer_end) - free_contig_range(end, outer_end - end); + if (!(gfp_mask & __GFP_COMP)) { + split_free_pages(cc.freepages); =20 + /* Free head and tail (if any) */ + if (start !=3D outer_start) + free_contig_range(outer_start, start - outer_start); + if (end !=3D outer_end) + free_contig_range(end, outer_end - end); + } else if (start =3D=3D outer_start && end =3D=3D outer_end && is_power_o= f_2(end - start)) { + struct page *head =3D pfn_to_page(start); + int order =3D ilog2(end - start); + + check_new_pages(head, order); + prep_new_page(head, order, gfp_mask, 0); + } else { + ret =3D -EINVAL; + WARN(true, "PFN range: requested [%lu, %lu), allocated [%lu, %lu)\n", + start, end, outer_start, outer_end); + } done: undo_isolate_page_range(start, end, migratetype); return ret; @@ -6708,6 +6745,18 @@ struct page *alloc_contig_pages_noprof(unsigned long= nr_pages, gfp_t gfp_mask, void free_contig_range(unsigned long pfn, unsigned long nr_pages) { unsigned long count =3D 0; + struct folio *folio =3D pfn_folio(pfn); + + if (folio_test_large(folio)) { + int expected =3D folio_nr_pages(folio); + + if (nr_pages =3D=3D expected) + folio_put(folio); + else + WARN(true, "PFN %lu: nr_pages %lu !=3D expected %d\n", + pfn, nr_pages, expected); + return; + } =20 for (; nr_pages--; pfn++) { struct page *page =3D pfn_to_page(pfn); --=20 2.46.0.76.ge559c4bf1a-goog From nobody Sat Feb 7 10:16:20 2026 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03DE515FCE7 for ; Sun, 11 Aug 2024 21:21:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723411300; cv=none; b=REmV3JL3LEyBZ074SD0MtoAG/o8eJC5g1iiqhs5qEET2Ydb8CbGPcyfKx6F2zZU2LjjLUxLgmkMeNK4HBqeRfRex13cpbq9raClyiET6X4s2siufWF59J+MDi8IAfnXpvhtJ8E+6h8NmhoWHMeHOyiCReOUepYqMSSls1jLGB7E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723411300; c=relaxed/simple; bh=pheRWS/WfgSFG+rDhKvzrJ0Y1XJmoH8WxkVEDAVFjdw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JWtE/AUNiDQWlCuZT3pH6z7NiM8a5NswAP8fztHmgJ8KBoWLLl8tbZKuKUdtSmSHQ7rlMx4cbot4mbWnNgUE76y56269O5jp2/4VGsIBgJ9OKJtWnR14b4mNReU7H2FjjTeCajZwLyUULjPnwHwm8r5zChCuYMi5NRnvnxSrb4M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=b+zQpiSd; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="b+zQpiSd" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-66b3b4415c7so84417787b3.0 for ; Sun, 11 Aug 2024 14:21:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723411298; x=1724016098; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MiWur5WPUCFOabKPS7p8sIQsgKEHOpII1M/7n2SnGyU=; b=b+zQpiSdwvlv2hA0xXq+G9kEk96QmF433yGTqcnzYqTCZs3OVf7LqbSg2OOv4HXrux pBSM7n4yALIeFK2l5qyczqCI5MAqv/RVL8p+DqWeX8RJBhXP83EuWOUceuIzygeXqTZ9 hu+C3Cxi4aqYDdY3n+ulmmxxZJiU7G0/pRss4KGdZmQJCTocQLCCKbb1u8CUPXV2RB/Q cz9uzl/BMVGGwu+GfV7W4EWE1jlrY45EWQpZu6lNTqHilmhZjvVkpQa+ztJ+L/DqSnRC LET/CWNrXz+aWsck3BNfom8uQsxMYkyo61vg+nJ1QINbBajpFTRZ9GbKi8JBb/v4SS+E +ToQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723411298; x=1724016098; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MiWur5WPUCFOabKPS7p8sIQsgKEHOpII1M/7n2SnGyU=; b=smHzTLFTf368WLDETkeaGlITkRVX3SASaOk8LryWcjCwhUaZkSvMaxinh3SmAfkq7b cJLrHB3FwaWIwTD7szccYuJ9++JZOvvKH6+xz9Tyjpq1Pk0pDHr11GmqeE0NYZETB8uT JX2+meHRNhzTKRki4z7V2xHdsJWW9w/m7Mm3eoXZboHWtW2l1A5eXSWz/Z1po3052Ry+ eli/6mgfZ7K2Nz6iU9W2r+N93EP6PIm+PGs0+/QPm00I0ceJ8blfJyBs+hhAgC4DXomL 3LYfNUVpGKZzRZ/0NcnijDPdz92a0TKYmYASyjv5BmDbLskKT2t+PJ6n2iJlzqJpPTYJ cc2g== X-Forwarded-Encrypted: i=1; AJvYcCX5ewRh06r9WiCYLM42oIN5rlHjiRic4VgyTisE8KvmJB4mUiXXSH/aq2U6gypAmhiyK+n7TaPJy+ki5aKVj7VE+RxOZrcVZNegvE9H X-Gm-Message-State: AOJu0YzJQgTVUAD+EdVBSSPxO61R6gubh/Z8gCc+OkEE0oW6VmgY+L64 fLYp7oT/uV8N4kJFNap0UuMycrMK74nXasAQQwQLkHqyiV/RxDbnIiupC5tmYtEhRLUE+AOSeBQ 2Yg== X-Google-Smtp-Source: AGHT+IEHCLQ7KUCLVKLYCjPKQE5Er4aNEmGlPQGmmKueq/o3fFhVzS87gYKaHPAinDZLXmdPberE0nnpI54= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:c9c:12b4:a1e3:7f10]) (user=yuzhao job=sendgmr) by 2002:a05:690c:2e13:b0:68e:edf4:7b80 with SMTP id 00721157ae682-69ec86beceamr2524307b3.5.1723411297967; Sun, 11 Aug 2024 14:21:37 -0700 (PDT) Date: Sun, 11 Aug 2024 15:21:28 -0600 In-Reply-To: <20240811212129.3074314-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240811212129.3074314-1-yuzhao@google.com> X-Mailer: git-send-email 2.46.0.76.ge559c4bf1a-goog Message-ID: <20240811212129.3074314-3-yuzhao@google.com> Subject: [PATCH mm-unstable v1 2/3] mm/cma: add cma_alloc_folio() From: Yu Zhao To: Andrew Morton , Muchun Song Cc: "Matthew Wilcox (Oracle)" , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With alloc_contig_range() and free_contig_range() supporting large folios, CMA can allocate and free large folios too, by cma_alloc_folio() and cma_release(). Signed-off-by: Yu Zhao --- include/linux/cma.h | 1 + mm/cma.c | 47 ++++++++++++++++++++++++++++++--------------- 2 files changed, 33 insertions(+), 15 deletions(-) diff --git a/include/linux/cma.h b/include/linux/cma.h index 9db877506ea8..086553fbda73 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -46,6 +46,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_a= ddr_t size, struct cma **res_cma); extern struct page *cma_alloc(struct cma *cma, unsigned long count, unsign= ed int align, bool no_warn); +extern struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp= ); extern bool cma_pages_valid(struct cma *cma, const struct page *pages, uns= igned long count); extern bool cma_release(struct cma *cma, const struct page *pages, unsigne= d long count); =20 diff --git a/mm/cma.c b/mm/cma.c index 95d6950e177b..46feb06db8e7 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -403,18 +403,8 @@ static void cma_debug_show_areas(struct cma *cma) spin_unlock_irq(&cma->lock); } =20 -/** - * cma_alloc() - allocate pages from contiguous area - * @cma: Contiguous memory region for which the allocation is performed. - * @count: Requested number of pages. - * @align: Requested alignment of pages (in PAGE_SIZE order). - * @no_warn: Avoid printing message about failed allocation - * - * This function allocates part of contiguous memory on specific - * contiguous memory area. - */ -struct page *cma_alloc(struct cma *cma, unsigned long count, - unsigned int align, bool no_warn) +static struct page *__cma_alloc(struct cma *cma, unsigned long count, + unsigned int align, gfp_t gfp) { unsigned long mask, offset; unsigned long pfn =3D -1; @@ -463,8 +453,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long c= ount, =20 pfn =3D cma->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(&cma_mutex); - ret =3D alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, - GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0)); + ret =3D alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp); mutex_unlock(&cma_mutex); if (ret =3D=3D 0) { page =3D pfn_to_page(pfn); @@ -494,7 +483,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long c= ount, page_kasan_tag_reset(nth_page(page, i)); } =20 - if (ret && !no_warn) { + if (ret && !(gfp & __GFP_NOWARN)) { pr_err_ratelimited("%s: %s: alloc failed, req-size: %lu pages, ret: %d\n= ", __func__, cma->name, count, ret); cma_debug_show_areas(cma); @@ -513,6 +502,34 @@ struct page *cma_alloc(struct cma *cma, unsigned long = count, return page; } =20 +/** + * cma_alloc() - allocate pages from contiguous area + * @cma: Contiguous memory region for which the allocation is performed. + * @count: Requested number of pages. + * @align: Requested alignment of pages (in PAGE_SIZE order). + * @no_warn: Avoid printing message about failed allocation + * + * This function allocates part of contiguous memory on specific + * contiguous memory area. + */ +struct page *cma_alloc(struct cma *cma, unsigned long count, + unsigned int align, bool no_warn) +{ + return __cma_alloc(cma, count, align, GFP_KERNEL | (no_warn ? __GFP_NOWAR= N : 0)); +} + +struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp) +{ + struct page *page; + + if (WARN_ON(order && !(gfp | __GFP_COMP))) + return NULL; + + page =3D __cma_alloc(cma, 1 << order, order, gfp); + + return page ? page_folio(page) : NULL; +} + bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count) { --=20 2.46.0.76.ge559c4bf1a-goog From nobody Sat Feb 7 10:16:20 2026 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93200165EF0 for ; Sun, 11 Aug 2024 21:21:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723411303; cv=none; b=e42zOwtBmqQbVYMxIu6Pp6AmyKRHBWWLdPFi/5fuqW1UkkGCENs34W3Dsy0b+BhhN/cAkAoSsaAEl2V05QQo7utSqgu1QottFXtUtxf8IKjfwDgUJxHf6bB70lUzB3Qs5SHFiHBeJ3uDAr61A7O1DwYUPH6qyY/S9ipH/BFCvDg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723411303; c=relaxed/simple; bh=rX/uyGK859Qv0c/8SCHhsaXgw8NXAEtdAyPwFlV/th8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jmWEarIMxsOyasOq4MDzMvxqbNSIIF+LgNqA1tKbl2yZmlq+MVwSZ8EmXPshpM3jLrnA+m7J9CeuEv4dY38x2uQsx+0R4YK8CA4WorRba4CxMzYRtVU8WSwgtFsu3qelcBT5kOJfNZgBTStM/2J1zhTmS1R7J5R8INPuumtmAZc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=E9l63VT9; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="E9l63VT9" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-654d96c2bb5so71759517b3.2 for ; Sun, 11 Aug 2024 14:21:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1723411300; x=1724016100; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=plPNUwSc20wvxDLG/5V4ONiycR26+90aEQUlSgRQtvI=; b=E9l63VT9e91/MxKGZp02kE0MK5vH7t4CcV1mPs88z+waj062iJejvzuwInKABnG3Tr WF+nFKZ2C8Ol4x1Y5yOeCUwmrT/12Q6kAzkInAhktCiq0f7/2cSRhb/vZFvjvQzeiYJ7 /1puVbwvEvEa1ndM8w5/IkQO3jnmoM9GDXdHg01lIMBuEnki47ZYdsvoWjQVFFJGvmvt Fgl8NCuyT1RmNSE8O/+Qxtg4onuHc7DXOHnI3XbQP3lftgzC+if2bS8Mj8yzzQ5o8Utg afvwrHpI3OwtT8IiY9UoH5EkTjYI87yOS2gbnoyV2b3j1FEqJkkmumdLMDApWUqPjD9v 6YYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723411300; x=1724016100; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=plPNUwSc20wvxDLG/5V4ONiycR26+90aEQUlSgRQtvI=; b=G9oIfwa7GoYFzYn3wTsma+i4XDmg4+OVp0ppMOEabqc1GfZ6gOtphnSYG4OSE6Gdvg xNWi86CW2jJv1IFz20AhfJ6y7RSTOUdHjeoNAwBHDl6H5nVWmqFmWvCzHEMSDfHZIoaw kzxbM6J417EM9xsrQ7K9OQLltOUjCMglkhoqF41+82xwdcddKGuhtab8ICcc1syAcwpg Ddl5uc5AX31UiKW355DyBjYYYK6IAg02v5dPzoCL71a/fH586YEgh0pEews8XaKx68T6 iK+DZ5q1qHJcgahY31wr5rQ9Y1m+dwvyw7TW4bfZDurv/xdbIrtOQ1JRc6gfZcCKTQKC FjOQ== X-Forwarded-Encrypted: i=1; AJvYcCWocmHhUD1WtQb3Vh4dTWiQPGdgfRFXKoCLrVD5OkhSqjuZpug0TuHq3mszZN9lbwC1acsA5D/esUAzdAGnQz6KtDNCh02B7wcSPwgv X-Gm-Message-State: AOJu0YxG3jaSoDaKaGxm3lytAXQRpTGZ+Pw9gU4IjV2Nt2IEun3Xr5fq 0nIBQ+uY/jKlQVRr4y3wDrRM6xdOdZE0LYk3A9whPR7PiJoYuDrQBq0opDxMv9y4EdP1dSmKKy7 nGw== X-Google-Smtp-Source: AGHT+IHabmrGk91oLaByBgnfM+5JbvP9zqcwTHpzXiVoA50OMZR/9U6IX+5qqzgzKEbG6UqMgpxrz3SgVrA= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:c9c:12b4:a1e3:7f10]) (user=yuzhao job=sendgmr) by 2002:a81:c64b:0:b0:62f:a56a:cee8 with SMTP id 00721157ae682-69ec54adf13mr3154657b3.3.1723411300470; Sun, 11 Aug 2024 14:21:40 -0700 (PDT) Date: Sun, 11 Aug 2024 15:21:29 -0600 In-Reply-To: <20240811212129.3074314-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240811212129.3074314-1-yuzhao@google.com> X-Mailer: git-send-email 2.46.0.76.ge559c4bf1a-goog Message-ID: <20240811212129.3074314-4-yuzhao@google.com> Subject: [PATCH mm-unstable v1 3/3] mm/hugetlb: use __GFP_COMP for gigantic folios From: Yu Zhao To: Andrew Morton , Muchun Song Cc: "Matthew Wilcox (Oracle)" , Zi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use __GFP_COMP for gigantic folios to greatly reduce not only the code but also the allocation and free time. LOC (approximately): -200, +50 Allocate and free 500 1GB hugeTLB memory without HVO by: time echo 500 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages Before After Alloc ~13s ~10s Free ~15s <1s The above magnitude generally holds for multiple x86 and arm64 CPU models. Signed-off-by: Yu Zhao --- include/linux/hugetlb.h | 9 +- mm/hugetlb.c | 244 ++++++++-------------------------------- 2 files changed, 50 insertions(+), 203 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 3100a52ceb73..98c47c394b89 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -896,10 +896,11 @@ static inline bool hugepage_movable_supported(struct = hstate *h) /* Movability of hugepages depends on migration support. */ static inline gfp_t htlb_alloc_mask(struct hstate *h) { - if (hugepage_movable_supported(h)) - return GFP_HIGHUSER_MOVABLE; - else - return GFP_HIGHUSER; + gfp_t gfp =3D __GFP_COMP | __GFP_NOWARN; + + gfp |=3D hugepage_movable_supported(h) ? GFP_HIGHUSER_MOVABLE : GFP_HIGHU= SER; + + return gfp; } =20 static inline gfp_t htlb_modify_alloc_mask(struct hstate *h, gfp_t gfp_mas= k) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1c13e65ab119..691f63408d50 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1512,43 +1512,7 @@ static int hstate_next_node_to_free(struct hstate *h= , nodemask_t *nodes_allowed) ((node =3D hstate_next_node_to_free(hs, mask)) || 1); \ nr_nodes--) =20 -/* used to demote non-gigantic_huge pages as well */ -static void __destroy_compound_gigantic_folio(struct folio *folio, - unsigned int order, bool demote) -{ - int i; - int nr_pages =3D 1 << order; - struct page *p; - - atomic_set(&folio->_entire_mapcount, 0); - atomic_set(&folio->_large_mapcount, 0); - atomic_set(&folio->_pincount, 0); - - for (i =3D 1; i < nr_pages; i++) { - p =3D folio_page(folio, i); - p->flags &=3D ~PAGE_FLAGS_CHECK_AT_FREE; - p->mapping =3D NULL; - clear_compound_head(p); - if (!demote) - set_page_refcounted(p); - } - - __folio_clear_head(folio); -} - -static void destroy_compound_hugetlb_folio_for_demote(struct folio *folio, - unsigned int order) -{ - __destroy_compound_gigantic_folio(folio, order, true); -} - #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE -static void destroy_compound_gigantic_folio(struct folio *folio, - unsigned int order) -{ - __destroy_compound_gigantic_folio(folio, order, false); -} - static void free_gigantic_folio(struct folio *folio, unsigned int order) { /* @@ -1569,38 +1533,52 @@ static void free_gigantic_folio(struct folio *folio= , unsigned int order) static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, int nid, nodemask_t *nodemask) { - struct page *page; - unsigned long nr_pages =3D pages_per_huge_page(h); + struct folio *folio; + int order =3D huge_page_order(h); + bool retry =3D false; + if (nid =3D=3D NUMA_NO_NODE) nid =3D numa_mem_id(); - +retry: + folio =3D NULL; #ifdef CONFIG_CMA { int node; =20 - if (hugetlb_cma[nid]) { - page =3D cma_alloc(hugetlb_cma[nid], nr_pages, - huge_page_order(h), true); - if (page) - return page_folio(page); - } + if (hugetlb_cma[nid]) + folio =3D cma_alloc_folio(hugetlb_cma[nid], order, gfp_mask); =20 - if (!(gfp_mask & __GFP_THISNODE)) { + if (!folio && !(gfp_mask & __GFP_THISNODE)) { for_each_node_mask(node, *nodemask) { if (node =3D=3D nid || !hugetlb_cma[node]) continue; =20 - page =3D cma_alloc(hugetlb_cma[node], nr_pages, - huge_page_order(h), true); - if (page) - return page_folio(page); + folio =3D cma_alloc_folio(hugetlb_cma[node], order, gfp_mask); + if (folio) + break; } } } #endif + if (!folio) { + struct page *page =3D alloc_contig_pages(1 << order, gfp_mask, nid, node= mask); =20 - page =3D alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask); - return page ? page_folio(page) : NULL; + if (!page) + return NULL; + + folio =3D page_folio(page); + } + + if (folio_ref_freeze(folio, 1)) + return folio; + + pr_warn("HugeTLB: unexpected refcount on PFN %lu\n", folio_pfn(folio)); + free_gigantic_folio(folio, order); + if (!retry) { + retry =3D true; + goto retry; + } + return NULL; } =20 #else /* !CONFIG_CONTIG_ALLOC */ @@ -1619,8 +1597,6 @@ static struct folio *alloc_gigantic_folio(struct hsta= te *h, gfp_t gfp_mask, } static inline void free_gigantic_folio(struct folio *folio, unsigned int order) { } -static inline void destroy_compound_gigantic_folio(struct folio *folio, - unsigned int order) { } #endif =20 /* @@ -1747,19 +1723,17 @@ static void __update_and_free_hugetlb_folio(struct = hstate *h, folio_clear_hugetlb_hwpoison(folio); =20 folio_ref_unfreeze(folio, 1); + INIT_LIST_HEAD(&folio->_deferred_list); =20 /* * Non-gigantic pages demoted from CMA allocated gigantic pages * need to be given back to CMA in free_gigantic_folio. */ if (hstate_is_gigantic(h) || - hugetlb_cma_folio(folio, huge_page_order(h))) { - destroy_compound_gigantic_folio(folio, huge_page_order(h)); + hugetlb_cma_folio(folio, huge_page_order(h))) free_gigantic_folio(folio, huge_page_order(h)); - } else { - INIT_LIST_HEAD(&folio->_deferred_list); + else folio_put(folio); - } } =20 /* @@ -2032,95 +2006,6 @@ static void prep_new_hugetlb_folio(struct hstate *h,= struct folio *folio, int ni spin_unlock_irq(&hugetlb_lock); } =20 -static bool __prep_compound_gigantic_folio(struct folio *folio, - unsigned int order, bool demote) -{ - int i, j; - int nr_pages =3D 1 << order; - struct page *p; - - __folio_clear_reserved(folio); - for (i =3D 0; i < nr_pages; i++) { - p =3D folio_page(folio, i); - - /* - * For gigantic hugepages allocated through bootmem at - * boot, it's safer to be consistent with the not-gigantic - * hugepages and clear the PG_reserved bit from all tail pages - * too. Otherwise drivers using get_user_pages() to access tail - * pages may get the reference counting wrong if they see - * PG_reserved set on a tail page (despite the head page not - * having PG_reserved set). Enforcing this consistency between - * head and tail pages allows drivers to optimize away a check - * on the head page when they need know if put_page() is needed - * after get_user_pages(). - */ - if (i !=3D 0) /* head page cleared above */ - __ClearPageReserved(p); - /* - * Subtle and very unlikely - * - * Gigantic 'page allocators' such as memblock or cma will - * return a set of pages with each page ref counted. We need - * to turn this set of pages into a compound page with tail - * page ref counts set to zero. Code such as speculative page - * cache adding could take a ref on a 'to be' tail page. - * We need to respect any increased ref count, and only set - * the ref count to zero if count is currently 1. If count - * is not 1, we return an error. An error return indicates - * the set of pages can not be converted to a gigantic page. - * The caller who allocated the pages should then discard the - * pages using the appropriate free interface. - * - * In the case of demote, the ref count will be zero. - */ - if (!demote) { - if (!page_ref_freeze(p, 1)) { - pr_warn("HugeTLB page can not be used due to unexpected inflated ref c= ount\n"); - goto out_error; - } - } else { - VM_BUG_ON_PAGE(page_count(p), p); - } - if (i !=3D 0) - set_compound_head(p, &folio->page); - } - __folio_set_head(folio); - /* we rely on prep_new_hugetlb_folio to set the hugetlb flag */ - folio_set_order(folio, order); - atomic_set(&folio->_entire_mapcount, -1); - atomic_set(&folio->_large_mapcount, -1); - atomic_set(&folio->_pincount, 0); - return true; - -out_error: - /* undo page modifications made above */ - for (j =3D 0; j < i; j++) { - p =3D folio_page(folio, j); - if (j !=3D 0) - clear_compound_head(p); - set_page_refcounted(p); - } - /* need to clear PG_reserved on remaining tail pages */ - for (; j < nr_pages; j++) { - p =3D folio_page(folio, j); - __ClearPageReserved(p); - } - return false; -} - -static bool prep_compound_gigantic_folio(struct folio *folio, - unsigned int order) -{ - return __prep_compound_gigantic_folio(folio, order, false); -} - -static bool prep_compound_gigantic_folio_for_demote(struct folio *folio, - unsigned int order) -{ - return __prep_compound_gigantic_folio(folio, order, true); -} - /* * Find and lock address space (mapping) in write mode. * @@ -2159,7 +2044,6 @@ static struct folio *alloc_buddy_hugetlb_folio(struct= hstate *h, */ if (node_alloc_noretry && node_isset(nid, *node_alloc_noretry)) alloc_try_hard =3D false; - gfp_mask |=3D __GFP_COMP|__GFP_NOWARN; if (alloc_try_hard) gfp_mask |=3D __GFP_RETRY_MAYFAIL; if (nid =3D=3D NUMA_NO_NODE) @@ -2206,48 +2090,14 @@ static struct folio *alloc_buddy_hugetlb_folio(stru= ct hstate *h, return folio; } =20 -static struct folio *__alloc_fresh_hugetlb_folio(struct hstate *h, - gfp_t gfp_mask, int nid, nodemask_t *nmask, - nodemask_t *node_alloc_noretry) -{ - struct folio *folio; - bool retry =3D false; - -retry: - if (hstate_is_gigantic(h)) - folio =3D alloc_gigantic_folio(h, gfp_mask, nid, nmask); - else - folio =3D alloc_buddy_hugetlb_folio(h, gfp_mask, - nid, nmask, node_alloc_noretry); - if (!folio) - return NULL; - - if (hstate_is_gigantic(h)) { - if (!prep_compound_gigantic_folio(folio, huge_page_order(h))) { - /* - * Rare failure to convert pages to compound page. - * Free pages and try again - ONCE! - */ - free_gigantic_folio(folio, huge_page_order(h)); - if (!retry) { - retry =3D true; - goto retry; - } - return NULL; - } - } - - return folio; -} - static struct folio *only_alloc_fresh_hugetlb_folio(struct hstate *h, gfp_t gfp_mask, int nid, nodemask_t *nmask, nodemask_t *node_alloc_noretry) { struct folio *folio; =20 - folio =3D __alloc_fresh_hugetlb_folio(h, gfp_mask, nid, nmask, - node_alloc_noretry); + folio =3D hstate_is_gigantic(h) ? alloc_gigantic_folio(h, gfp_mask, nid, = nmask) : + alloc_buddy_hugetlb_folio(h, gfp_mask, nid, nmask, node_alloc_noretry); if (folio) init_new_hugetlb_folio(h, folio); return folio; @@ -2265,7 +2115,8 @@ static struct folio *alloc_fresh_hugetlb_folio(struct= hstate *h, { struct folio *folio; =20 - folio =3D __alloc_fresh_hugetlb_folio(h, gfp_mask, nid, nmask, NULL); + folio =3D hstate_is_gigantic(h) ? alloc_gigantic_folio(h, gfp_mask, nid, = nmask) : + alloc_buddy_hugetlb_folio(h, gfp_mask, nid, nmask, NULL); if (!folio) return NULL; =20 @@ -2549,9 +2400,8 @@ struct folio *alloc_buddy_hugetlb_folio_with_mpol(str= uct hstate *h, =20 nid =3D huge_node(vma, addr, gfp_mask, &mpol, &nodemask); if (mpol_is_preferred_many(mpol)) { - gfp_t gfp =3D gfp_mask | __GFP_NOWARN; + gfp_t gfp =3D gfp_mask & ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); =20 - gfp &=3D ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); folio =3D alloc_surplus_hugetlb_folio(h, gfp, nid, nodemask); =20 /* Fallback to all nodes if page=3D=3DNULL */ @@ -3333,6 +3183,7 @@ static void __init hugetlb_folio_init_tail_vmemmap(st= ruct folio *folio, for (pfn =3D head_pfn + start_page_number; pfn < end_pfn; pfn++) { struct page *page =3D pfn_to_page(pfn); =20 + __ClearPageReserved(folio_page(folio, pfn - head_pfn)); __init_single_page(page, pfn, zone, nid); prep_compound_tail((struct page *)folio, pfn - head_pfn); ret =3D page_ref_freeze(page, 1); @@ -3949,21 +3800,16 @@ static long demote_free_hugetlb_folios(struct hstat= e *src, struct hstate *dst, continue; =20 list_del(&folio->lru); - /* - * Use destroy_compound_hugetlb_folio_for_demote for all huge page - * sizes as it will not ref count folios. - */ - destroy_compound_hugetlb_folio_for_demote(folio, huge_page_order(src)); + + split_page_owner(&folio->page, huge_page_order(src), huge_page_order(dst= )); + pgalloc_tag_split(&folio->page, 1 << huge_page_order(src)); =20 for (i =3D 0; i < pages_per_huge_page(src); i +=3D pages_per_huge_page(d= st)) { struct page *page =3D folio_page(folio, i); =20 - if (hstate_is_gigantic(dst)) - prep_compound_gigantic_folio_for_demote(page_folio(page), - dst->order); - else - prep_compound_page(page, dst->order); - set_page_private(page, 0); + page->mapping =3D NULL; + clear_compound_head(page); + prep_compound_page(page, dst->order); =20 init_new_hugetlb_folio(dst, page_folio(page)); list_add(&page->lru, &dst_list); --=20 2.46.0.76.ge559c4bf1a-goog