From nobody Thu Apr 2 17:43:09 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A63541AB6F1 for ; Thu, 12 Feb 2026 00:37:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856650; cv=none; b=ShJz3+2k9WbHqdLQWKLfPY1bhLPK3UtnhKEML6mJIeDLsezLz+rxkBhOGmK62CpSC5wzNSQ6ObmsP1fpY63sZOJppBCGzxZB+O9uM2gIMSw/IZlbIyzNxJ5sK/SeEZTEzPctUi4CvqdEm8nXUrJ0efWTyz9qeNovmfHfkeqXq/Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856650; c=relaxed/simple; bh=LM4IDRwtUxJyRCp0kUE1qgZT5L1HEBFTk6wT4AL7YaM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=sSVpQZTAX/BRbnku0LAOmNj/s1dUIgXkjCB5MdBGulEDJeK4WYr3AHsgsegi02L46LDRqZ/VZ/qTinpQY5ODnA7tyrLLcf+etYShhjwkoAl3f6YLkcrYOVGmc7BR4s/QF9jb3BYGlQ+p/evOv0EpXMfIOGLdTBbO1M7Y7XgaAXY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=lWwt34gR; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lWwt34gR" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-35464d7c539so5892067a91.0 for ; Wed, 11 Feb 2026 16:37:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770856648; x=1771461448; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=2JnA227LuyuHWxtahECJXUZ4BvkZ1hdR9r/VpPR8oGw=; b=lWwt34gRcEoQKHR52f7m0mprMvta85z7Kl+YRjoeQuFli12rq3Z6JuvgWq1WcZGldL JL4p6Q+ltNTwhEqpJW8S0ate+1RH4GQNSXXoaMOl3ixfSEtkrTiRK9L6K9yTFtJGDFwz CNXEeb0lK6PQvmS7IExlAh3IwMhezAszqM6Qbu1PSB3JmB/deMckSKigsD1pzif3P8/G Sqe2ylvlkh3Yv0erQBk1OU5rPBSHK4YewDlD5D00SMc917VLPu0OQRbuuYqU1+yeJnC6 UFcY0BbQYqxZfoEJjlzSgen5wIw57Lm3oAV6idIoMWdvkYR86WTZXAtM23mI1fc5/yNk V6yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770856648; x=1771461448; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2JnA227LuyuHWxtahECJXUZ4BvkZ1hdR9r/VpPR8oGw=; b=wD3gBp2K4yczCVjSRqroZkmtE66uW1bzK6Ug2fLQPOYa/rzFHDF2ZswyLtw+12EJZh GbtgaqywoXbnmkH5ESf0R2Kh7hGRRaoaWtpL7C2h4JaN39rIFmw4MvL9Cv1pFtU0qqIN FJf8SnbtMYP+lBdKIkH7crjNI+GhzD/TbFN0P+TAZ0DM5vHLyGN9Dbh+3dQrq2F1DpHj 4hDYIK/sKS6Hns2WMp0/bXzrklLQ7yPbp1Hhw/18f2KopkJ7P1g54yNYT9cRJz5m6xnk w8q9wnCo+BFj71sQYPg8qTNbakZ4iW8r1Qbtip2WZXytdCSMoth3DeagC9tpOrv8VO/q Zx4g== X-Forwarded-Encrypted: i=1; AJvYcCXeBOHsLdq3Q1HPEYlcwWD6+VJ4Wtp6QE3UA47CV+GFPVjn86US7erupfy2veGUxq/5pG2/whOFRRPc4lE=@vger.kernel.org X-Gm-Message-State: AOJu0YxnR1bvtDvsav8RVkIn9SbiYU9rOH0Pn1hy360MfQYrqLJnjF6E BEWPaIaIRk8OTV2WrwqZk4Yb4OuYD+VVNJklmf/sdbO8n63WwrYKnPEFomVAsUavMW1R2FPCQgo 0MFPi23poRUN+fDngraIoOCHROQ== X-Received: from pgbdo14.prod.google.com ([2002:a05:6a02:e8e:b0:c6c:9940:fac3]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:495:b0:38d:ebdc:3546 with SMTP id adf61e73a8af0-39448477cb9mr923997637.6.1770856647927; Wed, 11 Feb 2026 16:37:27 -0800 (PST) Date: Wed, 11 Feb 2026 16:37:12 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.53.0.310.g728cabbaf7-goog Message-ID: Subject: [RFC PATCH v1 1/7] mm: hugetlb: Consolidate interpretation of gbl_chg within alloc_hugetlb_folio() From: Ackerley Tng To: akpm@linux-foundation.org, dan.j.williams@intel.com, david@kernel.org, fvdl@google.com, hannes@cmpxchg.org, jgg@nvidia.com, jiaqiyan@google.com, jthoughton@google.com, kalyazin@amazon.com, mhocko@kernel.org, michael.roth@amd.com, muchun.song@linux.dev, osalvador@suse.de, pasha.tatashin@soleen.com, pbonzini@redhat.com, peterx@redhat.com, pratyush@kernel.org, rick.p.edgecombe@intel.com, rientjes@google.com, roman.gushchin@linux.dev, seanjc@google.com, shakeel.butt@linux.dev, shivankg@amd.com, vannapurve@google.com, yan.y.zhao@intel.com Cc: ackerleytng@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Previously, gbl_chg was passed from alloc_hugetlb_folio() into dequeue_hugetlb_folio_vma(), leaking the concept of gbl_chg into dequeue_hugetlb_folio_vma(). This patch consolidates the interpretation of gbl_chg into alloc_hugetlb_folio(), also renaming dequeue_hugetlb_folio_vma() to dequeue_hugetlb_folio() so dequeue_hugetlb_folio() can just focus on dequeuing a folio. No functional change intended. Signed-off-by: Ackerley Tng Reviewed-by: James Houghton Reviewed-by: Joshua Hahn --- mm/hugetlb.c | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a1832da0f6236..fd067bd394ee0 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1380,7 +1380,7 @@ static unsigned long available_huge_pages(struct hsta= te *h) =20 static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h, struct vm_area_struct *vma, - unsigned long address, long gbl_chg) + unsigned long address) { struct folio *folio =3D NULL; struct mempolicy *mpol; @@ -1388,13 +1388,6 @@ static struct folio *dequeue_hugetlb_folio_vma(struc= t hstate *h, nodemask_t *nodemask; int nid; =20 - /* - * gbl_chg=3D=3D1 means the allocation requires a new page that was not - * reserved before. Making sure there's at least one free page. - */ - if (gbl_chg && !available_huge_pages(h)) - goto err; - gfp_mask =3D htlb_alloc_mask(h); nid =3D huge_node(vma, address, gfp_mask, &mpol, &nodemask); =20 @@ -1412,9 +1405,6 @@ static struct folio *dequeue_hugetlb_folio_vma(struct= hstate *h, =20 mpol_cond_put(mpol); return folio; - -err: - return NULL; } =20 #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE @@ -2962,12 +2952,16 @@ struct folio *alloc_hugetlb_folio(struct vm_area_st= ruct *vma, goto out_uncharge_cgroup_reservation; =20 spin_lock_irq(&hugetlb_lock); + /* - * glb_chg is passed to indicate whether or not a page must be taken - * from the global free pool (global change). gbl_chg =3D=3D 0 indicates - * a reservation exists for the allocation. + * gbl_chg =3D=3D 0 indicates a reservation exists for the allocation - so + * try dequeuing a page. If there are available_huge_pages(), try using + * them! */ - folio =3D dequeue_hugetlb_folio_vma(h, vma, addr, gbl_chg); + folio =3D NULL; + if (!gbl_chg || available_huge_pages(h)) + folio =3D dequeue_hugetlb_folio_vma(h, vma, addr); + if (!folio) { spin_unlock_irq(&hugetlb_lock); folio =3D alloc_buddy_hugetlb_folio_with_mpol(h, vma, addr); --=20 2.53.0.310.g728cabbaf7-goog From nobody Thu Apr 2 17:43:09 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E3E11AC44D for ; Thu, 12 Feb 2026 00:37:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856652; cv=none; b=US1v9VPZWbeNC6sEOCTlDs4TWhUtOIOXBjmUSuu0CIKciuOkMhSAAlQKbQLtBQLJXRetLGU24N3iz1pAGasy7rjfhMl5AC1khsEQYM/ygzBKsK7XAohjtg5yWO6XWaOcPt3ibZjhuq5LL9fSNiDf0otjCYJgmHrIBnQWxXAw11U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856652; c=relaxed/simple; bh=WbOu/6e8uR8HBFqHpqlKdK3+FnhdgP20V84LWErdyq0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=OqDkGlAjkXhbrGJd+F2ev9Qc9nXRKKSJtppgcbm94mYetDN/76JXsPXiFZ0VLcBLIqz0qbNmOpe1SJmP1d28k46tLfLtITA67KLdfIgMKtubmwUCY9LVIpRvA6hHbHjYJZz4t/qUtuYMXOO5ujHz35+eWoTZXbjFry6x8Wi0PsM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=QwALuDq4; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="QwALuDq4" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-35301003062so16219921a91.2 for ; Wed, 11 Feb 2026 16:37:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770856650; x=1771461450; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=7+fyXiA+m57FzuXOpgMoIQnVFz3nHsjbKsXBY1bNSP0=; b=QwALuDq4hDTWF4z2/mARw6VlkqVwFmhExhusQcQrJbZgXlo8xEVZh5M/OZ+CloAal9 w2ZNbDjQRyWTAYizFTFfEpzgWYLxWsRxIfdWcIkY1T5TTKEJPXgxZcgOOmy0lkG7iTDF VwmFQwDYgAVwioT/5waJIpN+TTFtt4xmyhVq2Uu75UHfNhnLsVTg8svOzY+S8m9Z7cdg w2UmaVDNmmMNTFRz2PEoQv5i7U9JOydqRj5FYHotYfnzkU92eaDMIm/ggFj/BMWy23rz EWaIDJsCHQOe0oFcK9j/sOduMBqzqanXdUS2cVCMDqv1WlOZo1HriFC71cRYa3ON3nbE BuSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770856650; x=1771461450; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=7+fyXiA+m57FzuXOpgMoIQnVFz3nHsjbKsXBY1bNSP0=; b=O+VnTuRZqc/ZW+4a9yyEVgiviDTeUjrqsHtjFOy+Yxv0d7A32tOs8PrXjtFUOZiWvl ow+teHey3LQ+0A1J5xrNN+IUwa1CG4TE9mcpDzQCw1Ecp0IAzw7A2ybhhlph8bui4nZL phQXbvBl7VhqAENgKTHIjH5U49PpxNU8N/+EOkTeMxFtakigav0eli9uXKkXNYuFwDnc cL8jnENtptliko0e7XnKTL0OuLEU+FoU+0+ReCRmytvUE7/fKJ0AWjMS1Scy3/c95wBW olxy9bmsPE5M1nHdDzmx+uDf0TPd00ukFUGVbgihtMatOPdhIw0XheXUeJaExD+vP/nc Jptw== X-Forwarded-Encrypted: i=1; AJvYcCXLrkt1JqOabU7yut3/Q7P0qCanKFPep2CFTdMpjVdufGNxT5tsOs/gaOnr0xpIeL0UBBYC0syDZN1K6vk=@vger.kernel.org X-Gm-Message-State: AOJu0YwJcvmwhY1Wgt4WrN90VAeOWItrSDjT7P0tTSICxWp69VZsKzVp KQ1oxwXiYyvPg5WTt8XrKeX4LDlxin9f9FHr/G/VdyKaj6ECsFn8AOVF32eAX0gtfk8IpjM7ug0 03M3gHFT1YHY1Q3NvnDRHq3QYNQ== X-Received: from pjzr13.prod.google.com ([2002:a17:90b:50d:b0:33b:51fe:1a73]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2e45:b0:32e:7bbc:bf13 with SMTP id 98e67ed59e1d1-3568f52e24amr904712a91.34.1770856649529; Wed, 11 Feb 2026 16:37:29 -0800 (PST) Date: Wed, 11 Feb 2026 16:37:13 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.53.0.310.g728cabbaf7-goog Message-ID: <18dbaf2ff9579b285d92f26b9a69e1e302f3bbcc.1770854662.git.ackerleytng@google.com> Subject: [RFC PATCH v1 2/7] mm: hugetlb: Move mpol interpretation out of alloc_buddy_hugetlb_folio_with_mpol() From: Ackerley Tng To: akpm@linux-foundation.org, dan.j.williams@intel.com, david@kernel.org, fvdl@google.com, hannes@cmpxchg.org, jgg@nvidia.com, jiaqiyan@google.com, jthoughton@google.com, kalyazin@amazon.com, mhocko@kernel.org, michael.roth@amd.com, muchun.song@linux.dev, osalvador@suse.de, pasha.tatashin@soleen.com, pbonzini@redhat.com, peterx@redhat.com, pratyush@kernel.org, rick.p.edgecombe@intel.com, rientjes@google.com, roman.gushchin@linux.dev, seanjc@google.com, shakeel.butt@linux.dev, shivankg@amd.com, vannapurve@google.com, yan.y.zhao@intel.com Cc: ackerleytng@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move memory policy interpretation out of alloc_buddy_hugetlb_folio_with_mpol() and into alloc_hugetlb_folio() to separate reading and interpretation of memory policy from actual allocation. This will later allow memory policy to be interpreted outside of the process of allocating a hugetlb folio entirely. This opens doors for other callers of the HugeTLB folio allocation function, such as guest_memfd, where memory may not always be mapped and hence may not have an associated vma. No functional change intended. Signed-off-by: Ackerley Tng Reviewed-by: James Houghton --- mm/hugetlb.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fd067bd394ee0..aaa23d995b65c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2223,15 +2223,11 @@ static struct folio *alloc_migrate_hugetlb_folio(st= ruct hstate *h, gfp_t gfp_mas */ static struct folio *alloc_buddy_hugetlb_folio_with_mpol(struct hstate *h, - struct vm_area_struct *vma, unsigned long addr) + struct mempolicy *mpol, int nid, nodemask_t *nodemask) { struct folio *folio =3D NULL; - struct mempolicy *mpol; gfp_t gfp_mask =3D htlb_alloc_mask(h); - int nid; - nodemask_t *nodemask; =20 - nid =3D huge_node(vma, addr, gfp_mask, &mpol, &nodemask); if (mpol_is_preferred_many(mpol)) { gfp_t gfp =3D gfp_mask & ~(__GFP_DIRECT_RECLAIM | __GFP_NOFAIL); =20 @@ -2243,7 +2239,7 @@ struct folio *alloc_buddy_hugetlb_folio_with_mpol(str= uct hstate *h, =20 if (!folio) folio =3D alloc_surplus_hugetlb_folio(h, gfp_mask, nid, nodemask); - mpol_cond_put(mpol); + return folio; } =20 @@ -2892,7 +2888,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_stru= ct *vma, map_chg_state map_chg; int ret, idx; struct hugetlb_cgroup *h_cg =3D NULL; - gfp_t gfp =3D htlb_alloc_mask(h) | __GFP_RETRY_MAYFAIL; + gfp_t gfp =3D htlb_alloc_mask(h); =20 idx =3D hstate_index(h); =20 @@ -2963,8 +2959,14 @@ struct folio *alloc_hugetlb_folio(struct vm_area_str= uct *vma, folio =3D dequeue_hugetlb_folio_vma(h, vma, addr); =20 if (!folio) { + struct mempolicy *mpol; + nodemask_t *nodemask; + int nid; + spin_unlock_irq(&hugetlb_lock); - folio =3D alloc_buddy_hugetlb_folio_with_mpol(h, vma, addr); + nid =3D huge_node(vma, addr, gfp, &mpol, &nodemask); + folio =3D alloc_buddy_hugetlb_folio_with_mpol(h, mpol, nid, nodemask); + mpol_cond_put(mpol); if (!folio) goto out_uncharge_cgroup; spin_lock_irq(&hugetlb_lock); @@ -3023,7 +3025,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_stru= ct *vma, } } =20 - ret =3D mem_cgroup_charge_hugetlb(folio, gfp); + ret =3D mem_cgroup_charge_hugetlb(folio, gfp | __GFP_RETRY_MAYFAIL); /* * Unconditionally increment NR_HUGETLB here. If it turns out that * mem_cgroup_charge_hugetlb failed, then immediately free the page and --=20 2.53.0.310.g728cabbaf7-goog From nobody Thu Apr 2 17:43:09 2026 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B9C181E5724 for ; Thu, 12 Feb 2026 00:37:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856654; cv=none; b=u7Sj6FIv6+OQqRFLRR78EbqMtzm7qOmXvnUczbE2ZlYgOLePgyXqjAO5Uc13obY9vAfRPnwb7Nar1KH2nWBvcOJoSKhOJ62QIWrkLM5DXH1UP+9CQycylEJV8u/SGaxkUr6JPZd+cvpxgSGal3EIQql+YsAmIe9FkcPkVTi9h5s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856654; c=relaxed/simple; bh=8aZtrmhjCVIJaB3u3GcD2XUcVSdoXF1ZdqytqPqNHW4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DDqg+E/M6woJiF8tBAYBUYnktY+hDANVYOsV+K3Fe0w3FGUSbsb3KiRa1FSAZQhbnEePdZLJP8o5agApsmfRW6Ifao+mZkZLj15LXKCnHuzlfrd1Finf6BdGWhlwaeOA4Uf6vjNg+SFZtNgRwT1yiqXjMnrRWWlhbetTPQF8Xmg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ZHHpOsnM; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ZHHpOsnM" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-c6df833e1efso6374007a12.2 for ; Wed, 11 Feb 2026 16:37:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770856651; x=1771461451; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=TcMLX2Xy1jOIpMds+I1gs4yK1mnZMXU9mzbbwG+o564=; b=ZHHpOsnMKJ2Va2p8ZXdT2tHQaQB1BnoFg7Vryz+UgfeJeO4wEpKrY1Duuo1xM2NnVB 6bfsImLyfdpqwGwTc3tbxj/4k21lplCj0z58gusTaBcbjp0azIIPYxuDhMWd5+JpQz2V meTnXZ9Z907Q1Q79XOw37VBVsTgpVTmMB/x7I+S2WR7TmSNy5jnp3mfGXGL8LZqVvhxN hWfaEDJq4V1WIR5hvdX9gut3guJZONPd0yS8R11Q4IbaeuxsRsgia0BIopdLGlFgF8Kw zhcAq20tGLOdIbDOJ1nWAyeo14R0VSe7w2B4T8jct4aw0MFz5BAvsL/bxGhpkzyRh5OA TO/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770856651; x=1771461451; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TcMLX2Xy1jOIpMds+I1gs4yK1mnZMXU9mzbbwG+o564=; b=DC0laQA8IKx9mnbnnECSNDE1oBSeVs+L0p24Ipn0T56pKnIK9b7ljJtsWVZHd5pY5m nZmUqSRF+cEks7PM8P+gseOFDYp9R1vJtH70SqiRpoHYn9obYpG+If8xsWrT7Nlbww9/ Kn3QnLYkHQFzn0o6XSEZevATq+RC8y9b44ryOGg+xggnKVqd+fItBE2DXQrAlB5NG7JB gRiz2rl+XSfqA/v2WK8lcoraHwHDXgPLhGkJDm/dKFWrH8mU320OLa5MZ229X22Utma/ 27hIKtcNV4EtwIjzb+aoNQN9I6cTU2BliC8EVfaJrVJA0h+l1zqQJ+CK+4TGNi904Uap 7O2w== X-Forwarded-Encrypted: i=1; AJvYcCUIdu0y/q9o7BKm3EUB6pa99raG3CWEJBYR1ahHjm3PGVGkKb13x5kruwRQ3GRiwZN/gCLYiGT+iohporU=@vger.kernel.org X-Gm-Message-State: AOJu0YwvztiJVUo/JggSmgA3Gs0YvCpB1oTHOoUn6G6QwxogV9+e83yL qIRDNQeEXdZ1Mr4dZidmKG39lHHwJ57hQeKV7UA+N23h/JW3GJeE684BUbXAnj/AlmFZCMgW6w2 c0i3p7B/wqE5Cm40BIppT39g64Q== X-Received: from pfnz7.prod.google.com ([2002:aa7:85c7:0:b0:824:9b2f:783]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:4fc3:b0:81f:4d18:65c4 with SMTP id d2e1a72fcca58-824b05ce226mr768557b3a.59.1770856651067; Wed, 11 Feb 2026 16:37:31 -0800 (PST) Date: Wed, 11 Feb 2026 16:37:14 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.53.0.310.g728cabbaf7-goog Message-ID: <67a62716952c806c2a512e98bcac1f5224ada324.1770854662.git.ackerleytng@google.com> Subject: [RFC PATCH v1 3/7] mm: hugetlb: Move mpol interpretation out of dequeue_hugetlb_folio_vma() From: Ackerley Tng To: akpm@linux-foundation.org, dan.j.williams@intel.com, david@kernel.org, fvdl@google.com, hannes@cmpxchg.org, jgg@nvidia.com, jiaqiyan@google.com, jthoughton@google.com, kalyazin@amazon.com, mhocko@kernel.org, michael.roth@amd.com, muchun.song@linux.dev, osalvador@suse.de, pasha.tatashin@soleen.com, pbonzini@redhat.com, peterx@redhat.com, pratyush@kernel.org, rick.p.edgecombe@intel.com, rientjes@google.com, roman.gushchin@linux.dev, seanjc@google.com, shakeel.butt@linux.dev, shivankg@amd.com, vannapurve@google.com, yan.y.zhao@intel.com Cc: ackerleytng@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move memory policy interpretation out of dequeue_hugetlb_folio_vma() and into alloc_hugetlb_folio() to separate reading and interpretation of memory policy from actual allocation. Also rename dequeue_hugetlb_folio_vma() to dequeue_hugetlb_folio_with_mpol() to remove association with vma and to align with alloc_buddy_hugetlb_folio_with_mpol(). This will later allow memory policy to be interpreted outside of the process of allocating a hugetlb folio entirely. This opens doors for other callers of the HugeTLB folio allocation function, such as guest_memfd, where memory may not always be mapped and hence may not have an associated vma. No functional change intended. Signed-off-by: Ackerley Tng Reviewed-by: James Houghton --- mm/hugetlb.c | 34 +++++++++++++++------------------- 1 file changed, 15 insertions(+), 19 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index aaa23d995b65c..74b5136fdeb54 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1378,18 +1378,11 @@ static unsigned long available_huge_pages(struct hs= tate *h) return h->free_huge_pages - h->resv_huge_pages; } =20 -static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h, - struct vm_area_struct *vma, - unsigned long address) +static struct folio *dequeue_hugetlb_folio_with_mpol(struct hstate *h, + struct mempolicy *mpol, int nid, nodemask_t *nodemask) { struct folio *folio =3D NULL; - struct mempolicy *mpol; - gfp_t gfp_mask; - nodemask_t *nodemask; - int nid; - - gfp_mask =3D htlb_alloc_mask(h); - nid =3D huge_node(vma, address, gfp_mask, &mpol, &nodemask); + gfp_t gfp_mask =3D htlb_alloc_mask(h); =20 if (mpol_is_preferred_many(mpol)) { folio =3D dequeue_hugetlb_folio_nodemask(h, gfp_mask, @@ -1403,7 +1396,6 @@ static struct folio *dequeue_hugetlb_folio_vma(struct= hstate *h, folio =3D dequeue_hugetlb_folio_nodemask(h, gfp_mask, nid, nodemask); =20 - mpol_cond_put(mpol); return folio; } =20 @@ -2889,6 +2881,9 @@ struct folio *alloc_hugetlb_folio(struct vm_area_stru= ct *vma, int ret, idx; struct hugetlb_cgroup *h_cg =3D NULL; gfp_t gfp =3D htlb_alloc_mask(h); + struct mempolicy *mpol; + nodemask_t *nodemask; + int nid; =20 idx =3D hstate_index(h); =20 @@ -2949,6 +2944,9 @@ struct folio *alloc_hugetlb_folio(struct vm_area_stru= ct *vma, =20 spin_lock_irq(&hugetlb_lock); =20 + /* Takes reference on mpol. */ + nid =3D huge_node(vma, addr, gfp, &mpol, &nodemask); + /* * gbl_chg =3D=3D 0 indicates a reservation exists for the allocation - so * try dequeuing a page. If there are available_huge_pages(), try using @@ -2956,25 +2954,23 @@ struct folio *alloc_hugetlb_folio(struct vm_area_st= ruct *vma, */ folio =3D NULL; if (!gbl_chg || available_huge_pages(h)) - folio =3D dequeue_hugetlb_folio_vma(h, vma, addr); + folio =3D dequeue_hugetlb_folio_with_mpol(h, mpol, nid, nodemask); =20 if (!folio) { - struct mempolicy *mpol; - nodemask_t *nodemask; - int nid; - spin_unlock_irq(&hugetlb_lock); - nid =3D huge_node(vma, addr, gfp, &mpol, &nodemask); folio =3D alloc_buddy_hugetlb_folio_with_mpol(h, mpol, nid, nodemask); - mpol_cond_put(mpol); - if (!folio) + if (!folio) { + mpol_cond_put(mpol); goto out_uncharge_cgroup; + } spin_lock_irq(&hugetlb_lock); list_add(&folio->lru, &h->hugepage_activelist); folio_ref_unfreeze(folio, 1); /* Fall through */ } =20 + mpol_cond_put(mpol); + /* * Either dequeued or buddy-allocated folio needs to add special * mark to the folio when it consumes a global reservation. --=20 2.53.0.310.g728cabbaf7-goog From nobody Thu Apr 2 17:43:09 2026 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DDB41B6527 for ; Thu, 12 Feb 2026 00:37:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856655; cv=none; b=jk93Q/EZ0j1oKB8NpeJXHPiCU8hdZPKcejQT5c7HGrdhEIHMOsxNB+/ACN1MLO/P+NqY1k/NfJxcI0MmO+rU6L+MO9ltIKP3BZNF6m/Q6OBwipQdagshcm+kMOGFGk+AvfKr2iUFztduE0jLXwZ2KOrQnOUVdi56fuVP4jQbM9M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856655; c=relaxed/simple; bh=wa7pt8Z581mE1PPqr/ZpyRNRRdyNHYZow5FG+YQCcUE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ZwozvYNY9nKvD4NBpi78iDmzQyK8qFvyWnbEkhdchC8n7txOtdP88CD6XvmxjoyggGm5oEpyn6tpsAuUUGO590v7eBNIKC7Jv/ajMF3CcUiaf55e8/KUcGyz72RLyJ9dr9K32xy5sP/HF6zoHUttxSwaEgIyz+JM5unvxTQG6fI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=N/lrk7XS; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="N/lrk7XS" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-c6e1919fb7bso680543a12.2 for ; Wed, 11 Feb 2026 16:37:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770856653; x=1771461453; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ertUY3NkBRFTSuB1WAHz0tc6nOs7Jytmk4LTXJXAt1E=; b=N/lrk7XS9UfDwHBS5UZ2UPgbKfK695n7pXcpAXH53ye2mVLgNdluT4jJsJFHbf569v 0z49ty0BEkOlahMnstdAUpiAllz7T7dxuxCtpMVsR7KA8oexueu4OJKbFoTMO59BLPGb rxL4PGGXSruxtgKGAFstpPYtWHeRIas/Av0UfMO3difcjsPUnrMgIJ7B2TU2JkTBl7kD PCglzFZvb9T5Llyk6qVCb8sE9KWf0KvqbHcdMalj8WqUNssXJFG/rb3QtnNnrsdc7F5+ 6tAD9HxvKKBEKwoOn5uRv+eZxe/MmaJrpXVBb3bsungRr/yhrW6TVT+EVNhKDuhEGjn5 bmFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770856653; x=1771461453; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ertUY3NkBRFTSuB1WAHz0tc6nOs7Jytmk4LTXJXAt1E=; b=JBVik2K7bn8hT8r6yPeN3cPOWHnlwM7xdF+TFwZM76evpv4UEChK5/C5y+G1KxO8ci nPaYz18SD1nKxELggTXvbPqxEX4lbIduVt/zoXucyH1yLqaz0R7rmlmojrKleDtcnR1h 8EiT/YZMoVK5i6wl2oZSEZCjsz0+NGFXYQOGAePdQ8TpS3j0lPW0m1041+NDac6JgCrP ucb7421fk4Gynt77Qh/7EQcLrcVnz3rR/p1oAxysn0Xhc4/6RewpgfWm3xHbeogSqWsE 4nlMnlhL2lKtVbiwvZuoQTxijjZzFr9yxAUqOq2AbTSN8DZN3Sd2GWRaaYAkm2ohH1tD l9Jg== X-Forwarded-Encrypted: i=1; AJvYcCUSJdETqD9RsiDPR9IoDXapx9OTZbL2GUu5pNAPe2ycE6F/B/rEtsFllgBnBAI26BFyUhiMxFQfVT1QTng=@vger.kernel.org X-Gm-Message-State: AOJu0YyBY6/2+2+GbgehSCZclMmQhEZ/+wF6BGdMbLJDvqJgRfygeYvr 19+xhRGvzmcEzKGAit+SxTrYEtvWBL2LCw7RPAUFAeh5tWRzT6fgtcSm9dQnc0YdaoNvsGJlsMd iUZUMXEBu8cKHXSUeDq3fheIA2Q== X-Received: from pgax34.prod.google.com ([2002:a05:6a02:2e62:b0:c65:c5fc:1707]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:c702:b0:38e:9e77:c9e5 with SMTP id adf61e73a8af0-3944886d6f3mr806864637.72.1770856652565; Wed, 11 Feb 2026 16:37:32 -0800 (PST) Date: Wed, 11 Feb 2026 16:37:15 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.53.0.310.g728cabbaf7-goog Message-ID: <24f2962dcbd369a2e01590aca0a365e5118778fe.1770854662.git.ackerleytng@google.com> Subject: [RFC PATCH v1 4/7] Revert "memcg/hugetlb: remove memcg hugetlb try-commit-cancel protocol" From: Ackerley Tng To: akpm@linux-foundation.org, dan.j.williams@intel.com, david@kernel.org, fvdl@google.com, hannes@cmpxchg.org, jgg@nvidia.com, jiaqiyan@google.com, jthoughton@google.com, kalyazin@amazon.com, mhocko@kernel.org, michael.roth@amd.com, muchun.song@linux.dev, osalvador@suse.de, pasha.tatashin@soleen.com, pbonzini@redhat.com, peterx@redhat.com, pratyush@kernel.org, rick.p.edgecombe@intel.com, rientjes@google.com, roman.gushchin@linux.dev, seanjc@google.com, shakeel.butt@linux.dev, shivankg@amd.com, vannapurve@google.com, yan.y.zhao@intel.com Cc: ackerleytng@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This reverts commit 1d8f136a421f26747e58c01281cba5bffae8d289. Restore try-commit-cancel protocol for memory charging for HugeTLB, to be used in later patches. Signed-off-by: Ackerley Tng --- include/linux/memcontrol.h | 22 +++++++++++++ mm/memcontrol.c | 65 ++++++++++++++++++++++++++++++++++++-- 2 files changed, 84 insertions(+), 3 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index f29d4969c0c36..59eab4caa01fa 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -639,6 +639,8 @@ static inline bool mem_cgroup_below_min(struct mem_cgro= up *target, page_counter_read(&memcg->memory); } =20 +void mem_cgroup_commit_charge(struct folio *folio, struct mem_cgroup *memc= g); + int __mem_cgroup_charge(struct folio *folio, struct mm_struct *mm, gfp_t g= fp); =20 /** @@ -663,6 +665,9 @@ static inline int mem_cgroup_charge(struct folio *folio= , struct mm_struct *mm, return __mem_cgroup_charge(folio, mm, gfp); } =20 +int mem_cgroup_hugetlb_try_charge(struct mem_cgroup *memcg, gfp_t gfp, + long nr_pages); + int mem_cgroup_charge_hugetlb(struct folio* folio, gfp_t gfp); =20 int mem_cgroup_swapin_charge_folio(struct folio *folio, struct mm_struct *= mm, @@ -691,6 +696,7 @@ static inline void mem_cgroup_uncharge_folios(struct fo= lio_batch *folios) __mem_cgroup_uncharge_folios(folios); } =20 +void mem_cgroup_cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pa= ges); void mem_cgroup_replace_folio(struct folio *old, struct folio *new); void mem_cgroup_migrate(struct folio *old, struct folio *new); =20 @@ -1135,12 +1141,23 @@ static inline bool mem_cgroup_below_min(struct mem_= cgroup *target, return false; } =20 +static inline void mem_cgroup_commit_charge(struct folio *folio, + struct mem_cgroup *memcg) +{ +} + static inline int mem_cgroup_charge(struct folio *folio, struct mm_struct *mm, gfp_t gfp) { return 0; } =20 +static inline int mem_cgroup_hugetlb_try_charge(struct mem_cgroup *memcg, + gfp_t gfp, long nr_pages) +{ + return 0; +} + static inline int mem_cgroup_charge_hugetlb(struct folio* folio, gfp_t gfp) { return 0; @@ -1160,6 +1177,11 @@ static inline void mem_cgroup_uncharge_folios(struct= folio_batch *folios) { } =20 +static inline void mem_cgroup_cancel_charge(struct mem_cgroup *memcg, + unsigned int nr_pages) +{ +} + static inline void mem_cgroup_replace_folio(struct folio *old, struct folio *new) { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 36ab9897b61b2..70d762ba465b1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2561,6 +2561,21 @@ static inline int try_charge(struct mem_cgroup *memc= g, gfp_t gfp_mask, return try_charge_memcg(memcg, gfp_mask, nr_pages); } =20 +/** + * mem_cgroup_cancel_charge() - cancel an uncommitted try_charge() call. + * @memcg: memcg previously charged. + * @nr_pages: number of pages previously charged. + */ +void mem_cgroup_cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pa= ges) +{ + if (mem_cgroup_is_root(memcg)) + return; + + page_counter_uncharge(&memcg->memory, nr_pages); + if (do_memsw_account()) + page_counter_uncharge(&memcg->memsw, nr_pages); +} + static void commit_charge(struct folio *folio, struct mem_cgroup *memcg) { VM_BUG_ON_FOLIO(folio_memcg_charged(folio), folio); @@ -2574,6 +2589,18 @@ static void commit_charge(struct folio *folio, struc= t mem_cgroup *memcg) folio->memcg_data =3D (unsigned long)memcg; } =20 +/** + * mem_cgroup_commit_charge - commit a previously successful try_charge(). + * @folio: folio to commit the charge to. + * @memcg: memcg previously charged. + */ +void mem_cgroup_commit_charge(struct folio *folio, struct mem_cgroup *memc= g) +{ + css_get(&memcg->css); + commit_charge(folio, memcg); + memcg1_commit_charge(folio, memcg); +} + #ifdef CONFIG_MEMCG_NMI_SAFETY_REQUIRES_ATOMIC static inline void account_slab_nmi_safe(struct mem_cgroup *memcg, struct pglist_data *pgdat, @@ -4777,9 +4804,7 @@ static int charge_memcg(struct folio *folio, struct m= em_cgroup *memcg, if (ret) goto out; =20 - css_get(&memcg->css); - commit_charge(folio, memcg); - memcg1_commit_charge(folio, memcg); + mem_cgroup_commit_charge(folio, memcg); out: return ret; } @@ -4796,6 +4821,40 @@ int __mem_cgroup_charge(struct folio *folio, struct = mm_struct *mm, gfp_t gfp) return ret; } =20 +/** + * mem_cgroup_hugetlb_try_charge - try to charge the memcg for a hugetlb f= olio + * @memcg: memcg to charge. + * @gfp: reclaim mode. + * @nr_pages: number of pages to charge. + * + * This function is called when allocating a huge page folio to determine = if + * the memcg has the capacity for it. It does not commit the charge yet, + * as the hugetlb folio itself has not been obtained from the hugetlb pool. + * + * Once we have obtained the hugetlb folio, we can call + * mem_cgroup_commit_charge() to commit the charge. If we fail to obtain t= he + * folio, we should instead call mem_cgroup_cancel_charge() to undo the ef= fect + * of try_charge(). + * + * Returns 0 on success. Otherwise, an error code is returned. + */ +int mem_cgroup_hugetlb_try_charge(struct mem_cgroup *memcg, gfp_t gfp, + long nr_pages) +{ + /* + * If hugetlb memcg charging is not enabled, do not fail hugetlb allocati= on, + * but do not attempt to commit charge later (or cancel on error) either. + */ + if (mem_cgroup_disabled() || !memcg || + !cgroup_subsys_on_dfl(memory_cgrp_subsys) || !memcg_accounts_hugetlb()) + return -EOPNOTSUPP; + + if (try_charge(memcg, gfp, nr_pages)) + return -ENOMEM; + + return 0; +} + /** * mem_cgroup_charge_hugetlb - charge the memcg for a hugetlb folio * @folio: folio being charged --=20 2.53.0.310.g728cabbaf7-goog From nobody Thu Apr 2 17:43:09 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B460E20298C for ; Thu, 12 Feb 2026 00:37:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856655; cv=none; b=Bpm7NVab8pvmvHGP9+1Jri8qcrl3D31lEOGLrXEHqxJB/rmtoSUu5FxDjRqNxOonNISsw1xR3uSScZSENibT/T8VrYrUL747jvlIyhdajD1opYuImrgc3MtZAAbsxjJ7meo+FwcUnmaF8NdR6k7n0I2Be1S5wIRB05UlvlNv/RA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856655; c=relaxed/simple; bh=kKS15axEk0LsRuKNrmNoFDsiye0vhaIHHxcdpf45fME=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=NUFnMxcfbk5YZKACcc442RyJUM+ZKEkR4hcIXuYq+d64DaNg0xvoyuxDTMICbSQyGMymQqyq5kZlW/OS5gH7Lfe8UEiehpjLJ6zyATMAJDDPt/YAD4iEajL0r0+yvfpPYrEOQq7vLfHYCaGwuMQ01NHuXB85T7q1G9v7w0YSe4U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=BoQsVpqB; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="BoQsVpqB" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2a13cd9a784so58179465ad.2 for ; Wed, 11 Feb 2026 16:37:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770856654; x=1771461454; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Pe2F5DK4zZDtXkI/xhYQlRYGmZxVF2QOYK8cN8BDaKU=; b=BoQsVpqBtetDc2ujO21ye/Cdy2gQmK3Xi63AXlvOiTTQazLjjWA6ZRQ7MfRCq68jhX p9hS20ja/67qa62rmx0oV3riclxWmZrN7/wSxwqRM2LTWe+n5I0DQsqs+elzIozHWIQH pzQ75osmzL+T1DUCmZv1sfCppcc9vsTnF4s+YZh+tkKZ/gt6JUUvGbCaJRmN5oAVMN68 YW/Q6KaNd/23FaTjcFuPOVPdHaJs0pnbmyAL1Ft+K3a93iT41bhsNqNMn4wxyY4e+fMl u6QmVfYvuWv2apqGNP9aL+mfDiDBhIG3Z1kKK2Q32L3J9Z6+oig7F/c68lb/vqs70dp3 pmOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770856654; x=1771461454; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Pe2F5DK4zZDtXkI/xhYQlRYGmZxVF2QOYK8cN8BDaKU=; b=FuxbWot5AN5A6iKwvUFEXgi6I3tKxChlLMOJqcWUjOS/jK9paz5Ap/ZTU6EQFIa6Ic LAcEaAamHhSr0gDBDNZgP15Wqs4Ja/BytNtorkRk+ofcS20sgMeVM+h4vLjcRqwPNVvc lbDELh/MvR47Q/n8Nm++ptU2o44ZDm66CmbO4jOKQNJKYyUydH3WSFxHbJBUxPRKSYH0 dD4vwFu59viZ6PCZB9Fs10vfpO5NZSskaikQpFdBF2PKCrCOabSmPId/AYGqirowf1Uf 1OHl83MxXVhcrajTF31OgVllBkMc/smMOun4nneK51lCAHdNHAFiDv2kCmRvVp1EwOkW 1xvg== X-Forwarded-Encrypted: i=1; AJvYcCUeX+F/xcE+a1fGO6cJ0FAAxTC7C/dDJFQsj8kAccnouU7OM1C0zAt68qhBKWC+k8tfbQ70mO8LrDnXDCQ=@vger.kernel.org X-Gm-Message-State: AOJu0YygQ8B8Ltcak5fzf6FVt1iL82LEhDaxOBwj1BUv+0NEx5nWOdzD PuttRx8gUS2cw5LIIoq5N/Ctc6vDzHyNcd2UwMNxwukA3LmCUn97lszKO4t+5USj+aTLarvdBm8 YNTQ6LTx7P9ReyiXLj8bLWPHDnA== X-Received: from pllo10.prod.google.com ([2002:a17:902:778a:b0:2a7:80ac:dd82]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:17cd:b0:2a0:8e35:969d with SMTP id d9443c01a7336-2ab39b16ed4mr9131845ad.39.1770856654111; Wed, 11 Feb 2026 16:37:34 -0800 (PST) Date: Wed, 11 Feb 2026 16:37:16 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.53.0.310.g728cabbaf7-goog Message-ID: <28d1cd5b7b9a628d6ca6550b8fcafe887190a9e6.1770854662.git.ackerleytng@google.com> Subject: [RFC PATCH v1 5/7] mm: hugetlb: Adopt memcg try-commit-cancel protocol From: Ackerley Tng To: akpm@linux-foundation.org, dan.j.williams@intel.com, david@kernel.org, fvdl@google.com, hannes@cmpxchg.org, jgg@nvidia.com, jiaqiyan@google.com, jthoughton@google.com, kalyazin@amazon.com, mhocko@kernel.org, michael.roth@amd.com, muchun.song@linux.dev, osalvador@suse.de, pasha.tatashin@soleen.com, pbonzini@redhat.com, peterx@redhat.com, pratyush@kernel.org, rick.p.edgecombe@intel.com, rientjes@google.com, roman.gushchin@linux.dev, seanjc@google.com, shakeel.butt@linux.dev, shivankg@amd.com, vannapurve@google.com, yan.y.zhao@intel.com Cc: ackerleytng@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Refactor alloc_hugetlb_folio() to use the memcg try-commit-cancel protocol. Do this to allow the core of allocating a hugetlb folio and associated memcg charging to be refactored out in a later patch. In addition, checking cgroup memory limits before allocating avoids unnecessary allocation if the limits had already been hit. Update error code propagation in the failure paths so that existing error cases still return -ENOSPC, but if the memory limit is reached, return -ENOMEM as before. Signed-off-by: Ackerley Tng --- mm/hugetlb.c | 53 +++++++++++++++++++++++++++++++++------------------- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 74b5136fdeb54..70e91edc47dc1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2881,6 +2881,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_stru= ct *vma, int ret, idx; struct hugetlb_cgroup *h_cg =3D NULL; gfp_t gfp =3D htlb_alloc_mask(h); + bool memory_charged =3D false; + struct mem_cgroup *memcg; struct mempolicy *mpol; nodemask_t *nodemask; int nid; @@ -2917,8 +2919,10 @@ struct folio *alloc_hugetlb_folio(struct vm_area_str= uct *vma, */ if (map_chg) { gbl_chg =3D hugepage_subpool_get_pages(spool, 1); - if (gbl_chg < 0) + if (gbl_chg < 0) { + ret =3D -ENOSPC; goto out_end_reservation; + } } else { /* * If we have the vma reservation ready, no need for extra @@ -2934,13 +2938,25 @@ struct folio *alloc_hugetlb_folio(struct vm_area_st= ruct *vma, if (map_chg) { ret =3D hugetlb_cgroup_charge_cgroup_rsvd( idx, pages_per_huge_page(h), &h_cg); - if (ret) + if (ret) { + ret =3D -ENOSPC; goto out_subpool_put; + } } =20 ret =3D hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), &h_cg); - if (ret) + if (ret) { + ret =3D -ENOSPC; goto out_uncharge_cgroup_reservation; + } + + memcg =3D get_mem_cgroup_from_current(); + ret =3D mem_cgroup_hugetlb_try_charge(memcg, gfp | __GFP_RETRY_MAYFAIL, + pages_per_huge_page(h)); + if (ret =3D=3D -ENOMEM) + goto out_put_memcg; + + memory_charged =3D !ret; =20 spin_lock_irq(&hugetlb_lock); =20 @@ -2961,7 +2977,8 @@ struct folio *alloc_hugetlb_folio(struct vm_area_stru= ct *vma, folio =3D alloc_buddy_hugetlb_folio_with_mpol(h, mpol, nid, nodemask); if (!folio) { mpol_cond_put(mpol); - goto out_uncharge_cgroup; + ret =3D -ENOSPC; + goto out_uncharge_memory; } spin_lock_irq(&hugetlb_lock); list_add(&folio->lru, &h->hugepage_activelist); @@ -2991,6 +3008,12 @@ struct folio *alloc_hugetlb_folio(struct vm_area_str= uct *vma, =20 spin_unlock_irq(&hugetlb_lock); =20 + lruvec_stat_mod_folio(folio, NR_HUGETLB, pages_per_huge_page(h)); + + if (memory_charged) + mem_cgroup_commit_charge(folio, memcg); + mem_cgroup_put(memcg); + hugetlb_set_folio_subpool(folio, spool); =20 if (map_chg !=3D MAP_CHG_ENFORCED) { @@ -3021,22 +3044,14 @@ struct folio *alloc_hugetlb_folio(struct vm_area_st= ruct *vma, } } =20 - ret =3D mem_cgroup_charge_hugetlb(folio, gfp | __GFP_RETRY_MAYFAIL); - /* - * Unconditionally increment NR_HUGETLB here. If it turns out that - * mem_cgroup_charge_hugetlb failed, then immediately free the page and - * decrement NR_HUGETLB. - */ - lruvec_stat_mod_folio(folio, NR_HUGETLB, pages_per_huge_page(h)); - - if (ret =3D=3D -ENOMEM) { - free_huge_folio(folio); - return ERR_PTR(-ENOMEM); - } - return folio; =20 -out_uncharge_cgroup: +out_uncharge_memory: + if (memory_charged) + mem_cgroup_cancel_charge(memcg, pages_per_huge_page(h)); +out_put_memcg: + mem_cgroup_put(memcg); + hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h), h_cg); out_uncharge_cgroup_reservation: if (map_chg) @@ -3056,7 +3071,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_stru= ct *vma, out_end_reservation: if (map_chg !=3D MAP_CHG_ENFORCED) vma_end_reservation(h, vma, addr); - return ERR_PTR(-ENOSPC); + return ERR_PTR(ret); } =20 static __init void *alloc_bootmem(struct hstate *h, int nid, bool node_exa= ct) --=20 2.53.0.310.g728cabbaf7-goog From nobody Thu Apr 2 17:43:09 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 63BBF21765B for ; Thu, 12 Feb 2026 00:37:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856658; cv=none; b=OCYGoF04dthBI7ESdJyrOCFVgE2DgVij8kBQF6IAMJuyjHXB3swwnbvmu4K/Aud9kDLJssYsiCNfSMRNilVQPwbVmJVkasijS4+uQ2C3nqA9SjoVIhltMZIfVxqJOOHuARxiv8TeFSzvzViVCxtd3pvb1w2TlVURKvTaxvM+IcY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856658; c=relaxed/simple; bh=RqGxcJZMCajIzXX/cvMijNOKy/G5sjCpvWdZwU4g+HE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=StAyfjgQuWHm5aWWeuuTjZBKnphPwu3ijTzQjWSroYC0mDVM+wPaIN/s8L3rsAldyrN1zmbvfl32qTrlBVL+h8SfyTTeK0FOLq0qiWbuFI4deU9hqSMtZkYW6ZtA/VaQFEusR3ZHv51T9/uvPwPgQ33id24DtQdz4bVKqeokQoY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=x/FYOMuB; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="x/FYOMuB" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2a946c0e441so27685805ad.1 for ; Wed, 11 Feb 2026 16:37:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770856656; x=1771461456; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=h8YVk6TJ+SxHQqdzURW4gj7SpPw0Nw5yYajSbj7tT84=; b=x/FYOMuB5GDsbo57iSbAOXtFAC0qluTj8kubfQNweia5bAEHIJuYwOMKdvEqNq7DUm wp+6W1QEajixdBXFmFHvwQepmwg3JBJKxGpWc9pjP5N4aS4Zk9xYhdi2r1DFWGLEL1WP tpoTesl+y+voOzBBQ/Ei1/0YGomPC/DU4tX2FCe6VOmPTgxHSbs/xbRE9zSS+TYZZ91t meJKY7po7Z1BlF9Oxp+0Cw1hwJoCWM0QfxUJc3DRMSBZLCV1zhJb1xkvvd9VI8LGmqhn hSEILrJ9eWGqokoLWOfomclnoF+V+u/713YkUcQGgFsCO73V6HdKMInItBE/Q+otHFq1 MjuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770856656; x=1771461456; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=h8YVk6TJ+SxHQqdzURW4gj7SpPw0Nw5yYajSbj7tT84=; b=irO4K8vrE/bbEDu0ULZDUYguDSXfH+WlykngnTDPw3CCVdahzTQQg7+lSVqqstvywW 6yRgorDb+3czlm9QZFShiPjBaYlaYvjQOJVxQBjBmaCQgPpoYWCJ+/ru75GwIXAqYyMl p8NUswZWviQ95caaWNKZelG4ecA3yNgA1p5CXnv3FACLaX6TkHu0UtYMIIlkGvkMvxdl oxZNIF1xxqUq9ufinfig2Oq6zQbvtB2A64qlQ4uuFztUBEJRvOmeQhnxlQ1WpUkT42Rh C46BdO3iLdNxzefDLUQly2a0cyjg9CEM8w7Or5nvKpDIVtlq2KxQpHNvSsQfIPvudfPU omrw== X-Forwarded-Encrypted: i=1; AJvYcCXoy0h0eJDW6R2Pxisy1oJ9x7kUictI1D2Bd4ur1JbzUUkZa0vIy5MsLSOIaToLpFr+RSyys0RoRtTg4MM=@vger.kernel.org X-Gm-Message-State: AOJu0Yww5LG4YQg+PQvz8W6psfj2ULVIdc5NruGBBohMZa9hFWTFsv5P qs0XWomJXK5+kJMfyyB0YeBaat+LJxpao90YaR6yinFxUTunW/dm8BVfS3fSmOK2EwYVtnQI9WO LrlvjsCq7/6/hnaaKsloeojh2Bw== X-Received: from plok7.prod.google.com ([2002:a17:903:3bc7:b0:29f:2b44:973b]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f788:b0:2ab:1eef:0 with SMTP id d9443c01a7336-2ab3b2accc3mr5208665ad.51.1770856655751; Wed, 11 Feb 2026 16:37:35 -0800 (PST) Date: Wed, 11 Feb 2026 16:37:17 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.53.0.310.g728cabbaf7-goog Message-ID: <9b59772c1c06f6628d842d5d30f1f7777c621c90.1770854662.git.ackerleytng@google.com> Subject: [RFC PATCH v1 6/7] mm: memcontrol: Remove now-unused function mem_cgroup_charge_hugetlb From: Ackerley Tng To: akpm@linux-foundation.org, dan.j.williams@intel.com, david@kernel.org, fvdl@google.com, hannes@cmpxchg.org, jgg@nvidia.com, jiaqiyan@google.com, jthoughton@google.com, kalyazin@amazon.com, mhocko@kernel.org, michael.roth@amd.com, muchun.song@linux.dev, osalvador@suse.de, pasha.tatashin@soleen.com, pbonzini@redhat.com, peterx@redhat.com, pratyush@kernel.org, rick.p.edgecombe@intel.com, rientjes@google.com, roman.gushchin@linux.dev, seanjc@google.com, shakeel.butt@linux.dev, shivankg@amd.com, vannapurve@google.com, yan.y.zhao@intel.com Cc: ackerleytng@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With the (re)introduction of the try-commit-cancel charging protocol for HugeTLB's use, mem_cgroup_charge_hugetlb() is now redundant. Remove the function's implementation from mm/memcontrol.c and its declaration from include/linux/memcontrol.h No functional change intended. Signed-off-by: Ackerley Tng --- include/linux/memcontrol.h | 7 ------- mm/memcontrol.c | 34 ---------------------------------- 2 files changed, 41 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 59eab4caa01fa..572ad695afa40 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -668,8 +668,6 @@ static inline int mem_cgroup_charge(struct folio *folio= , struct mm_struct *mm, int mem_cgroup_hugetlb_try_charge(struct mem_cgroup *memcg, gfp_t gfp, long nr_pages); =20 -int mem_cgroup_charge_hugetlb(struct folio* folio, gfp_t gfp); - int mem_cgroup_swapin_charge_folio(struct folio *folio, struct mm_struct *= mm, gfp_t gfp, swp_entry_t entry); =20 @@ -1158,11 +1156,6 @@ static inline int mem_cgroup_hugetlb_try_charge(stru= ct mem_cgroup *memcg, return 0; } =20 -static inline int mem_cgroup_charge_hugetlb(struct folio* folio, gfp_t gfp) -{ - return 0; -} - static inline int mem_cgroup_swapin_charge_folio(struct folio *folio, struct mm_struct *mm, gfp_t gfp, swp_entry_t entry) { diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 70d762ba465b1..87d22db5a4bd3 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4855,40 +4855,6 @@ int mem_cgroup_hugetlb_try_charge(struct mem_cgroup = *memcg, gfp_t gfp, return 0; } =20 -/** - * mem_cgroup_charge_hugetlb - charge the memcg for a hugetlb folio - * @folio: folio being charged - * @gfp: reclaim mode - * - * This function is called when allocating a huge page folio, after the pa= ge has - * already been obtained and charged to the appropriate hugetlb cgroup - * controller (if it is enabled). - * - * Returns ENOMEM if the memcg is already full. - * Returns 0 if either the charge was successful, or if we skip the chargi= ng. - */ -int mem_cgroup_charge_hugetlb(struct folio *folio, gfp_t gfp) -{ - struct mem_cgroup *memcg =3D get_mem_cgroup_from_current(); - int ret =3D 0; - - /* - * Even memcg does not account for hugetlb, we still want to update - * system-level stats via lruvec_stat_mod_folio. Return 0, and skip - * charging the memcg. - */ - if (mem_cgroup_disabled() || !memcg_accounts_hugetlb() || - !memcg || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) - goto out; - - if (charge_memcg(folio, memcg, gfp)) - ret =3D -ENOMEM; - -out: - mem_cgroup_put(memcg); - return ret; -} - /** * mem_cgroup_swapin_charge_folio - Charge a newly allocated folio for swa= pin. * @folio: folio to charge. --=20 2.53.0.310.g728cabbaf7-goog From nobody Thu Apr 2 17:43:09 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 011B21DEFE8 for ; Thu, 12 Feb 2026 00:37:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856660; cv=none; b=gXakgJeDefP0pokXaarO2iNAJ5fCSO+iIoaYlciOEavEW5GxmukaMBBhmpymg3I5xogXfvvrC/EvaqwXZ8kAcJpyhPaf3GAbZDfNQQJ6lXKCLsU4iNKKy/9oAb1nks7jB/+tReop51Im23Aaf2VMVG7VRRG1aUvfn7mfsg5OpJU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770856660; c=relaxed/simple; bh=EwNfUsWZF1vGarE+OLV6nbe8sQt1PL8TvcUOE7m4PpA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=g+DFpBISSqfo5UtiwJb1bjefb68jZa5Zn+yhsGQX4m8JpcXmRiuZqJ8KMD+w950YOuGoj9wZuIlsdRhb61homqlxCdTjtaAVvyelIb1ILuwIjgXea47nVaEppQSzXHHX+zj5M7iPEsK057hlc4iltmPJixN4ggq7JGQOjdbo6yI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=k5t2Oeb0; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ackerleytng.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="k5t2Oeb0" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2a92a3f5de9so16550145ad.2 for ; Wed, 11 Feb 2026 16:37:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1770856657; x=1771461457; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=mW1n+uSYA6jLOEsvKLlQWfWVUyuDqSU/loYnlj1zbjs=; b=k5t2Oeb0fweDj+TO75fIWcU2fwhjgzi5B2WSRoMz0yE2aAHR+zg5O6gVjU4SnJRFVA qn5PxHwyGKbSUpDvLrOFwFPzciUvG/EpIXNdrlsc41gF0As2E3lSL3/kM0NgTreHPIZX a200gntpy5yqIQIu4B+fHJI7jL8VAzphOGG3v92usfuzDjgkZ4tZDPfNH8b0UDJ4EOnC BGxSz/CDlNjfsmg4gzfx71ZZBx88NhZFPXRLLfDtvwBvv+hgyE6+uBqbYGO28iRtFt1Y CtWbpGLZckdJwplkgxurro6XQawart2jR3V4HMvZGdbCJFdOVCQW5U/pfxI96r5+ZTK6 Y0Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1770856657; x=1771461457; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mW1n+uSYA6jLOEsvKLlQWfWVUyuDqSU/loYnlj1zbjs=; b=wKGLUN/Hsy1CRjzuo78DZedCzN4W/ni25pvt8Nnw2NOcby+ca7Bl6o2d5ZBjvIFNY2 qsmutzoN34wXYFbM8UxPmy3/skbZkfieyz2BaXo+n3LSprsr0zl5cgBzsdBubajALyKG XglOSKP1AVsG9c0OkuAlGmB5pEmG7PAq2EsI8NOv4BfUCR6Rjd2+n0fnTXnrb09lu+Wb CEOaX1zUWITdbzv5lEsLwvtwoH04hekQ/pofwNy9Hqb2yTAkNqMko3rX9GEV8Ln9PPsJ dGOHAtpjgIBXWxr6nm/d43jReEp04AONpi2FaNEtiiHEsAkAqPCX8Ls7Alx+4sMGjzgD ruWQ== X-Forwarded-Encrypted: i=1; AJvYcCVdLnbYxV4ugEsCgMMkBwBXXHyJMaRSFVztOSzVtLllcDgcnna6JmVKF/CQ/i1v7H+O+DM0CVaAyggVstE=@vger.kernel.org X-Gm-Message-State: AOJu0YzVCgDK/S7BVAsQNnmbWp1IvpfNx62Bsti6QkABbwKQeXhUeMhs 6oH9z5sgT9/TosG6CDXS/jx3nv0PVgJumIDlGaei4TeuvUgNCtSf91yu+X37dOwN5HaTKzoXL96 aMBxozDlK3IrFCJlpbfVEiRyA8A== X-Received: from plbla13.prod.google.com ([2002:a17:902:fa0d:b0:2a9:622c:47d6]) (user=ackerleytng job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:9cf:b0:2a7:d5c0:c659 with SMTP id d9443c01a7336-2ab3b1581a3mr5091465ad.5.1770856657375; Wed, 11 Feb 2026 16:37:37 -0800 (PST) Date: Wed, 11 Feb 2026 16:37:18 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.53.0.310.g728cabbaf7-goog Message-ID: <3803e96be57ab3201ab967ba47af22d12024f9e1.1770854662.git.ackerleytng@google.com> Subject: [RFC PATCH v1 7/7] mm: hugetlb: Refactor out hugetlb_alloc_folio() From: Ackerley Tng To: akpm@linux-foundation.org, dan.j.williams@intel.com, david@kernel.org, fvdl@google.com, hannes@cmpxchg.org, jgg@nvidia.com, jiaqiyan@google.com, jthoughton@google.com, kalyazin@amazon.com, mhocko@kernel.org, michael.roth@amd.com, muchun.song@linux.dev, osalvador@suse.de, pasha.tatashin@soleen.com, pbonzini@redhat.com, peterx@redhat.com, pratyush@kernel.org, rick.p.edgecombe@intel.com, rientjes@google.com, roman.gushchin@linux.dev, seanjc@google.com, shakeel.butt@linux.dev, shivankg@amd.com, vannapurve@google.com, yan.y.zhao@intel.com Cc: ackerleytng@google.com, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Refactor out hugetlb_alloc_folio() from alloc_hugetlb_folio(), which handles allocation of a folio and memory and HugeTLB charging to cgroups. Other than flags to control charging, hugetlb_alloc_folio() also takes parameters for memory policy and memcg to charge memory to. This refactoring decouples the HugeTLB page allocation from VMAs, specifically: 1. Reservations (as in resv_map) are stored in the vma 2. mpol is stored at vma->vm_policy 3. A vma must be used for allocation even if the pages are not meant to be used by host process. Without this coupling, VMAs are no longer a requirement for allocation. This opens up the allocation routine for usage without VMAs, which will allow guest_memfd to use HugeTLB as a more generic allocator of huge pages, since guest_memfd memory may not have any associated VMAs by design. In addition, direct allocations from HugeTLB could possibly be refactored to avoid the use of a pseudo-VMA. Also, this decouples HugeTLB page allocation from HugeTLBfs, where the subpool is stored at the fs mount. This is also a requirement for guest_memfd, where the plan is to have a subpool created per-fd and stored on the inode. No functional change intended. Signed-off-by: Ackerley Tng --- include/linux/hugetlb.h | 11 +++ mm/hugetlb.c | 201 +++++++++++++++++++++++----------------- 2 files changed, 126 insertions(+), 86 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index e51b8ef0cebd9..e385945c04af0 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -704,6 +704,9 @@ bool hugetlb_bootmem_page_zones_valid(int nid, struct h= uge_bootmem_page *m); int isolate_or_dissolve_huge_folio(struct folio *folio, struct list_head *= list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long en= d_pfn); void wait_for_freed_hugetlb_folios(void); +struct folio *hugetlb_alloc_folio(struct hstate *h, struct mempolicy *mpol, + int nid, nodemask_t *nodemask, struct mem_cgroup *memcg, + bool charge_hugetlb_rsvd, bool use_existing_reservation); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred= _nid, @@ -1115,6 +1118,14 @@ static inline void wait_for_freed_hugetlb_folios(voi= d) { } =20 +static inline struct folio *hugetlb_alloc_folio(struct hstate *h, + struct mempolicy *mpol, int nid, nodemask_t *nodemask, + struct mem_cgroup *memcg, bool charge_hugetlb_rsvd, + bool use_existing_reservation) +{ + return NULL; +} + static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, bool cow_from_owner) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 70e91edc47dc1..c6cfb268a527a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2844,6 +2844,105 @@ void wait_for_freed_hugetlb_folios(void) flush_work(&free_hpage_work); } =20 +/** + * hugetlb_alloc_folio() - Allocates a hugetlb folio. + * + * @h: struct hstate to allocate from. + * @mpol: struct mempolicy to apply for this folio allocation. + * Caller must hold reference to mpol. + * @nid: Node id, used together with mpol to determine folio allocation. + * @nodemask: Nodemask, used together with mpol to determine folio allocat= ion. + * @memcg: Memory cgroup to charge for memory usage. + * Caller must hold reference on memcg. + * @charge_hugetlb_rsvd: Set to true to charge hugetlb reservations in cgr= oup. + * @use_existing_reservation: Set to true if this allocation should use an + * existing hstate reservation. + * + * This function handles cgroup and global hstate reservations. VMA-related + * reservations and subpool debiting must be handled by the caller if nece= ssary. + * + * Return: folio on success or negated error otherwise. + */ +struct folio *hugetlb_alloc_folio(struct hstate *h, struct mempolicy *mpol, + int nid, nodemask_t *nodemask, struct mem_cgroup *memcg, + bool charge_hugetlb_rsvd, bool use_existing_reservation) +{ + size_t nr_pages =3D pages_per_huge_page(h); + struct hugetlb_cgroup *h_cg =3D NULL; + gfp_t gfp =3D htlb_alloc_mask(h); + bool memory_charged =3D false; + int idx =3D hstate_index(h); + struct folio *folio; + int ret; + + if (charge_hugetlb_rsvd) { + if (hugetlb_cgroup_charge_cgroup_rsvd(idx, nr_pages, &h_cg)) + return ERR_PTR(-ENOSPC); + } + + if (hugetlb_cgroup_charge_cgroup(idx, nr_pages, &h_cg)) { + ret =3D -ENOSPC; + goto out_uncharge_hugetlb_page_count; + } + + ret =3D mem_cgroup_hugetlb_try_charge(memcg, gfp | __GFP_RETRY_MAYFAIL, + nr_pages); + if (ret =3D=3D -ENOMEM) + goto out_uncharge_memory; + + memory_charged =3D !ret; + + spin_lock_irq(&hugetlb_lock); + + folio =3D NULL; + if (use_existing_reservation || available_huge_pages(h)) + folio =3D dequeue_hugetlb_folio_with_mpol(h, mpol, nid, nodemask); + + if (!folio) { + spin_unlock_irq(&hugetlb_lock); + folio =3D alloc_buddy_hugetlb_folio_with_mpol(h, mpol, nid, nodemask); + if (!folio) { + ret =3D -ENOSPC; + goto out_uncharge_memory; + } + spin_lock_irq(&hugetlb_lock); + list_add(&folio->lru, &h->hugepage_activelist); + folio_ref_unfreeze(folio, 1); + /* Fall through */ + } + + if (use_existing_reservation) { + folio_set_hugetlb_restore_reserve(folio); + h->resv_huge_pages--; + } + + hugetlb_cgroup_commit_charge(idx, nr_pages, h_cg, folio); + + if (charge_hugetlb_rsvd) + hugetlb_cgroup_commit_charge_rsvd(idx, nr_pages, h_cg, folio); + + spin_unlock_irq(&hugetlb_lock); + + lruvec_stat_mod_folio(folio, NR_HUGETLB, nr_pages); + + if (memory_charged) + mem_cgroup_commit_charge(folio, memcg); + + return folio; + +out_uncharge_memory: + if (memory_charged) + mem_cgroup_cancel_charge(memcg, nr_pages); + + hugetlb_cgroup_uncharge_cgroup(idx, nr_pages, h_cg); + +out_uncharge_hugetlb_page_count: + if (charge_hugetlb_rsvd) + hugetlb_cgroup_uncharge_cgroup_rsvd(idx, nr_pages, h_cg); + + return ERR_PTR(ret); +} + typedef enum { /* * For either 0/1: we checked the per-vma resv map, and one resv @@ -2878,17 +2977,14 @@ struct folio *alloc_hugetlb_folio(struct vm_area_st= ruct *vma, struct folio *folio; long retval, gbl_chg, gbl_reserve; map_chg_state map_chg; - int ret, idx; - struct hugetlb_cgroup *h_cg =3D NULL; gfp_t gfp =3D htlb_alloc_mask(h); - bool memory_charged =3D false; + bool charge_hugetlb_rsvd; + bool use_existing_reservation; struct mem_cgroup *memcg; struct mempolicy *mpol; nodemask_t *nodemask; int nid; =20 - idx =3D hstate_index(h); - /* Whether we need a separate per-vma reservation? */ if (cow_from_owner) { /* @@ -2920,7 +3016,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_stru= ct *vma, if (map_chg) { gbl_chg =3D hugepage_subpool_get_pages(spool, 1); if (gbl_chg < 0) { - ret =3D -ENOSPC; + folio =3D ERR_PTR(-ENOSPC); goto out_end_reservation; } } else { @@ -2935,85 +3031,30 @@ struct folio *alloc_hugetlb_folio(struct vm_area_st= ruct *vma, * If this allocation is not consuming a per-vma reservation, * charge the hugetlb cgroup now. */ - if (map_chg) { - ret =3D hugetlb_cgroup_charge_cgroup_rsvd( - idx, pages_per_huge_page(h), &h_cg); - if (ret) { - ret =3D -ENOSPC; - goto out_subpool_put; - } - } + charge_hugetlb_rsvd =3D (bool)map_chg; =20 - ret =3D hugetlb_cgroup_charge_cgroup(idx, pages_per_huge_page(h), &h_cg); - if (ret) { - ret =3D -ENOSPC; - goto out_uncharge_cgroup_reservation; - } + /* + * gbl_chg =3D=3D 0 indicates a reservation exists for the allocation, so + * try to use it. + */ + use_existing_reservation =3D gbl_chg =3D=3D 0; =20 memcg =3D get_mem_cgroup_from_current(); - ret =3D mem_cgroup_hugetlb_try_charge(memcg, gfp | __GFP_RETRY_MAYFAIL, - pages_per_huge_page(h)); - if (ret =3D=3D -ENOMEM) - goto out_put_memcg; - - memory_charged =3D !ret; - - spin_lock_irq(&hugetlb_lock); =20 /* Takes reference on mpol. */ nid =3D huge_node(vma, addr, gfp, &mpol, &nodemask); =20 - /* - * gbl_chg =3D=3D 0 indicates a reservation exists for the allocation - so - * try dequeuing a page. If there are available_huge_pages(), try using - * them! - */ - folio =3D NULL; - if (!gbl_chg || available_huge_pages(h)) - folio =3D dequeue_hugetlb_folio_with_mpol(h, mpol, nid, nodemask); - - if (!folio) { - spin_unlock_irq(&hugetlb_lock); - folio =3D alloc_buddy_hugetlb_folio_with_mpol(h, mpol, nid, nodemask); - if (!folio) { - mpol_cond_put(mpol); - ret =3D -ENOSPC; - goto out_uncharge_memory; - } - spin_lock_irq(&hugetlb_lock); - list_add(&folio->lru, &h->hugepage_activelist); - folio_ref_unfreeze(folio, 1); - /* Fall through */ - } + folio =3D hugetlb_alloc_folio(h, mpol, nid, nodemask, memcg, + charge_hugetlb_rsvd, + use_existing_reservation); =20 mpol_cond_put(mpol); =20 - /* - * Either dequeued or buddy-allocated folio needs to add special - * mark to the folio when it consumes a global reservation. - */ - if (!gbl_chg) { - folio_set_hugetlb_restore_reserve(folio); - h->resv_huge_pages--; - } - - hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, folio); - /* If allocation is not consuming a reservation, also store the - * hugetlb_cgroup pointer on the page. - */ - if (map_chg) { - hugetlb_cgroup_commit_charge_rsvd(idx, pages_per_huge_page(h), - h_cg, folio); - } - - spin_unlock_irq(&hugetlb_lock); - - lruvec_stat_mod_folio(folio, NR_HUGETLB, pages_per_huge_page(h)); - - if (memory_charged) - mem_cgroup_commit_charge(folio, memcg); mem_cgroup_put(memcg); =20 + if (IS_ERR(folio)) + goto out_subpool_put; + hugetlb_set_folio_subpool(folio, spool); =20 if (map_chg !=3D MAP_CHG_ENFORCED) { @@ -3046,17 +3087,6 @@ struct folio *alloc_hugetlb_folio(struct vm_area_str= uct *vma, =20 return folio; =20 -out_uncharge_memory: - if (memory_charged) - mem_cgroup_cancel_charge(memcg, pages_per_huge_page(h)); -out_put_memcg: - mem_cgroup_put(memcg); - - hugetlb_cgroup_uncharge_cgroup(idx, pages_per_huge_page(h), h_cg); -out_uncharge_cgroup_reservation: - if (map_chg) - hugetlb_cgroup_uncharge_cgroup_rsvd(idx, pages_per_huge_page(h), - h_cg); out_subpool_put: /* * put page to subpool iff the quota of subpool's rsv_hpages is used @@ -3067,11 +3097,10 @@ struct folio *alloc_hugetlb_folio(struct vm_area_st= ruct *vma, hugetlb_acct_memory(h, -gbl_reserve); } =20 - out_end_reservation: if (map_chg !=3D MAP_CHG_ENFORCED) vma_end_reservation(h, vma, addr); - return ERR_PTR(ret); + return folio; } =20 static __init void *alloc_bootmem(struct hstate *h, int nid, bool node_exa= ct) --=20 2.53.0.310.g728cabbaf7-goog