From nobody Mon Apr 6 09:18:50 2026 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF4D23E8678 for ; Fri, 20 Mar 2026 18:23:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774031028; cv=none; b=K2BG1Qrk6PcZP8SUWOVPGUApK9YL1oXBLSSFRkYRfWfDaMasFJ0Pv9dHdVN2fFRqoEAglkUlBmoNSDWVGB8Bw7ffOKt9bQlQ2j2hzHZqGbIEWV7qpjTxZ/LsvsORI1VFzzc650cmq/wnsVuZueNYmUxvHeLEJnm7T06ogalHYy0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774031028; c=relaxed/simple; bh=cLGwpVdb5vaUyDsQfveCYiv2W889l0OkCAS5cWlYLg4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=IDM2nCNtpQDDMwmsnYLJsWMExq7jFJdJQHMG5dsMF2jWKbHFILKOqvn8f3I/4/oVpbUY444WZwi9humxHpETd2WEtKrjLHiggUw/FPbb7/zmqWdkGylzbuoBduKPUoERtZ4Njpk5e89rdJ7aTVlYFl0yQOMvzFK15JJ/jmRuFo8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jackmanb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=lNtEszox; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jackmanb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lNtEszox" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4836abfc742so6629945e9.0 for ; Fri, 20 Mar 2026 11:23:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1774031025; x=1774635825; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=cM20szgm4biveuFYrOn51hJOnijwUTgLJlj/dU7J5NQ=; b=lNtEszoxdgHdjg5OOqF+2g8X2OblVOPVi4rEDBye2HienE0A2toHoP4Ue/Gm+G1LPP Ck/cJ3K2VryAAvRUNBPqhnRPfopbTGSxrjTyLdP2fn8MM8yDeJX34jrMKN7ZIujKVwNg KPzARpe7Hn+zSjikpKvWP0VnRqnoHdlNZeAAVw50On29g5sqkZ4nBkaO6i1CYi+tfFAd nIXzsRAFtIRP+d3hIS5MYnh1Tdm1w3yPnkHMYJ/aWqkEABBz8Lgy9MIqs5ltkwBUuQUd MOsImwM3o6K4a8sXUuOAu7mzWYH4zVSjaXZwoMYJdz30EElyiXCbeqJaLPyQmt4qO4ie u96A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774031025; x=1774635825; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cM20szgm4biveuFYrOn51hJOnijwUTgLJlj/dU7J5NQ=; b=XY5rKRUPyNT9d92a6p4m0RJiwHB7A3zRxMSCKZeo0AaD4pJ2lVHOcC2k/NotSyafbg bYWwl88UE9Xpt4v2wpw1wYVrXvrVoA0kpXj1w0w5JK45jjKRrSVwzNM5t3Uw6uHsJDmQ Ws2F+OL4sVDeC8rWqBwNxIUNjcqX8nIRbGLBVJtgHR0fey0OBMOWRlycpovZlXxQ/8so MnMBXgIsmCaNDJovwDjVBkPUN7vXs4ifjZDSR1wOft/Jqcj4B4Q1kbGC+O0ynTa6XPn+ DMRzrmiBPUaYr4r5vTiq+wZE8CYeyroRWhikuTSGwb0nmYFryfG+DvVfkcIo+V9b55Ct vG8A== X-Forwarded-Encrypted: i=1; AJvYcCWPXNgIZoulLVoxH3C9Y6ddjJlAkR2YUi87S8FmIuoGLY1xrEf5djaNHItSYSyFTbphDStPeOtjeNZ0DQM=@vger.kernel.org X-Gm-Message-State: AOJu0YyjnJN31lWbRyCqrtrRKtHawzeCHNdLg4D+8rAHxqAOg1Yd8asT GXaGxPQtcPtLBWLafwwcjFX1IW3k1aWFI+SiFZ0wGYvkKRQJKCR23mfIZLl1i17TpPoogPE38CF AjK9tTB75875Y2w== X-Received: from wmla9.prod.google.com ([2002:a05:600d:2389:b0:483:2ce9:2b05]) (user=jackmanb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:c167:b0:486:fc94:d8f2 with SMTP id 5b1f17b1804b1-486fedbc066mr61577695e9.14.1774031024700; Fri, 20 Mar 2026 11:23:44 -0700 (PDT) Date: Fri, 20 Mar 2026 18:23:29 +0000 In-Reply-To: <20260320-page_alloc-unmapped-v2-0-28bf1bd54f41@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260320-page_alloc-unmapped-v2-0-28bf1bd54f41@google.com> X-Mailer: b4 0.14.3 Message-ID: <20260320-page_alloc-unmapped-v2-5-28bf1bd54f41@google.com> Subject: [PATCH v2 05/22] mm: Add more flags for __apply_to_page_range() From: Brendan Jackman To: Borislav Petkov , Dave Hansen , Peter Zijlstra , Andrew Morton , David Hildenbrand , Vlastimil Babka , Wei Xu , Johannes Weiner , Zi Yan , Lorenzo Stoakes Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, rppt@kernel.org, Sumit Garg , derkling@google.com, reijiw@google.com, Will Deacon , rientjes@google.com, "Kalyazin, Nikita" , patrick.roy@linux.dev, "Itazuri, Takahiro" , Andy Lutomirski , David Kaplan , Thomas Gleixner , Brendan Jackman , Yosry Ahmed Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add two flags to make this API more generic: 1. Separate "create" into two levels - one to allow creating new mappings without allocating pagetables, and one for the current behaviour that allows both of these. 2. Create a new flag to report that the caller has taken care of synchronization and no locks are required. Both of these will serve to allow calling this API from restricted contexts where allocation and pagetable locking are not possible. Signed-off-by: Brendan Jackman --- mm/internal.h | 19 ++++++++++++++++++- mm/memory.c | 59 ++++++++++++++++++++++++++++++++++---------------------= ---- 2 files changed, 52 insertions(+), 26 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 4b389431b1639..f4c59534670e4 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1872,9 +1872,26 @@ static inline int get_sysctl_max_map_count(void) =20 /* * Create a mapping if it doesn't exist. (Otherwise, skip regions with no - * existing mapping, and return an error for regions with no leaf pagetabl= e). + * existing mapping). Most users will want PGRANGE_ALLOC or 0 instead. */ #define PGRANGE_CREATE (1 << 0) +/* + * Allocate a pagetable if one is missing. (Otherwise, return an error for + * regions with no leaf pagetable). Also implies PGRANGE_CREATE. + */ +#define PGRANGE_ALLOC (1 << 1) +/* + * Do not take any locks. This means the caller has taken care of + * synchronisation. This is incompatible with PGRANGE_ALLOC and also with + * mm=3D&init_mm. + */ +#define PGRANGE_NOLOCK (1 << 2) + + +static inline bool pgrange_create(unsigned int flags) +{ + return flags & (PGRANGE_CREATE | PGRANGE_ALLOC); +} =20 int __apply_to_page_range(struct mm_struct *mm, unsigned long addr, unsigned long size, pte_fn_t fn, diff --git a/mm/memory.c b/mm/memory.c index 7e55014e5560b..9f0ccbbbc4e59 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3211,30 +3211,36 @@ static int apply_to_pte_range(struct mm_struct *mm,= pmd_t *pmd, pte_fn_t fn, void *data, unsigned int flags, pgtbl_mod_mask *mask) { - bool create =3D flags & PGRANGE_CREATE; pte_t *pte, *mapped_pte; int err =3D 0; spinlock_t *ptl; =20 - if (create) { + if (flags & PGRANGE_ALLOC) { + VM_WARN_ON(flags & PGRANGE_NOLOCK); + mapped_pte =3D pte =3D (mm =3D=3D &init_mm) ? pte_alloc_kernel_track(pmd, addr, mask) : pte_alloc_map_lock(mm, pmd, addr, &ptl); + if (!pte) return -ENOMEM; } else { - mapped_pte =3D pte =3D (mm =3D=3D &init_mm) ? - pte_offset_kernel(pmd, addr) : - pte_offset_map_lock(mm, pmd, addr, &ptl); + if (mm =3D=3D &init_mm) + pte =3D pte_offset_kernel(pmd, addr); + else if (flags & PGRANGE_NOLOCK) + pte =3D pte_offset_map(pmd, addr); + else + pte =3D pte_offset_map_lock(mm, pmd, addr, &ptl); if (!pte) return -EINVAL; + mapped_pte =3D pte; } =20 lazy_mmu_mode_enable(); =20 if (fn) { do { - if (create || !pte_none(ptep_get(pte))) { + if (pgrange_create(flags) || !pte_none(ptep_get(pte))) { err =3D fn(pte, addr, data); if (err) break; @@ -3245,8 +3251,12 @@ static int apply_to_pte_range(struct mm_struct *mm, = pmd_t *pmd, =20 lazy_mmu_mode_disable(); =20 - if (mm !=3D &init_mm) - pte_unmap_unlock(mapped_pte, ptl); + if (mm !=3D &init_mm) { + if (flags & PGRANGE_NOLOCK) + pte_unmap(mapped_pte); + else + pte_unmap_unlock(mapped_pte, ptl); + } return err; } =20 @@ -3256,13 +3266,12 @@ static int apply_to_pmd_range(struct mm_struct *mm,= pud_t *pud, pgtbl_mod_mask *mask) { pmd_t *pmd; - bool create =3D flags & PGRANGE_CREATE; unsigned long next; int err =3D 0; =20 BUG_ON(pud_leaf(*pud)); =20 - if (create) { + if (pgrange_create(flags)) { pmd =3D pmd_alloc_track(mm, pud, addr, mask); if (!pmd) return -ENOMEM; @@ -3271,12 +3280,12 @@ static int apply_to_pmd_range(struct mm_struct *mm,= pud_t *pud, } do { next =3D pmd_addr_end(addr, end); - if (pmd_none(*pmd) && !create) + if (pmd_none(*pmd) && !pgrange_create(flags)) continue; if (WARN_ON_ONCE(pmd_leaf(*pmd))) return -EINVAL; if (!pmd_none(*pmd) && WARN_ON_ONCE(pmd_bad(*pmd))) { - if (!create) + if (!pgrange_create(flags)) continue; pmd_clear_bad(pmd); } @@ -3295,11 +3304,10 @@ static int apply_to_pud_range(struct mm_struct *mm,= p4d_t *p4d, pgtbl_mod_mask *mask) { pud_t *pud; - bool create =3D flags & PGRANGE_CREATE; unsigned long next; int err =3D 0; =20 - if (create) { + if (pgrange_create(flags)) { pud =3D pud_alloc_track(mm, p4d, addr, mask); if (!pud) return -ENOMEM; @@ -3308,17 +3316,17 @@ static int apply_to_pud_range(struct mm_struct *mm,= p4d_t *p4d, } do { next =3D pud_addr_end(addr, end); - if (pud_none(*pud) && !create) + if (pud_none(*pud) && !pgrange_create(flags)) continue; if (WARN_ON_ONCE(pud_leaf(*pud))) return -EINVAL; if (!pud_none(*pud) && WARN_ON_ONCE(pud_bad(*pud))) { - if (!create) + if (!pgrange_create(flags)) continue; pud_clear_bad(pud); } err =3D apply_to_pmd_range(mm, pud, addr, next, - fn, data, create, mask); + fn, data, flags, mask); if (err) break; } while (pud++, addr =3D next, addr !=3D end); @@ -3332,11 +3340,10 @@ static int apply_to_p4d_range(struct mm_struct *mm,= pgd_t *pgd, pgtbl_mod_mask *mask) { p4d_t *p4d; - bool create =3D flags & PGRANGE_CREATE; unsigned long next; int err =3D 0; =20 - if (create) { + if (pgrange_create(flags)) { p4d =3D p4d_alloc_track(mm, pgd, addr, mask); if (!p4d) return -ENOMEM; @@ -3345,12 +3352,12 @@ static int apply_to_p4d_range(struct mm_struct *mm,= pgd_t *pgd, } do { next =3D p4d_addr_end(addr, end); - if (p4d_none(*p4d) && !create) + if (p4d_none(*p4d) && !pgrange_create(flags)) continue; if (WARN_ON_ONCE(p4d_leaf(*p4d))) return -EINVAL; if (!p4d_none(*p4d) && WARN_ON_ONCE(p4d_bad(*p4d))) { - if (!create) + if (!pgrange_create(flags)) continue; p4d_clear_bad(p4d); } @@ -3368,7 +3375,6 @@ int __apply_to_page_range(struct mm_struct *mm, unsig= ned long addr, void *data, unsigned int flags) { pgd_t *pgd; - bool create =3D flags & PGRANGE_CREATE; unsigned long start =3D addr, next; unsigned long end =3D addr + size; pgtbl_mod_mask mask =3D 0; @@ -3376,18 +3382,21 @@ int __apply_to_page_range(struct mm_struct *mm, uns= igned long addr, =20 if (WARN_ON(addr >=3D end)) return -EINVAL; + if (WARN_ON(flags & PGRANGE_NOLOCK && + (mm =3D=3D &init_mm || flags & PGRANGE_ALLOC))) + return -EINVAL; =20 pgd =3D pgd_offset(mm, addr); do { next =3D pgd_addr_end(addr, end); - if (pgd_none(*pgd) && !create) + if (pgd_none(*pgd) && !pgrange_create(flags)) continue; if (WARN_ON_ONCE(pgd_leaf(*pgd))) { err =3D -EINVAL; break; } if (!pgd_none(*pgd) && WARN_ON_ONCE(pgd_bad(*pgd))) { - if (!create) + if (!pgrange_create(flags)) continue; pgd_clear_bad(pgd); } @@ -3410,7 +3419,7 @@ int __apply_to_page_range(struct mm_struct *mm, unsig= ned long addr, int apply_to_page_range(struct mm_struct *mm, unsigned long addr, unsigned long size, pte_fn_t fn, void *data) { - return __apply_to_page_range(mm, addr, size, fn, data, PGRANGE_CREATE); + return __apply_to_page_range(mm, addr, size, fn, data, PGRANGE_ALLOC); } EXPORT_SYMBOL_GPL(apply_to_page_range); =20 --=20 2.51.2