From nobody Thu Oct 2 15:34:44 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE2E3271476 for ; Mon, 15 Sep 2025 16:47:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954832; cv=none; b=U0GSK8Ot3YyuAFPip4IDIADnVU8o7itMgrqCjP8nzZqUny1SEyDg+leQCUVUuADBf8rpn5H44sjTDQmGn/m1IILScN9dJUe4ZPFSvzBltpDzm1aAZzKd764Kiuiqitg3tf5xaYc4wITlOnGLno7XkRMeDULISdMlfSgZgc0OuXI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954832; c=relaxed/simple; bh=pvCfH5VsfNwVFReZuZTk36FRtKutRSyx+F6VGLxE6kM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XbJiREHJnrSGdAHshOmerCQZYEQkB/v7HFPDh/SXO2JHMJMmBaHyEkXBuC3jyoW8u9Bu27r9F0xUKaEZeoQDL5zP1exbPHk2hfwfSSkrERexssnHR1vuIClYC0piUkicacXhvg906xw5g0CqhSdEpqu6poPSK8kGiGrOULxNvsU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=gVZCAX6H; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gVZCAX6H" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-24458274406so86441945ad.3 for ; Mon, 15 Sep 2025 09:47:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757954830; x=1758559630; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hKwpABqpohuFqzMZ59NYGHVdaV3v0Z5awKHv0UgL8Ps=; b=gVZCAX6HJOPfCYrrcb8VP55QLCj2yQ1s6oyBe+OSHPAbX1XUOkOkyCYncQ+BeHO6qm H23YxZ9/AY0EpPXOgT3RmEvMMfhqFAuILCEebdYBFRECRLnNO4+KVkmdLYqBktY9RTxc 2kxkNEycjsVQBB4QftNiRhNmGSU3FKvjuInNXx76ApzEKe/jfj2uwrrS8dCoAUaDAcvZ 67P5JmI8cglRupSmoe3FLBdcqm7kKkhO6bbY9y7jf7RBuLFinschXcVgTz19dnHD2aZ2 yhNKpF/eDBgyD0ajOzj4ojTL0Tn6eTEr6AShmYxf7RcLa1UYYFgXcYjjCD2d9ZYSHj3y zI+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757954830; x=1758559630; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hKwpABqpohuFqzMZ59NYGHVdaV3v0Z5awKHv0UgL8Ps=; b=c2bel8RMnCVIOVvk9HDDw93KXfGzBs79VMSwlszAiNKFC9vcEZiTfm7PQTwCmNfGHG ki6x38N6269wtazG0fLlRmCq7GVH7UrhGs0UyqFdWlMD+spJEz2shGxur8NTkR5J2EIl y9g7NUjh6yAWupFP1VN9TrjXRxtuEAR+tj/t87hu/8vULWbVqNAzPAR2PTmSfRr/WbvC Z/A7ql64SaU4ucqtfzEJUFKhnwBg6JcJ1DNbbpRvVXi/PTGETf2xW7puJUlHQAwQsetY gkMtg8woxDkYv/8lqJJfEBcx3woxAVhArDeZZTt3MasMzGRPWDXAnWUHTojqWEYSgiyF /ReQ== X-Forwarded-Encrypted: i=1; AJvYcCXtYG7+MyDE6JYhbtulzRY4ozgbcN57VrdwmFvXHooyiJSpYQneEp18wlqDVs4nDcA33tic13RuNHjnrgI=@vger.kernel.org X-Gm-Message-State: AOJu0YwYaGecMpX0WN80J4n0ywHGw4p7xQNPbNNZXAKJz6dfmFliPx9z ZPNdhfkSfPOcJyfkcZmxZJGGBR8CVoQTMNE89mSZPDW0ZoxWeuy83shet+tR+VTYOx4+ftjMzAj XwAH5z550Vgk8EjA0O2h1P4GQ8A== X-Google-Smtp-Source: AGHT+IHxs26EIdlIXz631Uz365/CGN6f9O6Xm7lLrsiaFhIbp7CDzVSSsE3TgFf9kclQMXvcquwmL00++yYPRtwtYA== X-Received: from plbnb15.prod.google.com ([2002:a17:903:15cf:b0:24c:af07:f077]) (user=kaleshsingh job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2448:b0:267:a55a:8684 with SMTP id d9443c01a7336-267a55a8724mr34282065ad.2.1757954829980; Mon, 15 Sep 2025 09:47:09 -0700 (PDT) Date: Mon, 15 Sep 2025 09:36:37 -0700 In-Reply-To: <20250915163838.631445-1-kaleshsingh@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250915163838.631445-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.51.0.384.g4c02a37b29-goog Message-ID: <20250915163838.631445-7-kaleshsingh@google.com> Subject: [PATCH v2 6/7] mm: add assertion for VMA count limit From: Kalesh Singh To: akpm@linux-foundation.org, minchan@kernel.org, lorenzo.stoakes@oracle.com, david@redhat.com, Liam.Howlett@oracle.com, rppt@kernel.org, pfalcato@suse.de Cc: kernel-team@android.com, android-mm@google.com, Kalesh Singh , Alexander Viro , Christian Brauner , Jan Kara , Kees Cook , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Valentin Schneider , Jann Horn , Shuah Khan , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Building on the vma_count helpers, add a VM_WARN_ON_ONCE() to detect cases where the VMA count exceeds the sysctl_max_map_count limit. This check will help catch future bugs or regressions where the VMAs are allocated exceeding the limit. The warning is placed in the main vma_count_*() helpers, while the internal *_nocheck variants bypass it. _nocheck helpers are used to ensure that the assertion does not trigger a false positive in the legitimate case of a temporary VMA increase past the limit by a VMA split in munmap(). Cc: Andrew Morton Cc: David Hildenbrand Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Mike Rapoport Cc: Minchan Kim Cc: Pedro Falcato Signed-off-by: Kalesh Singh --- Changes in v2: - Add assertions if exceeding max_vma_count limit, per Pedro include/linux/mm.h | 12 ++++++-- mm/internal.h | 1 - mm/vma.c | 49 +++++++++++++++++++++++++------- tools/testing/vma/vma_internal.h | 7 ++++- 4 files changed, 55 insertions(+), 14 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8bad1454984c..3a3749d7015c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4219,19 +4219,27 @@ static inline bool snapshot_page_is_faithful(const = struct page_snapshot *ps) =20 void snapshot_page(struct page_snapshot *ps, const struct page *page); =20 +int vma_count_remaining(const struct mm_struct *mm); + static inline void vma_count_init(struct mm_struct *mm) { ACCESS_PRIVATE(mm, __vma_count) =3D 0; } =20 -static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +static inline void __vma_count_add_nocheck(struct mm_struct *mm, int nr_vm= as) { ACCESS_PRIVATE(mm, __vma_count) +=3D nr_vmas; } =20 +static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +{ + VM_WARN_ON_ONCE(!vma_count_remaining(mm)); + __vma_count_add_nocheck(mm, nr_vmas); +} + static inline void vma_count_sub(struct mm_struct *mm, int nr_vmas) { - vma_count_add(mm, -nr_vmas); + __vma_count_add_nocheck(mm, -nr_vmas); } =20 static inline void vma_count_inc(struct mm_struct *mm) diff --git a/mm/internal.h b/mm/internal.h index 39f1c9535ae5..e0567a3b64fa 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1661,6 +1661,5 @@ static inline bool reclaim_pt_is_enabled(unsigned lon= g start, unsigned long end, void dup_mm_exe_file(struct mm_struct *mm, struct mm_struct *oldmm); int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm); =20 -int vma_count_remaining(const struct mm_struct *mm); =20 #endif /* __MM_INTERNAL_H */ diff --git a/mm/vma.c b/mm/vma.c index 0cd3cb472220..0e4fcaebe209 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -323,15 +323,17 @@ static void vma_prepare(struct vma_prepare *vp) } =20 /* - * vma_complete- Helper function for handling the unlocking after altering= VMAs, - * or for inserting a VMA. + * This is the internal, unsafe version of vma_complete(). Unlike its + * wrapper, this function bypasses runtime checks for VMA count limits by + * using the _nocheck vma_count* helpers. * - * @vp: The vma_prepare struct - * @vmi: The vma iterator - * @mm: The mm_struct + * Its use is restricted to __split_vma() where the VMA count can be + * temporarily higher than the sysctl_max_map_count limit. + * + * All other callers must use vma_complete(). */ -static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi, - struct mm_struct *mm) +static void __vma_complete(struct vma_prepare *vp, struct vma_iterator *vm= i, + struct mm_struct *mm) { if (vp->file) { if (vp->adj_next) @@ -352,7 +354,11 @@ static void vma_complete(struct vma_prepare *vp, struc= t vma_iterator *vmi, * (it may either follow vma or precede it). */ vma_iter_store_new(vmi, vp->insert); - vma_count_inc(mm); + /* + * Explicitly allow vma_count to exceed the threshold to prevent, + * blocking munmap() freeing resources. + */ + __vma_count_add_nocheck(mm, 1); } =20 if (vp->anon_vma) { @@ -403,6 +409,26 @@ static void vma_complete(struct vma_prepare *vp, struc= t vma_iterator *vmi, uprobe_mmap(vp->insert); } =20 +/* + * vma_complete- Helper function for handling the unlocking after altering= VMAs, + * or for inserting a VMA. + * + * @vp: The vma_prepare struct + * @vmi: The vma iterator + * @mm: The mm_struct + */ +static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi, + struct mm_struct *mm) +{ + /* + * __vma_complete() explicitly foregoes checking the new + * vma_count against the sysctl_max_map_count limit, so + * do it here. + */ + VM_WARN_ON_ONCE(!vma_count_remaining(mm)); + __vma_complete(vp, vmi, mm); +} + /* * init_vma_prep() - Initializer wrapper for vma_prepare struct * @vp: The vma_prepare struct @@ -564,8 +590,11 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_s= truct *vma, vma->vm_end =3D addr; } =20 - /* vma_complete stores the new vma */ - vma_complete(&vp, vmi, vma->vm_mm); + /* + * __vma_complete stores the new vma without checking against the + * sysctl_max_map_count (vma_count) limit. + */ + __vma_complete(&vp, vmi, vma->vm_mm); validate_mm(vma->vm_mm); /* Success. */ diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index 6e724ba1adf4..d084b1eb2a5c 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -1534,11 +1534,16 @@ static inline void vma_count_init(struct mm_struct = *mm) mm->__vma_count =3D 0; } -static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +static inline void __vma_count_add_nocheck(struct mm_struct *mm, int nr_vm= as) { mm->__vma_count +=3D nr_vmas; } +static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +{ + __vma_count_add_nocheck(mm, nr_vmas); +} + static inline void vma_count_sub(struct mm_struct *mm, int nr_vmas) { vma_count_add(mm, -nr_vmas); --=20 2.51.0.384.g4c02a37b29-goog