From nobody Thu Oct 2 14:25:58 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4234028504F for ; Mon, 15 Sep 2025 16:45:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954755; cv=none; b=bEL07adY6+GRKAMHrkx2lcoka3Vn+QjZG8ClwqnYF2MTP15RImAM4v3yZyD1qTPuVYVKUjnSs42nU75CSE6XWbPPiHYd5isIhUrAlWuMNA9PyFK0uZpLMAyaPyhCELC55SD6KWMD5WlhloDTa01p97OEe+FDaIW0La4mrXVZP80= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954755; c=relaxed/simple; bh=vPqrFhzsoS56yk34gogqjEXAKiAKcL51MnvFt/0vTxk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=FEDg9hZvT5Psjsjtme/9jS7Sy5W1gFrLT8kO+/F0cN93Yf+L8dFh432qsZkE8ltXISiyA40V7xcS7xSOoUVY6Cigua4zvqxBzHsC7CPtbpBEwyI0wKS90Ymacut4SlLHHWZ7/w32RRYgQrk052oKeNvzYqbZ0YMPO7nm7lijd4U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=zV/k/JN+; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="zV/k/JN+" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-261cec07f2dso16120705ad.1 for ; Mon, 15 Sep 2025 09:45:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757954752; x=1758559552; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8Dh1IS3JVHOf86RTcG+zb9g2w/Ys+63vuY/qru+PreQ=; b=zV/k/JN+MEs5mTleaLd/vdGTldfkvMs9MFHiB7jKUhkcINpmK79TDVqdPin9BNv2o9 VmNHWv406Dj7sIufglyt8C8zDiLy6lRB1+z8X869/5iTJjhJzWOfVuWbS6XBK4sgews8 9u8DrFyTcvo6UUdgtRbWBQ26Mv6zghSrdvgX/pP/eGJnD6rObeFKl22e3foOx3Y+9LMI k2kwvqw+ODGe0A4rAoQ80qoiSJFar4+n+1vByHVOeso+429EXL//YmjNqu79XREIRQ9+ bgAq4lqrkPTv8udsm19RkT16QSMiF+UZ7wUDgdnBMrOD+7rjvVUBridRM4Ef9TB4Idya t6lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757954752; x=1758559552; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8Dh1IS3JVHOf86RTcG+zb9g2w/Ys+63vuY/qru+PreQ=; b=KeRBlEA5d3iMeTyU92cWyLBdKe7Fww0SC/Ogm/VkVv9W5S1xrWxKvbKNKTAn0PvwCG bxaeB5p26uyeK6gocfuXeJkI1CmiaCrW4KUhzxMaskrQ/6KMl+oNacYWtGGgVfYTaX7k 7lfB/JMnye3lX1kMC4FuZ6cKyRfAZ4eGvU5jsrwP0CZ1RfXdSWz/SFbs8DrkV+FL7A4J aNfyjSBE0WTjIc1u8ijgiBzwy8g6SJV6pssIAUOBiD/sopCTmojFO2wtQcvv1LTm3qZ0 CCpdPt+aymgZhxsSVFFTvLZPW0zeOhvfYt2KHFTg0pxYv3dM1Zmv6pZWVA45mL94U3KR g+Ug== X-Forwarded-Encrypted: i=1; AJvYcCX8KULZBy6Ql8sNE6qnbr8Xv/mQlNIz7/ZvMCvnX5Ud70ib/I5crP0dEC77rUoflQ/GsCCSDuZbm14FilE=@vger.kernel.org X-Gm-Message-State: AOJu0Yxs01zpNgW+7J2BreAbbIJBSmNUYcKBYHSNo+4J0D4T5RlouSNt xbQ4t7vYMl+aTSVbEs8dLq7Q+ZvoGY2Jo2TsklcBIEnqoNc6VxJX7WYrSTDAU+YO/wo+frtGJtc f8v3MBbaU+zQErPOdjeVuctoLIg== X-Google-Smtp-Source: AGHT+IETeFsnzxOgNLMCiDHUaYpnu2vRB/Fb66R7+b7xgDA1s+PPQqB4+bTMKY9LvQ479LuxEWFmKDI75ZPJsp8Xlg== X-Received: from plkb5.prod.google.com ([2002:a17:903:fa5:b0:264:7b3c:4fe4]) (user=kaleshsingh job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:dac2:b0:24b:1585:6350 with SMTP id d9443c01a7336-25d2ac3c545mr185756655ad.11.1757954752355; Mon, 15 Sep 2025 09:45:52 -0700 (PDT) Date: Mon, 15 Sep 2025 09:36:32 -0700 In-Reply-To: <20250915163838.631445-1-kaleshsingh@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250915163838.631445-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.51.0.384.g4c02a37b29-goog Message-ID: <20250915163838.631445-2-kaleshsingh@google.com> Subject: [PATCH v2 1/7] mm: fix off-by-one error in VMA count limit checks From: Kalesh Singh To: akpm@linux-foundation.org, minchan@kernel.org, lorenzo.stoakes@oracle.com, david@redhat.com, Liam.Howlett@oracle.com, rppt@kernel.org, pfalcato@suse.de Cc: kernel-team@android.com, android-mm@google.com, Kalesh Singh , stable@vger.kernel.org, Alexander Viro , Christian Brauner , Jan Kara , Kees Cook , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Valentin Schneider , Jann Horn , Shuah Khan , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The VMA count limit check in do_mmap() and do_brk_flags() uses a strict inequality (>), which allows a process's VMA count to exceed the configured sysctl_max_map_count limit by one. A process with mm->map_count =3D=3D sysctl_max_map_count will incorrectly pass this check and then exceed the limit upon allocation of a new VMA when its map_count is incremented. Other VMA allocation paths, such as split_vma(), already use the correct, inclusive (>=3D) comparison. Fix this bug by changing the comparison to be inclusive in do_mmap() and do_brk_flags(), bringing them in line with the correct behavior of other allocation paths. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: Cc: Andrew Morton Cc: David Hildenbrand Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Mike Rapoport Cc: Minchan Kim Cc: Pedro Falcato Signed-off-by: Kalesh Singh Acked-by: SeongJae Park Reviewed-by: David Hildenbrand Reviewed-by: Lorenzo Stoakes Reviewed-by: Pedro Falcato --- Chnages in v2: - Fix mmap check, per Pedro mm/mmap.c | 2 +- mm/vma.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 7306253cc3b5..e5370e7fcd8f 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -374,7 +374,7 @@ unsigned long do_mmap(struct file *file, unsigned long = addr, return -EOVERFLOW; =20 /* Too many mappings? */ - if (mm->map_count > sysctl_max_map_count) + if (mm->map_count >=3D sysctl_max_map_count) return -ENOMEM; =20 /* diff --git a/mm/vma.c b/mm/vma.c index 3b12c7579831..033a388bc4b1 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -2772,7 +2772,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_= area_struct *vma, if (!may_expand_vm(mm, vm_flags, len >> PAGE_SHIFT)) return -ENOMEM; =20 - if (mm->map_count > sysctl_max_map_count) + if (mm->map_count >=3D sysctl_max_map_count) return -ENOMEM; =20 if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT)) --=20 2.51.0.384.g4c02a37b29-goog From nobody Thu Oct 2 14:25:58 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF6CC223DF6 for ; Mon, 15 Sep 2025 16:46:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954772; cv=none; b=DnMiHB+28r/WwL0F+G3GYNvdb98DpgiDDLz+TyDezbzrOI39OltRec+j3mlkdfbO3f8ifnz3x5Ek3v84b6Qs9hNk4akysH237pmKGoU7vsd1jZuSM1uu5K5BhmfdlznPxF+oIZkbd/UXihYJ1sHU+gSTdNtLqgXqW2IBQvhFxn4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954772; c=relaxed/simple; bh=riBOge8Wc0Wr4o3sWOGZOQ93vW3XAreUKztk0vzDkEM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=KtlHuPeKZCsDrWV1gXJ5WZ/p2hWQ/4tihUDD4D7XTRJi3Fb+nE7mkVMPxaYThzqURg0YEmPExE51IRnWX81Zz8WTORwlOJdbCTDAfseLRsu7t0Z8ZaDk0SFxJsLlTuMa11CDdhoH8J00aOmTVW6pTgh5Ypl5BltGtHOzcyHdZnQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=H3qggOJI; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="H3qggOJI" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-32e372d0ef7so2041661a91.1 for ; Mon, 15 Sep 2025 09:46:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757954768; x=1758559568; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kCSSp2N5c9yEXM9I+1aefN3SldFOzk155OGuTtokYU4=; b=H3qggOJId+WHEgMkE+pQBOXMo+lhzo6ghpSwLu6SDeGucBahBESu/35Nt37VIo3scI 6h7g+z6/UDIvZmvvL2JTj/hIR41tZLGjSAimG9QDvY8dWlGOOsh9MJVF79KbaSkW8f2t z8zK9G/PQ9AArwrKs71XGbDRvMjEnsrKymuuZbgYBOmJ4LMgzvjrBI6Yd5MUxB0BYMq0 28Ga1gq7HROViiMrP3LlBG5KtDml9UwouQeVCjoYSjKYAfgTcucYMHbNRk4llc+VEeVp Bh8QQJj8t93uCqPBSY9iVa+Y7RGlZvv3yN52W6JBTofdrBzMaNZfVOJEekXiovlumLkX YL6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757954768; x=1758559568; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kCSSp2N5c9yEXM9I+1aefN3SldFOzk155OGuTtokYU4=; b=r6ShSNFx792gS+//G2MF+4NAJtBeW81vFA/z4vZShgTyDC4mHPqz8shbcT8F4DJfp2 Cy3L0dvUHrWbvJVo6emJvNRpWtmDUB9HWO4AczjjeXpcYgJzPDZMUW1bFHPLwBjweT65 JvByjuLkaSrdoJ5jw7tjChw17WDEnWiwJ1KRjyZkTXGGJaPWX9N54J52mOOGtx572syj 68Fy2UcLi2PuRYNXxd3Vm6yTghguYQ1Xktvl1up6cFLM0UpkTpxJHoJhPq9iPYD2z8aB Xd/X6GNmcMu2oEF6Ovj1+n9z6KS122xZgkcXB7EUYhCq70XPe5Bq+RCtrKfyfmS2DjU6 9jrA== X-Forwarded-Encrypted: i=1; AJvYcCXsQazfNdD13p9UMeDAQbumt1auikffxOSWrmqveUqUrrxGs03AXvmQN2BrcOXzhKYoDg/1GOQmhSUXVMg=@vger.kernel.org X-Gm-Message-State: AOJu0YzHffUHB/3ia47toyoCsw6KUmed1UZ/ONdmdwpigk3lDWV2u378 WHRh8wrIRSlEwuL5VuZkvq17NcWV1dgArUv2oPilFFzlqv+BFZwhTQDUmWSqwBVxrg3jF8ssiT/ D6dzBjvTXDJlCa5ET6ZA4RJJ+UQ== X-Google-Smtp-Source: AGHT+IH629yWc1BxgGvDumfHUKr1KQxr/RK85CqBBR0VFlrjCzLlim4w3J/AZqPO6/2jdfO/Nl5jWNcSFrWm5T8gGA== X-Received: from pjbnc8.prod.google.com ([2002:a17:90b:37c8:b0:32d:a4d4:bb17]) (user=kaleshsingh job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2889:b0:32e:18f2:7a59 with SMTP id 98e67ed59e1d1-32e18f27cb4mr8891271a91.11.1757954768220; Mon, 15 Sep 2025 09:46:08 -0700 (PDT) Date: Mon, 15 Sep 2025 09:36:33 -0700 In-Reply-To: <20250915163838.631445-1-kaleshsingh@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250915163838.631445-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.51.0.384.g4c02a37b29-goog Message-ID: <20250915163838.631445-3-kaleshsingh@google.com> Subject: [PATCH v2 2/7] mm/selftests: add max_vma_count tests From: Kalesh Singh To: akpm@linux-foundation.org, minchan@kernel.org, lorenzo.stoakes@oracle.com, david@redhat.com, Liam.Howlett@oracle.com, rppt@kernel.org, pfalcato@suse.de Cc: kernel-team@android.com, android-mm@google.com, Kalesh Singh , Alexander Viro , Christian Brauner , Jan Kara , Kees Cook , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Valentin Schneider , Jann Horn , Shuah Khan , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a new selftest to verify that the max VMA count limit is correctly enforced. This test suite checks that various VMA operations (mmap, mprotect, munmap, mremap) succeed or fail as expected when the number of VMAs is close to the sysctl_max_map_count limit. The test works by first creating a large number of VMAs to bring the process close to the limit, and then performing various operations that may or may not create new VMAs. The test then verifies that the operations that would exceed the limit fail, and that the operations that do not exceed the limit succeed. NOTE: munmap is special as it's allowed to temporarily exceed the limit by one for splits as this will decrease back to the limit once the unmap succeeds. Cc: Andrew Morton Cc: David Hildenbrand Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Mike Rapoport Cc: Minchan Kim Cc: Pedro Falcato Signed-off-by: Kalesh Singh --- Changes in v2: - Add tests, per Liam (note that the do_brk_flags() path is not easily tested from userspace, so it's not included here). Exceeding the limit t= here should be uncommon. tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/max_vma_count_tests.c | 709 ++++++++++++++++++ tools/testing/selftests/mm/run_vmtests.sh | 5 + 3 files changed, 715 insertions(+) create mode 100644 tools/testing/selftests/mm/max_vma_count_tests.c diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/= mm/Makefile index d13b3cef2a2b..00a4b04eab06 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -91,6 +91,7 @@ TEST_GEN_FILES +=3D transhuge-stress TEST_GEN_FILES +=3D uffd-stress TEST_GEN_FILES +=3D uffd-unit-tests TEST_GEN_FILES +=3D uffd-wp-mremap +TEST_GEN_FILES +=3D max_vma_count_tests TEST_GEN_FILES +=3D split_huge_page_test TEST_GEN_FILES +=3D ksm_tests TEST_GEN_FILES +=3D ksm_functional_tests diff --git a/tools/testing/selftests/mm/max_vma_count_tests.c b/tools/testi= ng/selftests/mm/max_vma_count_tests.c new file mode 100644 index 000000000000..c8401c03425c --- /dev/null +++ b/tools/testing/selftests/mm/max_vma_count_tests.c @@ -0,0 +1,709 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright 2025 Google LLC + */ +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include +#include /* Definition of PR_* constants */ +#include + +#include "../kselftest.h" + +static int get_max_vma_count(void); +static bool set_max_vma_count(int val); +static int get_current_vma_count(void); +static bool is_current_vma_count(const char *msg, int expected); +static bool is_test_area_mapped(const char *msg); +static void print_surrounding_maps(const char *msg); + +/* Globals initialized in test_suite_setup() */ +static int MAX_VMA_COUNT; +static int ORIGINAL_MAX_VMA_COUNT; +static int PAGE_SIZE; +static int GUARD_SIZE; +static int TEST_AREA_SIZE; +static int EXTRA_MAP_SIZE; + +static int MAX_VMA_COUNT; + +static int NR_EXTRA_MAPS; + +static char *TEST_AREA; +static char *EXTRA_MAPS; + +#define DEFAULT_MAX_MAP_COUNT 65530 +#define TEST_AREA_NR_PAGES 3 +/* 1 before test area + 1 after test area + 1 after extra mappings */ +#define NR_GUARDS 3 +#define TEST_AREA_PROT (PROT_NONE) +#define EXTRA_MAP_PROT (PROT_NONE) + +/** + * test_suite_setup - Set up the VMA layout for VMA count testing. + * + * Sets up the following VMA layout: + * + * +----- base_addr + * | + * V + * +--------------+----------------------+--------------+----------------+= --------------+----------------+--------------+-----+----------------+-----= ---------+ + * | Guard Page | | Guard Page | Extra Map 1 |= Unmapped Gap | Extra Map 2 | Unmapped Gap | ... | Extra Map N | Unma= pped Gap | + * | (unmapped) | TEST_AREA | (unmapped) | (mapped page) |= (1 page) | (mapped page) | (1 page) | ... | (mapped page) | (1 = page) | + * | (1 page) | (unmapped, 3 pages) | (1 page) | (1 page) |= | (1 page) | | | (1 page) | = | + * +--------------+----------------------+--------------+----------------+= --------------+----------------+--------------+-----+----------------+-----= ---------+ + * ^ ^ ^ ^ = ^ + * | | | | = | + * +--GUARD_SIZE--+ | +-- EXTRA_MAPS poi= nts here Sufficient EXTRA_MAPS to ---+ + * (PAGE_SIZE) | | = reach MAX_VMA_COUNT + * | | + * +--- TEST_AREA_SIZE ---+ + * | (3 * PAGE_SIZE) | + * ^ + * | + * +-- TEST_AREA starts here + * + * Populates TEST_AREA and other globals required for the tests. + * If successful, the current VMA count will be MAX_VMA_COUNT - 1. + * + * Return: true on success, false on failure. + */ +static bool test_suite_setup(void) +{ + int initial_vma_count; + size_t reservation_size; + void *base_addr =3D NULL; + char *ptr =3D NULL; + + ksft_print_msg("Setting up vma_max_count test suite...\n"); + + /* Initialize globals */ + PAGE_SIZE =3D sysconf(_SC_PAGESIZE); + TEST_AREA_SIZE =3D TEST_AREA_NR_PAGES * PAGE_SIZE; + GUARD_SIZE =3D PAGE_SIZE; + EXTRA_MAP_SIZE =3D PAGE_SIZE; + MAX_VMA_COUNT =3D get_max_vma_count(); + + MAX_VMA_COUNT =3D get_max_vma_count(); + if (MAX_VMA_COUNT < 0) { + ksft_print_msg("Failed to read /proc/sys/vm/max_map_count\n"); + return false; + } + + /* + * If the current limit is higher than the kernel default, + * we attempt to lower it to the default to ensure the test + * can run with a reliably known boundary. + */ + ORIGINAL_MAX_VMA_COUNT =3D 0; + + if (MAX_VMA_COUNT > DEFAULT_MAX_MAP_COUNT) { + ORIGINAL_MAX_VMA_COUNT =3D MAX_VMA_COUNT; + + ksft_print_msg("Max VMA count is %d, lowering to default %d for test...\= n", + MAX_VMA_COUNT, DEFAULT_MAX_MAP_COUNT); + + if (!set_max_vma_count(DEFAULT_MAX_MAP_COUNT)) { + ksft_print_msg("WARNING: Failed to lower max_map_count to %d (requires = root)n", + DEFAULT_MAX_MAP_COUNT); + ksft_print_msg("Skipping test. Please run as root: limit needs adjustme= nt\n"); + + MAX_VMA_COUNT =3D ORIGINAL_MAX_VMA_COUNT; + + return false; + } + + /* Update MAX_VMA_COUNT for the test run */ + MAX_VMA_COUNT =3D DEFAULT_MAX_MAP_COUNT; + } + + initial_vma_count =3D get_current_vma_count(); + if (initial_vma_count < 0) { + ksft_print_msg("Failed to read /proc/self/maps\n"); + return false; + } + + /* + * Calculate how many extra mappings we need to create to reach + * MAX_VMA_COUNT - 1 (excluding test area). + */ + NR_EXTRA_MAPS =3D MAX_VMA_COUNT - 1 - initial_vma_count; + + if (NR_EXTRA_MAPS < 1) { + ksft_print_msg("Not enough available maps to run test\n"); + ksft_print_msg("max_vma_count=3D%d, current_vma_count=3D%d\n", + MAX_VMA_COUNT, initial_vma_count); + return false; + } + + /* + * Reserve space for: + * - Extra mappings with a 1-page gap after each (NR_EXTRA_MAPS * 2) + * - The test area itself (TEST_AREA_NR_PAGES) + * - The guard pages (NR_GUARDS) + */ + reservation_size =3D ((NR_EXTRA_MAPS * 2) + + TEST_AREA_NR_PAGES + NR_GUARDS) * PAGE_SIZE; + + base_addr =3D mmap(NULL, reservation_size, PROT_NONE, + MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); + if (base_addr =3D=3D MAP_FAILED) { + ksft_print_msg("Failed tommap initial reservation\n"); + return false; + } + + if (munmap(base_addr, reservation_size) =3D=3D -1) { + ksft_print_msg("Failed to munmap initial reservation\n"); + return false; + } + + /* Get the addr of the test area */ + TEST_AREA =3D (char *)base_addr + GUARD_SIZE; + + /* + * Get the addr of the region for extra mappings: + * test area + 1 guard. + */ + EXTRA_MAPS =3D TEST_AREA + TEST_AREA_SIZE + GUARD_SIZE; + + /* Create single-page mappings separated by unmapped pages */ + ptr =3D EXTRA_MAPS; + for (int i =3D 0; i < NR_EXTRA_MAPS; ++i) { + if (mmap(ptr, PAGE_SIZE, EXTRA_MAP_PROT, + MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED_NOREPLACE, + -1, 0) =3D=3D MAP_FAILED) { + perror("mmap in fill loop"); + ksft_print_msg("Failed on mapping #%d of %d\n", i + 1, + NR_EXTRA_MAPS); + return false; + } + + /* Advance pointer by 2 to leave a gap */ + ptr +=3D (2 * EXTRA_MAP_SIZE); + } + + if (!is_current_vma_count("test_suite_setup", MAX_VMA_COUNT - 1)) + return false; + + ksft_print_msg("vma_max_count test suite setup done.\n"); + + return true; +} + +static void test_suite_teardown(void) +{ + if (ORIGINAL_MAX_VMA_COUNT && MAX_VMA_COUNT !=3D ORIGINAL_MAX_VMA_COUNT) { + if (!set_max_vma_count(ORIGINAL_MAX_VMA_COUNT)) + ksft_print_msg("Failed to restore max_map_count to %d\n", + ORIGINAL_MAX_VMA_COUNT); + } +} + +/* --- Test Helper Functions --- */ +static bool mmap_anon(void) +{ + void *addr =3D mmap(NULL, PAGE_SIZE, PROT_READ, + MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); + + /* + * Handle cleanup here as the runner doesn't track where this, + *mapping is located. + */ + if (addr !=3D MAP_FAILED) + munmap(addr, PAGE_SIZE); + + return addr !=3D MAP_FAILED; +} + +static inline bool __mprotect(char *addr, int size) +{ + int new_prot =3D ~TEST_AREA_PROT & (PROT_READ | PROT_WRITE | PROT_EXEC); + + return mprotect(addr, size, new_prot) =3D=3D 0; +} + +static bool mprotect_nosplit(void) +{ + return __mprotect(TEST_AREA, TEST_AREA_SIZE); +} + +static bool mprotect_2way_split(void) +{ + return __mprotect(TEST_AREA, TEST_AREA_SIZE - PAGE_SIZE); +} + +static bool mprotect_3way_split(void) +{ + return __mprotect(TEST_AREA + PAGE_SIZE, PAGE_SIZE); +} + +static inline bool __munmap(char *addr, int size) +{ + return munmap(addr, size) =3D=3D 0; +} + +static bool munmap_nosplit(void) +{ + return __munmap(TEST_AREA, TEST_AREA_SIZE); +} + +static bool munmap_2way_split(void) +{ + return __munmap(TEST_AREA, TEST_AREA_SIZE - PAGE_SIZE); +} + +static bool munmap_3way_split(void) +{ + return __munmap(TEST_AREA + PAGE_SIZE, PAGE_SIZE); +} + +/* mremap accounts for the worst case to fail early */ +static const int MREMAP_REQUIRED_VMA_SLOTS =3D 6; + +static bool mremap_dontunmap(void) +{ + void *new_addr; + + /* + * Using MREMAP_DONTUNMAP will create a new mapping without + * removing the old one, consuming one VMA slot. + */ + new_addr =3D mremap(TEST_AREA, TEST_AREA_SIZE, TEST_AREA_SIZE, + MREMAP_MAYMOVE | MREMAP_DONTUNMAP, NULL); + + if (new_addr !=3D MAP_FAILED) + munmap(new_addr, TEST_AREA_SIZE); + + return new_addr !=3D MAP_FAILED; +} + +struct test { + const char *name; + bool (*test)(void); + /* How many VMA slots below the limit this test needs to start? */ + int vma_slots_needed; + bool expect_success; +}; + +/* --- Test Cases --- */ +struct test tests[] =3D { + { + .name =3D "mmap_at_1_below_vma_count_limit", + .test =3D mmap_anon, + .vma_slots_needed =3D 1, + .expect_success =3D true, + }, + { + .name =3D "mmap_at_vma_count_limit", + .test =3D mmap_anon, + .vma_slots_needed =3D 0, + .expect_success =3D false, + }, + { + .name =3D "mprotect_nosplit_at_1_below_vma_count_limit", + .test =3D mprotect_nosplit, + .vma_slots_needed =3D 1, + .expect_success =3D true, + }, + { + .name =3D "mprotect_nosplit_at_vma_count_limit", + .test =3D mprotect_nosplit, + .vma_slots_needed =3D 0, + .expect_success =3D true, + }, + { + .name =3D "mprotect_2way_split_at_1_below_vma_count_limit", + .test =3D mprotect_2way_split, + .vma_slots_needed =3D 1, + .expect_success =3D true, + }, + { + .name =3D "mprotect_2way_split_at_vma_count_limit", + .test =3D mprotect_2way_split, + .vma_slots_needed =3D 0, + .expect_success =3D false, + }, + { + .name =3D "mprotect_3way_split_at_2_below_vma_count_limit", + .test =3D mprotect_3way_split, + .vma_slots_needed =3D 2, + .expect_success =3D true, + }, + { + .name =3D "mprotect_3way_split_at_1_below_vma_count_limit", + .test =3D mprotect_3way_split, + .vma_slots_needed =3D 1, + .expect_success =3D false, + }, + { + .name =3D "mprotect_3way_split_at_vma_count_limit", + .test =3D mprotect_3way_split, + .vma_slots_needed =3D 0, + .expect_success =3D false, + }, + { + .name =3D "munmap_nosplit_at_1_below_vma_count_limit", + .test =3D munmap_nosplit, + .vma_slots_needed =3D 1, + .expect_success =3D true, + }, + { + .name =3D "munmap_nosplit_at_vma_count_limit", + .test =3D munmap_nosplit, + .vma_slots_needed =3D 0, + .expect_success =3D true, + }, + { + .name =3D "munmap_2way_split_at_1_below_vma_count_limit", + .test =3D munmap_2way_split, + .vma_slots_needed =3D 1, + .expect_success =3D true, + }, + { + .name =3D "munmap_2way_split_at_vma_count_limit", + .test =3D munmap_2way_split, + .vma_slots_needed =3D 0, + .expect_success =3D true, + }, + { + .name =3D "munmap_3way_split_at_2_below_vma_count_limit", + .test =3D munmap_3way_split, + .vma_slots_needed =3D 2, + .expect_success =3D true, + }, + { + .name =3D "munmap_3way_split_at_1_below_vma_count_limit", + .test =3D munmap_3way_split, + .vma_slots_needed =3D 1, + .expect_success =3D true, + }, + { + .name =3D "munmap_3way_split_at_vma_count_limit", + .test =3D munmap_3way_split, + .vma_slots_needed =3D 0, + .expect_success =3D false, + }, + { + .name =3D "mremap_dontunmap_at_required_vma_count_capcity", + .test =3D mremap_dontunmap, + .vma_slots_needed =3D MREMAP_REQUIRED_VMA_SLOTS, + .expect_success =3D true, + }, + { + .name =3D "mremap_dontunmap_at_1_below_required_vma_count_capacity", + .test =3D mremap_dontunmap, + .vma_slots_needed =3D MREMAP_REQUIRED_VMA_SLOTS - 1, + .expect_success =3D false, + }, +}; + +/* --- Test Runner --- */ +int main(int argc, char **argv) +{ + int num_tests =3D ARRAY_SIZE(tests); + int failed_tests =3D 0; + + ksft_set_plan(num_tests); + + if (!test_suite_setup() !=3D 0) { + if (MAX_VMA_COUNT > DEFAULT_MAX_MAP_COUNT) + ksft_exit_skip("max_map_count too high and cannot be lowered\n" + "Please rerun as root.\n"); + else + ksft_exit_fail_msg("Test suite setup failed. Aborting.\n"); + + } + + for (int i =3D 0; i < num_tests; i++) { + int maps_to_unmap =3D tests[i].vma_slots_needed; + const char *name =3D tests[i].name; + bool test_passed; + + errno =3D 0; + + /* 1. Setup: TEST_AREA mapping */ + if (mmap(TEST_AREA, TEST_AREA_SIZE, TEST_AREA_PROT, + MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED, -1, 0) + =3D=3D MAP_FAILED) { + ksft_test_result_fail( + "%s: Test setup failed to map TEST_AREA\n", + name); + maps_to_unmap =3D 0; + goto fail; + } + + /* Label TEST_AREA to ease debugging */ + if (prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, TEST_AREA, + TEST_AREA_SIZE, "TEST_AREA")) { + ksft_print_msg("WARNING: [%s] prctl(PR_SET_VMA) failed\n", + name); + ksft_print_msg( + "Continuing without named TEST_AREA mapping\n"); + } + + /* 2. Setup: Adjust VMA count based on test requirements */ + if (maps_to_unmap > NR_EXTRA_MAPS) { + ksft_test_result_fail( + "%s: Test setup failed: Invalid VMA slots required %d\n", + name, tests[i].vma_slots_needed); + maps_to_unmap =3D 0; + goto fail; + } + + /* Unmap extra mappings, accounting for the 1-page gap */ + for (int j =3D 0; j < maps_to_unmap; j++) + munmap(EXTRA_MAPS + (j * 2 * EXTRA_MAP_SIZE), + EXTRA_MAP_SIZE); + + /* + * 3. Verify the preconditions. + * + * Sometimes there isn't an easy way to determine the cause + * of the test failure. + * e.g. an mprotect ENOMEM may be due to trying to protect + * unmapped area or due to hitting MAX_VMA_COUNT limit. + * + * We verify the preconditions of the test to ensure any + * expected failures are from the expected cause and not + * coincidental. + */ + if (!is_current_vma_count(name, + MAX_VMA_COUNT - tests[i].vma_slots_needed)) + goto fail; + + if (!is_test_area_mapped(name)) + goto fail; + + /* 4. Run the test */ + test_passed =3D (tests[i].test() =3D=3D tests[i].expect_success); + if (test_passed) { + ksft_test_result_pass("%s\n", name); + } else { +fail: + failed_tests++; + ksft_test_result_fail( + "%s: current_vma_count=3D%d,max_vma_count=3D%d: errno: %d (%s)\n", + name, get_current_vma_count(), MAX_VMA_COUNT, + errno, strerror(errno)); + print_surrounding_maps(name); + } + + /* 5. Teardown: Unmap TEST_AREA. */ + munmap(TEST_AREA, TEST_AREA_SIZE); + + /* 6. Teardown: Restore extra mappings to test suite baseline */ + for (int j =3D 0; j < maps_to_unmap; j++) { + /* Remap extra mappings, accounting for the gap */ + mmap(EXTRA_MAPS + (j * 2 * EXTRA_MAP_SIZE), + EXTRA_MAP_SIZE, EXTRA_MAP_PROT, + MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED_NOREPLACE, + -1, 0); + } + } + + test_suite_teardown(); + + if (failed_tests > 0) + ksft_exit_fail(); + else + ksft_exit_pass(); +} + +/* --- Utilities --- */ + +static int get_max_vma_count(void) +{ + int max_count; + FILE *f; + + f =3D fopen("/proc/sys/vm/max_map_count", "r"); + if (!f) + return -1; + + if (fscanf(f, "%d", &max_count) !=3D 1) + max_count =3D -1; + + + fclose(f); + + return max_count; +} + +static bool set_max_vma_count(int val) +{ + FILE *f; + bool success =3D false; + + f =3D fopen("/proc/sys/vm/max_map_count", "w"); + if (!f) + return false; + + if (fprintf(f, "%d", val) > 0) + success =3D true; + + fclose(f); + return success; +} + +static int get_current_vma_count(void) +{ + char line[1024]; + int count =3D 0; + FILE *f; + + f =3D fopen("/proc/self/maps", "r"); + if (!f) + return -1; + + while (fgets(line, sizeof(line), f)) { + if (!strstr(line, "[vsyscall]")) + count++; + } + + fclose(f); + + return count; +} + +static bool is_current_vma_count(const char *msg, int expected) +{ + int current =3D get_current_vma_count(); + + if (current =3D=3D expected) + return true; + + ksft_print_msg("%s: vma count is %d, expected %d\n", msg, current, + expected); + return false; +} + +static bool is_test_area_mapped(const char *msg) +{ + unsigned long search_start =3D (unsigned long)TEST_AREA; + unsigned long search_end =3D search_start + TEST_AREA_SIZE; + bool found =3D false; + char line[1024]; + FILE *f; + + f =3D fopen("/proc/self/maps", "r"); + if (!f) { + ksft_print_msg("failed to open /proc/self/maps\n"); + return false; + } + + while (fgets(line, sizeof(line), f)) { + unsigned long start, end; + + if (sscanf(line, "%lx-%lx", &start, &end) !=3D 2) + continue; + + /* Check for an exact match of the range */ + if (start =3D=3D search_start && end =3D=3D search_end) { + found =3D true; + break; + } else if (start > search_end) { + /* + *Since maps are sorted, if we've passed the end, we + * can stop searching. + */ + break; + } + } + + fclose(f); + + if (found) + return true; + + /* Not found */ + ksft_print_msg( + "%s: TEST_AREA is not mapped as a single contiguous block.\n", + msg); + print_surrounding_maps(msg); + + return false; +} + +static void print_surrounding_maps(const char *msg) +{ + unsigned long search_start =3D (unsigned long)TEST_AREA; + unsigned long search_end =3D search_start + TEST_AREA_SIZE; + unsigned long start; + unsigned long end; + char line[1024] =3D {}; + int line_idx =3D 0; + int first_match_idx =3D -1; + int last_match_idx =3D -1; + FILE *f; + + f =3D fopen("/proc/self/maps", "r"); + if (!f) + return; + + if (msg) + ksft_print_msg("%s\n", msg); + + ksft_print_msg("--- Surrounding VMA entries for TEST_AREA (%p) ---\n", + TEST_AREA); + + /* First pass: Read all lines and find the range of matching entries */ + fseek(f, 0, SEEK_SET); /* Rewind file */ + while (fgets(line, sizeof(line), f)) { + if (sscanf(line, "%lx-%lx", &start, &end) !=3D 2) { + line_idx++; + continue; + } + + /* Check for any overlap */ + if (start < search_end && end > search_start) { + if (first_match_idx =3D=3D -1) + first_match_idx =3D line_idx; + last_match_idx =3D line_idx; + } else if (start > search_end) { + /* + * Since maps are sorted, if we've passed the end, we + * can stop searching. + */ + break; + } + + line_idx++; + } + + if (first_match_idx =3D=3D -1) { + ksft_print_msg("TEST_AREA (%p) is not currently mapped.\n", + TEST_AREA); + } else { + /* Second pass: Print the relevant lines */ + fseek(f, 0, SEEK_SET); /* Rewind file */ + line_idx =3D 0; + while (fgets(line, sizeof(line), f)) { + /* Print 2 lines before the first match */ + if (line_idx >=3D first_match_idx - 2 && + line_idx < first_match_idx) + ksft_print_msg(" %s", line); + + /* Print all matching TEST_AREA entries */ + if (line_idx >=3D first_match_idx && + line_idx <=3D last_match_idx) + ksft_print_msg(">> %s", line); + + /* Print 2 lines after the last match */ + if (line_idx > last_match_idx && + line_idx <=3D last_match_idx + 2) + ksft_print_msg(" %s", line); + + line_idx++; + } + } + + ksft_print_msg("--------------------------------------------------\n"); + + fclose(f); +} diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/self= tests/mm/run_vmtests.sh index 471e539d82b8..3794b50ec280 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -49,6 +49,8 @@ separated by spaces: test madvise(2) MADV_GUARD_INSTALL and MADV_GUARD_REMOVE options - madv_populate test memadvise(2) MADV_POPULATE_{READ,WRITE} options +- max_vma_count + tests for max vma_count - memfd_secret test memfd_secret(2) - process_mrelease @@ -417,6 +419,9 @@ fi # VADDR64 # vmalloc stability smoke test CATEGORY=3D"vmalloc" run_test bash ./test_vmalloc.sh smoke =20 +# test operations against max vma count limit +CATEGORY=3D"max_vma_count" run_test ./max_vma_count_tests + CATEGORY=3D"mremap" run_test ./mremap_dontunmap =20 CATEGORY=3D"hmm" run_test bash ./test_hmm.sh smoke --=20 2.51.0.384.g4c02a37b29-goog From nobody Thu Oct 2 14:25:58 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44FCD286D78 for ; Mon, 15 Sep 2025 16:46:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954786; cv=none; b=fw1n6CINnKtUaPvikOfXUsQF/iFumqyq14oBkE3mM7r/wMFPGBhSd3iZr7qCGxkje8DdD60Ulrk+m4z08hdNR9lv1xB6aiuVYnyjvBybY5uwllW+X1wDMcQrHKseCr9YxS8ov/rq8Vq0CDw21TKfYJzWZ9FYcd31u9jzxZ67uW8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954786; c=relaxed/simple; bh=lBWlwS7JcG7Bq/AWf4HTbFz+jYysV8dehUYeg6ZHX+w=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=cjlwMhYa/3pw74BKHus5KVbQeaSQ7UJJME9vRQf/Vecaeqa3RYZpbUGn53S7GL+cQQcH0y1368Xjkp5+2hNDAiZkfLL5YKC3dZedX/fbuJnQ23NZc9+voFpxK+JirkO6gCet0Gn/hgIM2ArA/pumJj+hWZ0XDmqAeUoWd6orGj4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=yU9zVeT3; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="yU9zVeT3" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-252afdfafe1so46773465ad.0 for ; Mon, 15 Sep 2025 09:46:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757954783; x=1758559583; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kvhIc5Qwm2MSlXgaqDt0tPsJi6Fn6VtZiCytLA6TTWs=; b=yU9zVeT3pmDrQ8UAASYsebznUX/oI10RS8RbKZE2oeH6l1pSMmASgj7Ww4qlEhy0zR fsrCL2CJCx/fqi3ANRRzAGM7c4j2rNfkO+CWnbYX+gAdruP/oEBOfxUlsZGMbOXhzd9P HzRPsAbD4mbw4yXUlPg4n5ofIJ+lhDRbsjMvMDxdmCxkqeWskN/tReNcR3zuC53qLP9c MMStveY5rq8LvSkT2dwD6CYZ1995U8uvBmeHiLc7qTE+N2GYRj+GKLFWfSmky9EeZGHo tx0tJrxnFVA5wiJL3B4xo4c1ZxvnWYuFgiqTelYI7S0xD9OGTXayj757q9IJjVzQh1Yx GxgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757954783; x=1758559583; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kvhIc5Qwm2MSlXgaqDt0tPsJi6Fn6VtZiCytLA6TTWs=; b=VXZ9O3if+RCCtWdGQDwuKhvhTLCjxb3NMr6pKAYl5Y7OzI/x4flkE+HKTVA6odCaC9 egaICyDtQ0gIRIB6TVVoKJuR38rz4uiLmUyi7KvMJW7Xcy8t0FT8kezV2es/F1ys5Eaa +p9t8YabGSFywpvqafsgTP2REfCaLqjfFxMiKqyBgjPUS1/grm/q+vBiXI96WXhLlSPr icOiDOVd6zaTiOG6P+8l/GJPVpDXzA5S5dwsKQk+QxSlRuCXTyL5XEhzwPrEg4H8yLBQ UQXxylnOn9ZtNovBU2KMT0dFDaBpr/ceBhySrLyXgkuPQwk+XRyLxtcsG4ChrZWK3az+ Tfsw== X-Forwarded-Encrypted: i=1; AJvYcCXP4Vb2qvoDi+uCxN/qTYLsKxvG5zfYzyP3HJ97RT6IAmbLjWboQhuQBUjQ2IyvfzyAnpbgm47KIuyEqVg=@vger.kernel.org X-Gm-Message-State: AOJu0Yyx1GcqN45wPu+lBu5v9U3ZOfp5vKIpbUlp5cYcr+7aqWglCNWn aNl0ZRbhI+EiUZ808dUi3fP34GaYdqRAxzNPMZb3Mkti7cPZDVf+5Kld3GDSCFR7ea0uOG3qZUi IwwRn89/HdvENw3oCOMTOuQpAjQ== X-Google-Smtp-Source: AGHT+IETjSAgl4Bq7YGi4NHOQLEhA++l/mjHETWNYuPpSZDm1ovOxLN6UFeAYtoyBJbEGZn5mcnKagruLzvaK22crw== X-Received: from plap12.prod.google.com ([2002:a17:902:f08c:b0:264:3c1f:6385]) (user=kaleshsingh job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:240c:b0:24e:3cf2:2450 with SMTP id d9443c01a7336-25d23e13c87mr190358195ad.2.1757954783437; Mon, 15 Sep 2025 09:46:23 -0700 (PDT) Date: Mon, 15 Sep 2025 09:36:34 -0700 In-Reply-To: <20250915163838.631445-1-kaleshsingh@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250915163838.631445-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.51.0.384.g4c02a37b29-goog Message-ID: <20250915163838.631445-4-kaleshsingh@google.com> Subject: [PATCH v2 3/7] mm: introduce vma_count_remaining() From: Kalesh Singh To: akpm@linux-foundation.org, minchan@kernel.org, lorenzo.stoakes@oracle.com, david@redhat.com, Liam.Howlett@oracle.com, rppt@kernel.org, pfalcato@suse.de Cc: kernel-team@android.com, android-mm@google.com, Kalesh Singh , Alexander Viro , Christian Brauner , Jan Kara , Kees Cook , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Valentin Schneider , Jann Horn , Shuah Khan , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The checks against sysctl_max_map_count are open-coded in multiple places. While simple checks are manageable, the logic in places like mremap.c involves arithmetic with magic numbers that can be difficult to reason about. e.g. ... >=3D sysctl_max_map_count - 3 To improve readability and centralize the logic, introduce a new helper, vma_count_remaining(). This function returns the VMA count headroom available for a givine process. The most common case of checking for a single new VMA can be done with the convenience helper has_vma_count_remaining(): if (!vma_count_remaining(mm)) And the complex checks in mremap.c become clearer by expressing the required capacity directly: if (vma_count_remaining(mm) < 4) While a capacity-based function could be misused (e.g., with an incorrect '<' vs '<=3D' comparison), the improved readability at the call sites makes such errors less likely than with the previous open-coded arithmetic. As part of this change, sysctl_max_map_count is made static to mm/mmap.c to improve encapsulation. Cc: Andrew Morton Cc: David Hildenbrand Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Mike Rapoport Cc: Minchan Kim Cc: Pedro Falcato Signed-off-by: Kalesh Singh --- Changes in v2: - Fix documentation comment for vma_count_remaining(), per Mike - Remove extern in header, per Mike and Pedro - Move declaration to mm/internal.h, per Mike - Replace exceeds_max_map_count() with capacity-based vma_count_remaining= (), per Lorenzo. - Fix tools/testing/vma, per Lorenzo. include/linux/mm.h | 2 -- mm/internal.h | 2 ++ mm/mmap.c | 21 ++++++++++++++++++++- mm/mremap.c | 7 ++++--- mm/nommu.c | 2 +- mm/util.c | 1 - mm/vma.c | 10 +++++----- tools/testing/vma/vma_internal.h | 9 +++++++++ 8 files changed, 41 insertions(+), 13 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1ae97a0b8ec7..138bab2988f8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -192,8 +192,6 @@ static inline void __mm_zero_struct_page(struct page *p= age) #define MAPCOUNT_ELF_CORE_MARGIN (5) #define DEFAULT_MAX_MAP_COUNT (USHRT_MAX - MAPCOUNT_ELF_CORE_MARGIN) =20 -extern int sysctl_max_map_count; - extern unsigned long sysctl_user_reserve_kbytes; extern unsigned long sysctl_admin_reserve_kbytes; =20 diff --git a/mm/internal.h b/mm/internal.h index 45b725c3dc03..39f1c9535ae5 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1661,4 +1661,6 @@ static inline bool reclaim_pt_is_enabled(unsigned lon= g start, unsigned long end, void dup_mm_exe_file(struct mm_struct *mm, struct mm_struct *oldmm); int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm); =20 +int vma_count_remaining(const struct mm_struct *mm); + #endif /* __MM_INTERNAL_H */ diff --git a/mm/mmap.c b/mm/mmap.c index e5370e7fcd8f..af88ce1fbb5f 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -374,7 +374,7 @@ unsigned long do_mmap(struct file *file, unsigned long = addr, return -EOVERFLOW; =20 /* Too many mappings? */ - if (mm->map_count >=3D sysctl_max_map_count) + if (!vma_count_remaining(mm)) return -ENOMEM; =20 /* @@ -1504,6 +1504,25 @@ struct vm_area_struct *_install_special_mapping( int sysctl_legacy_va_layout; #endif =20 +static int sysctl_max_map_count __read_mostly =3D DEFAULT_MAX_MAP_COUNT; + +/** + * vma_count_remaining - Determine available VMA slots + * @mm: The memory descriptor for the process. + * + * Check how many more VMAs can be created for the given @mm + * before hitting the sysctl_max_map_count limit. + * + * Return: The number of new VMAs the process can accommodate. + */ +int vma_count_remaining(const struct mm_struct *mm) +{ + const int map_count =3D mm->map_count; + const int max_count =3D sysctl_max_map_count; + + return (max_count > map_count) ? (max_count - map_count) : 0; +} + static const struct ctl_table mmap_table[] =3D { { .procname =3D "max_map_count", diff --git a/mm/mremap.c b/mm/mremap.c index 35de0a7b910e..14d35d87e89b 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -1040,7 +1040,7 @@ static unsigned long prep_move_vma(struct vma_remap_s= truct *vrm) * We'd prefer to avoid failure later on in do_munmap: * which may split one vma into three before unmapping. */ - if (current->mm->map_count >=3D sysctl_max_map_count - 3) + if (vma_count_remaining(current->mm) < 4) return -ENOMEM; =20 if (vma->vm_ops && vma->vm_ops->may_split) { @@ -1814,9 +1814,10 @@ static unsigned long check_mremap_params(struct vma_= remap_struct *vrm) * split in 3 before unmapping it. * That means 2 more maps (1 for each) to the ones we already hold. * Check whether current map count plus 2 still leads us to 4 maps below - * the threshold, otherwise return -ENOMEM here to be more safe. + * the threshold. In other words, is the current map count + 6 at or + * below the threshold? Otherwise return -ENOMEM here to be more safe. */ - if ((current->mm->map_count + 2) >=3D sysctl_max_map_count - 3) + if (vma_count_remaining(current->mm) < 6) return -ENOMEM; =20 return 0; diff --git a/mm/nommu.c b/mm/nommu.c index 8b819fafd57b..dd75f2334812 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1316,7 +1316,7 @@ static int split_vma(struct vma_iterator *vmi, struct= vm_area_struct *vma, return -ENOMEM; =20 mm =3D vma->vm_mm; - if (mm->map_count >=3D sysctl_max_map_count) + if (!vma_count_remaining(mm)) return -ENOMEM; =20 region =3D kmem_cache_alloc(vm_region_jar, GFP_KERNEL); diff --git a/mm/util.c b/mm/util.c index f814e6a59ab1..b6e83922cafe 100644 --- a/mm/util.c +++ b/mm/util.c @@ -751,7 +751,6 @@ EXPORT_SYMBOL(folio_mc_copy); int sysctl_overcommit_memory __read_mostly =3D OVERCOMMIT_GUESS; static int sysctl_overcommit_ratio __read_mostly =3D 50; static unsigned long sysctl_overcommit_kbytes __read_mostly; -int sysctl_max_map_count __read_mostly =3D DEFAULT_MAX_MAP_COUNT; unsigned long sysctl_user_reserve_kbytes __read_mostly =3D 1UL << 17; /* 1= 28MB */ unsigned long sysctl_admin_reserve_kbytes __read_mostly =3D 1UL << 13; /* = 8MB */ =20 diff --git a/mm/vma.c b/mm/vma.c index 033a388bc4b1..df0e8409f63d 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -491,8 +491,8 @@ void unmap_region(struct ma_state *mas, struct vm_area_= struct *vma, } =20 /* - * __split_vma() bypasses sysctl_max_map_count checking. We use this wher= e it - * has already been checked or doesn't make sense to fail. + * __split_vma() bypasses vma_count_remaining() checks. We use this where + * it has already been checked or doesn't make sense to fail. * VMA Iterator will point to the original VMA. */ static __must_check int @@ -592,7 +592,7 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_st= ruct *vma, static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, unsigned long addr, int new_below) { - if (vma->vm_mm->map_count >=3D sysctl_max_map_count) + if (!vma_count_remaining(vma->vm_mm)) return -ENOMEM; =20 return __split_vma(vmi, vma, addr, new_below); @@ -1345,7 +1345,7 @@ static int vms_gather_munmap_vmas(struct vma_munmap_s= truct *vms, * its limit temporarily, to help free resources as expected. */ if (vms->end < vms->vma->vm_end && - vms->vma->vm_mm->map_count >=3D sysctl_max_map_count) { + !vma_count_remaining(vms->vma->vm_mm)) { error =3D -ENOMEM; goto map_count_exceeded; } @@ -2772,7 +2772,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_= area_struct *vma, if (!may_expand_vm(mm, vm_flags, len >> PAGE_SHIFT)) return -ENOMEM; =20 - if (mm->map_count >=3D sysctl_max_map_count) + if (!vma_count_remaining(mm)) return -ENOMEM; =20 if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT)) diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index 3639aa8dd2b0..52cd7ddc73f4 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -1517,4 +1517,13 @@ static inline vm_flags_t ksm_vma_flags(const struct = mm_struct *, const struct fi return vm_flags; } =20 +/* Helper to get VMA count capacity */ +static int vma_count_remaining(const struct mm_struct *mm) +{ + const int map_count =3D mm->map_count; + const int max_count =3D sysctl_max_map_count; + + return (max_count > map_count) ? (max_count - map_count) : 0; +} + #endif /* __MM_VMA_INTERNAL_H */ --=20 2.51.0.384.g4c02a37b29-goog From nobody Thu Oct 2 14:25:58 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DA8A823B618 for ; Mon, 15 Sep 2025 16:46:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954801; cv=none; b=qfqPVImosQOjvK9BdTur1Es29Q/Q2lRA7hSBJsF2K1o1wCzM8KUk3oQ5JtpCjhxiFWWzQdEtYu1+0aLgJvdX1PqvQC9HGyTX2SoUhgUjc9Jxf8T08B+4YVvgttuAzPZqeJa3rgBDwUfrKokdWVwiaEgsPnnSORIkHGbZ9vYrQ6c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954801; c=relaxed/simple; bh=Wm5kVy+QK3aYdtcEt135qslNE4sqGTNT0KRjz0cm82A=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=GcPv5selo1Q2YZ5emv0wAx1pH+YIrQ6fZ+Mi/e7+0VFgLQAlnUFfp/op90ikgSmnFdUTIdr7IqJtdbssd+9vEGB41SzG27TkO0fy8vijvbP3hiK0KT5TXRPw1AXIyIg9gVKNP3AkvpCcBykNIGUu4RZza8AkGd/tI8nZombIHsE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=cwPWWf17; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="cwPWWf17" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-24456ebed7bso58565925ad.0 for ; Mon, 15 Sep 2025 09:46:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757954799; x=1758559599; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9uec6c9hHTVF7EYZ252n3iiXGzXanDu3XqGzsoJcRFg=; b=cwPWWf178tua5mD0wIhxdd+PSC2PkLrO63zLECIDoL9b4rmjfWTzwozfE5hxw6EviN AQmObY1Zv2SzVDz4AUFdiBA4v5+RmQufRoNuGb+k17PUdymD1SJ6SfyDBJ6bxlJie9h7 88OaLrSeATBJ3w162YyB603AGPujPcbM/BrohEFYI2ZDDdlc7VSuI02pvgn7n5jxo0MA VJ3M1GumJyB0ezDoyRbTRrG+hJ6873pJKZU2Tow0/Ht/7XghNjE3vskoSnm0xJ0fJQB+ k9xevCPOEUyhZjI0vyUBzxO4VpqetyIzRhdXQDA06AUzDhxY+htNRJZ0eitLJfx49w5z acJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757954799; x=1758559599; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9uec6c9hHTVF7EYZ252n3iiXGzXanDu3XqGzsoJcRFg=; b=hzWYqVKEIzV7tmBSiCD+I8G2e6ml4mkOh3wSgjyqhadOKPAuQlgXnaeHCMc3NSbxA1 gMn3g2Bsz4LTyaIsDfEd/jkz1eGAihnqYxdjGyJc7/tsQ+4pPXse5mbH1IfJKXnPi5N2 t0qwi0fXh/Hi2sCz5C5dvcn9TYMZ1PAMAyjhZxzamPyVy10CAJrdwQkbbfYdBWOxa5sx 86rE+1R08WSDyEFMK6lb3v384pbvAUQXwg3QErDitRUAknHl5Lo5sJijs6lBp/tQHhyy NNkgFkIdzfpBD7XXNJpPYOu999TQTBjAwEvqutS5qAmwVXmZgIyvxlcmhRfE0UR9ib43 gbRg== X-Forwarded-Encrypted: i=1; AJvYcCWyXETUCh6E5pevwpcHDeN09lIJ59/vVtNmZ6/icD+3SQqQCXsn2zmucjaAJyXNqcgBLY6MdDZjNXqVHNI=@vger.kernel.org X-Gm-Message-State: AOJu0YxSh7PmV/bKNthUBk+Di21gb5nlRR2j7IRQsg5xLU6ziIRcVERy vynsPJSA/Ctme6M29IppomzeK6utwyJHZazwjDpUJtFLBwjOCbOpInCUNexBUSE06jBeJ5pp0tW cV9zxhdEfNw/cSx++pXoej7AXXQ== X-Google-Smtp-Source: AGHT+IE0Au7KtYKMn6v6+XfekUxenGOOYdpGtRed0ASYQgDdnt6D91OKFdMUEycPdM3BioeBGvXFHcpGg9KCP0jNrA== X-Received: from plbmz4.prod.google.com ([2002:a17:903:3504:b0:266:c070:158d]) (user=kaleshsingh job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:d4ce:b0:25d:510:622c with SMTP id d9443c01a7336-25d2da0f07fmr167476645ad.28.1757954798904; Mon, 15 Sep 2025 09:46:38 -0700 (PDT) Date: Mon, 15 Sep 2025 09:36:35 -0700 In-Reply-To: <20250915163838.631445-1-kaleshsingh@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250915163838.631445-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.51.0.384.g4c02a37b29-goog Message-ID: <20250915163838.631445-5-kaleshsingh@google.com> Subject: [PATCH v2 4/7] mm: rename mm_struct::map_count to vma_count From: Kalesh Singh To: akpm@linux-foundation.org, minchan@kernel.org, lorenzo.stoakes@oracle.com, david@redhat.com, Liam.Howlett@oracle.com, rppt@kernel.org, pfalcato@suse.de Cc: kernel-team@android.com, android-mm@google.com, Kalesh Singh , Alexander Viro , Christian Brauner , Jan Kara , Kees Cook , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Valentin Schneider , Jann Horn , Shuah Khan , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A mechanical rename of the mm_struct->map_count field to vma_count; no functional change is intended. The name "map_count" is ambiguous within the memory management subsystem, as it can be confused with the folio/page->_mapcount field, which tracks PTE references. The new name, vma_count, is more precise as this field has always counted the number of vm_area_structs associated with an mm_struct. Cc: Andrew Morton Cc: David Hildenbrand Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Mike Rapoport Cc: Minchan Kim Cc: Pedro Falcato Signed-off-by: Kalesh Singh Acked-by: David Hildenbrand Reviewed-by: Lorenzo Stoakes Reviewed-by: Pedro Falcato --- Changes in v2: - map_count is easily confused with _mapcount rename to vma_count, per Da= vid fs/binfmt_elf.c | 2 +- fs/coredump.c | 2 +- include/linux/mm_types.h | 2 +- kernel/fork.c | 2 +- mm/debug.c | 2 +- mm/mmap.c | 6 +++--- mm/nommu.c | 6 +++--- mm/vma.c | 24 ++++++++++++------------ tools/testing/vma/vma.c | 32 ++++++++++++++++---------------- tools/testing/vma/vma_internal.h | 6 +++--- 10 files changed, 42 insertions(+), 42 deletions(-) diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index 264fba0d44bd..52449dec12cb 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1643,7 +1643,7 @@ static int fill_files_note(struct memelfnote *note, s= truct coredump_params *cprm data[0] =3D count; data[1] =3D PAGE_SIZE; /* - * Count usually is less than mm->map_count, + * Count usually is less than mm->vma_count, * we need to move filenames down. */ n =3D cprm->vma_count - count; diff --git a/fs/coredump.c b/fs/coredump.c index 60bc9685e149..8881459c53d9 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -1731,7 +1731,7 @@ static bool dump_vma_snapshot(struct coredump_params = *cprm) cprm->vma_data_size =3D 0; gate_vma =3D get_gate_vma(mm); - cprm->vma_count =3D mm->map_count + (gate_vma ? 1 : 0); + cprm->vma_count =3D mm->vma_count + (gate_vma ? 1 : 0); cprm->vma_meta =3D kvmalloc_array(cprm->vma_count, sizeof(*cprm->vma_meta= ), GFP_KERNEL); if (!cprm->vma_meta) { diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 08bc2442db93..4343be2f9e85 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1020,7 +1020,7 @@ struct mm_struct { #ifdef CONFIG_MMU atomic_long_t pgtables_bytes; /* size of all page tables */ #endif - int map_count; /* number of VMAs */ + int vma_count; /* number of VMAs */ spinlock_t page_table_lock; /* Protects page tables and some * counters diff --git a/kernel/fork.c b/kernel/fork.c index c4ada32598bd..8fcbbf947579 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1037,7 +1037,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm= , struct task_struct *p, mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); mm_pgtables_bytes_init(mm); - mm->map_count =3D 0; + mm->vma_count =3D 0; mm->locked_vm =3D 0; atomic64_set(&mm->pinned_vm, 0); memset(&mm->rss_stat, 0, sizeof(mm->rss_stat)); diff --git a/mm/debug.c b/mm/debug.c index b4388f4dcd4d..40fc9425a84a 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -204,7 +204,7 @@ void dump_mm(const struct mm_struct *mm) mm->pgd, atomic_read(&mm->mm_users), atomic_read(&mm->mm_count), mm_pgtables_bytes(mm), - mm->map_count, + mm->vma_count, mm->hiwater_rss, mm->hiwater_vm, mm->total_vm, mm->locked_vm, (u64)atomic64_read(&mm->pinned_vm), mm->data_vm, mm->exec_vm, mm->stack_vm, diff --git a/mm/mmap.c b/mm/mmap.c index af88ce1fbb5f..c6769394a174 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1308,7 +1308,7 @@ void exit_mmap(struct mm_struct *mm) vma =3D vma_next(&vmi); } while (vma && likely(!xa_is_zero(vma))); =20 - BUG_ON(count !=3D mm->map_count); + BUG_ON(count !=3D mm->vma_count); =20 trace_exit_mmap(mm); destroy: @@ -1517,7 +1517,7 @@ static int sysctl_max_map_count __read_mostly =3D DEF= AULT_MAX_MAP_COUNT; */ int vma_count_remaining(const struct mm_struct *mm) { - const int map_count =3D mm->map_count; + const int map_count =3D mm->vma_count; const int max_count =3D sysctl_max_map_count; =20 return (max_count > map_count) ? (max_count - map_count) : 0; @@ -1828,7 +1828,7 @@ __latent_entropy int dup_mmap(struct mm_struct *mm, s= truct mm_struct *oldmm) */ vma_iter_bulk_store(&vmi, tmp); =20 - mm->map_count++; + mm->vma_count++; =20 if (tmp->vm_ops && tmp->vm_ops->open) tmp->vm_ops->open(tmp); diff --git a/mm/nommu.c b/mm/nommu.c index dd75f2334812..9ab2e5ca736d 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -576,7 +576,7 @@ static void setup_vma_to_mm(struct vm_area_struct *vma,= struct mm_struct *mm) =20 static void cleanup_vma_from_mm(struct vm_area_struct *vma) { - vma->vm_mm->map_count--; + vma->vm_mm->vma_count--; /* remove the VMA from the mapping */ if (vma->vm_file) { struct address_space *mapping; @@ -1198,7 +1198,7 @@ unsigned long do_mmap(struct file *file, goto error_just_free; =20 setup_vma_to_mm(vma, current->mm); - current->mm->map_count++; + current->mm->vma_count++; /* add the VMA to the tree */ vma_iter_store_new(&vmi, vma); =20 @@ -1366,7 +1366,7 @@ static int split_vma(struct vma_iterator *vmi, struct= vm_area_struct *vma, setup_vma_to_mm(vma, mm); setup_vma_to_mm(new, mm); vma_iter_store_new(vmi, new); - mm->map_count++; + mm->vma_count++; return 0; =20 err_vmi_preallocate: diff --git a/mm/vma.c b/mm/vma.c index df0e8409f63d..64f4e7c867c3 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -352,7 +352,7 @@ static void vma_complete(struct vma_prepare *vp, struct= vma_iterator *vmi, * (it may either follow vma or precede it). */ vma_iter_store_new(vmi, vp->insert); - mm->map_count++; + mm->vma_count++; } =20 if (vp->anon_vma) { @@ -383,7 +383,7 @@ static void vma_complete(struct vma_prepare *vp, struct= vma_iterator *vmi, } if (vp->remove->anon_vma) anon_vma_merge(vp->vma, vp->remove); - mm->map_count--; + mm->vma_count--; mpol_put(vma_policy(vp->remove)); if (!vp->remove2) WARN_ON_ONCE(vp->vma->vm_end < vp->remove->vm_end); @@ -683,13 +683,13 @@ void validate_mm(struct mm_struct *mm) } #endif /* Check for a infinite loop */ - if (++i > mm->map_count + 10) { + if (++i > mm->vma_count + 10) { i =3D -1; break; } } - if (i !=3D mm->map_count) { - pr_emerg("map_count %d vma iterator %d\n", mm->map_count, i); + if (i !=3D mm->vma_count) { + pr_emerg("vma_count %d vma iterator %d\n", mm->vma_count, i); bug =3D 1; } VM_BUG_ON_MM(bug, mm); @@ -1266,7 +1266,7 @@ static void vms_complete_munmap_vmas(struct vma_munma= p_struct *vms, struct mm_struct *mm; =20 mm =3D current->mm; - mm->map_count -=3D vms->vma_count; + mm->vma_count -=3D vms->vma_count; mm->locked_vm -=3D vms->locked_vm; if (vms->unlock) mmap_write_downgrade(mm); @@ -1340,14 +1340,14 @@ static int vms_gather_munmap_vmas(struct vma_munmap= _struct *vms, if (vms->start > vms->vma->vm_start) { /* - * Make sure that map_count on return from munmap() will + * Make sure that vma_count on return from munmap() will * not exceed its limit; but let map_count go just above * its limit temporarily, to help free resources as expected. */ if (vms->end < vms->vma->vm_end && !vma_count_remaining(vms->vma->vm_mm)) { error =3D -ENOMEM; - goto map_count_exceeded; + goto vma_count_exceeded; } =20 /* Don't bother splitting the VMA if we can't unmap it anyway */ @@ -1461,7 +1461,7 @@ static int vms_gather_munmap_vmas(struct vma_munmap_s= truct *vms, modify_vma_failed: reattach_vmas(mas_detach); start_split_failed: -map_count_exceeded: +vma_count_exceeded: return error; } =20 @@ -1795,7 +1795,7 @@ int vma_link(struct mm_struct *mm, struct vm_area_str= uct *vma) vma_start_write(vma); vma_iter_store_new(&vmi, vma); vma_link_file(vma); - mm->map_count++; + mm->vma_count++; validate_mm(mm); return 0; } @@ -2495,7 +2495,7 @@ static int __mmap_new_vma(struct mmap_state *map, str= uct vm_area_struct **vmap) /* Lock the VMA since it is modified after insertion into VMA tree */ vma_start_write(vma); vma_iter_store_new(vmi, vma); - map->mm->map_count++; + map->mm->vma_count++; vma_link_file(vma); /* @@ -2810,7 +2810,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_= area_struct *vma, if (vma_iter_store_gfp(vmi, vma, GFP_KERNEL)) goto mas_store_fail; - mm->map_count++; + mm->vma_count++; validate_mm(mm); out: perf_event_mmap(vma); diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c index 656e1c75b711..69fa7d14a6c2 100644 --- a/tools/testing/vma/vma.c +++ b/tools/testing/vma/vma.c @@ -261,7 +261,7 @@ static int cleanup_mm(struct mm_struct *mm, struct vma_= iterator *vmi) } mtree_destroy(&mm->mm_mt); - mm->map_count =3D 0; + mm->vma_count =3D 0; return count; } @@ -500,7 +500,7 @@ static bool test_merge_new(void) INIT_LIST_HEAD(&vma_d->anon_vma_chain); list_add(&dummy_anon_vma_chain_d.same_vma, &vma_d->anon_vma_chain); ASSERT_FALSE(merged); - ASSERT_EQ(mm.map_count, 4); + ASSERT_EQ(mm.vma_count, 4); /* * Merge BOTH sides. @@ -519,7 +519,7 @@ static bool test_merge_new(void) ASSERT_EQ(vma->vm_pgoff, 0); ASSERT_EQ(vma->anon_vma, &dummy_anon_vma); ASSERT_TRUE(vma_write_started(vma)); - ASSERT_EQ(mm.map_count, 3); + ASSERT_EQ(mm.vma_count, 3); /* * Merge to PREVIOUS VMA. @@ -536,7 +536,7 @@ static bool test_merge_new(void) ASSERT_EQ(vma->vm_pgoff, 0); ASSERT_EQ(vma->anon_vma, &dummy_anon_vma); ASSERT_TRUE(vma_write_started(vma)); - ASSERT_EQ(mm.map_count, 3); + ASSERT_EQ(mm.vma_count, 3); /* * Merge to NEXT VMA. @@ -555,7 +555,7 @@ static bool test_merge_new(void) ASSERT_EQ(vma->vm_pgoff, 6); ASSERT_EQ(vma->anon_vma, &dummy_anon_vma); ASSERT_TRUE(vma_write_started(vma)); - ASSERT_EQ(mm.map_count, 3); + ASSERT_EQ(mm.vma_count, 3); /* * Merge BOTH sides. @@ -573,7 +573,7 @@ static bool test_merge_new(void) ASSERT_EQ(vma->vm_pgoff, 0); ASSERT_EQ(vma->anon_vma, &dummy_anon_vma); ASSERT_TRUE(vma_write_started(vma)); - ASSERT_EQ(mm.map_count, 2); + ASSERT_EQ(mm.vma_count, 2); /* * Merge to NEXT VMA. @@ -591,7 +591,7 @@ static bool test_merge_new(void) ASSERT_EQ(vma->vm_pgoff, 0xa); ASSERT_EQ(vma->anon_vma, &dummy_anon_vma); ASSERT_TRUE(vma_write_started(vma)); - ASSERT_EQ(mm.map_count, 2); + ASSERT_EQ(mm.vma_count, 2); /* * Merge BOTH sides. @@ -608,7 +608,7 @@ static bool test_merge_new(void) ASSERT_EQ(vma->vm_pgoff, 0); ASSERT_EQ(vma->anon_vma, &dummy_anon_vma); ASSERT_TRUE(vma_write_started(vma)); - ASSERT_EQ(mm.map_count, 1); + ASSERT_EQ(mm.vma_count, 1); /* * Final state. @@ -967,7 +967,7 @@ static bool test_vma_merge_new_with_close(void) ASSERT_EQ(vma->vm_pgoff, 0); ASSERT_EQ(vma->vm_ops, &vm_ops); ASSERT_TRUE(vma_write_started(vma)); - ASSERT_EQ(mm.map_count, 2); + ASSERT_EQ(mm.vma_count, 2); cleanup_mm(&mm, &vmi); return true; @@ -1017,7 +1017,7 @@ static bool test_merge_existing(void) ASSERT_EQ(vma->vm_pgoff, 2); ASSERT_TRUE(vma_write_started(vma)); ASSERT_TRUE(vma_write_started(vma_next)); - ASSERT_EQ(mm.map_count, 2); + ASSERT_EQ(mm.vma_count, 2); /* Clear down and reset. */ ASSERT_EQ(cleanup_mm(&mm, &vmi), 2); @@ -1045,7 +1045,7 @@ static bool test_merge_existing(void) ASSERT_EQ(vma_next->vm_pgoff, 2); ASSERT_EQ(vma_next->anon_vma, &dummy_anon_vma); ASSERT_TRUE(vma_write_started(vma_next)); - ASSERT_EQ(mm.map_count, 1); + ASSERT_EQ(mm.vma_count, 1); /* Clear down and reset. We should have deleted vma. */ ASSERT_EQ(cleanup_mm(&mm, &vmi), 1); @@ -1079,7 +1079,7 @@ static bool test_merge_existing(void) ASSERT_EQ(vma->vm_pgoff, 6); ASSERT_TRUE(vma_write_started(vma_prev)); ASSERT_TRUE(vma_write_started(vma)); - ASSERT_EQ(mm.map_count, 2); + ASSERT_EQ(mm.vma_count, 2); /* Clear down and reset. */ ASSERT_EQ(cleanup_mm(&mm, &vmi), 2); @@ -1108,7 +1108,7 @@ static bool test_merge_existing(void) ASSERT_EQ(vma_prev->vm_pgoff, 0); ASSERT_EQ(vma_prev->anon_vma, &dummy_anon_vma); ASSERT_TRUE(vma_write_started(vma_prev)); - ASSERT_EQ(mm.map_count, 1); + ASSERT_EQ(mm.vma_count, 1); /* Clear down and reset. We should have deleted vma. */ ASSERT_EQ(cleanup_mm(&mm, &vmi), 1); @@ -1138,7 +1138,7 @@ static bool test_merge_existing(void) ASSERT_EQ(vma_prev->vm_pgoff, 0); ASSERT_EQ(vma_prev->anon_vma, &dummy_anon_vma); ASSERT_TRUE(vma_write_started(vma_prev)); - ASSERT_EQ(mm.map_count, 1); + ASSERT_EQ(mm.vma_count, 1); /* Clear down and reset. We should have deleted prev and next. */ ASSERT_EQ(cleanup_mm(&mm, &vmi), 1); @@ -1540,7 +1540,7 @@ static bool test_merge_extend(void) ASSERT_EQ(vma->vm_end, 0x4000); ASSERT_EQ(vma->vm_pgoff, 0); ASSERT_TRUE(vma_write_started(vma)); - ASSERT_EQ(mm.map_count, 1); + ASSERT_EQ(mm.vma_count, 1); cleanup_mm(&mm, &vmi); return true; @@ -1652,7 +1652,7 @@ static bool test_mmap_region_basic(void) 0x24d, NULL); ASSERT_EQ(addr, 0x24d000); - ASSERT_EQ(mm.map_count, 2); + ASSERT_EQ(mm.vma_count, 2); for_each_vma(vmi, vma) { if (vma->vm_start =3D=3D 0x300000) { diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index 52cd7ddc73f4..15525b86145d 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -251,7 +251,7 @@ struct mutex {}; struct mm_struct { struct maple_tree mm_mt; - int map_count; /* number of VMAs */ + int vma_count; /* number of VMAs */ unsigned long total_vm; /* Total pages mapped */ unsigned long locked_vm; /* Pages that have PG_mlocked set */ unsigned long data_vm; /* VM_WRITE & ~VM_SHARED & ~VM_STACK */ @@ -1520,10 +1520,10 @@ static inline vm_flags_t ksm_vma_flags(const struct= mm_struct *, const struct fi /* Helper to get VMA count capacity */ static int vma_count_remaining(const struct mm_struct *mm) { - const int map_count =3D mm->map_count; + const int vma_count =3D mm->vma_count; const int max_count =3D sysctl_max_map_count; - return (max_count > map_count) ? (max_count - map_count) : 0; + return (max_count > vma_count) ? (max_count - vma_count) : 0; } #endif /* __MM_VMA_INTERNAL_H */ --=20 2.51.0.384.g4c02a37b29-goog From nobody Thu Oct 2 14:25:58 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 285FC29A30D for ; Mon, 15 Sep 2025 16:46:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954817; cv=none; b=PyVFvG7fYkB1ZBNsSxTFp9ZjIU9Y8rQ/YTZYvYokTT2YKMsTbO6hdfhZ8pyp7qNqFJDOVcaGIg5H5apqDEUlMXFzVW98+lKat+Ximx1fRKw8B5L3W5VbpdZmYTnwRVRlfUqKUKbHr/d7eJ5Hfrmbl0sl1FyrY0ua8adB7UHCwwU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954817; c=relaxed/simple; bh=mxgSbuM3vTGaiZBQ17HgKAaGLPJx9xqthCVW+1kVVoQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=nRp7skZwaMUzxk88WJi7YgOmdHEIJLzDSTpDlriKcP8fVp8xX5u4NkfmvOTRTuI87FtG+igBMRzfoKU4kG7q96S2ZipllKkcvb9nXwkpAo3M3zI0GEZjoFpe/jlQRCekVOO+r3INtahuD1A9BSBEEuJhwteEw3oC4hWTfq/NAW0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=LUADRP39; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="LUADRP39" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b4fb25c2e56so2988054a12.3 for ; Mon, 15 Sep 2025 09:46:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757954814; x=1758559614; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SbJK8pWy0CZE0crqWikRoe+l2baB0Rx7jlgQSj60pLY=; b=LUADRP39biqSf8B0SAW3o65iokIzW92ZpxTMwpe6NSdZohY2MnahdT5A3GLorR8Xg1 KLmJ5EFJkLu75AW6Qhmbr3H6iFZ6FrvKVNPAaCbory4+DDebrUf2hLvg750OVAA4uRAo 7Bn6QPs8w9NVJq8J10Nx7fHJHm8XybkL7FHuzSAHosanrAXqc9FtsU862jxhUWR7Ao61 PHN7bU6Ah6uwG8YWcEssZusKB1mXeGdS4Elx9oia0O/NONZl7fyMVpK3YRpL7Lxzwtk4 q8UPl58h1/LpPKKlRC4eSEfWS99GPG9tmok1p3QSfV3bCsbKprI52gSb97GKe1+4r3J8 J8cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757954814; x=1758559614; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SbJK8pWy0CZE0crqWikRoe+l2baB0Rx7jlgQSj60pLY=; b=iGMrC5yzjHM0hQsWC/T6en/rb4uczWKwnhNIShudANnAzTWgDfJEEYCDZg+ATjQ2Wh iJoQgn0CI87jfiHeQCqxKx3eKNanijhSG0rwtzvHNnZxQahutxWk2T/fqe7rbnd/M5H9 3tI23giDLXBpeA7PC4tYpEdwvwMmb4wRNJJvlu+xk2ICK9FBjBINeX4u1DkO1hSz6fMa dTUfR+bANkvClrvPw+HbKo1yTRmB2XFx0geetWmU5BGDfwCRI3qgPdN/f3BaH9iK2y5v Wm5F+Gxhb97cm1ASZkt9/R7TgIxLFuU7IMkxOSnPAXpOgP8J8oz7Tebc9lq70PaUnGvR iOiQ== X-Forwarded-Encrypted: i=1; AJvYcCVoWmDqWPAYro94prwFFfKo1EOKTfoQtt2EMoJ2TR3kYtO6hPO5ulT/15wBzFnjLafwC4OefWKjxuUBKj4=@vger.kernel.org X-Gm-Message-State: AOJu0YyTvqDMOSUDdCJMNdKuRMiw9cIplY2vOphrtjr+uNqj/0QW6rFQ zIZgGL79Lbvbe8bXK+TLltfnN354IWx3G2bhvWgUmeJMHPcV3ZBpTuBjR3du3mxeZrEqThTp9A9 zsVTMT1ovGDW8JNSmlBs4/EWktg== X-Google-Smtp-Source: AGHT+IFj2fssKhkti4Yi38ezbH/JI+2/57SQchOc8iv6dsxeJ23DL0meIXHXVVhUOdfHhPdiopD3vNalm3MQQmXFcA== X-Received: from pleg2.prod.google.com ([2002:a17:902:e382:b0:25d:f53e:e5a4]) (user=kaleshsingh job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:1b4c:b0:24c:b2a4:7089 with SMTP id d9443c01a7336-25d26077175mr169658175ad.31.1757954814330; Mon, 15 Sep 2025 09:46:54 -0700 (PDT) Date: Mon, 15 Sep 2025 09:36:36 -0700 In-Reply-To: <20250915163838.631445-1-kaleshsingh@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250915163838.631445-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.51.0.384.g4c02a37b29-goog Message-ID: <20250915163838.631445-6-kaleshsingh@google.com> Subject: [PATCH v2 5/7] mm: harden vma_count against direct modification From: Kalesh Singh To: akpm@linux-foundation.org, minchan@kernel.org, lorenzo.stoakes@oracle.com, david@redhat.com, Liam.Howlett@oracle.com, rppt@kernel.org, pfalcato@suse.de Cc: kernel-team@android.com, android-mm@google.com, Kalesh Singh , Alexander Viro , Christian Brauner , Jan Kara , Kees Cook , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Valentin Schneider , Jann Horn , Shuah Khan , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To make VMA counting more robust, prevent direct modification of the mm->vma_count field. This is achieved by making the public-facing member const via a union and requiring all modifications to go through a new set of helper functions the operate on a private __vma_count. While there are no other invariants tied to vma_count currently, this structural change improves maintainability; as it creates a single, centralized point for any future logic, such as adding debug checks or updating related statistics (in subsequent patches). Cc: Andrew Morton Cc: David Hildenbrand Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Mike Rapoport Cc: Minchan Kim Cc: Pedro Falcato Signed-off-by: Kalesh Singh --- include/linux/mm.h | 25 +++++++++++++++++++++++++ include/linux/mm_types.h | 5 ++++- kernel/fork.c | 2 +- mm/mmap.c | 2 +- mm/vma.c | 12 ++++++------ tools/testing/vma/vma.c | 2 +- tools/testing/vma/vma_internal.h | 30 +++++++++++++++++++++++++++++- 7 files changed, 67 insertions(+), 11 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 138bab2988f8..8bad1454984c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4219,4 +4219,29 @@ static inline bool snapshot_page_is_faithful(const s= truct page_snapshot *ps) void snapshot_page(struct page_snapshot *ps, const struct page *page); +static inline void vma_count_init(struct mm_struct *mm) +{ + ACCESS_PRIVATE(mm, __vma_count) =3D 0; +} + +static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +{ + ACCESS_PRIVATE(mm, __vma_count) +=3D nr_vmas; +} + +static inline void vma_count_sub(struct mm_struct *mm, int nr_vmas) +{ + vma_count_add(mm, -nr_vmas); +} + +static inline void vma_count_inc(struct mm_struct *mm) +{ + vma_count_add(mm, 1); +} + +static inline void vma_count_dec(struct mm_struct *mm) +{ + vma_count_sub(mm, 1); +} + #endif /* _LINUX_MM_H */ diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 4343be2f9e85..2ea8fc722aa2 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1020,7 +1020,10 @@ struct mm_struct { #ifdef CONFIG_MMU atomic_long_t pgtables_bytes; /* size of all page tables */ #endif - int vma_count; /* number of VMAs */ + union { + const int vma_count; /* number of VMAs */ + int __private __vma_count; + }; spinlock_t page_table_lock; /* Protects page tables and some * counters diff --git a/kernel/fork.c b/kernel/fork.c index 8fcbbf947579..ea9eff416e51 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1037,7 +1037,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm= , struct task_struct *p, mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); mm_pgtables_bytes_init(mm); - mm->vma_count =3D 0; + vma_count_init(mm); mm->locked_vm =3D 0; atomic64_set(&mm->pinned_vm, 0); memset(&mm->rss_stat, 0, sizeof(mm->rss_stat)); diff --git a/mm/mmap.c b/mm/mmap.c index c6769394a174..30ddd550197e 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1828,7 +1828,7 @@ __latent_entropy int dup_mmap(struct mm_struct *mm, s= truct mm_struct *oldmm) */ vma_iter_bulk_store(&vmi, tmp); =20 - mm->vma_count++; + vma_count_inc(mm); =20 if (tmp->vm_ops && tmp->vm_ops->open) tmp->vm_ops->open(tmp); diff --git a/mm/vma.c b/mm/vma.c index 64f4e7c867c3..0cd3cb472220 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -352,7 +352,7 @@ static void vma_complete(struct vma_prepare *vp, struct= vma_iterator *vmi, * (it may either follow vma or precede it). */ vma_iter_store_new(vmi, vp->insert); - mm->vma_count++; + vma_count_inc(mm); } =20 if (vp->anon_vma) { @@ -383,7 +383,7 @@ static void vma_complete(struct vma_prepare *vp, struct= vma_iterator *vmi, } if (vp->remove->anon_vma) anon_vma_merge(vp->vma, vp->remove); - mm->vma_count--; + vma_count_dec(mm); mpol_put(vma_policy(vp->remove)); if (!vp->remove2) WARN_ON_ONCE(vp->vma->vm_end < vp->remove->vm_end); @@ -1266,7 +1266,7 @@ static void vms_complete_munmap_vmas(struct vma_munma= p_struct *vms, struct mm_struct *mm; =20 mm =3D current->mm; - mm->vma_count -=3D vms->vma_count; + vma_count_sub(mm, vms->vma_count); mm->locked_vm -=3D vms->locked_vm; if (vms->unlock) mmap_write_downgrade(mm); @@ -1795,7 +1795,7 @@ int vma_link(struct mm_struct *mm, struct vm_area_str= uct *vma) vma_start_write(vma); vma_iter_store_new(&vmi, vma); vma_link_file(vma); - mm->vma_count++; + vma_count_inc(mm); validate_mm(mm); return 0; } @@ -2495,7 +2495,7 @@ static int __mmap_new_vma(struct mmap_state *map, str= uct vm_area_struct **vmap) /* Lock the VMA since it is modified after insertion into VMA tree */ vma_start_write(vma); vma_iter_store_new(vmi, vma); - map->mm->vma_count++; + vma_count_inc(map->mm); vma_link_file(vma); =20 /* @@ -2810,7 +2810,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_= area_struct *vma, if (vma_iter_store_gfp(vmi, vma, GFP_KERNEL)) goto mas_store_fail; =20 - mm->vma_count++; + vma_count_inc(mm); validate_mm(mm); out: perf_event_mmap(vma); diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c index 69fa7d14a6c2..ee5a1e2365e0 100644 --- a/tools/testing/vma/vma.c +++ b/tools/testing/vma/vma.c @@ -261,7 +261,7 @@ static int cleanup_mm(struct mm_struct *mm, struct vma_= iterator *vmi) } =20 mtree_destroy(&mm->mm_mt); - mm->vma_count =3D 0; + vma_count_init(mm); return count; } =20 diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index 15525b86145d..6e724ba1adf4 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -251,7 +251,10 @@ struct mutex {}; =20 struct mm_struct { struct maple_tree mm_mt; - int vma_count; /* number of VMAs */ + union { + const int vma_count; /* number of VMAs */ + int __vma_count; + }; unsigned long total_vm; /* Total pages mapped */ unsigned long locked_vm; /* Pages that have PG_mlocked set */ unsigned long data_vm; /* VM_WRITE & ~VM_SHARED & ~VM_STACK */ @@ -1526,4 +1529,29 @@ static int vma_count_remaining(const struct mm_struc= t *mm) return (max_count > vma_count) ? (max_count - vma_count) : 0; } =20 +static inline void vma_count_init(struct mm_struct *mm) +{ + mm->__vma_count =3D 0; +} + +static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +{ + mm->__vma_count +=3D nr_vmas; +} + +static inline void vma_count_sub(struct mm_struct *mm, int nr_vmas) +{ + vma_count_add(mm, -nr_vmas); +} + +static inline void vma_count_inc(struct mm_struct *mm) +{ + vma_count_add(mm, 1); +} + +static inline void vma_count_dec(struct mm_struct *mm) +{ + vma_count_sub(mm, 1); +} + #endif /* __MM_VMA_INTERNAL_H */ --=20 2.51.0.384.g4c02a37b29-goog From nobody Thu Oct 2 14:25:58 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE2E3271476 for ; Mon, 15 Sep 2025 16:47:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954832; cv=none; b=U0GSK8Ot3YyuAFPip4IDIADnVU8o7itMgrqCjP8nzZqUny1SEyDg+leQCUVUuADBf8rpn5H44sjTDQmGn/m1IILScN9dJUe4ZPFSvzBltpDzm1aAZzKd764Kiuiqitg3tf5xaYc4wITlOnGLno7XkRMeDULISdMlfSgZgc0OuXI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954832; c=relaxed/simple; bh=pvCfH5VsfNwVFReZuZTk36FRtKutRSyx+F6VGLxE6kM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XbJiREHJnrSGdAHshOmerCQZYEQkB/v7HFPDh/SXO2JHMJMmBaHyEkXBuC3jyoW8u9Bu27r9F0xUKaEZeoQDL5zP1exbPHk2hfwfSSkrERexssnHR1vuIClYC0piUkicacXhvg906xw5g0CqhSdEpqu6poPSK8kGiGrOULxNvsU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=gVZCAX6H; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gVZCAX6H" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-24458274406so86441945ad.3 for ; Mon, 15 Sep 2025 09:47:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757954830; x=1758559630; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hKwpABqpohuFqzMZ59NYGHVdaV3v0Z5awKHv0UgL8Ps=; b=gVZCAX6HJOPfCYrrcb8VP55QLCj2yQ1s6oyBe+OSHPAbX1XUOkOkyCYncQ+BeHO6qm H23YxZ9/AY0EpPXOgT3RmEvMMfhqFAuILCEebdYBFRECRLnNO4+KVkmdLYqBktY9RTxc 2kxkNEycjsVQBB4QftNiRhNmGSU3FKvjuInNXx76ApzEKe/jfj2uwrrS8dCoAUaDAcvZ 67P5JmI8cglRupSmoe3FLBdcqm7kKkhO6bbY9y7jf7RBuLFinschXcVgTz19dnHD2aZ2 yhNKpF/eDBgyD0ajOzj4ojTL0Tn6eTEr6AShmYxf7RcLa1UYYFgXcYjjCD2d9ZYSHj3y zI+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757954830; x=1758559630; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hKwpABqpohuFqzMZ59NYGHVdaV3v0Z5awKHv0UgL8Ps=; b=c2bel8RMnCVIOVvk9HDDw93KXfGzBs79VMSwlszAiNKFC9vcEZiTfm7PQTwCmNfGHG ki6x38N6269wtazG0fLlRmCq7GVH7UrhGs0UyqFdWlMD+spJEz2shGxur8NTkR5J2EIl y9g7NUjh6yAWupFP1VN9TrjXRxtuEAR+tj/t87hu/8vULWbVqNAzPAR2PTmSfRr/WbvC Z/A7ql64SaU4ucqtfzEJUFKhnwBg6JcJ1DNbbpRvVXi/PTGETf2xW7puJUlHQAwQsetY gkMtg8woxDkYv/8lqJJfEBcx3woxAVhArDeZZTt3MasMzGRPWDXAnWUHTojqWEYSgiyF /ReQ== X-Forwarded-Encrypted: i=1; AJvYcCXtYG7+MyDE6JYhbtulzRY4ozgbcN57VrdwmFvXHooyiJSpYQneEp18wlqDVs4nDcA33tic13RuNHjnrgI=@vger.kernel.org X-Gm-Message-State: AOJu0YwYaGecMpX0WN80J4n0ywHGw4p7xQNPbNNZXAKJz6dfmFliPx9z ZPNdhfkSfPOcJyfkcZmxZJGGBR8CVoQTMNE89mSZPDW0ZoxWeuy83shet+tR+VTYOx4+ftjMzAj XwAH5z550Vgk8EjA0O2h1P4GQ8A== X-Google-Smtp-Source: AGHT+IHxs26EIdlIXz631Uz365/CGN6f9O6Xm7lLrsiaFhIbp7CDzVSSsE3TgFf9kclQMXvcquwmL00++yYPRtwtYA== X-Received: from plbnb15.prod.google.com ([2002:a17:903:15cf:b0:24c:af07:f077]) (user=kaleshsingh job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2448:b0:267:a55a:8684 with SMTP id d9443c01a7336-267a55a8724mr34282065ad.2.1757954829980; Mon, 15 Sep 2025 09:47:09 -0700 (PDT) Date: Mon, 15 Sep 2025 09:36:37 -0700 In-Reply-To: <20250915163838.631445-1-kaleshsingh@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250915163838.631445-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.51.0.384.g4c02a37b29-goog Message-ID: <20250915163838.631445-7-kaleshsingh@google.com> Subject: [PATCH v2 6/7] mm: add assertion for VMA count limit From: Kalesh Singh To: akpm@linux-foundation.org, minchan@kernel.org, lorenzo.stoakes@oracle.com, david@redhat.com, Liam.Howlett@oracle.com, rppt@kernel.org, pfalcato@suse.de Cc: kernel-team@android.com, android-mm@google.com, Kalesh Singh , Alexander Viro , Christian Brauner , Jan Kara , Kees Cook , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Valentin Schneider , Jann Horn , Shuah Khan , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Building on the vma_count helpers, add a VM_WARN_ON_ONCE() to detect cases where the VMA count exceeds the sysctl_max_map_count limit. This check will help catch future bugs or regressions where the VMAs are allocated exceeding the limit. The warning is placed in the main vma_count_*() helpers, while the internal *_nocheck variants bypass it. _nocheck helpers are used to ensure that the assertion does not trigger a false positive in the legitimate case of a temporary VMA increase past the limit by a VMA split in munmap(). Cc: Andrew Morton Cc: David Hildenbrand Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Mike Rapoport Cc: Minchan Kim Cc: Pedro Falcato Signed-off-by: Kalesh Singh --- Changes in v2: - Add assertions if exceeding max_vma_count limit, per Pedro include/linux/mm.h | 12 ++++++-- mm/internal.h | 1 - mm/vma.c | 49 +++++++++++++++++++++++++------- tools/testing/vma/vma_internal.h | 7 ++++- 4 files changed, 55 insertions(+), 14 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8bad1454984c..3a3749d7015c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4219,19 +4219,27 @@ static inline bool snapshot_page_is_faithful(const = struct page_snapshot *ps) =20 void snapshot_page(struct page_snapshot *ps, const struct page *page); =20 +int vma_count_remaining(const struct mm_struct *mm); + static inline void vma_count_init(struct mm_struct *mm) { ACCESS_PRIVATE(mm, __vma_count) =3D 0; } =20 -static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +static inline void __vma_count_add_nocheck(struct mm_struct *mm, int nr_vm= as) { ACCESS_PRIVATE(mm, __vma_count) +=3D nr_vmas; } =20 +static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +{ + VM_WARN_ON_ONCE(!vma_count_remaining(mm)); + __vma_count_add_nocheck(mm, nr_vmas); +} + static inline void vma_count_sub(struct mm_struct *mm, int nr_vmas) { - vma_count_add(mm, -nr_vmas); + __vma_count_add_nocheck(mm, -nr_vmas); } =20 static inline void vma_count_inc(struct mm_struct *mm) diff --git a/mm/internal.h b/mm/internal.h index 39f1c9535ae5..e0567a3b64fa 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1661,6 +1661,5 @@ static inline bool reclaim_pt_is_enabled(unsigned lon= g start, unsigned long end, void dup_mm_exe_file(struct mm_struct *mm, struct mm_struct *oldmm); int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm); =20 -int vma_count_remaining(const struct mm_struct *mm); =20 #endif /* __MM_INTERNAL_H */ diff --git a/mm/vma.c b/mm/vma.c index 0cd3cb472220..0e4fcaebe209 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -323,15 +323,17 @@ static void vma_prepare(struct vma_prepare *vp) } =20 /* - * vma_complete- Helper function for handling the unlocking after altering= VMAs, - * or for inserting a VMA. + * This is the internal, unsafe version of vma_complete(). Unlike its + * wrapper, this function bypasses runtime checks for VMA count limits by + * using the _nocheck vma_count* helpers. * - * @vp: The vma_prepare struct - * @vmi: The vma iterator - * @mm: The mm_struct + * Its use is restricted to __split_vma() where the VMA count can be + * temporarily higher than the sysctl_max_map_count limit. + * + * All other callers must use vma_complete(). */ -static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi, - struct mm_struct *mm) +static void __vma_complete(struct vma_prepare *vp, struct vma_iterator *vm= i, + struct mm_struct *mm) { if (vp->file) { if (vp->adj_next) @@ -352,7 +354,11 @@ static void vma_complete(struct vma_prepare *vp, struc= t vma_iterator *vmi, * (it may either follow vma or precede it). */ vma_iter_store_new(vmi, vp->insert); - vma_count_inc(mm); + /* + * Explicitly allow vma_count to exceed the threshold to prevent, + * blocking munmap() freeing resources. + */ + __vma_count_add_nocheck(mm, 1); } =20 if (vp->anon_vma) { @@ -403,6 +409,26 @@ static void vma_complete(struct vma_prepare *vp, struc= t vma_iterator *vmi, uprobe_mmap(vp->insert); } =20 +/* + * vma_complete- Helper function for handling the unlocking after altering= VMAs, + * or for inserting a VMA. + * + * @vp: The vma_prepare struct + * @vmi: The vma iterator + * @mm: The mm_struct + */ +static void vma_complete(struct vma_prepare *vp, struct vma_iterator *vmi, + struct mm_struct *mm) +{ + /* + * __vma_complete() explicitly foregoes checking the new + * vma_count against the sysctl_max_map_count limit, so + * do it here. + */ + VM_WARN_ON_ONCE(!vma_count_remaining(mm)); + __vma_complete(vp, vmi, mm); +} + /* * init_vma_prep() - Initializer wrapper for vma_prepare struct * @vp: The vma_prepare struct @@ -564,8 +590,11 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_s= truct *vma, vma->vm_end =3D addr; } =20 - /* vma_complete stores the new vma */ - vma_complete(&vp, vmi, vma->vm_mm); + /* + * __vma_complete stores the new vma without checking against the + * sysctl_max_map_count (vma_count) limit. + */ + __vma_complete(&vp, vmi, vma->vm_mm); validate_mm(vma->vm_mm); /* Success. */ diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index 6e724ba1adf4..d084b1eb2a5c 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -1534,11 +1534,16 @@ static inline void vma_count_init(struct mm_struct = *mm) mm->__vma_count =3D 0; } -static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +static inline void __vma_count_add_nocheck(struct mm_struct *mm, int nr_vm= as) { mm->__vma_count +=3D nr_vmas; } +static inline void vma_count_add(struct mm_struct *mm, int nr_vmas) +{ + __vma_count_add_nocheck(mm, nr_vmas); +} + static inline void vma_count_sub(struct mm_struct *mm, int nr_vmas) { vma_count_add(mm, -nr_vmas); --=20 2.51.0.384.g4c02a37b29-goog From nobody Thu Oct 2 14:25:58 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1657627A903 for ; Mon, 15 Sep 2025 16:47:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954847; cv=none; b=OjzxGcTWMDaIuBhjovPB2LCvDoXDPm72KsRWiTb86y18VCJxWSRGo24hbnPs3lt/8V2B4jfvF9oI4gLiyuvR3Be5GF38WXD7UmJZ6I/vpYrJrGclTiWGuKmV6XuhoSltPAmR8myPSszW6Yle3P5+OGoNXVVYWPjdMB8bGeWu3Ds= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757954847; c=relaxed/simple; bh=XFR6sya9J2xG2hfxSf1a8GNZ3av4lo6QHIK09umVRAY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WFOmn3hRyndZljGCwNCei//rUcYIs34yvuM4/y+belRh7C9v3a8/BHRCmSaWZ+qV+5VJ5nmjjcJzJ9N6BVquVRwhnk13gSc1QMy/wgiGnhj9GtnMUTZOVpI8XzK4L5/YJ357+a1yFMXuMrv30Rs6U4X5ezzpiX4wh81gdOxibrg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=g4QAmWol; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--kaleshsingh.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="g4QAmWol" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-26166420e5dso20211015ad.3 for ; Mon, 15 Sep 2025 09:47:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1757954845; x=1758559645; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=59uH0/Rs24hPqNuUZQXolv8jQt8VsNO5VNNB4W0XCmc=; b=g4QAmWolSYP/Ed9ydbSy+KIA8l92LSjJj+8sG3pzGH5w13aHY+AiLC3d+3C6gOLUSi eyOQL+EixJoK8kycW0C2xNJdBqrMbodeultUPbyJhxugZSLs1rcqX0oLcwQ11h42ZZ3I HxI115Q/I55b2pos6MXXkNHm7oquDjII6JxW81cnJddBTRrQ9y/vjLtp3WWSRoQNyBjt JLlzNO7Aj7W569UK/AIqBVgF7o4rJo65wmzm6wYhKO/15ypJXRWi37MD+3OII4iv+RRV ojSpkz8CU3zHn9X5Dpu+AzG7asbpSAGBoBPt9a8LB9Uh10wtd1slV7/dCJiTKv7V7I+o BICQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757954845; x=1758559645; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=59uH0/Rs24hPqNuUZQXolv8jQt8VsNO5VNNB4W0XCmc=; b=NryI1gntni1HxRTi0tEo/vW3FsHcg/XwFHwzwor+vGPFuIHlE7Y439T6dQ9/L1Ai0h /0EwRpj1Pl30CwOok+HgQChw2nqXUQQpJOT3FVmrlZOsrKJkQFQmrqME2jaN7Lw3XAvp E15FkmMlkDJniXf6QFOgTKOEFrYnPdD44y1/VZObuOVa+GYMCiH1eEDi8ZUruPFRpYcG TfnN3MYzRe2gFXK6lMDFsVC3j84I6EIby5OkSy0P9VKAf5KqpWVHcADAbo2wrbU4K5Md mHsMvGtoBBPibjJL4pvAVdBhBSO+N1LsWTkzE6+TgK3Veq21APVmCDhrti2W1L19TKWp aaXA== X-Forwarded-Encrypted: i=1; AJvYcCUpka0E3tf2GIzPsq6mdfnbzPKrFUaUKeAQMbUJk01+EoGFnPzUvdtAYsW8DPHwG3zfKvD6pUf9kcm3z1k=@vger.kernel.org X-Gm-Message-State: AOJu0YzsI3y4qs05HA9MAsjCJZqDiuArqF4c+UrNqhSssFEJsUjsysZ7 eFTvKtHDOSJl1ua4LPYxJ0hHR/0jm6J7GhdWDe8BIGkO6j2HRuwvUFZYblWczcK5NfmXZUIBY8P 5Wvkjv4jLjnryglDT+QBh1M65mg== X-Google-Smtp-Source: AGHT+IHzc/fMWSswS4eA5+I14M9qXHKcg1jQwLLx5RpIwI8nZ9BGgFMV8dLNy7m9xz0Qp5xRzbGmEY4nirx7Uqh/jg== X-Received: from plhs1.prod.google.com ([2002:a17:903:3201:b0:24c:b6df:675e]) (user=kaleshsingh job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:d552:b0:260:df70:f753 with SMTP id d9443c01a7336-260df70fbdcmr125334005ad.38.1757954845285; Mon, 15 Sep 2025 09:47:25 -0700 (PDT) Date: Mon, 15 Sep 2025 09:36:38 -0700 In-Reply-To: <20250915163838.631445-1-kaleshsingh@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250915163838.631445-1-kaleshsingh@google.com> X-Mailer: git-send-email 2.51.0.384.g4c02a37b29-goog Message-ID: <20250915163838.631445-8-kaleshsingh@google.com> Subject: [PATCH v2 7/7] mm/tracing: introduce max_vma_count_exceeded trace event From: Kalesh Singh To: akpm@linux-foundation.org, minchan@kernel.org, lorenzo.stoakes@oracle.com, david@redhat.com, Liam.Howlett@oracle.com, rppt@kernel.org, pfalcato@suse.de Cc: kernel-team@android.com, android-mm@google.com, Kalesh Singh , Alexander Viro , Christian Brauner , Jan Kara , Kees Cook , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Ben Segall , Mel Gorman , Valentin Schneider , Jann Horn , Shuah Khan , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-trace-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Needed observability on in field devices can be collected with minimal overhead and can be toggled on and off. Event driven telemetry can be done with tracepoint BPF programs. The process comm is provided for aggregation across devices and tgid is to enable per-process aggregation per device. This allows for observing the distribution of such problems in the field, to deduce if there are legitimate bugs or if a bump to the limit is warranted. Cc: Andrew Morton Cc: David Hildenbrand Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Mike Rapoport Cc: Minchan Kim Cc: Pedro Falcato Signed-off-by: Kalesh Singh --- Chnages in v2: - Add needed observability for operations failing due to the vma count li= mit, per Minchan (Since there isn't a common point for debug logging due checks being external to the capacity based vma_count_remaining() helper. I used a trace event for low overhead and to facilitate event driven telemetry for in field devices) include/trace/events/vma.h | 32 ++++++++++++++++++++++++++++++++ mm/mmap.c | 5 ++++- mm/mremap.c | 10 ++++++++-- mm/vma.c | 11 +++++++++-- 4 files changed, 53 insertions(+), 5 deletions(-) create mode 100644 include/trace/events/vma.h diff --git a/include/trace/events/vma.h b/include/trace/events/vma.h new file mode 100644 index 000000000000..2fed63b0d0a6 --- /dev/null +++ b/include/trace/events/vma.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM vma + +#if !defined(_TRACE_VMA_H) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_VMA_H + +#include + +TRACE_EVENT(max_vma_count_exceeded, + + TP_PROTO(struct task_struct *task), + + TP_ARGS(task), + + TP_STRUCT__entry( + __string(comm, task->comm) + __field(pid_t, tgid) + ), + + TP_fast_assign( + __assign_str(comm); + __entry->tgid =3D task->tgid; + ), + + TP_printk("comm=3D%s tgid=3D%d", __get_str(comm), __entry->tgid) +); + +#endif /* _TRACE_VMA_H */ + +/* This part must be outside protection */ +#include diff --git a/mm/mmap.c b/mm/mmap.c index 30ddd550197e..0bb311bf48f3 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -56,6 +56,7 @@ =20 #define CREATE_TRACE_POINTS #include +#include =20 #include "internal.h" =20 @@ -374,8 +375,10 @@ unsigned long do_mmap(struct file *file, unsigned long= addr, return -EOVERFLOW; =20 /* Too many mappings? */ - if (!vma_count_remaining(mm)) + if (!vma_count_remaining(mm)) { + trace_max_vma_count_exceeded(current); return -ENOMEM; + } =20 /* * addr is returned from get_unmapped_area, diff --git a/mm/mremap.c b/mm/mremap.c index 14d35d87e89b..f42ac05f0069 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -30,6 +30,8 @@ #include #include =20 +#include + #include "internal.h" =20 /* Classify the kind of remap operation being performed. */ @@ -1040,8 +1042,10 @@ static unsigned long prep_move_vma(struct vma_remap_= struct *vrm) * We'd prefer to avoid failure later on in do_munmap: * which may split one vma into three before unmapping. */ - if (vma_count_remaining(current->mm) < 4) + if (vma_count_remaining(current->mm) < 4) { + trace_max_vma_count_exceeded(current); return -ENOMEM; + } =20 if (vma->vm_ops && vma->vm_ops->may_split) { if (vma->vm_start !=3D old_addr) @@ -1817,8 +1821,10 @@ static unsigned long check_mremap_params(struct vma_= remap_struct *vrm) * the threshold. In other words, is the current map count + 6 at or * below the threshold? Otherwise return -ENOMEM here to be more safe. */ - if (vma_count_remaining(current->mm) < 6) + if (vma_count_remaining(current->mm) < 6) { + trace_max_vma_count_exceeded(current); return -ENOMEM; + } =20 return 0; } diff --git a/mm/vma.c b/mm/vma.c index 0e4fcaebe209..692c33c3e84d 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -7,6 +7,8 @@ #include "vma_internal.h" #include "vma.h" =20 +#include + struct mmap_state { struct mm_struct *mm; struct vma_iterator *vmi; @@ -621,8 +623,10 @@ __split_vma(struct vma_iterator *vmi, struct vm_area_s= truct *vma, static int split_vma(struct vma_iterator *vmi, struct vm_area_struct *vma, unsigned long addr, int new_below) { - if (!vma_count_remaining(vma->vm_mm)) + if (!vma_count_remaining(vma->vm_mm)) { + trace_max_vma_count_exceeded(current); return -ENOMEM; + } =20 return __split_vma(vmi, vma, addr, new_below); } @@ -1375,6 +1379,7 @@ static int vms_gather_munmap_vmas(struct vma_munmap_s= truct *vms, */ if (vms->end < vms->vma->vm_end && !vma_count_remaining(vms->vma->vm_mm)) { + trace_max_vma_count_exceeded(current); error =3D -ENOMEM; goto vma_count_exceeded; } @@ -2801,8 +2806,10 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm= _area_struct *vma, if (!may_expand_vm(mm, vm_flags, len >> PAGE_SHIFT)) return -ENOMEM; =20 - if (!vma_count_remaining(mm)) + if (!vma_count_remaining(mm)) { + trace_max_vma_count_exceeded(current); return -ENOMEM; + } =20 if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT)) return -ENOMEM; --=20 2.51.0.384.g4c02a37b29-goog