From nobody Mon Oct 6 15:17:33 2025 Received: from mail-oo1-f73.google.com (mail-oo1-f73.google.com [209.85.161.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77B4A28936C for ; Sat, 19 Jul 2025 18:29:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949742; cv=none; b=dQHjN5eAwzwQst0/FHiq/w6Z+p6HSbwDFx786ARuxRt63YHSBd1Yubjfn494+f4BBQtMlfOjmeueE4o8OAC2UOi7dbsGuV65utDq/mr8bcfpg9WXInO2bxIDmqTxmQiDliiNa89B73XbsXD9oN1ET4C+4qqsNmORr2vzuGNvsm8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949742; c=relaxed/simple; bh=hzHgQqgsigKswQz7j9NJVsuZrK+uK91DpG7o5k2BisM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Zuy+Rsux0g936lLjHPGJBAu7GkEp8cve+yffLTsWPt85DzqhLqY1jX7XFT4EBAo066h11x6mUf4xgAAu8o21YEtKmzaOOrfEFJtQkhQRD5Tkrvtln4CWh0fWp3qFQTPzk8AXCcuXDpp5aOnxo01lUhs6iRK6mWNqIFFSbNFA15I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=y+Q9Jddb; arc=none smtp.client-ip=209.85.161.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="y+Q9Jddb" Received: by mail-oo1-f73.google.com with SMTP id 006d021491bc7-61200aa7525so2737807eaf.3 for ; Sat, 19 Jul 2025 11:29:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752949739; x=1753554539; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=EPvgBiLqlpShSpauZlW2pytMfVpDu05qY5wabgs+g1Y=; b=y+Q9JddbTgCFbudYcAmm5wtVyR9tlUrlHNUvCRPZpPLkiSbEvbsuS6hNamLerNMlrA G15XANOlh3x7XjVJy08P8jH1xhGL09quXiR7Ton+AkyLswfo4OtbCx+JQFKDY1Z2IWOb Q3dpfXonHrv/ujX31LTuPhUuifLce89lQyeyvXu4Da/Vs/8RplZk87yEknKqipp32bP2 oY7NT+fEUMPWj7DJgMwyXO0XvkIO+F33n/V8omcydWRlGh+gBsugRzitrEfN2hn6lPxr jGhbs865gaqazXq73LX1seVnJaLhz0Txm5nrNWpbHfySxCiIninoOYH2g8YFHjF8pcUc QQWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752949739; x=1753554539; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EPvgBiLqlpShSpauZlW2pytMfVpDu05qY5wabgs+g1Y=; b=CNW3p5vXN4Pupw/C+t4h6IPOky7JwHuj4zhPctubA8cYG0sUgecTJAxPEm+UAB7+y/ syy3+RnvXSPvAfsLmvJHYEXxOrntP8qJmYU2SIhAj0mXAzUjNbnxOpQ/pCYlzzlAB/ZB B0sEBoU9mQ0meuXMXyIBlYezDjRFM7snTxhnB0xp93rR9QGsljsuELwZG37Pz0LoAVsv oxcGApoIen6IRg0AuZKmLsbaI6wjURTaB5MzCltrAOP2DE7F0LRKf1fvlcqztN9+TlzN SY4FNhnU2JiEXH/79qZdx/7mJFBeXU4tV8liyTICRElY4P2CeiiMiAPoCshKnmPgAVrl L+7w== X-Forwarded-Encrypted: i=1; AJvYcCWLoS+M6cF1lWMGGFmxec3bjwKzhs0mnWuq7ds7oEqW+Xvz6/WR1M4C/r7PlG+OVP414zGQBFVwgm+zCRg=@vger.kernel.org X-Gm-Message-State: AOJu0YydE+JyxVBZP3BjJQkTvORTsDjRPu9ZOi3X0loTp0sqXVTWbjcj +oM028tAJ6WkJ5SQOrz69nU/7T0902o4LD71amzDjgAJV8s4V9jrY7q9My70aUh2/mrVlgiXe9k Kcg+Pzg== X-Google-Smtp-Source: AGHT+IFoGFLafBonuC3rQovXKKJU34G15nWux3XEkMYYSvePU4fdrLMa6Mpo311hHQrdwhevOxqLkpiabCo= X-Received: from oabde14.prod.google.com ([2002:a05:6870:75ce:b0:2d5:491:29fa]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6870:8a1e:b0:2ea:7101:7dc1 with SMTP id 586e51a60fabf-2ffb24a3bf3mr11597629fac.33.1752949739515; Sat, 19 Jul 2025 11:28:59 -0700 (PDT) Date: Sat, 19 Jul 2025 11:28:49 -0700 In-Reply-To: <20250719182854.3166724-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250719182854.3166724-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog Message-ID: <20250719182854.3166724-2-surenb@google.com> Subject: [PATCH v8 1/6] selftests/proc: add /proc/pid/maps tearing from vma split test From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The /proc/pid/maps file is generated page by page, with the mmap_lock released between pages. This can lead to inconsistent reads if the underlying vmas are concurrently modified. For instance, if a vma split or merge occurs at a page boundary while /proc/pid/maps is being read, the same vma might be seen twice: once before and once after the change. This duplication is considered acceptable for userspace handling. However, observing a "hole" where a vma should be (e.g., due to a vma being replaced and the space temporarily being empty) is unacceptable. Implement a test that: 1. Forks a child process which continuously modifies its address space, specifically targeting a vma at the boundary between two pages. 2. The parent process repeatedly reads the child's /proc/pid/maps. 3. The parent process checks the last vma of the first page and the first vma of the second page for consistency, looking for the effects of vma splits or merges. The test duration is configurable via DURATION environment variable expressed in seconds. The default test duration is 5 seconds. Example Command: DURATION=3D10 ./proc-maps-race Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/.gitignore | 1 + tools/testing/selftests/proc/Makefile | 1 + tools/testing/selftests/proc/proc-maps-race.c | 447 ++++++++++++++++++ 3 files changed, 449 insertions(+) create mode 100644 tools/testing/selftests/proc/proc-maps-race.c diff --git a/tools/testing/selftests/proc/.gitignore b/tools/testing/selfte= sts/proc/.gitignore index 973968f45bba..19bb333e2485 100644 --- a/tools/testing/selftests/proc/.gitignore +++ b/tools/testing/selftests/proc/.gitignore @@ -5,6 +5,7 @@ /proc-2-is-kthread /proc-fsconfig-hidepid /proc-loadavg-001 +/proc-maps-race /proc-multiple-procfs /proc-empty-vm /proc-pid-vm diff --git a/tools/testing/selftests/proc/Makefile b/tools/testing/selftest= s/proc/Makefile index b12921b9794b..50aba102201a 100644 --- a/tools/testing/selftests/proc/Makefile +++ b/tools/testing/selftests/proc/Makefile @@ -9,6 +9,7 @@ TEST_GEN_PROGS +=3D fd-002-posix-eq TEST_GEN_PROGS +=3D fd-003-kthread TEST_GEN_PROGS +=3D proc-2-is-kthread TEST_GEN_PROGS +=3D proc-loadavg-001 +TEST_GEN_PROGS +=3D proc-maps-race TEST_GEN_PROGS +=3D proc-empty-vm TEST_GEN_PROGS +=3D proc-pid-vm TEST_GEN_PROGS +=3D proc-self-map-files-001 diff --git a/tools/testing/selftests/proc/proc-maps-race.c b/tools/testing/= selftests/proc/proc-maps-race.c new file mode 100644 index 000000000000..5b28dda08b7d --- /dev/null +++ b/tools/testing/selftests/proc/proc-maps-race.c @@ -0,0 +1,447 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright 2022 Google LLC. + * Author: Suren Baghdasaryan + * + * Permission to use, copy, modify, and distribute this software for any + * purpose with or without fee is hereby granted, provided that the above + * copyright notice and this permission notice appear in all copies. + * + * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES + * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF + * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR + * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES + * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN + * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF + * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. + */ +/* + * Fork a child that concurrently modifies address space while the main + * process is reading /proc/$PID/maps and verifying the results. Address + * space modifications include: + * VMA splitting and merging + * + */ +#define _GNU_SOURCE +#include "../kselftest_harness.h" +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* /proc/pid/maps parsing routines */ +struct page_content { + char *data; + ssize_t size; +}; + +#define LINE_MAX_SIZE 256 + +struct line_content { + char text[LINE_MAX_SIZE]; + unsigned long start_addr; + unsigned long end_addr; +}; + +enum test_state { + INIT, + CHILD_READY, + PARENT_READY, + SETUP_READY, + SETUP_MODIFY_MAPS, + SETUP_MAPS_MODIFIED, + SETUP_RESTORE_MAPS, + SETUP_MAPS_RESTORED, + TEST_READY, + TEST_DONE, +}; + +struct vma_modifier_info; + +FIXTURE(proc_maps_race) +{ + struct vma_modifier_info *mod_info; + struct page_content page1; + struct page_content page2; + struct line_content last_line; + struct line_content first_line; + unsigned long duration_sec; + int shared_mem_size; + int page_size; + int vma_count; + int maps_fd; + pid_t pid; +}; + +typedef bool (*vma_modifier_op)(FIXTURE_DATA(proc_maps_race) *self); +typedef bool (*vma_mod_result_check_op)(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line); + +struct vma_modifier_info { + int vma_count; + void *addr; + int prot; + void *next_addr; + vma_modifier_op vma_modify; + vma_modifier_op vma_restore; + vma_mod_result_check_op vma_mod_check; + pthread_mutex_t sync_lock; + pthread_cond_t sync_cond; + enum test_state curr_state; + bool exit; + void *child_mapped_addr[]; +}; + + +static bool read_two_pages(FIXTURE_DATA(proc_maps_race) *self) +{ + ssize_t bytes_read; + + if (lseek(self->maps_fd, 0, SEEK_SET) < 0) + return false; + + bytes_read =3D read(self->maps_fd, self->page1.data, self->page_size); + if (bytes_read <=3D 0) + return false; + + self->page1.size =3D bytes_read; + + bytes_read =3D read(self->maps_fd, self->page2.data, self->page_size); + if (bytes_read <=3D 0) + return false; + + self->page2.size =3D bytes_read; + + return true; +} + +static void copy_first_line(struct page_content *page, char *first_line) +{ + char *pos =3D strchr(page->data, '\n'); + + strncpy(first_line, page->data, pos - page->data); + first_line[pos - page->data] =3D '\0'; +} + +static void copy_last_line(struct page_content *page, char *last_line) +{ + /* Get the last line in the first page */ + const char *end =3D page->data + page->size - 1; + /* skip last newline */ + const char *pos =3D end - 1; + + /* search previous newline */ + while (pos[-1] !=3D '\n') + pos--; + strncpy(last_line, pos, end - pos); + last_line[end - pos] =3D '\0'; +} + +/* Read the last line of the first page and the first line of the second p= age */ +static bool read_boundary_lines(FIXTURE_DATA(proc_maps_race) *self, + struct line_content *last_line, + struct line_content *first_line) +{ + if (!read_two_pages(self)) + return false; + + copy_last_line(&self->page1, last_line->text); + copy_first_line(&self->page2, first_line->text); + + return sscanf(last_line->text, "%lx-%lx", &last_line->start_addr, + &last_line->end_addr) =3D=3D 2 && + sscanf(first_line->text, "%lx-%lx", &first_line->start_addr, + &first_line->end_addr) =3D=3D 2; +} + +/* Thread synchronization routines */ +static void wait_for_state(struct vma_modifier_info *mod_info, enum test_s= tate state) +{ + pthread_mutex_lock(&mod_info->sync_lock); + while (mod_info->curr_state !=3D state) + pthread_cond_wait(&mod_info->sync_cond, &mod_info->sync_lock); + pthread_mutex_unlock(&mod_info->sync_lock); +} + +static void signal_state(struct vma_modifier_info *mod_info, enum test_sta= te state) +{ + pthread_mutex_lock(&mod_info->sync_lock); + mod_info->curr_state =3D state; + pthread_cond_signal(&mod_info->sync_cond); + pthread_mutex_unlock(&mod_info->sync_lock); +} + +static void stop_vma_modifier(struct vma_modifier_info *mod_info) +{ + wait_for_state(mod_info, SETUP_READY); + mod_info->exit =3D true; + signal_state(mod_info, SETUP_MODIFY_MAPS); +} + +static bool capture_mod_pattern(FIXTURE_DATA(proc_maps_race) *self, + struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + signal_state(self->mod_info, SETUP_MODIFY_MAPS); + wait_for_state(self->mod_info, SETUP_MAPS_MODIFIED); + + /* Copy last line of the first page and first line of the last page */ + if (!read_boundary_lines(self, mod_last_line, mod_first_line)) + return false; + + signal_state(self->mod_info, SETUP_RESTORE_MAPS); + wait_for_state(self->mod_info, SETUP_MAPS_RESTORED); + + /* Copy last line of the first page and first line of the last page */ + if (!read_boundary_lines(self, restored_last_line, restored_first_line)) + return false; + + if (!self->mod_info->vma_mod_check(mod_last_line, mod_first_line, + restored_last_line, restored_first_line)) + return false; + + /* + * The content of these lines after modify+resore should be the same + * as the original. + */ + return strcmp(restored_last_line->text, self->last_line.text) =3D=3D 0 && + strcmp(restored_first_line->text, self->first_line.text) =3D=3D 0; +} + +static inline bool split_vma(FIXTURE_DATA(proc_maps_race) *self) +{ + return mmap(self->mod_info->addr, self->page_size, self->mod_info->prot |= PROT_EXEC, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0) !=3D MAP_FAILED; +} + +static inline bool merge_vma(FIXTURE_DATA(proc_maps_race) *self) +{ + return mmap(self->mod_info->addr, self->page_size, self->mod_info->prot, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0) !=3D MAP_FAILED; +} + +static inline bool check_split_result(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + /* Make sure vmas at the boundaries are changing */ + return strcmp(mod_last_line->text, restored_last_line->text) !=3D 0 && + strcmp(mod_first_line->text, restored_first_line->text) !=3D 0; +} + +FIXTURE_SETUP(proc_maps_race) +{ + const char *duration =3D getenv("DURATION"); + struct vma_modifier_info *mod_info; + pthread_mutexattr_t mutex_attr; + pthread_condattr_t cond_attr; + unsigned long duration_sec; + char fname[32]; + + self->page_size =3D (unsigned long)sysconf(_SC_PAGESIZE); + duration_sec =3D duration ? atol(duration) : 0; + self->duration_sec =3D duration_sec ? duration_sec : 5UL; + + /* + * Have to map enough vmas for /proc/pid/maps to contain more than one + * page worth of vmas. Assume at least 32 bytes per line in maps output + */ + self->vma_count =3D self->page_size / 32 + 1; + self->shared_mem_size =3D sizeof(struct vma_modifier_info) + self->vma_co= unt * sizeof(void *); + + /* map shared memory for communication with the child process */ + self->mod_info =3D (struct vma_modifier_info *)mmap(NULL, self->shared_me= m_size, + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); + ASSERT_NE(self->mod_info, MAP_FAILED); + mod_info =3D self->mod_info; + + /* Initialize shared members */ + pthread_mutexattr_init(&mutex_attr); + pthread_mutexattr_setpshared(&mutex_attr, PTHREAD_PROCESS_SHARED); + ASSERT_EQ(pthread_mutex_init(&mod_info->sync_lock, &mutex_attr), 0); + pthread_condattr_init(&cond_attr); + pthread_condattr_setpshared(&cond_attr, PTHREAD_PROCESS_SHARED); + ASSERT_EQ(pthread_cond_init(&mod_info->sync_cond, &cond_attr), 0); + mod_info->vma_count =3D self->vma_count; + mod_info->curr_state =3D INIT; + mod_info->exit =3D false; + + self->pid =3D fork(); + if (!self->pid) { + /* Child process modifying the address space */ + int prot =3D PROT_READ | PROT_WRITE; + int i; + + for (i =3D 0; i < mod_info->vma_count; i++) { + mod_info->child_mapped_addr[i] =3D mmap(NULL, self->page_size * 3, prot, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + ASSERT_NE(mod_info->child_mapped_addr[i], MAP_FAILED); + /* change protection in adjacent maps to prevent merging */ + prot ^=3D PROT_WRITE; + } + signal_state(mod_info, CHILD_READY); + wait_for_state(mod_info, PARENT_READY); + while (true) { + signal_state(mod_info, SETUP_READY); + wait_for_state(mod_info, SETUP_MODIFY_MAPS); + if (mod_info->exit) + break; + + ASSERT_TRUE(mod_info->vma_modify(self)); + signal_state(mod_info, SETUP_MAPS_MODIFIED); + wait_for_state(mod_info, SETUP_RESTORE_MAPS); + ASSERT_TRUE(mod_info->vma_restore(self)); + signal_state(mod_info, SETUP_MAPS_RESTORED); + + wait_for_state(mod_info, TEST_READY); + while (mod_info->curr_state !=3D TEST_DONE) { + ASSERT_TRUE(mod_info->vma_modify(self)); + ASSERT_TRUE(mod_info->vma_restore(self)); + } + } + for (i =3D 0; i < mod_info->vma_count; i++) + munmap(mod_info->child_mapped_addr[i], self->page_size * 3); + + exit(0); + } + + sprintf(fname, "/proc/%d/maps", self->pid); + self->maps_fd =3D open(fname, O_RDONLY); + ASSERT_NE(self->maps_fd, -1); + + /* Wait for the child to map the VMAs */ + wait_for_state(mod_info, CHILD_READY); + + /* Read first two pages */ + self->page1.data =3D malloc(self->page_size); + ASSERT_NE(self->page1.data, NULL); + self->page2.data =3D malloc(self->page_size); + ASSERT_NE(self->page2.data, NULL); + + ASSERT_TRUE(read_boundary_lines(self, &self->last_line, &self->first_line= )); + + /* + * Find the addresses corresponding to the last line in the first page + * and the first line in the last page. + */ + mod_info->addr =3D NULL; + mod_info->next_addr =3D NULL; + for (int i =3D 0; i < mod_info->vma_count; i++) { + if (mod_info->child_mapped_addr[i] =3D=3D (void *)self->last_line.start_= addr) { + mod_info->addr =3D mod_info->child_mapped_addr[i]; + mod_info->prot =3D PROT_READ; + /* Even VMAs have write permission */ + if ((i % 2) =3D=3D 0) + mod_info->prot |=3D PROT_WRITE; + } else if (mod_info->child_mapped_addr[i] =3D=3D (void *)self->first_lin= e.start_addr) { + mod_info->next_addr =3D mod_info->child_mapped_addr[i]; + } + + if (mod_info->addr && mod_info->next_addr) + break; + } + ASSERT_TRUE(mod_info->addr && mod_info->next_addr); + + signal_state(mod_info, PARENT_READY); + +} + +FIXTURE_TEARDOWN(proc_maps_race) +{ + int status; + + stop_vma_modifier(self->mod_info); + + free(self->page2.data); + free(self->page1.data); + + for (int i =3D 0; i < self->vma_count; i++) + munmap(self->mod_info->child_mapped_addr[i], self->page_size); + close(self->maps_fd); + waitpid(self->pid, &status, 0); + munmap(self->mod_info, self->shared_mem_size); +} + +TEST_F(proc_maps_race, test_maps_tearing_from_split) +{ + struct vma_modifier_info *mod_info =3D self->mod_info; + + struct line_content split_last_line; + struct line_content split_first_line; + struct line_content restored_last_line; + struct line_content restored_first_line; + + wait_for_state(mod_info, SETUP_READY); + + /* re-read the file to avoid using stale data from previous test */ + ASSERT_TRUE(read_boundary_lines(self, &self->last_line, &self->first_line= )); + + mod_info->vma_modify =3D split_vma; + mod_info->vma_restore =3D merge_vma; + mod_info->vma_mod_check =3D check_split_result; + + ASSERT_TRUE(capture_mod_pattern(self, &split_last_line, &split_first_line, + &restored_last_line, &restored_first_line)); + + /* Now start concurrent modifications for self->duration_sec */ + signal_state(mod_info, TEST_READY); + + struct line_content new_last_line; + struct line_content new_first_line; + struct timespec start_ts, end_ts; + + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + do { + bool last_line_changed; + bool first_line_changed; + + ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); + + /* Check if we read vmas after split */ + if (!strcmp(new_last_line.text, split_last_line.text)) { + /* + * The vmas should be consistent with split results, + * however if vma was concurrently restored after a + * split, it can be reported twice (first the original + * split one, then the same vma but extended after the + * merge) because we found it as the next vma again. + * In that case new first line will be the same as the + * last restored line. + */ + ASSERT_FALSE(strcmp(new_first_line.text, split_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text)); + } else { + /* The vmas should be consistent with merge results */ + ASSERT_FALSE(strcmp(new_last_line.text, restored_last_line.text)); + ASSERT_FALSE(strcmp(new_first_line.text, restored_first_line.text)); + } + /* + * First and last lines should change in unison. If the last + * line changed then the first line should change as well and + * vice versa. + */ + last_line_changed =3D strcmp(new_last_line.text, self->last_line.text) != =3D 0; + first_line_changed =3D strcmp(new_first_line.text, self->first_line.text= ) !=3D 0; + ASSERT_EQ(last_line_changed, first_line_changed); + + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); + + /* Signal the modifyer thread to stop and wait until it exits */ + signal_state(mod_info, TEST_DONE); +} + +TEST_HARNESS_MAIN --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 15:17:33 2025 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC9FF2882BB for ; Sat, 19 Jul 2025 18:29:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949743; cv=none; b=e9BVMX9u9FbXr3U94J9MMDiH9/7h8ppDFVWiVDYNA++QPjkbHoIlnNYI0FsstMVaNnAHNkqdzVTcg9DpF3xqmdgD1VIgqX4w4TIcGoBfTrnxJij+m82xGrgplqtw5GhWSd7Cd2tT86ozJPhZFAUnGC3u0RZn5uD06+IKtVBOs3s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949743; c=relaxed/simple; bh=EoONMavb9oQd28yIVVpYa/F0BisZ+1FFpf7/OKMIWio=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=HYJxHwHr0GmwyUVRWpdtdQSLczzhyvOagHZwp6RfcnSGYw4wUitiICo3Gai2a3vnQ3DZ+AUyKoh5pEA+h9Pt9l+20O8QeD4XHAFgSf6zSmnZ7US2bdb2L6YvrQcAfAsxPnqAQ9qVBG1WapPzoNWNY3PFAz0c9lQSY0EiKJQ1gRw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=cWOya/0N; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="cWOya/0N" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-748f30d56d1so1408454b3a.3 for ; Sat, 19 Jul 2025 11:29:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752949741; x=1753554541; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JZIfIkvrizFRlwLe+eL2JUXe03By99qwQvMdi3R4EP0=; b=cWOya/0N8JJeghD70tjbNmbMG/YkjV6kxcDqhe1FKJ8CRStn7JFuzZ4HH1FQAljMDt wh/Akl6nHPDUcW/pjlIhgcFCTscgPUo4wc/g5FofgyczbfuGADqMqkqewKzfv1dLFLN+ HtgBzK3s9gESuxh3uPgN15pj56yfSSi0pPGquaDEDAIqKY+DV5GzjIkKu4OHaXu0npiZ l8rQCzKtTDshPRA3wWfswSCeILDV9dx3KfCSqKnlSaAkThPEZ5iBmDbZpP7hSy9wtLxW 3J7lae4kEbg0H4WhiAe+X0Px07R3R9yF3tgGPofU19RD333iDRdBNdSCVi8OPQSr+onb /VYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752949741; x=1753554541; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JZIfIkvrizFRlwLe+eL2JUXe03By99qwQvMdi3R4EP0=; b=FJRQGh81O/G9R75N2v75sMNpmSs9Z71WSWSwW/qX0GszpK+KNW5t6A4iGseT19Gwu8 uGtCbIBH+vT0kxQc6Cd7W8Nu+8Pidd9hXKBEwCC5+KdEbhrmwZLx5SbdmVb/QSyFKlpQ dab8/zQZ2Zai4J4a2T9vLiNZFKcq+1dIShEZ5uyUZRh4u6LqvbK3gr4b6IMG5y62lckz gPjbrIwmpSZPTN30wJsEYQDrAr6ViAZ88eH6Lf1qcmli7iXUnoMLhNNJED0bqUyPT+Xf jbVM7aclJjhXytFTxwCxvNGetJzRDnVTn+ffsGdaysox0j05UYl47pa4V/TVr+iPPv3/ m1Ig== X-Forwarded-Encrypted: i=1; AJvYcCVDKO1xEdBpQJy8W8jFkRXiEHc4IW9nyPGvZ42xrKC6+PC61TeR8fQJFukg5xFbGawvUhbJl2vNpaAR8j0=@vger.kernel.org X-Gm-Message-State: AOJu0YyfQ8uXpn0BS8V6piqZu9V3Qv65PbwkyePsKqZlcRmO1itaiB7n l9ERjDuFxsmTTwQ2BOYWrKLUFcbFptrH0vhJfFCt7zu7B2uxw9UEP0Ua/NlO73H1VpTzD1Vu2EI c6/TUmQ== X-Google-Smtp-Source: AGHT+IFtvDgzrvc853iwGTNcdFPdZ36TnaM/IGuEwlU0uCtz/OmMBvPrUhcVv4vqThaoN95p4kZpK9V/M5s= X-Received: from pgbn3.prod.google.com ([2002:a63:5903:0:b0:b2b:f469:cf78]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:1fc3:b0:235:c9fe:5929 with SMTP id adf61e73a8af0-237d7b649bbmr22450234637.42.1752949741279; Sat, 19 Jul 2025 11:29:01 -0700 (PDT) Date: Sat, 19 Jul 2025 11:28:50 -0700 In-Reply-To: <20250719182854.3166724-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250719182854.3166724-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog Message-ID: <20250719182854.3166724-3-surenb@google.com> Subject: [PATCH v8 2/6] selftests/proc: extend /proc/pid/maps tearing test to include vma resizing From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Test that /proc/pid/maps does not report unexpected holes in the address space when a vma at the edge of the page is being concurrently remapped. This remapping results in the vma shrinking and expanding from under the reader. We should always see either shrunk or expanded (original) version of the vma. Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-maps-race.c | 79 +++++++++++++++++++ 1 file changed, 79 insertions(+) diff --git a/tools/testing/selftests/proc/proc-maps-race.c b/tools/testing/= selftests/proc/proc-maps-race.c index 5b28dda08b7d..19028bd3b85c 100644 --- a/tools/testing/selftests/proc/proc-maps-race.c +++ b/tools/testing/selftests/proc/proc-maps-race.c @@ -242,6 +242,28 @@ static inline bool check_split_result(struct line_cont= ent *mod_last_line, strcmp(mod_first_line->text, restored_first_line->text) !=3D 0; } =20 +static inline bool shrink_vma(FIXTURE_DATA(proc_maps_race) *self) +{ + return mremap(self->mod_info->addr, self->page_size * 3, + self->page_size, 0) !=3D MAP_FAILED; +} + +static inline bool expand_vma(FIXTURE_DATA(proc_maps_race) *self) +{ + return mremap(self->mod_info->addr, self->page_size, + self->page_size * 3, 0) !=3D MAP_FAILED; +} + +static inline bool check_shrink_result(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + /* Make sure only the last vma of the first page is changing */ + return strcmp(mod_last_line->text, restored_last_line->text) !=3D 0 && + strcmp(mod_first_line->text, restored_first_line->text) =3D=3D 0; +} + FIXTURE_SETUP(proc_maps_race) { const char *duration =3D getenv("DURATION"); @@ -444,4 +466,61 @@ TEST_F(proc_maps_race, test_maps_tearing_from_split) signal_state(mod_info, TEST_DONE); } =20 +TEST_F(proc_maps_race, test_maps_tearing_from_resize) +{ + struct vma_modifier_info *mod_info =3D self->mod_info; + + struct line_content shrunk_last_line; + struct line_content shrunk_first_line; + struct line_content restored_last_line; + struct line_content restored_first_line; + + wait_for_state(mod_info, SETUP_READY); + + /* re-read the file to avoid using stale data from previous test */ + ASSERT_TRUE(read_boundary_lines(self, &self->last_line, &self->first_line= )); + + mod_info->vma_modify =3D shrink_vma; + mod_info->vma_restore =3D expand_vma; + mod_info->vma_mod_check =3D check_shrink_result; + + ASSERT_TRUE(capture_mod_pattern(self, &shrunk_last_line, &shrunk_first_li= ne, + &restored_last_line, &restored_first_line)); + + /* Now start concurrent modifications for self->duration_sec */ + signal_state(mod_info, TEST_READY); + + struct line_content new_last_line; + struct line_content new_first_line; + struct timespec start_ts, end_ts; + + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + do { + ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); + + /* Check if we read vmas after shrinking it */ + if (!strcmp(new_last_line.text, shrunk_last_line.text)) { + /* + * The vmas should be consistent with shrunk results, + * however if the vma was concurrently restored, it + * can be reported twice (first as shrunk one, then + * as restored one) because we found it as the next vma + * again. In that case new first line will be the same + * as the last restored line. + */ + ASSERT_FALSE(strcmp(new_first_line.text, shrunk_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text)); + } else { + /* The vmas should be consistent with the original/resored state */ + ASSERT_FALSE(strcmp(new_last_line.text, restored_last_line.text)); + ASSERT_FALSE(strcmp(new_first_line.text, restored_first_line.text)); + } + + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); + + /* Signal the modifyer thread to stop and wait until it exits */ + signal_state(mod_info, TEST_DONE); +} + TEST_HARNESS_MAIN --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 15:17:33 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0DF59289E15 for ; Sat, 19 Jul 2025 18:29:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949746; cv=none; b=CeCuqY3K4IUVifG9/WUUWXM5tqKYRtw/HLd9/1v7R5lIm6nUqw8KtzGmHOS/8BA09QsDX+J7BFMW81t5k2ZHLUV478JBUJilGET16OSrpBzeh+Mpy4luHLEX9J28CBtEIz5m9gvnJ546wjsq/HodiXOrAg9kagzf70KE099gFc8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949746; c=relaxed/simple; bh=UxAPbMUB0pIGsPbDU2jCpQ4QLRcvRQ4nl32Q3o4bVcA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=e+jyAiCQTg9ID0Qe1bMIQ5CW8cCN2CrGRM2EGpcT2otaCup5iZnDcijlLVSwe8bRx4HEYOPPCuYWSppq1uNm67lcp2o4aJKqbkmgC/CHMiexVEYgaYg3o8WbKS6kQaQmQ4sdGpcZU3bZJMdmqilIIXtiOJoup3QgRu8cAlBqgrw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=z7jCIRP0; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="z7jCIRP0" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3132c1942a1so4138180a91.2 for ; Sat, 19 Jul 2025 11:29:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752949743; x=1753554543; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6cmTk44WrbxZ0X4bw1vDDTZxTb31GhJvBzomPieJ1uw=; b=z7jCIRP06aZ79m68rgGkZdauMyP4bIIfd92fKB8nvHSgEf4GEP8Hqxsw+uLgL0YKYr qGgrk36gocTyXvdq6WjU+qe2Arez8x2IOKnYpYtxZgaKUPpxAA0cdAOAhtRgueyRc7P8 utA9oSVO3f1ZcQD4JWwSInA0nZAlGzv2eSxNBNmy6AqiZJM9TDEKpZVt1Z0bTKZfMr5K eaKRIMA5Ymhks0Epf93ZmbLyQISt6HWgV9w6mtWTevH4Z7cmNmkPtvGPzULUu9kiysXa YPX0wNjGtaspsog+UAfic46By/yMi0RuYmE1xeGhpnlMj4z3eHyGpSf7C1xruqg4PIrj N6WQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752949743; x=1753554543; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6cmTk44WrbxZ0X4bw1vDDTZxTb31GhJvBzomPieJ1uw=; b=u2KOtf635mW1tbxhq3O8J0uqU4Ddw3ojKbde7/nRAe+xrM63jbEm5jXSIFLgaSC56C LuTXBad7TPqeP/HQFS0tYcQKR2ZyChxW5h7Xk7g39jPy9MqM6FDoG2LmbSfTHcECcmu8 VgrqHnnPqHmxzbmuxYIHFiVD94VTBLLJWoUoia0+16k7y1r71JDgrC1+Tw13/QF9Xc+P H058RuR1YB19RyvOHVa7w669Zwtnax7cArYP54uRKPEnoDeOufetyUqTuurBGirWTjXs G6m+vEgNActBnGuWzjdtMMFpblUm5Jc2DBGD/bfJwuLcJSPAcyD2+xHzC5m1eOcDWI8q gWbQ== X-Forwarded-Encrypted: i=1; AJvYcCVscAKhvB9m7qW5+PuypGC7dtqkShPCf0tSJxvzJZfZg8hxNQLjZ19Bej3LMhRix1Ozr1rLrsqBzlhlzmg=@vger.kernel.org X-Gm-Message-State: AOJu0Yw2+u/og93mXtC4O+4rlwHEIcoYdqdL50Z6CCzWJLr7nR2g2yM/ 6cQO5s/RIesl4Dlf2w8GMgWL//3qL/nMCtlJHx9ftqdLZafnsvY1cXSWpsQcQ6nZer6VLAbicOw GIAgM7g== X-Google-Smtp-Source: AGHT+IFpR+G367NUdXqT4GjCDRW7uEBhYOCVViMZav6Hsk0eDu7mD5ualbj9xAJiOnJAq+Jb3OjyutTaGig= X-Received: from pjbsj2.prod.google.com ([2002:a17:90b:2d82:b0:31c:15c8:4c80]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3e47:b0:315:cbe0:13b3 with SMTP id 98e67ed59e1d1-31c9f399399mr19658063a91.7.1752949743417; Sat, 19 Jul 2025 11:29:03 -0700 (PDT) Date: Sat, 19 Jul 2025 11:28:51 -0700 In-Reply-To: <20250719182854.3166724-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250719182854.3166724-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog Message-ID: <20250719182854.3166724-4-surenb@google.com> Subject: [PATCH v8 3/6] selftests/proc: extend /proc/pid/maps tearing test to include vma remapping From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Test that /proc/pid/maps does not report unexpected holes in the address space when we concurrently remap a part of a vma into the middle of another vma. This remapping results in the destination vma being split into three parts and the part in the middle being patched back from, all done concurrently from under the reader. We should always see either original vma or the split one with no holes. Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-maps-race.c | 86 +++++++++++++++++++ 1 file changed, 86 insertions(+) diff --git a/tools/testing/selftests/proc/proc-maps-race.c b/tools/testing/= selftests/proc/proc-maps-race.c index 19028bd3b85c..bc614a2d944a 100644 --- a/tools/testing/selftests/proc/proc-maps-race.c +++ b/tools/testing/selftests/proc/proc-maps-race.c @@ -264,6 +264,35 @@ static inline bool check_shrink_result(struct line_con= tent *mod_last_line, strcmp(mod_first_line->text, restored_first_line->text) =3D=3D 0; } =20 +static inline bool remap_vma(FIXTURE_DATA(proc_maps_race) *self) +{ + /* + * Remap the last page of the next vma into the middle of the vma. + * This splits the current vma and the first and middle parts (the + * parts at lower addresses) become the last vma objserved in the + * first page and the first vma observed in the last page. + */ + return mremap(self->mod_info->next_addr + self->page_size * 2, self->page= _size, + self->page_size, MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + self->mod_info->addr + self->page_size) !=3D MAP_FAILED; +} + +static inline bool patch_vma(FIXTURE_DATA(proc_maps_race) *self) +{ + return mprotect(self->mod_info->addr + self->page_size, self->page_size, + self->mod_info->prot) =3D=3D 0; +} + +static inline bool check_remap_result(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + /* Make sure vmas at the boundaries are changing */ + return strcmp(mod_last_line->text, restored_last_line->text) !=3D 0 && + strcmp(mod_first_line->text, restored_first_line->text) !=3D 0; +} + FIXTURE_SETUP(proc_maps_race) { const char *duration =3D getenv("DURATION"); @@ -523,4 +552,61 @@ TEST_F(proc_maps_race, test_maps_tearing_from_resize) signal_state(mod_info, TEST_DONE); } =20 +TEST_F(proc_maps_race, test_maps_tearing_from_remap) +{ + struct vma_modifier_info *mod_info =3D self->mod_info; + + struct line_content remapped_last_line; + struct line_content remapped_first_line; + struct line_content restored_last_line; + struct line_content restored_first_line; + + wait_for_state(mod_info, SETUP_READY); + + /* re-read the file to avoid using stale data from previous test */ + ASSERT_TRUE(read_boundary_lines(self, &self->last_line, &self->first_line= )); + + mod_info->vma_modify =3D remap_vma; + mod_info->vma_restore =3D patch_vma; + mod_info->vma_mod_check =3D check_remap_result; + + ASSERT_TRUE(capture_mod_pattern(self, &remapped_last_line, &remapped_firs= t_line, + &restored_last_line, &restored_first_line)); + + /* Now start concurrent modifications for self->duration_sec */ + signal_state(mod_info, TEST_READY); + + struct line_content new_last_line; + struct line_content new_first_line; + struct timespec start_ts, end_ts; + + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + do { + ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); + + /* Check if we read vmas after remapping it */ + if (!strcmp(new_last_line.text, remapped_last_line.text)) { + /* + * The vmas should be consistent with remap results, + * however if the vma was concurrently restored, it + * can be reported twice (first as split one, then + * as restored one) because we found it as the next vma + * again. In that case new first line will be the same + * as the last restored line. + */ + ASSERT_FALSE(strcmp(new_first_line.text, remapped_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text)); + } else { + /* The vmas should be consistent with the original/resored state */ + ASSERT_FALSE(strcmp(new_last_line.text, restored_last_line.text)); + ASSERT_FALSE(strcmp(new_first_line.text, restored_first_line.text)); + } + + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); + + /* Signal the modifyer thread to stop and wait until it exits */ + signal_state(mod_info, TEST_DONE); +} + TEST_HARNESS_MAIN --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 15:17:33 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BE69289838 for ; Sat, 19 Jul 2025 18:29:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949748; cv=none; b=F4H/3YrhugS8iKSzDCWjSyrWnX+8MOi1eRyPZkMd9W96d9CFDHoyhZTtJuS912CwAhtm7dFex0v7uGV6mmvBOcHJ3Sm8p+WgoXt+zil2AMuXhIMFN9p54OELmmVlWW8ilHDMSEzqqZEb2Qevh0cC45q39P3Q+B284Zkbn2XIBDc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949748; c=relaxed/simple; bh=tUZ4dqdiz+tH/IQ6l3F06JbQsd9hzO5wedZNYL8/80U=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=M+bhLLMvZH/V6vp8zMnh4HEnZV7atjx/iCJmvbSfv+wRU1edbpc6119ZrVe8b53uCo40JYzFFpdceHwsn51i0MweAEO3alSmzsRdOqldRDnzqpyhWwL7L4f+JcMhNj6WTzMG+63zQ9dzOzoWJCzvZpGvBJSy3+3IsLDQPUIGk0Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=r4B8tRBW; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="r4B8tRBW" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-23507382e64so28985005ad.2 for ; Sat, 19 Jul 2025 11:29:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752949745; x=1753554545; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kldXOAzG4so78dmLm2qHj9lFsFg2dD4gGV3mKFtJSPI=; b=r4B8tRBWOz6hR5cBJow/uxBimhZEQJ6TUwbsl9nimtIWlvHF/DlHg67lp/U6GjJVMD psWzlTLLAq5lBGycOHDlHhyZ+QqBaR7OuTGZZwhKGSC/5aYJ8myEWUM6sG/PQtdm2VGD PkQjJXRkO7Ws9vUOq5PiTl6g+rBd9wWJKQRfedI72RVTHqck1G9teYJvmga3PG3lpXsG z6o7Ghi0f9OPzjlnQyL2+c4DL5c93L/ZUEqqHNbs/HuSM5NmgRRix6fvcoFbZ74/DhSR bzP97ytS7T6xAWbJZhiYiclPQHMV08JBEOchM99ljAh3JBm1N1/HBPm41ppgsT+YcCfF WgFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752949745; x=1753554545; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kldXOAzG4so78dmLm2qHj9lFsFg2dD4gGV3mKFtJSPI=; b=IALRhcuVCs6T75c7dF9P6B6YaiREJQZw7/gHU3BFZLe6AWCI6KwasOJbWr6I5iOecZ EfTj9Km/n+dr2Z37i7P/72QKjG3iQSncMZrP2gx/yPin72clda4hX8ysR4OnksM4Aav1 c+rPNbBTfLMnh2fR95JRIJcLihMkExshBB8flqYeQZediMzb0Br39tGX05Z+csB2azHw maQcdyFbZm8s+xzYbKr8kCDHb5Bk48k9rqhiX08msw/w3rMO3YdrqFA0FOY8os4y820Q SfI2qGwu1L0SGu0l6xcmMsYNOsx7JuDoFbqGMdFg2HdKj6Hrhp26H7QBuQc40C3aFJ4S 4l0A== X-Forwarded-Encrypted: i=1; AJvYcCWwZPXiRxfUiej7BWVipV2YyOJVApfqNCCq0tcnkE3Zpu915svlIo92K0WOr/ftevUg72zsPA1eC2qgDzA=@vger.kernel.org X-Gm-Message-State: AOJu0YyC1+XtFtpBwakYbKzCPjF8DX10EGZJW6qNFf0fwN2g9a+Dk/jk 5BQRzpYRQlHJIvdVc45RxyFP2wS7la1ohdWiq+4DHfdtvu2AI/hv2jsTbszOWpJ9NfcewsB3QdS B+QxmHA== X-Google-Smtp-Source: AGHT+IFAkxSoila/fwaGlvNf2qW0QBDiOSpGMxIR79yLqlqjemU3z0o2eWLkFLAs8M7qaglNQyc8/iazA7g= X-Received: from pjh16.prod.google.com ([2002:a17:90b:3f90:b0:312:eaf7:aa0d]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ce8f:b0:234:c86d:4572 with SMTP id d9443c01a7336-23e2572aaf4mr203386885ad.30.1752949745541; Sat, 19 Jul 2025 11:29:05 -0700 (PDT) Date: Sat, 19 Jul 2025 11:28:52 -0700 In-Reply-To: <20250719182854.3166724-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250719182854.3166724-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog Message-ID: <20250719182854.3166724-5-surenb@google.com> Subject: [PATCH v8 4/6] selftests/proc: add verbose mode for /proc/pid/maps tearing tests From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add verbose mode to the /proc/pid/maps tearing tests to print debugging information. VERBOSE environment variable is used to enable it. Usage example: VERBOSE=3D1 ./proc-maps-race Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-maps-race.c | 153 ++++++++++++++++-- 1 file changed, 141 insertions(+), 12 deletions(-) diff --git a/tools/testing/selftests/proc/proc-maps-race.c b/tools/testing/= selftests/proc/proc-maps-race.c index bc614a2d944a..66773685a047 100644 --- a/tools/testing/selftests/proc/proc-maps-race.c +++ b/tools/testing/selftests/proc/proc-maps-race.c @@ -77,6 +77,7 @@ FIXTURE(proc_maps_race) int shared_mem_size; int page_size; int vma_count; + bool verbose; int maps_fd; pid_t pid; }; @@ -188,12 +189,104 @@ static void stop_vma_modifier(struct vma_modifier_in= fo *mod_info) signal_state(mod_info, SETUP_MODIFY_MAPS); } =20 +static void print_first_lines(char *text, int nr) +{ + const char *end =3D text; + + while (nr && (end =3D strchr(end, '\n')) !=3D NULL) { + nr--; + end++; + } + + if (end) { + int offs =3D end - text; + + text[offs] =3D '\0'; + printf(text); + text[offs] =3D '\n'; + printf("\n"); + } else { + printf(text); + } +} + +static void print_last_lines(char *text, int nr) +{ + const char *start =3D text + strlen(text); + + nr++; /* to ignore the last newline */ + while (nr) { + while (start > text && *start !=3D '\n') + start--; + nr--; + start--; + } + printf(start); +} + +static void print_boundaries(const char *title, FIXTURE_DATA(proc_maps_rac= e) *self) +{ + if (!self->verbose) + return; + + printf("%s", title); + /* Print 3 boundary lines from each page */ + print_last_lines(self->page1.data, 3); + printf("-----------------page boundary-----------------\n"); + print_first_lines(self->page2.data, 3); +} + +static bool print_boundaries_on(bool condition, const char *title, + FIXTURE_DATA(proc_maps_race) *self) +{ + if (self->verbose && condition) + print_boundaries(title, self); + + return condition; +} + +static void report_test_start(const char *name, bool verbose) +{ + if (verbose) + printf("=3D=3D=3D=3D %s =3D=3D=3D=3D\n", name); +} + +static struct timespec print_ts; + +static void start_test_loop(struct timespec *ts, bool verbose) +{ + if (verbose) + print_ts.tv_sec =3D ts->tv_sec; +} + +static void end_test_iteration(struct timespec *ts, bool verbose) +{ + if (!verbose) + return; + + /* Update every second */ + if (print_ts.tv_sec =3D=3D ts->tv_sec) + return; + + printf("."); + fflush(stdout); + print_ts.tv_sec =3D ts->tv_sec; +} + +static void end_test_loop(bool verbose) +{ + if (verbose) + printf("\n"); +} + static bool capture_mod_pattern(FIXTURE_DATA(proc_maps_race) *self, struct line_content *mod_last_line, struct line_content *mod_first_line, struct line_content *restored_last_line, struct line_content *restored_first_line) { + print_boundaries("Before modification", self); + signal_state(self->mod_info, SETUP_MODIFY_MAPS); wait_for_state(self->mod_info, SETUP_MAPS_MODIFIED); =20 @@ -201,6 +294,8 @@ static bool capture_mod_pattern(FIXTURE_DATA(proc_maps_= race) *self, if (!read_boundary_lines(self, mod_last_line, mod_first_line)) return false; =20 + print_boundaries("After modification", self); + signal_state(self->mod_info, SETUP_RESTORE_MAPS); wait_for_state(self->mod_info, SETUP_MAPS_RESTORED); =20 @@ -208,6 +303,8 @@ static bool capture_mod_pattern(FIXTURE_DATA(proc_maps_= race) *self, if (!read_boundary_lines(self, restored_last_line, restored_first_line)) return false; =20 + print_boundaries("After restore", self); + if (!self->mod_info->vma_mod_check(mod_last_line, mod_first_line, restored_last_line, restored_first_line)) return false; @@ -295,6 +392,7 @@ static inline bool check_remap_result(struct line_conte= nt *mod_last_line, =20 FIXTURE_SETUP(proc_maps_race) { + const char *verbose =3D getenv("VERBOSE"); const char *duration =3D getenv("DURATION"); struct vma_modifier_info *mod_info; pthread_mutexattr_t mutex_attr; @@ -303,6 +401,7 @@ FIXTURE_SETUP(proc_maps_race) char fname[32]; =20 self->page_size =3D (unsigned long)sysconf(_SC_PAGESIZE); + self->verbose =3D verbose && !strncmp(verbose, "1", 1); duration_sec =3D duration ? atol(duration) : 0; self->duration_sec =3D duration_sec ? duration_sec : 5UL; =20 @@ -444,6 +543,7 @@ TEST_F(proc_maps_race, test_maps_tearing_from_split) mod_info->vma_restore =3D merge_vma; mod_info->vma_mod_check =3D check_split_result; =20 + report_test_start("Tearing from split", self->verbose); ASSERT_TRUE(capture_mod_pattern(self, &split_last_line, &split_first_line, &restored_last_line, &restored_first_line)); =20 @@ -455,6 +555,7 @@ TEST_F(proc_maps_race, test_maps_tearing_from_split) struct timespec start_ts, end_ts; =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + start_test_loop(&start_ts, self->verbose); do { bool last_line_changed; bool first_line_changed; @@ -472,12 +573,18 @@ TEST_F(proc_maps_race, test_maps_tearing_from_split) * In that case new first line will be the same as the * last restored line. */ - ASSERT_FALSE(strcmp(new_first_line.text, split_first_line.text) && - strcmp(new_first_line.text, restored_last_line.text)); + ASSERT_FALSE(print_boundaries_on( + strcmp(new_first_line.text, split_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text), + "Split result invalid", self)); } else { /* The vmas should be consistent with merge results */ - ASSERT_FALSE(strcmp(new_last_line.text, restored_last_line.text)); - ASSERT_FALSE(strcmp(new_first_line.text, restored_first_line.text)); + ASSERT_FALSE(print_boundaries_on( + strcmp(new_last_line.text, restored_last_line.text), + "Merge result invalid", self)); + ASSERT_FALSE(print_boundaries_on( + strcmp(new_first_line.text, restored_first_line.text), + "Merge result invalid", self)); } /* * First and last lines should change in unison. If the last @@ -489,7 +596,9 @@ TEST_F(proc_maps_race, test_maps_tearing_from_split) ASSERT_EQ(last_line_changed, first_line_changed); =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + end_test_iteration(&end_ts, self->verbose); } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); + end_test_loop(self->verbose); =20 /* Signal the modifyer thread to stop and wait until it exits */ signal_state(mod_info, TEST_DONE); @@ -513,6 +622,7 @@ TEST_F(proc_maps_race, test_maps_tearing_from_resize) mod_info->vma_restore =3D expand_vma; mod_info->vma_mod_check =3D check_shrink_result; =20 + report_test_start("Tearing from resize", self->verbose); ASSERT_TRUE(capture_mod_pattern(self, &shrunk_last_line, &shrunk_first_li= ne, &restored_last_line, &restored_first_line)); =20 @@ -524,6 +634,7 @@ TEST_F(proc_maps_race, test_maps_tearing_from_resize) struct timespec start_ts, end_ts; =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + start_test_loop(&start_ts, self->verbose); do { ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); =20 @@ -537,16 +648,24 @@ TEST_F(proc_maps_race, test_maps_tearing_from_resize) * again. In that case new first line will be the same * as the last restored line. */ - ASSERT_FALSE(strcmp(new_first_line.text, shrunk_first_line.text) && - strcmp(new_first_line.text, restored_last_line.text)); + ASSERT_FALSE(print_boundaries_on( + strcmp(new_first_line.text, shrunk_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text), + "Shrink result invalid", self)); } else { /* The vmas should be consistent with the original/resored state */ - ASSERT_FALSE(strcmp(new_last_line.text, restored_last_line.text)); - ASSERT_FALSE(strcmp(new_first_line.text, restored_first_line.text)); + ASSERT_FALSE(print_boundaries_on( + strcmp(new_last_line.text, restored_last_line.text), + "Expand result invalid", self)); + ASSERT_FALSE(print_boundaries_on( + strcmp(new_first_line.text, restored_first_line.text), + "Expand result invalid", self)); } =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + end_test_iteration(&end_ts, self->verbose); } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); + end_test_loop(self->verbose); =20 /* Signal the modifyer thread to stop and wait until it exits */ signal_state(mod_info, TEST_DONE); @@ -570,6 +689,7 @@ TEST_F(proc_maps_race, test_maps_tearing_from_remap) mod_info->vma_restore =3D patch_vma; mod_info->vma_mod_check =3D check_remap_result; =20 + report_test_start("Tearing from remap", self->verbose); ASSERT_TRUE(capture_mod_pattern(self, &remapped_last_line, &remapped_firs= t_line, &restored_last_line, &restored_first_line)); =20 @@ -581,6 +701,7 @@ TEST_F(proc_maps_race, test_maps_tearing_from_remap) struct timespec start_ts, end_ts; =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + start_test_loop(&start_ts, self->verbose); do { ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); =20 @@ -594,16 +715,24 @@ TEST_F(proc_maps_race, test_maps_tearing_from_remap) * again. In that case new first line will be the same * as the last restored line. */ - ASSERT_FALSE(strcmp(new_first_line.text, remapped_first_line.text) && - strcmp(new_first_line.text, restored_last_line.text)); + ASSERT_FALSE(print_boundaries_on( + strcmp(new_first_line.text, remapped_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text), + "Remap result invalid", self)); } else { /* The vmas should be consistent with the original/resored state */ - ASSERT_FALSE(strcmp(new_last_line.text, restored_last_line.text)); - ASSERT_FALSE(strcmp(new_first_line.text, restored_first_line.text)); + ASSERT_FALSE(print_boundaries_on( + strcmp(new_last_line.text, restored_last_line.text), + "Remap restore result invalid", self)); + ASSERT_FALSE(print_boundaries_on( + strcmp(new_first_line.text, restored_first_line.text), + "Remap restore result invalid", self)); } =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + end_test_iteration(&end_ts, self->verbose); } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); + end_test_loop(self->verbose); =20 /* Signal the modifyer thread to stop and wait until it exits */ signal_state(mod_info, TEST_DONE); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 15:17:33 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 604E228A716 for ; Sat, 19 Jul 2025 18:29:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949750; cv=none; b=RtYNJr67DHyxnFucn3i5k3UsYoTsbgdSmINZ5/QQENM89RxWdY0Z2kLhsboyHkuvtp68ChpULf7Odyv7x0dhlAlUN4jvProWIoiPq0FoV5zvoiAg20q3Zy3H7jHGnyFMpvPehGGhS0qpACTYiabjzBL1xPzDo+uk5Q9Tj4FZvSo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949750; c=relaxed/simple; bh=SGDFTeiKPGTkAuerpbJP68A8Byp9ih/c04po0z++Ajw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XoHnOeKt43ScWXWnxDi1nNnKBV7FPiHe8CO2xO5cj19pFITD/VzGum3YVUCHSrjarEBFGuxnxj0HADfcFClHsewpjTrg3eV09yS5hyodeeEfJpz+H0VOZJcH57dVvjpAx42NmwaOToqhBNsjL8pm/32helnUQJX5jenzB9st9Wo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=y36oTUK3; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="y36oTUK3" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-311ef4fb5fdso3744065a91.1 for ; Sat, 19 Jul 2025 11:29:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752949748; x=1753554548; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Wv6Rc6knLCwk0bVTU7Jho39C3VrUJugHZkg7EchZAig=; b=y36oTUK3Lqho/mcc5c7FGvkg46be4SDijV0330fmrmHSt9iiEHkqqQfM3/9Khh8w8c SIDNztaKU9GROBXGCvnwqnXGNH5lZSMeIV7qSGt1RJQ0bjABypM/PoZ7LCULZVCoLEYA bKZeut1WUnChEMV7dEcd8/SXgNXlJNbAQujVM3TkaYmLXk3+YF6ymg4hp0Iu80Ncfnyv +4N4NMkHC+KxFCZsAght6TO65cAiAhF6coMTjtgsgUwDf6nutM8+1Q0ocXwqZpMagolq dzFhDhEjFJIWnChcq0w5QeUp8UBM0lMpkyEmQA2JTf+Y0+IE7nhqNHjVXyPy9XrM3m6Q xdJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752949748; x=1753554548; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Wv6Rc6knLCwk0bVTU7Jho39C3VrUJugHZkg7EchZAig=; b=NBCxYPxiQ/bGJtSUSEDvweKzPMnHNbnr4ZWSDPph+rQYjjUR4EJPb/0c5oCuRBe4+j Rer4YZc+ibLK3o9cKgIug2QB8Y5MLei/UGkCecdIgvV+abal/ZyXRcfqqbHYv8eM0wMi vuo2mR1D040POLyj+vjCQgwT9o5+4EcEC824FTVYxgZF/Rb2+Q0JUjqYrEE/r2X7IBg2 wt0mhuhEKt55z8qgRypEs631CKhBhvTHdumdNbRyR4JMYjq1xiYIY1bT3f53ZwxCt9+X IUnnFXBa0cuQGY6FragTylb5or1UQe7zvXp6KmLBbVu3uhASAH7pZBTeDgubQr/Xr4YM I3vQ== X-Forwarded-Encrypted: i=1; AJvYcCWg7Gx8CQQo4asI+ilpD/HTjJAH/J83xV0VE9baBINCmgO4zdSdyUzu/G09NTOdTEYR1ldZ/hkrffhs7U8=@vger.kernel.org X-Gm-Message-State: AOJu0Ywpcj0ZfsidQcIIH/9j51dwJbfoNF5IRF6scZq82XRzq2SMl/iI 6vXxA5yxh/jGemDw/60tgL7XJUZwor6Io8Rev/QUyYu9qjQn+4U0swwFsgIycoKV/RXXwU+O43S N4NS/vg== X-Google-Smtp-Source: AGHT+IFF+l8YOMdVgdfAKyd00OGIBIw32C6yb0RlsDUFxpfFgJAmPbDpH0QnnK8+9k8e/WhkoyFB5zF8/XQ= X-Received: from pjmm4.prod.google.com ([2002:a17:90b:5804:b0:312:3b05:5f44]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:48c8:b0:315:b07a:ac12 with SMTP id 98e67ed59e1d1-31c9e6f71b8mr24466775a91.14.1752949747704; Sat, 19 Jul 2025 11:29:07 -0700 (PDT) Date: Sat, 19 Jul 2025 11:28:53 -0700 In-Reply-To: <20250719182854.3166724-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250719182854.3166724-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog Message-ID: <20250719182854.3166724-6-surenb@google.com> Subject: [PATCH v8 5/6] fs/proc/task_mmu: remove conversion of seq_file position to unsigned From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Back in 2.6 era, last_addr used to be stored in seq_file->version variable, which was unsigned long. As a result, sentinels to represent gate vma and end of all vmas used unsigned values. In more recent kernels we don't used seq_file->version anymore and therefore conversion from loff_t into unsigned type is not needed. Similarly, sentinel values don't need to be unsigned. Remove type conversion for set_file position and change sentinel values to signed. While at it, change the hardcoded sentinel values with named definitions for better documentation. Signed-off-by: Suren Baghdasaryan Reviewed-by: Lorenzo Stoakes Reviewed-by: Vlastimil Babka Acked-by: David Hildenbrand --- fs/proc/task_mmu.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 751479eb128f..90237df1ed33 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -29,6 +29,9 @@ #include #include "internal.h" =20 +#define SENTINEL_VMA_END -1 +#define SENTINEL_VMA_GATE -2 + #define SEQ_PUT_DEC(str, val) \ seq_put_decimal_ull_width(m, str, (val) << (PAGE_SHIFT-10), 8) void task_mem(struct seq_file *m, struct mm_struct *mm) @@ -135,7 +138,7 @@ static struct vm_area_struct *proc_get_vma(struct proc_= maps_private *priv, if (vma) { *ppos =3D vma->vm_start; } else { - *ppos =3D -2UL; + *ppos =3D SENTINEL_VMA_GATE; vma =3D get_gate_vma(priv->mm); } =20 @@ -145,11 +148,11 @@ static struct vm_area_struct *proc_get_vma(struct pro= c_maps_private *priv, static void *m_start(struct seq_file *m, loff_t *ppos) { struct proc_maps_private *priv =3D m->private; - unsigned long last_addr =3D *ppos; + loff_t last_addr =3D *ppos; struct mm_struct *mm; =20 /* See m_next(). Zero at the start or after lseek. */ - if (last_addr =3D=3D -1UL) + if (last_addr =3D=3D SENTINEL_VMA_END) return NULL; =20 priv->task =3D get_proc_task(priv->inode); @@ -170,9 +173,9 @@ static void *m_start(struct seq_file *m, loff_t *ppos) return ERR_PTR(-EINTR); } =20 - vma_iter_init(&priv->iter, mm, last_addr); + vma_iter_init(&priv->iter, mm, (unsigned long)last_addr); hold_task_mempolicy(priv); - if (last_addr =3D=3D -2UL) + if (last_addr =3D=3D SENTINEL_VMA_GATE) return get_gate_vma(mm); =20 return proc_get_vma(priv, ppos); @@ -180,8 +183,8 @@ static void *m_start(struct seq_file *m, loff_t *ppos) =20 static void *m_next(struct seq_file *m, void *v, loff_t *ppos) { - if (*ppos =3D=3D -2UL) { - *ppos =3D -1UL; + if (*ppos =3D=3D SENTINEL_VMA_GATE) { + *ppos =3D SENTINEL_VMA_END; return NULL; } return proc_get_vma(m->private, ppos); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 15:17:33 2025 Received: from mail-oa1-f73.google.com (mail-oa1-f73.google.com [209.85.160.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D377728B4E0 for ; Sat, 19 Jul 2025 18:29:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949753; cv=none; b=GMCUG78AvNQ64WPrbqdEkpuQFXOGCOZlHL/6N08WZll0UxqsLrosKJZG/Qbh1RIuRIQP/okh55J9cRWLovoj6RHH8wBeyQRnR+Ln2nDvExIq/2rBoPntrFFPrhD4KSqXPpMXvBHmT7boTqwyDN8bS6SAwFiimdxb1BARjjpruf4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752949753; c=relaxed/simple; bh=0R92dkqkLkVyRhsk1A+S8miCbOzA10PVNtoE2FxT788=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JGejzQ8iIAScStZJhIXw6sv1TC7YgqzmX8/LC4XApJrAfhdE7gH95Itf8dGSEWf7mXaCWtN4Hq58pFing3gd/s+QY6QWPX5Ty1vIthq7Z+9NSPNWU/CZBnO3nHw6nfULYbAxomVOU1ZEA93WeYwvAtbDa01ebzB75Iw/Z7evhm8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=TEmcpjVW; arc=none smtp.client-ip=209.85.160.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="TEmcpjVW" Received: by mail-oa1-f73.google.com with SMTP id 586e51a60fabf-2ff9bf5ce1aso5068259fac.0 for ; Sat, 19 Jul 2025 11:29:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752949750; x=1753554550; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=vDMAWi2K+NbQLnl9UyB/CGGV2UlqflJbxYQJHmN9ABQ=; b=TEmcpjVW08Bm5YHCFLSjioRHJd8I2hCyrLyrzU0nNc3gOViTh28hyFLoNgChkigPIp S/i3zjyMiItfq16GDXycpCM96arJZ0DbGCk131fe8Pou8MeARF/QcM2vRIJx3a2BjQsK 4SnAc0LWFZ6zpj8S3yqhe0K1bLCgyI+liRb1GxLVLnlci5hRdRTkEjZ8UeW0gQckPlz3 e5Bg+keqVeSq3J3Azfpgq5KI8f2KZ3NFvajm7bxOoCzlLvI0QWI1EuUfRyoHXPJMGU55 GJ+4gvLv2aFmaYNSeY2d6ZCYIjtP1gCho0rqyAvALvNdF1kX+ebgdp4txEAPcvzJHnLY dfwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752949750; x=1753554550; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vDMAWi2K+NbQLnl9UyB/CGGV2UlqflJbxYQJHmN9ABQ=; b=eZCR9AtpPNMJQGUaYyD/BaKj1rKs4t1aaceCwt5q3s+U1ZHB0h0N7+rhC5Nlr7hvht z2PMNG2F5KSmQKFNKVvs7Gn/wK9ddxcDKHYc3hkGyEtEuzMyTO9UKhfeDvJExqcmCOLa xHkIj6IHzdX6/Pf/vhLTwppdICJ1eHJVhmHNwDFO15MulzARNA7MvH1nO8WoV7QvF7X5 R7hLcRyZomdjRVTOUxExjyXE65kngFpaPjVdzjFZ09/QBLshsgGYKwHNYEzsgurkUHhn j2Ik6D/hmZjND4eHIf9/rYrBsTF05zA2Jm1EsFquooISx2ph9BwJ9YBk0LcP9xoObBLh 428Q== X-Forwarded-Encrypted: i=1; AJvYcCUV3Mjxxg4XEFcHqtYre9QNvqCxWH/iduyA3YVEfby60xqwAzZDnVMysbRa+/HaPswK64o+yQwMNBscPks=@vger.kernel.org X-Gm-Message-State: AOJu0YyXAO3hWLgc54RDr+mivycyJjT0pjV1gWevkU/cuyS6dC6m6sNt xLICICgX8mYkSkO00h3CMMXi4veqeDOu12qmA5XRO25wJtSH2o3BpqfT51cFesfKt25RdU/dm25 EGOW/3Q== X-Google-Smtp-Source: AGHT+IEipe0LJbGyYnss4/OL7SnvMyb93XlarOafpChub5K6cjmzI5jbHM8Jlvma++lFZLdQmq8W3nTugY8= X-Received: from oabms18.prod.google.com ([2002:a05:6870:6b92:b0:303:204:336f]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6871:7c18:b0:2c2:3e24:9b54 with SMTP id 586e51a60fabf-2ffb225ee4amr10130864fac.11.1752949749803; Sat, 19 Jul 2025 11:29:09 -0700 (PDT) Date: Sat, 19 Jul 2025 11:28:54 -0700 In-Reply-To: <20250719182854.3166724-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250719182854.3166724-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog Message-ID: <20250719182854.3166724-7-surenb@google.com> Subject: [PATCH v8 6/6] fs/proc/task_mmu: read proc/pid/maps under per-vma lock From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With maple_tree supporting vma tree traversal under RCU and per-vma locks, /proc/pid/maps can be read while holding individual vma locks instead of locking the entire address space. A completely lockless approach (walking vma tree under RCU) would be quite complex with the main issue being get_vma_name() using callbacks which might not work correctly with a stable vma copy, requiring original (unstable) vma - see special_mapping_name() for example. When per-vma lock acquisition fails, we take the mmap_lock for reading, lock the vma, release the mmap_lock and continue. This fallback to mmap read lock guarantees the reader to make forward progress even during lock contention. This will interfere with the writer but for a very short time while we are acquiring the per-vma lock and only when there was contention on the vma reader is interested in. We shouldn't see a repeated fallback to mmap read locks in practice, as this require a very unlikely series of lock contentions (for instance due to repeated vma split operations). However even if this did somehow happen, we would still progress. One case requiring special handling is when a vma changes between the time it was found and the time it got locked. A problematic case would be if a vma got shrunk so that its vm_start moved higher in the address space and a new vma was installed at the beginning: reader found: |--------VMA A--------| VMA is modified: |-VMA B-|----VMA A----| reader locks modified VMA A reader reports VMA A: | gap |----VMA A----| This would result in reporting a gap in the address space that does not exist. To prevent this we retry the lookup after locking the vma, however we do that only when we identify a gap and detect that the address space was changed after we found the vma. This change is designed to reduce mmap_lock contention and prevent a process reading /proc/pid/maps files (often a low priority task, such as monitoring/data collection services) from blocking address space updates. Note that this change has a userspace visible disadvantage: it allows for sub-page data tearing as opposed to the previous mechanism where data tearing could happen only between pages of generated output data. Since current userspace considers data tearing between pages to be acceptable, we assume is will be able to handle sub-page data tearing as well. Signed-off-by: Suren Baghdasaryan Reviewed-by: Vlastimil Babka --- fs/proc/internal.h | 5 ++ fs/proc/task_mmu.c | 141 +++++++++++++++++++++++++++++++++++--- include/linux/mmap_lock.h | 11 +++ mm/madvise.c | 3 +- mm/mmap_lock.c | 93 +++++++++++++++++++++++++ 5 files changed, 244 insertions(+), 9 deletions(-) diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 3d48ffe72583..7c235451c5ea 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -384,6 +384,11 @@ struct proc_maps_private { struct task_struct *task; struct mm_struct *mm; struct vma_iterator iter; + loff_t last_pos; +#ifdef CONFIG_PER_VMA_LOCK + bool mmap_locked; + struct vm_area_struct *locked_vma; +#endif #ifdef CONFIG_NUMA struct mempolicy *task_mempolicy; #endif diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 90237df1ed33..3d6d8a9f13fc 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -130,13 +130,132 @@ static void release_task_mempolicy(struct proc_maps_= private *priv) } #endif =20 -static struct vm_area_struct *proc_get_vma(struct proc_maps_private *priv, - loff_t *ppos) +#ifdef CONFIG_PER_VMA_LOCK + +static void unlock_vma(struct proc_maps_private *priv) +{ + if (priv->locked_vma) { + vma_end_read(priv->locked_vma); + priv->locked_vma =3D NULL; + } +} + +static const struct seq_operations proc_pid_maps_op; + +static inline bool lock_vma_range(struct seq_file *m, + struct proc_maps_private *priv) +{ + /* + * smaps and numa_maps perform page table walk, therefore require + * mmap_lock but maps can be read with locking just the vma and + * walking the vma tree under rcu read protection. + */ + if (m->op !=3D &proc_pid_maps_op) { + if (mmap_read_lock_killable(priv->mm)) + return false; + + priv->mmap_locked =3D true; + } else { + rcu_read_lock(); + priv->locked_vma =3D NULL; + priv->mmap_locked =3D false; + } + + return true; +} + +static inline void unlock_vma_range(struct proc_maps_private *priv) +{ + if (priv->mmap_locked) { + mmap_read_unlock(priv->mm); + } else { + unlock_vma(priv); + rcu_read_unlock(); + } +} + +static struct vm_area_struct *get_next_vma(struct proc_maps_private *priv, + loff_t last_pos) +{ + struct vm_area_struct *vma; + + if (priv->mmap_locked) + return vma_next(&priv->iter); + + unlock_vma(priv); + vma =3D lock_next_vma(priv->mm, &priv->iter, last_pos); + if (!IS_ERR_OR_NULL(vma)) + priv->locked_vma =3D vma; + + return vma; +} + +static inline bool fallback_to_mmap_lock(struct proc_maps_private *priv, + loff_t pos) { - struct vm_area_struct *vma =3D vma_next(&priv->iter); + if (priv->mmap_locked) + return false; + + rcu_read_unlock(); + mmap_read_lock(priv->mm); + /* Reinitialize the iterator after taking mmap_lock */ + vma_iter_set(&priv->iter, pos); + priv->mmap_locked =3D true; =20 + return true; +} + +#else /* CONFIG_PER_VMA_LOCK */ + +static inline bool lock_vma_range(struct seq_file *m, + struct proc_maps_private *priv) +{ + return mmap_read_lock_killable(priv->mm) =3D=3D 0; +} + +static inline void unlock_vma_range(struct proc_maps_private *priv) +{ + mmap_read_unlock(priv->mm); +} + +static struct vm_area_struct *get_next_vma(struct proc_maps_private *priv, + loff_t last_pos) +{ + return vma_next(&priv->iter); +} + +static inline bool fallback_to_mmap_lock(struct proc_maps_private *priv, + loff_t pos) +{ + return false; +} + +#endif /* CONFIG_PER_VMA_LOCK */ + +static struct vm_area_struct *proc_get_vma(struct seq_file *m, loff_t *ppo= s) +{ + struct proc_maps_private *priv =3D m->private; + struct vm_area_struct *vma; + +retry: + vma =3D get_next_vma(priv, *ppos); + /* EINTR of EAGAIN is possible */ + if (IS_ERR(vma)) { + if (PTR_ERR(vma) =3D=3D -EAGAIN && fallback_to_mmap_lock(priv, *ppos)) + goto retry; + + return vma; + } + + /* Store previous position to be able to restart if needed */ + priv->last_pos =3D *ppos; if (vma) { - *ppos =3D vma->vm_start; + /* + * Track the end of the reported vma to ensure position changes + * even if previous vma was merged with the next vma and we + * found the extended vma with the same vm_start. + */ + *ppos =3D vma->vm_end; } else { *ppos =3D SENTINEL_VMA_GATE; vma =3D get_gate_vma(priv->mm); @@ -166,19 +285,25 @@ static void *m_start(struct seq_file *m, loff_t *ppos) return NULL; } =20 - if (mmap_read_lock_killable(mm)) { + if (!lock_vma_range(m, priv)) { mmput(mm); put_task_struct(priv->task); priv->task =3D NULL; return ERR_PTR(-EINTR); } =20 + /* + * Reset current position if last_addr was set before + * and it's not a sentinel. + */ + if (last_addr > 0) + *ppos =3D last_addr =3D priv->last_pos; vma_iter_init(&priv->iter, mm, (unsigned long)last_addr); hold_task_mempolicy(priv); if (last_addr =3D=3D SENTINEL_VMA_GATE) return get_gate_vma(mm); =20 - return proc_get_vma(priv, ppos); + return proc_get_vma(m, ppos); } =20 static void *m_next(struct seq_file *m, void *v, loff_t *ppos) @@ -187,7 +312,7 @@ static void *m_next(struct seq_file *m, void *v, loff_t= *ppos) *ppos =3D SENTINEL_VMA_END; return NULL; } - return proc_get_vma(m->private, ppos); + return proc_get_vma(m, ppos); } =20 static void m_stop(struct seq_file *m, void *v) @@ -199,7 +324,7 @@ static void m_stop(struct seq_file *m, void *v) return; =20 release_task_mempolicy(priv); - mmap_read_unlock(mm); + unlock_vma_range(priv); mmput(mm); put_task_struct(priv->task); priv->task =3D NULL; diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 5da384bd0a26..1f4f44951abe 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -309,6 +309,17 @@ void vma_mark_detached(struct vm_area_struct *vma); struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, unsigned long address); =20 +/* + * Locks next vma pointed by the iterator. Confirms the locked vma has not + * been modified and will retry under mmap_lock protection if modification + * was detected. Should be called from read RCU section. + * Returns either a valid locked VMA, NULL if no more VMAs or -EINTR if the + * process was interrupted. + */ +struct vm_area_struct *lock_next_vma(struct mm_struct *mm, + struct vma_iterator *iter, + unsigned long address); + #else /* CONFIG_PER_VMA_LOCK */ =20 static inline void mm_lock_seqcount_init(struct mm_struct *mm) {} diff --git a/mm/madvise.c b/mm/madvise.c index da6e0e7c00b5..5c32c3b95e51 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -109,7 +109,8 @@ void anon_vma_name_free(struct kref *kref) =20 struct anon_vma_name *anon_vma_name(struct vm_area_struct *vma) { - mmap_assert_locked(vma->vm_mm); + if (!rwsem_is_locked(&vma->vm_mm->mmap_lock)) + vma_assert_locked(vma); =20 return vma->anon_name; } diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 5f725cc67334..729fb7d0dd59 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -178,6 +178,99 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_st= ruct *mm, count_vm_vma_lock_event(VMA_LOCK_ABORT); return NULL; } + +static struct vm_area_struct *lock_next_vma_under_mmap_lock(struct mm_stru= ct *mm, + struct vma_iterator *vmi, + unsigned long from_addr) +{ + struct vm_area_struct *vma; + int ret; + + ret =3D mmap_read_lock_killable(mm); + if (ret) + return ERR_PTR(ret); + + /* Lookup the vma at the last position again under mmap_read_lock */ + vma_iter_set(vmi, from_addr); + vma =3D vma_next(vmi); + if (vma) { + /* Very unlikely vma->vm_refcnt overflow case */ + if (unlikely(!vma_start_read_locked(vma))) + vma =3D ERR_PTR(-EAGAIN); + } + + mmap_read_unlock(mm); + + return vma; +} + +struct vm_area_struct *lock_next_vma(struct mm_struct *mm, + struct vma_iterator *vmi, + unsigned long from_addr) +{ + struct vm_area_struct *vma; + unsigned int mm_wr_seq; + bool mmap_unlocked; + + RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "no rcu read lock held"); +retry: + /* Start mmap_lock speculation in case we need to verify the vma later */ + mmap_unlocked =3D mmap_lock_speculate_try_begin(mm, &mm_wr_seq); + vma =3D vma_next(vmi); + if (!vma) + return NULL; + + vma =3D vma_start_read(mm, vma); + if (IS_ERR_OR_NULL(vma)) { + /* + * Retry immediately if the vma gets detached from under us. + * Infinite loop should not happen because the vma we find will + * have to be constantly knocked out from under us. + */ + if (PTR_ERR(vma) =3D=3D -EAGAIN) { + /* reset to search from the last address */ + vma_iter_set(vmi, from_addr); + goto retry; + } + + goto fallback; + } + + /* + * Verify the vma we locked belongs to the same address space and it's + * not behind of the last search position. + */ + if (unlikely(vma->vm_mm !=3D mm || from_addr >=3D vma->vm_end)) + goto fallback_unlock; + + /* + * vma can be ahead of the last search position but we need to verify + * it was not shrunk after we found it and another vma has not been + * installed ahead of it. Otherwise we might observe a gap that should + * not be there. + */ + if (from_addr < vma->vm_start) { + /* Verify only if the address space might have changed since vma lookup.= */ + if (!mmap_unlocked || mmap_lock_speculate_retry(mm, mm_wr_seq)) { + vma_iter_set(vmi, from_addr); + if (vma !=3D vma_next(vmi)) + goto fallback_unlock; + } + } + + return vma; + +fallback_unlock: + vma_end_read(vma); +fallback: + rcu_read_unlock(); + vma =3D lock_next_vma_under_mmap_lock(mm, vmi, from_addr); + rcu_read_lock(); + /* Reinitialize the iterator after re-entering rcu read section */ + vma_iter_set(vmi, IS_ERR_OR_NULL(vma) ? from_addr : vma->vm_end); + + return vma; +} #endif /* CONFIG_PER_VMA_LOCK */ =20 #ifdef CONFIG_LOCK_MM_AND_FIND_VMA --=20 2.50.0.727.gbf7dc18ff4-goog