From nobody Wed Oct 8 20:02:01 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 639C92E764A for ; Tue, 24 Jun 2025 19:34:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793647; cv=none; b=oeo2oYCkMXodsZMxrw9us6QJgwMzQl2Ql8NRSloWeiR+OX3Fv1bvNE0JSd6FrR98ddIO1o1wVR/nMWF3sn5qPZXMs/2ZNUO6uqFGAD2AkxR+pkek21XmNh2H30x8fje/UavzMsAClCl68YxDYy8d1KC6luIUh9OBzBTMCblsIhQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793647; c=relaxed/simple; bh=i0PAR0ouH8mexagArKlIB1LJY+XP42mLP7kgSqvPHxE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=oa5bW1CaZh/X6dCgWzdVWUQ4Q7Ixq90TGxkbbNhg7lg1SBoQXzctcC1gz1wnZEYW3B3hq2I7281wfKkuDZxOihxeqS+bFnxnZqiotZkfuPPqVQFznh4Kn5KS9jRP/omWvU8B2o+uDyCDQVyziIIH/2kYbE8WrN9zEkIAgJZN1Tg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Q3IYIkCu; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Q3IYIkCu" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-748b4d5c045so4470352b3a.1 for ; Tue, 24 Jun 2025 12:34:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1750793645; x=1751398445; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=BeEKEjYjVxEUNIjcbbAJOoiOTSUHH2co8aPuVjJv1wU=; b=Q3IYIkCud0y0hALfIJArVE/MdF0S3NmRNXQU8e92l5WAncTXdUXobEBSfgRcEiWow2 AB8uDPYp1iijZ8EUM/FggTZFwW8KZbicKThT/ZrMafZ8GZAzxAmGINFcT/+3F2eofyl5 YcvAT7EiOqACj3xMc/hZu+rWlgL5mrvId7RhysNw8PXWLGZ6tZJq7+hE53U4kFjXCce1 fuJvzlOLk9fidvmeslHOWuCzatKnpje7Sl7aLMswo7Jafhm1jwFgHQ4PyB9z1Z7Dlybt YLrU3wluxVUGi3ls+AdK55NYe+HePDisV8y+3ePORhRNSOi+E3bOZ458DfMAj87Ubiic /s7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750793645; x=1751398445; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BeEKEjYjVxEUNIjcbbAJOoiOTSUHH2co8aPuVjJv1wU=; b=Kk45tW+FpAftEyRP22xxoVJ8RC+Ti9l2cPL7nWBhyT3uyvIjj5q1hEIeybodPkosIg AQFNnQ3NI4QR+UlJ5IrM2wIuW9Yd6YmQ4vF+zkwCYOg+1DcA0OqFFdUUvo51oYxcVjAf sq1I5Mo6ssZE66e2Lt8P6HoLJ0Nuuw4228+Qes17ZibshSJyI6y8Ymj0zVdsfVuD3Xqn HjLfgSoUj2CY7Qja8wcTZMhXGcPJm09fqnWvSNYL86fuAtFqBcfS5whAxVQJlu8+zwHp 0g6ZwIOFuj4ANJiNvgA7lzShLEFyGfPxtj4O1B/KMsSTTvOS5vb6iztawzvDXb9V2DX4 4OHg== X-Forwarded-Encrypted: i=1; AJvYcCW8Hzl4yP8f8f5Rw6oO3DJ5bUxaHtNoQFy0kRXI5Vn6j1erQnF4KQ2UorwP8dSQia21+XRQf+HY11hniZ8=@vger.kernel.org X-Gm-Message-State: AOJu0Yy6LolIMfIoU9Cs7NjGfXZW4r2W9O+VNDiyJ40+X7QVMyEwGpam Ntx3btiHxM3zhvKUhW7g5DDcclF7/71B5OKN3xtplGsRE88LTI9f4kubRWtVOgo1bH3wsy1Mrqi VBhoqtg== X-Google-Smtp-Source: AGHT+IEgCjAq9juT5ySc6JmV7oHzO4880HbobM8qMc/PLoiGP0YoWDbXfdbACCK9q6rB9NqgbQp1XxC8b50= X-Received: from pfkh17.prod.google.com ([2002:a05:6a00:11:b0:746:2897:67e3]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:9993:b0:217:4f95:6a51 with SMTP id adf61e73a8af0-2207f28f862mr303921637.29.1750793644735; Tue, 24 Jun 2025 12:34:04 -0700 (PDT) Date: Tue, 24 Jun 2025 12:33:53 -0700 In-Reply-To: <20250624193359.3865351-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250624193359.3865351-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.714.g196bf9f422-goog Message-ID: <20250624193359.3865351-2-surenb@google.com> Subject: [PATCH v5 1/7] selftests/proc: add /proc/pid/maps tearing from vma split test From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The /proc/pid/maps file is generated page by page, with the mmap_lock released between pages. This can lead to inconsistent reads if the underlying vmas are concurrently modified. For instance, if a vma split or merge occurs at a page boundary while /proc/pid/maps is being read, the same vma might be seen twice: once before and once after the change. This duplication is considered acceptable for userspace handling. However, observing a "hole" where a vma should be (e.g., due to a vma being replaced and the space temporarily being empty) is unacceptable. Implement a test that: 1. Forks a child process which continuously modifies its address space, specifically targeting a vma at the boundary between two pages. 2. The parent process repeatedly reads the child's /proc/pid/maps. 3. The parent process checks the last vma of the first page and the first vma of the second page for consistency, looking for the effects of vma splits or merges. The test duration is configurable via the -d command-line parameter in seconds to increase the likelihood of catching the race condition. The default test duration is 5 seconds. Example Command: proc-pid-vm -d 10 Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 430 ++++++++++++++++++++- 1 file changed, 429 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index d04685771952..6e3f06376a1f 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -27,6 +27,7 @@ #undef NDEBUG #include #include +#include #include #include #include @@ -34,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -70,6 +72,8 @@ static void make_private_tmp(void) } } =20 +static unsigned long test_duration_sec =3D 5UL; +static int page_size; static pid_t pid =3D -1; static void ate(void) { @@ -281,11 +285,431 @@ static void vsyscall(void) } } =20 -int main(void) +/* /proc/pid/maps parsing routines */ +struct page_content { + char *data; + ssize_t size; +}; + +#define LINE_MAX_SIZE 256 + +struct line_content { + char text[LINE_MAX_SIZE]; + unsigned long start_addr; + unsigned long end_addr; +}; + +static void read_two_pages(int maps_fd, struct page_content *page1, + struct page_content *page2) +{ + ssize_t bytes_read; + + assert(lseek(maps_fd, 0, SEEK_SET) >=3D 0); + bytes_read =3D read(maps_fd, page1->data, page_size); + assert(bytes_read > 0 && bytes_read < page_size); + page1->size =3D bytes_read; + + bytes_read =3D read(maps_fd, page2->data, page_size); + assert(bytes_read > 0 && bytes_read < page_size); + page2->size =3D bytes_read; +} + +static void copy_first_line(struct page_content *page, char *first_line) +{ + char *pos =3D strchr(page->data, '\n'); + + strncpy(first_line, page->data, pos - page->data); + first_line[pos - page->data] =3D '\0'; +} + +static void copy_last_line(struct page_content *page, char *last_line) +{ + /* Get the last line in the first page */ + const char *end =3D page->data + page->size - 1; + /* skip last newline */ + const char *pos =3D end - 1; + + /* search previous newline */ + while (pos[-1] !=3D '\n') + pos--; + strncpy(last_line, pos, end - pos); + last_line[end - pos] =3D '\0'; +} + +/* Read the last line of the first page and the first line of the second p= age */ +static void read_boundary_lines(int maps_fd, struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line) +{ + read_two_pages(maps_fd, page1, page2); + + copy_last_line(page1, last_line->text); + copy_first_line(page2, first_line->text); + + assert(sscanf(last_line->text, "%lx-%lx", &last_line->start_addr, + &last_line->end_addr) =3D=3D 2); + assert(sscanf(first_line->text, "%lx-%lx", &first_line->start_addr, + &first_line->end_addr) =3D=3D 2); +} + +/* Thread synchronization routines */ +enum test_state { + INIT, + CHILD_READY, + PARENT_READY, + SETUP_READY, + SETUP_MODIFY_MAPS, + SETUP_MAPS_MODIFIED, + SETUP_RESTORE_MAPS, + SETUP_MAPS_RESTORED, + TEST_READY, + TEST_DONE, +}; + +struct vma_modifier_info; + +typedef void (*vma_modifier_op)(const struct vma_modifier_info *mod_info); +typedef void (*vma_mod_result_check_op)(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line); + +struct vma_modifier_info { + int vma_count; + void *addr; + int prot; + void *next_addr; + vma_modifier_op vma_modify; + vma_modifier_op vma_restore; + vma_mod_result_check_op vma_mod_check; + pthread_mutex_t sync_lock; + pthread_cond_t sync_cond; + enum test_state curr_state; + bool exit; + void *child_mapped_addr[]; +}; + +static void wait_for_state(struct vma_modifier_info *mod_info, enum test_s= tate state) +{ + pthread_mutex_lock(&mod_info->sync_lock); + while (mod_info->curr_state !=3D state) + pthread_cond_wait(&mod_info->sync_cond, &mod_info->sync_lock); + pthread_mutex_unlock(&mod_info->sync_lock); +} + +static void signal_state(struct vma_modifier_info *mod_info, enum test_sta= te state) +{ + pthread_mutex_lock(&mod_info->sync_lock); + mod_info->curr_state =3D state; + pthread_cond_signal(&mod_info->sync_cond); + pthread_mutex_unlock(&mod_info->sync_lock); +} + +/* VMA modification routines */ +static void *child_vma_modifier(struct vma_modifier_info *mod_info) +{ + int prot =3D PROT_READ | PROT_WRITE; + int i; + + for (i =3D 0; i < mod_info->vma_count; i++) { + mod_info->child_mapped_addr[i] =3D mmap(NULL, page_size * 3, prot, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + assert(mod_info->child_mapped_addr[i] !=3D MAP_FAILED); + /* change protection in adjacent maps to prevent merging */ + prot ^=3D PROT_WRITE; + } + signal_state(mod_info, CHILD_READY); + wait_for_state(mod_info, PARENT_READY); + while (true) { + signal_state(mod_info, SETUP_READY); + wait_for_state(mod_info, SETUP_MODIFY_MAPS); + if (mod_info->exit) + break; + + mod_info->vma_modify(mod_info); + signal_state(mod_info, SETUP_MAPS_MODIFIED); + wait_for_state(mod_info, SETUP_RESTORE_MAPS); + mod_info->vma_restore(mod_info); + signal_state(mod_info, SETUP_MAPS_RESTORED); + + wait_for_state(mod_info, TEST_READY); + while (mod_info->curr_state !=3D TEST_DONE) { + mod_info->vma_modify(mod_info); + mod_info->vma_restore(mod_info); + } + } + for (i =3D 0; i < mod_info->vma_count; i++) + munmap(mod_info->child_mapped_addr[i], page_size * 3); + + return NULL; +} + +static void stop_vma_modifier(struct vma_modifier_info *mod_info) +{ + wait_for_state(mod_info, SETUP_READY); + mod_info->exit =3D true; + signal_state(mod_info, SETUP_MODIFY_MAPS); +} + +static void capture_mod_pattern(int maps_fd, + struct vma_modifier_info *mod_info, + struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line, + struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + signal_state(mod_info, SETUP_MODIFY_MAPS); + wait_for_state(mod_info, SETUP_MAPS_MODIFIED); + + /* Copy last line of the first page and first line of the last page */ + read_boundary_lines(maps_fd, page1, page2, mod_last_line, mod_first_line); + + signal_state(mod_info, SETUP_RESTORE_MAPS); + wait_for_state(mod_info, SETUP_MAPS_RESTORED); + + /* Copy last line of the first page and first line of the last page */ + read_boundary_lines(maps_fd, page1, page2, restored_last_line, restored_f= irst_line); + + mod_info->vma_mod_check(mod_last_line, mod_first_line, + restored_last_line, restored_first_line); + + /* + * The content of these lines after modify+resore should be the same + * as the original. + */ + assert(strcmp(restored_last_line->text, last_line->text) =3D=3D 0); + assert(strcmp(restored_first_line->text, first_line->text) =3D=3D 0); +} + +static inline void split_vma(const struct vma_modifier_info *mod_info) +{ + assert(mmap(mod_info->addr, page_size, mod_info->prot | PROT_EXEC, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, + -1, 0) !=3D MAP_FAILED); +} + +static inline void merge_vma(const struct vma_modifier_info *mod_info) +{ + assert(mmap(mod_info->addr, page_size, mod_info->prot, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, + -1, 0) !=3D MAP_FAILED); +} + +static inline void check_split_result(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + /* Make sure vmas at the boundaries are changing */ + assert(strcmp(mod_last_line->text, restored_last_line->text) !=3D 0); + assert(strcmp(mod_first_line->text, restored_first_line->text) !=3D 0); +} + +static void test_maps_tearing_from_split(int maps_fd, + struct vma_modifier_info *mod_info, + struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line) +{ + struct line_content split_last_line; + struct line_content split_first_line; + struct line_content restored_last_line; + struct line_content restored_first_line; + + wait_for_state(mod_info, SETUP_READY); + + /* re-read the file to avoid using stale data from previous test */ + read_boundary_lines(maps_fd, page1, page2, last_line, first_line); + + mod_info->vma_modify =3D split_vma; + mod_info->vma_restore =3D merge_vma; + mod_info->vma_mod_check =3D check_split_result; + + capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, + &split_last_line, &split_first_line, + &restored_last_line, &restored_first_line); + + /* Now start concurrent modifications for test_duration_sec */ + signal_state(mod_info, TEST_READY); + + struct line_content new_last_line; + struct line_content new_first_line; + struct timespec start_ts, end_ts; + + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + do { + bool last_line_changed; + bool first_line_changed; + + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); + + /* Check if we read vmas after split */ + if (!strcmp(new_last_line.text, split_last_line.text)) { + /* + * The vmas should be consistent with split results, + * however if vma was concurrently restored after a + * split, it can be reported twice (first the original + * split one, then the same vma but extended after the + * merge) because we found it as the next vma again. + * In that case new first line will be the same as the + * last restored line. + */ + assert(!strcmp(new_first_line.text, split_first_line.text) || + !strcmp(new_first_line.text, restored_last_line.text)); + } else { + /* The vmas should be consistent with merge results */ + assert(!strcmp(new_last_line.text, restored_last_line.text) && + !strcmp(new_first_line.text, restored_first_line.text)); + } + /* + * First and last lines should change in unison. If the last + * line changed then the first line should change as well and + * vice versa. + */ + last_line_changed =3D strcmp(new_last_line.text, last_line->text) !=3D 0; + first_line_changed =3D strcmp(new_first_line.text, first_line->text) != =3D 0; + assert(last_line_changed =3D=3D first_line_changed); + + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + + /* Signal the modifyer thread to stop and wait until it exits */ + signal_state(mod_info, TEST_DONE); +} + +static int test_maps_tearing(void) +{ + struct vma_modifier_info *mod_info; + pthread_mutexattr_t mutex_attr; + pthread_condattr_t cond_attr; + int shared_mem_size; + char fname[32]; + int vma_count; + int maps_fd; + int status; + pid_t pid; + + /* + * Have to map enough vmas for /proc/pid/maps to containt more than one + * page worth of vmas. Assume at least 32 bytes per line in maps output + */ + vma_count =3D page_size / 32 + 1; + shared_mem_size =3D sizeof(struct vma_modifier_info) + vma_count * sizeof= (void *); + + /* map shared memory for communication with the child process */ + mod_info =3D (struct vma_modifier_info *)mmap(NULL, shared_mem_size, + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); + + assert(mod_info !=3D MAP_FAILED); + + /* Initialize shared members */ + pthread_mutexattr_init(&mutex_attr); + pthread_mutexattr_setpshared(&mutex_attr, PTHREAD_PROCESS_SHARED); + assert(!pthread_mutex_init(&mod_info->sync_lock, &mutex_attr)); + pthread_condattr_init(&cond_attr); + pthread_condattr_setpshared(&cond_attr, PTHREAD_PROCESS_SHARED); + assert(!pthread_cond_init(&mod_info->sync_cond, &cond_attr)); + mod_info->vma_count =3D vma_count; + mod_info->curr_state =3D INIT; + mod_info->exit =3D false; + + pid =3D fork(); + if (!pid) { + /* Child process */ + child_vma_modifier(mod_info); + return 0; + } + + sprintf(fname, "/proc/%d/maps", pid); + maps_fd =3D open(fname, O_RDONLY); + assert(maps_fd !=3D -1); + + /* Wait for the child to map the VMAs */ + wait_for_state(mod_info, CHILD_READY); + + /* Read first two pages */ + struct page_content page1; + struct page_content page2; + + page1.data =3D malloc(page_size); + assert(page1.data); + page2.data =3D malloc(page_size); + assert(page2.data); + + struct line_content last_line; + struct line_content first_line; + + read_boundary_lines(maps_fd, &page1, &page2, &last_line, &first_line); + + /* + * Find the addresses corresponding to the last line in the first page + * and the first line in the last page. + */ + mod_info->addr =3D NULL; + mod_info->next_addr =3D NULL; + for (int i =3D 0; i < mod_info->vma_count; i++) { + if (mod_info->child_mapped_addr[i] =3D=3D (void *)last_line.start_addr) { + mod_info->addr =3D mod_info->child_mapped_addr[i]; + mod_info->prot =3D PROT_READ; + /* Even VMAs have write permission */ + if ((i % 2) =3D=3D 0) + mod_info->prot |=3D PROT_WRITE; + } else if (mod_info->child_mapped_addr[i] =3D=3D (void *)first_line.star= t_addr) { + mod_info->next_addr =3D mod_info->child_mapped_addr[i]; + } + + if (mod_info->addr && mod_info->next_addr) + break; + } + assert(mod_info->addr && mod_info->next_addr); + + signal_state(mod_info, PARENT_READY); + + test_maps_tearing_from_split(maps_fd, mod_info, &page1, &page2, + &last_line, &first_line); + + stop_vma_modifier(mod_info); + + free(page2.data); + free(page1.data); + + for (int i =3D 0; i < vma_count; i++) + munmap(mod_info->child_mapped_addr[i], page_size); + close(maps_fd); + waitpid(pid, &status, 0); + munmap(mod_info, shared_mem_size); + + return 0; +} + +int usage(void) +{ + fprintf(stderr, "Userland /proc/pid/{s}maps test cases\n"); + fprintf(stderr, " -d: Duration for time-consuming tests\n"); + fprintf(stderr, " -h: Help screen\n"); + exit(-1); +} + +int main(int argc, char **argv) { int pipefd[2]; int exec_fd; + int opt; + + while ((opt =3D getopt(argc, argv, "d:h")) !=3D -1) { + if (opt =3D=3D 'd') + test_duration_sec =3D strtoul(optarg, NULL, 0); + else if (opt =3D=3D 'h') + usage(); + } =20 + page_size =3D sysconf(_SC_PAGESIZE); vsyscall(); switch (g_vsyscall) { case 0: @@ -578,6 +1002,10 @@ int main(void) assert(err =3D=3D -ENOENT); } =20 + /* Test tearing in /proc/$PID/maps */ + if (test_maps_tearing()) + return 1; + return 0; } #else --=20 2.50.0.714.g196bf9f422-goog From nobody Wed Oct 8 20:02:01 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A19D72E7631 for ; Tue, 24 Jun 2025 19:34:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793650; cv=none; b=nlYp6LFD3I9qQceD8gtEFsTbClI5ZwOUlUonh+SAAX9c7aCnQG7WiosXXn/aIHdx+bG3IuNU5WdmUkfdNGBHnVqoJ1pDQdKnj23eA3Y1Kz+cQpn955cusMiJ8zFEGIkF2iUW32GeruOCuZDLgfq9m6/fwMhG9iWhyU6Wc1lgp80= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793650; c=relaxed/simple; bh=6N56cO+mEI93aCD929OWXgdVDQmRt6u/f2iRrSkiXBY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=UNS51So29ewngAhrHgBC+GGO3r9cidTWSOS4mCJfVsYOHXw3MKD5GHR8nvIpMN+XRRxmZFjG1OIcuM3u/TrFMHmScfS4dU5dmUJ68ZMo3adHoW5dFk+waZMj21fJUYGoiPKDiu0P4juuD9Vt1gPSx9kwdyT3f8Vomwo+zMZBbaM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tk6JcZJL; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tk6JcZJL" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-748f13ef248so4366778b3a.3 for ; Tue, 24 Jun 2025 12:34:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1750793647; x=1751398447; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=UC2aFr+2hBpo4c8OQj6ulNMqnFjS9I9wfu23t0p6gWo=; b=tk6JcZJLly3FgBTmdsemuIUDIl7dxcOpbtvOW6o0b+rh/CpFrYJXAwCvXoFhntDkVV 74ALfeTpNGYWI1BLqtaD9nRF/9nohrsD89N0AgRpBiShYoFZeDkPuXU+kEC7ynAcNKIz D78BjZ2j53bgsOBVB9gRuj9LAHW1/gUaNkf+Yo2Q1EyRNfYL2Fc5swKiFlnUjazcWGi1 /aBke/UaLrt/KaODq9OnUjEmdt38MiahV403+BJbSqO4cuQbgBXxrakwpaWxiAPToRvK 2VanKTly/kRzdvWjdcsCSER9C07YUac6DJsCpQU1wl1QweDO5MGkqfdPNMfU8g6yX6KL nzyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750793647; x=1751398447; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=UC2aFr+2hBpo4c8OQj6ulNMqnFjS9I9wfu23t0p6gWo=; b=qk4suVcV4q3eHYlNh2xgkDmolv2CJh/rB2+7apXK1fWZB8UAzO32XnvB1cyFaV2tF4 bNFDWPWkfsZJJkgfyCQfPLKz+ZbM8yrZkGWstroGy+tBmW6c2tJ9ElufFuv5MXfTqYSY /LO8XVJ2nBJPWZG8SvIpETy/r2/3wPpPbFI4wb8Gxj+Jx8FpK4qW0NxGA1+V09hvVFdP IfeDTgZyjmjGqlGhUfk5tvzWWS4P0+SGmD/AOh/UFuACfNhmgt3hhQ+jIh78Uud/oiaI wi7uZsjdlQnTnuF6LEseks2oHnhgrZ9zc2g+ggAypXheCBDNtPEgoDz60h8QYzQ/+d+5 7bJA== X-Forwarded-Encrypted: i=1; AJvYcCXJfPK1KfxWtZF28BWAHruqY8XH4sBw89CHZ0rSpjZnMyxGRnZmiedoWVZwS+O9SzurWJA56xktsxUa7lk=@vger.kernel.org X-Gm-Message-State: AOJu0YztyTWGvC5CnP7Op6pcBSv6TXfk8+vGvKn1o+EqdxoEZKpSEN7Q BhL092bRVKNldmBbKfHckPPDt/XH989Y5bvEE3yO1fOTm8J7fCBHYw4lTSYxjtLa5I681v8oeMh Z1uttUA== X-Google-Smtp-Source: AGHT+IGXFGZq1eG2J9AB8OK4tRQIFjL7wLANvEFPeXTMgxcyIvcd791KpwM8NoFdIHzomkbPKUWUDTzIDUk= X-Received: from pfwy15.prod.google.com ([2002:a05:6a00:1c8f:b0:739:8cd6:c16c]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:124f:b0:746:1e35:3307 with SMTP id d2e1a72fcca58-74ad44fa9e5mr508606b3a.14.1750793646830; Tue, 24 Jun 2025 12:34:06 -0700 (PDT) Date: Tue, 24 Jun 2025 12:33:54 -0700 In-Reply-To: <20250624193359.3865351-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250624193359.3865351-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.714.g196bf9f422-goog Message-ID: <20250624193359.3865351-3-surenb@google.com> Subject: [PATCH v5 2/7] selftests/proc: extend /proc/pid/maps tearing test to include vma resizing From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Test that /proc/pid/maps does not report unexpected holes in the address space when a vma at the edge of the page is being concurrently remapped. This remapping results in the vma shrinking and expanding from under the reader. We should always see either shrunk or expanded (original) version of the vma. Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 83 ++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index 6e3f06376a1f..39842e4ec45f 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -583,6 +583,86 @@ static void test_maps_tearing_from_split(int maps_fd, signal_state(mod_info, TEST_DONE); } =20 +static inline void shrink_vma(const struct vma_modifier_info *mod_info) +{ + assert(mremap(mod_info->addr, page_size * 3, page_size, 0) !=3D MAP_FAILE= D); +} + +static inline void expand_vma(const struct vma_modifier_info *mod_info) +{ + assert(mremap(mod_info->addr, page_size, page_size * 3, 0) !=3D MAP_FAILE= D); +} + +static inline void check_shrink_result(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + /* Make sure only the last vma of the first page is changing */ + assert(strcmp(mod_last_line->text, restored_last_line->text) !=3D 0); + assert(strcmp(mod_first_line->text, restored_first_line->text) =3D=3D 0); +} + +static void test_maps_tearing_from_resize(int maps_fd, + struct vma_modifier_info *mod_info, + struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line) +{ + struct line_content shrunk_last_line; + struct line_content shrunk_first_line; + struct line_content restored_last_line; + struct line_content restored_first_line; + + wait_for_state(mod_info, SETUP_READY); + + /* re-read the file to avoid using stale data from previous test */ + read_boundary_lines(maps_fd, page1, page2, last_line, first_line); + + mod_info->vma_modify =3D shrink_vma; + mod_info->vma_restore =3D expand_vma; + mod_info->vma_mod_check =3D check_shrink_result; + + capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, + &shrunk_last_line, &shrunk_first_line, + &restored_last_line, &restored_first_line); + + /* Now start concurrent modifications for test_duration_sec */ + signal_state(mod_info, TEST_READY); + + struct line_content new_last_line; + struct line_content new_first_line; + struct timespec start_ts, end_ts; + + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + do { + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); + + /* Check if we read vmas after shrinking it */ + if (!strcmp(new_last_line.text, shrunk_last_line.text)) { + /* + * The vmas should be consistent with shrunk results, + * however if the vma was concurrently restored, it + * can be reported twice (first as shrunk one, then + * as restored one) because we found it as the next vma + * again. In that case new first line will be the same + * as the last restored line. + */ + assert(!strcmp(new_first_line.text, shrunk_first_line.text) || + !strcmp(new_first_line.text, restored_last_line.text)); + } else { + /* The vmas should be consistent with the original/resored state */ + assert(!strcmp(new_last_line.text, restored_last_line.text) && + !strcmp(new_first_line.text, restored_first_line.text)); + } + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + + /* Signal the modifyer thread to stop and wait until it exits */ + signal_state(mod_info, TEST_DONE); +} + static int test_maps_tearing(void) { struct vma_modifier_info *mod_info; @@ -674,6 +754,9 @@ static int test_maps_tearing(void) test_maps_tearing_from_split(maps_fd, mod_info, &page1, &page2, &last_line, &first_line); =20 + test_maps_tearing_from_resize(maps_fd, mod_info, &page1, &page2, + &last_line, &first_line); + stop_vma_modifier(mod_info); =20 free(page2.data); --=20 2.50.0.714.g196bf9f422-goog From nobody Wed Oct 8 20:02:01 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC2602E92B7 for ; Tue, 24 Jun 2025 19:34:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793651; cv=none; b=dxP7nQcWpTWKOoturcCtG2PRU9/jtz/hpiZ3v2bk2KBEUjFWxMlsDy/F6sGrMgac6c3R4M02kLPlE3s8X2d05n/oIDM022nJjuDLQEDw6+kvtnWTcDPr1nODjzFhBhv/A9JSPagASu8SfOVisws9z+FNveOceOIKKLiWhWsfkGA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793651; c=relaxed/simple; bh=CkY0yRlPblz/oHTqqCyzbBSv1yWCABFDXWr+I+GLZGs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=m6KzEKhy824xgmp2bOmgQg+HeFYC9F/M7g4AWvCQJLaqnAPRge08R/6fkoasv5SCZuicxce3ND3uKagZrScw0Klwam1bLF4/pAOjE1A130GrBNBPvuxOT1jT9xFa/8hoDMdYO4F4dQ0iX/A6uaDF0C2MaU8nFYQa3YTvjfXUNgU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=GHzLRaeQ; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="GHzLRaeQ" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-23507382e64so54067065ad.2 for ; Tue, 24 Jun 2025 12:34:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1750793649; x=1751398449; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=VTpxqo9rYzmnOlbgrt5I9x/wPv8qMNQnGNX8faPUhlk=; b=GHzLRaeQC0LkJtNgZWa/qpMHTsWpMQINWQ8IhwrlNgMlLXJdlk+JxhFYzKp2ez/zum fWSgsueafYBrAQ2p4mwl8DaSrs5yjqVtjvCvsWt3dyu85K35ofbTRxrQg50mq7E7xf7Y oy6PBLoX2pB7KnBdBbP/cvgcdvDhyxtNw97tEClUz/bfBvtZioa3snqNVawlSkoDy00F M5m8wVZPiKxdzGodIo2mrHjlclVjuwbi3UQgM6vwq4sSeiFoN3WuXgjawxOoALrzpJ6c PPboH49AN1J5Xs9DX/a4j5YY+qdWmJkCrAg0A78CRDeTGftvSktWRO45Fho2w5StJctL sOqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750793649; x=1751398449; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VTpxqo9rYzmnOlbgrt5I9x/wPv8qMNQnGNX8faPUhlk=; b=IE12PYzymyGHs7JxLDf3GECU3+TeyownzY7U/CnKY92iSRV+USQbAd/Gr17bcliENu b0wgQi18XguQLsgswvcOgtJOd+kK0/3pilG1xhPT84fE/7uLeVs7q/BzVwh5loE4omGF IjqCWWHXyUSk7SzJOCvh+RJoXdq5h/1PToHwT2LzX0IBhWEtmczZbBJElIG7Ep1WiONk /oxrawAYG1D4et58xiJPH8bY8C9vqgIdS15S1qH81jprMdGcFi0RCo3E58mAs4p5trGA Wkc94hygB/viBcs0K5KU/VIj/G4zIRjEyEt2+z2m35CcflRxCBA7Dt5MVPitFe03f1Bt ngSA== X-Forwarded-Encrypted: i=1; AJvYcCWwbfb7iYPub+Wy8GVj8xnFnlgPofDqjuhWsmzowfOwiQY+Ca7ERrNJiuXLI2rCqslwl1ChU2MlEBWr21k=@vger.kernel.org X-Gm-Message-State: AOJu0Yzsl+pY9Qsy0bf5OqLo2qR10K0Jdrx5BjD85E5UeuGlVszsibe/ 8m3n12Pn6LTjZA95jEYi0SlUGC6+413Z/TUcYyINrWiK3Blm5wxAZSSNqbyCJa0xuj53+WZ8LW+ vPeWf+A== X-Google-Smtp-Source: AGHT+IEYzVFlaafTK+NJEStVfwqaXhWigt92zShqce3bvq87pwtPR0sp6zBPOudQkJC7xdMwRNzMBfU8jZI= X-Received: from plbkk15.prod.google.com ([2002:a17:903:70f:b0:234:908f:4e22]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:8cd:b0:236:6f43:7047 with SMTP id d9443c01a7336-23823f87ed4mr8275335ad.9.1750793648973; Tue, 24 Jun 2025 12:34:08 -0700 (PDT) Date: Tue, 24 Jun 2025 12:33:55 -0700 In-Reply-To: <20250624193359.3865351-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250624193359.3865351-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.714.g196bf9f422-goog Message-ID: <20250624193359.3865351-4-surenb@google.com> Subject: [PATCH v5 3/7] selftests/proc: extend /proc/pid/maps tearing test to include vma remapping From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Test that /proc/pid/maps does not report unexpected holes in the address space when we concurrently remap a part of a vma into the middle of another vma. This remapping results in the destination vma being split into three parts and the part in the middle being patched back from, all done concurrently from under the reader. We should always see either original vma or the split one with no holes. Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 92 ++++++++++++++++++++++ 1 file changed, 92 insertions(+) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index 39842e4ec45f..1aef2db7e893 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -663,6 +663,95 @@ static void test_maps_tearing_from_resize(int maps_fd, signal_state(mod_info, TEST_DONE); } =20 +static inline void remap_vma(const struct vma_modifier_info *mod_info) +{ + /* + * Remap the last page of the next vma into the middle of the vma. + * This splits the current vma and the first and middle parts (the + * parts at lower addresses) become the last vma objserved in the + * first page and the first vma observed in the last page. + */ + assert(mremap(mod_info->next_addr + page_size * 2, page_size, + page_size, MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + mod_info->addr + page_size) !=3D MAP_FAILED); +} + +static inline void patch_vma(const struct vma_modifier_info *mod_info) +{ + assert(!mprotect(mod_info->addr + page_size, page_size, + mod_info->prot)); +} + +static inline void check_remap_result(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + /* Make sure vmas at the boundaries are changing */ + assert(strcmp(mod_last_line->text, restored_last_line->text) !=3D 0); + assert(strcmp(mod_first_line->text, restored_first_line->text) !=3D 0); +} + +static void test_maps_tearing_from_remap(int maps_fd, + struct vma_modifier_info *mod_info, + struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line) +{ + struct line_content remapped_last_line; + struct line_content remapped_first_line; + struct line_content restored_last_line; + struct line_content restored_first_line; + + wait_for_state(mod_info, SETUP_READY); + + /* re-read the file to avoid using stale data from previous test */ + read_boundary_lines(maps_fd, page1, page2, last_line, first_line); + + mod_info->vma_modify =3D remap_vma; + mod_info->vma_restore =3D patch_vma; + mod_info->vma_mod_check =3D check_remap_result; + + capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, + &remapped_last_line, &remapped_first_line, + &restored_last_line, &restored_first_line); + + /* Now start concurrent modifications for test_duration_sec */ + signal_state(mod_info, TEST_READY); + + struct line_content new_last_line; + struct line_content new_first_line; + struct timespec start_ts, end_ts; + + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + do { + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); + + /* Check if we read vmas after remapping it */ + if (!strcmp(new_last_line.text, remapped_last_line.text)) { + /* + * The vmas should be consistent with remap results, + * however if the vma was concurrently restored, it + * can be reported twice (first as split one, then + * as restored one) because we found it as the next vma + * again. In that case new first line will be the same + * as the last restored line. + */ + assert(!strcmp(new_first_line.text, remapped_first_line.text) || + !strcmp(new_first_line.text, restored_last_line.text)); + } else { + /* The vmas should be consistent with the original/resored state */ + assert(!strcmp(new_last_line.text, restored_last_line.text) && + !strcmp(new_first_line.text, restored_first_line.text)); + } + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + + /* Signal the modifyer thread to stop and wait until it exits */ + signal_state(mod_info, TEST_DONE); +} + static int test_maps_tearing(void) { struct vma_modifier_info *mod_info; @@ -757,6 +846,9 @@ static int test_maps_tearing(void) test_maps_tearing_from_resize(maps_fd, mod_info, &page1, &page2, &last_line, &first_line); =20 + test_maps_tearing_from_remap(maps_fd, mod_info, &page1, &page2, + &last_line, &first_line); + stop_vma_modifier(mod_info); =20 free(page2.data); --=20 2.50.0.714.g196bf9f422-goog From nobody Wed Oct 8 20:02:01 2025 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0A192EA738 for ; Tue, 24 Jun 2025 19:34:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793653; cv=none; b=KTzS7ESP42uLS0DvkiTKzSs6I5xrp81Z2wEYvYUF+PeTjifX+e4ENGxqjQU5f9HwXtw0dCWT8drfdkpCtjGv6LWZKkonSS7oUz89f/MDfXp/gYIKGd3Fz2Dztt4fHbX9c5lnC5OTlhqE6/Azy/ZZaEdrbUaUE5HzOcG/Rv0K6XY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793653; c=relaxed/simple; bh=F9xHDiW8Uqm2cMSKOBqvwqb70yMRUXd6tMFuQWLlKP4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=nrOchTEoTALUB8Pvrbb6sfeGTt2uPvmCejLOPUa7uNo1BvSE4H8GrvzWEOY5ZOU3XNeAr+8x8h6lvJwphKrvOuxQxAXhh3xkq0lcrzYFTm8vn/pt2iQCgeG8QrPEmZP4eHkHikqRdogz+0RVRORDFXqRIDfeYlO+cOwJ/u8Ene4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=AbwDHTy4; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="AbwDHTy4" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-b2eea1c2e97so571252a12.2 for ; Tue, 24 Jun 2025 12:34:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1750793651; x=1751398451; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yLtEbRv6QyxPjsf9LVAPVV2LoLwOWyeMLnVsJ44IjdU=; b=AbwDHTy4bZnt0JWMsRnTvU9Eok/Y74F983dSPXWbA/8swgF3GX/Uq0SefP1ma6Mu8j QxN+pNUIcwodjVgXYIdUGmtUcR/gqr8JYh7XKW8KNrNYQG0xRrmhAy5rfsDVlnY370qQ OM/gtsIdKdWFqqycBJlSHSOX+1JeisRJ7Gv07WfDfcIKKkQ9qYvrIp0gXVczXkgxqlRy dbUJCkGkaKF5ieYthimCOzCnefv55YecppSbIGjX71AQ9HQ2T68Gv2pvq0HjgH00Kk0B Z+emEUSNiV4uLSCKqT3v954cZOZKLkqbCPnUvbaSFA5kIIy0slXnmhtYzkwHkW+5VzPI URVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750793651; x=1751398451; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yLtEbRv6QyxPjsf9LVAPVV2LoLwOWyeMLnVsJ44IjdU=; b=g/X5nCOUBqHiDbGiqdhT+gnSWa40VXynYU774u/f0EYkfbxRCQcuxUEEkMsQnkEuKM QzOtp7iuQgxQfmcaSw11PbzzCXkabm9ZacUgVu1mENgrQfg4Pg4sMXUtN07wV2yq9BaG Q1eG2uEkvcBQVsCoZDwvCy62xycg1Emb/WcWNgyD18y0fRtAmPzdNj6dsihK820RL5MO 0a7ZhPgQOLmJzFScKLzTDxIDwxZFz+IlHXklpb854tlkQWLAoDmZ0/ktoel9Xvlakikv OguB6ttqt0GJ9FKnU6mllKFTj3e1sPA03GO1vWjj76/B5gOycVD0KHUwU+kwH3URH8J5 nI5g== X-Forwarded-Encrypted: i=1; AJvYcCU6NkHudUeBLZ1eN77viOBrK0Uc6/yoQLWsPyNsMCiXfvnVgE2MQbYq4sZY6e83uOhbOzbv6MDfT9+J/XI=@vger.kernel.org X-Gm-Message-State: AOJu0YxiCILcdFr8LJBLj+3cqEuj/RDwMs9mAzqbi2S5f6ZWSVl7b2Tw 8jpQTczu4DS0PtQIQhXEUeMsYGbjrh8RvF5SjdbFsuPvLII817lLJw9JZB1bR/MqTMMpqXLvVrE Zystdkw== X-Google-Smtp-Source: AGHT+IHClWuGmDdB+MsMxoQpT43VMdFBAo96rsx+K87NYCqHXT7J+KynqmIRFatckh9qdl8zl/G773J5x/U= X-Received: from pgww20.prod.google.com ([2002:a05:6a02:2c94:b0:b2c:3d70:9c1]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:9d8e:b0:215:e43a:29b9 with SMTP id adf61e73a8af0-2207f2a49ddmr205722637.33.1750793651161; Tue, 24 Jun 2025 12:34:11 -0700 (PDT) Date: Tue, 24 Jun 2025 12:33:56 -0700 In-Reply-To: <20250624193359.3865351-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250624193359.3865351-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.714.g196bf9f422-goog Message-ID: <20250624193359.3865351-5-surenb@google.com> Subject: [PATCH v5 4/7] selftests/proc: test PROCMAP_QUERY ioctl while vma is concurrently modified From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend /proc/pid/maps tearing test to verify PROCMAP_QUERY ioctl operation correctness while the vma is being concurrently modified. Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 60 ++++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index 1aef2db7e893..b582f40851fb 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -486,6 +486,21 @@ static void capture_mod_pattern(int maps_fd, assert(strcmp(restored_first_line->text, first_line->text) =3D=3D 0); } =20 +static void query_addr_at(int maps_fd, void *addr, + unsigned long *vma_start, unsigned long *vma_end) +{ + struct procmap_query q; + + memset(&q, 0, sizeof(q)); + q.size =3D sizeof(q); + /* Find the VMA at the split address */ + q.query_addr =3D (unsigned long long)addr; + q.query_flags =3D 0; + assert(!ioctl(maps_fd, PROCMAP_QUERY, &q)); + *vma_start =3D q.vma_start; + *vma_end =3D q.vma_end; +} + static inline void split_vma(const struct vma_modifier_info *mod_info) { assert(mmap(mod_info->addr, page_size, mod_info->prot | PROT_EXEC, @@ -546,6 +561,8 @@ static void test_maps_tearing_from_split(int maps_fd, do { bool last_line_changed; bool first_line_changed; + unsigned long vma_start; + unsigned long vma_end; =20 read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); =20 @@ -576,6 +593,19 @@ static void test_maps_tearing_from_split(int maps_fd, first_line_changed =3D strcmp(new_first_line.text, first_line->text) != =3D 0; assert(last_line_changed =3D=3D first_line_changed); =20 + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + query_addr_at(maps_fd, mod_info->addr + page_size, + &vma_start, &vma_end); + /* + * The vma at the split address can be either the same as + * original one (if read before the split) or the same as the + * first line in the second page (if read after the split). + */ + assert((vma_start =3D=3D last_line->start_addr && + vma_end =3D=3D last_line->end_addr) || + (vma_start =3D=3D split_first_line.start_addr && + vma_end =3D=3D split_first_line.end_addr)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); =20 @@ -637,6 +667,9 @@ static void test_maps_tearing_from_resize(int maps_fd, =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); do { + unsigned long vma_start; + unsigned long vma_end; + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); =20 /* Check if we read vmas after shrinking it */ @@ -656,6 +689,17 @@ static void test_maps_tearing_from_resize(int maps_fd, assert(!strcmp(new_last_line.text, restored_last_line.text) && !strcmp(new_first_line.text, restored_first_line.text)); } + + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + query_addr_at(maps_fd, mod_info->addr, &vma_start, &vma_end); + /* + * The vma should stay at the same address and have either the + * original size of 3 pages or 1 page if read after shrinking. + */ + assert(vma_start =3D=3D last_line->start_addr && + (vma_end - vma_start =3D=3D page_size * 3 || + vma_end - vma_start =3D=3D page_size)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); =20 @@ -726,6 +770,9 @@ static void test_maps_tearing_from_remap(int maps_fd, =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); do { + unsigned long vma_start; + unsigned long vma_end; + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); =20 /* Check if we read vmas after remapping it */ @@ -745,6 +792,19 @@ static void test_maps_tearing_from_remap(int maps_fd, assert(!strcmp(new_last_line.text, restored_last_line.text) && !strcmp(new_first_line.text, restored_first_line.text)); } + + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + query_addr_at(maps_fd, mod_info->addr + page_size, &vma_start, &vma_end); + /* + * The vma should either stay at the same address and have the + * original size of 3 pages or we should find the remapped vma + * at the remap destination address with size of 1 page. + */ + assert((vma_start =3D=3D last_line->start_addr && + vma_end - vma_start =3D=3D page_size * 3) || + (vma_start =3D=3D last_line->start_addr + page_size && + vma_end - vma_start =3D=3D page_size)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); =20 --=20 2.50.0.714.g196bf9f422-goog From nobody Wed Oct 8 20:02:01 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEEBB2EBB97 for ; Tue, 24 Jun 2025 19:34:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793656; cv=none; b=P5wFcWJuieIm4a3S+NxA/AUMHBpkR3rWXD0+5oMeygbM6lk97QIxYfzRnd5tlD7yn3MWDAXC+IORuJjX6mh7liES5u2OIqi+h27KNDG0Ksg+oNpcQkbZgckF5JdNAe2Zw0uoZA/ghDz66bkCG9fbVg1o52+jFe7HnMDiQYiHCGY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793656; c=relaxed/simple; bh=GzCQk+svK77W7hPvspovzKVXyJYt48gsWkTARhc2Ylc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ixTwiKnDf1MOYoHAGLzn4F85Z3Eq4wXnXRD/Aec+dwi/Xnm49m6eLkj2pACjxGUfUYlAWcseq0VACGirOavvpHaQPZhJS3Fnc35bCRr9cR2XfMjebxUPOwXe+YW0kt33uDPCEeyhjYoYfKYF181EFLTRKKvnXNuhTbrloZ6bke4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=wMlJ5NDK; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="wMlJ5NDK" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b31cc625817so151505a12.0 for ; Tue, 24 Jun 2025 12:34:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1750793653; x=1751398453; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=5L5H3FNoajNf/UELmS1eaZLiPr7GIJ0ppwDkwKzX2RM=; b=wMlJ5NDK4tJi5bn+CrMTfQVjz9AaKFywv3IjFU0+EDB4/W2/7+S2NdFnFAqu047zgC CWnlY/IOUzwL2uGbVnr8p0pGruujlewhUM3wUzh6xyVuSHqcl+ndVvnzDiKx9ldziV5C GCPkhu1iVNbaACA61t7fanQZedtwi8jtdTBj35/2j2s9LufJUKgLzVRsU5+yGxyL8t8n RC+YUXcbFCzTjIbo+85uAmEHBfwQg3mC6LYkd2adHBMyJ08xgN9t0nrlbmGJxcrNc1m7 ItsReA3fINv0Q5NdXMtUVQcf84bNcFf6tWPzHH1z3wjbbWRR1zmz6G0ESIKCtE6z4XgF e40w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750793653; x=1751398453; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5L5H3FNoajNf/UELmS1eaZLiPr7GIJ0ppwDkwKzX2RM=; b=GP00prp5QdQSOmo/5BMRFsE0+Bw49gjQqC5Mf+9VyRICMJQa2SFw46BtTEiySIhlE+ EA2QYyEYuYMBWFhmftWT+ahHgVX1fIQ/hMPMwC7GWo021EgwsNl5mrVyE7GAQ8U4MEne aHWbjK+3DbSR45lfc/ZoONOMoKTDC7M5nPaM3JtAO5kiIkLCc4mf4L9uIXoq8/1DESxo LY4Avw96qEImidHAebamXYgEnZphnSRAkgyoIqcAS1FdZOoLlKvUTazrGGJcpd7nAgyI xPKb8uE6PprDUTkPWN4yRD8j2yijQFffav2ZsHCn29S97+6uI3yov+C89y3t/TEXO/IB YoLQ== X-Forwarded-Encrypted: i=1; AJvYcCXkRvpKstcg92/oPWdo8DxfTbnn4X4zvEFlEhCxooCrhQZR/PtUdMDFvNplYfIKhqn/QB6h38AFLYC5sqs=@vger.kernel.org X-Gm-Message-State: AOJu0Yz5YE2tP8TKUPdRlLiKtKdr1yEeURSHv7zNDe9kfDb1B+E6pkkR QmD+JijyJyDooLPbWGcBMMldtcSWZqwg9LqbFlGCjKaw+7ZJ4vYiChYk+FFUGIBo3wEqD3w+SCn Y+hU+Tw== X-Google-Smtp-Source: AGHT+IGv6JeXyWyTe30WOyPpkx12eoNk+GrPmkwRBZ04YyBIZv26IDRNFd+s3FgwEXmVVNDLoZsKfuuItlg= X-Received: from pjq6.prod.google.com ([2002:a17:90b:5606:b0:313:285a:5547]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1f8d:b0:30a:9feb:1e15 with SMTP id 98e67ed59e1d1-315ccc4da71mr7233133a91.8.1750793653167; Tue, 24 Jun 2025 12:34:13 -0700 (PDT) Date: Tue, 24 Jun 2025 12:33:57 -0700 In-Reply-To: <20250624193359.3865351-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250624193359.3865351-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.714.g196bf9f422-goog Message-ID: <20250624193359.3865351-6-surenb@google.com> Subject: [PATCH v5 5/7] selftests/proc: add verbose more for tests to facilitate debugging From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add verbose mode to the proc tests to print debugging information. Usage: proc-pid-vm -v Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 154 +++++++++++++++++++-- 1 file changed, 141 insertions(+), 13 deletions(-) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index b582f40851fb..97017f48cd70 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -73,6 +73,7 @@ static void make_private_tmp(void) } =20 static unsigned long test_duration_sec =3D 5UL; +static bool verbose; static int page_size; static pid_t pid =3D -1; static void ate(void) @@ -452,6 +453,99 @@ static void stop_vma_modifier(struct vma_modifier_info= *mod_info) signal_state(mod_info, SETUP_MODIFY_MAPS); } =20 +static void print_first_lines(char *text, int nr) +{ + const char *end =3D text; + + while (nr && (end =3D strchr(end, '\n')) !=3D NULL) { + nr--; + end++; + } + + if (end) { + int offs =3D end - text; + + text[offs] =3D '\0'; + printf(text); + text[offs] =3D '\n'; + printf("\n"); + } else { + printf(text); + } +} + +static void print_last_lines(char *text, int nr) +{ + const char *start =3D text + strlen(text); + + nr++; /* to ignore the last newline */ + while (nr) { + while (start > text && *start !=3D '\n') + start--; + nr--; + start--; + } + printf(start); +} + +static void print_boundaries(const char *title, + struct page_content *page1, + struct page_content *page2) +{ + if (!verbose) + return; + + printf("%s", title); + /* Print 3 boundary lines from each page */ + print_last_lines(page1->data, 3); + printf("-----------------page boundary-----------------\n"); + print_first_lines(page2->data, 3); +} + +static bool print_boundaries_on(bool condition, const char *title, + struct page_content *page1, + struct page_content *page2) +{ + if (verbose && condition) + print_boundaries(title, page1, page2); + + return condition; +} + +static void report_test_start(const char *name) +{ + if (verbose) + printf("=3D=3D=3D=3D %s =3D=3D=3D=3D\n", name); +} + +static struct timespec print_ts; + +static void start_test_loop(struct timespec *ts) +{ + if (verbose) + print_ts.tv_sec =3D ts->tv_sec; +} + +static void end_test_iteration(struct timespec *ts) +{ + if (!verbose) + return; + + /* Update every second */ + if (print_ts.tv_sec =3D=3D ts->tv_sec) + return; + + printf("."); + fflush(stdout); + print_ts.tv_sec =3D ts->tv_sec; +} + +static void end_test_loop(void) +{ + if (verbose) + printf("\n"); +} + static void capture_mod_pattern(int maps_fd, struct vma_modifier_info *mod_info, struct page_content *page1, @@ -463,18 +557,24 @@ static void capture_mod_pattern(int maps_fd, struct line_content *restored_last_line, struct line_content *restored_first_line) { + print_boundaries("Before modification", page1, page2); + signal_state(mod_info, SETUP_MODIFY_MAPS); wait_for_state(mod_info, SETUP_MAPS_MODIFIED); =20 /* Copy last line of the first page and first line of the last page */ read_boundary_lines(maps_fd, page1, page2, mod_last_line, mod_first_line); =20 + print_boundaries("After modification", page1, page2); + signal_state(mod_info, SETUP_RESTORE_MAPS); wait_for_state(mod_info, SETUP_MAPS_RESTORED); =20 /* Copy last line of the first page and first line of the last page */ read_boundary_lines(maps_fd, page1, page2, restored_last_line, restored_f= irst_line); =20 + print_boundaries("After restore", page1, page2); + mod_info->vma_mod_check(mod_last_line, mod_first_line, restored_last_line, restored_first_line); =20 @@ -546,6 +646,7 @@ static void test_maps_tearing_from_split(int maps_fd, mod_info->vma_restore =3D merge_vma; mod_info->vma_mod_check =3D check_split_result; =20 + report_test_start("Tearing from split"); capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, &split_last_line, &split_first_line, &restored_last_line, &restored_first_line); @@ -558,6 +659,7 @@ static void test_maps_tearing_from_split(int maps_fd, struct timespec start_ts, end_ts; =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + start_test_loop(&start_ts); do { bool last_line_changed; bool first_line_changed; @@ -577,12 +679,17 @@ static void test_maps_tearing_from_split(int maps_fd, * In that case new first line will be the same as the * last restored line. */ - assert(!strcmp(new_first_line.text, split_first_line.text) || - !strcmp(new_first_line.text, restored_last_line.text)); + assert(!print_boundaries_on( + strcmp(new_first_line.text, split_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text), + "Split result invalid", page1, page2)); + } else { /* The vmas should be consistent with merge results */ - assert(!strcmp(new_last_line.text, restored_last_line.text) && - !strcmp(new_first_line.text, restored_first_line.text)); + assert(!print_boundaries_on( + strcmp(new_last_line.text, restored_last_line.text) || + strcmp(new_first_line.text, restored_first_line.text), + "Merge result invalid", page1, page2)); } /* * First and last lines should change in unison. If the last @@ -607,7 +714,9 @@ static void test_maps_tearing_from_split(int maps_fd, vma_end =3D=3D split_first_line.end_addr)); =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + end_test_iteration(&end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + end_test_loop(); =20 /* Signal the modifyer thread to stop and wait until it exits */ signal_state(mod_info, TEST_DONE); @@ -654,6 +763,7 @@ static void test_maps_tearing_from_resize(int maps_fd, mod_info->vma_restore =3D expand_vma; mod_info->vma_mod_check =3D check_shrink_result; =20 + report_test_start("Tearing from resize"); capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, &shrunk_last_line, &shrunk_first_line, &restored_last_line, &restored_first_line); @@ -666,6 +776,7 @@ static void test_maps_tearing_from_resize(int maps_fd, struct timespec start_ts, end_ts; =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + start_test_loop(&start_ts); do { unsigned long vma_start; unsigned long vma_end; @@ -682,12 +793,16 @@ static void test_maps_tearing_from_resize(int maps_fd, * again. In that case new first line will be the same * as the last restored line. */ - assert(!strcmp(new_first_line.text, shrunk_first_line.text) || - !strcmp(new_first_line.text, restored_last_line.text)); + assert(!print_boundaries_on( + strcmp(new_first_line.text, shrunk_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text), + "Shrink result invalid", page1, page2)); } else { /* The vmas should be consistent with the original/resored state */ - assert(!strcmp(new_last_line.text, restored_last_line.text) && - !strcmp(new_first_line.text, restored_first_line.text)); + assert(!print_boundaries_on( + strcmp(new_last_line.text, restored_last_line.text) || + strcmp(new_first_line.text, restored_first_line.text), + "Expand result invalid", page1, page2)); } =20 /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ @@ -701,7 +816,9 @@ static void test_maps_tearing_from_resize(int maps_fd, vma_end - vma_start =3D=3D page_size)); =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + end_test_iteration(&end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + end_test_loop(); =20 /* Signal the modifyer thread to stop and wait until it exits */ signal_state(mod_info, TEST_DONE); @@ -757,6 +874,7 @@ static void test_maps_tearing_from_remap(int maps_fd, mod_info->vma_restore =3D patch_vma; mod_info->vma_mod_check =3D check_remap_result; =20 + report_test_start("Tearing from remap"); capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, &remapped_last_line, &remapped_first_line, &restored_last_line, &restored_first_line); @@ -769,6 +887,7 @@ static void test_maps_tearing_from_remap(int maps_fd, struct timespec start_ts, end_ts; =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + start_test_loop(&start_ts); do { unsigned long vma_start; unsigned long vma_end; @@ -785,12 +904,16 @@ static void test_maps_tearing_from_remap(int maps_fd, * again. In that case new first line will be the same * as the last restored line. */ - assert(!strcmp(new_first_line.text, remapped_first_line.text) || - !strcmp(new_first_line.text, restored_last_line.text)); + assert(!print_boundaries_on( + strcmp(new_first_line.text, remapped_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text), + "Remap result invalid", page1, page2)); } else { /* The vmas should be consistent with the original/resored state */ - assert(!strcmp(new_last_line.text, restored_last_line.text) && - !strcmp(new_first_line.text, restored_first_line.text)); + assert(!print_boundaries_on( + strcmp(new_last_line.text, restored_last_line.text) || + strcmp(new_first_line.text, restored_first_line.text), + "Remap restore result invalid", page1, page2)); } =20 /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ @@ -806,7 +929,9 @@ static void test_maps_tearing_from_remap(int maps_fd, vma_end - vma_start =3D=3D page_size)); =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + end_test_iteration(&end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + end_test_loop(); =20 /* Signal the modifyer thread to stop and wait until it exits */ signal_state(mod_info, TEST_DONE); @@ -927,6 +1052,7 @@ int usage(void) { fprintf(stderr, "Userland /proc/pid/{s}maps test cases\n"); fprintf(stderr, " -d: Duration for time-consuming tests\n"); + fprintf(stderr, " -v: Verbose mode\n"); fprintf(stderr, " -h: Help screen\n"); exit(-1); } @@ -937,9 +1063,11 @@ int main(int argc, char **argv) int exec_fd; int opt; =20 - while ((opt =3D getopt(argc, argv, "d:h")) !=3D -1) { + while ((opt =3D getopt(argc, argv, "d:vh")) !=3D -1) { if (opt =3D=3D 'd') test_duration_sec =3D strtoul(optarg, NULL, 0); + else if (opt =3D=3D 'v') + verbose =3D true; else if (opt =3D=3D 'h') usage(); } --=20 2.50.0.714.g196bf9f422-goog From nobody Wed Oct 8 20:02:01 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF68F2EBDE6 for ; Tue, 24 Jun 2025 19:34:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793658; cv=none; b=EMTnIlK+mVhM5aeG5SWB7q7jBLIgqwVdIkAWHhxk07Tip3nikgnH1YV+8XZxQDA1KwEycXd+Y2qw/Z1kkq888jqyaJp4817ucIeOgB2wByteECGGY34vmZ9wqcaWhv9EOjhEzg4JqMyAIouNNiI/70YhQ0M29Y07eBAt8i57gAQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793658; c=relaxed/simple; bh=U4CpqRhmkjkU+JSBE1xEgEtaLoyuPdpcrl5XFw5JfRQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=AyDUG+jeZbXfLSugwiWXemxDi+nmdZ0YX5M65Wyxfbjnwo/jjjgXnIZ7hJS34lQVfVkhvF13aC2n0+ZJxnyoPBwNL2GA1bC0NncgGOzu2Yo3iBveVxYR6vjOyzLZXAnWAQTb67U8iZdO+KuhOiw95bZsl4wFAJQKxFwUcfx0jR4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=lexT8NlL; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lexT8NlL" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2369dd58602so50319535ad.1 for ; Tue, 24 Jun 2025 12:34:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1750793655; x=1751398455; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=R8jCWZ/F+ajpqK4ZOmTtaT6tZvVFIoI2XxVqwquCgmw=; b=lexT8NlLX3KgGeZX3T/PApvCCB7P4EEKvnwrfpzvbxwXJRdFSiS6PecyRjA5BNYQCL p5ViVEInGKJeLomMCdQH67LSjJpyP7576YxVekJI2INkI33uHsHSVZw5ibTvXSvD+p2X D3ts+Z/Md4oErG5qlULHQ+N+eI3Q4Z4GFgDqVEbk/i9DNjBXosOrP4xz0NJs9Bt4Xr3C ycjc8ikB2aaszRtlsMkuYTzelMkmoAo2rsNXQRu2SNrtquDVyJBzyhkIajuKIp7O85qD 2QUTrQcYS76YCb63DZLSYnnYyfUkLHRTG2oDRHV9LNjllq6gr/tBSi6XnVfXMZxTAsmr 5Veg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750793655; x=1751398455; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=R8jCWZ/F+ajpqK4ZOmTtaT6tZvVFIoI2XxVqwquCgmw=; b=UaUCNzc2qrhUhkCTx0JT/dMO8WEm4JIDaETbfJntpW6GdJa+9DWbeRoX5H8VD6VoWB OrK0sGzwIgDJLVGCiD+8DnwQ08j5vTPua0s/leGtTzwHhGO0d3cUbTJZtERPfQL/ACGX i8HugCpI5Qhbh9IsA83uLLAZ8k59GLAQhVlm7JJJqF0iulxRjZUmYlpNbMtrBXnJ/qlJ LkLLgFktfuWeyxPkiTP/J8Rt4iHyl85AlPz7EU9XVafZgsxfGogn8XqXIesf7UnrgYiw f6H2g7VaA9MAlDS1iKfEamSy46KHmL+d3Aktu45dNdXraxPN/Uxzri2T0aOhjjT9lEGN 1KTw== X-Forwarded-Encrypted: i=1; AJvYcCXbf22KuOR5tCvZJwBKA7onFzPX1h8Wd5hCF2r+ckk5GbO/MSv/gwHDNDCyI9mdoxBbDeiAA1GgzAf7ALI=@vger.kernel.org X-Gm-Message-State: AOJu0YzcohF1SaA8XqJfHsZM//lbPIXzffObkisAdsTu3wnTHaKVSsWn WF7C0/jLTDaBnDKwpTJ/oXNge+vMoK3VqgqwgXWEFHzQDksD4NQ9pZfzcoAvjH3aXdxFX0/vk9w 61OU1VQ== X-Google-Smtp-Source: AGHT+IF1b8D4iLGP12pVy+17DkqSsks+kkF0hmalJQ6QlCgP6k7A5fSY0k99ddn2kQ7IT6XhG2dE5s8LCCA= X-Received: from plbkw7.prod.google.com ([2002:a17:902:f907:b0:234:4c97:1e84]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f684:b0:235:60e:3704 with SMTP id d9443c01a7336-23823fc3f99mr9626325ad.12.1750793655228; Tue, 24 Jun 2025 12:34:15 -0700 (PDT) Date: Tue, 24 Jun 2025 12:33:58 -0700 In-Reply-To: <20250624193359.3865351-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250624193359.3865351-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.714.g196bf9f422-goog Message-ID: <20250624193359.3865351-7-surenb@google.com> Subject: [PATCH v5 6/7] mm/maps: read proc/pid/maps under per-vma lock From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With maple_tree supporting vma tree traversal under RCU and per-vma locks, /proc/pid/maps can be read while holding individual vma locks instead of locking the entire address space. Completely lockless approach (walking vma tree under RCU) would be quite complex with the main issue being get_vma_name() using callbacks which might not work correctly with a stable vma copy, requiring original (unstable) vma - see special_mapping_name() for an example. When per-vma lock acquisition fails, we take the mmap_lock for reading, lock the vma, release the mmap_lock and continue. This fallback to mmap read lock guarantees the reader to make forward progress even during lock contention. This will interfere with the writer but for a very short time while we are acquiring the per-vma lock and only when there was contention on the vma reader is interested in. We shouldn't see a repeated fallback to mmap read locks in practice, as this require a very unlikely series of lock contentions (for instance due to repeated vma split operations). However even if this did somehow happen, we would still progress. One case requiring special handling is when vma changes between the time it was found and the time it got locked. A problematic case would be if vma got shrunk so that it's start moved higher in the address space and a new vma was installed at the beginning: reader found: |--------VMA A--------| VMA is modified: |-VMA B-|----VMA A----| reader locks modified VMA A reader reports VMA A: | gap |----VMA A----| This would result in reporting a gap in the address space that does not exist. To prevent this we retry the lookup after locking the vma, however we do that only when we identify a gap and detect that the address space was changed after we found the vma. This change is designed to reduce mmap_lock contention and prevent a process reading /proc/pid/maps files (often a low priority task, such as monitoring/data collection services) from blocking address space updates. Note that this change has a userspace visible disadvantage: it allows for sub-page data tearing as opposed to the previous mechanism where data tearing could happen only between pages of generated output data. Since current userspace considers data tearing between pages to be acceptable, we assume is will be able to handle sub-page data tearing as well. Signed-off-by: Suren Baghdasaryan Acked-by: Suren Baghdasaryan --- fs/proc/internal.h | 5 ++ fs/proc/task_mmu.c | 123 ++++++++++++++++++++++++++++++++++---- include/linux/mmap_lock.h | 11 ++++ mm/mmap_lock.c | 88 +++++++++++++++++++++++++++ 4 files changed, 217 insertions(+), 10 deletions(-) diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 3d48ffe72583..7c235451c5ea 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -384,6 +384,11 @@ struct proc_maps_private { struct task_struct *task; struct mm_struct *mm; struct vma_iterator iter; + loff_t last_pos; +#ifdef CONFIG_PER_VMA_LOCK + bool mmap_locked; + struct vm_area_struct *locked_vma; +#endif #ifdef CONFIG_NUMA struct mempolicy *task_mempolicy; #endif diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 751479eb128f..33171afb5364 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -127,21 +127,118 @@ static void release_task_mempolicy(struct proc_maps_= private *priv) } #endif =20 -static struct vm_area_struct *proc_get_vma(struct proc_maps_private *priv, - loff_t *ppos) +#ifdef CONFIG_PER_VMA_LOCK + +static void unlock_vma(struct proc_maps_private *priv) { - struct vm_area_struct *vma =3D vma_next(&priv->iter); + if (priv->locked_vma) { + vma_end_read(priv->locked_vma); + priv->locked_vma =3D NULL; + } +} + +static const struct seq_operations proc_pid_maps_op; + +static inline bool lock_vma_range(struct seq_file *m, + struct proc_maps_private *priv) +{ + /* + * smaps and numa_maps perform page table walk, therefore require + * mmap_lock but maps can be read with locking just the vma. + */ + if (m->op !=3D &proc_pid_maps_op) { + if (mmap_read_lock_killable(priv->mm)) + return false; =20 + priv->mmap_locked =3D true; + } else { + rcu_read_lock(); + priv->locked_vma =3D NULL; + priv->mmap_locked =3D false; + } + + return true; +} + +static inline void unlock_vma_range(struct proc_maps_private *priv) +{ + if (priv->mmap_locked) { + mmap_read_unlock(priv->mm); + } else { + unlock_vma(priv); + rcu_read_unlock(); + } +} + +static struct vm_area_struct *get_next_vma(struct proc_maps_private *priv, + loff_t last_pos) +{ + struct vm_area_struct *vma; + + if (priv->mmap_locked) + return vma_next(&priv->iter); + + unlock_vma(priv); + vma =3D lock_next_vma(priv->mm, &priv->iter, last_pos); + if (!IS_ERR_OR_NULL(vma)) + priv->locked_vma =3D vma; + + return vma; +} + +#else /* CONFIG_PER_VMA_LOCK */ + +static inline bool lock_vma_range(struct seq_file *m, + struct proc_maps_private *priv) +{ + return mmap_read_lock_killable(priv->mm) =3D=3D 0; +} + +static inline void unlock_vma_range(struct proc_maps_private *priv) +{ + mmap_read_unlock(priv->mm); +} + +static struct vm_area_struct *get_next_vma(struct proc_maps_private *priv, + loff_t last_pos) +{ + return vma_next(&priv->iter); +} + +#endif /* CONFIG_PER_VMA_LOCK */ + +static struct vm_area_struct *proc_get_vma(struct seq_file *m, loff_t *ppo= s) +{ + struct proc_maps_private *priv =3D m->private; + struct vm_area_struct *vma; + + vma =3D get_next_vma(priv, *ppos); + /* EINTR is possible */ + if (IS_ERR(vma)) + return vma; + + /* Store previous position to be able to restart if needed */ + priv->last_pos =3D *ppos; if (vma) { - *ppos =3D vma->vm_start; + /* + * Track the end of the reported vma to ensure position changes + * even if previous vma was merged with the next vma and we + * found the extended vma with the same vm_start. + */ + *ppos =3D vma->vm_end; } else { - *ppos =3D -2UL; + *ppos =3D -2UL; /* -2 indicates gate vma */ vma =3D get_gate_vma(priv->mm); } =20 return vma; } =20 +static inline bool is_sentinel_pos(unsigned long pos) +{ + return pos =3D=3D -1UL || pos =3D=3D -2UL; +} + static void *m_start(struct seq_file *m, loff_t *ppos) { struct proc_maps_private *priv =3D m->private; @@ -163,28 +260,34 @@ static void *m_start(struct seq_file *m, loff_t *ppos) return NULL; } =20 - if (mmap_read_lock_killable(mm)) { + if (!lock_vma_range(m, priv)) { mmput(mm); put_task_struct(priv->task); priv->task =3D NULL; return ERR_PTR(-EINTR); } =20 + /* + * Reset current position if last_addr was set before + * and it's not a sentinel. + */ + if (last_addr > 0 && !is_sentinel_pos(last_addr)) + *ppos =3D last_addr =3D priv->last_pos; vma_iter_init(&priv->iter, mm, last_addr); hold_task_mempolicy(priv); if (last_addr =3D=3D -2UL) return get_gate_vma(mm); =20 - return proc_get_vma(priv, ppos); + return proc_get_vma(m, ppos); } =20 static void *m_next(struct seq_file *m, void *v, loff_t *ppos) { if (*ppos =3D=3D -2UL) { - *ppos =3D -1UL; + *ppos =3D -1UL; /* -1 indicates no more vmas */ return NULL; } - return proc_get_vma(m->private, ppos); + return proc_get_vma(m, ppos); } =20 static void m_stop(struct seq_file *m, void *v) @@ -196,7 +299,7 @@ static void m_stop(struct seq_file *m, void *v) return; =20 release_task_mempolicy(priv); - mmap_read_unlock(mm); + unlock_vma_range(priv); mmput(mm); put_task_struct(priv->task); priv->task =3D NULL; diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 5da384bd0a26..1f4f44951abe 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -309,6 +309,17 @@ void vma_mark_detached(struct vm_area_struct *vma); struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, unsigned long address); =20 +/* + * Locks next vma pointed by the iterator. Confirms the locked vma has not + * been modified and will retry under mmap_lock protection if modification + * was detected. Should be called from read RCU section. + * Returns either a valid locked VMA, NULL if no more VMAs or -EINTR if the + * process was interrupted. + */ +struct vm_area_struct *lock_next_vma(struct mm_struct *mm, + struct vma_iterator *iter, + unsigned long address); + #else /* CONFIG_PER_VMA_LOCK */ =20 static inline void mm_lock_seqcount_init(struct mm_struct *mm) {} diff --git a/mm/mmap_lock.c b/mm/mmap_lock.c index 5f725cc67334..ed0e5e2171cd 100644 --- a/mm/mmap_lock.c +++ b/mm/mmap_lock.c @@ -178,6 +178,94 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_st= ruct *mm, count_vm_vma_lock_event(VMA_LOCK_ABORT); return NULL; } + +static struct vm_area_struct *lock_vma_under_mmap_lock(struct mm_struct *m= m, + struct vma_iterator *iter, + unsigned long address) +{ + struct vm_area_struct *vma; + int ret; + + ret =3D mmap_read_lock_killable(mm); + if (ret) + return ERR_PTR(ret); + + /* Lookup the vma at the last position again under mmap_read_lock */ + vma_iter_init(iter, mm, address); + vma =3D vma_next(iter); + if (vma) + vma_start_read_locked(vma); + + mmap_read_unlock(mm); + + return vma; +} + +struct vm_area_struct *lock_next_vma(struct mm_struct *mm, + struct vma_iterator *iter, + unsigned long address) +{ + struct vm_area_struct *vma; + unsigned int mm_wr_seq; + bool mmap_unlocked; + + RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "no rcu read lock held"); +retry: + /* Start mmap_lock speculation in case we need to verify the vma later */ + mmap_unlocked =3D mmap_lock_speculate_try_begin(mm, &mm_wr_seq); + vma =3D vma_next(iter); + if (!vma) + return NULL; + + vma =3D vma_start_read(mm, vma); + + if (IS_ERR_OR_NULL(vma)) { + /* + * Retry immediately if the vma gets detached from under us. + * Infinite loop should not happen because the vma we find will + * have to be constantly knocked out from under us. + */ + if (PTR_ERR(vma) =3D=3D -EAGAIN) { + vma_iter_init(iter, mm, address); + goto retry; + } + + goto out; + } + + /* + * Verify the vma we locked belongs to the same address space and it's + * not behind of the last search position. + */ + if (unlikely(vma->vm_mm !=3D mm || address >=3D vma->vm_end)) + goto out_unlock; + + /* + * vma can be ahead of the last search position but we need to verify + * it was not shrunk after we found it and another vma has not been + * installed ahead of it. Otherwise we might observe a gap that should + * not be there. + */ + if (address < vma->vm_start) { + /* Verify only if the address space might have changed since vma lookup.= */ + if (!mmap_unlocked || mmap_lock_speculate_retry(mm, mm_wr_seq)) { + vma_iter_init(iter, mm, address); + if (vma !=3D vma_next(iter)) + goto out_unlock; + } + } + + return vma; + +out_unlock: + vma_end_read(vma); +out: + rcu_read_unlock(); + vma =3D lock_vma_under_mmap_lock(mm, iter, address); + rcu_read_lock(); + + return vma; +} #endif /* CONFIG_PER_VMA_LOCK */ =20 #ifdef CONFIG_LOCK_MM_AND_FIND_VMA --=20 2.50.0.714.g196bf9f422-goog From nobody Wed Oct 8 20:02:01 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E26AA2ED146 for ; Tue, 24 Jun 2025 19:34:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793659; cv=none; b=Uq5Z9lnypt0sES/ltsmPWBx4oanpM4ry3DGf+xc3yqSieBrufFR/hoT/tEXICgXSJHhnUhyVgD4eGp32DR8lnOYYKshlOr6usBIw3uEyyoDNtQqq91aAjr/H+Xv3wUj7KAXS/nIgySN9/xMnHfe1bcYQgOg+iGyYlYretxPTcL0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750793659; c=relaxed/simple; bh=XrkCL8NpzlaF/n3BVH9B6erYWI4M51jhFfWUSpSZR4I=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=m5/tnWLP3aq5vamc5A7lz6NBiDdpJbFLH+CdtmflSFMSY6gUsySqlGOVg84H8oOs6HAZvp98CoZ150zTNF2dp8/oIZZsH9xfC1/+sx2H8qnkmGWHjxcm/hns1gej3dyYw8Qa6HcUwT5YmjJr3zqVjEkhh0xhTccIcBvEd+vmjNI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=bkdiY3j0; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="bkdiY3j0" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-740774348f6so424282b3a.1 for ; Tue, 24 Jun 2025 12:34:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1750793657; x=1751398457; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=1BERKeqLlcPDyrSUuZIHMWPi0iaWreXsjzz5dAn8mkk=; b=bkdiY3j0wIAYJhRcwLEPLslfcOeGaFBVIfREiNARIzV5FYSH6FpPb+wcz8zxnXa+J3 cGE/tATlBHEhPYlAOlkqtacmgwIxmJAkg1qhoX6OOEo/6cjEWsl3DXqIl68JnK1zpsU5 sWUV+TlmnqXj6FvKVgvPLaeC1FLaBCpF9q3aTV/cbP03D/fCc0KhiJqxGzVJy7SAQTa2 Vv9HLX/cEOXYDXST7Xz1K7WtnPINeVwhSRj74qg1kcMbGS4p+kM0NR1cJYDfwIgusuif Xg94RBzMYwwkTuNt04bEL6Av8VKA0E74ObMtmwkwxsR7JcwhZ9D4ywtJrQbGklPpAEFb tyGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750793657; x=1751398457; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1BERKeqLlcPDyrSUuZIHMWPi0iaWreXsjzz5dAn8mkk=; b=iMSGAk5dk1JAfhvPHgzfOmNpkl42lU2jGgh2Pi40k2BMrRx0CLYxe91SPlEhmsq6w3 G7PmlF/TIPmomWnK8ln8qi2/KnWcD25P4Az0/3tNx203qlkF8+EK59PjXPa6ciKVBCRR lzH0/11AThjAnCPqpsL2Wo3sdSiuCrL7O4FCLKTWSdZozPFwqryHieVgbul1FaBQXeyt kiR5nsD2tUsmIcSNLXng6ddZj7LQwcXHT/2HJTHTBX8bZQc+hWc0B94KF8T8G2FUO+4R A3VAwiyg31snbcqBo2yu1fBBvZ1Wj+fg9Mpv8TKss34Apyn+uNboF8YS/6ZxHW05qVhV jrZA== X-Forwarded-Encrypted: i=1; AJvYcCXaoPjqi8376l3+pzNgDJK/6gugJFgTxdJUf2SfP3ffpOjcgHhZznUqdd3LBJ2E7T823t6PE7N/ScroqEE=@vger.kernel.org X-Gm-Message-State: AOJu0YzmakYiFQ4Vdvbs6UR4+CndHd9WpipnJPHTMA6i4UaGthiSbEu7 AlIUd3WXPGJkka8J/98Rxr/EeQX76WlgfmZZhRYM4U1pEUMtse37LtvXjZoBc20r2cc7kOIplma PCz86xQ== X-Google-Smtp-Source: AGHT+IHtLljtrcUhHYC7Wt+7WKI/a5PDJ4OrDvfWNeRaq5sGEKU3TQKgRzLNMXpoHYXyrtGwyT6oPA3hh2M= X-Received: from pfrh7.prod.google.com ([2002:aa7:9f47:0:b0:748:4f7c:c605]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:6f8b:b0:220:7b2e:5b3f with SMTP id adf61e73a8af0-2207f27d61amr271750637.19.1750793657223; Tue, 24 Jun 2025 12:34:17 -0700 (PDT) Date: Tue, 24 Jun 2025 12:33:59 -0700 In-Reply-To: <20250624193359.3865351-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250624193359.3865351-1-surenb@google.com> X-Mailer: git-send-email 2.50.0.714.g196bf9f422-goog Message-ID: <20250624193359.3865351-8-surenb@google.com> Subject: [PATCH v5 7/7] mm/maps: execute PROCMAP_QUERY ioctl under per-vma locks From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Utilize per-vma locks to stabilize vma after lookup without taking mmap_lock during PROCMAP_QUERY ioctl execution. While we might take mmap_lock for reading during contention, we do that momentarily only to lock the vma. This change is designed to reduce mmap_lock contention and prevent PROCMAP_QUERY ioctl calls from blocking address space updates. Signed-off-by: Suren Baghdasaryan Acked-by: Andrii Nakryiko --- fs/proc/task_mmu.c | 56 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 44 insertions(+), 12 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 33171afb5364..f3659046efb7 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -492,28 +492,60 @@ static int pid_maps_open(struct inode *inode, struct = file *file) PROCMAP_QUERY_VMA_FLAGS \ ) =20 -static int query_vma_setup(struct mm_struct *mm) +#ifdef CONFIG_PER_VMA_LOCK + +static int query_vma_setup(struct proc_maps_private *priv) { - return mmap_read_lock_killable(mm); + rcu_read_lock(); + priv->locked_vma =3D NULL; + priv->mmap_locked =3D false; + + return 0; } =20 -static void query_vma_teardown(struct mm_struct *mm, struct vm_area_struct= *vma) +static void query_vma_teardown(struct proc_maps_private *priv) { - mmap_read_unlock(mm); + unlock_vma(priv); + rcu_read_unlock(); +} + +static struct vm_area_struct *query_vma_find_by_addr(struct proc_maps_priv= ate *priv, + unsigned long addr) +{ + vma_iter_init(&priv->iter, priv->mm, addr); + return get_next_vma(priv, addr); +} + +#else /* CONFIG_PER_VMA_LOCK */ + +static int query_vma_setup(struct proc_maps_private *priv) +{ + return mmap_read_lock_killable(priv->mm); } =20 -static struct vm_area_struct *query_vma_find_by_addr(struct mm_struct *mm,= unsigned long addr) +static void query_vma_teardown(struct proc_maps_private *priv) { - return find_vma(mm, addr); + mmap_read_unlock(priv->mm); } =20 -static struct vm_area_struct *query_matching_vma(struct mm_struct *mm, +static struct vm_area_struct *query_vma_find_by_addr(struct proc_maps_priv= ate *priv, + unsigned long addr) +{ + return find_vma(priv->mm, addr); +} + +#endif /* CONFIG_PER_VMA_LOCK */ + +static struct vm_area_struct *query_matching_vma(struct proc_maps_private = *priv, unsigned long addr, u32 flags) { struct vm_area_struct *vma; =20 next_vma: - vma =3D query_vma_find_by_addr(mm, addr); + vma =3D query_vma_find_by_addr(priv, addr); + if (IS_ERR(vma)) + return vma; + if (!vma) goto no_vma; =20 @@ -589,13 +621,13 @@ static int do_procmap_query(struct proc_maps_private = *priv, void __user *uarg) if (!mm || !mmget_not_zero(mm)) return -ESRCH; =20 - err =3D query_vma_setup(mm); + err =3D query_vma_setup(priv); if (err) { mmput(mm); return err; } =20 - vma =3D query_matching_vma(mm, karg.query_addr, karg.query_flags); + vma =3D query_matching_vma(priv, karg.query_addr, karg.query_flags); if (IS_ERR(vma)) { err =3D PTR_ERR(vma); vma =3D NULL; @@ -680,7 +712,7 @@ static int do_procmap_query(struct proc_maps_private *p= riv, void __user *uarg) } =20 /* unlock vma or mmap_lock, and put mm_struct before copying data to user= */ - query_vma_teardown(mm, vma); + query_vma_teardown(priv); mmput(mm); =20 if (karg.vma_name_size && copy_to_user(u64_to_user_ptr(karg.vma_name_addr= ), @@ -700,7 +732,7 @@ static int do_procmap_query(struct proc_maps_private *p= riv, void __user *uarg) return 0; =20 out: - query_vma_teardown(mm, vma); + query_vma_teardown(priv); mmput(mm); kfree(name_buf); return err; --=20 2.50.0.714.g196bf9f422-goog