From nobody Fri Dec 19 16:08:24 2025 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B53521C9F7 for ; Fri, 18 Apr 2025 17:50:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998608; cv=none; b=V8b4YpYmRKBIEYyngDMLvhjAW8qBgFeKBqfwEuju7S9O76TGoT5WXChSFr91eRu3q3VvYB42tVzVWFKmXlGLcStoa36eRo1HTcHpX0mrxMAKM0FBdvZ/lPkeR5LYc8dm8AoTXtBliMn9cdBaMSzZnT7R4GZi8js6BnQxz2fuXo8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998608; c=relaxed/simple; bh=D5KVUKAoJwS0XGSathwFmHbGtyOBmVYW2dpOpzCywz0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=BXfjzm2lJMJvICXwix2DZ+sQP/U1+EXtTn6SVO3v4RvttXzpqyygd8bfHjkIuKfc0Lsuu2v7/GiQGS76nUczWT3T4+CoJE/UVExU6kjfg5EF4eyHQItRXgoxR3EJW6WuYDer9+CsDkER3C/ZPJk4w/z8kOJ0QyVsP8d2HCiHO6s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Kqc3qt20; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Kqc3qt20" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-b0dd00e1a01so337254a12.0 for ; Fri, 18 Apr 2025 10:50:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744998606; x=1745603406; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=mgDRBr7EHzoWVxJpYPL/4eCyatCNp0mNaqALeY1G7/4=; b=Kqc3qt20lL/CU4/+HXo/ml5kHVhJF1HegPx0f/2aHqeqxUFoF4S/IfJ9GC1pd2nF4k tTkErM39YAHapTZ9k/kGJ1HMsR7lNKllLHL+hEJIrDSTjBhgC/ElM7zAD3EWJ3YfpBra 97skXNe76tBFeZn/y5qIfqIfPRf9xlcDfOkNKwNFEp5vOVm3XgNpzMn/p9KEHILlL8CC DTHYpzYgQoB6RWThOXxfH+ih98oTPnM4tM5p0iIp9eFYjC3F5NP7WXJs0ieB6hrFfNEO tl4jCs2MEW/copOMksNubOvIBi8wXYvh9UwzfKJl4wLCZ+T36jaVRgwRTivbDNo1h/n9 NBqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744998606; x=1745603406; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mgDRBr7EHzoWVxJpYPL/4eCyatCNp0mNaqALeY1G7/4=; b=twgfWDLeto6Mu70LTn4Ncu4ZvQwMiUMOUAcNmxbL8/+DDm4lB9UAIalzBFQH73hwJT Yl2Qv0PCtPsFQq9EK9z7NMNvtRx6AMhmiARvBqk/zMS/bdDWkKQB5lAEDK97ro4WqQ2b QBfFWTxmI5ZqJ3ZwFKNgmtAO6hrXkjMu5kU8Qi8H3mUf3ZdStumuVTmgy14HfHb+OdFj R7hKcZs93fBEq9iAXJB+5vfU3ccsy4cMrPUY6O8+j4fJuY1J672VSRiFqKEzXgSrejhd uuGhsYvhyW4pdXjR+5C5JKVEm10JHvQaJ14Mzfb7qLxsUtC6/cduYCy2Cwfodn3OtVVo FF/Q== X-Forwarded-Encrypted: i=1; AJvYcCXR7ch5EiLPjzQg221FSlQWqwOUo8rf6jHUABHwWmSzdDq0GqVZ5Ng0W+6yK8eQt3TaqDGFtyA+gQNkbYU=@vger.kernel.org X-Gm-Message-State: AOJu0Yz/Ovcs3FufFyIgIfWyMqs3hhFwOVp2xRbJrENKxCMq0yZ+pGO5 IXWVItqO39gtdz097PzGb15/ROZ9BfpI6im3E9xqVlXWU9hhAjwqZGWJGN+C9tlGvVLuQXhn1Hb cvw== X-Google-Smtp-Source: AGHT+IHcSsp6fTq+P5mdogx2xA//Oy5g0EOLyOHv8vpLXb30G0tARWpAdcWwAIgGOvySdxYbnBMETRGuhIU= X-Received: from pjbnb11.prod.google.com ([2002:a17:90b:35cb:b0:2fc:201d:6026]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2b90:b0:2fe:80cb:ac05 with SMTP id 98e67ed59e1d1-3087bb48d7bmr5606678a91.9.1744998605831; Fri, 18 Apr 2025 10:50:05 -0700 (PDT) Date: Fri, 18 Apr 2025 10:49:52 -0700 In-Reply-To: <20250418174959.1431962-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250418174959.1431962-1-surenb@google.com> X-Mailer: git-send-email 2.49.0.805.g082f7c87e0-goog Message-ID: <20250418174959.1431962-2-surenb@google.com> Subject: [PATCH v3 1/8] selftests/proc: add /proc/pid/maps tearing from vma split test From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The content of /proc/pid/maps is generated page-by-page with mmap_lock read lock (or other synchronization mechanism) being dropped in between these pages. This means that the reader can occasionally retrieve inconsistent information if the data used for file generation is being concurrently changed. For /proc/pid/maps that means it's possible to read inconsistent data if vmas or vma tree are concurrently modified. A simple example is when a vma gets split or merged. If such action happens while /proc/pid/maps is read and this vma happens to be at the edge of the two pages being generated, the readers can see the same vma twice: once before it got modified and second time after the modification. This is considered acceptable if the same vma is seen twice and userspace can deal with this situation. What is unacceptable is if we see a hole in the place occupied by a vma, for example as a result of a vma being replaced with another one, leaving the space temporarily empty. Implement a test that reads /proc/pid/maps of a forked child process and checks data consistency at the edge of two pages. Child process constantly modifies its address space in a way that affects the vma located at the end of the first page when /proc/pid/maps is read by the parent process. The parent checks the last vma of the first page and the first vma of the last page for consistency with the split/merge results. Since the test is designed to create a race between the file reader and vma tree modifier, we need multiple iterations to catch invalid results. To limit the time test is run, introduce a command line parameter specifying the duration of the test in seconds. For example, the following command will allow this concurrency test to run for 10 seconds: proc-pid-vm -d 10 The default test duration is set to 5 seconds. Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 430 ++++++++++++++++++++- 1 file changed, 429 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index d04685771952..6e3f06376a1f 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -27,6 +27,7 @@ #undef NDEBUG #include #include +#include #include #include #include @@ -34,6 +35,7 @@ #include #include #include +#include #include #include #include @@ -70,6 +72,8 @@ static void make_private_tmp(void) } } =20 +static unsigned long test_duration_sec =3D 5UL; +static int page_size; static pid_t pid =3D -1; static void ate(void) { @@ -281,11 +285,431 @@ static void vsyscall(void) } } =20 -int main(void) +/* /proc/pid/maps parsing routines */ +struct page_content { + char *data; + ssize_t size; +}; + +#define LINE_MAX_SIZE 256 + +struct line_content { + char text[LINE_MAX_SIZE]; + unsigned long start_addr; + unsigned long end_addr; +}; + +static void read_two_pages(int maps_fd, struct page_content *page1, + struct page_content *page2) +{ + ssize_t bytes_read; + + assert(lseek(maps_fd, 0, SEEK_SET) >=3D 0); + bytes_read =3D read(maps_fd, page1->data, page_size); + assert(bytes_read > 0 && bytes_read < page_size); + page1->size =3D bytes_read; + + bytes_read =3D read(maps_fd, page2->data, page_size); + assert(bytes_read > 0 && bytes_read < page_size); + page2->size =3D bytes_read; +} + +static void copy_first_line(struct page_content *page, char *first_line) +{ + char *pos =3D strchr(page->data, '\n'); + + strncpy(first_line, page->data, pos - page->data); + first_line[pos - page->data] =3D '\0'; +} + +static void copy_last_line(struct page_content *page, char *last_line) +{ + /* Get the last line in the first page */ + const char *end =3D page->data + page->size - 1; + /* skip last newline */ + const char *pos =3D end - 1; + + /* search previous newline */ + while (pos[-1] !=3D '\n') + pos--; + strncpy(last_line, pos, end - pos); + last_line[end - pos] =3D '\0'; +} + +/* Read the last line of the first page and the first line of the second p= age */ +static void read_boundary_lines(int maps_fd, struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line) +{ + read_two_pages(maps_fd, page1, page2); + + copy_last_line(page1, last_line->text); + copy_first_line(page2, first_line->text); + + assert(sscanf(last_line->text, "%lx-%lx", &last_line->start_addr, + &last_line->end_addr) =3D=3D 2); + assert(sscanf(first_line->text, "%lx-%lx", &first_line->start_addr, + &first_line->end_addr) =3D=3D 2); +} + +/* Thread synchronization routines */ +enum test_state { + INIT, + CHILD_READY, + PARENT_READY, + SETUP_READY, + SETUP_MODIFY_MAPS, + SETUP_MAPS_MODIFIED, + SETUP_RESTORE_MAPS, + SETUP_MAPS_RESTORED, + TEST_READY, + TEST_DONE, +}; + +struct vma_modifier_info; + +typedef void (*vma_modifier_op)(const struct vma_modifier_info *mod_info); +typedef void (*vma_mod_result_check_op)(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line); + +struct vma_modifier_info { + int vma_count; + void *addr; + int prot; + void *next_addr; + vma_modifier_op vma_modify; + vma_modifier_op vma_restore; + vma_mod_result_check_op vma_mod_check; + pthread_mutex_t sync_lock; + pthread_cond_t sync_cond; + enum test_state curr_state; + bool exit; + void *child_mapped_addr[]; +}; + +static void wait_for_state(struct vma_modifier_info *mod_info, enum test_s= tate state) +{ + pthread_mutex_lock(&mod_info->sync_lock); + while (mod_info->curr_state !=3D state) + pthread_cond_wait(&mod_info->sync_cond, &mod_info->sync_lock); + pthread_mutex_unlock(&mod_info->sync_lock); +} + +static void signal_state(struct vma_modifier_info *mod_info, enum test_sta= te state) +{ + pthread_mutex_lock(&mod_info->sync_lock); + mod_info->curr_state =3D state; + pthread_cond_signal(&mod_info->sync_cond); + pthread_mutex_unlock(&mod_info->sync_lock); +} + +/* VMA modification routines */ +static void *child_vma_modifier(struct vma_modifier_info *mod_info) +{ + int prot =3D PROT_READ | PROT_WRITE; + int i; + + for (i =3D 0; i < mod_info->vma_count; i++) { + mod_info->child_mapped_addr[i] =3D mmap(NULL, page_size * 3, prot, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + assert(mod_info->child_mapped_addr[i] !=3D MAP_FAILED); + /* change protection in adjacent maps to prevent merging */ + prot ^=3D PROT_WRITE; + } + signal_state(mod_info, CHILD_READY); + wait_for_state(mod_info, PARENT_READY); + while (true) { + signal_state(mod_info, SETUP_READY); + wait_for_state(mod_info, SETUP_MODIFY_MAPS); + if (mod_info->exit) + break; + + mod_info->vma_modify(mod_info); + signal_state(mod_info, SETUP_MAPS_MODIFIED); + wait_for_state(mod_info, SETUP_RESTORE_MAPS); + mod_info->vma_restore(mod_info); + signal_state(mod_info, SETUP_MAPS_RESTORED); + + wait_for_state(mod_info, TEST_READY); + while (mod_info->curr_state !=3D TEST_DONE) { + mod_info->vma_modify(mod_info); + mod_info->vma_restore(mod_info); + } + } + for (i =3D 0; i < mod_info->vma_count; i++) + munmap(mod_info->child_mapped_addr[i], page_size * 3); + + return NULL; +} + +static void stop_vma_modifier(struct vma_modifier_info *mod_info) +{ + wait_for_state(mod_info, SETUP_READY); + mod_info->exit =3D true; + signal_state(mod_info, SETUP_MODIFY_MAPS); +} + +static void capture_mod_pattern(int maps_fd, + struct vma_modifier_info *mod_info, + struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line, + struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + signal_state(mod_info, SETUP_MODIFY_MAPS); + wait_for_state(mod_info, SETUP_MAPS_MODIFIED); + + /* Copy last line of the first page and first line of the last page */ + read_boundary_lines(maps_fd, page1, page2, mod_last_line, mod_first_line); + + signal_state(mod_info, SETUP_RESTORE_MAPS); + wait_for_state(mod_info, SETUP_MAPS_RESTORED); + + /* Copy last line of the first page and first line of the last page */ + read_boundary_lines(maps_fd, page1, page2, restored_last_line, restored_f= irst_line); + + mod_info->vma_mod_check(mod_last_line, mod_first_line, + restored_last_line, restored_first_line); + + /* + * The content of these lines after modify+resore should be the same + * as the original. + */ + assert(strcmp(restored_last_line->text, last_line->text) =3D=3D 0); + assert(strcmp(restored_first_line->text, first_line->text) =3D=3D 0); +} + +static inline void split_vma(const struct vma_modifier_info *mod_info) +{ + assert(mmap(mod_info->addr, page_size, mod_info->prot | PROT_EXEC, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, + -1, 0) !=3D MAP_FAILED); +} + +static inline void merge_vma(const struct vma_modifier_info *mod_info) +{ + assert(mmap(mod_info->addr, page_size, mod_info->prot, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, + -1, 0) !=3D MAP_FAILED); +} + +static inline void check_split_result(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + /* Make sure vmas at the boundaries are changing */ + assert(strcmp(mod_last_line->text, restored_last_line->text) !=3D 0); + assert(strcmp(mod_first_line->text, restored_first_line->text) !=3D 0); +} + +static void test_maps_tearing_from_split(int maps_fd, + struct vma_modifier_info *mod_info, + struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line) +{ + struct line_content split_last_line; + struct line_content split_first_line; + struct line_content restored_last_line; + struct line_content restored_first_line; + + wait_for_state(mod_info, SETUP_READY); + + /* re-read the file to avoid using stale data from previous test */ + read_boundary_lines(maps_fd, page1, page2, last_line, first_line); + + mod_info->vma_modify =3D split_vma; + mod_info->vma_restore =3D merge_vma; + mod_info->vma_mod_check =3D check_split_result; + + capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, + &split_last_line, &split_first_line, + &restored_last_line, &restored_first_line); + + /* Now start concurrent modifications for test_duration_sec */ + signal_state(mod_info, TEST_READY); + + struct line_content new_last_line; + struct line_content new_first_line; + struct timespec start_ts, end_ts; + + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + do { + bool last_line_changed; + bool first_line_changed; + + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); + + /* Check if we read vmas after split */ + if (!strcmp(new_last_line.text, split_last_line.text)) { + /* + * The vmas should be consistent with split results, + * however if vma was concurrently restored after a + * split, it can be reported twice (first the original + * split one, then the same vma but extended after the + * merge) because we found it as the next vma again. + * In that case new first line will be the same as the + * last restored line. + */ + assert(!strcmp(new_first_line.text, split_first_line.text) || + !strcmp(new_first_line.text, restored_last_line.text)); + } else { + /* The vmas should be consistent with merge results */ + assert(!strcmp(new_last_line.text, restored_last_line.text) && + !strcmp(new_first_line.text, restored_first_line.text)); + } + /* + * First and last lines should change in unison. If the last + * line changed then the first line should change as well and + * vice versa. + */ + last_line_changed =3D strcmp(new_last_line.text, last_line->text) !=3D 0; + first_line_changed =3D strcmp(new_first_line.text, first_line->text) != =3D 0; + assert(last_line_changed =3D=3D first_line_changed); + + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + + /* Signal the modifyer thread to stop and wait until it exits */ + signal_state(mod_info, TEST_DONE); +} + +static int test_maps_tearing(void) +{ + struct vma_modifier_info *mod_info; + pthread_mutexattr_t mutex_attr; + pthread_condattr_t cond_attr; + int shared_mem_size; + char fname[32]; + int vma_count; + int maps_fd; + int status; + pid_t pid; + + /* + * Have to map enough vmas for /proc/pid/maps to containt more than one + * page worth of vmas. Assume at least 32 bytes per line in maps output + */ + vma_count =3D page_size / 32 + 1; + shared_mem_size =3D sizeof(struct vma_modifier_info) + vma_count * sizeof= (void *); + + /* map shared memory for communication with the child process */ + mod_info =3D (struct vma_modifier_info *)mmap(NULL, shared_mem_size, + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); + + assert(mod_info !=3D MAP_FAILED); + + /* Initialize shared members */ + pthread_mutexattr_init(&mutex_attr); + pthread_mutexattr_setpshared(&mutex_attr, PTHREAD_PROCESS_SHARED); + assert(!pthread_mutex_init(&mod_info->sync_lock, &mutex_attr)); + pthread_condattr_init(&cond_attr); + pthread_condattr_setpshared(&cond_attr, PTHREAD_PROCESS_SHARED); + assert(!pthread_cond_init(&mod_info->sync_cond, &cond_attr)); + mod_info->vma_count =3D vma_count; + mod_info->curr_state =3D INIT; + mod_info->exit =3D false; + + pid =3D fork(); + if (!pid) { + /* Child process */ + child_vma_modifier(mod_info); + return 0; + } + + sprintf(fname, "/proc/%d/maps", pid); + maps_fd =3D open(fname, O_RDONLY); + assert(maps_fd !=3D -1); + + /* Wait for the child to map the VMAs */ + wait_for_state(mod_info, CHILD_READY); + + /* Read first two pages */ + struct page_content page1; + struct page_content page2; + + page1.data =3D malloc(page_size); + assert(page1.data); + page2.data =3D malloc(page_size); + assert(page2.data); + + struct line_content last_line; + struct line_content first_line; + + read_boundary_lines(maps_fd, &page1, &page2, &last_line, &first_line); + + /* + * Find the addresses corresponding to the last line in the first page + * and the first line in the last page. + */ + mod_info->addr =3D NULL; + mod_info->next_addr =3D NULL; + for (int i =3D 0; i < mod_info->vma_count; i++) { + if (mod_info->child_mapped_addr[i] =3D=3D (void *)last_line.start_addr) { + mod_info->addr =3D mod_info->child_mapped_addr[i]; + mod_info->prot =3D PROT_READ; + /* Even VMAs have write permission */ + if ((i % 2) =3D=3D 0) + mod_info->prot |=3D PROT_WRITE; + } else if (mod_info->child_mapped_addr[i] =3D=3D (void *)first_line.star= t_addr) { + mod_info->next_addr =3D mod_info->child_mapped_addr[i]; + } + + if (mod_info->addr && mod_info->next_addr) + break; + } + assert(mod_info->addr && mod_info->next_addr); + + signal_state(mod_info, PARENT_READY); + + test_maps_tearing_from_split(maps_fd, mod_info, &page1, &page2, + &last_line, &first_line); + + stop_vma_modifier(mod_info); + + free(page2.data); + free(page1.data); + + for (int i =3D 0; i < vma_count; i++) + munmap(mod_info->child_mapped_addr[i], page_size); + close(maps_fd); + waitpid(pid, &status, 0); + munmap(mod_info, shared_mem_size); + + return 0; +} + +int usage(void) +{ + fprintf(stderr, "Userland /proc/pid/{s}maps test cases\n"); + fprintf(stderr, " -d: Duration for time-consuming tests\n"); + fprintf(stderr, " -h: Help screen\n"); + exit(-1); +} + +int main(int argc, char **argv) { int pipefd[2]; int exec_fd; + int opt; + + while ((opt =3D getopt(argc, argv, "d:h")) !=3D -1) { + if (opt =3D=3D 'd') + test_duration_sec =3D strtoul(optarg, NULL, 0); + else if (opt =3D=3D 'h') + usage(); + } =20 + page_size =3D sysconf(_SC_PAGESIZE); vsyscall(); switch (g_vsyscall) { case 0: @@ -578,6 +1002,10 @@ int main(void) assert(err =3D=3D -ENOENT); } =20 + /* Test tearing in /proc/$PID/maps */ + if (test_maps_tearing()) + return 1; + return 0; } #else --=20 2.49.0.805.g082f7c87e0-goog From nobody Fri Dec 19 16:08:24 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9795D21D011 for ; Fri, 18 Apr 2025 17:50:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998610; cv=none; b=iWAcYGqeuaCrpWb84Pq4UYqlNEFBucSDLRXgMIysb5zanO0oLOWRrumcDjKaRKds8dPtsoDASxrbWG00uHVrP+EAR4MznwTgkcrplNqt45xmA73IaCM5dK9nIph0RYRyBM2mPdsq7voBIfMc+vhptuxQLatokR8tQJdLZYGHPn0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998610; c=relaxed/simple; bh=xrrLN9Fn238jRuFI24Vz65P8O0LQj4xBEOmBNdG0wOE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jTVJk9/CREPV6Y0/MUrDrKJC6u/CvpmF5Ybx4lfSpaCi3c7B90o+Z/KDpN2eBxz9qfhiP+4q1iWxuIP7HgPL3ZXlsjIpSiqEYHPDUmDbl0abAcM1L4jAnKPZuht0W5PnqIXjG+k8slfBg8yCqKkUo1/TgChKxV93OePN2Zlfed4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=yG31AAK9; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="yG31AAK9" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-3086107d023so2093426a91.2 for ; Fri, 18 Apr 2025 10:50:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744998608; x=1745603408; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NX95Q4Cv9wzSD/gXCZRQ7iUBrYCRAWvVBL1zZlzjT0w=; b=yG31AAK9k6Rdbreim3y4gDSAGuYVWVQgjafeYykx6b+1iGTAph8VruRvrYWn+dSshB bVz+iB6kb67kosGpJFnaK/192jCS3f7L6ckDtp6A33vsBSUoppgOR6lKeMEgMd8nvyVV sPSGUMpGRH7DawY3iGkVF5k28/82iv4EPAe/FO9NWTM315uHlF8Aw69KHNX363f4GYnv 5y3m7+LRUUXTqXrhQ531acDFF3PEMiRx6NdWF+crMpsQgIMPlSNXeLrRgsCopZrLhtTN DOPeJn01Gbjf9IzFh2n5+FXj6+O6ayVXMsvNSlliivJHG78u7p04M3UZExFmsTodX8TY d7yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744998608; x=1745603408; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NX95Q4Cv9wzSD/gXCZRQ7iUBrYCRAWvVBL1zZlzjT0w=; b=PhCmvO+PBFwuLWM1+9kNi9FPZSYS/W30oNqs1bDicToS+rMDETXbJwecb/HJVen2iy PrVWd5fGFmfeygoM5TxGGTE8v+T5+qSe/8dFyMjIrJgCFdDf+8zCaHhtDP2fnc3nZ2Zt P5Zwv1K69tQDC6xP25y4qIiFx/3QPh/Mxoq9EzrP2Pnv9uo/znvkkpixgnL1ktq/oyaF 58YD5Xsl9E6pYOT3yuEyLRpXaCzLqm6rzkcjJ1RO+aRb376LcXw1sa1CwlMTIUcpEcTh cQsupCxPVkwgTnKms4qdJB8FQ/yRUzr2/5kTO5T6nvjTBdP7bKEZ/JPU4y7qieKoR0Bz /UZA== X-Forwarded-Encrypted: i=1; AJvYcCXp4VU1eG09GjGd4dnbI3qLIG6jHvpyG8UrZeKfeeI6150Wj79rqwJ/wXf4yaW7C6rCbIVYVQyH7R9WPZY=@vger.kernel.org X-Gm-Message-State: AOJu0YxVyFTvIJuSCTa/k7BSDtsDGIPBZQHQCWhC4apnj4Gk2sCVMADp tXWciwuoGwFcRr95gzlf0JChhzNzxMkPqpZnfQ4Cr9fPUgL4xdNLpSqbmKIHSUkc3qGXIQ7E+WY uOw== X-Google-Smtp-Source: AGHT+IFpXjxxv3q6WygkdxKZ0NcMoXnZ0n+s0fsiF2xjhvVMd0DnZR58HYYN8GWJayn2FJMme4MlbLuDSV4= X-Received: from pjyp13.prod.google.com ([2002:a17:90a:e70d:b0:2fc:2ee0:d38a]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5404:b0:305:2d68:8d39 with SMTP id 98e67ed59e1d1-3087bb571a2mr6196148a91.12.1744998607909; Fri, 18 Apr 2025 10:50:07 -0700 (PDT) Date: Fri, 18 Apr 2025 10:49:53 -0700 In-Reply-To: <20250418174959.1431962-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250418174959.1431962-1-surenb@google.com> X-Mailer: git-send-email 2.49.0.805.g082f7c87e0-goog Message-ID: <20250418174959.1431962-3-surenb@google.com> Subject: [PATCH v3 2/8] selftests/proc: extend /proc/pid/maps tearing test to include vma resizing From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Test that /proc/pid/maps does not report unexpected holes in the address space when a vma at the edge of the page is being concurrently remapped. This remapping results in the vma shrinking and expanding from under the reader. We should always see either shrunk or expanded (original) version of the vma. Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 83 ++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index 6e3f06376a1f..39842e4ec45f 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -583,6 +583,86 @@ static void test_maps_tearing_from_split(int maps_fd, signal_state(mod_info, TEST_DONE); } =20 +static inline void shrink_vma(const struct vma_modifier_info *mod_info) +{ + assert(mremap(mod_info->addr, page_size * 3, page_size, 0) !=3D MAP_FAILE= D); +} + +static inline void expand_vma(const struct vma_modifier_info *mod_info) +{ + assert(mremap(mod_info->addr, page_size, page_size * 3, 0) !=3D MAP_FAILE= D); +} + +static inline void check_shrink_result(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + /* Make sure only the last vma of the first page is changing */ + assert(strcmp(mod_last_line->text, restored_last_line->text) !=3D 0); + assert(strcmp(mod_first_line->text, restored_first_line->text) =3D=3D 0); +} + +static void test_maps_tearing_from_resize(int maps_fd, + struct vma_modifier_info *mod_info, + struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line) +{ + struct line_content shrunk_last_line; + struct line_content shrunk_first_line; + struct line_content restored_last_line; + struct line_content restored_first_line; + + wait_for_state(mod_info, SETUP_READY); + + /* re-read the file to avoid using stale data from previous test */ + read_boundary_lines(maps_fd, page1, page2, last_line, first_line); + + mod_info->vma_modify =3D shrink_vma; + mod_info->vma_restore =3D expand_vma; + mod_info->vma_mod_check =3D check_shrink_result; + + capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, + &shrunk_last_line, &shrunk_first_line, + &restored_last_line, &restored_first_line); + + /* Now start concurrent modifications for test_duration_sec */ + signal_state(mod_info, TEST_READY); + + struct line_content new_last_line; + struct line_content new_first_line; + struct timespec start_ts, end_ts; + + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + do { + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); + + /* Check if we read vmas after shrinking it */ + if (!strcmp(new_last_line.text, shrunk_last_line.text)) { + /* + * The vmas should be consistent with shrunk results, + * however if the vma was concurrently restored, it + * can be reported twice (first as shrunk one, then + * as restored one) because we found it as the next vma + * again. In that case new first line will be the same + * as the last restored line. + */ + assert(!strcmp(new_first_line.text, shrunk_first_line.text) || + !strcmp(new_first_line.text, restored_last_line.text)); + } else { + /* The vmas should be consistent with the original/resored state */ + assert(!strcmp(new_last_line.text, restored_last_line.text) && + !strcmp(new_first_line.text, restored_first_line.text)); + } + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + + /* Signal the modifyer thread to stop and wait until it exits */ + signal_state(mod_info, TEST_DONE); +} + static int test_maps_tearing(void) { struct vma_modifier_info *mod_info; @@ -674,6 +754,9 @@ static int test_maps_tearing(void) test_maps_tearing_from_split(maps_fd, mod_info, &page1, &page2, &last_line, &first_line); =20 + test_maps_tearing_from_resize(maps_fd, mod_info, &page1, &page2, + &last_line, &first_line); + stop_vma_modifier(mod_info); =20 free(page2.data); --=20 2.49.0.805.g082f7c87e0-goog From nobody Fri Dec 19 16:08:24 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B34C021D5B7 for ; Fri, 18 Apr 2025 17:50:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998612; cv=none; b=MB4Tjo6loHc2m4wDvwfHQFrNwX2cwgIdrkckfQzc6B0D/U9duu8CK0Oi2q5C2zZGmEfxKv3N/ZQ2ns1xJ/tQqwlxToElnT3cwEOnUchevC+UG7mIAYB2CjAz87lzKZWALNgk6LCuW0XWmFwfENPo0kab8zIXT4Avwp70V6GIAZA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998612; c=relaxed/simple; bh=TUJL4tusOyal/p5nEXPnA97MvfjXVXmJvXC+NGBZezE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=bBbx1rkgDxkHHdZJGqyzWGuR0YduhpUt9Kr+xsN1l3E6l4/lOVCM+9FvGeh+VR9kiiXOUlBi3elhB/pSGfxIgPrzDsUN/oLlqKfyrddUcp8J5R0tfcb0avJN36mV9tQ+H4L+PVTdgyMhFyTA0/MrjvdBqTxt3ZAjSAD2Bgyo2CU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=nKG0sC0x; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nKG0sC0x" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ff605a7a43so3038388a91.3 for ; Fri, 18 Apr 2025 10:50:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744998610; x=1745603410; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9QDrVpMHVQI58rYUIwLQJvrC+l7tIXNhALhARIzYV3A=; b=nKG0sC0xUFO3qXwEEMHDsqp8eX1jDxaf1Ws679PICGPuqIfOAtuxIxNrGoefT/1Mn9 g7AYKf7FWjUUHU2KO4M5m+ke2RjFO/9HGl2P3xFKiz6DKrrPa4pJ2bU1ErxajH5Au/Fs eJpe4sWpUyPECmb46nOz6mhNkRqoAJFzYI5fsuShrTc3ltwZu5Zv3G3o3Yl9D4Tp9Bix yqXRuobZ14TiLrtnylB55JKL8Gu44HfE7mZyOuYAr2k0z5eT8iwOwH2lrhv0DUBTieQH LizFT1jE8Bk8jVykNcZgO0ps0/6WKdlLlZuMhCX9cr8KaMiPtZgPYXaY2CLEpq9wD3le 6M/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744998610; x=1745603410; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9QDrVpMHVQI58rYUIwLQJvrC+l7tIXNhALhARIzYV3A=; b=wN4KuPalZb2pkfHvxFj355GfI57pZi6nbNe5F6RqSTeC4s5oAENpSYuvcxtHyPTFKI w+oZUjIevcQyOxg6a1xggrR5ezP0Wrw/FR1Bff2C9HunxNFvae5+JYqZxNAGQzeegVtr KAIje1Jb/0OCZDMAdNKev8IWqql5dJX0DFrQGfxGW6pseAI1LDpYoUutDO5OMrwhAZ95 +BcQiJDAQRdu5/i0Cy6RiEnq4D4w0yH3uQZuCy4/sLC+8iJn97U7xx+H4o1PE+HhLbFI 3g3yObqz8/jFtIMhQCAvOsUjZqmropydcWylcBesRI/Zmgo8sV97ysXLlJb+guagJelb 9tGw== X-Forwarded-Encrypted: i=1; AJvYcCWlY7uS3WcX8iJ15R0aqkNlb8bUPPVxTkW6cxMHcLcSiiD8Nyv7nbPSSItqVZUZKvbAajafhDetKvuco9w=@vger.kernel.org X-Gm-Message-State: AOJu0Yz1NJarxIWEdN8WSH5dfGO+/F9Q9rtHgC8Q4bcBlFSNLk0Dga5B ys4wA8Kdnd3MUCq4Ay78sCzOi3ybudACqS6KOZezIiXU+/XXlTU7aZsGH5dCCQNGzgJ0dPVka9l rTQ== X-Google-Smtp-Source: AGHT+IE7iTtCIpLL5pKA5lwf7kKMbjD35Kmet76OppQlPi+RQQiOVcZRsAZLaU2DHrZErhMdfWGzeuof8yo= X-Received: from pjqq6.prod.google.com ([2002:a17:90b:5846:b0:2fc:3022:36b8]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4ed0:b0:2fe:8c22:48b0 with SMTP id 98e67ed59e1d1-3087bb6d159mr5887502a91.15.1744998609934; Fri, 18 Apr 2025 10:50:09 -0700 (PDT) Date: Fri, 18 Apr 2025 10:49:54 -0700 In-Reply-To: <20250418174959.1431962-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250418174959.1431962-1-surenb@google.com> X-Mailer: git-send-email 2.49.0.805.g082f7c87e0-goog Message-ID: <20250418174959.1431962-4-surenb@google.com> Subject: [PATCH v3 3/8] selftests/proc: extend /proc/pid/maps tearing test to include vma remapping From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Test that /proc/pid/maps does not report unexpected holes in the address space when we concurrently remap a part of a vma into the middle of another vma. This remapping results in the destination vma being split into three parts and the part in the middle being patched back from, all done concurrently from under the reader. We should always see either original vma or the split one with no holes. Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 92 ++++++++++++++++++++++ 1 file changed, 92 insertions(+) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index 39842e4ec45f..1aef2db7e893 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -663,6 +663,95 @@ static void test_maps_tearing_from_resize(int maps_fd, signal_state(mod_info, TEST_DONE); } =20 +static inline void remap_vma(const struct vma_modifier_info *mod_info) +{ + /* + * Remap the last page of the next vma into the middle of the vma. + * This splits the current vma and the first and middle parts (the + * parts at lower addresses) become the last vma objserved in the + * first page and the first vma observed in the last page. + */ + assert(mremap(mod_info->next_addr + page_size * 2, page_size, + page_size, MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP, + mod_info->addr + page_size) !=3D MAP_FAILED); +} + +static inline void patch_vma(const struct vma_modifier_info *mod_info) +{ + assert(!mprotect(mod_info->addr + page_size, page_size, + mod_info->prot)); +} + +static inline void check_remap_result(struct line_content *mod_last_line, + struct line_content *mod_first_line, + struct line_content *restored_last_line, + struct line_content *restored_first_line) +{ + /* Make sure vmas at the boundaries are changing */ + assert(strcmp(mod_last_line->text, restored_last_line->text) !=3D 0); + assert(strcmp(mod_first_line->text, restored_first_line->text) !=3D 0); +} + +static void test_maps_tearing_from_remap(int maps_fd, + struct vma_modifier_info *mod_info, + struct page_content *page1, + struct page_content *page2, + struct line_content *last_line, + struct line_content *first_line) +{ + struct line_content remapped_last_line; + struct line_content remapped_first_line; + struct line_content restored_last_line; + struct line_content restored_first_line; + + wait_for_state(mod_info, SETUP_READY); + + /* re-read the file to avoid using stale data from previous test */ + read_boundary_lines(maps_fd, page1, page2, last_line, first_line); + + mod_info->vma_modify =3D remap_vma; + mod_info->vma_restore =3D patch_vma; + mod_info->vma_mod_check =3D check_remap_result; + + capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, + &remapped_last_line, &remapped_first_line, + &restored_last_line, &restored_first_line); + + /* Now start concurrent modifications for test_duration_sec */ + signal_state(mod_info, TEST_READY); + + struct line_content new_last_line; + struct line_content new_first_line; + struct timespec start_ts, end_ts; + + clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + do { + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); + + /* Check if we read vmas after remapping it */ + if (!strcmp(new_last_line.text, remapped_last_line.text)) { + /* + * The vmas should be consistent with remap results, + * however if the vma was concurrently restored, it + * can be reported twice (first as split one, then + * as restored one) because we found it as the next vma + * again. In that case new first line will be the same + * as the last restored line. + */ + assert(!strcmp(new_first_line.text, remapped_first_line.text) || + !strcmp(new_first_line.text, restored_last_line.text)); + } else { + /* The vmas should be consistent with the original/resored state */ + assert(!strcmp(new_last_line.text, restored_last_line.text) && + !strcmp(new_first_line.text, restored_first_line.text)); + } + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + + /* Signal the modifyer thread to stop and wait until it exits */ + signal_state(mod_info, TEST_DONE); +} + static int test_maps_tearing(void) { struct vma_modifier_info *mod_info; @@ -757,6 +846,9 @@ static int test_maps_tearing(void) test_maps_tearing_from_resize(maps_fd, mod_info, &page1, &page2, &last_line, &first_line); =20 + test_maps_tearing_from_remap(maps_fd, mod_info, &page1, &page2, + &last_line, &first_line); + stop_vma_modifier(mod_info); =20 free(page2.data); --=20 2.49.0.805.g082f7c87e0-goog From nobody Fri Dec 19 16:08:24 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7E7B21C9F3 for ; Fri, 18 Apr 2025 17:50:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998614; cv=none; b=AMmxIkUusa5tDXnIrQjUBa24lmi42+dLDYMJpsWdCk1YKih9A3cz5aAAgyBq3XsViRxrICvKONu/vrlIhi2q388sKhLEgbxWGH73iq658aAkHL/USMKejvGTFnbLqsp/1g+aPkJ5+7Kq+oJ6YT7cKknhB3OAXrSdFrUeuw0wenA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998614; c=relaxed/simple; bh=uhXXixeQxZQZ9ZI8mpiMJ9ZnVW4idzYW13f4X6dKbPc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=aMc0WI5GU7HlPNPCBm6HTdR1qUaDOrbZc4wN5nMtyP8VAMPDuTifs8SkrX9b7piRsh6/G7FyZmTZwbdfhKMHgGO+4P3gM4DjWNWlPdvSoeLG6+wyfO/hlQV1z5nUUK4c/YRqTUfkxk+OlFd+m3YQsVN659iqnjE+pJUmrRfHMjc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=gpNzRx0W; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gpNzRx0W" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b0dd00e1a01so337339a12.0 for ; Fri, 18 Apr 2025 10:50:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744998612; x=1745603412; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HUB6q+v0EpFzFu3PlHq45yNrya7EC7jKKGmhERTYTZQ=; b=gpNzRx0WjBK+hFs1U4pMiD9FHvYKBsXf0BCfVxReNyWi3/LU9dcZYTabEYsltxxCF5 ohiF6fsfmsOALWRILucVkKmVQJUaFt4lyZkAiugsHpCXjDSqU/BALh8noSuWtEqwhLft XLNgWpLfL/NccQVomW2bPfaLOCJOLJ9lsEceC6U2esyQ5A2ABvZtS5hh8zxYTlYdlzMo C3RnP3lQG9LEiUvJzHZm4+plRoeBBuVqEtDEbdHafXcwP0YhBsuKPCFJhUsTSbOIYXJQ VGaJW1cTHUBm9C6aLMObEbOzpfUTvMNzyT4Gpatbrc1nDsS0CyITqYedO0zXDK1G6t0F mevQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744998612; x=1745603412; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HUB6q+v0EpFzFu3PlHq45yNrya7EC7jKKGmhERTYTZQ=; b=kZBsKfrP4ogweFfaokec3yxKHO0ri42izg1s9GrUsVjersgJZettYiAyGFh04Tp3ef qDPxIKiZ9cQyFYmVdw+p4poX7wmkYyDUBANIeP2EHCDUWnRQo+ZnKW+LmRHY0LIh1Haq R+bULPkNW0crCxSw1AdRmXDMwo/WywPxK1dDLm00xVvYByzgCW1LOVsxhsNPM4hVBkHK X+K+4hPgT84XIxTJChTwXfMlODicNgYdcgyh70xsxwsaw4acH5NbMZzY/B/Z527+GpcY fp+lU4nSSecXbF6GfwYIPZNeKA62eR7NNiQv7iIInlQoxDUBG/CrjFCguivMGPMp5jVP Vapw== X-Forwarded-Encrypted: i=1; AJvYcCWgv5H123TeKGV9ksqWCe7kFAYkOeEwLYJfksOXwpogLr/SzoLxFlrwXrjJLZcloJJs+Hj3Kaq4ukVe4lc=@vger.kernel.org X-Gm-Message-State: AOJu0YxMM37tWjz4Q1354Ql5TLLC4L9DRc1v7DyUWA3SRtOsh94Kpiag N75Rgfgj25lY2XMiIaC1VZpgfolo4khGgfDzEPuGkUhWe7Rx8tAelW8wsSMj8e8W9UlzAe6MJpJ T3w== X-Google-Smtp-Source: AGHT+IGyaQr+2pZMSmPsh5wX1RoiROvtwBkd0HseqzqLOxOjXALv+e2mSSAlluxAbqqn3sMiDcQhj5J0Y/Q= X-Received: from pjbqx7.prod.google.com ([2002:a17:90b:3e47:b0:301:2679:9aa]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2809:b0:301:1c29:a1d9 with SMTP id 98e67ed59e1d1-3087bb66b26mr5412483a91.21.1744998612089; Fri, 18 Apr 2025 10:50:12 -0700 (PDT) Date: Fri, 18 Apr 2025 10:49:55 -0700 In-Reply-To: <20250418174959.1431962-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250418174959.1431962-1-surenb@google.com> X-Mailer: git-send-email 2.49.0.805.g082f7c87e0-goog Message-ID: <20250418174959.1431962-5-surenb@google.com> Subject: [PATCH v3 4/8] selftests/proc: test PROCMAP_QUERY ioctl while vma is concurrently modified From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend /proc/pid/maps tearing test to verify PROCMAP_QUERY ioctl operation correctness while the vma is being concurrently modified. Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 60 ++++++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index 1aef2db7e893..b582f40851fb 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -486,6 +486,21 @@ static void capture_mod_pattern(int maps_fd, assert(strcmp(restored_first_line->text, first_line->text) =3D=3D 0); } =20 +static void query_addr_at(int maps_fd, void *addr, + unsigned long *vma_start, unsigned long *vma_end) +{ + struct procmap_query q; + + memset(&q, 0, sizeof(q)); + q.size =3D sizeof(q); + /* Find the VMA at the split address */ + q.query_addr =3D (unsigned long long)addr; + q.query_flags =3D 0; + assert(!ioctl(maps_fd, PROCMAP_QUERY, &q)); + *vma_start =3D q.vma_start; + *vma_end =3D q.vma_end; +} + static inline void split_vma(const struct vma_modifier_info *mod_info) { assert(mmap(mod_info->addr, page_size, mod_info->prot | PROT_EXEC, @@ -546,6 +561,8 @@ static void test_maps_tearing_from_split(int maps_fd, do { bool last_line_changed; bool first_line_changed; + unsigned long vma_start; + unsigned long vma_end; =20 read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); =20 @@ -576,6 +593,19 @@ static void test_maps_tearing_from_split(int maps_fd, first_line_changed =3D strcmp(new_first_line.text, first_line->text) != =3D 0; assert(last_line_changed =3D=3D first_line_changed); =20 + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + query_addr_at(maps_fd, mod_info->addr + page_size, + &vma_start, &vma_end); + /* + * The vma at the split address can be either the same as + * original one (if read before the split) or the same as the + * first line in the second page (if read after the split). + */ + assert((vma_start =3D=3D last_line->start_addr && + vma_end =3D=3D last_line->end_addr) || + (vma_start =3D=3D split_first_line.start_addr && + vma_end =3D=3D split_first_line.end_addr)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); =20 @@ -637,6 +667,9 @@ static void test_maps_tearing_from_resize(int maps_fd, =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); do { + unsigned long vma_start; + unsigned long vma_end; + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); =20 /* Check if we read vmas after shrinking it */ @@ -656,6 +689,17 @@ static void test_maps_tearing_from_resize(int maps_fd, assert(!strcmp(new_last_line.text, restored_last_line.text) && !strcmp(new_first_line.text, restored_first_line.text)); } + + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + query_addr_at(maps_fd, mod_info->addr, &vma_start, &vma_end); + /* + * The vma should stay at the same address and have either the + * original size of 3 pages or 1 page if read after shrinking. + */ + assert(vma_start =3D=3D last_line->start_addr && + (vma_end - vma_start =3D=3D page_size * 3 || + vma_end - vma_start =3D=3D page_size)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); =20 @@ -726,6 +770,9 @@ static void test_maps_tearing_from_remap(int maps_fd, =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); do { + unsigned long vma_start; + unsigned long vma_end; + read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_li= ne); =20 /* Check if we read vmas after remapping it */ @@ -745,6 +792,19 @@ static void test_maps_tearing_from_remap(int maps_fd, assert(!strcmp(new_last_line.text, restored_last_line.text) && !strcmp(new_first_line.text, restored_first_line.text)); } + + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + query_addr_at(maps_fd, mod_info->addr + page_size, &vma_start, &vma_end); + /* + * The vma should either stay at the same address and have the + * original size of 3 pages or we should find the remapped vma + * at the remap destination address with size of 1 page. + */ + assert((vma_start =3D=3D last_line->start_addr && + vma_end - vma_start =3D=3D page_size * 3) || + (vma_start =3D=3D last_line->start_addr + page_size && + vma_end - vma_start =3D=3D page_size)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); =20 --=20 2.49.0.805.g082f7c87e0-goog From nobody Fri Dec 19 16:08:24 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B5A0D224252 for ; Fri, 18 Apr 2025 17:50:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998616; cv=none; b=fqtM+dO4MgnAjGmnT2VAkJs7t+sdinMerG3UZVgrWMvT+0zf0XbC8WcOg9TzXz+PRTfEAb/PQ2n5CywgZ/pvKi0uqOCN+cWo/HeIx/bx3G31p9mdpD1wDxuDcDf61oqUp/pHyfQ0B/SuAa5VDtBry9UQQzrv7LDPfgSGwfLuCwI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998616; c=relaxed/simple; bh=rl3OkV3B3ApFRI5rofk/H3MDnkkOrJGscLlJwuZAlos=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=pbflR0pGwGTmSWn9BDit5FHPJ0hfhaDvTMGloF0UWNUmZd7Krp1UbgfeGiM3CErUMUc5vtNgvrmYY9Ze25PuwU464TIQggukEGP9f8ymqRG/RSeIgbZiiAUYQHtsmI5XoNkENhdmCsG28Hn81RCuYLRUw2NqtWC/BQmNidd7RsQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=EiXf3qoh; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="EiXf3qoh" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-3032f4eca83so1887912a91.3 for ; Fri, 18 Apr 2025 10:50:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744998614; x=1745603414; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=4jqUSvj/NGRZpdVDELGH7hX9O/M0JsmqV6H5P9bViYo=; b=EiXf3qohPxuE94qE9LZ7HlQH8KWG70D7QyL+Y+gAu15RUfz0FM4MssXB9XznRxgowJ lIdlX8mQC3RlQzt6J7imu4C9R22LWfZU4oanCk2eheN/42mmq28jSLWEFYw9G31lkIYt QJ4vv537uLkkZ1UbB6g4DmIeTgP/p7KoaNhQXL4Ey6Kv34fKE6g1SQpqwVwcwjAP14h+ 15H1xl3T3SyvD83OblZEktV6/3DtT/utzMGuD+DCbpkaN34r+WOGHmYUk71OdyGzCWem bMNqkRr/WXnYksZhclEp2CDu252nP53UfGCrYfylsWGnsY5lYX+k+4YjLK6drfbdQMxL t1qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744998614; x=1745603414; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4jqUSvj/NGRZpdVDELGH7hX9O/M0JsmqV6H5P9bViYo=; b=VksFWWzpffztpt5ejo7sKyhjTT5BnSZNN18HGXQ+/l1TLDLuaCtGReu6i3mViYKOrh rv2FVqGCFWOcJ2CE0pdqydk/O3PBevxcqvbkKCXKrkXw6MLc3/ppiixsKNbb+/PGiyA2 k9ibC78JJ7KroZ0msUzCjyKn4nifdHbMKcYOdXOqWpLq9GQn6QxAGB7XJxZPwDaG0ccu 84ZqoD0GooIKLb4YAv29hcQXHbyxFe/BdFXRTkbZajlgKQ++S9pXTiEl5Q65R+5QUVT8 7+/NpLIsgA/So8iNP7Rg1REwY6pHFPH8kgfWJjVcg+hcW7kbz7pNoiSbt+2LfsytOEs4 70hA== X-Forwarded-Encrypted: i=1; AJvYcCX69RuGmmPzrwhiLlgHMfJQHvG5oHXwd7MH7uoDuADr1c1WlGH547VxmWDrG6H1CtqIE04DoD4YJpQ+BxM=@vger.kernel.org X-Gm-Message-State: AOJu0Yy9+kL5UbfwNrAB4Y7nfB/eB5GV0SH9zBLDEU2IvZipBYCRqV3s 3td5+RRnnlIgqu/ptuzCy0K/CjQTydaPUP3yzk/jOC+xXR2yc36uhVI9a/jZNaYRdr3HRExhPnr gOg== X-Google-Smtp-Source: AGHT+IERCla4iFsgpWYpwI6HMOKZWs0UZbPEMYoTuXdvkIt8Z5/Vr0hH5bN0CnC5Fg33kNuXlwhByED7Sls= X-Received: from pjbpd10.prod.google.com ([2002:a17:90b:1dca:b0:2fc:c98:ea47]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3d89:b0:2f4:4500:bb4d with SMTP id 98e67ed59e1d1-3087bb6ba88mr5312595a91.20.1744998614144; Fri, 18 Apr 2025 10:50:14 -0700 (PDT) Date: Fri, 18 Apr 2025 10:49:56 -0700 In-Reply-To: <20250418174959.1431962-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250418174959.1431962-1-surenb@google.com> X-Mailer: git-send-email 2.49.0.805.g082f7c87e0-goog Message-ID: <20250418174959.1431962-6-surenb@google.com> Subject: [PATCH v3 5/8] selftests/proc: add verbose more for tests to facilitate debugging From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add verbose more to the proc tests to print debugging information. Usage: proc-pid-vm --verbose Signed-off-by: Suren Baghdasaryan --- tools/testing/selftests/proc/proc-pid-vm.c | 154 +++++++++++++++++++-- 1 file changed, 141 insertions(+), 13 deletions(-) diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/sel= ftests/proc/proc-pid-vm.c index b582f40851fb..97017f48cd70 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -73,6 +73,7 @@ static void make_private_tmp(void) } =20 static unsigned long test_duration_sec =3D 5UL; +static bool verbose; static int page_size; static pid_t pid =3D -1; static void ate(void) @@ -452,6 +453,99 @@ static void stop_vma_modifier(struct vma_modifier_info= *mod_info) signal_state(mod_info, SETUP_MODIFY_MAPS); } =20 +static void print_first_lines(char *text, int nr) +{ + const char *end =3D text; + + while (nr && (end =3D strchr(end, '\n')) !=3D NULL) { + nr--; + end++; + } + + if (end) { + int offs =3D end - text; + + text[offs] =3D '\0'; + printf(text); + text[offs] =3D '\n'; + printf("\n"); + } else { + printf(text); + } +} + +static void print_last_lines(char *text, int nr) +{ + const char *start =3D text + strlen(text); + + nr++; /* to ignore the last newline */ + while (nr) { + while (start > text && *start !=3D '\n') + start--; + nr--; + start--; + } + printf(start); +} + +static void print_boundaries(const char *title, + struct page_content *page1, + struct page_content *page2) +{ + if (!verbose) + return; + + printf("%s", title); + /* Print 3 boundary lines from each page */ + print_last_lines(page1->data, 3); + printf("-----------------page boundary-----------------\n"); + print_first_lines(page2->data, 3); +} + +static bool print_boundaries_on(bool condition, const char *title, + struct page_content *page1, + struct page_content *page2) +{ + if (verbose && condition) + print_boundaries(title, page1, page2); + + return condition; +} + +static void report_test_start(const char *name) +{ + if (verbose) + printf("=3D=3D=3D=3D %s =3D=3D=3D=3D\n", name); +} + +static struct timespec print_ts; + +static void start_test_loop(struct timespec *ts) +{ + if (verbose) + print_ts.tv_sec =3D ts->tv_sec; +} + +static void end_test_iteration(struct timespec *ts) +{ + if (!verbose) + return; + + /* Update every second */ + if (print_ts.tv_sec =3D=3D ts->tv_sec) + return; + + printf("."); + fflush(stdout); + print_ts.tv_sec =3D ts->tv_sec; +} + +static void end_test_loop(void) +{ + if (verbose) + printf("\n"); +} + static void capture_mod_pattern(int maps_fd, struct vma_modifier_info *mod_info, struct page_content *page1, @@ -463,18 +557,24 @@ static void capture_mod_pattern(int maps_fd, struct line_content *restored_last_line, struct line_content *restored_first_line) { + print_boundaries("Before modification", page1, page2); + signal_state(mod_info, SETUP_MODIFY_MAPS); wait_for_state(mod_info, SETUP_MAPS_MODIFIED); =20 /* Copy last line of the first page and first line of the last page */ read_boundary_lines(maps_fd, page1, page2, mod_last_line, mod_first_line); =20 + print_boundaries("After modification", page1, page2); + signal_state(mod_info, SETUP_RESTORE_MAPS); wait_for_state(mod_info, SETUP_MAPS_RESTORED); =20 /* Copy last line of the first page and first line of the last page */ read_boundary_lines(maps_fd, page1, page2, restored_last_line, restored_f= irst_line); =20 + print_boundaries("After restore", page1, page2); + mod_info->vma_mod_check(mod_last_line, mod_first_line, restored_last_line, restored_first_line); =20 @@ -546,6 +646,7 @@ static void test_maps_tearing_from_split(int maps_fd, mod_info->vma_restore =3D merge_vma; mod_info->vma_mod_check =3D check_split_result; =20 + report_test_start("Tearing from split"); capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, &split_last_line, &split_first_line, &restored_last_line, &restored_first_line); @@ -558,6 +659,7 @@ static void test_maps_tearing_from_split(int maps_fd, struct timespec start_ts, end_ts; =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + start_test_loop(&start_ts); do { bool last_line_changed; bool first_line_changed; @@ -577,12 +679,17 @@ static void test_maps_tearing_from_split(int maps_fd, * In that case new first line will be the same as the * last restored line. */ - assert(!strcmp(new_first_line.text, split_first_line.text) || - !strcmp(new_first_line.text, restored_last_line.text)); + assert(!print_boundaries_on( + strcmp(new_first_line.text, split_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text), + "Split result invalid", page1, page2)); + } else { /* The vmas should be consistent with merge results */ - assert(!strcmp(new_last_line.text, restored_last_line.text) && - !strcmp(new_first_line.text, restored_first_line.text)); + assert(!print_boundaries_on( + strcmp(new_last_line.text, restored_last_line.text) || + strcmp(new_first_line.text, restored_first_line.text), + "Merge result invalid", page1, page2)); } /* * First and last lines should change in unison. If the last @@ -607,7 +714,9 @@ static void test_maps_tearing_from_split(int maps_fd, vma_end =3D=3D split_first_line.end_addr)); =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + end_test_iteration(&end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + end_test_loop(); =20 /* Signal the modifyer thread to stop and wait until it exits */ signal_state(mod_info, TEST_DONE); @@ -654,6 +763,7 @@ static void test_maps_tearing_from_resize(int maps_fd, mod_info->vma_restore =3D expand_vma; mod_info->vma_mod_check =3D check_shrink_result; =20 + report_test_start("Tearing from resize"); capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, &shrunk_last_line, &shrunk_first_line, &restored_last_line, &restored_first_line); @@ -666,6 +776,7 @@ static void test_maps_tearing_from_resize(int maps_fd, struct timespec start_ts, end_ts; =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + start_test_loop(&start_ts); do { unsigned long vma_start; unsigned long vma_end; @@ -682,12 +793,16 @@ static void test_maps_tearing_from_resize(int maps_fd, * again. In that case new first line will be the same * as the last restored line. */ - assert(!strcmp(new_first_line.text, shrunk_first_line.text) || - !strcmp(new_first_line.text, restored_last_line.text)); + assert(!print_boundaries_on( + strcmp(new_first_line.text, shrunk_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text), + "Shrink result invalid", page1, page2)); } else { /* The vmas should be consistent with the original/resored state */ - assert(!strcmp(new_last_line.text, restored_last_line.text) && - !strcmp(new_first_line.text, restored_first_line.text)); + assert(!print_boundaries_on( + strcmp(new_last_line.text, restored_last_line.text) || + strcmp(new_first_line.text, restored_first_line.text), + "Expand result invalid", page1, page2)); } =20 /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ @@ -701,7 +816,9 @@ static void test_maps_tearing_from_resize(int maps_fd, vma_end - vma_start =3D=3D page_size)); =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + end_test_iteration(&end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + end_test_loop(); =20 /* Signal the modifyer thread to stop and wait until it exits */ signal_state(mod_info, TEST_DONE); @@ -757,6 +874,7 @@ static void test_maps_tearing_from_remap(int maps_fd, mod_info->vma_restore =3D patch_vma; mod_info->vma_mod_check =3D check_remap_result; =20 + report_test_start("Tearing from remap"); capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_lin= e, &remapped_last_line, &remapped_first_line, &restored_last_line, &restored_first_line); @@ -769,6 +887,7 @@ static void test_maps_tearing_from_remap(int maps_fd, struct timespec start_ts, end_ts; =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); + start_test_loop(&start_ts); do { unsigned long vma_start; unsigned long vma_end; @@ -785,12 +904,16 @@ static void test_maps_tearing_from_remap(int maps_fd, * again. In that case new first line will be the same * as the last restored line. */ - assert(!strcmp(new_first_line.text, remapped_first_line.text) || - !strcmp(new_first_line.text, restored_last_line.text)); + assert(!print_boundaries_on( + strcmp(new_first_line.text, remapped_first_line.text) && + strcmp(new_first_line.text, restored_last_line.text), + "Remap result invalid", page1, page2)); } else { /* The vmas should be consistent with the original/resored state */ - assert(!strcmp(new_last_line.text, restored_last_line.text) && - !strcmp(new_first_line.text, restored_first_line.text)); + assert(!print_boundaries_on( + strcmp(new_last_line.text, restored_last_line.text) || + strcmp(new_first_line.text, restored_first_line.text), + "Remap restore result invalid", page1, page2)); } =20 /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ @@ -806,7 +929,9 @@ static void test_maps_tearing_from_remap(int maps_fd, vma_end - vma_start =3D=3D page_size)); =20 clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); + end_test_iteration(&end_ts); } while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec); + end_test_loop(); =20 /* Signal the modifyer thread to stop and wait until it exits */ signal_state(mod_info, TEST_DONE); @@ -927,6 +1052,7 @@ int usage(void) { fprintf(stderr, "Userland /proc/pid/{s}maps test cases\n"); fprintf(stderr, " -d: Duration for time-consuming tests\n"); + fprintf(stderr, " -v: Verbose mode\n"); fprintf(stderr, " -h: Help screen\n"); exit(-1); } @@ -937,9 +1063,11 @@ int main(int argc, char **argv) int exec_fd; int opt; =20 - while ((opt =3D getopt(argc, argv, "d:h")) !=3D -1) { + while ((opt =3D getopt(argc, argv, "d:vh")) !=3D -1) { if (opt =3D=3D 'd') test_duration_sec =3D strtoul(optarg, NULL, 0); + else if (opt =3D=3D 'v') + verbose =3D true; else if (opt =3D=3D 'h') usage(); } --=20 2.49.0.805.g082f7c87e0-goog From nobody Fri Dec 19 16:08:24 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CE9A22539F for ; Fri, 18 Apr 2025 17:50:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998627; cv=none; b=NVzF3LZDczR7OoQBs7ivaZQCx4xBttnFCBoOmtdxOhzIaNHCoEOMX6KlrKAXy+Y1BWBnZgp6sFiMo688tVzjgBL8YcAQFxgevSegGM02CYZVILveSaVSSAvTGmIb3MvKBBwLYvutkcMdD/WBwboHk0aJg7Ebk19Q5UDPX7CtLbk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998627; c=relaxed/simple; bh=0VmCWLkn+PpW3InwqLr223zLcy4GX+RqloQObCjZ3BI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=aj3OkjvywezQwEFgTt60sBt2UntOMhD8lR7HQgUbk1Nta4Zz6Ferhx9rOzlRvzQ4O4+PK7huxjd5NSe9eNM19L40Qvejmed6wZiCkx4oqZzsfDIzgxEM14Tz1wzSee8LCdefqa2fspxReVXrxaUMk0Dbrg2ZTXE2BahZgvYotq4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=sA7cSpEr; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="sA7cSpEr" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-225ab228a37so19079845ad.2 for ; Fri, 18 Apr 2025 10:50:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744998616; x=1745603416; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=g/gTsaXuJx1lGF+ktcF8yVCekjdFoDtmHyOs+3ik9i4=; b=sA7cSpEr6d/lRDaTrwePeHKs/3e1PjU9ZccRpuy/juBfMaDtMhXeMmg/l/86rdHJXM SCC4DiRJcjoknFeD1nKeMalx3XJd4luigYHWXHusq01aD7BwDJsnH/TJZ/avZhiPKT5Q hOnItm1TVmZDGKkEZ45DpICf1krvwzUoro6b6ASKbWH+J3Avs1DmhSrmw10+KzQzPhwZ E6vd1BWLiH8tX1hcAyCdYRNvwXKtGFsR0rLSa/JvTj5MaIBv0FK4zhY06y+e0uawTotp iWHdTRL/uF8Rx+8QsxUIakLGofTdFeUtAMWS6Kw2F04lQtL+OWeLbQ28pzkRXBCk5ENM v5xA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744998616; x=1745603416; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=g/gTsaXuJx1lGF+ktcF8yVCekjdFoDtmHyOs+3ik9i4=; b=VBBEDXPNCI85JTOt2xaLIyOukOJmVWLVZxYkwbsIMsR42wkqMC6bbTdsrCjLErfBup idbRNWCw0FS1E7zDDR4o2ODPXKC6WTOZdwjWA+hDE/Cyt5YpdE5DcWfdwlZacRYywW9I KDxv5LjwOldvtHf1FKXGGoLURcwmNVQ72yLQ2FEWEFk75ZZ7nQShiQrSvaTAV3qiAPC/ KHf1jkk72bsQTJenw6bZ4bYdRzDnihP3yhGMsCD9o7pvtNqCjDL04/JOkotXkyU1Rqwj ihB4Iz4Esxr5ZJyOcw1qV1Vy7XUkK45KPgT2ITwZk5faRt6sN9wocmfEyFEZeiswuidi mc9w== X-Forwarded-Encrypted: i=1; AJvYcCVpjRRZr/E/yLXBWIBKmqj7Nc6udoREq9UpFbSsWSpgh/TYaa7L/H0L79N5Bx+jdSBuRPlPYewxvdAP0Ng=@vger.kernel.org X-Gm-Message-State: AOJu0YxrJDAHqa67pTBk4WECIHlT3nwsaqdhI9o1PNWgrY4Va+FSKrIM t39pgnqRCDuAoP6SJWpuj2Jhmtf/O7mWRE3pG281Srzi2JhL88dsH2bTBXfNgdxaWBwgEHNpmMY 7nQ== X-Google-Smtp-Source: AGHT+IGjcICUXwS1wYeFYZR+XrM2Xg014/1EfWbXSEfTLiETYQCBjkJqO1ncD1j5guZx5QFaNlxV4IfN5dY= X-Received: from plld7.prod.google.com ([2002:a17:902:7287:b0:223:8244:94f6]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:1744:b0:227:e980:919d with SMTP id d9443c01a7336-22c536207dfmr51660975ad.47.1744998616385; Fri, 18 Apr 2025 10:50:16 -0700 (PDT) Date: Fri, 18 Apr 2025 10:49:57 -0700 In-Reply-To: <20250418174959.1431962-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250418174959.1431962-1-surenb@google.com> X-Mailer: git-send-email 2.49.0.805.g082f7c87e0-goog Message-ID: <20250418174959.1431962-7-surenb@google.com> Subject: [PATCH v3 6/8] mm: make vm_area_struct anon_name field RCU-safe From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" For lockless /proc/pid/maps reading we have to ensure all the fields used when generating the output are RCU-safe. The only pointer fields in vm_area_struct which are used to generate that file's output are vm_file and anon_name. vm_file is RCU-safe but anon_name is not. Make anon_name RCU-safe as well. Signed-off-by: Suren Baghdasaryan --- include/linux/mm_inline.h | 10 +++++++++- include/linux/mm_types.h | 3 ++- mm/madvise.c | 30 ++++++++++++++++++++++++++---- 3 files changed, 37 insertions(+), 6 deletions(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index f9157a0c42a5..9ac2d92d7ede 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -410,7 +410,7 @@ static inline void dup_anon_vma_name(struct vm_area_str= uct *orig_vma, struct anon_vma_name *anon_name =3D anon_vma_name(orig_vma); =20 if (anon_name) - new_vma->anon_name =3D anon_vma_name_reuse(anon_name); + rcu_assign_pointer(new_vma->anon_name, anon_vma_name_reuse(anon_name)); } =20 static inline void free_anon_vma_name(struct vm_area_struct *vma) @@ -432,6 +432,8 @@ static inline bool anon_vma_name_eq(struct anon_vma_nam= e *anon_name1, !strcmp(anon_name1->name, anon_name2->name); } =20 +struct anon_vma_name *anon_vma_name_get_rcu(struct vm_area_struct *vma); + #else /* CONFIG_ANON_VMA_NAME */ static inline void anon_vma_name_get(struct anon_vma_name *anon_name) {} static inline void anon_vma_name_put(struct anon_vma_name *anon_name) {} @@ -445,6 +447,12 @@ static inline bool anon_vma_name_eq(struct anon_vma_na= me *anon_name1, return true; } =20 +static inline +struct anon_vma_name *anon_vma_name_get_rcu(struct vm_area_struct *vma) +{ + return NULL; +} + #endif /* CONFIG_ANON_VMA_NAME */ =20 static inline void init_tlb_flush_pending(struct mm_struct *mm) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 56d07edd01f9..15ec288d4a21 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -700,6 +700,7 @@ struct vm_userfaultfd_ctx {}; =20 struct anon_vma_name { struct kref kref; + struct rcu_head rcu; /* The name needs to be at the end because it is dynamically sized. */ char name[]; }; @@ -874,7 +875,7 @@ struct vm_area_struct { * terminated string containing the name given to the vma, or NULL if * unnamed. Serialized by mmap_lock. Use anon_vma_name to access. */ - struct anon_vma_name *anon_name; + struct anon_vma_name __rcu *anon_name; #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; } __randomize_layout; diff --git a/mm/madvise.c b/mm/madvise.c index 8433ac9b27e0..ed03a5a2c140 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -101,14 +101,15 @@ void anon_vma_name_free(struct kref *kref) { struct anon_vma_name *anon_name =3D container_of(kref, struct anon_vma_name, kref); - kfree(anon_name); + kfree_rcu(anon_name, rcu); } =20 struct anon_vma_name *anon_vma_name(struct vm_area_struct *vma) { mmap_assert_locked(vma->vm_mm); =20 - return vma->anon_name; + return rcu_dereference_protected(vma->anon_name, + rwsem_is_locked(&vma->vm_mm->mmap_lock)); } =20 /* mmap_lock should be write-locked */ @@ -118,7 +119,7 @@ static int replace_anon_vma_name(struct vm_area_struct = *vma, struct anon_vma_name *orig_name =3D anon_vma_name(vma); =20 if (!anon_name) { - vma->anon_name =3D NULL; + rcu_assign_pointer(vma->anon_name, NULL); anon_vma_name_put(orig_name); return 0; } @@ -126,11 +127,32 @@ static int replace_anon_vma_name(struct vm_area_struc= t *vma, if (anon_vma_name_eq(orig_name, anon_name)) return 0; =20 - vma->anon_name =3D anon_vma_name_reuse(anon_name); + rcu_assign_pointer(vma->anon_name, anon_vma_name_reuse(anon_name)); anon_vma_name_put(orig_name); =20 return 0; } + +/* + * Returned anon_vma_name is stable due to elevated refcount but not guara= nteed + * to be assigned to the original VMA after the call. + */ +struct anon_vma_name *anon_vma_name_get_rcu(struct vm_area_struct *vma) +{ + struct anon_vma_name __rcu *anon_name; + + WARN_ON_ONCE(!rcu_read_lock_held()); + + anon_name =3D rcu_dereference(vma->anon_name); + if (!anon_name) + return NULL; + + if (unlikely(!kref_get_unless_zero(&anon_name->kref))) + return NULL; + + return anon_name; +} + #else /* CONFIG_ANON_VMA_NAME */ static int replace_anon_vma_name(struct vm_area_struct *vma, struct anon_vma_name *anon_name) --=20 2.49.0.805.g082f7c87e0-goog From nobody Fri Dec 19 16:08:24 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6B22F225415 for ; Fri, 18 Apr 2025 17:50:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998621; cv=none; b=BXzd5IC+3udOPjBx8K+NpxtbGNcY16u8T+6Y8z4dC15OnerXbXto0SWmfuhZGVKPEHgAKI2YevR9UF4ykzRa8RfU3yn5w33hlJIxYpUaKq+q8oNIwPP4C1gmNpo35VFzrlfBVQmS5FOrxNPeIjVktPWpt0pTnTvSIwXfGpxLmj0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998621; c=relaxed/simple; bh=20+DFZgTng6ZYblv4okOjrUyBQ5mxE92qM420ijlexM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ODPn9l7M0qWflKEIWWjyJbJBjgBShTX4r7/ary50N4Mf6wtO20mucb6kv3AP725p4lDrYWDquzkTQEsFW8QpXNDlz63o0SyddSt8qlT1BuBISG+r0oRVUy4Wjot9XSBiif2uvJvBX29XaGr//O6QNMdfqJECYhMfti/yxi48psU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=KygVUAlx; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="KygVUAlx" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b090c7c2c6aso1306033a12.0 for ; Fri, 18 Apr 2025 10:50:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744998618; x=1745603418; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/zYtkrY7HUMXj6I83iq2T2dcIR3++4RWvz/bo+0QsbU=; b=KygVUAlxQS1mQjAnX7J0juIs8PRjKB4JOuOMYE3GwrDNuu1Jvqt8xvy+6QQR4lt1KA /CF85Dds4ZVc5+O/TPD5C9dniUOzdO70Tu3Z5o1sKPV7xQBXBEyucKDE1qA24B+sgskq suT9T2Ua1so9sAaRknrOfIz9Cht8EjjWHWA30s9B1FarjSUk3zuxjARV4I6tzM4nnvjJ +jo5r+RWj0fyKeHLMsQw3Fs9XdDRPJpjFfk3ckJh4trZR/D/M77ffiJVEMcoyWcP4WcZ +tYQvlR7ll3W5Oe+7bcyX3OZemHFSYKuHNzU5Q8fhXNLQbyClccKjfv/+WDlDQ7fferR 9WBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744998618; x=1745603418; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/zYtkrY7HUMXj6I83iq2T2dcIR3++4RWvz/bo+0QsbU=; b=w/ngkw+Ht2HxR3DliwtzoQ2+34S1JabnX6dVLPbUK0MA8WE1teOSLYB3EC3xwfNN0t XE55Gds12KxFUHT5lSL6m/MsEHoorZIbe5RjB95fxzDHKvBAslPDFtKMJ9M6h4OzlR5h OaTjH66nAagjcvEEE5UfC4a2BDhhccXU/WYpSlY8E8CI7+RkgoU3bYaNbRyt9N/r5oxj CPryfGk1QaielfHF4KsX1a2L/IJINm2Hv8Mj3FAH9kKcKKUl6dk2Ict4Ee9IUQY2FYRT fj56WI1kta4EI/TYHfZddoGh8ULv3PXZQVhync8P849hmFAeN9f9/gsVUqbHvS9278Rf 07dw== X-Forwarded-Encrypted: i=1; AJvYcCXqhzw7bCvjdj9WXGfO9GOaW9daTLr/BzsXnqfopfMd6+kD8+APTjfq74wS1oC7T5aDJrjtJ5QqKw/vtII=@vger.kernel.org X-Gm-Message-State: AOJu0YxB0ErgGJ3VYWuX8BolIzgFdQmARAv678VLa+HZWlxquEXkxkV8 jQoU7qv3dfyorSe9reJcYqGcpp0eZzgX2wBexPcFeHQQpfJa0z8qU+YUPOTnsrWzvgLNukvsTw2 hHw== X-Google-Smtp-Source: AGHT+IFtA4TMsuStPYu7aoBjj//DZ8se4BNHvSXu+W66hYuPWTAds876LFCU1b60VeWLvlCjpqZ1/KyDDUw= X-Received: from pjbpa2.prod.google.com ([2002:a17:90b:2642:b0:2ef:7af4:5e8e]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:528a:b0:2ff:7ad4:77b1 with SMTP id 98e67ed59e1d1-3087bb3973fmr6045262a91.2.1744998618622; Fri, 18 Apr 2025 10:50:18 -0700 (PDT) Date: Fri, 18 Apr 2025 10:49:58 -0700 In-Reply-To: <20250418174959.1431962-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250418174959.1431962-1-surenb@google.com> X-Mailer: git-send-email 2.49.0.805.g082f7c87e0-goog Message-ID: <20250418174959.1431962-8-surenb@google.com> Subject: [PATCH v3 7/8] mm/maps: read proc/pid/maps under RCU From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With maple_tree supporting vma tree traversal under RCU and vma and its important members being RCU-safe, /proc/pid/maps can be read under RCU and without the need to read-lock mmap_lock. However vma content can change from under us, therefore we make a copy of the vma and we pin pointer fields used when generating the output (currently only vm_file and anon_name). Afterwards we check for concurrent address space modifications, wait for them to end and retry. While we take the mmap_lock for reading during such contention, we do that momentarily only to record new mm_wr_seq counter. This change is designed to reduce mmap_lock contention and prevent a process reading /proc/pid/maps files (often a low priority task, such as monitoring/data collection services) from blocking address space updates. Note that this change has a userspace visible disadvantage: it allows for sub-page data tearing as opposed to the previous mechanism where data tearing could happen only between pages of generated output data. Since current userspace considers data tearing between pages to be acceptable, we assume is will be able to handle sub-page data tearing as well. Signed-off-by: Suren Baghdasaryan --- fs/proc/internal.h | 6 ++ fs/proc/task_mmu.c | 170 ++++++++++++++++++++++++++++++++++---- include/linux/mm_inline.h | 18 ++++ 3 files changed, 177 insertions(+), 17 deletions(-) diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 96122e91c645..6e1169c1f4df 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -379,6 +379,12 @@ struct proc_maps_private { struct task_struct *task; struct mm_struct *mm; struct vma_iterator iter; + bool mmap_locked; + loff_t last_pos; +#ifdef CONFIG_PER_VMA_LOCK + unsigned int mm_wr_seq; + struct vm_area_struct vma_copy; +#endif #ifdef CONFIG_NUMA struct mempolicy *task_mempolicy; #endif diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index b9e4fbbdf6e6..f9d50a61167c 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -127,13 +127,130 @@ static void release_task_mempolicy(struct proc_maps_= private *priv) } #endif =20 -static struct vm_area_struct *proc_get_vma(struct proc_maps_private *priv, - loff_t *ppos) +#ifdef CONFIG_PER_VMA_LOCK + +static const struct seq_operations proc_pid_maps_op; + +/* + * Take VMA snapshot and pin vm_file and anon_name as they are used by + * show_map_vma. + */ +static int get_vma_snapshot(struct proc_maps_private *priv, struct vm_area= _struct *vma) +{ + struct vm_area_struct *copy =3D &priv->vma_copy; + int ret =3D -EAGAIN; + + memcpy(copy, vma, sizeof(*vma)); + if (copy->vm_file && !get_file_rcu(©->vm_file)) + goto out; + + if (!anon_vma_name_get_if_valid(copy)) + goto put_file; + + if (!mmap_lock_speculate_retry(priv->mm, priv->mm_wr_seq)) + return 0; + + /* Address space got modified, vma might be stale. Re-lock and retry. */ + rcu_read_unlock(); + ret =3D mmap_read_lock_killable(priv->mm); + if (!ret) { + /* mmap_lock_speculate_try_begin() succeeds when holding mmap_read_lock = */ + mmap_lock_speculate_try_begin(priv->mm, &priv->mm_wr_seq); + mmap_read_unlock(priv->mm); + ret =3D -EAGAIN; + } + + rcu_read_lock(); + + anon_vma_name_put_if_valid(copy); +put_file: + if (copy->vm_file) + fput(copy->vm_file); +out: + return ret; +} + +static void put_vma_snapshot(struct proc_maps_private *priv) +{ + struct vm_area_struct *vma =3D &priv->vma_copy; + + anon_vma_name_put_if_valid(vma); + if (vma->vm_file) + fput(vma->vm_file); +} + +static inline bool drop_mmap_lock(struct seq_file *m, struct proc_maps_pri= vate *priv) +{ + /* + * smaps and numa_maps perform page table walk, therefore require + * mmap_lock but maps can be read under RCU. + */ + if (m->op !=3D &proc_pid_maps_op) + return false; + + /* mmap_lock_speculate_try_begin() succeeds when holding mmap_read_lock */ + mmap_lock_speculate_try_begin(priv->mm, &priv->mm_wr_seq); + mmap_read_unlock(priv->mm); + rcu_read_lock(); + memset(&priv->vma_copy, 0, sizeof(priv->vma_copy)); + + return true; +} + +static struct vm_area_struct *get_stable_vma(struct vm_area_struct *vma, + struct proc_maps_private *priv, + loff_t last_pos) +{ + int ret; + + put_vma_snapshot(priv); + while ((ret =3D get_vma_snapshot(priv, vma)) =3D=3D -EAGAIN) { + /* lookup the vma at the last position again */ + vma_iter_init(&priv->iter, priv->mm, last_pos); + vma =3D vma_next(&priv->iter); + } + + return ret ? ERR_PTR(ret) : &priv->vma_copy; +} + +#else /* CONFIG_PER_VMA_LOCK */ + +/* Without per-vma locks VMA access is not RCU-safe */ +static inline bool drop_mmap_lock(struct seq_file *m, + struct proc_maps_private *priv) +{ + return false; +} + +static struct vm_area_struct *get_stable_vma(struct vm_area_struct *vma, + struct proc_maps_private *priv, + loff_t last_pos) +{ + return vma; +} + +#endif /* CONFIG_PER_VMA_LOCK */ + +static struct vm_area_struct *proc_get_vma(struct seq_file *m, loff_t *ppo= s) { + struct proc_maps_private *priv =3D m->private; struct vm_area_struct *vma =3D vma_next(&priv->iter); =20 + if (vma && !priv->mmap_locked) + vma =3D get_stable_vma(vma, priv, *ppos); + + if (IS_ERR(vma)) + return vma; + if (vma) { - *ppos =3D vma->vm_start; + /* Store previous position to be able to restart if needed */ + priv->last_pos =3D *ppos; + /* + * Track the end of the reported vma to ensure position changes + * even if previous vma was merged with the next vma and we + * found the extended vma with the same vm_start. + */ + *ppos =3D vma->vm_end; } else { *ppos =3D -2UL; vma =3D get_gate_vma(priv->mm); @@ -148,6 +265,7 @@ static void *m_start(struct seq_file *m, loff_t *ppos) unsigned long last_addr =3D *ppos; struct mm_struct *mm; =20 + priv->mmap_locked =3D true; /* See m_next(). Zero at the start or after lseek. */ if (last_addr =3D=3D -1UL) return NULL; @@ -170,12 +288,18 @@ static void *m_start(struct seq_file *m, loff_t *ppos) return ERR_PTR(-EINTR); } =20 + /* Drop mmap_lock if possible */ + if (drop_mmap_lock(m, priv)) + priv->mmap_locked =3D false; + + if (last_addr > 0) + *ppos =3D last_addr =3D priv->last_pos; vma_iter_init(&priv->iter, mm, last_addr); hold_task_mempolicy(priv); if (last_addr =3D=3D -2UL) return get_gate_vma(mm); =20 - return proc_get_vma(priv, ppos); + return proc_get_vma(m, ppos); } =20 static void *m_next(struct seq_file *m, void *v, loff_t *ppos) @@ -184,7 +308,7 @@ static void *m_next(struct seq_file *m, void *v, loff_t= *ppos) *ppos =3D -1UL; return NULL; } - return proc_get_vma(m->private, ppos); + return proc_get_vma(m, ppos); } =20 static void m_stop(struct seq_file *m, void *v) @@ -196,7 +320,10 @@ static void m_stop(struct seq_file *m, void *v) return; =20 release_task_mempolicy(priv); - mmap_read_unlock(mm); + if (priv->mmap_locked) + mmap_read_unlock(mm); + else + rcu_read_unlock(); mmput(mm); put_task_struct(priv->task); priv->task =3D NULL; @@ -243,14 +370,20 @@ static int do_maps_open(struct inode *inode, struct f= ile *file, static void get_vma_name(struct vm_area_struct *vma, const struct path **path, const char **name, - const char **name_fmt) + const char **name_fmt, bool mmap_locked) { - struct anon_vma_name *anon_name =3D vma->vm_mm ? anon_vma_name(vma) : NUL= L; + struct anon_vma_name *anon_name; =20 *name =3D NULL; *path =3D NULL; *name_fmt =3D NULL; =20 + if (vma->vm_mm) + anon_name =3D mmap_locked ? anon_vma_name(vma) : + anon_vma_name_get_rcu(vma); + else + anon_name =3D NULL; + /* * Print the dentry name for named mappings, and a * special [heap] marker for the heap: @@ -266,39 +399,41 @@ static void get_vma_name(struct vm_area_struct *vma, } else { *path =3D file_user_path(vma->vm_file); } - return; + goto out; } =20 if (vma->vm_ops && vma->vm_ops->name) { *name =3D vma->vm_ops->name(vma); if (*name) - return; + goto out; } =20 *name =3D arch_vma_name(vma); if (*name) - return; + goto out; =20 if (!vma->vm_mm) { *name =3D "[vdso]"; - return; + goto out; } =20 if (vma_is_initial_heap(vma)) { *name =3D "[heap]"; - return; + goto out; } =20 if (vma_is_initial_stack(vma)) { *name =3D "[stack]"; - return; + goto out; } =20 if (anon_name) { *name_fmt =3D "[anon:%s]"; *name =3D anon_name->name; - return; } +out: + if (anon_name && !mmap_locked) + anon_vma_name_put(anon_name); } =20 static void show_vma_header_prefix(struct seq_file *m, @@ -324,6 +459,7 @@ static void show_vma_header_prefix(struct seq_file *m, static void show_map_vma(struct seq_file *m, struct vm_area_struct *vma) { + struct proc_maps_private *priv =3D m->private; const struct path *path; const char *name_fmt, *name; vm_flags_t flags =3D vma->vm_flags; @@ -344,7 +480,7 @@ show_map_vma(struct seq_file *m, struct vm_area_struct = *vma) end =3D vma->vm_end; show_vma_header_prefix(m, start, end, flags, pgoff, dev, ino); =20 - get_vma_name(vma, &path, &name, &name_fmt); + get_vma_name(vma, &path, &name, &name_fmt, priv->mmap_locked); if (path) { seq_pad(m, ' '); seq_path(m, path, "\n"); @@ -549,7 +685,7 @@ static int do_procmap_query(struct proc_maps_private *p= riv, void __user *uarg) const char *name_fmt; size_t name_sz =3D 0; =20 - get_vma_name(vma, &path, &name, &name_fmt); + get_vma_name(vma, &path, &name, &name_fmt, true); =20 if (path || name_fmt || name) { name_buf =3D kmalloc(name_buf_sz, GFP_KERNEL); diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 9ac2d92d7ede..436512f1e759 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -434,6 +434,21 @@ static inline bool anon_vma_name_eq(struct anon_vma_na= me *anon_name1, =20 struct anon_vma_name *anon_vma_name_get_rcu(struct vm_area_struct *vma); =20 +/* + * Takes a reference if anon_vma is valid and stable (has references). + * Fails only if anon_vma is valid but we failed to get a reference. + */ +static inline bool anon_vma_name_get_if_valid(struct vm_area_struct *vma) +{ + return !vma->anon_name || anon_vma_name_get_rcu(vma); +} + +static inline void anon_vma_name_put_if_valid(struct vm_area_struct *vma) +{ + if (vma->anon_name) + anon_vma_name_put(vma->anon_name); +} + #else /* CONFIG_ANON_VMA_NAME */ static inline void anon_vma_name_get(struct anon_vma_name *anon_name) {} static inline void anon_vma_name_put(struct anon_vma_name *anon_name) {} @@ -453,6 +468,9 @@ struct anon_vma_name *anon_vma_name_get_rcu(struct vm_a= rea_struct *vma) return NULL; } =20 +static inline bool anon_vma_name_get_if_valid(struct vm_area_struct *vma) = { return true; } +static inline void anon_vma_name_put_if_valid(struct vm_area_struct *vma) = {} + #endif /* CONFIG_ANON_VMA_NAME */ =20 static inline void init_tlb_flush_pending(struct mm_struct *mm) --=20 2.49.0.805.g082f7c87e0-goog From nobody Fri Dec 19 16:08:24 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A97822540F for ; Fri, 18 Apr 2025 17:50:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998624; cv=none; b=ozpAPLcdhasdEZNgFZzPfNcZ6u2JtcjNOHjbZbDyZ6Cp6KIUrWw/sXh8WZ7mzz3KwvCx15Qu1mp9+WgWvaWtqBkNc1uHThyNMd19+Q0IljXOtnwzFwvcbLgGs8o2KLOPtOiegbUz2bEfxGSH6eC7wN6/FOOXc460YnQ52qMiC3M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744998624; c=relaxed/simple; bh=sKg7f329Zs3O82Sqp+e8qxPxOZ/t4mW6H6H8eA3C9lA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DHUg2oNg7LGIbF3FwKrpxD1RfW72+nSdEPR8FzexZC+/4gdOyFvey5Ozqv8j0Jw2GeuNb8SqIxwbdFagYh/JMRrad2uqnklYsuwa6fJmxNsfTILpJ43rNNFFo3KCNqJwuZabz64Kq6k7UIB8bTd7LcZqNRdit/G7fosEnxdwNUs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=AZqb7kPy; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="AZqb7kPy" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2242ade807fso33216835ad.2 for ; Fri, 18 Apr 2025 10:50:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744998621; x=1745603421; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GDNsq3YX7eW5e3apKRRmkWWvDDDuWE0xbjvQjwBgCsE=; b=AZqb7kPy/oRHhVyW00nvGWJKO6OQN2L0ukhaFp7qXyDPXhS+GR+w5FMSEtkQa/zcZZ dw+GmEy1385+mbG1El3VlhBm2rn6nwWEwg8YHBz3fKKQPMecS+cJiz7tHTUDW0Qsqj9t mE4If8izxzVDtWRnhsr+DEHlGTFNgFTEwADD41JxWYc+VnaDBqUST4v0Yw1hdqY0f2gq ny+ySQxHj3mGDwHrK2kzMl83Vm/5IzHPJE3USmbUKHK2TjEwu1VvTdMl3yDgwrVlNqjC Px0Ezpd8E/c730HRLMwGA+zNHLeiZVxDYWjAAzwkMnWb5aVJ57Kr2hbR0jlJ1M/R+U2G qoSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744998621; x=1745603421; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GDNsq3YX7eW5e3apKRRmkWWvDDDuWE0xbjvQjwBgCsE=; b=myHUvnQPK3ffydZs6R4zX2VVoj0CjeboiaZ/bmcNbCu0XFVK2cLJsImxdRA6iUhbp0 mjBPe8VLJh7Ct4OftSlz71a/P8eKjZrXds5OBMOePw+9jsC6NjeKNsEXb6CIio5quNnE djVFM2+J/QTLFF+k37/sAAp5d+Ns3tiCvIJb0KPzWuRGuKPhOic3fX0x7av7lPKNlkTY 9AgRlxY9G7wghqWbmC7eNsgSIoClZhAXXzCm3kTvG3HKdZuOhrZ9ZLYGNnM70gfAkB1x j893ZP67t578sjEiu+tY3hOcA78nI/HT1ha8LgcGehOGr4LMDK1lyMlqYAF/v52pOYtu oEWw== X-Forwarded-Encrypted: i=1; AJvYcCX6A9INieU+r4lW9xMUORKbIN0HtQ2c4Y84O7++l8zkGfIsGTZHarLw0V+wAz+UgPPuEVmMfnHac31obZo=@vger.kernel.org X-Gm-Message-State: AOJu0YzPH22/QSDTBIRt6vBDcKQpTL1jzDNMkvGjQHyhSPuWIS1j10Yq 2UkJh5UiRbTs5TJhEMS4bvLZTGGlVcRRdQZ2UrxNerVBkltv5il99xau5wyyxQTKFJGxjj1tuwv 5BQ== X-Google-Smtp-Source: AGHT+IH4SzFcgriEZeEyINeEOYHsKsBduZvh1a8UF65TGxe4ng4DHNhTgv3meBOCFXGKpW1iswhVW2S93bM= X-Received: from plrd9.prod.google.com ([2002:a17:902:aa89:b0:223:225b:3d83]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:19cc:b0:224:24d5:f20a with SMTP id d9443c01a7336-22c53620caamr57870345ad.48.1744998620802; Fri, 18 Apr 2025 10:50:20 -0700 (PDT) Date: Fri, 18 Apr 2025 10:49:59 -0700 In-Reply-To: <20250418174959.1431962-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250418174959.1431962-1-surenb@google.com> X-Mailer: git-send-email 2.49.0.805.g082f7c87e0-goog Message-ID: <20250418174959.1431962-9-surenb@google.com> Subject: [PATCH v3 8/8] mm/maps: execute PROCMAP_QUERY ioctl under RCU From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Utilize speculative vma lookup to find and snapshot a vma without taking mmap_lock during PROCMAP_QUERY ioctl execution. Concurrent address space modifications are detected and the lookup is retried. While we take the mmap_lock for reading during such contention, we do that momentarily only to record new mm_wr_seq counter. This change is designed to reduce mmap_lock contention and prevent PROCMAP_QUERY ioctl calls (often a low priority task, such as monitoring/data collection services) from blocking address space updates. Signed-off-by: Suren Baghdasaryan --- fs/proc/task_mmu.c | 63 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 55 insertions(+), 8 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f9d50a61167c..28b975ddff26 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -525,9 +525,53 @@ static int pid_maps_open(struct inode *inode, struct f= ile *file) PROCMAP_QUERY_VMA_FLAGS \ ) =20 -static int query_vma_setup(struct mm_struct *mm) +#ifdef CONFIG_PER_VMA_LOCK + +static int query_vma_setup(struct proc_maps_private *priv) { - return mmap_read_lock_killable(mm); + if (!mmap_lock_speculate_try_begin(priv->mm, &priv->mm_wr_seq)) { + int ret =3D mmap_read_lock_killable(priv->mm); + + if (ret) + return ret; + + /* mmap_lock_speculate_try_begin() succeeds when holding mmap_read_lock = */ + mmap_lock_speculate_try_begin(priv->mm, &priv->mm_wr_seq); + mmap_read_unlock(priv->mm); + } + + memset(&priv->vma_copy, 0, sizeof(priv->vma_copy)); + rcu_read_lock(); + + return 0; +} + +static void query_vma_teardown(struct mm_struct *mm, struct vm_area_struct= *vma) +{ + rcu_read_unlock(); +} + +static struct vm_area_struct *query_vma_find_by_addr(struct proc_maps_priv= ate *priv, + unsigned long addr) +{ + struct vm_area_struct *vma; + + vma_iter_init(&priv->iter, priv->mm, addr); + vma =3D vma_next(&priv->iter); + if (!vma) + return NULL; + + vma =3D get_stable_vma(vma, priv, addr); + + /* The only possible error is EINTR, just pretend we found nothing */ + return IS_ERR(vma) ? NULL : vma; +} + +#else /* CONFIG_PER_VMA_LOCK */ + +static int query_vma_setup(struct proc_maps_private *priv) +{ + return mmap_read_lock_killable(priv->mm); } =20 static void query_vma_teardown(struct mm_struct *mm, struct vm_area_struct= *vma) @@ -535,18 +579,21 @@ static void query_vma_teardown(struct mm_struct *mm, = struct vm_area_struct *vma) mmap_read_unlock(mm); } =20 -static struct vm_area_struct *query_vma_find_by_addr(struct mm_struct *mm,= unsigned long addr) +static struct vm_area_struct *query_vma_find_by_addr(struct proc_maps_priv= ate *priv, + unsigned long addr) { - return find_vma(mm, addr); + return find_vma(priv->mm, addr); } =20 -static struct vm_area_struct *query_matching_vma(struct mm_struct *mm, +#endif /* CONFIG_PER_VMA_LOCK */ + +static struct vm_area_struct *query_matching_vma(struct proc_maps_private = *priv, unsigned long addr, u32 flags) { struct vm_area_struct *vma; =20 next_vma: - vma =3D query_vma_find_by_addr(mm, addr); + vma =3D query_vma_find_by_addr(priv, addr); if (!vma) goto no_vma; =20 @@ -622,13 +669,13 @@ static int do_procmap_query(struct proc_maps_private = *priv, void __user *uarg) if (!mm || !mmget_not_zero(mm)) return -ESRCH; =20 - err =3D query_vma_setup(mm); + err =3D query_vma_setup(priv); if (err) { mmput(mm); return err; } =20 - vma =3D query_matching_vma(mm, karg.query_addr, karg.query_flags); + vma =3D query_matching_vma(priv, karg.query_addr, karg.query_flags); if (IS_ERR(vma)) { err =3D PTR_ERR(vma); vma =3D NULL; --=20 2.49.0.805.g082f7c87e0-goog