From nobody Sun Oct 5 09:07:16 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0407B26B749 for ; Wed, 6 Aug 2025 15:59:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754495954; cv=none; b=NfOyTPKQ0llvy+ta6AlqdIwka9pHjQqGANs1D9vPK5jy9NlWVJlSjXiS1HGHqsEtiBWPLp2sYPUMouyrBjaPWUttV+1vh2mOJn6KvRZsQRZIQn8GTOq648yah9i8bZJ9mTwI0zI5DGDOrAVwgkqR5ECpufcT/eZHFDZW9dMA97w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754495954; c=relaxed/simple; bh=iUVqOV2h0uljyDUnMUl+kkbvIgii2KNnTTE9p+jQbC8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=BBBjmdY1VtT0NmrGsSdXaZThXxVH6hGGe2jfZKfg6nw18I0UInKwU6t/+ZofTZdT/hhmNxLfK4YC6bOjDnAg53wDf+OvenADh4+F61RjnfWigEQ/deZjVJ1HKUPKF/auHgYMpJT7oJpuS1kpeM5zYYDW/+6sFfzNEYOhC9v3HQI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=mHyBu/3u; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="mHyBu/3u" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-32147620790so84782a91.1 for ; Wed, 06 Aug 2025 08:59:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1754495950; x=1755100750; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PXYe+deuPHevDPYBINDTsn9wi9CS1Jzi+BLJQIES5w4=; b=mHyBu/3uVC27TrGAJOYcGukM4kpk8CzsM5g0D7Rp+mvDyMTgTMEpe3pWMGSgTzTwpw 87habpdv/6UOENPLIiECKCD8GbQUpt6LVhoEMQ263neVOh/lqYzmICdi8T/VEurtj46B BeGXj5FH9MorIBtPjtV2lVby0DWkgjwj5XdBdWOKK6+XyqlXz1rwMgmuuSHzFaSVufHT //TQTHCMCk6MAFZGI3YmTb1MOBURfkln5Bh1f/H21tlT/CMsfZePx4UBO1M5mG8KpMkP OTjCsh/X2xh6D9pN2qWe/ij0aCtDIj8uITewvJWi5rbnLuRqMjQInG3+7jttOM8Bm0OL gf0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754495950; x=1755100750; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PXYe+deuPHevDPYBINDTsn9wi9CS1Jzi+BLJQIES5w4=; b=jjaBm0x98HUdeOVp66kFtummH/4JdS5h79h+DGu22s5T/x2qHLZjVW3q18Dz7smdfW QDO70U2C3dw6Xf2L8UX0tb5wm6SzjWvwGzrrHjw2wSi7NiklWH4fezsIcPJAuGA97Zdx cZKBg1lCTpkk1QcfaAWyhU4BQAh398bky+Y6uVDlXmKFltdRhX/Ws+0uyO34jJ986oTa fDuHZ04SokKxUrJbb9FJwf805gszDQcs/5aFo1RZveKtLcZs/DkpDqIY5JZJVmA1lI1U G1MCXLcgKYbhpxvJaMZbg01aGbJnUCSHcI/3kJD3NfKvcLYm9sW3sJYFniTzO73yBI/p QJJg== X-Forwarded-Encrypted: i=1; AJvYcCUcaS3x3ukboNWMzlGPpA95Mr2oMBh1FuU0lw8Y71EQLde98T5ShBO0CwwC0YzIlyoJTZnvuG8vGkvkMNk=@vger.kernel.org X-Gm-Message-State: AOJu0YyKh7wegAa2Hym7rWFRV3SkM4JArMhwRRwYtTEdZvNiJ6ORQa0N 2BzBUdPqRB2mG4ZSDKRg+f8ODNOGlol0zBhsLW6S5f0QkEvUJx2Rq3o4WR2vDW0/OP7C9NnywG9 LvYIUKg== X-Google-Smtp-Source: AGHT+IGgvg3dQvTXmdt74y/yValTWb4FYmuEsFGJGvZhCliPSz+PpQKAizJFqhgIhOOFcKnbAdsQp68ZE7E= X-Received: from pjv4.prod.google.com ([2002:a17:90b:5644:b0:313:245:8921]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2688:b0:321:38e0:d591 with SMTP id 98e67ed59e1d1-32166c97f61mr5487656a91.17.1754495950179; Wed, 06 Aug 2025 08:59:10 -0700 (PDT) Date: Wed, 6 Aug 2025 08:59:02 -0700 In-Reply-To: <20250806155905.824388-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250806155905.824388-1-surenb@google.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog Message-ID: <20250806155905.824388-2-surenb@google.com> Subject: [PATCH v3 1/3] selftests/proc: test PROCMAP_QUERY ioctl while vma is concurrently modified From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com, SeongJae Park Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend /proc/pid/maps tearing tests to verify PROCMAP_QUERY ioctl operation correctness while the vma is being concurrently modified. Signed-off-by: Suren Baghdasaryan Tested-by: SeongJae Park Acked-by: SeongJae Park --- tools/testing/selftests/proc/proc-maps-race.c | 65 +++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/tools/testing/selftests/proc/proc-maps-race.c b/tools/testing/= selftests/proc/proc-maps-race.c index 94bba4553130..a546475db550 100644 --- a/tools/testing/selftests/proc/proc-maps-race.c +++ b/tools/testing/selftests/proc/proc-maps-race.c @@ -32,6 +32,8 @@ #include #include #include +#include +#include #include #include #include @@ -317,6 +319,25 @@ static bool capture_mod_pattern(FIXTURE_DATA(proc_maps= _race) *self, strcmp(restored_first_line->text, self->first_line.text) =3D=3D 0; } =20 +static bool query_addr_at(int maps_fd, void *addr, + unsigned long *vma_start, unsigned long *vma_end) +{ + struct procmap_query q; + + memset(&q, 0, sizeof(q)); + q.size =3D sizeof(q); + /* Find the VMA at the split address */ + q.query_addr =3D (unsigned long long)addr; + q.query_flags =3D 0; + if (ioctl(maps_fd, PROCMAP_QUERY, &q)) + return false; + + *vma_start =3D q.vma_start; + *vma_end =3D q.vma_end; + + return true; +} + static inline bool split_vma(FIXTURE_DATA(proc_maps_race) *self) { return mmap(self->mod_info->addr, self->page_size, self->mod_info->prot |= PROT_EXEC, @@ -559,6 +580,8 @@ TEST_F(proc_maps_race, test_maps_tearing_from_split) do { bool last_line_changed; bool first_line_changed; + unsigned long vma_start; + unsigned long vma_end; =20 ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); =20 @@ -595,6 +618,19 @@ TEST_F(proc_maps_race, test_maps_tearing_from_split) first_line_changed =3D strcmp(new_first_line.text, self->first_line.text= ) !=3D 0; ASSERT_EQ(last_line_changed, first_line_changed); =20 + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + ASSERT_TRUE(query_addr_at(self->maps_fd, mod_info->addr + self->page_siz= e, + &vma_start, &vma_end)); + /* + * The vma at the split address can be either the same as + * original one (if read before the split) or the same as the + * first line in the second page (if read after the split). + */ + ASSERT_TRUE((vma_start =3D=3D self->last_line.start_addr && + vma_end =3D=3D self->last_line.end_addr) || + (vma_start =3D=3D split_first_line.start_addr && + vma_end =3D=3D split_first_line.end_addr)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); end_test_iteration(&end_ts, self->verbose); } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); @@ -636,6 +672,9 @@ TEST_F(proc_maps_race, test_maps_tearing_from_resize) clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); start_test_loop(&start_ts, self->verbose); do { + unsigned long vma_start; + unsigned long vma_end; + ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); =20 /* Check if we read vmas after shrinking it */ @@ -662,6 +701,16 @@ TEST_F(proc_maps_race, test_maps_tearing_from_resize) "Expand result invalid", self)); } =20 + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + ASSERT_TRUE(query_addr_at(self->maps_fd, mod_info->addr, &vma_start, &vm= a_end)); + /* + * The vma should stay at the same address and have either the + * original size of 3 pages or 1 page if read after shrinking. + */ + ASSERT_TRUE(vma_start =3D=3D self->last_line.start_addr && + (vma_end - vma_start =3D=3D self->page_size * 3 || + vma_end - vma_start =3D=3D self->page_size)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); end_test_iteration(&end_ts, self->verbose); } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); @@ -703,6 +752,9 @@ TEST_F(proc_maps_race, test_maps_tearing_from_remap) clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); start_test_loop(&start_ts, self->verbose); do { + unsigned long vma_start; + unsigned long vma_end; + ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); =20 /* Check if we read vmas after remapping it */ @@ -729,6 +781,19 @@ TEST_F(proc_maps_race, test_maps_tearing_from_remap) "Remap restore result invalid", self)); } =20 + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + ASSERT_TRUE(query_addr_at(self->maps_fd, mod_info->addr + self->page_siz= e, + &vma_start, &vma_end)); + /* + * The vma should either stay at the same address and have the + * original size of 3 pages or we should find the remapped vma + * at the remap destination address with size of 1 page. + */ + ASSERT_TRUE((vma_start =3D=3D self->last_line.start_addr && + vma_end - vma_start =3D=3D self->page_size * 3) || + (vma_start =3D=3D self->last_line.start_addr + self->page_size && + vma_end - vma_start =3D=3D self->page_size)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); end_test_iteration(&end_ts, self->verbose); } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 09:07:16 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADF28293C67 for ; Wed, 6 Aug 2025 15:59:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754495956; cv=none; b=sh02uLHZXKT/SJbljzK1LHVhaMW4Pnke1Q9N1hTOQVFg2LDkAkGfJtfShEu6HqbRRPQ5qBWXQsk/GxvDzq/red8/zWM1Wm/RyzJkS2nDQ7URWKFlP8UdKynXiy7LxuSSdGtWQb0sunRg2qrK0F7PkZ4SZ0r2yogwbDYqtiO0K8Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754495956; c=relaxed/simple; bh=l8GK7j81RR9TW2O59fnuL+/gB6hOmTs5V1eaXU6DdMY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=dnGV7CFEfy4Wmx3u2gwuVmN7fYr0gNMwnoEALhTU+Pkm6QmXtYLxXmN9Jio2KRRyOdtOxjwl7pTTnt3D/UfnFGpLWiC0sX4wAyoNjyw7Wi8czu+CFbSrlNI1mjy4116HpN8oTTaY1hE+d2n9H1u0jJta96SkzzaGvnUJ3IbIBzY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=YRXHm28G; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YRXHm28G" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-31f3cfdd3d3so92238a91.3 for ; Wed, 06 Aug 2025 08:59:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1754495952; x=1755100752; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OmeTFm5IfPWrfl19FRLhdX2A+KKvy2/5Athl1bFbaAo=; b=YRXHm28GWFL1rkpM+UQ4o45qcIf2PyVrEqR+vuJfRK/wxT31RdWsfug/F8BYVOlm5o cONPVfpVjyeeg14iMDFqyjPxlB6Izlf+TgZXFzkVQvn6ejo1bqA/AtHseK4f0g/139Wc 3UhVi6A1ZOrzjff15TrgOqb/qYdQsJlobyeRMjd5c5s4f01EchFH3BbzDXCMlG+m9Eu3 3ojINJVbDcKziiD44aZGXnXyJrmvz29HFxydvtuNYhV/PkexpD2uEynRpBpy08Mv2D1g tMFsSwoKgpZ+SlnYf4XanZm1w+SdTpCl3i9HWPxW2Yme94h3Ib+/k8AqN4V8mb9VCP/y oI0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754495952; x=1755100752; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OmeTFm5IfPWrfl19FRLhdX2A+KKvy2/5Athl1bFbaAo=; b=Y7ThgJFodEFDpFHXHHXfm7wPyhMrnxtaeE/rg8rWQ1rPuDfaE89dqxwOnv7rLicERy waloDs3Bd7Z5mg6FIR5LJCjEhqLo+R+dxcT1EtMJ7IB7FVFjF+IghEwNEUajH28Q6krq Y21JF7a1KBuIdbBicxldEvVP1HeIh4H4J5oLyPr51WxGT2i7HL03UvIiShd57BYZBFJc PVZP7TdojmT9ps0ytWfU4n6AZtit2YEuD5Nk/euh8bVCEbdSLG0qlJ0mAYTQ7UFI9/8S MyPrujSb3WKkW9ZDuC7JgDSATbYubhlocgWQokWStnHhsZowzZbz/AaHHKRaD6VAIji4 fuxg== X-Forwarded-Encrypted: i=1; AJvYcCWxYC42rpemJfX9Sz5uClXoVpgtZ894heTH5rN6HxPVHDMdYyZMwf5jsFXdkNrx3pvCjr1VmIMghFTI72Q=@vger.kernel.org X-Gm-Message-State: AOJu0YzeKL+FwOJZ94+M7aexHH8eNacHfbii01euwKlYwDkbmhFVO+11 xnFYCvjO5bi0JStR5SHrCOodAJHJU9eEL93+pxUJIxAXB0jO6Lp5teL01Kpy4uuDCib1G7XabDJ 1R0hj9A== X-Google-Smtp-Source: AGHT+IHypLIFS0bldz1JBYWo4PWfSWusTsYByh8vtqV4CTtvef/tfbbsrJZ7GVJlV0SCUzRbeil7HQOczFA= X-Received: from pjsj11.prod.google.com ([2002:a17:90a:734b:b0:31c:bdae:b22a]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2fc7:b0:321:2160:bf6e with SMTP id 98e67ed59e1d1-3216759164amr4291303a91.24.1754495952108; Wed, 06 Aug 2025 08:59:12 -0700 (PDT) Date: Wed, 6 Aug 2025 08:59:03 -0700 In-Reply-To: <20250806155905.824388-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250806155905.824388-1-surenb@google.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog Message-ID: <20250806155905.824388-3-surenb@google.com> Subject: [PATCH v3 2/3] fs/proc/task_mmu: factor out proc_maps_private fields used by PROCMAP_QUERY From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Refactor struct proc_maps_private so that the fields used by PROCMAP_QUERY ioctl are moved into a separate structure. In the next patch this allows ioctl to reuse some of the functions used for reading /proc/pid/maps without using file->private_data. This prevents concurrent modification of file->private_data members by ioctl and /proc/pid/maps readers. The change is pure code refactoring and has no functional changes. Signed-off-by: Suren Baghdasaryan Reviewed-by: Vlastimil Babka Acked-by: SeongJae Park --- fs/proc/internal.h | 15 ++++++---- fs/proc/task_mmu.c | 70 ++++++++++++++++++++++---------------------- fs/proc/task_nommu.c | 14 ++++----- 3 files changed, 52 insertions(+), 47 deletions(-) diff --git a/fs/proc/internal.h b/fs/proc/internal.h index e737401d7383..d1598576506c 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -378,16 +378,21 @@ extern void proc_self_init(void); * task_[no]mmu.c */ struct mem_size_stats; -struct proc_maps_private { - struct inode *inode; - struct task_struct *task; + +struct proc_maps_locking_ctx { struct mm_struct *mm; - struct vma_iterator iter; - loff_t last_pos; #ifdef CONFIG_PER_VMA_LOCK bool mmap_locked; struct vm_area_struct *locked_vma; #endif +}; + +struct proc_maps_private { + struct inode *inode; + struct task_struct *task; + struct vma_iterator iter; + loff_t last_pos; + struct proc_maps_locking_ctx lock_ctx; #ifdef CONFIG_NUMA struct mempolicy *task_mempolicy; #endif diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index ee1e4ccd33bd..45134335e086 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -132,11 +132,11 @@ static void release_task_mempolicy(struct proc_maps_p= rivate *priv) =20 #ifdef CONFIG_PER_VMA_LOCK =20 -static void unlock_vma(struct proc_maps_private *priv) +static void unlock_vma(struct proc_maps_locking_ctx *lock_ctx) { - if (priv->locked_vma) { - vma_end_read(priv->locked_vma); - priv->locked_vma =3D NULL; + if (lock_ctx->locked_vma) { + vma_end_read(lock_ctx->locked_vma); + lock_ctx->locked_vma =3D NULL; } } =20 @@ -151,14 +151,14 @@ static inline bool lock_vma_range(struct seq_file *m, * walking the vma tree under rcu read protection. */ if (m->op !=3D &proc_pid_maps_op) { - if (mmap_read_lock_killable(priv->mm)) + if (mmap_read_lock_killable(priv->lock_ctx.mm)) return false; =20 - priv->mmap_locked =3D true; + priv->lock_ctx.mmap_locked =3D true; } else { rcu_read_lock(); - priv->locked_vma =3D NULL; - priv->mmap_locked =3D false; + priv->lock_ctx.locked_vma =3D NULL; + priv->lock_ctx.mmap_locked =3D false; } =20 return true; @@ -166,10 +166,10 @@ static inline bool lock_vma_range(struct seq_file *m, =20 static inline void unlock_vma_range(struct proc_maps_private *priv) { - if (priv->mmap_locked) { - mmap_read_unlock(priv->mm); + if (priv->lock_ctx.mmap_locked) { + mmap_read_unlock(priv->lock_ctx.mm); } else { - unlock_vma(priv); + unlock_vma(&priv->lock_ctx); rcu_read_unlock(); } } @@ -179,13 +179,13 @@ static struct vm_area_struct *get_next_vma(struct pro= c_maps_private *priv, { struct vm_area_struct *vma; =20 - if (priv->mmap_locked) + if (priv->lock_ctx.mmap_locked) return vma_next(&priv->iter); =20 - unlock_vma(priv); - vma =3D lock_next_vma(priv->mm, &priv->iter, last_pos); + unlock_vma(&priv->lock_ctx); + vma =3D lock_next_vma(priv->lock_ctx.mm, &priv->iter, last_pos); if (!IS_ERR_OR_NULL(vma)) - priv->locked_vma =3D vma; + priv->lock_ctx.locked_vma =3D vma; =20 return vma; } @@ -193,14 +193,14 @@ static struct vm_area_struct *get_next_vma(struct pro= c_maps_private *priv, static inline bool fallback_to_mmap_lock(struct proc_maps_private *priv, loff_t pos) { - if (priv->mmap_locked) + if (priv->lock_ctx.mmap_locked) return false; =20 rcu_read_unlock(); - mmap_read_lock(priv->mm); + mmap_read_lock(priv->lock_ctx.mm); /* Reinitialize the iterator after taking mmap_lock */ vma_iter_set(&priv->iter, pos); - priv->mmap_locked =3D true; + priv->lock_ctx.mmap_locked =3D true; =20 return true; } @@ -210,12 +210,12 @@ static inline bool fallback_to_mmap_lock(struct proc_= maps_private *priv, static inline bool lock_vma_range(struct seq_file *m, struct proc_maps_private *priv) { - return mmap_read_lock_killable(priv->mm) =3D=3D 0; + return mmap_read_lock_killable(priv->lock_ctx.mm) =3D=3D 0; } =20 static inline void unlock_vma_range(struct proc_maps_private *priv) { - mmap_read_unlock(priv->mm); + mmap_read_unlock(priv->lock_ctx.mm); } =20 static struct vm_area_struct *get_next_vma(struct proc_maps_private *priv, @@ -258,7 +258,7 @@ static struct vm_area_struct *proc_get_vma(struct seq_f= ile *m, loff_t *ppos) *ppos =3D vma->vm_end; } else { *ppos =3D SENTINEL_VMA_GATE; - vma =3D get_gate_vma(priv->mm); + vma =3D get_gate_vma(priv->lock_ctx.mm); } =20 return vma; @@ -278,7 +278,7 @@ static void *m_start(struct seq_file *m, loff_t *ppos) if (!priv->task) return ERR_PTR(-ESRCH); =20 - mm =3D priv->mm; + mm =3D priv->lock_ctx.mm; if (!mm || !mmget_not_zero(mm)) { put_task_struct(priv->task); priv->task =3D NULL; @@ -318,7 +318,7 @@ static void *m_next(struct seq_file *m, void *v, loff_t= *ppos) static void m_stop(struct seq_file *m, void *v) { struct proc_maps_private *priv =3D m->private; - struct mm_struct *mm =3D priv->mm; + struct mm_struct *mm =3D priv->lock_ctx.mm; =20 if (!priv->task) return; @@ -339,9 +339,9 @@ static int proc_maps_open(struct inode *inode, struct f= ile *file, return -ENOMEM; =20 priv->inode =3D inode; - priv->mm =3D proc_mem_open(inode, PTRACE_MODE_READ); - if (IS_ERR_OR_NULL(priv->mm)) { - int err =3D priv->mm ? PTR_ERR(priv->mm) : -ESRCH; + priv->lock_ctx.mm =3D proc_mem_open(inode, PTRACE_MODE_READ); + if (IS_ERR_OR_NULL(priv->lock_ctx.mm)) { + int err =3D priv->lock_ctx.mm ? PTR_ERR(priv->lock_ctx.mm) : -ESRCH; =20 seq_release_private(inode, file); return err; @@ -355,8 +355,8 @@ static int proc_map_release(struct inode *inode, struct= file *file) struct seq_file *seq =3D file->private_data; struct proc_maps_private *priv =3D seq->private; =20 - if (priv->mm) - mmdrop(priv->mm); + if (priv->lock_ctx.mm) + mmdrop(priv->lock_ctx.mm); =20 return seq_release_private(inode, file); } @@ -610,7 +610,7 @@ static int do_procmap_query(struct proc_maps_private *p= riv, void __user *uarg) if (!!karg.build_id_size !=3D !!karg.build_id_addr) return -EINVAL; =20 - mm =3D priv->mm; + mm =3D priv->lock_ctx.mm; if (!mm || !mmget_not_zero(mm)) return -ESRCH; =20 @@ -1311,7 +1311,7 @@ static int show_smaps_rollup(struct seq_file *m, void= *v) { struct proc_maps_private *priv =3D m->private; struct mem_size_stats mss =3D {}; - struct mm_struct *mm =3D priv->mm; + struct mm_struct *mm =3D priv->lock_ctx.mm; struct vm_area_struct *vma; unsigned long vma_start =3D 0, last_vma_end =3D 0; int ret =3D 0; @@ -1456,9 +1456,9 @@ static int smaps_rollup_open(struct inode *inode, str= uct file *file) goto out_free; =20 priv->inode =3D inode; - priv->mm =3D proc_mem_open(inode, PTRACE_MODE_READ); - if (IS_ERR_OR_NULL(priv->mm)) { - ret =3D priv->mm ? PTR_ERR(priv->mm) : -ESRCH; + priv->lock_ctx.mm =3D proc_mem_open(inode, PTRACE_MODE_READ); + if (IS_ERR_OR_NULL(priv->lock_ctx.mm)) { + ret =3D priv->lock_ctx.mm ? PTR_ERR(priv->lock_ctx.mm) : -ESRCH; =20 single_release(inode, file); goto out_free; @@ -1476,8 +1476,8 @@ static int smaps_rollup_release(struct inode *inode, = struct file *file) struct seq_file *seq =3D file->private_data; struct proc_maps_private *priv =3D seq->private; =20 - if (priv->mm) - mmdrop(priv->mm); + if (priv->lock_ctx.mm) + mmdrop(priv->lock_ctx.mm); =20 kfree(priv); return single_release(inode, file); diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c index 59bfd61d653a..d362919f4f68 100644 --- a/fs/proc/task_nommu.c +++ b/fs/proc/task_nommu.c @@ -204,7 +204,7 @@ static void *m_start(struct seq_file *m, loff_t *ppos) if (!priv->task) return ERR_PTR(-ESRCH); =20 - mm =3D priv->mm; + mm =3D priv->lock_ctx.mm; if (!mm || !mmget_not_zero(mm)) { put_task_struct(priv->task); priv->task =3D NULL; @@ -226,7 +226,7 @@ static void *m_start(struct seq_file *m, loff_t *ppos) static void m_stop(struct seq_file *m, void *v) { struct proc_maps_private *priv =3D m->private; - struct mm_struct *mm =3D priv->mm; + struct mm_struct *mm =3D priv->lock_ctx.mm; =20 if (!priv->task) return; @@ -259,9 +259,9 @@ static int maps_open(struct inode *inode, struct file *= file, return -ENOMEM; =20 priv->inode =3D inode; - priv->mm =3D proc_mem_open(inode, PTRACE_MODE_READ); - if (IS_ERR_OR_NULL(priv->mm)) { - int err =3D priv->mm ? PTR_ERR(priv->mm) : -ESRCH; + priv->lock_ctx.mm =3D proc_mem_open(inode, PTRACE_MODE_READ); + if (IS_ERR_OR_NULL(priv->lock_ctx.mm)) { + int err =3D priv->lock_ctx.mm ? PTR_ERR(priv->lock_ctx.mm) : -ESRCH; =20 seq_release_private(inode, file); return err; @@ -276,8 +276,8 @@ static int map_release(struct inode *inode, struct file= *file) struct seq_file *seq =3D file->private_data; struct proc_maps_private *priv =3D seq->private; =20 - if (priv->mm) - mmdrop(priv->mm); + if (priv->lock_ctx.mm) + mmdrop(priv->lock_ctx.mm); =20 return seq_release_private(inode, file); } --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 09:07:16 2025 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE23A29A9D3 for ; Wed, 6 Aug 2025 15:59:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754495957; cv=none; b=UBN0Hz+BHu/82UJzU0eaSZQw40GdSW8yBYRSZ6BXBdzgq/0OESZ9gDFkcgQMHU4eqDe0iGBFS2cqNv5OHDJwO9QDpU7/SZ9bRdfSEEKNObRbnmTDLq4LqrXlF8koWEDZbnDjKaoweUd6QoEsyYq23XiISBhNoSy71FzCsLWsyuw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754495957; c=relaxed/simple; bh=JRfgX1mdsRBhfL5iUYnonjqsaZrBRF+pfu+2IoIqfMs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Q9f492Pb+x8JZldO+behoOKXCLGsPiSmn8JSOUzHTmpBPoWpaZUama2FTLYS8nCBuJT7DK39HvrxI3yZuzjiRfHPI5occVUk0Z3kdTn7bcI4zjuZiWNf8AO2XEPLx4VVqwvpsVVZT/CGhpRx3jwWl+GDaOL+p0HlOFm29kWCc6s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Budp1pRU; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Budp1pRU" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-b4227538a47so4828777a12.1 for ; Wed, 06 Aug 2025 08:59:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1754495954; x=1755100754; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=79cST+DpT8JLjhSc853c7Vp44fp4YLe4km92lxkzOhQ=; b=Budp1pRUx6mYa0epyo7zbp+kkDKr3RN/i8el8R7Gka6OmrHe2dFDFDdc+iDTLUKKiC iq40KT4Z5+N9+8HmW49DP/HxNSsTFIzoP30ciMT9uw8AeFodCJqI6ZEdtg9hnks+JGHl xZoBMksg8rh0B18GhOcwGnnPUsoKD54LGiyKNA3erREfPj6wpwDRuMrPdDoFPXgCrh0q Lo0mJa3l3ezZkJrQ9XJPpiSLn0TwfCXe/2ffk5VJZXxr8LYJF29c9J5g6v9woFjbRH+8 Ryz0j8obquz+9LqjwKt+gymvn94TrB4q8xSYgGRCXLW7KO3UMTw2Aas+6mWmJKGs5X1s 9U7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754495954; x=1755100754; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=79cST+DpT8JLjhSc853c7Vp44fp4YLe4km92lxkzOhQ=; b=wUwQmJxv1wAEvVtVZ+HIg/1GoZV1eI2mfkTiV2YhEHds0whrZVOr7uA7EuX/yP2AcD 8ZNpTqWA1g41yqqegbra+z3rlq6h+/C90lF+lkB3g9P/hrHH0oVaUXGbG+joqxo1yE7K FDPCUqCcZqlw2uTWRm+9ecp5s2DodHiBHt9o1V0xWYIYjXaLT4gVN0VMYJ2S8XuD0V6d T2EOhWn7u4uPcQ5WindBZ8DG5abj/ROm8WL5ukRxIdF/Z/69/SSTwtI+Nqmyb1xAm5BC 455WDpEOgQ33vN1eAfKDROB9Za61GLbQvUk/XRMBpr17I+3eFoSMw40gsMnwscBZDTYi gffQ== X-Forwarded-Encrypted: i=1; AJvYcCVTCyQF3aujYCICsJZNdbpgZOcRzDgaZRlKm64a39JoIlSIhtEv2sURn5tf320h/X2AU/GpuQe/FV0WGrk=@vger.kernel.org X-Gm-Message-State: AOJu0YzzjocDXsbpwFjAOR04AJ27FXPhUa+3FFGfO8HPqVOfg2Veobtr sHNQFp5KimC9KlOtirFJCOJJyMro4pnO/LNnCYFladO/c+zaYirFXFjDmjakW4LI7FnAcJ4YXHR w2rTqUQ== X-Google-Smtp-Source: AGHT+IFlS+JvIsWnRvs7bTfyfgC/5w+RmEgonVOp9GJDKi3AshY2Fhxqa2lT8Ee9l8rZG2GebQuiHk0ZMyQ= X-Received: from pjboe1.prod.google.com ([2002:a17:90b:3941:b0:31e:c61e:663b]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:dacc:b0:240:92cc:8fcf with SMTP id d9443c01a7336-2429f496c6bmr47463745ad.49.1754495954146; Wed, 06 Aug 2025 08:59:14 -0700 (PDT) Date: Wed, 6 Aug 2025 08:59:04 -0700 In-Reply-To: <20250806155905.824388-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250806155905.824388-1-surenb@google.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog Message-ID: <20250806155905.824388-4-surenb@google.com> Subject: [PATCH v3 3/3] fs/proc/task_mmu: execute PROCMAP_QUERY ioctl under per-vma locks From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Utilize per-vma locks to stabilize vma after lookup without taking mmap_lock during PROCMAP_QUERY ioctl execution. If vma lock is contended, we fall back to mmap_lock but take it only momentarily to lock the vma and release the mmap_lock. In a very unlikely case of vm_refcnt overflow, this fall back path will fail and ioctl is done under mmap_lock protection. This change is designed to reduce mmap_lock contention and prevent PROCMAP_QUERY ioctl calls from blocking address space updates. Signed-off-by: Suren Baghdasaryan Acked-by: SeongJae Park --- fs/proc/task_mmu.c | 84 +++++++++++++++++++++++++++++++++++++--------- 1 file changed, 68 insertions(+), 16 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 45134335e086..0396315dfaee 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -517,28 +517,81 @@ static int pid_maps_open(struct inode *inode, struct = file *file) PROCMAP_QUERY_VMA_FLAGS \ ) =20 -static int query_vma_setup(struct mm_struct *mm) +#ifdef CONFIG_PER_VMA_LOCK + +static int query_vma_setup(struct proc_maps_locking_ctx *lock_ctx) { - return mmap_read_lock_killable(mm); + lock_ctx->locked_vma =3D NULL; + lock_ctx->mmap_locked =3D false; + + return 0; } =20 -static void query_vma_teardown(struct mm_struct *mm, struct vm_area_struct= *vma) +static void query_vma_teardown(struct proc_maps_locking_ctx *lock_ctx) { - mmap_read_unlock(mm); + if (lock_ctx->mmap_locked) + mmap_read_unlock(lock_ctx->mm); + else + unlock_vma(lock_ctx); +} + +static struct vm_area_struct *query_vma_find_by_addr(struct proc_maps_lock= ing_ctx *lock_ctx, + unsigned long addr) +{ + struct vm_area_struct *vma; + struct vma_iterator vmi; + + if (lock_ctx->mmap_locked) + return find_vma(lock_ctx->mm, addr); + + unlock_vma(lock_ctx); + rcu_read_lock(); + vma_iter_init(&vmi, lock_ctx->mm, addr); + vma =3D lock_next_vma(lock_ctx->mm, &vmi, addr); + rcu_read_unlock(); + + if (!IS_ERR_OR_NULL(vma)) { + lock_ctx->locked_vma =3D vma; + } else if (PTR_ERR(vma) =3D=3D -EAGAIN) { + /* Fallback to mmap_lock on vma->vm_refcnt overflow */ + mmap_read_lock(lock_ctx->mm); + vma =3D find_vma(lock_ctx->mm, addr); + lock_ctx->mmap_locked =3D true; + } + + return vma; +} + +#else /* CONFIG_PER_VMA_LOCK */ + +static int query_vma_setup(struct proc_maps_locking_ctx *lock_ctx) +{ + return mmap_read_lock_killable(lock_ctx->mm); } =20 -static struct vm_area_struct *query_vma_find_by_addr(struct mm_struct *mm,= unsigned long addr) +static void query_vma_teardown(struct proc_maps_locking_ctx *lock_ctx) { - return find_vma(mm, addr); + mmap_read_unlock(lock_ctx->mm); } =20 -static struct vm_area_struct *query_matching_vma(struct mm_struct *mm, +static struct vm_area_struct *query_vma_find_by_addr(struct proc_maps_lock= ing_ctx *lock_ctx, + unsigned long addr) +{ + return find_vma(lock_ctx->mm, addr); +} + +#endif /* CONFIG_PER_VMA_LOCK */ + +static struct vm_area_struct *query_matching_vma(struct proc_maps_locking_= ctx *lock_ctx, unsigned long addr, u32 flags) { struct vm_area_struct *vma; =20 next_vma: - vma =3D query_vma_find_by_addr(mm, addr); + vma =3D query_vma_find_by_addr(lock_ctx, addr); + if (IS_ERR(vma)) + return vma; + if (!vma) goto no_vma; =20 @@ -579,11 +632,11 @@ static struct vm_area_struct *query_matching_vma(stru= ct mm_struct *mm, return ERR_PTR(-ENOENT); } =20 -static int do_procmap_query(struct proc_maps_private *priv, void __user *u= arg) +static int do_procmap_query(struct mm_struct *mm, void __user *uarg) { + struct proc_maps_locking_ctx lock_ctx =3D { .mm =3D mm }; struct procmap_query karg; struct vm_area_struct *vma; - struct mm_struct *mm; const char *name =3D NULL; char build_id_buf[BUILD_ID_SIZE_MAX], *name_buf =3D NULL; __u64 usize; @@ -610,17 +663,16 @@ static int do_procmap_query(struct proc_maps_private = *priv, void __user *uarg) if (!!karg.build_id_size !=3D !!karg.build_id_addr) return -EINVAL; =20 - mm =3D priv->lock_ctx.mm; if (!mm || !mmget_not_zero(mm)) return -ESRCH; =20 - err =3D query_vma_setup(mm); + err =3D query_vma_setup(&lock_ctx); if (err) { mmput(mm); return err; } =20 - vma =3D query_matching_vma(mm, karg.query_addr, karg.query_flags); + vma =3D query_matching_vma(&lock_ctx, karg.query_addr, karg.query_flags); if (IS_ERR(vma)) { err =3D PTR_ERR(vma); vma =3D NULL; @@ -705,7 +757,7 @@ static int do_procmap_query(struct proc_maps_private *p= riv, void __user *uarg) } =20 /* unlock vma or mmap_lock, and put mm_struct before copying data to user= */ - query_vma_teardown(mm, vma); + query_vma_teardown(&lock_ctx); mmput(mm); =20 if (karg.vma_name_size && copy_to_user(u64_to_user_ptr(karg.vma_name_addr= ), @@ -725,7 +777,7 @@ static int do_procmap_query(struct proc_maps_private *p= riv, void __user *uarg) return 0; =20 out: - query_vma_teardown(mm, vma); + query_vma_teardown(&lock_ctx); mmput(mm); kfree(name_buf); return err; @@ -738,7 +790,7 @@ static long procfs_procmap_ioctl(struct file *file, uns= igned int cmd, unsigned l =20 switch (cmd) { case PROCMAP_QUERY: - return do_procmap_query(priv, (void __user *)arg); + return do_procmap_query(priv->lock_ctx.mm, (void __user *)arg); default: return -ENOIOCTLCMD; } --=20 2.50.1.565.gc32cd1483b-goog