From nobody Sun Oct 5 16:15:47 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 839DE26E6F0 for ; Thu, 31 Jul 2025 22:00:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753999232; cv=none; b=om2QuUQbvXZPZK882t5WZYwo2vOgNPZy5OAdAfnY8v3uK+xhOGZwEkc+3h2zB3nW8wl+SmmqHqWfhqc2GGUtUk1HpV5BzO+l0P/zzzf4StgcOJb5esVzXm1sen0As0t7J+3KX7WR6iPbQkGxJklGd5YJke2v4NIn+S1YruAJUvc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753999232; c=relaxed/simple; bh=RiwVOvQ4OfV3fqsRV0AGydJn3cbs7Xep9L0oRhd2Iiw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=CfQKBz1XFz6Dzti9vE1UcLk/s2rwFe+lCwdTfZ+eDX+WFINfjRntVxCmiLUn+kdjf68g5f7JkYQfXwpCt2eB4e5oraC21otSiz9gcptF1+TugQ3UhIkqbUa3CIJLnMhWKDITLdfaLEC43rq/QbhCNypD9WurDUnortpzgxtMR1s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=utQNirCL; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="utQNirCL" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-76bc9259f63so230939b3a.3 for ; Thu, 31 Jul 2025 15:00:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753999230; x=1754604030; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MNceOmOhkQd57dKgoJb1yry2qV9WAY49fIuYrPBvXdU=; b=utQNirCLj+F/sj4naPsDZOQ//8AqkeSYqLz5A4W9AIQxsQDI4Edr3t0Ef8CVDnzLl/ C2A/tjXyNK2LShMFivb5BJ2g/GTV6XvXs4XoWsDyJJqgeMVA6GGJbMGR2HDgIgA/YvE/ uvFkE99hg5ZCgeG0wSA2gNYRGM9sgHThbSC5sz8k1t9hpC9dR85fubKqTPP2AQAZFfYO LS5nUBssJMMNSwv7XbLwjhoMOoWdbDyjSyQH5BR6XPf2IHKvd+Mq2Z5FnCYyD24WA4E6 yjvgiAtUYCd70OieAxYv0g5DaF/tP0BrIpVi9SO5ykBRYx0tNU4cOe11G1SIOaDHA9Qf mOBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753999230; x=1754604030; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MNceOmOhkQd57dKgoJb1yry2qV9WAY49fIuYrPBvXdU=; b=o1S8XEma6RIEg4nSuWRniXC1vXnr8SXvP4m/NtGVs4SCsJdulwquNCTUEnPwNY6WAS hkwAhjGoWhndQpv0OMKIZQ+ybGd0M9vGMcUp7u6RF1O9Phj4wL5c+OgSkZLBZd5PLl7Q Z8uja2PF+jC7b6VIRJ8TdvIxpc0DoQmLOg9enSewzfVA8BeFBvsUlYp4QDYLeDtx9am8 qDUkKwAcV69qvsn7SB/sot5oWQ6m3r2wNIycyHBt2Oompy2LaLnwH9q7AGsoZ0jwc2y5 AwuMbbWROwn2u8Aj13dvhWshz6H9tSLyhxrQppvMoCU+qtbLpYCGGyi1iINKWMyCZnNV E5Rg== X-Forwarded-Encrypted: i=1; AJvYcCVHqxiytuADrK1nNAZu/stDf+E2lqEam11Eqh3ngyUgdn+6oFDyv1XMAeo8+P3blWGqy3g3o7aobjCiEHI=@vger.kernel.org X-Gm-Message-State: AOJu0YyVmCjYI/mcIa2knwnIL/tdgHklFBzuhMqF5jHlAKjVIdwGbpA6 hK6SZ3skujPayqyUJ8aJ9Dq7CGBMnGbUIqUNZN/XIkmxJyBIGGrd3aJ9vzRZnL+yVU4B8m6KqBW mvr0B+w== X-Google-Smtp-Source: AGHT+IGN+xdEoqijDkNOjG0dwcR0a2XHO8rj8Ew1U8h+s/FjiVf73gnDDd7MDhE7O1ZnTaPwtmwN5GGXu0o= X-Received: from pjbof4.prod.google.com ([2002:a17:90b:39c4:b0:312:1e70:e233]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:3ca2:b0:215:d28e:8dc2 with SMTP id adf61e73a8af0-23dc0e84b91mr17365554637.31.1753999229680; Thu, 31 Jul 2025 15:00:29 -0700 (PDT) Date: Thu, 31 Jul 2025 15:00:22 -0700 In-Reply-To: <20250731220024.702621-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250731220024.702621-1-surenb@google.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog Message-ID: <20250731220024.702621-2-surenb@google.com> Subject: [PATCH 1/3] selftests/proc: test PROCMAP_QUERY ioctl while vma is concurrently modified From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend /proc/pid/maps tearing tests to verify PROCMAP_QUERY ioctl operation correctness while the vma is being concurrently modified. Signed-off-by: Suren Baghdasaryan Acked-by: SeongJae Park Tested-by: SeongJae Park --- tools/testing/selftests/proc/proc-maps-race.c | 65 +++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/tools/testing/selftests/proc/proc-maps-race.c b/tools/testing/= selftests/proc/proc-maps-race.c index 66773685a047..d40854a07ec1 100644 --- a/tools/testing/selftests/proc/proc-maps-race.c +++ b/tools/testing/selftests/proc/proc-maps-race.c @@ -32,6 +32,8 @@ #include #include #include +#include +#include #include #include #include @@ -317,6 +319,25 @@ static bool capture_mod_pattern(FIXTURE_DATA(proc_maps= _race) *self, strcmp(restored_first_line->text, self->first_line.text) =3D=3D 0; } =20 +static bool query_addr_at(int maps_fd, void *addr, + unsigned long *vma_start, unsigned long *vma_end) +{ + struct procmap_query q; + + memset(&q, 0, sizeof(q)); + q.size =3D sizeof(q); + /* Find the VMA at the split address */ + q.query_addr =3D (unsigned long long)addr; + q.query_flags =3D 0; + if (ioctl(maps_fd, PROCMAP_QUERY, &q)) + return false; + + *vma_start =3D q.vma_start; + *vma_end =3D q.vma_end; + + return true; +} + static inline bool split_vma(FIXTURE_DATA(proc_maps_race) *self) { return mmap(self->mod_info->addr, self->page_size, self->mod_info->prot |= PROT_EXEC, @@ -559,6 +580,8 @@ TEST_F(proc_maps_race, test_maps_tearing_from_split) do { bool last_line_changed; bool first_line_changed; + unsigned long vma_start; + unsigned long vma_end; =20 ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); =20 @@ -595,6 +618,19 @@ TEST_F(proc_maps_race, test_maps_tearing_from_split) first_line_changed =3D strcmp(new_first_line.text, self->first_line.text= ) !=3D 0; ASSERT_EQ(last_line_changed, first_line_changed); =20 + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + ASSERT_TRUE(query_addr_at(self->maps_fd, mod_info->addr + self->page_siz= e, + &vma_start, &vma_end)); + /* + * The vma at the split address can be either the same as + * original one (if read before the split) or the same as the + * first line in the second page (if read after the split). + */ + ASSERT_TRUE((vma_start =3D=3D self->last_line.start_addr && + vma_end =3D=3D self->last_line.end_addr) || + (vma_start =3D=3D split_first_line.start_addr && + vma_end =3D=3D split_first_line.end_addr)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); end_test_iteration(&end_ts, self->verbose); } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); @@ -636,6 +672,9 @@ TEST_F(proc_maps_race, test_maps_tearing_from_resize) clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); start_test_loop(&start_ts, self->verbose); do { + unsigned long vma_start; + unsigned long vma_end; + ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); =20 /* Check if we read vmas after shrinking it */ @@ -662,6 +701,16 @@ TEST_F(proc_maps_race, test_maps_tearing_from_resize) "Expand result invalid", self)); } =20 + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + ASSERT_TRUE(query_addr_at(self->maps_fd, mod_info->addr, &vma_start, &vm= a_end)); + /* + * The vma should stay at the same address and have either the + * original size of 3 pages or 1 page if read after shrinking. + */ + ASSERT_TRUE(vma_start =3D=3D self->last_line.start_addr && + (vma_end - vma_start =3D=3D self->page_size * 3 || + vma_end - vma_start =3D=3D self->page_size)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); end_test_iteration(&end_ts, self->verbose); } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); @@ -703,6 +752,9 @@ TEST_F(proc_maps_race, test_maps_tearing_from_remap) clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts); start_test_loop(&start_ts, self->verbose); do { + unsigned long vma_start; + unsigned long vma_end; + ASSERT_TRUE(read_boundary_lines(self, &new_last_line, &new_first_line)); =20 /* Check if we read vmas after remapping it */ @@ -729,6 +781,19 @@ TEST_F(proc_maps_race, test_maps_tearing_from_remap) "Remap restore result invalid", self)); } =20 + /* Check if PROCMAP_QUERY ioclt() finds the right VMA */ + ASSERT_TRUE(query_addr_at(self->maps_fd, mod_info->addr + self->page_siz= e, + &vma_start, &vma_end)); + /* + * The vma should either stay at the same address and have the + * original size of 3 pages or we should find the remapped vma + * at the remap destination address with size of 1 page. + */ + ASSERT_TRUE((vma_start =3D=3D self->last_line.start_addr && + vma_end - vma_start =3D=3D self->page_size * 3) || + (vma_start =3D=3D self->last_line.start_addr + self->page_size && + vma_end - vma_start =3D=3D self->page_size)); + clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts); end_test_iteration(&end_ts, self->verbose); } while (end_ts.tv_sec - start_ts.tv_sec < self->duration_sec); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 16:15:47 2025 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 85F982701DF for ; Thu, 31 Jul 2025 22:00:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753999234; cv=none; b=PdISgFbIuTfEyTio4bWvUXMekFgt/6dZbIYzQSf9MgBUhxfnxeLFmjd22CtNUoVZB8JO3ZOTt8bjFsfKehao8ELC6I2YlJK9p7ckAokDCsjIOUGK1sMPhAfNHMSqCyl2KRahB9XeCfMpjvPttHAfXaSiZkawLYJW5h2n0YD4pn8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753999234; c=relaxed/simple; bh=yhEtd1ka8mErusKeRqx/I8nXNg9UEifaYYnXhvMlSKc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=p+uX/R5m5Hl6pvUiCzEJTQ2Kylkh2ZYnolhr/nGupY4p8UfQqwj1DbNs/ytIG6//OUazS+fHtt/M3aIx4ZAwMv3J4nHLEqytfMpLPG9mviH1cHxmRAhS8qUSAbDf0jrDeA1HHR6Vt32IxbJXKN8hKeU+rpKtZCXC+GoWujqIZ+M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=xo7OSuw+; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="xo7OSuw+" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2369dd58602so2782205ad.1 for ; Thu, 31 Jul 2025 15:00:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753999232; x=1754604032; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=c3nszwvCvPWAmHYDsSOEe6+RDYFjtctgvzjemTCKIEM=; b=xo7OSuw+0JbLwCAXadDhWLSr1R6eVL/CuqCdT5bgUUKxS90sKUqc7nJ2wGjdG4dNeW Vj4J15Q98xxdZieZGPLd29bFWtIBxtPA1ZZ4UgtsJ4VYY9ZZIs/sZWm5A2els0DQ6OHy tgqkY32pQxo6sgZau7OVT87Ka5+fE4PbUYG0XX8Yxgh7B5roqmWCAp+SR7dhi40sxGd0 OcJ++o7Eo515vM+hk3Mke7DICCKCPo9WuhueRjBPuEkalNItSBX9ceU+ASDAf7e9LPlS zlDe9XH8/LpsowmnHA1H2i9SrOV015SfLoljkPJ8mp7uUPQcr5FYGNZs8qVbbR+d9zt1 53Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753999232; x=1754604032; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=c3nszwvCvPWAmHYDsSOEe6+RDYFjtctgvzjemTCKIEM=; b=HcxMFKRLtSPKbJBwLSy39btvj8vsEyq7y5TF53gmKsbJf8IL5lX+5IwHVXR/+1PkbP rizss4XOya3sX44iUkfQVZ3MJ+DnxXErC887sTYcIr0rXEx4v8VgsxR3y0BqxUnNzvaj bUGLXnqMIlkUQmEukdkZTxTVaF/4qki9OabtRoVmtrDcc7iSfE3Tb1SPx/en7Jwt7rmw ZYEM5QVmtnednatOayBHQy+lhwx63RZVYjNIRgNWQbQB9LaomvfaDmAVH6vn37YHgDLM w6a2i/3QtJ4Cnem0rTkLjlCSAw/Hr3uyHKGB0g9XhsHL0wqNGXFIiOowyPQo7cN5IWwU xonw== X-Forwarded-Encrypted: i=1; AJvYcCVlprzf/tkqUrWoi8QmtjQfz6xajJ8vHA/dqapvujbhd+h6dAypDP18mp6hjjHOxlPB3l03IQuMrU5N9Mc=@vger.kernel.org X-Gm-Message-State: AOJu0Yz5c3MQizo5omcRWzUSwb6ASv15J0oDSx4eQEKPtpr4fuCzQlf8 tT6A9rfa5AkFNY9KZG8R46uO522Nm3DXv+COSdzdhvjbNIDPpZNpmiupZHjSuZ8p9EbKWz/UKFh /P0n6iw== X-Google-Smtp-Source: AGHT+IEe6D0EDOa4VLSLQnNt443QGZdBsTsucswGzk0sXMtaRnOh2OfcwLIsh8g7TBhjLjlnLUKia1iuw4c= X-Received: from plbka16.prod.google.com ([2002:a17:903:3350:b0:240:9ca:fc49]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:fc50:b0:234:ef42:5d65 with SMTP id d9443c01a7336-24096c0dcf7mr124221755ad.52.1753999231880; Thu, 31 Jul 2025 15:00:31 -0700 (PDT) Date: Thu, 31 Jul 2025 15:00:23 -0700 In-Reply-To: <20250731220024.702621-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250731220024.702621-1-surenb@google.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog Message-ID: <20250731220024.702621-3-surenb@google.com> Subject: [PATCH 2/3] fs/proc/task_mmu: factor out proc_maps_private fields used by PROCMAP_QUERY From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Refactor struct proc_maps_private so that the fields used by PROCMAP_QUERY ioctl are moved into a separate structure. In the next patch this allows ioctl to reuse some of the functions used for reading /proc/pid/maps without using file->private_data. This prevents concurrent modification of file->private_data members by ioctl and /proc/pid/maps readers. The change is pure code refactoring and has no functional changes. Signed-off-by: Suren Baghdasaryan --- fs/proc/internal.h | 15 ++++++---- fs/proc/task_mmu.c | 70 +++++++++++++++++++++++----------------------- 2 files changed, 45 insertions(+), 40 deletions(-) diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 7c235451c5ea..e2447b22592e 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -379,16 +379,21 @@ extern void proc_self_init(void); * task_[no]mmu.c */ struct mem_size_stats; -struct proc_maps_private { - struct inode *inode; - struct task_struct *task; + +struct proc_maps_query_data { struct mm_struct *mm; - struct vma_iterator iter; - loff_t last_pos; #ifdef CONFIG_PER_VMA_LOCK bool mmap_locked; struct vm_area_struct *locked_vma; #endif +}; + +struct proc_maps_private { + struct inode *inode; + struct task_struct *task; + struct vma_iterator iter; + loff_t last_pos; + struct proc_maps_query_data query; #ifdef CONFIG_NUMA struct mempolicy *task_mempolicy; #endif diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 3d6d8a9f13fc..509fa162760a 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -132,11 +132,11 @@ static void release_task_mempolicy(struct proc_maps_p= rivate *priv) =20 #ifdef CONFIG_PER_VMA_LOCK =20 -static void unlock_vma(struct proc_maps_private *priv) +static void unlock_vma(struct proc_maps_query_data *query) { - if (priv->locked_vma) { - vma_end_read(priv->locked_vma); - priv->locked_vma =3D NULL; + if (query->locked_vma) { + vma_end_read(query->locked_vma); + query->locked_vma =3D NULL; } } =20 @@ -151,14 +151,14 @@ static inline bool lock_vma_range(struct seq_file *m, * walking the vma tree under rcu read protection. */ if (m->op !=3D &proc_pid_maps_op) { - if (mmap_read_lock_killable(priv->mm)) + if (mmap_read_lock_killable(priv->query.mm)) return false; =20 - priv->mmap_locked =3D true; + priv->query.mmap_locked =3D true; } else { rcu_read_lock(); - priv->locked_vma =3D NULL; - priv->mmap_locked =3D false; + priv->query.locked_vma =3D NULL; + priv->query.mmap_locked =3D false; } =20 return true; @@ -166,10 +166,10 @@ static inline bool lock_vma_range(struct seq_file *m, =20 static inline void unlock_vma_range(struct proc_maps_private *priv) { - if (priv->mmap_locked) { - mmap_read_unlock(priv->mm); + if (priv->query.mmap_locked) { + mmap_read_unlock(priv->query.mm); } else { - unlock_vma(priv); + unlock_vma(&priv->query); rcu_read_unlock(); } } @@ -179,13 +179,13 @@ static struct vm_area_struct *get_next_vma(struct pro= c_maps_private *priv, { struct vm_area_struct *vma; =20 - if (priv->mmap_locked) + if (priv->query.mmap_locked) return vma_next(&priv->iter); =20 - unlock_vma(priv); - vma =3D lock_next_vma(priv->mm, &priv->iter, last_pos); + unlock_vma(&priv->query); + vma =3D lock_next_vma(priv->query.mm, &priv->iter, last_pos); if (!IS_ERR_OR_NULL(vma)) - priv->locked_vma =3D vma; + priv->query.locked_vma =3D vma; =20 return vma; } @@ -193,14 +193,14 @@ static struct vm_area_struct *get_next_vma(struct pro= c_maps_private *priv, static inline bool fallback_to_mmap_lock(struct proc_maps_private *priv, loff_t pos) { - if (priv->mmap_locked) + if (priv->query.mmap_locked) return false; =20 rcu_read_unlock(); - mmap_read_lock(priv->mm); + mmap_read_lock(priv->query.mm); /* Reinitialize the iterator after taking mmap_lock */ vma_iter_set(&priv->iter, pos); - priv->mmap_locked =3D true; + priv->query.mmap_locked =3D true; =20 return true; } @@ -210,12 +210,12 @@ static inline bool fallback_to_mmap_lock(struct proc_= maps_private *priv, static inline bool lock_vma_range(struct seq_file *m, struct proc_maps_private *priv) { - return mmap_read_lock_killable(priv->mm) =3D=3D 0; + return mmap_read_lock_killable(priv->query.mm) =3D=3D 0; } =20 static inline void unlock_vma_range(struct proc_maps_private *priv) { - mmap_read_unlock(priv->mm); + mmap_read_unlock(priv->query.mm); } =20 static struct vm_area_struct *get_next_vma(struct proc_maps_private *priv, @@ -258,7 +258,7 @@ static struct vm_area_struct *proc_get_vma(struct seq_f= ile *m, loff_t *ppos) *ppos =3D vma->vm_end; } else { *ppos =3D SENTINEL_VMA_GATE; - vma =3D get_gate_vma(priv->mm); + vma =3D get_gate_vma(priv->query.mm); } =20 return vma; @@ -278,7 +278,7 @@ static void *m_start(struct seq_file *m, loff_t *ppos) if (!priv->task) return ERR_PTR(-ESRCH); =20 - mm =3D priv->mm; + mm =3D priv->query.mm; if (!mm || !mmget_not_zero(mm)) { put_task_struct(priv->task); priv->task =3D NULL; @@ -318,7 +318,7 @@ static void *m_next(struct seq_file *m, void *v, loff_t= *ppos) static void m_stop(struct seq_file *m, void *v) { struct proc_maps_private *priv =3D m->private; - struct mm_struct *mm =3D priv->mm; + struct mm_struct *mm =3D priv->query.mm; =20 if (!priv->task) return; @@ -339,9 +339,9 @@ static int proc_maps_open(struct inode *inode, struct f= ile *file, return -ENOMEM; =20 priv->inode =3D inode; - priv->mm =3D proc_mem_open(inode, PTRACE_MODE_READ); - if (IS_ERR_OR_NULL(priv->mm)) { - int err =3D priv->mm ? PTR_ERR(priv->mm) : -ESRCH; + priv->query.mm =3D proc_mem_open(inode, PTRACE_MODE_READ); + if (IS_ERR_OR_NULL(priv->query.mm)) { + int err =3D priv->query.mm ? PTR_ERR(priv->query.mm) : -ESRCH; =20 seq_release_private(inode, file); return err; @@ -355,8 +355,8 @@ static int proc_map_release(struct inode *inode, struct= file *file) struct seq_file *seq =3D file->private_data; struct proc_maps_private *priv =3D seq->private; =20 - if (priv->mm) - mmdrop(priv->mm); + if (priv->query.mm) + mmdrop(priv->query.mm); =20 return seq_release_private(inode, file); } @@ -610,7 +610,7 @@ static int do_procmap_query(struct proc_maps_private *p= riv, void __user *uarg) if (!!karg.build_id_size !=3D !!karg.build_id_addr) return -EINVAL; =20 - mm =3D priv->mm; + mm =3D priv->query.mm; if (!mm || !mmget_not_zero(mm)) return -ESRCH; =20 @@ -1307,7 +1307,7 @@ static int show_smaps_rollup(struct seq_file *m, void= *v) { struct proc_maps_private *priv =3D m->private; struct mem_size_stats mss =3D {}; - struct mm_struct *mm =3D priv->mm; + struct mm_struct *mm =3D priv->query.mm; struct vm_area_struct *vma; unsigned long vma_start =3D 0, last_vma_end =3D 0; int ret =3D 0; @@ -1452,9 +1452,9 @@ static int smaps_rollup_open(struct inode *inode, str= uct file *file) goto out_free; =20 priv->inode =3D inode; - priv->mm =3D proc_mem_open(inode, PTRACE_MODE_READ); - if (IS_ERR_OR_NULL(priv->mm)) { - ret =3D priv->mm ? PTR_ERR(priv->mm) : -ESRCH; + priv->query.mm =3D proc_mem_open(inode, PTRACE_MODE_READ); + if (IS_ERR_OR_NULL(priv->query.mm)) { + ret =3D priv->query.mm ? PTR_ERR(priv->query.mm) : -ESRCH; =20 single_release(inode, file); goto out_free; @@ -1472,8 +1472,8 @@ static int smaps_rollup_release(struct inode *inode, = struct file *file) struct seq_file *seq =3D file->private_data; struct proc_maps_private *priv =3D seq->private; =20 - if (priv->mm) - mmdrop(priv->mm); + if (priv->query.mm) + mmdrop(priv->query.mm); =20 kfree(priv); return single_release(inode, file); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 16:15:47 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A9E592737F6 for ; Thu, 31 Jul 2025 22:00:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753999236; cv=none; b=WiWKZCvDnj5LGYXenPNszLzL8D9W74Vpc7CUgWmQM3qp0rWT32r+cT4O6ris8PMyO3IhPnyjlMi5FY8Z0Xeb24YCJyMaDx8ryyQZ95Zzvcs5diW4yUxXFuro20kT493jRcB95g5ioOFIu39DSiBL8CJRH7GA0Rga0V045DD3U9U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753999236; c=relaxed/simple; bh=WiIYR43BkynWQ/NHpDxB552UWAXO4+NMreOccpkPwOQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=mjcbfvu8hdM1oKZFc1DeO/Nqd+vEx/UMnOFvY9AdmFVZIz6NfAQ0nIyes6teK5lfL+3Jz4UK+KpsTf7QKJfJHuTn91EVcJTJomY6gRizUZSgz6i3UiyGnM0emCzoSGE09xN4jaaHhmPO+x61jfA8adzxZHzRHmAplwYEzzBZgNw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=qNlqP4JK; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="qNlqP4JK" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2403e4c82ddso6862635ad.1 for ; Thu, 31 Jul 2025 15:00:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753999234; x=1754604034; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=vSpqmXgbYV8j5bsB4IVuVf0tack8DQaHL/QU05OaXgY=; b=qNlqP4JK1flJPLxN7zfmFhWvzp8AVpCq61xcPupiHZcXRheMfD6+FfX/VxhHjV4i4s WskETkR01ngEblt5Ox5cKCNlR9KQxZVdvGi+/LBGtn4ybiFWR5BrqT+oVt9rb4Q63pYz 8u84hmO2OGp13c96NuQM2QIHJIQ5LmjR1GNNGSxIwBbXhiDteYyOs6Cy/SMz18YKGGy3 8+7qgQmhGhEfUVMSylchyagwCYN0ii7O5liTDQo80UFZuazszG1ThGuJ+ss3ikzha/xd RTMeDsc7bKLbyOgvWFTCu0UVPMX+xnmlW/4yKnYI0UcEuua+zKeZUulymZtlZ3BBMayc DSyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753999234; x=1754604034; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vSpqmXgbYV8j5bsB4IVuVf0tack8DQaHL/QU05OaXgY=; b=paHYgND2qMcwS/wTfGCnRnNXO43WzCstS+M0O2/Vjxo+/iTsshKvxF8NHqi8t6ls76 54b1BE2rtV1JPU9wVBf7AEDVH3rDToLx2xi7DIwy2In3cJIFYMFleUC3rKhbl4sOeo1d XLsQ7fF5r2X9UzA6SG+rw8oCy0YSCUEwMuqEphTp+pDEcAz/3SmceNcxq6LdKFy5UUBU ydiPy8AsBlEAItgsTy80nzxrEopDBH/Q6/brPB198YUzWOw7XDq4GGBu5VO6rMOz4LkQ Q4KwyNJ+s9xfatwH2XNvae+r6NSxNYtvNQuk95Qe2IgGXOqcFs6zXMDfVE80l1Cp1Juz JmBQ== X-Forwarded-Encrypted: i=1; AJvYcCXIj/VymLbZjRU8pzxXY7+n5bECoEn1XDJoqyw1iyYK/t9D6F1VR7QQGXJQRtMGyFnChR+OHdzbmijupUk=@vger.kernel.org X-Gm-Message-State: AOJu0Yydb1FnYJdf5PAP20rLuGh9x8GYfpH/4EDWSfogM+oAqYe8jcGi rKjaKBdiMp/M6hSJ6FPc4iHLPL3DMQkUSB1eNJsYu+w22CvAAklUaHIozXJxllekIAaXd8dwEMc eReR9jw== X-Google-Smtp-Source: AGHT+IG9WPhm5XE/ZfgGaQtzh+SWoXfTwZ/PnXrEVOaIZ7eiVoVCEnZsOmLu+uDuAMPetyxG7jZDFNEKyv8= X-Received: from plai12.prod.google.com ([2002:a17:902:c94c:b0:240:2b97:90f6]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:234d:b0:23f:adc0:8cc2 with SMTP id d9443c01a7336-2422a699c3amr3040655ad.27.1753999233810; Thu, 31 Jul 2025 15:00:33 -0700 (PDT) Date: Thu, 31 Jul 2025 15:00:24 -0700 In-Reply-To: <20250731220024.702621-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250731220024.702621-1-surenb@google.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog Message-ID: <20250731220024.702621-4-surenb@google.com> Subject: [PATCH 3/3] fs/proc/task_mmu: execute PROCMAP_QUERY ioctl under per-vma locks From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Utilize per-vma locks to stabilize vma after lookup without taking mmap_lock during PROCMAP_QUERY ioctl execution. If vma lock is contended, we fall back to mmap_lock but take it only momentarily to lock the vma and release the mmap_lock. In a very unlikely case of vm_refcnt overflow, this fall back path will fail and ioctl is done under mmap_lock protection. This change is designed to reduce mmap_lock contention and prevent PROCMAP_QUERY ioctl calls from blocking address space updates. Signed-off-by: Suren Baghdasaryan --- fs/proc/task_mmu.c | 81 +++++++++++++++++++++++++++++++++++++--------- 1 file changed, 65 insertions(+), 16 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 509fa162760a..b504b798e8fe 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -517,28 +517,78 @@ static int pid_maps_open(struct inode *inode, struct = file *file) PROCMAP_QUERY_VMA_FLAGS \ ) =20 -static int query_vma_setup(struct mm_struct *mm) +#ifdef CONFIG_PER_VMA_LOCK + +static int query_vma_setup(struct proc_maps_query_data *query) { - return mmap_read_lock_killable(mm); + query->locked_vma =3D NULL; + query->mmap_locked =3D false; + + return 0; } =20 -static void query_vma_teardown(struct mm_struct *mm, struct vm_area_struct= *vma) +static void query_vma_teardown(struct proc_maps_query_data *query) { - mmap_read_unlock(mm); + if (query->mmap_locked) + mmap_read_unlock(query->mm); + else + unlock_vma(query); } =20 -static struct vm_area_struct *query_vma_find_by_addr(struct mm_struct *mm,= unsigned long addr) +static struct vm_area_struct *query_vma_find_by_addr(struct proc_maps_quer= y_data *query, + unsigned long addr) { - return find_vma(mm, addr); + struct vm_area_struct *vma; + struct vma_iterator vmi; + + unlock_vma(query); + rcu_read_lock(); + vma_iter_init(&vmi, query->mm, addr); + vma =3D lock_next_vma(query->mm, &vmi, addr); + rcu_read_unlock(); + + if (!IS_ERR_OR_NULL(vma)) { + query->locked_vma =3D vma; + } else if (PTR_ERR(vma) =3D=3D -EAGAIN) { + /* Fallback to mmap_lock on vma->vm_refcnt overflow */ + mmap_read_lock(query->mm); + vma =3D find_vma(query->mm, addr); + query->mmap_locked =3D true; + } + + return vma; } =20 -static struct vm_area_struct *query_matching_vma(struct mm_struct *mm, +#else /* CONFIG_PER_VMA_LOCK */ + +static int query_vma_setup(struct proc_maps_query_data *query) +{ + return mmap_read_lock_killable(query->mm); +} + +static void query_vma_teardown(struct proc_maps_query_data *query) +{ + mmap_read_unlock(query->mm); +} + +static struct vm_area_struct *query_vma_find_by_addr(struct proc_maps_quer= y_data *query, + unsigned long addr) +{ + return find_vma(query->mm, addr); +} + +#endif /* CONFIG_PER_VMA_LOCK */ + +static struct vm_area_struct *query_matching_vma(struct proc_maps_query_da= ta *query, unsigned long addr, u32 flags) { struct vm_area_struct *vma; =20 next_vma: - vma =3D query_vma_find_by_addr(mm, addr); + vma =3D query_vma_find_by_addr(query, addr); + if (IS_ERR(vma)) + return vma; + if (!vma) goto no_vma; =20 @@ -579,11 +629,11 @@ static struct vm_area_struct *query_matching_vma(stru= ct mm_struct *mm, return ERR_PTR(-ENOENT); } =20 -static int do_procmap_query(struct proc_maps_private *priv, void __user *u= arg) +static int do_procmap_query(struct mm_struct *mm, void __user *uarg) { + struct proc_maps_query_data query =3D { .mm =3D mm }; struct procmap_query karg; struct vm_area_struct *vma; - struct mm_struct *mm; const char *name =3D NULL; char build_id_buf[BUILD_ID_SIZE_MAX], *name_buf =3D NULL; __u64 usize; @@ -610,17 +660,16 @@ static int do_procmap_query(struct proc_maps_private = *priv, void __user *uarg) if (!!karg.build_id_size !=3D !!karg.build_id_addr) return -EINVAL; =20 - mm =3D priv->query.mm; if (!mm || !mmget_not_zero(mm)) return -ESRCH; =20 - err =3D query_vma_setup(mm); + err =3D query_vma_setup(&query); if (err) { mmput(mm); return err; } =20 - vma =3D query_matching_vma(mm, karg.query_addr, karg.query_flags); + vma =3D query_matching_vma(&query, karg.query_addr, karg.query_flags); if (IS_ERR(vma)) { err =3D PTR_ERR(vma); vma =3D NULL; @@ -705,7 +754,7 @@ static int do_procmap_query(struct proc_maps_private *p= riv, void __user *uarg) } =20 /* unlock vma or mmap_lock, and put mm_struct before copying data to user= */ - query_vma_teardown(mm, vma); + query_vma_teardown(&query); mmput(mm); =20 if (karg.vma_name_size && copy_to_user(u64_to_user_ptr(karg.vma_name_addr= ), @@ -725,7 +774,7 @@ static int do_procmap_query(struct proc_maps_private *p= riv, void __user *uarg) return 0; =20 out: - query_vma_teardown(mm, vma); + query_vma_teardown(&query); mmput(mm); kfree(name_buf); return err; @@ -738,7 +787,7 @@ static long procfs_procmap_ioctl(struct file *file, uns= igned int cmd, unsigned l =20 switch (cmd) { case PROCMAP_QUERY: - return do_procmap_query(priv, (void __user *)arg); + return do_procmap_query(priv->query.mm, (void __user *)arg); default: return -ENOIOCTLCMD; } --=20 2.50.1.565.gc32cd1483b-goog