fs/proc/task_mmu.c | 195 +++++++++--- tools/testing/selftests/proc/proc-maps-race.c | 293 ++++++++++++++---- 2 files changed, 387 insertions(+), 101 deletions(-)
Use per-vma locks when reading /proc/pid/smaps and /proc/pid/numa_maps
similar to /proc/pid/maps to reduce contention on central mmap_lock. One
major difference between maps and smaps/numa_maps reading is that the
latter executes page table walk which can't be done under RCU due to a
possibility of sleeping. Therefore we drop RCU read lock before this walk
while keeping the VMA locked. After the walk we retake RCU read lock,
reset VMA iterator and proceed with the next VMA.
The last two patches extend /proc/pid/maps test to cover /proc/pid/smaps
reading during concurrent address space modification.
Changes since v1[1]
- moved drop_rcu earlied in smap_gather_stats to avoid sleeping under
RCU lock in shmem_swap_usage(), per Sashiko
- skip page walks for gate VMA in show_numa_map(), per Sashiko
- introduced parse_vma_line() and copy_line() helper functions to ensure
input string passed to sscanf() is always NUL-terminated, per Sashiko
- used FIXTURE_VARIANT to run both maps and smaps tests in a single
test run, per Liam R. Howlett
Applies over mm-unstable.
[1] https://lore.kernel.org/all/20260424070234.190145-1-surenb@google.com/
Suren Baghdasaryan (3):
fs/proc/task_mmu: read proc/pid/{smaps|numa_maps} under per-vma lock
selftests/proc: ensure the test is performed at the right page
boundary
selftests/proc: add /proc/pid/smaps tearing tests
fs/proc/task_mmu.c | 195 +++++++++---
tools/testing/selftests/proc/proc-maps-race.c | 293 ++++++++++++++----
2 files changed, 387 insertions(+), 101 deletions(-)
base-commit: 761e9fad336afb6fe2cd488c7bd522e2783064fc
--
2.54.0.545.g6539524ca2-goog
On 4/26/26 08:27, Suren Baghdasaryan wrote: > Use per-vma locks when reading /proc/pid/smaps and /proc/pid/numa_maps > similar to /proc/pid/maps to reduce contention on central mmap_lock. One > major difference between maps and smaps/numa_maps reading is that the > latter executes page table walk which can't be done under RCU due to a > possibility of sleeping. Therefore we drop RCU read lock before this walk > while keeping the VMA locked. After the walk we retake RCU read lock, > reset VMA iterator and proceed with the next VMA. With many small VMAs, is that overhead noticable? -- Cheers, David
On Tue, May 12, 2026 at 2:28 AM David Hildenbrand (Arm)
<david@kernel.org> wrote:
>
> On 4/26/26 08:27, Suren Baghdasaryan wrote:
> > Use per-vma locks when reading /proc/pid/smaps and /proc/pid/numa_maps
> > similar to /proc/pid/maps to reduce contention on central mmap_lock. One
> > major difference between maps and smaps/numa_maps reading is that the
> > latter executes page table walk which can't be done under RCU due to a
> > possibility of sleeping. Therefore we drop RCU read lock before this walk
> > while keeping the VMA locked. After the walk we retake RCU read lock,
> > reset VMA iterator and proceed with the next VMA.
>
> With many small VMAs, is that overhead noticable?
It might be but the point of this patchset (and the previous one that
made a similar change for /proc/pid/maps) is to reduce mmap_lock
contention, not to speed up the read operation, which is not a
performance critical part. The original problem that Paul McKenney
described and which kicked these series of changes is that a
low-priority monitoring process reading /proc/pid/{maps|smaps|...} can
block a high-priority updates by holding the mmap_lock. You can see
details about this problem and the numbers Paul obtained with the
previous change in here:
https://lore.kernel.org/all/20250719182854.3166724-1-surenb@google.com/
>
> --
> Cheers,
>
> David
On 5/13/26 05:58, Suren Baghdasaryan wrote:
> On Tue, May 12, 2026 at 2:28 AM David Hildenbrand (Arm)
> <david@kernel.org> wrote:
>>
>> On 4/26/26 08:27, Suren Baghdasaryan wrote:
>>> Use per-vma locks when reading /proc/pid/smaps and /proc/pid/numa_maps
>>> similar to /proc/pid/maps to reduce contention on central mmap_lock. One
>>> major difference between maps and smaps/numa_maps reading is that the
>>> latter executes page table walk which can't be done under RCU due to a
>>> possibility of sleeping. Therefore we drop RCU read lock before this walk
>>> while keeping the VMA locked. After the walk we retake RCU read lock,
>>> reset VMA iterator and proceed with the next VMA.
>>
>> With many small VMAs, is that overhead noticable?
>
> It might be but the point of this patchset (and the previous one that
> made a similar change for /proc/pid/maps) is to reduce mmap_lock
> contention, not to speed up the read operation, which is not a
> performance critical part.
Well, this interface has been around .. forever, so if there is a noticeable
change in performance it should be called out.
> The original problem that Paul McKenney
> described and which kicked these series of changes is that a
> low-priority monitoring process reading /proc/pid/{maps|smaps|...} can
> block a high-priority updates by holding the mmap_lock. You can see
> details about this problem and the numbers Paul obtained with the
> previous change in here:
> https://lore.kernel.org/all/20250719182854.3166724-1-surenb@google.com/
Yes, I know. A problem that has been around ... forever as well :)
--
Cheers,
David
On Wed, May 13, 2026 at 12:39 AM David Hildenbrand (Arm)
<david@kernel.org> wrote:
>
> On 5/13/26 05:58, Suren Baghdasaryan wrote:
> > On Tue, May 12, 2026 at 2:28 AM David Hildenbrand (Arm)
> > <david@kernel.org> wrote:
> >>
> >> On 4/26/26 08:27, Suren Baghdasaryan wrote:
> >>> Use per-vma locks when reading /proc/pid/smaps and /proc/pid/numa_maps
> >>> similar to /proc/pid/maps to reduce contention on central mmap_lock. One
> >>> major difference between maps and smaps/numa_maps reading is that the
> >>> latter executes page table walk which can't be done under RCU due to a
> >>> possibility of sleeping. Therefore we drop RCU read lock before this walk
> >>> while keeping the VMA locked. After the walk we retake RCU read lock,
> >>> reset VMA iterator and proceed with the next VMA.
> >>
> >> With many small VMAs, is that overhead noticable?
> >
> > It might be but the point of this patchset (and the previous one that
> > made a similar change for /proc/pid/maps) is to reduce mmap_lock
> > contention, not to speed up the read operation, which is not a
> > performance critical part.
>
> Well, this interface has been around .. forever, so if there is a noticeable
> change in performance it should be called out.
Sorry, I missed your reply. I'll see if I can adopt Paul's test for
/proc/pid/maps [1] for benchmarking smaps but I would expect similar
results as was reported in [2].
[1] https://github.com/paulmckrcu/proc-mmap_sem-test
[2] https://lore.kernel.org/all/20250719182854.3166724-1-surenb@google.com/
>
> > The original problem that Paul McKenney
> > described and which kicked these series of changes is that a
> > low-priority monitoring process reading /proc/pid/{maps|smaps|...} can
> > block a high-priority updates by holding the mmap_lock. You can see
> > details about this problem and the numbers Paul obtained with the
> > previous change in here:
> > https://lore.kernel.org/all/20250719182854.3166724-1-surenb@google.com/
>
> Yes, I know. A problem that has been around ... forever as well :)
>
> --
> Cheers,
>
> David
On Thu, 21 May 2026 08:16:01 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > It might be but the point of this patchset (and the previous one that > > > made a similar change for /proc/pid/maps) is to reduce mmap_lock > > > contention, not to speed up the read operation, which is not a > > > performance critical part. > > > > Well, this interface has been around .. forever, so if there is a noticeable > > change in performance it should be called out. > > Sorry, I missed your reply. I'll see if I can adopt Paul's test for > /proc/pid/maps [1] for benchmarking smaps but I would expect similar > results as was reported in [2]. How's it coming along ;) > [1] https://github.com/paulmckrcu/proc-mmap_sem-test > [2] https://lore.kernel.org/all/20250719182854.3166724-1-surenb@google.com/ I've moved this series to the tail of mm-unstable to permit more time.
On Tue, May 26, 2026 at 06:49:44PM -0700, Andrew Morton wrote: > On Thu, 21 May 2026 08:16:01 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > > > It might be but the point of this patchset (and the previous one that > > > > made a similar change for /proc/pid/maps) is to reduce mmap_lock > > > > contention, not to speed up the read operation, which is not a > > > > performance critical part. > > > > > > Well, this interface has been around .. forever, so if there is a noticeable > > > change in performance it should be called out. > > > > Sorry, I missed your reply. I'll see if I can adopt Paul's test for > > /proc/pid/maps [1] for benchmarking smaps but I would expect similar > > results as was reported in [2]. > > How's it coming along ;) > > > [1] https://github.com/paulmckrcu/proc-mmap_sem-test > > [2] https://lore.kernel.org/all/20250719182854.3166724-1-surenb@google.com/ > > I've moved this series to the tail of mm-unstable to permit more time. Well I'm not sure it's _vital_ to get stats for this, it's pretty much an extension of existing VMA lock work in /proc/$pid/maps -> smaps, and the logic is sound. There's no reason to believe there will be anything other than a reduction in lock contention here at least under whichever workloads happen to hammer smaps, but doesn't feel like there's a downside! Cheers, Lorenzo
On Thu, May 28, 2026 at 8:37 AM Lorenzo Stoakes <ljs@kernel.org> wrote:
>
> On Tue, May 26, 2026 at 06:49:44PM -0700, Andrew Morton wrote:
> > On Thu, 21 May 2026 08:16:01 -0700 Suren Baghdasaryan <surenb@google.com> wrote:
> >
> > > > > It might be but the point of this patchset (and the previous one that
> > > > > made a similar change for /proc/pid/maps) is to reduce mmap_lock
> > > > > contention, not to speed up the read operation, which is not a
> > > > > performance critical part.
> > > >
> > > > Well, this interface has been around .. forever, so if there is a noticeable
> > > > change in performance it should be called out.
> > >
> > > Sorry, I missed your reply. I'll see if I can adopt Paul's test for
> > > /proc/pid/maps [1] for benchmarking smaps but I would expect similar
> > > results as was reported in [2].
> >
> > How's it coming along ;)
> >
> > > [1] https://github.com/paulmckrcu/proc-mmap_sem-test
> > > [2] https://lore.kernel.org/all/20250719182854.3166724-1-surenb@google.com/
> >
> > I've moved this series to the tail of mm-unstable to permit more time.
>
> Well I'm not sure it's _vital_ to get stats for this, it's pretty much an
> extension of existing VMA lock work in /proc/$pid/maps -> smaps, and the
> logic is sound.
>
> There's no reason to believe there will be anything other than a reduction
> in lock contention here at least under whichever workloads happen to hammer
> smaps, but doesn't feel like there's a downside!
Just finished running Paul's test. I ran a smaller number of
iterations (20 instead of 100) because it takes lots of time but
results are quite convincing.
In the baseline, all runs had maximum latencies exceeding 9.7
milliseconds. In contrast, the patched version has maximum latencies
in the range of 1.8-3.2 milliseconds.
Median performance of the baseline is at 1.135 milliseconds and patch
series at 7 microseconds.
Baseline
./run-proc-vs-map.sh --nsamples 20 --rawdata -- --busyduration 2
1.135 1.078 9.737
1.135 1.086 9.834
1.135 1.090 9.890
1.135 1.091 9.892
1.135 1.111 9.894
1.135 1.119 9.896
1.135 1.120 9.918
1.135 1.123 9.927
1.135 1.130 9.937
1.135 1.132 9.940
1.135 1.138 9.942
1.135 1.140 9.944
1.135 1.141 9.952
1.135 1.148 9.952
1.135 1.153 9.958
1.135 1.153 9.967
1.135 1.167 9.974
1.135 1.168 9.975
1.135 1.173 9.982
1.135 1.199 10.014
Patched
./run-proc-vs-map.sh --nsamples 20 --rawdata -- --busyduration 2
0.007 0.006 1.862
0.007 0.006 1.937
0.007 0.006 1.947
0.007 0.007 1.956
0.007 0.007 1.956
0.007 0.007 1.998
0.007 0.007 2.020
0.007 0.007 2.027
0.007 0.007 2.085
0.007 0.007 2.100
0.007 0.007 2.115
0.007 0.007 2.233
0.007 0.007 2.304
0.007 0.007 2.325
0.007 0.007 2.327
0.007 0.007 2.427
0.007 0.007 2.489
0.007 0.007 2.500
0.007 0.007 3.123
0.007 0.007 3.151
>
> Cheers, Lorenzo
On Sat, 25 Apr 2026 23:27:15 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > Use per-vma locks when reading /proc/pid/smaps and /proc/pid/numa_maps > similar to /proc/pid/maps to reduce contention on central mmap_lock. Was the benefit measured/measurable? Sashiko doesn't like your strchr(): https://sashiko.dev/#/patchset/20260426062718.1238437-1-surenb@google.com
On Sun, Apr 26, 2026 at 12:59 PM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Sat, 25 Apr 2026 23:27:15 -0700 Suren Baghdasaryan <surenb@google.com> wrote: > > > Use per-vma locks when reading /proc/pid/smaps and /proc/pid/numa_maps > > similar to /proc/pid/maps to reduce contention on central mmap_lock. > > Was the benefit measured/measurable? > > Sashiko doesn't like your strchr(): > https://sashiko.dev/#/patchset/20260426062718.1238437-1-surenb@google.com Sorry, missed your comment earlier. I think strchr() is fine here and Sashiko did notice that read_page() always makes sure the buffer ends with a newline character, IOW *end_pos == '\n'. So, once we found this last newline and moved past it, we terminate the loop. I guess Sashiko is being overprotective...
© 2016 - 2026 Red Hat, Inc.