tools/perf/util/maps.c | 113 +++++++++++++++++++++++++++++++++-------- 1 file changed, 92 insertions(+), 21 deletions(-)
Fix latent unlikely bugs in __maps__fixup_overlap_and_insert. Improve __maps__fixup_overlap_and_insert's performance 21x in the case of overlapping mmaps. sesse@google.com reported slowness opening perf.data files from chromium where the files contained a large number of overlapping mappings. Improve this case primarily by avoiding unnecessary sorting. Unscientific timing data processing a perf.data file with overlapping mmap events from chromium: Before: real 0m9.856s user 0m9.637s sys 0m0.204s After: real 0m0.675s user 0m0.454s sys 0m0.196s Tested with address/leak sanitizer, invariant checks and validating the before and after output are identical. Ian Rogers (3): perf maps: Fix use after free in __maps__fixup_overlap_and_insert perf maps: Reduce sorting for overlapping mappings perf maps: Add/use a sorted insert for fixup overlap and insert tools/perf/util/maps.c | 113 +++++++++++++++++++++++++++++++++-------- 1 file changed, 92 insertions(+), 21 deletions(-) -- 2.45.0.rc1.225.g2a3ae87e7f-goog
On Tue, 21 May 2024 09:51:06 -0700, Ian Rogers wrote: > Fix latent unlikely bugs in __maps__fixup_overlap_and_insert. > > Improve __maps__fixup_overlap_and_insert's performance 21x in the case > of overlapping mmaps. sesse@google.com reported slowness opening > perf.data files from chromium where the files contained a large number > of overlapping mappings. Improve this case primarily by avoiding > unnecessary sorting. > > [...] Applied to perf-tools-next, thanks! Best regards, Namhyung
On 21/05/2024 17:51, Ian Rogers wrote: > Fix latent unlikely bugs in __maps__fixup_overlap_and_insert. > > Improve __maps__fixup_overlap_and_insert's performance 21x in the case > of overlapping mmaps. sesse@google.com reported slowness opening > perf.data files from chromium where the files contained a large number > of overlapping mappings. Improve this case primarily by avoiding > unnecessary sorting. > > Unscientific timing data processing a perf.data file with overlapping > mmap events from chromium: > > Before: > real 0m9.856s > user 0m9.637s > sys 0m0.204s > > After: > real 0m0.675s > user 0m0.454s > sys 0m0.196s > > Tested with address/leak sanitizer, invariant checks and validating > the before and after output are identical. > > Ian Rogers (3): > perf maps: Fix use after free in __maps__fixup_overlap_and_insert > perf maps: Reduce sorting for overlapping mappings > perf maps: Add/use a sorted insert for fixup overlap and insert > > tools/perf/util/maps.c | 113 +++++++++++++++++++++++++++++++++-------- > 1 file changed, 92 insertions(+), 21 deletions(-) > Reviewed-by: James Clark <james.clark@arm.com> I'm wondering if there is any point in the non sorted insert any more? Maps could be always either sorted by name or sorted by address and insert() is always a sorted/fixup-overlaps insert depending on the sort style of the list.
On Thu, Jun 6, 2024 at 3:56 AM James Clark <james.clark@arm.com> wrote: > > > > On 21/05/2024 17:51, Ian Rogers wrote: > > Fix latent unlikely bugs in __maps__fixup_overlap_and_insert. > > > > Improve __maps__fixup_overlap_and_insert's performance 21x in the case > > of overlapping mmaps. sesse@google.com reported slowness opening > > perf.data files from chromium where the files contained a large number > > of overlapping mappings. Improve this case primarily by avoiding > > unnecessary sorting. > > > > Unscientific timing data processing a perf.data file with overlapping > > mmap events from chromium: > > > > Before: > > real 0m9.856s > > user 0m9.637s > > sys 0m0.204s > > > > After: > > real 0m0.675s > > user 0m0.454s > > sys 0m0.196s > > > > Tested with address/leak sanitizer, invariant checks and validating > > the before and after output are identical. > > > > Ian Rogers (3): > > perf maps: Fix use after free in __maps__fixup_overlap_and_insert > > perf maps: Reduce sorting for overlapping mappings > > perf maps: Add/use a sorted insert for fixup overlap and insert > > > > tools/perf/util/maps.c | 113 +++++++++++++++++++++++++++++++++-------- > > 1 file changed, 92 insertions(+), 21 deletions(-) > > > > Reviewed-by: James Clark <james.clark@arm.com> > > I'm wondering if there is any point in the non sorted insert any more? > > Maps could be always either sorted by name or sorted by address and > insert() is always a sorted/fixup-overlaps insert depending on the sort > style of the list. One of the motivations for the sorted array, instead of an rbtree, was to simplify reference count checking. We're in much better shape with that these days, I think the worst "memory leak" is the dsos holding onto a dso and its symbols indefinitely, instead of removing older unused dsos. It'd be hard to go back to an rb tree with reference counting. For the rbtree insert it was O(log N), the sorted insert these changes add is O(N) and the regular insert without sorting is O(1). A common case from scanning /proc/pid/maps is that when map entries are inserted they are inserted in order. The O(1) insert checks that and maintains that the maps are sorted. https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/maps.c?h=perf-tools-next#n469 For the synthesized maps we should be able to insert N map entries in O(N). The sorted insert would be similar as the memmoves would always copy 0 elements. So I can see an argument for keeping the array always sorted. For the perf.data files we commonly process synthesized mmaps dominate. In the case for this patch, JIT data dominated, with frequent overlapping mapping changes. I guess the biggest hurdle to just always keeping things sorted is the mental hurdle of ignoring the worst insert complexity, which should never apply given how the maps are commonly built. Thanks, Ian
On Tue, May 21, 2024 at 9:51 AM Ian Rogers <irogers@google.com> wrote: > > Fix latent unlikely bugs in __maps__fixup_overlap_and_insert. > > Improve __maps__fixup_overlap_and_insert's performance 21x in the case > of overlapping mmaps. sesse@google.com reported slowness opening > perf.data files from chromium where the files contained a large number > of overlapping mappings. Improve this case primarily by avoiding > unnecessary sorting. > > Unscientific timing data processing a perf.data file with overlapping > mmap events from chromium: > > Before: > real 0m9.856s > user 0m9.637s > sys 0m0.204s > > After: > real 0m0.675s > user 0m0.454s > sys 0m0.196s > > Tested with address/leak sanitizer, invariant checks and validating > the before and after output are identical. > > Ian Rogers (3): > perf maps: Fix use after free in __maps__fixup_overlap_and_insert > perf maps: Reduce sorting for overlapping mappings > perf maps: Add/use a sorted insert for fixup overlap and insert Ping. Thanks, Ian > tools/perf/util/maps.c | 113 +++++++++++++++++++++++++++++++++-------- > 1 file changed, 92 insertions(+), 21 deletions(-) > > -- > 2.45.0.rc1.225.g2a3ae87e7f-goog >
© 2016 - 2026 Red Hat, Inc.