include/linux/maple_tree.h | 4 + lib/maple_tree.c | 89 +++++++++++++--------- tools/testing/radix-tree/maple.c | 125 +++++++++++++++++++++++++++++-- 3 files changed, 176 insertions(+), 42 deletions(-)
================ overview ======================== Currently, the maple tree preallocates the worst case number of nodes for given store type by taking into account the whole height of the tree. This comes from a worst case scenario of every node in the tree being full and having to propagate node allocation upwards until we reach the root of the tree. This can be optimized if there are vacancies in nodes that are at a lower depth than the root node. This series implements tracking the level at which there is a vacant node so we only need to allocate until this level is reached, rather than always using the full height of the tree. The ma_wr_state struct is modified to add a field which keeps track of the vacant height and is updated during walks of the tree. This value is then read in mas_prealloc_calc() when we decide how many nodes to allocate. For rebalancing stores, we also need to track the lowest height at which a node has 1 more entry than the minimum sufficient number of entries. This is because rebalancing can cause a parent node to become insufficient which results in further node allocations. In this case, we need to use the sufficient height as the worst case rather than the vacant height. patch 1-2: preparatory patches patch 3: implement vacant height tracking + update the tests patch 4: support vacant height tracking for rebalacning writes patch 5: implement sufficient height tracking ================ results ========================= Bpftrace was used to profile the allocation path for requesting new maple nodes while running the ./mmap1_processes test from mmtests. The two paths for allocation are requests for a single node and the bulk allocation path. The histogram represents the number of calls to these paths and a shows the distribution of the number of nodes requested for the bulk allocation path. mm-unstable 11/13/24 @bulk_alloc_req: [2, 4) 10 |@@@@@@@@@@@@@ | [4, 8) 38 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [8, 16) 19 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | mm-unstable 11/13/24 + this series @bulk_alloc_req: [2, 4) 9 |@@@@@@@@@@ | [4, 8) 43 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [8, 16) 15 |@@@@@@@@@@@@@@@@@@ | We can see the worst case bulk allocations of [8,16) nodes are reduced after this series. Sidhartha Kumar (5): maple_tree: convert mas_prealloc_calc() to take in a maple write state maple_tree: use height and depth consistently maple_tree: use vacant nodes to reduce worst case allocations maple_tree: break on convergence in mas_spanning_rebalance() maple_tree: add sufficient height include/linux/maple_tree.h | 4 + lib/maple_tree.c | 89 +++++++++++++--------- tools/testing/radix-tree/maple.c | 125 +++++++++++++++++++++++++++++-- 3 files changed, 176 insertions(+), 42 deletions(-) -- 2.43.0
On 11/14/24 12:05 PM, Sidhartha Kumar wrote: > ================ overview ======================== > Currently, the maple tree preallocates the worst case number of nodes for > given store type by taking into account the whole height of the tree. This > comes from a worst case scenario of every node in the tree being full and > having to propagate node allocation upwards until we reach the root of the > tree. This can be optimized if there are vacancies in nodes that are at a > lower depth than the root node. This series implements tracking the level > at which there is a vacant node so we only need to allocate until this > level is reached, rather than always using the full height of the tree. > The ma_wr_state struct is modified to add a field which keeps track of the > vacant height and is updated during walks of the tree. This value is then > read in mas_prealloc_calc() when we decide how many nodes to allocate. > > For rebalancing stores, we also need to track the lowest height at which > a node has 1 more entry than the minimum sufficient number of entries. > This is because rebalancing can cause a parent node to become insufficient > which results in further node allocations. In this case, we need to use > the sufficient height as the worst case rather than the vacant height. > > patch 1-2: preparatory patches > patch 3: implement vacant height tracking + update the tests > patch 4: support vacant height tracking for rebalacning writes > patch 5: implement sufficient height tracking > > ================ results ========================= > Bpftrace was used to profile the allocation path for requesting new maple > nodes while running the ./mmap1_processes test from mmtests. The two paths > for allocation are requests for a single node and the bulk allocation path. > The histogram represents the number of calls to these paths and a shows the > distribution of the number of nodes requested for the bulk allocation path. > > > mm-unstable 11/13/24 > @bulk_alloc_req: > [2, 4) 10 |@@@@@@@@@@@@@ | > [4, 8) 38 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > [8, 16) 19 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | > > > mm-unstable 11/13/24 + this series > @bulk_alloc_req: > [2, 4) 9 |@@@@@@@@@@ | > [4, 8) 43 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > [8, 16) 15 |@@@@@@@@@@@@@@@@@@ | > > We can see the worst case bulk allocations of [8,16) nodes are reduced after > this series. From running the ./malloc1_threads test case we eliminate almost all bulk allocation requests that fall between 8 and 16 nodes ./malloc1_threads -t 8 -s 100 mm-unstable + this series @bulk_alloc_req: [2, 4) 2 | | [4, 8) 3381 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [8, 16) 2 | | mm-unstable @bulk_alloc_req: [2, 4) 1 | | [4, 8) 1427 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | [8, 16) 2790 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > > Sidhartha Kumar (5): > maple_tree: convert mas_prealloc_calc() to take in a maple write state > maple_tree: use height and depth consistently > maple_tree: use vacant nodes to reduce worst case allocations > maple_tree: break on convergence in mas_spanning_rebalance() > maple_tree: add sufficient height > > include/linux/maple_tree.h | 4 + > lib/maple_tree.c | 89 +++++++++++++--------- > tools/testing/radix-tree/maple.c | 125 +++++++++++++++++++++++++++++-- > 3 files changed, 176 insertions(+), 42 deletions(-) >
On Thu, Nov 14, 2024 at 04:39:00PM -0500, Sid Kumar wrote: > >On 11/14/24 12:05 PM, Sidhartha Kumar wrote: [...] >> ================ results ========================= >> Bpftrace was used to profile the allocation path for requesting new maple >> nodes while running the ./mmap1_processes test from mmtests. The two paths >> for allocation are requests for a single node and the bulk allocation path. >> The histogram represents the number of calls to these paths and a shows the >> distribution of the number of nodes requested for the bulk allocation path. >> >> >> mm-unstable 11/13/24 >> @bulk_alloc_req: >> [2, 4) 10 |@@@@@@@@@@@@@ | >> [4, 8) 38 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| >> [8, 16) 19 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | >> >> >> mm-unstable 11/13/24 + this series >> @bulk_alloc_req: >> [2, 4) 9 |@@@@@@@@@@ | >> [4, 8) 43 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| >> [8, 16) 15 |@@@@@@@@@@@@@@@@@@ | >> >> We can see the worst case bulk allocations of [8,16) nodes are reduced after >> this series. > >From running the ./malloc1_threads test case we eliminate almost all bulk >allocation requests that > >fall between 8 and 16 nodes > >./malloc1_threads -t 8 -s 100 >mm-unstable + this series >@bulk_alloc_req: >[2, 4) 2 | >| >[4, 8) 3381 >|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| >[8, 16) 2 | >| > This is impressive. But I come up one thing not clear. For mmap related code, we usually have the following usage: vma_iter_prealloc(vmi, vma); mas_preallocate(vmi->mas, vma); MA_WR_STATE(wr_mas, ); mas_wr_store_type(&wr_mas); --- (1) vma_iter_store(vmi, vma); Locaton (1) is where we try to get a better estimation of allocations. The estimation is based on we walk down the tree and try to meet a proper node. In mmap related code, we usually have already walked down the tree to leaf, by vma_find() or related iteration operation, and the mas.status is set to ma_active. To me, I don't expect mas_preallocate() would traverse the tree again. But from your result, it seems most cases do traverse the tree again to get a more precise height. Which part do you think I have missed? > >mm-unstable >@bulk_alloc_req: >[2, 4) 1 | >| >[4, 8) 1427 |@@@@@@@@@@@@@@@@@@@@@@@@@@ >| >[8, 16) 2790 >|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| > > >> >> Sidhartha Kumar (5): >> maple_tree: convert mas_prealloc_calc() to take in a maple write state >> maple_tree: use height and depth consistently >> maple_tree: use vacant nodes to reduce worst case allocations >> maple_tree: break on convergence in mas_spanning_rebalance() >> maple_tree: add sufficient height >> >> include/linux/maple_tree.h | 4 + >> lib/maple_tree.c | 89 +++++++++++++--------- >> tools/testing/radix-tree/maple.c | 125 +++++++++++++++++++++++++++++-- >> 3 files changed, 176 insertions(+), 42 deletions(-) >> -- Wei Yang Help you, Help me
© 2016 - 2024 Red Hat, Inc.