[RFC PATCH -next 0/7] Introduce heat-level memcg reclaim

Chen Ridong posted 7 patches 2 weeks, 5 days ago
Documentation/mm/multigen_lru.rst |  30 --
include/linux/memcontrol.h        |   7 +
include/linux/mmzone.h            |  89 -----
mm/memcontrol-v1.c                |   6 -
mm/memcontrol.c                   |   7 +-
mm/mm_init.c                      |   1 -
mm/vmscan.c                       | 547 ++++++++++++------------------
7 files changed, 231 insertions(+), 456 deletions(-)
[RFC PATCH -next 0/7] Introduce heat-level memcg reclaim
Posted by Chen Ridong 2 weeks, 5 days ago
From: Chen Ridong <chenridong@huawei.com>

The memcg LRU was originally introduced to improve scalability during
global reclaim, but it only supports gen lru global reclaim and its
implementation has become complex. Moreover, it has caused performance
regressions when dealing with a large number of memory cgroups [1].

Previous attempts to remove memcg LRU by switching back to iteration
implementation brought performance regression [3].

This series introduces a per-memcg heat level mechanism for reclaim,
aiming to unify gen lru and traditional LRU global reclaim. The core
idea is to track per-node per-memcg reclaim state, including heat,
last_decay, and last_refault. Three reclaim heat levels are defined:
cold, warm, and hot. Cold memcgs are reclaimed first; only if cold
memcgs cannot reclaim enough pages, warm memcgs become eligible for
reclaim. Hot memcgs are reclaimed last.

While the heat level design can be applied to all memcg reclaim scenarios,
this series takes a conservative approach and initially applies it only
to global reclaim. The first few patches introduce the heat level
infrastructure and apply it to traditional LRU global reclaim. The
subsequent patches gradually migrate gen lru global reclaim to the
heat-level-based approach, with the final patch combining shrink_many
into shrink_node_memcgs to complete the transition.

Performance results show significant improvements:

Traditional LRU results (2-hour run of test [2]):
Throughput (number of requests)         before     after        Change
Total                                   1,734,169  2,353,717    +35%

Gen LRU results (24-hour run of test [2]):
Throughput (number of requests)         before     after        Change
Total                                   22,879,701 25,331,956   +10%

The performance tests are based on next branch commit:
commit ef0d146624b0 ("Add linux-next specific files for 20251219")

This series has been rebased on next-20260119:
commit d08c85ac8894 ("Add linux-next specific files for 20260119")

[1] https://lore.kernel.org/r/20251126171513.GC135004@cmpxchg.org
[2] https://lore.kernel.org/r/20221222041905.2431096-7-yuzhao@google.com
[3] https://lore.kernel.org/lkml/20251224073032.161911-1-chenridong@huaweicloud.com/

Chen Ridong (7):
  vmscan: add memcg heat level for reclaim
  mm/mglru: make calls to flush_reclaim_state() similar for MGLRU and
    non-MGLRU
  mm/mglru: rename should_abort_scan to lru_gen_should_abort_scan
  mm/mglru: extend lru_gen_shrink_lruvec to support root reclaim
  mm/mglru: combine shrink_many into shrink_node_memcgs
  mm/mglru: remove memcg disable handling from lru_gen_shrink_node
  mm/mglru: remove memcg lru

 Documentation/mm/multigen_lru.rst |  30 --
 include/linux/memcontrol.h        |   7 +
 include/linux/mmzone.h            |  89 -----
 mm/memcontrol-v1.c                |   6 -
 mm/memcontrol.c                   |   7 +-
 mm/mm_init.c                      |   1 -
 mm/vmscan.c                       | 547 ++++++++++++------------------
 7 files changed, 231 insertions(+), 456 deletions(-)

-- 
2.34.1
Re: [RFC PATCH -next 0/7] Introduce heat-level memcg reclaim
Posted by Chen Ridong 1 week, 3 days ago

On 2026/1/20 21:42, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
> 
> The memcg LRU was originally introduced to improve scalability during
> global reclaim, but it only supports gen lru global reclaim and its
> implementation has become complex. Moreover, it has caused performance
> regressions when dealing with a large number of memory cgroups [1].
> 
> Previous attempts to remove memcg LRU by switching back to iteration
> implementation brought performance regression [3].
> 
> This series introduces a per-memcg heat level mechanism for reclaim,
> aiming to unify gen lru and traditional LRU global reclaim. The core
> idea is to track per-node per-memcg reclaim state, including heat,
> last_decay, and last_refault. Three reclaim heat levels are defined:
> cold, warm, and hot. Cold memcgs are reclaimed first; only if cold
> memcgs cannot reclaim enough pages, warm memcgs become eligible for
> reclaim. Hot memcgs are reclaimed last.
> 
> While the heat level design can be applied to all memcg reclaim scenarios,
> this series takes a conservative approach and initially applies it only
> to global reclaim. The first few patches introduce the heat level
> infrastructure and apply it to traditional LRU global reclaim. The
> subsequent patches gradually migrate gen lru global reclaim to the
> heat-level-based approach, with the final patch combining shrink_many
> into shrink_node_memcgs to complete the transition.
> 
> Performance results show significant improvements:
> 
> Traditional LRU results (2-hour run of test [2]):
> Throughput (number of requests)         before     after        Change
> Total                                   1,734,169  2,353,717    +35%
> 
> Gen LRU results (24-hour run of test [2]):
> Throughput (number of requests)         before     after        Change
> Total                                   22,879,701 25,331,956   +10%
> 
> The performance tests are based on next branch commit:
> commit ef0d146624b0 ("Add linux-next specific files for 20251219")
> 
> This series has been rebased on next-20260119:
> commit d08c85ac8894 ("Add linux-next specific files for 20260119")
> 
> [1] https://lore.kernel.org/r/20251126171513.GC135004@cmpxchg.org
> [2] https://lore.kernel.org/r/20221222041905.2431096-7-yuzhao@google.com
> [3] https://lore.kernel.org/lkml/20251224073032.161911-1-chenridong@huaweicloud.com/
> 
> Chen Ridong (7):
>   vmscan: add memcg heat level for reclaim
>   mm/mglru: make calls to flush_reclaim_state() similar for MGLRU and
>     non-MGLRU
>   mm/mglru: rename should_abort_scan to lru_gen_should_abort_scan
>   mm/mglru: extend lru_gen_shrink_lruvec to support root reclaim
>   mm/mglru: combine shrink_many into shrink_node_memcgs
>   mm/mglru: remove memcg disable handling from lru_gen_shrink_node
>   mm/mglru: remove memcg lru
> 
>  Documentation/mm/multigen_lru.rst |  30 --
>  include/linux/memcontrol.h        |   7 +
>  include/linux/mmzone.h            |  89 -----
>  mm/memcontrol-v1.c                |   6 -
>  mm/memcontrol.c                   |   7 +-
>  mm/mm_init.c                      |   1 -
>  mm/vmscan.c                       | 547 ++++++++++++------------------
>  7 files changed, 231 insertions(+), 456 deletions(-)
> 

Hi, Johannes and Shakeel,

I would appreciate it if you could share your thoughts on this series.

-- 
Best regards,
Ridong