[v2] mm/damon: Support hot application detections

[RFC PATCH v2 0/4] mm/damon: Support hot application detections

Posted by gutierrez.asier@huawei-partners.com 3 weeks, 1 day ago

From: Asier Gutierrez <gutierrez.asier@huawei-partners.com>

Overview
----------

This patch set introduces a new dynamic mechanism for detecting hot applications
and hot regions in those applications.

Motivation
-----------

Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or
hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory
fragmentation and memory waste. For this reason, most application guides and
system administrators suggest to disable THP.


Solution
-----------

A new Linux kernel module that uses DAMON to detect hot regions and collapse
those regions into huge pages. The user supplies a set of PIDs using a module
parameter, and then, the module launches a new kdamond thread to monitor each
of the tasks.

In each kdamond, we start with a high min_access value. Our goal is to find the
"maximum" min_access value at which point the DAMON action is applied. In each
cycle, if no action is applied, we lower the min_access.

Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us
collapse synchronously and avoid polluting khugepaged and other parts of the MM
subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise,
which needs the correct vm_flags_t set.


-----------
Changes in v2:
- Previously there was a mechanism to automatically detect hot applications.
  Based on SeongJae Park's feedback [1], this was removed from the module, leaving
  it entirely to the user space.
- All allocations now use kzalloc_obj.
- Since the user space provides now the list of pids to monitor, a commit_input
  parameter is added to allow changing the pids while the module runs.
- Renamed the module from dynamic_hugepages to hugepages

[1]: https://lore.kernel.org/all/20260211150902.70066-1-sj@kernel.org/

Asier Gutierrez (4):
  Damon_modules_new_paddr_ctx_target. This works  only for physical
    contexts. In case of virtual addresses, we should  duplicate the
    code.
  Support for huge pages collapse, which will be used by
    dynamic_hugepages module.
  This new module launches a new kdamond thread for each of them. The
    purpose is to detect hot regions in a given list of tasks and
    collapse them into huge pages.
  DAMON_HOT_HUGEPAGE documentation

 .../admin-guide/mm/damon/hugepage.rst (new)   | 186 ++++++++
 include/linux/damon.h                         |   1 +
 mm/damon/Kconfig                              |   7 +
 mm/damon/Makefile                             |   1 +
 mm/damon/hugepage.c (new)                     | 441 ++++++++++++++++++
 mm/damon/lru_sort.c                           |   5 +-
 mm/damon/modules-common.c                     |   6 +-
 mm/damon/modules-common.h                     |   4 +-
 mm/damon/reclaim.c                            |   5 +-
 mm/damon/vaddr.c                              |   3 +
 10 files changed, 650 insertions(+), 9 deletions(-)
 create mode 100644 Documentation/admin-guide/mm/damon/hugepage.rst
 create mode 100644 mm/damon/hugepage.c

-- 
2.43.0

Re: [RFC PATCH v2 0/4] mm/damon: Support hot application detections

Posted by SeongJae Park 3 weeks, 1 day ago

Hello Asier,

Thank you for continuing this work!

On Tue, 10 Mar 2026 16:24:16 +0000 <gutierrez.asier@huawei-partners.com> wrote:

> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com>
> 
> Overview
> ----------

Let's make the legnth of the subject and the length of the underline same.

> 
> This patch set introduces a new dynamic mechanism for detecting hot applications
> and hot regions in those applications.

Seems now you offload the hot applications detection to the user space.  If I'm
not wrong, you should remove "hot applications and" on the above sentence.

> 
> Motivation
> -----------
> 
> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or
> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory
> fragmentation and memory waste. For this reason, most application guides and
> system administrators suggest to disable THP.
> 
> 
> Solution
> -----------
> 
> A new Linux kernel module that uses DAMON to detect hot regions and collapse
> those regions into huge pages. The user supplies a set of PIDs using a module
> parameter,

This sounds reasonable to me.

> and then, the module launches a new kdamond thread to monitor each
> of the tasks.
> 
> In each kdamond, we start with a high min_access value. Our goal is to find the
> "maximum" min_access value at which point the DAMON action is applied. In each
> cycle, if no action is applied, we lower the min_access.

So, this patch series introduces a sort of auto-tuning of the hugepages
collapse hotness threshold, that implemented in the new module.

We already have a sort of DAMOS auto-tuning feature, namely goal-based DAMOS
quota auto-tuning [1].  Have you considered using that?  Of course, it might
not be able to be used as is.  Some extensions, e.g., introduction of new goal
metric, may be needed.

Yet another approach would be implementing the auto-tuning in the user-space.
Because DAMON parameters can be updated online, updating the min_access from
the user space should be doable?  Given the fact the module anyway require
user-space control for feeding the list of applications to apply access-aware
huge pages collapsing, I find no problem at user space driven auto-tuning.

If the goal-based DAMOS quota auto-tuning or the user-space based auto-tuning
are feasible, all the controls can be done using DAMON sysfs interface.
Introduction of the new kernel module might not really be needed in the case.

We have DAMON modules in addition to DAMON sysfs interface for users who want
to use DAMON for a given specific use case with only minimum or near-zero
user-space control.  In this case, because it is already aimed to ask the
user-space to feed the list of applications to apply DAMOS-based hugepages
collapsing, it seems a new module is not really needed, to me.

But I guess your use case might have some special restrictions that really
require use of the module instead of offloading the auto-tuning to the
user-space or DAMON core.  Is that the case?  If so, can you share more details
about it?

> 
> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us
> collapse synchronously and avoid polluting khugepaged and other parts of the MM
> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise,
> which needs the correct vm_flags_t set.

This makes sense to me.  I expect DAMOS_COLLAPSE to have some advantages over
DAMOS_HUGEPAGE for some use cases, similar to MADV_COLLAPSE vs MADV_HUGEPAGE.

From my perspective, this patch series is introducing three things.
1) hugepage collapsing hotness threshold auto-tuning, 2) the module for running
the auto-tuning, and 3) DAMOS_COLLAPSE.  To me, it is unclear if the first two
changes are really needed.  I will wait your answer.

Meanwhile, the third change seems reasonable and not necessarily need to be
blocked for the other two changes.  I think separating the third change from
this patch series and upstreaming it first could also be a path forward.
Because the change is simple and sound, convincing me would be easy.  I'd be
convinced if at least some reasonable test results can be shown.  I'm not
saying we should drop the other two changes.  We can keep discussing those in
parallel.  Rather, upstreaming the third change first could help finding real
benefits of the other two changes, since the testing will be easier.  The
decision is up to Asier, of course.  I'm just sharing my two cents.

> 
> 
> -----------
> Changes in v2:

Let's keep calling this "RFC" here.  When you drop the "RFC" tag, this might
confuse some people.

Also, when you add a changelog of a patch, adding a link to the previous
version [2] can help reviewing.

> - Previously there was a mechanism to automatically detect hot applications.
>   Based on SeongJae Park's feedback [1], this was removed from the module, leaving
>   it entirely to the user space.
> - All allocations now use kzalloc_obj.
> - Since the user space provides now the list of pids to monitor, a commit_input
>   parameter is added to allow changing the pids while the module runs.
> - Renamed the module from dynamic_hugepages to hugepages

Thank you for doing this, Asier.

> 
> [1]: https://lore.kernel.org/all/20260211150902.70066-1-sj@kernel.org/
> 
> Asier Gutierrez (4):
>   Damon_modules_new_paddr_ctx_target. This works  only for physical
>     contexts. In case of virtual addresses, we should  duplicate the
>     code.
>   Support for huge pages collapse, which will be used by
>     dynamic_hugepages module.
>   This new module launches a new kdamond thread for each of them. The
>     purpose is to detect hot regions in a given list of tasks and
>     collapse them into huge pages.
>   DAMON_HOT_HUGEPAGE documentation
> 
>  .../admin-guide/mm/damon/hugepage.rst (new)   | 186 ++++++++
>  include/linux/damon.h                         |   1 +
>  mm/damon/Kconfig                              |   7 +
>  mm/damon/Makefile                             |   1 +
>  mm/damon/hugepage.c (new)                     | 441 ++++++++++++++++++
>  mm/damon/lru_sort.c                           |   5 +-
>  mm/damon/modules-common.c                     |   6 +-
>  mm/damon/modules-common.h                     |   4 +-
>  mm/damon/reclaim.c                            |   5 +-
>  mm/damon/vaddr.c                              |   3 +
>  10 files changed, 650 insertions(+), 9 deletions(-)
>  create mode 100644 Documentation/admin-guide/mm/damon/hugepage.rst
>  create mode 100644 mm/damon/hugepage.c
> 
> -- 
> 2.43.0

[1] https://origin.kernel.org/doc/html/latest/mm/damon/design.html#aim-oriented-feedback-driven-auto-tuning
[2] https://docs.kernel.org/process/submitting-patches.html#commentary

Thanks,
SJ

Re: [RFC PATCH v2 0/4] mm/damon: Support hot application detections

Posted by Gutierrez Asier 3 weeks ago

Hi SeongJae,

On 3/11/2026 8:07 AM, SeongJae Park wrote:
> Hello Asier,
> 
> 
> Thank you for continuing this work!
> 
> On Tue, 10 Mar 2026 16:24:16 +0000 <gutierrez.asier@huawei-partners.com> wrote:
> 
>> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com>
>>
>> Overview
>> ----------
> 
> Let's make the legnth of the subject and the length of the underline same.
> 
>>
>> This patch set introduces a new dynamic mechanism for detecting hot applications
>> and hot regions in those applications.
> 
> Seems now you offload the hot applications detection to the user space.  If I'm
> not wrong, you should remove "hot applications and" on the above sentence.

You're right. I was not sure whether changing the RFC subject was right or not.
I will change it for the next RFC version.

>>
>> Motivation
>> -----------
>>
>> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or
>> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory
>> fragmentation and memory waste. For this reason, most application guides and
>> system administrators suggest to disable THP.
>>
>>
>> Solution
>> -----------
>>
>> A new Linux kernel module that uses DAMON to detect hot regions and collapse
>> those regions into huge pages. The user supplies a set of PIDs using a module
>> parameter,
> 
> This sounds reasonable to me.
> 
>> and then, the module launches a new kdamond thread to monitor each
>> of the tasks.
>>
>> In each kdamond, we start with a high min_access value. Our goal is to find the
>> "maximum" min_access value at which point the DAMON action is applied. In each
>> cycle, if no action is applied, we lower the min_access.
> 
> So, this patch series introduces a sort of auto-tuning of the hugepages
> collapse hotness threshold, that implemented in the new module.
> 
> We already have a sort of DAMOS auto-tuning feature, namely goal-based DAMOS
> quota auto-tuning [1].  Have you considered using that?  Of course, it might
> not be able to be used as is.  Some extensions, e.g., introduction of new goal
> metric, may be needed.
> 
> Yet another approach would be implementing the auto-tuning in the user-space.
> Because DAMON parameters can be updated online, updating the min_access from
> the user space should be doable?  Given the fact the module anyway require
> user-space control for feeding the list of applications to apply access-aware
> huge pages collapsing, I find no problem at user space driven auto-tuning.
> 
> If the goal-based DAMOS quota auto-tuning or the user-space based auto-tuning
> are feasible, all the controls can be done using DAMON sysfs interface.
> Introduction of the new kernel module might not really be needed in the case.
> 
> We have DAMON modules in addition to DAMON sysfs interface for users who want
> to use DAMON for a given specific use case with only minimum or near-zero
> user-space control.  In this case, because it is already aimed to ask the
> user-space to feed the list of applications to apply DAMOS-based hugepages
> collapsing, it seems a new module is not really needed, to me.
> 
> But I guess your use case might have some special restrictions that really
> require use of the module instead of offloading the auto-tuning to the
> user-space or DAMON core.  Is that the case?  If so, can you share more details
> about it?

I haven't figured out how I can use goal autotune to change the min_access.
Your suggestion about moving this to the user space sound good.

The idea was to stop lowering the min_access as soon as collapses occur,
since we don't want to lower so much that we start collapsing regions that
are not very hot.

Maybe you can suggest a better way to do it. Maybe with autotuning.

> 
>>
>> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us
>> collapse synchronously and avoid polluting khugepaged and other parts of the MM
>> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise,
>> which needs the correct vm_flags_t set.
> 
> This makes sense to me.  I expect DAMOS_COLLAPSE to have some advantages over
> DAMOS_HUGEPAGE for some use cases, similar to MADV_COLLAPSE vs MADV_HUGEPAGE.
> 
> From my perspective, this patch series is introducing three things.
> 1) hugepage collapsing hotness threshold auto-tuning, 2) the module for running
> the auto-tuning, and 3) DAMOS_COLLAPSE.  To me, it is unclear if the first two
> changes are really needed.  I will wait your answer.
> 
> Meanwhile, the third change seems reasonable and not necessarily need to be
> blocked for the other two changes.  I think separating the third change from
> this patch series and upstreaming it first could also be a path forward.
> Because the change is simple and sound, convincing me would be easy.  I'd be
> convinced if at least some reasonable test results can be shown.  I'm not
> saying we should drop the other two changes.  We can keep discussing those in
> parallel.  Rather, upstreaming the third change first could help finding real
> benefits of the other two changes, since the testing will be easier.  The
> decision is up to Asier, of course.  I'm just sharing my two cents.
> 
>>
>>
>> -----------
>> Changes in v2:
> 
> Let's keep calling this "RFC" here.  When you drop the "RFC" tag, this might
> confuse some people.
> 
> Also, when you add a changelog of a patch, adding a link to the previous
> version [2] can help reviewing.

Will do it.

> 
>> - Previously there was a mechanism to automatically detect hot applications.
>>   Based on SeongJae Park's feedback [1], this was removed from the module, leaving
>>   it entirely to the user space.
>> - All allocations now use kzalloc_obj.
>> - Since the user space provides now the list of pids to monitor, a commit_input
>>   parameter is added to allow changing the pids while the module runs.
>> - Renamed the module from dynamic_hugepages to hugepages
> 
> Thank you for doing this, Asier.
> 
>>
>> [1]: https://lore.kernel.org/all/20260211150902.70066-1-sj@kernel.org/
>>
>> Asier Gutierrez (4):
>>   Damon_modules_new_paddr_ctx_target. This works  only for physical
>>     contexts. In case of virtual addresses, we should  duplicate the
>>     code.
>>   Support for huge pages collapse, which will be used by
>>     dynamic_hugepages module.
>>   This new module launches a new kdamond thread for each of them. The
>>     purpose is to detect hot regions in a given list of tasks and
>>     collapse them into huge pages.
>>   DAMON_HOT_HUGEPAGE documentation
>>
>>  .../admin-guide/mm/damon/hugepage.rst (new)   | 186 ++++++++
>>  include/linux/damon.h                         |   1 +
>>  mm/damon/Kconfig                              |   7 +
>>  mm/damon/Makefile                             |   1 +
>>  mm/damon/hugepage.c (new)                     | 441 ++++++++++++++++++
>>  mm/damon/lru_sort.c                           |   5 +-
>>  mm/damon/modules-common.c                     |   6 +-
>>  mm/damon/modules-common.h                     |   4 +-
>>  mm/damon/reclaim.c                            |   5 +-
>>  mm/damon/vaddr.c                              |   3 +
>>  10 files changed, 650 insertions(+), 9 deletions(-)
>>  create mode 100644 Documentation/admin-guide/mm/damon/hugepage.rst
>>  create mode 100644 mm/damon/hugepage.c
>>
>> -- 
>> 2.43.0
> 
> [1] https://origin.kernel.org/doc/html/latest/mm/damon/design.html#aim-oriented-feedback-driven-auto-tuning
> [2] https://docs.kernel.org/process/submitting-patches.html#commentary
> 
> 
> Thanks,
> SJ
> 

-- 
Asier Gutierrez
Huawei

Re: [RFC PATCH v2 0/4] mm/damon: Support hot application detections

Posted by SeongJae Park 3 weeks ago

On Wed, 11 Mar 2026 16:08:56 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote:

> Hi SeongJae,
> 
> On 3/11/2026 8:07 AM, SeongJae Park wrote:
> > Hello Asier,
> > 
> > 
> > Thank you for continuing this work!
> > 
> > On Tue, 10 Mar 2026 16:24:16 +0000 <gutierrez.asier@huawei-partners.com> wrote:
> > 
> >> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com>
> >>
> >> Overview
> >> ----------
> > 
> > Let's make the legnth of the subject and the length of the underline same.
> > 
> >>
> >> This patch set introduces a new dynamic mechanism for detecting hot applications
> >> and hot regions in those applications.
> > 
> > Seems now you offload the hot applications detection to the user space.  If I'm
> > not wrong, you should remove "hot applications and" on the above sentence.
> 
> You're right. I was not sure whether changing the RFC subject was right or not.
> I will change it for the next RFC version.

It's fine to change the subject.  Please feel free to do so in the next version
:)

> 
> >>
> >> Motivation
> >> -----------
> >>
> >> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or
> >> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory
> >> fragmentation and memory waste. For this reason, most application guides and
> >> system administrators suggest to disable THP.
> >>
> >>
> >> Solution
> >> -----------
> >>
> >> A new Linux kernel module that uses DAMON to detect hot regions and collapse
> >> those regions into huge pages. The user supplies a set of PIDs using a module
> >> parameter,
> > 
> > This sounds reasonable to me.
> > 
> >> and then, the module launches a new kdamond thread to monitor each
> >> of the tasks.
> >>
> >> In each kdamond, we start with a high min_access value. Our goal is to find the
> >> "maximum" min_access value at which point the DAMON action is applied. In each
> >> cycle, if no action is applied, we lower the min_access.
> > 
> > So, this patch series introduces a sort of auto-tuning of the hugepages
> > collapse hotness threshold, that implemented in the new module.
> > 
> > We already have a sort of DAMOS auto-tuning feature, namely goal-based DAMOS
> > quota auto-tuning [1].  Have you considered using that?  Of course, it might
> > not be able to be used as is.  Some extensions, e.g., introduction of new goal
> > metric, may be needed.
> > 
> > Yet another approach would be implementing the auto-tuning in the user-space.
> > Because DAMON parameters can be updated online, updating the min_access from
> > the user space should be doable?  Given the fact the module anyway require
> > user-space control for feeding the list of applications to apply access-aware
> > huge pages collapsing, I find no problem at user space driven auto-tuning.
> > 
> > If the goal-based DAMOS quota auto-tuning or the user-space based auto-tuning
> > are feasible, all the controls can be done using DAMON sysfs interface.
> > Introduction of the new kernel module might not really be needed in the case.
> > 
> > We have DAMON modules in addition to DAMON sysfs interface for users who want
> > to use DAMON for a given specific use case with only minimum or near-zero
> > user-space control.  In this case, because it is already aimed to ask the
> > user-space to feed the list of applications to apply DAMOS-based hugepages
> > collapsing, it seems a new module is not really needed, to me.
> > 
> > But I guess your use case might have some special restrictions that really
> > require use of the module instead of offloading the auto-tuning to the
> > user-space or DAMON core.  Is that the case?  If so, can you share more details
> > about it?
> 
> I haven't figured out how I can use goal autotune to change the min_access.

Indeed, it is not a very straightforward feature.

> Your suggestion about moving this to the user space sound good.

If it works for you, maybe that is best for you :)

> 
> The idea was to stop lowering the min_access as soon as collapses occur,
> since we don't want to lower so much that we start collapsing regions that
> are not very hot.
> 
> Maybe you can suggest a better way to do it. Maybe with autotuning.

I will add more detailed suggestion soon, by tomorrow or a day after.

> 
> > 
> >>
> >> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us
> >> collapse synchronously and avoid polluting khugepaged and other parts of the MM
> >> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise,
> >> which needs the correct vm_flags_t set.
> > 
> > This makes sense to me.  I expect DAMOS_COLLAPSE to have some advantages over
> > DAMOS_HUGEPAGE for some use cases, similar to MADV_COLLAPSE vs MADV_HUGEPAGE.
> > 
> > From my perspective, this patch series is introducing three things.
> > 1) hugepage collapsing hotness threshold auto-tuning, 2) the module for running
> > the auto-tuning, and 3) DAMOS_COLLAPSE.  To me, it is unclear if the first two
> > changes are really needed.  I will wait your answer.

Please answer the above questions when you get a chance.

> > 
> > Meanwhile, the third change seems reasonable and not necessarily need to be
> > blocked for the other two changes.  I think separating the third change from
> > this patch series and upstreaming it first could also be a path forward.
> > Because the change is simple and sound, convincing me would be easy.  I'd be
> > convinced if at least some reasonable test results can be shown.  I'm not
> > saying we should drop the other two changes.  We can keep discussing those in
> > parallel.  Rather, upstreaming the third change first could help finding real
> > benefits of the other two changes, since the testing will be easier.  The
> > decision is up to Asier, of course.  I'm just sharing my two cents.

I'm also curious what you think about this.


Thanks,
SJ

[...]

Re: [RFC PATCH v2 0/4] mm/damon: Support hot application detections

Posted by Gutierrez Asier 2 weeks, 6 days ago


On 3/11/2026 5:39 PM, SeongJae Park wrote:
> On Wed, 11 Mar 2026 16:08:56 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote:
> 
>> Hi SeongJae,
>>
>> On 3/11/2026 8:07 AM, SeongJae Park wrote:
>>> Hello Asier,
>>>
>>>
>>> Thank you for continuing this work!
>>>
>>> On Tue, 10 Mar 2026 16:24:16 +0000 <gutierrez.asier@huawei-partners.com> wrote:
>>>
>>>> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com>
>>>>
>>>> Overview
>>>> ----------
>>>
>>> Let's make the legnth of the subject and the length of the underline same.
>>>
>>>>
>>>> This patch set introduces a new dynamic mechanism for detecting hot applications
>>>> and hot regions in those applications.
>>>
>>> Seems now you offload the hot applications detection to the user space.  If I'm
>>> not wrong, you should remove "hot applications and" on the above sentence.
>>
>> You're right. I was not sure whether changing the RFC subject was right or not.
>> I will change it for the next RFC version.
> 
> It's fine to change the subject.  Please feel free to do so in the next version
> :)
> 
>>
>>>>
>>>> Motivation
>>>> -----------
>>>>
>>>> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or
>>>> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory
>>>> fragmentation and memory waste. For this reason, most application guides and
>>>> system administrators suggest to disable THP.
>>>>
>>>>
>>>> Solution
>>>> -----------
>>>>
>>>> A new Linux kernel module that uses DAMON to detect hot regions and collapse
>>>> those regions into huge pages. The user supplies a set of PIDs using a module
>>>> parameter,
>>>
>>> This sounds reasonable to me.
>>>
>>>> and then, the module launches a new kdamond thread to monitor each
>>>> of the tasks.
>>>>
>>>> In each kdamond, we start with a high min_access value. Our goal is to find the
>>>> "maximum" min_access value at which point the DAMON action is applied. In each
>>>> cycle, if no action is applied, we lower the min_access.
>>>
>>> So, this patch series introduces a sort of auto-tuning of the hugepages
>>> collapse hotness threshold, that implemented in the new module.
>>>
>>> We already have a sort of DAMOS auto-tuning feature, namely goal-based DAMOS
>>> quota auto-tuning [1].  Have you considered using that?  Of course, it might
>>> not be able to be used as is.  Some extensions, e.g., introduction of new goal
>>> metric, may be needed.
>>>
>>> Yet another approach would be implementing the auto-tuning in the user-space.
>>> Because DAMON parameters can be updated online, updating the min_access from
>>> the user space should be doable?  Given the fact the module anyway require
>>> user-space control for feeding the list of applications to apply access-aware
>>> huge pages collapsing, I find no problem at user space driven auto-tuning.
>>>
>>> If the goal-based DAMOS quota auto-tuning or the user-space based auto-tuning
>>> are feasible, all the controls can be done using DAMON sysfs interface.
>>> Introduction of the new kernel module might not really be needed in the case.
>>>
>>> We have DAMON modules in addition to DAMON sysfs interface for users who want
>>> to use DAMON for a given specific use case with only minimum or near-zero
>>> user-space control.  In this case, because it is already aimed to ask the
>>> user-space to feed the list of applications to apply DAMOS-based hugepages
>>> collapsing, it seems a new module is not really needed, to me.
>>>
>>> But I guess your use case might have some special restrictions that really
>>> require use of the module instead of offloading the auto-tuning to the
>>> user-space or DAMON core.  Is that the case?  If so, can you share more details
>>> about it?
>>
>> I haven't figured out how I can use goal autotune to change the min_access.
> 
> Indeed, it is not a very straightforward feature.
> 
>> Your suggestion about moving this to the user space sound good.
> 
> If it works for you, maybe that is best for you :)
> 
>>
>> The idea was to stop lowering the min_access as soon as collapses occur,
>> since we don't want to lower so much that we start collapsing regions that
>> are not very hot.
>>
>> Maybe you can suggest a better way to do it. Maybe with autotuning.
> 
> I will add more detailed suggestion soon, by tomorrow or a day after.
> 
>>
>>>
>>>>
>>>> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us
>>>> collapse synchronously and avoid polluting khugepaged and other parts of the MM
>>>> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise,
>>>> which needs the correct vm_flags_t set.
>>>
>>> This makes sense to me.  I expect DAMOS_COLLAPSE to have some advantages over
>>> DAMOS_HUGEPAGE for some use cases, similar to MADV_COLLAPSE vs MADV_HUGEPAGE.
>>>
>>> From my perspective, this patch series is introducing three things.
>>> 1) hugepage collapsing hotness threshold auto-tuning, 2) the module for running
>>> the auto-tuning, and 3) DAMOS_COLLAPSE.  To me, it is unclear if the first two
>>> changes are really needed.  I will wait your answer.
Yes, I tried to introduce those 3 things. The problem is that I initially found
goal autotuning quite confusing, so I kind of implemented something that behaves
like autotuning, but doesn't use DAMON's core algorithm. 
> 
>>>
>>> Meanwhile, the third change seems reasonable and not necessarily need to be
>>> blocked for the other two changes.  I think separating the third change from
>>> this patch series and upstreaming it first could also be a path forward.
>>> Because the change is simple and sound, convincing me would be easy.  I'd be
>>> convinced if at least some reasonable test results can be shown.  I'm not
>>> saying we should drop the other two changes.  We can keep discussing those in
>>> parallel.  Rather, upstreaming the third change first could help finding real
>>> benefits of the other two changes, since the testing will be easier.  The
>>> decision is up to Asier, of course.  I'm just sharing my two cents.
> 
> I'm also curious what you think about this.

Sure, we can upstream the third change. I will prepare a new patch for with just
that diff.

> 
> Thanks,
> SJ
> 
> [...]
> 

The use case that I had in mind is pretty simple. Few admins use huge pages in
production, since it leads to memory fragmentation and waste. On the other hand,
amount of memory increases faster than entries in the TLB, which means more
TLB misses and more cycles waste. My goal is to balance this. Improve performance
in applications while keeping the amount of memory waste due to fragmentation to
a minimum.

Imagine a database server. The sysadmin would like to collapse only hot regions
of the database task, improving CPU utilization but without wasting too much
memory.


Today I sat down and review the damon code. Given all your feedback, I think
this I didn't use the right approach or I didn't understand you initially.
My suggestions:

1. Implement a new goal type for autotuning that uses huge pages.
2. Implement a module that uses this new goal type.

Would this make sense to you?

-- 
Asier Gutierrez
Huawei

Re: [RFC PATCH v2 0/4] mm/damon: Support hot application detections

Posted by SeongJae Park 2 weeks, 6 days ago

On Thu, 12 Mar 2026 17:42:35 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote:

> 
> 
> On 3/11/2026 5:39 PM, SeongJae Park wrote:
> > On Wed, 11 Mar 2026 16:08:56 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote:
> > 
> >> Hi SeongJae,
> >>
> >> On 3/11/2026 8:07 AM, SeongJae Park wrote:
> >>> Hello Asier,
> >>>
> >>>
> >>> Thank you for continuing this work!
> >>>
> >>> On Tue, 10 Mar 2026 16:24:16 +0000 <gutierrez.asier@huawei-partners.com> wrote:
> >>>
> >>>> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com>
> >>>>
> >>>> Overview
> >>>> ----------
> >>>
> >>> Let's make the legnth of the subject and the length of the underline same.
> >>>
> >>>>
> >>>> This patch set introduces a new dynamic mechanism for detecting hot applications
> >>>> and hot regions in those applications.
> >>>
> >>> Seems now you offload the hot applications detection to the user space.  If I'm
> >>> not wrong, you should remove "hot applications and" on the above sentence.
> >>
> >> You're right. I was not sure whether changing the RFC subject was right or not.
> >> I will change it for the next RFC version.
> > 
> > It's fine to change the subject.  Please feel free to do so in the next version
> > :)
> > 
> >>
> >>>>
> >>>> Motivation
> >>>> -----------
> >>>>
> >>>> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or
> >>>> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory
> >>>> fragmentation and memory waste. For this reason, most application guides and
> >>>> system administrators suggest to disable THP.
> >>>>
> >>>>
> >>>> Solution
> >>>> -----------
> >>>>
> >>>> A new Linux kernel module that uses DAMON to detect hot regions and collapse
> >>>> those regions into huge pages. The user supplies a set of PIDs using a module
> >>>> parameter,
> >>>
> >>> This sounds reasonable to me.
> >>>
> >>>> and then, the module launches a new kdamond thread to monitor each
> >>>> of the tasks.
> >>>>
> >>>> In each kdamond, we start with a high min_access value. Our goal is to find the
> >>>> "maximum" min_access value at which point the DAMON action is applied. In each
> >>>> cycle, if no action is applied, we lower the min_access.
> >>>
> >>> So, this patch series introduces a sort of auto-tuning of the hugepages
> >>> collapse hotness threshold, that implemented in the new module.
> >>>
> >>> We already have a sort of DAMOS auto-tuning feature, namely goal-based DAMOS
> >>> quota auto-tuning [1].  Have you considered using that?  Of course, it might
> >>> not be able to be used as is.  Some extensions, e.g., introduction of new goal
> >>> metric, may be needed.
> >>>
> >>> Yet another approach would be implementing the auto-tuning in the user-space.
> >>> Because DAMON parameters can be updated online, updating the min_access from
> >>> the user space should be doable?  Given the fact the module anyway require
> >>> user-space control for feeding the list of applications to apply access-aware
> >>> huge pages collapsing, I find no problem at user space driven auto-tuning.
> >>>
> >>> If the goal-based DAMOS quota auto-tuning or the user-space based auto-tuning
> >>> are feasible, all the controls can be done using DAMON sysfs interface.
> >>> Introduction of the new kernel module might not really be needed in the case.
> >>>
> >>> We have DAMON modules in addition to DAMON sysfs interface for users who want
> >>> to use DAMON for a given specific use case with only minimum or near-zero
> >>> user-space control.  In this case, because it is already aimed to ask the
> >>> user-space to feed the list of applications to apply DAMOS-based hugepages
> >>> collapsing, it seems a new module is not really needed, to me.
> >>>
> >>> But I guess your use case might have some special restrictions that really
> >>> require use of the module instead of offloading the auto-tuning to the
> >>> user-space or DAMON core.  Is that the case?  If so, can you share more details
> >>> about it?
> >>
> >> I haven't figured out how I can use goal autotune to change the min_access.
> > 
> > Indeed, it is not a very straightforward feature.
> > 
> >> Your suggestion about moving this to the user space sound good.
> > 
> > If it works for you, maybe that is best for you :)
> > 
> >>
> >> The idea was to stop lowering the min_access as soon as collapses occur,
> >> since we don't want to lower so much that we start collapsing regions that
> >> are not very hot.
> >>
> >> Maybe you can suggest a better way to do it. Maybe with autotuning.
> > 
> > I will add more detailed suggestion soon, by tomorrow or a day after.
> > 
> >>
> >>>
> >>>>
> >>>> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us
> >>>> collapse synchronously and avoid polluting khugepaged and other parts of the MM
> >>>> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise,
> >>>> which needs the correct vm_flags_t set.
> >>>
> >>> This makes sense to me.  I expect DAMOS_COLLAPSE to have some advantages over
> >>> DAMOS_HUGEPAGE for some use cases, similar to MADV_COLLAPSE vs MADV_HUGEPAGE.
> >>>
> >>> From my perspective, this patch series is introducing three things.
> >>> 1) hugepage collapsing hotness threshold auto-tuning, 2) the module for running
> >>> the auto-tuning, and 3) DAMOS_COLLAPSE.  To me, it is unclear if the first two
> >>> changes are really needed.  I will wait your answer.
> Yes, I tried to introduce those 3 things. The problem is that I initially found
> goal autotuning quite confusing, so I kind of implemented something that behaves
> like autotuning, but doesn't use DAMON's core algorithm. 

Sorry for the confusing autotuning.  I will try to make it better documented
and simplified.

> > 
> >>>
> >>> Meanwhile, the third change seems reasonable and not necessarily need to be
> >>> blocked for the other two changes.  I think separating the third change from
> >>> this patch series and upstreaming it first could also be a path forward.
> >>> Because the change is simple and sound, convincing me would be easy.  I'd be
> >>> convinced if at least some reasonable test results can be shown.  I'm not
> >>> saying we should drop the other two changes.  We can keep discussing those in
> >>> parallel.  Rather, upstreaming the third change first could help finding real
> >>> benefits of the other two changes, since the testing will be easier.  The
> >>> decision is up to Asier, of course.  I'm just sharing my two cents.
> > 
> > I'm also curious what you think about this.
> 
> Sure, we can upstream the third change. I will prepare a new patch for with just
> that diff.

Thank you!

> 
> > 
> > Thanks,
> > SJ
> > 
> > [...]
> > 
> 
> The use case that I had in mind is pretty simple. Few admins use huge pages in
> production, since it leads to memory fragmentation and waste. On the other hand,
> amount of memory increases faster than entries in the TLB, which means more
> TLB misses and more cycles waste. My goal is to balance this. Improve performance
> in applications while keeping the amount of memory waste due to fragmentation to
> a minimum.
> 
> Imagine a database server. The sysadmin would like to collapse only hot regions
> of the database task, improving CPU utilization but without wasting too much
> memory.

This makes perfect sense to me, thank you for clarifying.

Nonetheless, I think there could be devils in the details.  For example,

1) How we find the application to collapse hot memory for?
2) What should be the hotness threshold for the collapsing?
3) Should the collapse be done asynchronously like MADV_HUGEPAGE, or
   synchronously, like MADV_COLLAPSE?

And I understand now we are aligned with below answers.

1) User-space may know and existing DAMON sysfs interface supports that.
2) Users-space may know and existing DAMON sysfs interface supports that.
3) Should be synchronous, but DAMON doesn't support that; DAMOS_COLLAPSE need
   to be upstreamed.

As I mentioned above, I think this makes sense.  Looking forward to your
DAMOS_COLLAPSE patch!

> 
> 
> Today I sat down and review the damon code. Given all your feedback, I think
> this I didn't use the right approach or I didn't understand you initially.
> My suggestions:
> 
> 1. Implement a new goal type for autotuning that uses huge pages.
> 2. Implement a module that uses this new goal type.
> 
> Would this make sense to you?

Yes that sounds like a nice step-by-step approach.  We might need to discuss
more about the details for each step, but I think this is the good direction to
go: merge obviously beneficial part first and discuss more/revisit next thing
one by one.


Thanks,
SJ

[...]

Re: [RFC PATCH v2 0/4] mm/damon: Support hot application detections

Posted by SeongJae Park 3 weeks ago

On Wed, 11 Mar 2026 07:39:12 -0700 SeongJae Park <sj@kernel.org> wrote:

> On Wed, 11 Mar 2026 16:08:56 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote:
> 
> > Hi SeongJae,
> > 
> > On 3/11/2026 8:07 AM, SeongJae Park wrote:
> > > Hello Asier,
> > > 
> > > 
> > > Thank you for continuing this work!
> > > 
> > > On Tue, 10 Mar 2026 16:24:16 +0000 <gutierrez.asier@huawei-partners.com> wrote:
> > > 
> > >> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com>
> > >>
> > >> Overview
> > >> ----------
> > > 
> > > Let's make the legnth of the subject and the length of the underline same.
> > > 
> > >>
> > >> This patch set introduces a new dynamic mechanism for detecting hot applications
> > >> and hot regions in those applications.
> > > 
> > > Seems now you offload the hot applications detection to the user space.  If I'm
> > > not wrong, you should remove "hot applications and" on the above sentence.
> > 
> > You're right. I was not sure whether changing the RFC subject was right or not.
> > I will change it for the next RFC version.
> 
> It's fine to change the subject.  Please feel free to do so in the next version
> :)
> 
> > 
> > >>
> > >> Motivation
> > >> -----------
> > >>
> > >> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or
> > >> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory
> > >> fragmentation and memory waste. For this reason, most application guides and
> > >> system administrators suggest to disable THP.
> > >>
> > >>
> > >> Solution
> > >> -----------
> > >>
> > >> A new Linux kernel module that uses DAMON to detect hot regions and collapse
> > >> those regions into huge pages. The user supplies a set of PIDs using a module
> > >> parameter,
> > > 
> > > This sounds reasonable to me.
> > > 
> > >> and then, the module launches a new kdamond thread to monitor each
> > >> of the tasks.
> > >>
> > >> In each kdamond, we start with a high min_access value. Our goal is to find the
> > >> "maximum" min_access value at which point the DAMON action is applied. In each
> > >> cycle, if no action is applied, we lower the min_access.
> > > 
> > > So, this patch series introduces a sort of auto-tuning of the hugepages
> > > collapse hotness threshold, that implemented in the new module.
> > > 
> > > We already have a sort of DAMOS auto-tuning feature, namely goal-based DAMOS
> > > quota auto-tuning [1].  Have you considered using that?  Of course, it might
> > > not be able to be used as is.  Some extensions, e.g., introduction of new goal
> > > metric, may be needed.
> > > 
> > > Yet another approach would be implementing the auto-tuning in the user-space.
> > > Because DAMON parameters can be updated online, updating the min_access from
> > > the user space should be doable?  Given the fact the module anyway require
> > > user-space control for feeding the list of applications to apply access-aware
> > > huge pages collapsing, I find no problem at user space driven auto-tuning.
> > > 
> > > If the goal-based DAMOS quota auto-tuning or the user-space based auto-tuning
> > > are feasible, all the controls can be done using DAMON sysfs interface.
> > > Introduction of the new kernel module might not really be needed in the case.
> > > 
> > > We have DAMON modules in addition to DAMON sysfs interface for users who want
> > > to use DAMON for a given specific use case with only minimum or near-zero
> > > user-space control.  In this case, because it is already aimed to ask the
> > > user-space to feed the list of applications to apply DAMOS-based hugepages
> > > collapsing, it seems a new module is not really needed, to me.
> > > 
> > > But I guess your use case might have some special restrictions that really
> > > require use of the module instead of offloading the auto-tuning to the
> > > user-space or DAMON core.  Is that the case?  If so, can you share more details
> > > about it?
> > 
> > I haven't figured out how I can use goal autotune to change the min_access.
> 
> Indeed, it is not a very straightforward feature.
> 
> > Your suggestion about moving this to the user space sound good.
> 
> If it works for you, maybe that is best for you :)
> 
> > 
> > The idea was to stop lowering the min_access as soon as collapses occur,
> > since we don't want to lower so much that we start collapsing regions that
> > are not very hot.
> > 
> > Maybe you can suggest a better way to do it. Maybe with autotuning.
> 
> I will add more detailed suggestion soon, by tomorrow or a day after.

First of all, as I mentioned on the above reply, if the approach of this patch
series (starting from high min_access and lowering down until some pages really
be collapsed) works for your use case, and you are ok at doing it in user space
approach, I think that should be the best for you, and I'd recommend going on
the direction.

But if you want to use the aim-oriented DAMOS quota goal auto-tuning, maybe I
can suggest following options.

Let me start from the basic background.  Using aim-oriented auto-tuning, the
user can set the aimed goal of a given DAMOS scheme.  Then, DAMOS automatically
increases the quota if the goal is under-achieved.  If the goal is
over-achieved, DAMOS decreases the quota.  More details are available on the
documentation [1].

So what you need is finding a proper goal.  In the hugepage use case, I expect
the goal is an amount of hugepages for the application.  Say, the goal can be
"making X% of the application memory hugepage".  DAMOS quota goal feature
supports multiple types of goals.  Unfortunately, there is no one for exactly
this example goal type.  There can be two options.

First, using 'user_input' goal type.  It allows users to directly feed the
current goal achievement to DAMON.  You can measure how much of the
applicastion's memory is hugepage and feed that inofrmation with 'user_input'
goal type.  If user-space approach is ok for you, this should also work.

Second, implementing yet another goal type for the use case.  No single quota
goal fits all, and therefore we are adding new quota goal whenever it is
needed.

But, again, if the approach of this patch series is proven to work for your use
case, and it is ok to implement it in user space, that should be the best and
fastest option for you.

Also I don't know your real use case.  Hence my quota goal based approach
suggestion might make no sense at all for your use case.  If you want, I will
be happy to learn more about your use case and suggest another option.
Nonetheless, I'm also only struggling at finding the best way for utilizing
DAMON for hugepages, and therefore trying to get some community feedback in
LSF/MM/BPF [2].

[1] https://docs.kernel.org/mm/damon/design.html#aim-oriented-feedback-driven-auto-tuning
[2] https://lore.kernel.org/all/20260211050729.69719-1-sj@kernel.org/


Thanks,
SJ

> 
> > 
> > > 
> > >>
> > >> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us
> > >> collapse synchronously and avoid polluting khugepaged and other parts of the MM
> > >> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise,
> > >> which needs the correct vm_flags_t set.
> > > 
> > > This makes sense to me.  I expect DAMOS_COLLAPSE to have some advantages over
> > > DAMOS_HUGEPAGE for some use cases, similar to MADV_COLLAPSE vs MADV_HUGEPAGE.
> > > 
> > > From my perspective, this patch series is introducing three things.
> > > 1) hugepage collapsing hotness threshold auto-tuning, 2) the module for running
> > > the auto-tuning, and 3) DAMOS_COLLAPSE.  To me, it is unclear if the first two
> > > changes are really needed.  I will wait your answer.
> 
> Please answer the above questions when you get a chance.
> 
> > > 
> > > Meanwhile, the third change seems reasonable and not necessarily need to be
> > > blocked for the other two changes.  I think separating the third change from
> > > this patch series and upstreaming it first could also be a path forward.
> > > Because the change is simple and sound, convincing me would be easy.  I'd be
> > > convinced if at least some reasonable test results can be shown.  I'm not
> > > saying we should drop the other two changes.  We can keep discussing those in
> > > parallel.  Rather, upstreaming the third change first could help finding real
> > > benefits of the other two changes, since the testing will be easier.  The
> > > decision is up to Asier, of course.  I'm just sharing my two cents.
> 
> I'm also curious what you think about this.
> 
> 
> Thanks,
> SJ
> 
> [...]