mm/damon/lru_sort.c | 6 +- mm/damon/modules-common.c | 7 +- mm/damon/modules-common.h | 5 +- mm/damon/reclaim.c | 5 +- mm/damon/vaddr.c | 3 + 10 files changed, 778 insertions(+), 9 deletions(-) create mode 100644 Documentation/admin-guide/mm/damon/dynamic_hugepages.rst create mode 100644 mm/damon/dynamic_hugepages.c
From: Asier Gutierrez <gutierrez.asier@huawei-partners.com>
Overview
----------
This patch set introduces a new dynamic mechanism for detecting hot applications
and hot regions in those applications.
Motivation
-----------
Currently DAMON requires the system administrator to provide information about
which application needs to be monitored and all the parameters. Ideally this
should be done automatically, with minimal intervention from the system
administrator.
Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or
hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory
fragmentation and memory waste. For this reason, most application guides and
system administrators suggest to disable THP.
We would like to detect: 1. which applications are hot in the system and 2.
which memory regions are hot in order to collapse those regions.
Solution
-----------
┌────────────┐ ┌────────────┐
│Damon_module│ │Task_monitor│
└──────┬─────┘ └──────┬─────┘
│ start │
│───────────────────────>│
│ │
│ │────┐
│ │ │ calculate task load
│ │<───┘
│ │
│ │────┐
│ │ │ sort tasks
│ │<───┘
│ │
│ │────┐
│ │ │ start kdamond for top 3 tasks
│ │<───┘
┌──────┴─────┐ ┌──────┴─────┐
│Damon_module│ │Task_monitor│
└────────────┘ └────────────┘
We calculate the task load base on the sum of all the utime for all the threads
in a given task. Once we get total utime, we use the exponential load average
provided by calc_load. The tasks that become cold, the kdamond will be stopped
for them.
In each kdamond, we start with a high min_access value. Our goal is to find the
"maximum" min_access value at which point the DAMON action is applied. In each
cycle, if no action is applied, we lower the min_access.
Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us
collapse synchronously and avoid polluting khugepaged and other parts of the MM
subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise,
which needs the correct vm_flags_t set.
Benchmark
-----------
Asier Gutierrez (4):
mm/damon: Generic context creation for modules
mm/damon: Support for synchrounous huge pages collapse
mm/damon: New module with hot application detection
documentation/mm/damon: Documentation for the dynamic_hugepages
module
.../mm/damon/dynamic_hugepages.rst (new) | 173 ++++++
include/linux/damon.h | 1 +
mm/damon/Kconfig | 7 +
mm/damon/Makefile | 1 +
mm/damon/dynamic_hugepages.c (new) | 579 ++++++++++++++++++
mm/damon/lru_sort.c | 6 +-
mm/damon/modules-common.c | 7 +-
mm/damon/modules-common.h | 5 +-
mm/damon/reclaim.c | 5 +-
mm/damon/vaddr.c | 3 +
10 files changed, 778 insertions(+), 9 deletions(-)
create mode 100644 Documentation/admin-guide/mm/damon/dynamic_hugepages.rst
create mode 100644 mm/damon/dynamic_hugepages.c
--
2.43.0
Hello Asier, Thank you for sharing this nice RFC patch series! On Mon, 2 Feb 2026 14:56:45 +0000 <gutierrez.asier@huawei-partners.com> wrote: > From: Asier Gutierrez <gutierrez.asier@huawei-partners.com> > > Overview > ---------- > > This patch set introduces a new dynamic mechanism for detecting hot applications > and hot regions in those applications. > > Motivation > ----------- > > Currently DAMON requires the system administrator to provide information about > which application needs to be monitored and all the parameters. Ideally this > should be done automatically, with minimal intervention from the system > administrator. > > > Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or > hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory > fragmentation and memory waste. For this reason, most application guides and > system administrators suggest to disable THP. > > We would like to detect: 1. which applications are hot in the system and 2. > which memory regions are hot in order to collapse those regions. > > > Solution > ----------- > > ┌────────────┐ ┌────────────┐ > │Damon_module│ │Task_monitor│ > └──────┬─────┘ └──────┬─────┘ > │ start │ > │───────────────────────>│ > │ │ > │ │────┐ > │ │ │ calculate task load > │ │<───┘ > │ │ > │ │────┐ > │ │ │ sort tasks > │ │<───┘ > │ │ > │ │────┐ > │ │ │ start kdamond for top 3 tasks > │ │<───┘ > ┌──────┴─────┐ ┌──────┴─────┐ > │Damon_module│ │Task_monitor│ > └────────────┘ └────────────┘ > > > We calculate the task load base on the sum of all the utime for all the threads > in a given task. Once we get total utime, we use the exponential load average > provided by calc_load. The tasks that become cold, the kdamond will be stopped > for them. Sounds interesting, and this high level idea makes sense to me. :) I'd like to further learn a few things. Is there a reason to think the top 3 tasks are enough number of tasks? Also, what if a region was hot and successfully promoted to use huge pages, but later be cold? Should we also have a DAMOS scheme for splitting such no-more-hot huge pages? > > In each kdamond, we start with a high min_access value. Our goal is to find the > "maximum" min_access value at which point the DAMON action is applied. In each > cycle, if no action is applied, we lower the min_access. Sounds like a nice auto-tuning. And we have DAMOS quota goal for that kind of auto-tuning. Have you considered using that? > > Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us > collapse synchronously and avoid polluting khugepaged and other parts of the MM > subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise, > which needs the correct vm_flags_t set. > > Benchmark > ----------- Seems you forgot writing this section up. Or, you don't have benchmark results yet, but only mistakenly wrote the above section header? Either is fine, as this is just an RFC. Nevertheless, test results and your expected use case of this patch series will be very helpful. Thanks, SJ [...]
SeongJae, Thanks a lot for all the useful feedback. One thing that I was not sure about while working on this patch set is whether to have an external new module or adding the logic to damon core. I mean, the hot application detecting can be useful for all other modules and can improve DAMON performance. What do you think? My implementation was module based because I tried to avoid changes to DAMON core for the RFC. On 2/3/2026 4:10 AM, SeongJae Park wrote: > Hello Asier, > > > Thank you for sharing this nice RFC patch series! > > On Mon, 2 Feb 2026 14:56:45 +0000 <gutierrez.asier@huawei-partners.com> wrote: > >> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com> >> >> Overview >> ---------- >> >> This patch set introduces a new dynamic mechanism for detecting hot applications >> and hot regions in those applications. >> >> Motivation >> ----------- >> >> Currently DAMON requires the system administrator to provide information about >> which application needs to be monitored and all the parameters. Ideally this >> should be done automatically, with minimal intervention from the system >> administrator. >> >> >> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or >> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory >> fragmentation and memory waste. For this reason, most application guides and >> system administrators suggest to disable THP. >> >> We would like to detect: 1. which applications are hot in the system and 2. >> which memory regions are hot in order to collapse those regions. >> >> >> Solution >> ----------- >> >> ┌────────────┐ ┌────────────┐ >> │Damon_module│ │Task_monitor│ >> └──────┬─────┘ └──────┬─────┘ >> │ start │ >> │───────────────────────>│ >> │ │ >> │ │────┐ >> │ │ │ calculate task load >> │ │<───┘ >> │ │ >> │ │────┐ >> │ │ │ sort tasks >> │ │<───┘ >> │ │ >> │ │────┐ >> │ │ │ start kdamond for top 3 tasks >> │ │<───┘ >> ┌──────┴─────┐ ┌──────┴─────┐ >> │Damon_module│ │Task_monitor│ >> └────────────┘ └────────────┘ >> >> >> We calculate the task load base on the sum of all the utime for all the threads >> in a given task. Once we get total utime, we use the exponential load average >> provided by calc_load. The tasks that become cold, the kdamond will be stopped >> for them. > > Sounds interesting, and this high level idea makes sense to me. :) > > I'd like to further learn a few things. Is there a reason to think the top 3 > tasks are enough number of tasks? Also, what if a region was hot and > successfully promoted to use huge pages, but later be cold? Should we also > have a DAMOS scheme for splitting such no-more-hot huge pages? > >> >> In each kdamond, we start with a high min_access value. Our goal is to find the >> "maximum" min_access value at which point the DAMON action is applied. In each >> cycle, if no action is applied, we lower the min_access. > > Sounds like a nice auto-tuning. And we have DAMOS quota goal for that kind of > auto-tuning. Have you considered using that? > >> >> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us >> collapse synchronously and avoid polluting khugepaged and other parts of the MM >> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise, >> which needs the correct vm_flags_t set. >> >> Benchmark >> ----------- > > Seems you forgot writing this section up. Or, you don't have benchmark results > yet, but only mistakenly wrote the above section header? Either is fine, as > this is just an RFC. Nevertheless, test results and your expected use case of > this patch series will be very helpful. > > > Thanks, > SJ > > [...] > -- Asier Gutierrez Huawei
On Tue, 3 Feb 2026 17:25:11 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote: > SeongJae, > > Thanks a lot for all the useful feedback. The pleasure is mine! :) > > One thing that I was not sure about while working on this patch set > is whether to have an external new module or adding the logic to > damon core. I mean, the hot application detecting can be useful for > all other modules and can improve DAMON performance. All exising non-sample DAMON modules are working for the physical address space. I hence finding no many opportunities to bring benefits of hot application detection to those. I agree the hot applications detection could be useful in general and creative DAMON use cases for virtual address spaces. Implementing the feature in DAMON core layer and exposing it via DAMON sysfs interface will help such use cases. But it seems not straightforward to imagine how the sysfs interface can be extended for the feature. So, I think it would better to be implemented inside the new module at the moment. Later, if we end up having more modules that use the feature, we could move it to the modules-common or the core. If we further find a good way to integrate that with sysfs interface, definitely it could go to core. From this point, however, I realize the feature can also be implemented in the user sapce in a pretty straightforward way. Have you considered that? > What do you think? > My implementation was module based because I tried to avoid changes > to DAMON core for the RFC. If there is a good reason to implement that not in the user space but the kernel space, as I mentioned above, it seems the module is the right place to me. Thanks, SJ [...]
On 2/4/2026 10:17 AM, SeongJae Park wrote: > On Tue, 3 Feb 2026 17:25:11 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote: > >> SeongJae, >> >> Thanks a lot for all the useful feedback. > > The pleasure is mine! :) > >> >> One thing that I was not sure about while working on this patch set >> is whether to have an external new module or adding the logic to >> damon core. I mean, the hot application detecting can be useful for >> all other modules and can improve DAMON performance. > > All exising non-sample DAMON modules are working for the physical address > space. I hence finding no many opportunities to bring benefits of hot > application detection to those. > > I agree the hot applications detection could be useful in general and creative > DAMON use cases for virtual address spaces. Implementing the feature in DAMON > core layer and exposing it via DAMON sysfs interface will help such use cases. > But it seems not straightforward to imagine how the sysfs interface can be > extended for the feature. > > So, I think it would better to be implemented inside the new module at the > moment. Later, if we end up having more modules that use the feature, we could > move it to the modules-common or the core. If we further find a good way to > integrate that with sysfs interface, definitely it could go to core. > > From this point, however, I realize the feature can also be implemented in the > user sapce in a pretty straightforward way. Have you considered that? I though about it. However, accessing the task_struct directly and extracting the utime is much more efficient that getting the required info from the user space. > >> What do you think? >> My implementation was module based because I tried to avoid changes >> to DAMON core for the RFC. > > If there is a good reason to implement that not in the user space but the > kernel space, as I mentioned above, it seems the module is the right place to > me. > > > Thanks, > SJ > > [...] > -- Asier Gutierrez Huawei
On Wed, 4 Feb 2026 16:07:40 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote: > > > On 2/4/2026 10:17 AM, SeongJae Park wrote: > > On Tue, 3 Feb 2026 17:25:11 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote: > > > >> SeongJae, > >> > >> Thanks a lot for all the useful feedback. > > > > The pleasure is mine! :) > > > >> > >> One thing that I was not sure about while working on this patch set > >> is whether to have an external new module or adding the logic to > >> damon core. I mean, the hot application detecting can be useful for > >> all other modules and can improve DAMON performance. > > > > All exising non-sample DAMON modules are working for the physical address > > space. I hence finding no many opportunities to bring benefits of hot > > application detection to those. > > > > I agree the hot applications detection could be useful in general and creative > > DAMON use cases for virtual address spaces. Implementing the feature in DAMON > > core layer and exposing it via DAMON sysfs interface will help such use cases. > > But it seems not straightforward to imagine how the sysfs interface can be > > extended for the feature. > > > > So, I think it would better to be implemented inside the new module at the > > moment. Later, if we end up having more modules that use the feature, we could > > move it to the modules-common or the core. If we further find a good way to > > integrate that with sysfs interface, definitely it could go to core. > > > > From this point, however, I realize the feature can also be implemented in the > > user sapce in a pretty straightforward way. Have you considered that? > > I though about it. However, accessing the task_struct directly and extracting > the utime is much more efficient that getting the required info from the user > space. > > > > >> What do you think? > >> My implementation was module based because I tried to avoid changes > >> to DAMON core for the RFC. > > > > If there is a good reason to implement that not in the user space but the > > kernel space, as I mentioned above, it seems the module is the right place to > > me. I agree it would be much more efficient to do that in the kernel space. But, given existence of 'top' like user space programs that also does similar works and heavily used, I'm not directly feeling how important the efficiency is in the real life. Meanwhile, doing it in the user space (DAMON user-space tool or your own one) would be simpler and more flexible. For example, you could simply use 'ps --sort=%cpu', and make the sorting algorithm flexible (e.g., sorting by RSS or using different sorting algorithms) without changing kernel. So there are pros and cons, to my humble view. Those may depend on the given use case, and I want to focus on your planned or expected use case. Could you please clarify your planned or expected use case, and how important and trivial the pros and cons of keeping the logic in kernel or user space would be on the scenarios? Thanks, SJ [...]
Hi SeongJae! On 2/3/2026 4:10 AM, SeongJae Park wrote: > Hello Asier, > > > Thank you for sharing this nice RFC patch series! > > On Mon, 2 Feb 2026 14:56:45 +0000 <gutierrez.asier@huawei-partners.com> wrote: > >> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com> >> >> Overview >> ---------- >> >> This patch set introduces a new dynamic mechanism for detecting hot applications >> and hot regions in those applications. >> >> Motivation >> ----------- >> >> Currently DAMON requires the system administrator to provide information about >> which application needs to be monitored and all the parameters. Ideally this >> should be done automatically, with minimal intervention from the system >> administrator. >> >> >> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or >> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory >> fragmentation and memory waste. For this reason, most application guides and >> system administrators suggest to disable THP. >> >> We would like to detect: 1. which applications are hot in the system and 2. >> which memory regions are hot in order to collapse those regions. >> >> >> Solution >> ----------- >> >> ┌────────────┐ ┌────────────┐ >> │Damon_module│ │Task_monitor│ >> └──────┬─────┘ └──────┬─────┘ >> │ start │ >> │───────────────────────>│ >> │ │ >> │ │────┐ >> │ │ │ calculate task load >> │ │<───┘ >> │ │ >> │ │────┐ >> │ │ │ sort tasks >> │ │<───┘ >> │ │ >> │ │────┐ >> │ │ │ start kdamond for top 3 tasks >> │ │<───┘ >> ┌──────┴─────┐ ┌──────┴─────┐ >> │Damon_module│ │Task_monitor│ >> └────────────┘ └────────────┘ >> >> >> We calculate the task load base on the sum of all the utime for all the threads >> in a given task. Once we get total utime, we use the exponential load average >> provided by calc_load. The tasks that become cold, the kdamond will be stopped >> for them. > > Sounds interesting, and this high level idea makes sense to me. :) > > I'd like to further learn a few things. Is there a reason to think the top 3 > tasks are enough number of tasks? Also, what if a region was hot and > successfully promoted to use huge pages, but later be cold? Should we also > have a DAMOS scheme for splitting such no-more-hot huge pages? No specific reason. This was just for the RFC. We could move this to a parameter somehow. In case of a region turning cold, I haven't worked on it. In turning hot means that we collapse the hot region, we should do the opposite (split) in case the area turns cold. I haven't thought about it, but that a good catch. Thanks! >> >> In each kdamond, we start with a high min_access value. Our goal is to find the >> "maximum" min_access value at which point the DAMON action is applied. In each >> cycle, if no action is applied, we lower the min_access. > > Sounds like a nice auto-tuning. And we have DAMOS quota goal for that kind of > auto-tuning. Have you considered using that? > >> >> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us >> collapse synchronously and avoid polluting khugepaged and other parts of the MM >> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise, >> which needs the correct vm_flags_t set. >> >> Benchmark >> ----------- > > Seems you forgot writing this section up. Or, you don't have benchmark results > yet, but only mistakenly wrote the above section header? Either is fine, as > this is just an RFC. Nevertheless, test results and your expected use case of > this patch series will be very helpful. > > > Thanks, > SJ > > [...] > Sure, will add the benchmark results in the next RFC version. -- Asier Gutierrez Huawei
On Tue, 3 Feb 2026 16:03:04 +0300 Gutierrez Asier <gutierrez.asier@huawei-partners.com> wrote: > Hi SeongJae! > > On 2/3/2026 4:10 AM, SeongJae Park wrote: > > Hello Asier, > > > > > > Thank you for sharing this nice RFC patch series! > > > > On Mon, 2 Feb 2026 14:56:45 +0000 <gutierrez.asier@huawei-partners.com> wrote: > > > >> From: Asier Gutierrez <gutierrez.asier@huawei-partners.com> > >> > >> Overview > >> ---------- > >> > >> This patch set introduces a new dynamic mechanism for detecting hot applications > >> and hot regions in those applications. > >> > >> Motivation > >> ----------- > >> > >> Currently DAMON requires the system administrator to provide information about > >> which application needs to be monitored and all the parameters. Ideally this > >> should be done automatically, with minimal intervention from the system > >> administrator. > >> > >> > >> Since TLB is a bottleneck for many systems, a way to optimize TLB misses (or > >> hits) is to use huge pages. Unfortunately, using "always" in THP leads to memory > >> fragmentation and memory waste. For this reason, most application guides and > >> system administrators suggest to disable THP. > >> > >> We would like to detect: 1. which applications are hot in the system and 2. > >> which memory regions are hot in order to collapse those regions. > >> > >> > >> Solution > >> ----------- > >> > >> ┌────────────┐ ┌────────────┐ > >> │Damon_module│ │Task_monitor│ > >> └──────┬─────┘ └──────┬─────┘ > >> │ start │ > >> │───────────────────────>│ > >> │ │ > >> │ │────┐ > >> │ │ │ calculate task load > >> │ │<───┘ > >> │ │ > >> │ │────┐ > >> │ │ │ sort tasks > >> │ │<───┘ > >> │ │ > >> │ │────┐ > >> │ │ │ start kdamond for top 3 tasks > >> │ │<───┘ > >> ┌──────┴─────┐ ┌──────┴─────┐ > >> │Damon_module│ │Task_monitor│ > >> └────────────┘ └────────────┘ > >> > >> > >> We calculate the task load base on the sum of all the utime for all the threads > >> in a given task. Once we get total utime, we use the exponential load average > >> provided by calc_load. The tasks that become cold, the kdamond will be stopped > >> for them. > > > > Sounds interesting, and this high level idea makes sense to me. :) > > > > I'd like to further learn a few things. Is there a reason to think the top 3 > > tasks are enough number of tasks? Also, what if a region was hot and > > successfully promoted to use huge pages, but later be cold? Should we also > > have a DAMOS scheme for splitting such no-more-hot huge pages? > > No specific reason. This was just for the RFC. We could move this to a parameter > somehow. Makes sense. Depending on the test results with 3 tasks default value, I think we could just keep it a hard-coded default one. If it turns out it is not working good for different cases, we could make it a tunable parameter or internally auto-tuned. > > In case of a region turning cold, I haven't worked on it. In turning hot means > that we collapse the hot region, we should do the opposite (split) in case the > area turns cold. I haven't thought about it, but that a good catch. Thanks! You're welcome! > > >> > >> In each kdamond, we start with a high min_access value. Our goal is to find the > >> "maximum" min_access value at which point the DAMON action is applied. In each > >> cycle, if no action is applied, we lower the min_access. > > > > Sounds like a nice auto-tuning. And we have DAMOS quota goal for that kind of > > auto-tuning. Have you considered using that? Maybe you missed the above question? > > > >> > >> Regarding the action, we introduce a new action: DAMOS_COLLAPSE. This allows us > >> collapse synchronously and avoid polluting khugepaged and other parts of the MM > >> subsystem with DAMON stuff. DAMOS_HUGEPAGE eventually calls hugepage_madvise, > >> which needs the correct vm_flags_t set. > >> > >> Benchmark > >> ----------- > > > > Seems you forgot writing this section up. Or, you don't have benchmark results > > yet, but only mistakenly wrote the above section header? Either is fine, as > > this is just an RFC. Nevertheless, test results and your expected use case of > > this patch series will be very helpful. > > > > > > Thanks, > > SJ > > > > [...] > > > > Sure, will add the benchmark results in the next RFC version. Looking forward! I'm particularly interested in your expected or planned use case, including why you implement the top n processes logic inside the kernel instead of putting it on the user space. I'm also interested in how well the test setup is representing the realistic use case, and how good the results is. That will help us deciding important things including whether this can be merged, and if some corner cases handling should be made before or after merging it, earlier. Thanks, SJ
© 2016 - 2026 Red Hat, Inc.