From nobody Thu Apr 2 12:36:48 2026 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E00034EEF9 for ; Tue, 10 Mar 2026 16:25:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773159939; cv=none; b=Gca9Q65NfWjuLzwxSOsRpQa3W2LqCA4DtHgUMYIzUK2+y8b0Sy3eu62VYjBPs1jAKQsvkOh0i4fZgaIaKFTLazHDWjj7Lu2uiebJ0hLS7J2fgq1+pXwKqFJkaB0aTr5ZZzPIDPlfhw+WlV4ac29G7AVJFZeQ+3+kRacPM4TJm44= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773159939; c=relaxed/simple; bh=TaBSgBju+exH/fvrUpx4M55Ffp8kmGJjedL2LI6btL8=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Slk2q6Dl8Yx2VnzQaHwvZaJu6msh+DpgQh1JnAjxdK1DrTfHntt/PPcWNY+dLeJ7tQbq+ha8lJ4sqtMLHoq8UJKxZwi0lqf1QssBzyYUICefnu7tmIURgVSL8GOoMva3myfpyCdI0Tbh1KbFvkPSL1PswiG+O5HlORYe21Ykk/Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei-partners.com; spf=pass smtp.mailfrom=huawei-partners.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei-partners.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei-partners.com Received: from mail.maildlp.com (unknown [172.18.224.83]) by frasgout.his.huawei.com (SkyGuard) with ESMTPS id 4fVfPN1VnpzHnGjq; Wed, 11 Mar 2026 00:25:28 +0800 (CST) Received: from mscpeml500003.china.huawei.com (unknown [7.188.49.51]) by mail.maildlp.com (Postfix) with ESMTPS id 385B940086; Wed, 11 Mar 2026 00:25:34 +0800 (CST) Received: from mscphis01197.huawei.com (10.123.65.218) by mscpeml500003.china.huawei.com (7.188.49.51) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Tue, 10 Mar 2026 19:25:33 +0300 From: To: , , , , , , , , , , Subject: [RFC PATCH v2 3/4] mm/damon: New module with hot application detection Date: Tue, 10 Mar 2026 16:24:19 +0000 Message-ID: <20260310162420.4180562-4-gutierrez.asier@huawei-partners.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260310162420.4180562-1-gutierrez.asier@huawei-partners.com> References: <20260310162420.4180562-1-gutierrez.asier@huawei-partners.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: mscpeml500004.china.huawei.com (7.188.26.250) To mscpeml500003.china.huawei.com (7.188.49.51) Content-Type: text/plain; charset="utf-8" From: Asier Gutierrez 1. It first launches a new kthread called damon_dynamic. This thread will behave as a supervisor, launching new kdamond threads for all the processes we want to montiored. The tasks are sorted by utime delta. For the top N tasks, a new kdamond thread will be launched. Applications which turn cold will have their kdamond stopped. The processes are supplied by the monitored_pids parameter. When the module is enabled, it will go through all the monitored_pids, start the supervisor and a new kdamond thread for each of the tasks. This tasks can be modified and applied using the commit_input parameters. In that case, we will stop any kdamond thread for tasks that are not going to be monitored anymore, and start a new kdamond thread for each new task to be monitored. For tasks that were monitored before and are still monitored after commiting a new monitored_pids list, kdamond threads are left intact. 2. Initially we don't know the min_access for each of the task. We want to find the highest min_access when collapses start happening. For that we have an initial threashold of 90, which we will lower until a collpase occurs. Signed-off-by: Asier Gutierrez Co-developed-by: Anatoly Stepanov --- mm/damon/Kconfig | 7 + mm/damon/Makefile | 1 + mm/damon/hugepage.c (new) | 441 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 449 insertions(+) diff --git a/mm/damon/Kconfig b/mm/damon/Kconfig index 8c868f7035fc..2355aacb6d12 100644 --- a/mm/damon/Kconfig +++ b/mm/damon/Kconfig @@ -110,4 +110,11 @@ config DAMON_STAT_ENABLED_DEFAULT Whether to enable DAMON_STAT by default. Users can disable it in boot or runtime using its 'enabled' parameter. =20 +config DAMON_HOT_HUGEPAGE + bool "Build DAMON-based collapse of hot regions (DAMON_HOT_HUGEPAGES)" + depends on DAMON_VADDR + help + Collapse hot region into huge pages. Hot regions are determined by + DAMON-based sampling + endmenu diff --git a/mm/damon/Makefile b/mm/damon/Makefile index d8d6bf5f8bff..ac3afbc81cc7 100644 --- a/mm/damon/Makefile +++ b/mm/damon/Makefile @@ -7,3 +7,4 @@ obj-$(CONFIG_DAMON_SYSFS) +=3D sysfs-common.o sysfs-schemes= .o sysfs.o obj-$(CONFIG_DAMON_RECLAIM) +=3D modules-common.o reclaim.o obj-$(CONFIG_DAMON_LRU_SORT) +=3D modules-common.o lru_sort.o obj-$(CONFIG_DAMON_STAT) +=3D modules-common.o stat.o +obj-$(CONFIG_DAMON_HOT_HUGEPAGE) +=3D modules-common.o hugepage.o diff --git a/mm/damon/hugepage.c b/mm/damon/hugepage.c new file mode 100644 index 000000000000..ccd31c48d391 --- /dev/null +++ b/mm/damon/hugepage.c @@ -0,0 +1,441 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2026 HUAWEI, Inc. + * https://www.huawei.com + * + * Author: Asier Gutierrez + */ + +#define pr_fmt(fmt) "damon-hugepage: " fmt + +#include +#include +#include + +#include "modules-common.h" + +#ifdef MODULE_PARAM_PREFIX +#undef MODULE_PARAM_PREFIX +#endif +#define MODULE_PARAM_PREFIX "damon_hugepage." + +#define MAX_MONITORED_PIDS 100 +#define HIGHEST_MIN_ACCESS 90 +#define HIGH_ACC_THRESHOLD 50 +#define MID_ACC_THRESHOLD 15 +#define LOW_ACC_THRESHOLD 2 + +static struct task_struct *monitor_thread; + +struct mutex enable_disable_lock; + +/* + * Enable or disable DAMON_HUGEPAGE. + * + * You can enable DAMON_HUGEPAGE by setting the value of this parameter + * as ``Y``. Setting it as ``N`` disables DAMON_HOT_HUGEPAGE. Note that + * DAMON_HOT_HUGEPAGE could do no real monitoring and reclamation due to t= he + * watermarks-based activation condition. Refer to below descriptions for= the + * watermarks parameter for this. + */ +static bool enabled __read_mostly; + +/* + * Make DAMON_HUGEPAGE reads the input parameters again, except ``enabled`= `. + * + * Input parameters that updated while DAMON_HUGEPAGE is running are not a= pplied + * by default. Once this parameter is set as ``Y``, DAMON_HUGEPAGE reads = values + * of parametrs except ``enabled`` again. Once the re-reading is done, th= is + * parameter is set as ``N``. If invalid parameters are found while the + * re-reading, DAMON_HUGEPAGE will be disabled. + */ +static bool commit_inputs __read_mostly; +module_param(commit_inputs, bool, 0600); + +/* + * DAMON_HUGEPAGE monitoring period in microseconds. + * 5000000 =3D 5s + */ +static unsigned long monitor_period __read_mostly =3D 5000000; +module_param(monitor_period, ulong, 0600); + +static long monitored_pids[MAX_MONITORED_PIDS]; +static int num_monitored_pids; +module_param_array(monitored_pids, long, &num_monitored_pids, 0600); + +static struct damos_quota damon_hugepage_quota =3D { + /* use up to 10 ms time, reclaim up to 128 MiB per 1 sec by default */ + .ms =3D 10, + .sz =3D 0, + .reset_interval =3D 1000, + /* Within the quota, page out older regions first. */ + .weight_sz =3D 0, + .weight_nr_accesses =3D 0, + .weight_age =3D 1 +}; +DEFINE_DAMON_MODULES_DAMOS_TIME_QUOTA(damon_hugepage_quota); + +static struct damos_watermarks damon_hugepage_wmarks =3D { + .metric =3D DAMOS_WMARK_FREE_MEM_RATE, + .interval =3D 5000000, /* 5 seconds */ + .high =3D 900, /* 90 percent */ + .mid =3D 800, /* 80 percent */ + .low =3D 50, /* 5 percent */ +}; +DEFINE_DAMON_MODULES_WMARKS_PARAMS(damon_hugepage_wmarks); + +static struct damon_attrs damon_hugepage_mon_attrs =3D { + .sample_interval =3D 5000, /* 5 ms */ + .aggr_interval =3D 100000, /* 100 ms */ + .ops_update_interval =3D 0, + .min_nr_regions =3D 10, + .max_nr_regions =3D 1000, +}; +DEFINE_DAMON_MODULES_MON_ATTRS_PARAMS(damon_hugepage_mon_attrs); + +struct hugepage_task { + struct damon_ctx *ctx; + int pid; + struct damon_target *target; + struct damon_call_control call_control; + struct list_head list; +}; + +static struct damos *damon_hugepage_new_scheme(int min_access, + enum damos_action action) +{ + struct damos_access_pattern pattern =3D { + /* Find regions having PAGE_SIZE or larger size */ + .min_sz_region =3D PMD_SIZE, + .max_sz_region =3D ULONG_MAX, + /* and not accessed at all */ + .min_nr_accesses =3D min_access, + .max_nr_accesses =3D 100, + /* for min_age or more micro-seconds */ + .min_age_region =3D 0, + .max_age_region =3D UINT_MAX, + }; + + return damon_new_scheme( + &pattern, + /* synchrounous partial collapse as soon as found */ + action, 0, + /* under the quota. */ + &damon_hugepage_quota, + /* (De)activate this according to the watermarks. */ + &damon_hugepage_wmarks, NUMA_NO_NODE); +} + +static int damon_hugepage_apply_parameters( + struct hugepage_task *monitored_task, + int min_access, + enum damos_action action) +{ + struct damos *scheme; + struct damon_ctx *param_ctx; + struct damon_target *param_target; + struct damos_filter *filter; + int err; + struct pid *spid; + + err =3D damon_modules_new_ctx_target(¶m_ctx, ¶m_target, + DAMON_OPS_VADDR); + if (err) + return err; + + spid =3D find_get_pid(monitored_task->pid); + if (!spid) + return err; + + param_target->pid =3D spid; + + err =3D damon_set_attrs(param_ctx, &damon_hugepage_mon_attrs); + if (err) + goto out; + + err =3D -ENOMEM; + scheme =3D damon_hugepage_new_scheme(min_access, action); + if (!scheme) + goto out; + + damon_set_schemes(param_ctx, &scheme, 1); + + filter =3D damos_new_filter(DAMOS_FILTER_TYPE_ANON, true, false); + if (!filter) + goto out; + damos_add_filter(scheme, filter); + + err =3D damon_commit_ctx(monitored_task->ctx, param_ctx); +out: + damon_destroy_ctx(param_ctx); + return err; +} + +static int damon_hugepage_damon_call_fn(void *arg) +{ + struct hugepage_task *monitored_task =3D arg; + struct damon_ctx *ctx =3D monitored_task->ctx; + struct damos *scheme; + int err =3D 0; + int min_access; + struct damos_stat stat; + + damon_for_each_scheme(scheme, ctx) + stat =3D scheme->stat; + scheme =3D list_first_entry(&ctx->schemes, struct damos, list); + + if (ctx->passed_sample_intervals < scheme->next_apply_sis) + return err; + + if (stat.nr_applied) + return err; + + min_access =3D scheme->pattern.min_nr_accesses; + + if (min_access > HIGH_ACC_THRESHOLD) { + min_access =3D min_access - 10; + err =3D damon_hugepage_apply_parameters( + monitored_task, min_access, DAMOS_COLLAPSE); + } else if (min_access > MID_ACC_THRESHOLD) { + min_access =3D min_access - 5; + err =3D damon_hugepage_apply_parameters( + monitored_task, min_access, DAMOS_COLLAPSE); + } else if (min_access > LOW_ACC_THRESHOLD) { + min_access =3D min_access - 1; + err =3D damon_hugepage_apply_parameters( + monitored_task, min_access, DAMOS_COLLAPSE); + } + return err; +} + +static int damon_hugepage_init_task(struct hugepage_task *monitored_task) +{ + int err =3D 0; + struct damon_ctx *ctx =3D monitored_task->ctx; + struct damon_target *target =3D monitored_task->target; + struct pid *spid; + + if (!ctx || !target) + damon_modules_new_ctx_target(&ctx, &target, DAMON_OPS_VADDR); + + if (damon_is_running(ctx)) + return 0; + + spid =3D find_get_pid(monitored_task->pid); + if (!spid) + return err; + + target->pid =3D spid; + + monitored_task->call_control.fn =3D damon_hugepage_damon_call_fn; + monitored_task->call_control.repeat =3D true; + monitored_task->call_control.data =3D monitored_task; + + struct damos *scheme =3D damon_hugepage_new_scheme( + HIGHEST_MIN_ACCESS, DAMOS_COLLAPSE); + if (!scheme) + return -ENOMEM; + + damon_set_schemes(ctx, &scheme, 1); + + monitored_task->ctx =3D ctx; + err =3D damon_start(&monitored_task->ctx, 1, false); + if (err) + return err; + + return damon_call(monitored_task->ctx, &monitored_task->call_control); +} + +static int add_monitored_task(int pid, struct list_head *task_monitor) +{ + struct hugepage_task *new_hugepage_task; + int err; + + new_hugepage_task =3D kzalloc_obj(*new_hugepage_task); + if (!new_hugepage_task) + return -ENOMEM; + + new_hugepage_task->pid =3D pid; + INIT_LIST_HEAD(&new_hugepage_task->list); + err =3D damon_hugepage_init_task(new_hugepage_task); + if (err) + return err; + list_add(&new_hugepage_task->list, task_monitor); + return 0; +} + +static int damon_hugepage_handle_commit_inputs( + struct list_head *monitored_tasks) +{ + int i =3D 0; + int err =3D 0; + bool found; + struct hugepage_task *monitored_task, *tmp; + + if (!commit_inputs) + return 0; + + while (i < MAX_MONITORED_PIDS) { + if (!monitored_pids[i]) + break; + + found =3D false; + + rcu_read_lock(); + if (!find_vpid(monitored_pids[i])) { + rcu_read_unlock(); + continue; + } + + rcu_read_unlock(); + + list_for_each_entry_safe(monitored_task, tmp, monitored_tasks, list) { + if (monitored_task->pid =3D=3D monitored_pids[i]) { + list_move(&monitored_task->list, monitored_tasks); + found =3D true; + break; + } + } + if (!found) { + err =3D add_monitored_task(monitored_pids[i], monitored_tasks); + /* Skip failed tasks */ + if (err) + continue; + } + i++; + } + + i =3D 0; + list_for_each_entry_safe(monitored_task, tmp, monitored_tasks, list) { + i++; + if (i <=3D num_monitored_pids) + continue; + + err =3D damon_stop(&monitored_task->ctx, 1); + damon_destroy_ctx(monitored_task->ctx); + list_del(&monitored_task->list); + kfree(monitored_task); + } + + commit_inputs =3D false; + return err; +} + +static int damon_manager_monitor_thread(void *data) +{ + int err =3D 0; + int i; + struct hugepage_task *entry, *tmp; + + LIST_HEAD(monitored_tasks); + + for (i =3D 0; i < MAX_MONITORED_PIDS; i++) { + if (!monitored_pids[i]) + break; + + rcu_read_lock(); + if (!find_vpid(monitored_pids[i])) { + rcu_read_unlock(); + continue; + } + rcu_read_unlock(); + + add_monitored_task(monitored_pids[i], &monitored_tasks); + } + + + while (!kthread_should_stop()) { + schedule_timeout_idle(usecs_to_jiffies(monitor_period)); + err =3D damon_hugepage_handle_commit_inputs(&monitored_tasks); + if (err) + break; + } + + list_for_each_entry_safe(entry, tmp, &monitored_tasks, list) { + err =3D damon_stop(&entry->ctx, 1); + damon_destroy_ctx(entry->ctx); + } + + for (int i =3D 0; i < MAX_MONITORED_PIDS;) { + monitored_pids[i] =3D 0; + i++; + } + return err; +} + +static int damon_hugepage_start_monitor_thread(void) +{ + num_monitored_pids =3D 0; + monitor_thread =3D kthread_create(damon_manager_monitor_thread, NULL, + "damon_dynamic"); + + if (IS_ERR(monitor_thread)) + return PTR_ERR(monitor_thread); + + wake_up_process(monitor_thread); + return 0; +} + +static int damon_hugepage_turn(bool on) +{ + int err =3D 0; + + mutex_lock(&enable_disable_lock); + if (!on) { + if (monitor_thread) { + kthread_stop(monitor_thread); + monitor_thread =3D NULL; + } + goto out; + } + err =3D damon_hugepage_start_monitor_thread(); +out: + mutex_unlock(&enable_disable_lock); + return err; +} + +static int damon_hugepage_enabled_store(const char *val, + const struct kernel_param *kp) +{ + bool is_enabled =3D enabled; + bool enable; + int err; + + err =3D kstrtobool(val, &enable); + if (err) + return err; + + if (is_enabled =3D=3D enable) + return 0; + + err =3D damon_hugepage_turn(enable); + if (err) + return err; + + enabled =3D enable; + return err; +} + +static const struct kernel_param_ops enabled_param_ops =3D { + .set =3D damon_hugepage_enabled_store, + .get =3D param_get_bool, +}; + +module_param_cb(enabled, &enabled_param_ops, &enabled, 0600); +MODULE_PARM_DESC(enabled, + "Enable or disable DAMON_DYNAMIC_HUGEPAGES (default: disabled)"); + +static int __init damon_hugepage_init(void) +{ + int err; + + /* 'enabled' has set before this function, probably via command line */ + if (enabled) + err =3D damon_hugepage_turn(true); + + if (err && enabled) + enabled =3D false; + return err; +} + +module_init(damon_hugepage_init); --=20 2.43.0