[RFC PATCH 01/10] mm/damon/core: introduce damon_ctx->paused

SeongJae Park posted 10 patches 3 weeks, 1 day ago
There is a newer version of this series
[RFC PATCH 01/10] mm/damon/core: introduce damon_ctx->paused
Posted by SeongJae Park 3 weeks, 1 day ago
DAMON supports only start and stop of the execution.  When it is
stopped, its internal data that it self-trained goes away.  It will be
useful if the execution can be paused and resumed with the previous
self-trained data.

Introduce per-context API parameter, 'paused', for the purpose.  The
parameter can be set and unset while DAMON is running and paused, using
the online parameters commit helper functions (damon_commit_ctx() and
damon_call()).  Once 'paused' is set, the kdamond_fn() main loop does
only limited works with sampling interval sleep during the works.  The
limited works include the handling of the online parameters update, so
that users can unset the 'pause' and resume the execution when they
want.  It also keep checking DAMON stop conditions and handling of it,
so that DAMON can be stopped while paused if needed.

Signed-off-by: SeongJae Park <sj@kernel.org>
---
 include/linux/damon.h | 2 ++
 mm/damon/core.c       | 8 ++++++++
 2 files changed, 10 insertions(+)

diff --git a/include/linux/damon.h b/include/linux/damon.h
index 3a441fbca170d..421e51eff3bd2 100644
--- a/include/linux/damon.h
+++ b/include/linux/damon.h
@@ -811,6 +811,8 @@ struct damon_ctx {
 	 * intervals tuning
 	 */
 	unsigned long next_intervals_tune_sis;
+	/* pause kdamond main loop */
+	bool pause;
 	/* for waiting until the execution of the kdamond_fn is started */
 	struct completion kdamond_started;
 	/* for scheme quotas prioritization */
diff --git a/mm/damon/core.c b/mm/damon/core.c
index f9854aedc42d1..1e9f6aa569fd2 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1331,6 +1331,7 @@ int damon_commit_ctx(struct damon_ctx *dst, struct damon_ctx *src)
 		if (err)
 			return err;
 	}
+	dst->pause = src->pause;
 	dst->ops = src->ops;
 	dst->addr_unit = src->addr_unit;
 	dst->min_region_sz = src->min_region_sz;
@@ -2978,6 +2979,13 @@ static int kdamond_fn(void *data)
 		 * kdamond_merge_regions() if possible, to reduce overhead
 		 */
 		kdamond_call(ctx, false);
+		while (ctx->pause) {
+			if (kdamond_need_stop(ctx))
+				goto done;
+			kdamond_usleep(ctx->attrs.sample_interval);
+			/* allow caller unset pause via damon_call() */
+			kdamond_call(ctx, false);
+		}
 		if (!list_empty(&ctx->schemes))
 			kdamond_apply_schemes(ctx);
 		else
-- 
2.47.3
Re: [RFC PATCH 01/10] mm/damon/core: introduce damon_ctx->paused
Posted by SeongJae Park 3 weeks ago
On Sun, 15 Mar 2026 14:00:00 -0700 SeongJae Park <sj@kernel.org> wrote:

> DAMON supports only start and stop of the execution.  When it is
> stopped, its internal data that it self-trained goes away.  It will be
> useful if the execution can be paused and resumed with the previous
> self-trained data.
> 
> Introduce per-context API parameter, 'paused', for the purpose.  The
> parameter can be set and unset while DAMON is running and paused, using
> the online parameters commit helper functions (damon_commit_ctx() and
> damon_call()).  Once 'paused' is set, the kdamond_fn() main loop does
> only limited works with sampling interval sleep during the works.  The
> limited works include the handling of the online parameters update, so
> that users can unset the 'pause' and resume the execution when they
> want.  It also keep checking DAMON stop conditions and handling of it,
> so that DAMON can be stopped while paused if needed.
> 
> Signed-off-by: SeongJae Park <sj@kernel.org>
> ---
>  include/linux/damon.h | 2 ++
>  mm/damon/core.c       | 8 ++++++++
>  2 files changed, 10 insertions(+)
> 
> diff --git a/include/linux/damon.h b/include/linux/damon.h
> index 3a441fbca170d..421e51eff3bd2 100644
> --- a/include/linux/damon.h
> +++ b/include/linux/damon.h
> @@ -811,6 +811,8 @@ struct damon_ctx {
>  	 * intervals tuning
>  	 */
>  	unsigned long next_intervals_tune_sis;
> +	/* pause kdamond main loop */
> +	bool pause;
>  	/* for waiting until the execution of the kdamond_fn is started */
>  	struct completion kdamond_started;
>  	/* for scheme quotas prioritization */
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index f9854aedc42d1..1e9f6aa569fd2 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -1331,6 +1331,7 @@ int damon_commit_ctx(struct damon_ctx *dst, struct damon_ctx *src)
>  		if (err)
>  			return err;
>  	}
> +	dst->pause = src->pause;
>  	dst->ops = src->ops;
>  	dst->addr_unit = src->addr_unit;
>  	dst->min_region_sz = src->min_region_sz;
> @@ -2978,6 +2979,13 @@ static int kdamond_fn(void *data)
>  		 * kdamond_merge_regions() if possible, to reduce overhead
>  		 */
>  		kdamond_call(ctx, false);
> +		while (ctx->pause) {
> +			if (kdamond_need_stop(ctx))
> +				goto done;
> +			kdamond_usleep(ctx->attrs.sample_interval);
> +			/* allow caller unset pause via damon_call() */
> +			kdamond_call(ctx, false);
> +		}

sashiko.dev comments [1] below.  Let me reply in line.

: Could this cause a deadlock if a walk request is initiated while DAMON is
: paused?
: 
: Looking at kdamond_fn(), when ctx->pause is true, the thread stays in the
: while (ctx->pause) loop and skips kdamond_apply_schemes(). If a user
: triggers a sysfs command that initiates a walk (like writing
: update_schemes_tried_regions), the sysfs thread calls damos_walk() and waits
: for walk_control->completion.
: 
: Since the kdamond thread is paused, the completion is never signaled, and the
: blocked sysfs writer will hold the global damon_sysfs_lock permanently. This
: prevents further interactions with the DAMON sysfs interface, making it
: impossible to even unpause the context.

Correct.  I was able to trigger the deadlock on my tet setup.

: 
: Should we call damos_walk_cancel(ctx) inside the pause loop to abort pending
: walk requests, similar to what is done in kdamond_wait_activation()?

Good suggestion.  I will add below attaching fixup change on the next spin.  I
confirmed the deadlock cannot be triggered after applying the fixup.

[1] https://sashiko.dev/#/patchset/20260315210012.94846-2-sj@kernel.org


Thanks,
SJ

[...]
=== >8 ===

--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -3405,6 +3405,7 @@ static int kdamond_fn(void *data)
                        kdamond_usleep(ctx->attrs.sample_interval);
                        /* allow caller unset pause via damon_call() */
                        kdamond_call(ctx, false);
+                       damos_walk_cancel(ctx);
                }
                if (!list_empty(&ctx->schemes))
                        kdamond_apply_schemes(ctx);