mm/damon/sysfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
damon_call() for repeat_call_control of DAMON_SYSFS could fail if
somehow the kdamond is stopped before the damon_call(). It could
happen, for example, when te damon context was made for monitroing of a
virtual address processes, and the process is terminated immediately,
before the damon_call() invocation. In the case, the dyanmically
allocated repeat_call_control is not deallocated and leaked.
Fix the leak by deallocating the repeat_call_control under the
damon_call() failure.
This issue is discovered by sashiko [1].
[1] https://lore.kernel.org/20260320020630.962-1-sj@kernel.org
Fixes: 04a06b139ec0 ("mm/damon/sysfs: use dynamically allocated repeat mode damon_call_control")
Cc: <stable@vger.kernel.org> # 6.17.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
Changes from RFC
(https://lore.kernel.org/20260326062347.88569-2-sj@kernel.org)
- Split out from the series.
- Drop RFC tag.
- Add Fixes: and Cc: stable.
mm/damon/sysfs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c
index 6a44a2f3d8fc..eefa959aa30a 100644
--- a/mm/damon/sysfs.c
+++ b/mm/damon/sysfs.c
@@ -1670,7 +1670,8 @@ static int damon_sysfs_turn_damon_on(struct damon_sysfs_kdamond *kdamond)
repeat_call_control->data = kdamond;
repeat_call_control->repeat = true;
repeat_call_control->dealloc_on_cancel = true;
- damon_call(ctx, repeat_call_control);
+ if (damon_call(ctx, repeat_call_control))
+ kfree(repeat_call_control);
return err;
}
base-commit: dd478b2be41492a9f7be5abbdbd4dceddc46818f
--
2.47.3
Forwarding Sashiko review for doing discussions via mails. # review url: https://sashiko.dev/#/patchset/20260327003224.55752-1-sj@kernel.org # start of sashiko.dev inline review commit aba7f94d0accfda5ab9b573157f691de9e9acf00 Author: SeongJae Park <sj@kernel.org> mm/damon/sysfs: dealloc repeat_call_control if damon_call() fails This commit attempts to fix a memory leak where repeat_call_control is not deallocated if damon_call() fails because the kdamond stopped before the invocation. It adds a check to free repeat_call_control when damon_call() returns an error. Link: https://lore.kernel.org/20260320020630.962-1-sj@kernel.org > diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c > index 6a44a2f3d8fc..eefa959aa30a 100644 > --- a/mm/damon/sysfs.c > +++ b/mm/damon/sysfs.c > @@ -1670,7 +1670,8 @@ static int damon_sysfs_turn_damon_on(struct damon_sysfs_kdamond *kdamond) > repeat_call_control->data = kdamond; > repeat_call_control->repeat = true; > repeat_call_control->dealloc_on_cancel = true; > - damon_call(ctx, repeat_call_control); > + if (damon_call(ctx, repeat_call_control)) > + kfree(repeat_call_control); Does this fully resolve the leak of repeat_call_control? If a sysfs writer thread calls damon_call() during kdamond shutdown, after the worker thread in kdamond_fn() flushes pending controls but before it sets ctx->kdamond = NULL: kdamond_fn() kdamond_call(ctx, true); ... // Before ctx->kdamond = NULL Would damon_call() add the new request to ctx->call_controls and return 0 since ctx->kdamond is not yet NULL and repeat is true? If so, does this leave repeat_call_control in the queue and leaked since the worker is already exiting and this check is bypassed? Could this same window also cause a deadlock for synchronous commands like DAMON_SYSFS_CMD_UPDATE_SCHEMES_STATS? If a synchronous command with repeat = false is added to the queue during this same window, would the caller hang indefinitely on wait_for_completion() since the worker thread will never process the queue again? > return err; > } # end of sashiko.dev inline review # review url: https://sashiko.dev/#/patchset/20260327003224.55752-1-sj@kernel.org # # hkml [1] generated a draft of this mail. It can be regenerated # using below command: # # hkml patch sashiko_dev --for_forwarding \ # 20260327003224.55752-1-sj@kernel.org # # [1] https://github.com/sjp38/hackermail Sent using hkml (https://github.com/sjp38/hackermail)
TL; DR: nice finding but orthogonal to this patch. On Thu, 26 Mar 2026 18:17:47 -0700 SeongJae Park <sj@kernel.org> wrote: > Forwarding Sashiko review for doing discussions via mails. > > # review url: https://sashiko.dev/#/patchset/20260327003224.55752-1-sj@kernel.org > # start of sashiko.dev inline review > commit aba7f94d0accfda5ab9b573157f691de9e9acf00 > Author: SeongJae Park <sj@kernel.org> > > mm/damon/sysfs: dealloc repeat_call_control if damon_call() fails > > This commit attempts to fix a memory leak where repeat_call_control is > not deallocated if damon_call() fails because the kdamond stopped before > the invocation. It adds a check to free repeat_call_control when > damon_call() returns an error. > > Link: https://lore.kernel.org/20260320020630.962-1-sj@kernel.org > > > diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c > > index 6a44a2f3d8fc..eefa959aa30a 100644 > > --- a/mm/damon/sysfs.c > > +++ b/mm/damon/sysfs.c > > @@ -1670,7 +1670,8 @@ static int damon_sysfs_turn_damon_on(struct damon_sysfs_kdamond *kdamond) > > repeat_call_control->data = kdamond; > > repeat_call_control->repeat = true; > > repeat_call_control->dealloc_on_cancel = true; > > - damon_call(ctx, repeat_call_control); > > + if (damon_call(ctx, repeat_call_control)) > > + kfree(repeat_call_control); > > Does this fully resolve the leak of repeat_call_control? > > If a sysfs writer thread calls damon_call() during kdamond shutdown, > after the worker thread in kdamond_fn() flushes pending controls but > before it sets ctx->kdamond = NULL: > > kdamond_fn() > kdamond_call(ctx, true); > ... > // Before ctx->kdamond = NULL > > Would damon_call() add the new request to ctx->call_controls and return 0 > since ctx->kdamond is not yet NULL and repeat is true? > > If so, does this leave repeat_call_control in the queue and leaked since > the worker is already exiting and this check is bypassed? > > Could this same window also cause a deadlock for synchronous commands like > DAMON_SYSFS_CMD_UPDATE_SCHEMES_STATS? > > If a synchronous command with repeat = false is added to the queue during > this same window, would the caller hang indefinitely on > wait_for_completion() since the worker thread will never process the queue > again? Nice catch, but orthogonal to this patch. I'm working on fixing the race with a separate patch: https://lore.kernel.org/20260327004952.58266-1-sj@kernel.org The patch would fix the two issues at once. Nonetheless, the seaprate patch was motivated for only the deadlock issue. Finding of the leak possibility is a good call. Thanks, SJ [...]
© 2016 - 2026 Red Hat, Inc.