From nobody Mon May 25 08:10:52 2026 Received: from mail-yw1-f196.google.com (mail-yw1-f196.google.com [209.85.128.196]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 812BC38B7BC for ; Sat, 16 May 2026 22:34:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.196 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970888; cv=none; b=Pz7Ax05rXvyIKW4HPlTSsSzJaH12hjaIWimjbg/56/ZEz80lku8Czy2RhOYMNryiKL038ZAdQmdNE7VDMhwi5nqzMYJHz39Gm8WuDPd+ASzQNIk+OtOKxCxhETYxKtrIaKAGV+5a9oksVoSaA8s1ENxKIi9hkhmvupC1yvXl6lk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970888; c=relaxed/simple; bh=RGTQjDI4J9zN6ogXHBydYjAWRRH6xgj1FRwnKHtWy40=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iz96V40w7zPz99vc33HWnwBMsKedaa+1dpqwCLq3bgROhwf5+nQYbo8cOeP2hrJD8DQv+3JZpfUsgzb2n/jFSYQH26vSmVMNCI/bZvwlbXZ0jU35eZUqsMM3pvRFjkuAsco/YXg+3OfSfQjcDDoHDxEi2YvhsCMCwBjHJrlzKm0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=KGKi44VY; arc=none smtp.client-ip=209.85.128.196 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KGKi44VY" Received: by mail-yw1-f196.google.com with SMTP id 00721157ae682-7bb0d18c7f9so4385677b3.0 for ; Sat, 16 May 2026 15:34:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778970885; x=1779575685; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=14OZdlsCd/minckeoEg/l7pBnGdlCv1AuWKncPE3uPM=; b=KGKi44VYG/mBh+KbAnHtgO2UffNtOjBUoFwXnnFj9Nz9kBHI2k6H1rLFL76IXtQuRN WlGhH7JHJQHGAFucG1GQ2UhC4tWmlEIeksG0xIVSfDgKOjT/RP+xgTFgUCf1kInWNmNM aPzQYnTuQVArvS89g/ylhjQN7DPoDLcsJ+a/5woe3+nQF7kzoVAQz6ahj5dt2ND+5PtR 7CwI7vjY0UaCFaITPDV2VhVpH7V/pVuNHgFLJJpb6Ppyxs8V5rxCFmIRdPwYE+7J6FJ6 jEV9ADuuFRDuWlDiPsXtygnbz4LGyEI4e83GWqQYacJVl9rfF/F2P80OSvgAizjdOxE8 VydA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778970885; x=1779575685; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=14OZdlsCd/minckeoEg/l7pBnGdlCv1AuWKncPE3uPM=; b=kH8xkeXwzdxGkQThS9YqhZG255rYcUzF83Iv+x6653OJ7TmowbMHFYrk+zJixcCfMg lKpjkhukxawHF4WrUddDsDGKgf+hhF0znqn6DKZ2liUc7JVfS6huF/cRyqi9qQUvvsQa ITNf3NdsiF/Gwmok/SN1aJfu7g3l4Pjiwy25Z4PhPOnnGwJP0gYc15233QFlqvN/wF12 27IuWE+LGoPiA1oscOE7CtJ4E3uEMQZzGXDQFCLNRA3Nh3eIweslk/wILmPg1yTeVs2p gDkKAzm3XB7rhGHhsRRuiiZRAe+mt8xyy9j6us/0UOBuxuwLB8JgngD1ZodlitBEUUuz HZ1Q== X-Forwarded-Encrypted: i=1; AFNElJ8a8hSIJLHlqRvckmxRx0pqbnN1l6zvW4lme3eSma68S2dkYbTBGbrpORLPtw87l2gH/l3d/9rf+HnbyLw=@vger.kernel.org X-Gm-Message-State: AOJu0YyUT0sZwViFZz5LGSJjXP9OAhN/2OrGWRI9Ozt+C0hPrgcA97ZM ExR+J27NAjZUztddBmZVsnDUrtIAsWEjVbeWA+nZ2iY3AKmosqWxKPk= X-Gm-Gg: Acq92OGXHDTPiWd+0AiZhUGd1+kTAhGtWOM6lsGcpuyTUfIUbPJBQDSV0y8Oleff9OH lHr0v8H5t4A0/eErXvDnrVDxg+M3bkgihqhktkNab/T4rGNbzsrtqRnwyFOjbq074wZduFtuaTN 8za5ZmPoAnUawYQHfuvQOrKa5OyQP9HWPmFn5nDy4eUoCQc87ZpOEXLgO1nVFf9c1fFla9lQ9Ch bh5QdtfEDmEsDN/4I53WJIbFwhg4jgemcnOuu63zyUZO/axIXFX+HXI5DJ42O+AZlVa/1coanPD 7XxBpIfHZ6rTUZasLM4G+KyrkCLPI0z+S7oyiDCtgkBlzXbJeFy5o8CFzQMQh5KcBAsJxMlmouN 41tPAmg3HtFgnWnRr2sCmqYAFN3nCOz5N9MSMEiwVbCyQaU/wSzsiIDp3vULfEkk4MbRu8a7PSP GesF+GMBsUfu2at+xe1x4nUXuherzpHx/2uS0SpB0uEYkIDwbXYwRu6PtBiwSMk17e+6TR/kxs9 Q== X-Received: by 2002:a05:690c:6b12:b0:7bd:d4f4:261b with SMTP id 00721157ae682-7c95b33d4b2mr103710917b3.29.1778970885288; Sat, 16 May 2026 15:34:45 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7cc9c2e49e9sm668787b3.33.2026.05.16.15.34.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 May 2026 15:34:44 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com, bharata@amd.com Subject: [RFC PATCH 1/7] mm/damon/core: refcount ops owner module to prevent rmmod UAF Date: Sat, 16 May 2026 15:34:26 -0700 Message-ID: <20260516223439.4033-2-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260516223439.4033-1-ravis.opensrc@gmail.com> References: <20260516223439.4033-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" damon_select_ops() copies the registered damon_operations struct into ctx->ops by value. After damon_unregister_ops() is called from a backend module's exit path, the registry slot is cleared but any surviving ctx still holds function pointers that resolve into the unloaded module's text. Restarting kdamond on such a ctx, or invoking any ops callback, jumps into freed code. Add a struct module *owner field to damon_operations. In damon_select_ops(), take a reference to ops->owner via try_module_get() after locating the registry entry; on failure return -EBUSY without binding the ctx. If the ctx already had an ops bound (re-select case), drop the previous owner's reference before installing the new one to keep the refcount balanced. In damon_destroy_ctx(), release the reference via module_put(ctx->ops.owner). In damon_commit_ctx(), the live ops field is overwritten by a value copy from src. Balance the refcount when the owner changes: take a ref on the new owner (return -EBUSY on failure) and put the ref on the old owner before the assignment. Built-in ops sets (vaddr, paddr) leave owner =3D NULL; try_module_get(NULL) returns true and module_put(NULL) is a no-op. Loadable backends set owner =3D THIS_MODULE in their registration. Also add damon_unregister_ops() so loadable backends have a clean exit path. Signed-off-by: Ravi Jonnalagadda --- include/linux/damon.h | 4 ++++ mm/damon/core.c | 46 ++++++++++++++++++++++++++++++++++--- mm/damon/tests/core-kunit.h | 2 +- 3 files changed, 48 insertions(+), 4 deletions(-) diff --git a/include/linux/damon.h b/include/linux/damon.h index df7910a39b407..8e6e1cd89e551 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -682,6 +682,8 @@ enum damon_ops_id { * struct damon_operations - Monitoring operations for given use cases. * * @id: Identifier of this operations set. + * @owner: Module that provides this operations set, or NULL + * for built-in ops. * @init: Initialize operations-related data structures. * @update: Update operations-related data structures. * @prepare_access_checks: Prepare next access check of target regions. @@ -728,6 +730,7 @@ enum damon_ops_id { */ struct damon_operations { enum damon_ops_id id; + struct module *owner; void (*init)(struct damon_ctx *context); void (*update)(struct damon_ctx *context); void (*prepare_access_checks)(struct damon_ctx *context); @@ -1206,6 +1209,7 @@ int damon_commit_ctx(struct damon_ctx *old_ctx, struc= t damon_ctx *new_ctx); int damon_nr_running_ctxs(void); bool damon_is_registered_ops(enum damon_ops_id id); int damon_register_ops(struct damon_operations *ops); +int damon_unregister_ops(enum damon_ops_id id); int damon_select_ops(struct damon_ctx *ctx, enum damon_ops_id id); =20 static inline bool damon_target_has_pid(const struct damon_ctx *ctx) diff --git a/mm/damon/core.c b/mm/damon/core.c index e4b9adc0a64dd..b605d36b29b1a 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -93,6 +94,31 @@ int damon_register_ops(struct damon_operations *ops) mutex_unlock(&damon_ops_lock); return err; } +EXPORT_SYMBOL_GPL(damon_register_ops); + +/** + * damon_unregister_ops() - Unregister a monitoring operations set. + * @id: ID of the operations set to unregister. + * + * Return: 0 on success, negative error code otherwise. + */ +int damon_unregister_ops(enum damon_ops_id id) +{ + if (id >=3D NR_DAMON_OPS) + return -EINVAL; + + /* + * Callers (typically the owning module exit path) hold a + * module ref via try_module_get() in damon_select_ops(); the + * unregister cannot race with active ctxs because module_exit + * runs only at owner refcount 0. + */ + mutex_lock(&damon_ops_lock); + memset(&damon_registered_ops[id], 0, sizeof(damon_registered_ops[id])); + mutex_unlock(&damon_ops_lock); + return 0; +} +EXPORT_SYMBOL_GPL(damon_unregister_ops); =20 /** * damon_select_ops() - Select a monitoring operations to use with the con= text. @@ -112,10 +138,18 @@ int damon_select_ops(struct damon_ctx *ctx, enum damo= n_ops_id id) return -EINVAL; =20 mutex_lock(&damon_ops_lock); - if (!__damon_is_registered_ops(id)) + if (!__damon_is_registered_ops(id)) { err =3D -EINVAL; - else - ctx->ops =3D damon_registered_ops[id]; + goto out; + } + if (!try_module_get(damon_registered_ops[id].owner)) { + err =3D -EBUSY; + goto out; + } + /* Drop previous owner ref if this ctx had ops selected before. */ + module_put(ctx->ops.owner); + ctx->ops =3D damon_registered_ops[id]; +out: mutex_unlock(&damon_ops_lock); return err; } @@ -835,6 +869,7 @@ void damon_destroy_ctx(struct damon_ctx *ctx) damon_for_each_sample_filter_safe(f, next_f, &ctx->sample_control) damon_destroy_sample_filter(f, &ctx->sample_control); =20 + module_put(ctx->ops.owner); kfree(ctx); } =20 @@ -1749,6 +1784,11 @@ int damon_commit_ctx(struct damon_ctx *dst, struct d= amon_ctx *src) return err; } dst->pause =3D src->pause; + if (src->ops.owner !=3D dst->ops.owner) { + if (!try_module_get(src->ops.owner)) + return -EBUSY; + module_put(dst->ops.owner); + } dst->ops =3D src->ops; err =3D damon_commit_probes(dst, src); if (err) diff --git a/mm/damon/tests/core-kunit.h b/mm/damon/tests/core-kunit.h index 0369c717b93db..300659b115602 100644 --- a/mm/damon/tests/core-kunit.h +++ b/mm/damon/tests/core-kunit.h @@ -342,7 +342,7 @@ static void damon_test_split_regions_of(struct kunit *t= est) static void damon_test_ops_registration(struct kunit *test) { struct damon_ctx *c =3D damon_new_ctx(); - struct damon_operations ops =3D {.id =3D DAMON_OPS_VADDR}, bak; + struct damon_operations ops =3D {.id =3D DAMON_OPS_VADDR}, bak =3D {}; bool need_cleanup =3D false; =20 if (!c) --=20 2.43.0 From nobody Mon May 25 08:10:52 2026 Received: from mail-yw1-f194.google.com (mail-yw1-f194.google.com [209.85.128.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC5DF38F24D for ; Sat, 16 May 2026 22:34:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.194 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970889; cv=none; b=dGMSSdOgTbyPaleh9paBH3g0jYxEfxnYT63yyTYCcpl5mKCfo4usY+zMk5GSPQVlBSii6BYuxfizJD8s+AwjEbxDr7V2RLcqDkXJIiZCQqUbyY65K+1sdT4c97rjuTXQQ1X92bA98s45f/OuPAQAQhuUzvCv4urmwxmmYHQqhLA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970889; c=relaxed/simple; bh=81t2u2yCaMRD7GUpqkuicA4uYgq8AJpp2/8XxLBG7H8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LF/PpeGDYgicxJ6z+whHc7rjP/BbnZx3vbIKpPoYULtd4ZiwKlf6WfDAL23Z+XBK9/Zaj7tof82K/9YbqJnzfd4fhcrCsDbA/ymb7GJEtMlGOb0FI+Gr6KNEsNlJrrtf/9PXghhpKQeT26bxW2iR3dlrj/klsu6Mscd//QDOxYg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JZg7dqQW; arc=none smtp.client-ip=209.85.128.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JZg7dqQW" Received: by mail-yw1-f194.google.com with SMTP id 00721157ae682-7bd6f65c781so5177257b3.1 for ; Sat, 16 May 2026 15:34:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778970887; x=1779575687; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=R2FYJPYLSru7tpq4iL0KXNgZ0ilFNQRh0d+zdghJXVs=; b=JZg7dqQWAkJnvD4PkduhIElA1kVt1D2WbJz7GmAMfhK8uC+X0ZMMJPAI663aXh1Peh ooGjQzMY9wfjkLJVFDR3WFSNnUr0LIqZ0UaFKXp7hN9CeYc1mDsFJiKXWntWLul82t/K 5yqtkt9qadQkAsNB/PsXln4bN10t7qOaKMe6NLCCs65FSabq+7QHhLpWG4ADHWTz3EpY +KnBekYMRJ7El2DXvQreGJXzSCre2qQhX3/ma7S17CL/OXbXe2LP2OrGUrxW85NCDBAa ciwoFc3ya66COMACr07jl3Yst80Bkrw241jhac6BF850yyIPrerA0wkUFZOCUVOFy6W4 rYXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778970887; x=1779575687; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=R2FYJPYLSru7tpq4iL0KXNgZ0ilFNQRh0d+zdghJXVs=; b=Fjk2NRJP6Ytu7LMRRszE0XpozdWp47P4R0XIQBQK1xKBRjP4ne/tT7YLzR6P4IL7OV WdriUNtL0whjPvliK/VtyLXwKtbMV+HQcABo6nDNoRsm6MRUCMv9ZtswhFULdqoX/Noc 7ko+S4yXvLDJGqM4ylq91bXvuEJYF6pR9AdiUOxDK2uqe2tp16lX0Su2xd1y+w/bduyt hU9h2QGta1xTKEogiHsikhYNznRYBWZ+fCZfV0h+X15ClnaoEZzJR6v38MpxJxqaSTWf gYVzFRJ5WSlNQhbZ3jno7nfWutoa93kYowKbkpALw4wHxzGbzW0wtmUUKCNnpAMl+Iad 0v8w== X-Forwarded-Encrypted: i=1; AFNElJ9P5z44Un2JmXEdL7Uh43/VKk1eaN60xI+xfyt7z7uLX/CUPFER0eJuAroxAuWLXsqHA7r2kgpbDNr/mgU=@vger.kernel.org X-Gm-Message-State: AOJu0YzmWEpqXjiMqg7C8RCRR6lnnbcKfj0LnHJhMyamh8uoRT36pw5Z Y03obOOTWj/5tv89pMuG0z+Y0eBZjclwvRoU4D4bV4eSez5vWy9CzxA= X-Gm-Gg: Acq92OFpwagBdSy1jd0sAgACu2AWJ2RqUciBxigdu6cmrn+hCPAWFDoKJ20NAdgzt/8 rwbCBdcfX11bIKXD6zrUQ+y80oQ+IVaEbEStqO7EE8ZnJBhdnmnD26PoL8iWuMxEl9F9n+U8bgR InQGWr9WfW+Gx+BUKo6lcIPtnc3VWaB4t5+n7rlH7RGkjf4SZFoyd0Ejjqu9xU2mqVk2PLWUrVh mSUC5pPqHrRlwN1d4LqPvRsr/MxgR5+gIezZEh1vHLAfmyFt12MfrHzqZfVvQrj3Be1//s4Kup2 eAHGTl+CDf67/0kh5ZAkY8OU78iL8TGRwj4wBb5DXrYKzA+kgalMlpcn0BzWeBmrwuZ7ftKsniM VHs2xs0FmlRiNjNtLbjJomdgLlv3uItcNSnmUeiODRpluZ1qY9MFsi4OJL3Y0OotdKd1uPFd7QG qVg8HSZyG+el9BIlWv0i8Jf1XMhc/1X/OSY4Raks74xMLzx4eOYH4Kp98JELmsB1TDtl424b0R4 Q== X-Received: by 2002:a05:690c:f02:b0:7b6:dada:4017 with SMTP id 00721157ae682-7c949ac202bmr92155237b3.24.1778970886989; Sat, 16 May 2026 15:34:46 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7cc9d18dcddsm633357b3.49.2026.05.16.15.34.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 May 2026 15:34:46 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com, bharata@amd.com Subject: [RFC PATCH 2/7] mm/damon/paddr: export damon_pa_* ops for IBS module Date: Sat, 16 May 2026 15:34:27 -0700 Message-ID: <20260516223439.4033-3-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260516223439.4033-1-ravis.opensrc@gmail.com> References: <20260516223439.4033-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Remove static qualifier from damon_pa_prepare_access_checks, damon_pa_check_accesses, damon_pa_apply_probes, damon_pa_apply_scheme, and damon_pa_scheme_score. Add EXPORT_SYMBOL_GPL for each. These functions are used as ops callbacks by the IBS backend module (damon_= ibs.ko) which registers paddr_ibs operations. Signed-off-by: Ravi Jonnalagadda --- mm/damon/ops-common.h | 13 +++++++++++++ mm/damon/paddr.c | 15 ++++++++++----- 2 files changed, 23 insertions(+), 5 deletions(-) diff --git a/mm/damon/ops-common.h b/mm/damon/ops-common.h index 5efa5b5970def..0ec75276d985a 100644 --- a/mm/damon/ops-common.h +++ b/mm/damon/ops-common.h @@ -23,3 +23,16 @@ bool damos_folio_filter_match(struct damos_filter *filte= r, struct folio *folio); unsigned long damon_migrate_pages(struct list_head *folio_list, int target= _nid); =20 bool damos_ops_has_filter(struct damos *s); + +/* + * paddr ops callbacks, declared here so paddr-family backends + * (e.g. paddr_ibs) can reuse the paddr operation implementations. + */ +void damon_pa_prepare_access_checks(struct damon_ctx *ctx); +unsigned int damon_pa_check_accesses(struct damon_ctx *ctx); +void damon_pa_apply_probes(struct damon_ctx *ctx); +unsigned long damon_pa_apply_scheme(struct damon_ctx *ctx, + struct damon_target *t, struct damon_region *r, + struct damos *scheme, unsigned long *sz_filter_passed); +int damon_pa_scheme_score(struct damon_ctx *context, + struct damon_region *r, struct damos *scheme); diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c index fc2154b6221fb..5af4ac2a7ed4d 100644 --- a/mm/damon/paddr.c +++ b/mm/damon/paddr.c @@ -124,13 +124,14 @@ static void damon_pa_prepare_access_checks_faults(str= uct damon_ctx *ctx) } } =20 -static void damon_pa_prepare_access_checks(struct damon_ctx *ctx) +void damon_pa_prepare_access_checks(struct damon_ctx *ctx) { if (ctx->sample_control.primitives_enabled.page_table) damon_pa_prepare_access_checks_abit(ctx); if (ctx->sample_control.primitives_enabled.page_fault) damon_pa_prepare_access_checks_faults(ctx); } +EXPORT_SYMBOL_GPL(damon_pa_prepare_access_checks); =20 static bool damon_pa_young(phys_addr_t paddr, unsigned long *folio_sz) { @@ -168,7 +169,7 @@ static void __damon_pa_check_access(struct damon_region= *r, last_addr =3D sampling_addr; } =20 -static unsigned int damon_pa_check_accesses(struct damon_ctx *ctx) +unsigned int damon_pa_check_accesses(struct damon_ctx *ctx) { struct damon_target *t; struct damon_region *r; @@ -184,6 +185,7 @@ static unsigned int damon_pa_check_accesses(struct damo= n_ctx *ctx) =20 return max_nr_accesses; } +EXPORT_SYMBOL_GPL(damon_pa_check_accesses); =20 static bool damon_pa_filter_match(struct damon_filter *filter, struct folio *folio) @@ -234,7 +236,7 @@ static bool damon_pa_filter_pass(phys_addr_t pa, struct= folio *folio, return pass; } =20 -static void damon_pa_apply_probes(struct damon_ctx *ctx) +void damon_pa_apply_probes(struct damon_ctx *ctx) { struct damon_target *t; struct damon_region *r; @@ -259,6 +261,7 @@ static void damon_pa_apply_probes(struct damon_ctx *ctx) } } } +EXPORT_SYMBOL_GPL(damon_pa_apply_probes); =20 /* * damos_pa_filter_out - Return true if the page should be filtered out. @@ -542,7 +545,7 @@ static unsigned long damon_pa_alloc_or_free( =20 #endif =20 -static unsigned long damon_pa_apply_scheme(struct damon_ctx *ctx, +unsigned long damon_pa_apply_scheme(struct damon_ctx *ctx, struct damon_target *t, struct damon_region *r, struct damos *scheme, unsigned long *sz_filter_passed) { @@ -574,8 +577,9 @@ static unsigned long damon_pa_apply_scheme(struct damon= _ctx *ctx, } return 0; } +EXPORT_SYMBOL_GPL(damon_pa_apply_scheme); =20 -static int damon_pa_scheme_score(struct damon_ctx *context, +int damon_pa_scheme_score(struct damon_ctx *context, struct damon_region *r, struct damos *scheme) { switch (scheme->action) { @@ -595,6 +599,7 @@ static int damon_pa_scheme_score(struct damon_ctx *cont= ext, =20 return DAMOS_MAX_SCORE; } +EXPORT_SYMBOL_GPL(damon_pa_scheme_score); =20 static int __init damon_pa_initcall(void) { --=20 2.43.0 From nobody Mon May 25 08:10:52 2026 Received: from mail-yw1-f194.google.com (mail-yw1-f194.google.com [209.85.128.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC06138F629 for ; Sat, 16 May 2026 22:34:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.194 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970891; cv=none; b=WwXFGTUT0ZMHBM/aTxg8uc8PR8J6hqYOdENJMmsp1CgADbS4QKIO5XYJqXi9xJoqkDR7ZEnvhh0VMaatIIy0/Sg7+SwdA30ygwcWJWvCfGrqV0Nl2j3e5hx6YxGi885U+0VAmWd+H0RXHGgc6pyD0H58lZU1IRmSlso6TLmWJWE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970891; c=relaxed/simple; bh=9qCyQuXla6OX/PfPv/Mgm7xBHK14vZ4tbPQnAPQWEbk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=F4IjCdQnOp5h2muOq6fSSzN3ip/8jrbX5FJdv5SGOzY35a+inCIapRtB7h16T8JYnmHw3NQ8bhAd1eRI91728SqLqkDJB7GF7unWfywQ7Ay2SUOOvvTnipzS5rsDpi2MLGFJPuIXGZ+iBCtHV0SPC2jFEkowBxb0WpJX2aiAO3w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=H7+H8xfX; arc=none smtp.client-ip=209.85.128.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="H7+H8xfX" Received: by mail-yw1-f194.google.com with SMTP id 00721157ae682-7bd6f65c781so5177417b3.1 for ; Sat, 16 May 2026 15:34:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778970889; x=1779575689; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wV43Lg6ZeBSxT2gi43CrNOJZdZ5c40FejoI6Xqvs4d0=; b=H7+H8xfX3gngHHYpwOlzZEuLXIJJ7rPG9QArrHHWqmjpIvlgKvT1Tx1bn2OkYck6is Hg2yrLyFey9YH9fQKM2nz1+Sqf5IiK3A1hk8Fld1ikUjlyyPmpx/XeAwymhOPtMBOMYR XG//ArTfL6RVQtUIF1S0jLB+Mv0gEOjUGfomC/QWEMUyHhY2V7YzOqvfFZI2nB2ryVrE Q6OO8033tBONtf8OW/1ExFCUnosh6Ck6HBgNzp5V0FJ2PZArGFAwqu3Xtkm2zAbKRgT3 4141ltHo58mhgoYKrgdJxu1+SP9ducX/lJpiEeKzj543Dflhrd+eHiXciuOoKFk1Nw6w d7QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778970889; x=1779575689; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=wV43Lg6ZeBSxT2gi43CrNOJZdZ5c40FejoI6Xqvs4d0=; b=C6lWHSkXrDmi5fbs7XoxzziBWwANXn1bSk0jdg4y+UBHe78XiBzqkYgJz4807jP1Nv bpF5kzTVqBqKDoLh2pA9Zu0CtBPfTRxcMRh+M7cLN6JpboEGF9d8uCAJkV/gF+IfMITX KjgWOhhVkHWCKo86f80m8+v8jpowjSs6yTsGDOTAngZlp+18XP1gfc+DLZ5NltqT+hz7 d9YyaiSeEGRWhLlP83VJrXjvKTd8Zm4zSqToqUBaH2hxIRsVGKb80guTij+axF0jCR47 q9bALBM6dmro94xssl91FJ+S1wh+JDTcUbKzhr/46WewzvyMJkwmh1gNV8s5JCwkWH8+ xUGA== X-Forwarded-Encrypted: i=1; AFNElJ8Jt3zboe/I188fDGdN5hjQJl7s8TepUzLDePGMhVsDqF/F/wkrPc6oOeaURIOmTKyjekuPbOVVfWh93XM=@vger.kernel.org X-Gm-Message-State: AOJu0YxWHp+tjym1/aoQW9NE4hWHnBEAu9bZZ0qDiDVbDiHmHpkX23iP fvR3gz8tRqYz+V+we+UphK0m/bF2+QPv/WfoG8Ll2KOdbsl11smw9+A= X-Gm-Gg: Acq92OFm0Mooif86Gu/PqlE9L8A1HLTsaAzFPqhgb+rgax4dcwZIV6DyICX8wnIjvm5 NQZySBe5oaie/+AMT2V4Uue4cfp2g1vCI/vK4mPd/zxNwPPppw/d9IZMPwW5I/RmcUbNiqOqWIa jnHT6Oz9x4izvikLwMewwAo/pLwNp+3cIzyeZKEzMs2jI47OHp+30mzYGd2uL1US74/0/7TVifD 0yr3/zgebW5+TMV/iHqSB9slEqq9DJCbJaShL5Z9D8+aG5JK/qogoHpJkIEAWrHxi5nMs/2WbLA 6pSb75MsmROB80C5RVe1lGfPKuRcMMBX5pkBHFr2LuM5hAwTB9SruTfsEmKsXGhefKzZXgcTneA 17we6H4iuqetsF84EO2C4DSNe0/aMnIkmjLmIJQU3DXcfT+sol88n5vdTjRJhi6wBoCR06Oo3As fYzMNDSU+R6GanT1JvCARgGaM3ERU0vZ98pBZ13jDux0TToL0+SDmgjQ4sTAjvsZnA0N1Fnu6k5 vKpABPUAjrP X-Received: by 2002:a05:690c:e04e:b0:7c2:fa53:6d7f with SMTP id 00721157ae682-7c9463a2449mr68267967b3.3.1778970888966; Sat, 16 May 2026 15:34:48 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7cc9c6cd4f5sm666137b3.35.2026.05.16.15.34.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 May 2026 15:34:48 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com, bharata@amd.com Subject: [RFC PATCH 3/7] mm/damon/core: replace mutex-protected report buffer with per-CPU lockless ring Date: Sat, 16 May 2026 15:34:28 -0700 Message-ID: <20260516223439.4033-4-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260516223439.4033-1-ravis.opensrc@gmail.com> References: <20260516223439.4033-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replace the mutex-protected fixed-size array (DAMON_ACCESS_REPORTS_CAP=3D10= 00) with a per-CPU lockless ring buffer. This enables damon_report_access() to be called from NMI context. Ring design: - Producer is serialized per CPU: only one in-flight producer per CPU is allowed. A per-CPU damon_report_ring_busy counter detects NMI-on-process nesting and drops the nested attempt, preserving the single-writer invariant on the slot. - head is advanced by the producer with smp_wmb() before publish. - tail is advanced by the consumer (kdamond) after the entries[] reads. - Overflow: sample silently dropped. NMI context is allocation-free and access reports are best-effort. To keep the producer/consumer pattern scalable on systems with many CPUs and a high NMI rate, the ring layout follows three rules: - head, tail and entries[] live on separate cache lines via ____cacheline_aligned_in_smp, so producer and consumer do not invalidate each other's working set on every advance. - DAMON_REPORT_RING_SIZE is bounded so the per-CPU footprint stays small (256 entries x sizeof(struct damon_access_report) plus head and tail cache lines), keeping draining all rings during one kdamond tick from evicting unrelated data on contemporary server parts. - A cpumask, damon_rings_pending, is set by the producer after publishing and cleared by the consumer per ring drained, so the consumer iterates only CPUs with pending entries instead of walking every online CPU. An smp_mb__before_atomic() between the head publish and the cpumask_set_cpu() ensures observers of the pending bit also observe the published head; without it, weakly- ordered architectures could let the consumer drain stale head and delay the report. The consumer pairs this with an smp_mb__after_atomic() between cpumask_clear_cpu() and reading head, so a producer that publishes between the consumer's clear and head-read is observed via the bit it re-sets rather than silently stranded. Consumer (kdamond_check_reported_accesses) drains the rings of CPUs in damon_rings_pending, applying reports to targets. Signed-off-by: Ravi Jonnalagadda --- mm/damon/core.c | 143 ++++++++++++++++++++++++++++++++++-------------- 1 file changed, 101 insertions(+), 42 deletions(-) diff --git a/mm/damon/core.c b/mm/damon/core.c index b605d36b29b1a..9ed789e932ebd 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -25,7 +25,26 @@ #define CREATE_TRACE_POINTS #include =20 -#define DAMON_ACCESS_REPORTS_CAP 1000 +/* Sized so the per-CPU ring set fits in L3 on typical multi-socket boxes.= */ +#define DAMON_REPORT_RING_SIZE 256 +#define DAMON_REPORT_RING_MASK (DAMON_REPORT_RING_SIZE - 1) + +struct damon_report_ring { + unsigned int head; /* written by producer (NMI) */ + unsigned int tail /* written by consumer (kdamond) */ + ____cacheline_aligned_in_smp; + struct damon_access_report entries[DAMON_REPORT_RING_SIZE] + ____cacheline_aligned_in_smp; +}; + +static DEFINE_PER_CPU(struct damon_report_ring, damon_report_rings); +static DEFINE_PER_CPU(int, damon_report_ring_busy); +/* + * Per-CPU bitmap: producer (NMI) sets after publishing a report; + * consumer (kdamond) clears before draining the corresponding ring. + * Hot-write under sampling load - do NOT mark __read_mostly. + */ +static cpumask_t damon_rings_pending; =20 static DEFINE_MUTEX(damon_lock); static int nr_running_ctxs; @@ -36,10 +55,6 @@ static struct damon_operations damon_registered_ops[NR_D= AMON_OPS]; =20 static struct kmem_cache *damon_region_cache __ro_after_init; =20 -static DEFINE_MUTEX(damon_access_reports_lock); -static struct damon_access_report damon_access_reports[ - DAMON_ACCESS_REPORTS_CAP]; -static int damon_access_reports_len; =20 /* Should be called under damon_ops_lock with id smaller than NR_DAMON_OPS= */ static bool __damon_is_registered_ops(enum damon_ops_id id) @@ -2127,33 +2142,56 @@ int damos_walk(struct damon_ctx *ctx, struct damos_= walk_control *control) } =20 /** - * damon_report_access() - Report identified access events to DAMON. - * @report: The reporting access information. + * damon_report_access() - Report a hardware-observed memory access. + * @report: pointer to a filled damon_access_report struct. * - * Report access events to DAMON. - * - * Context: May sleep. - * - * NOTE: we may be able to implement this as a lockless queue, and allow a= ny - * context. As the overhead is unknown, and region-based DAMON logics wou= ld - * guarantee the reports would be not made that frequently, let's start wi= th - * this simple implementation. + * Context: NMI-safe. No sleeping, no allocation, no locks. */ void damon_report_access(struct damon_access_report *report) { - struct damon_access_report *dst; + struct damon_report_ring *ring; + unsigned int head, next; =20 - /* silently fail for races */ - if (!mutex_trylock(&damon_access_reports_lock)) - return; - dst =3D &damon_access_reports[damon_access_reports_len++]; - /* just drop all existing reports in favor of simplicity. */ - if (damon_access_reports_len =3D=3D DAMON_ACCESS_REPORTS_CAP) - damon_access_reports_len =3D 0; - *dst =3D *report; - dst->report_jiffies =3D jiffies; - mutex_unlock(&damon_access_reports_lock); + /* Pin to a CPU so the SPSC invariant holds for preemptible callers. */ + preempt_disable(); + /* + * NMI nesting on the same CPU as a process-context producer would + * stomp the same entries[head] slot. Detect and drop instead. + */ + if (this_cpu_inc_return(damon_report_ring_busy) !=3D 1) { + /* NMI nested on a process-context producer; drop. */ + goto out; + } + + ring =3D this_cpu_ptr(&damon_report_rings); + head =3D ring->head; + next =3D (head + 1) & DAMON_REPORT_RING_MASK; + + if (next =3D=3D READ_ONCE(ring->tail)) { + /* Ring full; consumer is behind, drop the report. */ + goto out; + } + + ring->entries[head] =3D *report; + ring->entries[head].report_jiffies =3D jiffies; + smp_wmb(); /* ensure entry visible before head advance */ + WRITE_ONCE(ring->head, next); + /* + * Order the head advance before publishing the pending bit + * so that the consumer, on observing the bit, is also + * guaranteed to observe the new head. set_bit/cpumask_set_cpu + * are documented as unordered RMW (atomic_bitops.txt), hence + * the explicit barrier; without it, a weakly-ordered arch + * could let the consumer drain stale head, clear the bit, and + * delay the report until the next producer sets the bit again. + */ + smp_mb__before_atomic(); + cpumask_set_cpu(smp_processor_id(), &damon_rings_pending); +out: + this_cpu_dec(damon_report_ring_busy); + preempt_enable(); } +EXPORT_SYMBOL_GPL(damon_report_access); =20 #ifdef CONFIG_MMU void damon_report_page_fault(struct vm_fault *vmf, bool huge_pmd) @@ -3814,26 +3852,47 @@ static unsigned int kdamond_apply_zero_access_repor= t(struct damon_ctx *ctx) =20 static unsigned int kdamond_check_reported_accesses(struct damon_ctx *ctx) { - int i; - struct damon_access_report *report; + int cpu; struct damon_target *t; =20 - /* currently damon_access_report supports only physical address */ - if (damon_target_has_pid(ctx)) - return 0; + for_each_cpu(cpu, &damon_rings_pending) { + struct damon_report_ring *ring =3D + per_cpu_ptr(&damon_report_rings, cpu); + unsigned int head, tail; =20 - mutex_lock(&damon_access_reports_lock); - for (i =3D 0; i < damon_access_reports_len; i++) { - report =3D &damon_access_reports[i]; - if (time_before(report->report_jiffies, - jiffies - - usecs_to_jiffies( - ctx->attrs.sample_interval))) - continue; - damon_for_each_target(t, ctx) - kdamond_apply_access_report(report, t, ctx); + cpumask_clear_cpu(cpu, &damon_rings_pending); + /* + * Pair with the producer's smp_mb__before_atomic() between + * the head publish and cpumask_set_cpu(): order the bit + * clear before the head read so that a producer publishing + * between our clear and our READ_ONCE(head) is observed via + * the bit it re-sets, not lost as a stale-head drain. + */ + smp_mb__after_atomic(); + head =3D READ_ONCE(ring->head); + /* + * Pair with smp_wmb in damon_report_access(): the entry + * data published before the producer advanced head must be + * visible to the entries[] reads inside the loop below. + */ + smp_rmb(); + tail =3D ring->tail; + + while (tail !=3D head) { + struct damon_access_report *report =3D + &ring->entries[tail]; + + if (!time_before(report->report_jiffies, + jiffies - usecs_to_jiffies( + ctx->attrs.sample_interval))) { + damon_for_each_target(t, ctx) + kdamond_apply_access_report( + report, t, ctx); + } + tail =3D (tail + 1) & DAMON_REPORT_RING_MASK; + } + WRITE_ONCE(ring->tail, tail); } - mutex_unlock(&damon_access_reports_lock); /* For nr_accesses_bp, absence of access should also be reported. */ return kdamond_apply_zero_access_report(ctx); } --=20 2.43.0 From nobody Mon May 25 08:10:52 2026 Received: from mail-yw1-f196.google.com (mail-yw1-f196.google.com [209.85.128.196]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD9A6387364 for ; Sat, 16 May 2026 22:34:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.196 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970895; cv=none; b=AIfOO7TjmfX3zd1oIOacwpph7ARvB9xlrKa7ptK3+54rdGbNWfiUnps3Gu+ugZcrnOMvLEB431VMIEVmSjuNNZ7bC3y8DqGkpi/AOT8j0ttrJdbRBiX3WaGA+MVOkbK7TUkhtOeGK0WzUK7chn2XjIH4kYFYhQQVwPCRcSXFem0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970895; c=relaxed/simple; bh=U7geW279ChrFXLde61M58WBHBu8SgrbFDoM4Qrvb2q8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Ckn2F8Hp/x3Z5d/HnCn95p1+1/viQEFlyC1TsfSNCPFChmtHh1RMccbPEwAh3noa/OrLApntfVynAqMIePgnA2VtndxF4vNsVicA4/lt18OdliLXV1Fizl7o4oyZC8oCG1wZemKrNDcV2lyXYybyHfVjegSCf4+Ew/Fnp/htt+I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lPpifN/T; arc=none smtp.client-ip=209.85.128.196 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lPpifN/T" Received: by mail-yw1-f196.google.com with SMTP id 00721157ae682-7bdf83185bbso5223197b3.2 for ; Sat, 16 May 2026 15:34:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778970891; x=1779575691; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V8+0GSDgF6eaqM7aU4GbWIS0I2DgBkn1ubqIjVsKT2c=; b=lPpifN/Tkjcp2aNLTwX73temMbpxnWT9XLa4qrnPaVGusB3H2cUwWr/nHFykruhaFz 5NGx77LgoZJfETZKg1I0WCblxSSzTXtu2wDYijFQ/1PVTrJSbLjscGpHHtPqvWTKtSyT rWYW0dr0qu+x25ZhUuhJe0wgiBPHrpRW2W6sP6p5xdYK24Wpjy0EkUnv/03Kbh5avvvZ X95F0LvRIW6OJIwUFdg3PkZAU1tyBFvtqaojUvgo7LAvwAkolT3Mjqtsu/JZXb/ZBPvd vcR3z+fAtT4ZHWGvwhIoQyoLpvmren32NNvr3EcxuD6a9Ry+EDGfudLDn7szEzcrf4d+ PEMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778970891; x=1779575691; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=V8+0GSDgF6eaqM7aU4GbWIS0I2DgBkn1ubqIjVsKT2c=; b=GpVhwj8i6wk4tMDgmz1U/1+F0h1L0Rx3sIxj1AoN0okk9a3z0rrQdRz5O4bKRY6QgY ZFNZ25A+LId+tq2dQbezTlj9O3udUSp8nIiOTIheSFtnu+pl0wVd8THu2Y0nwaJyHz1h yLJqWz9VYJPY85vGzReJpTFlo996m/AKqTDbzFRq2GlmTlgi4HWl6k1aDqZ6FHAE7HGe UCGMf7CaUbz3JEe5PIwtW3UtEQOptJ2QTjl4QmSQyR3uI8dgkDLVt+QhK/ZFH0LgiIRw C9yfpFBVS2BUt/orbawm+hT0Df9YoE8ALOm5lTdgg693TgNVHAOW1pykr+yCXJ714Jg+ fWRg== X-Forwarded-Encrypted: i=1; AFNElJ8CP5UBkH62iP38BPsI0XVVn63MmrDJr1dVKnkb0gesS7avPhFKLuxVTjghhaJ7xs3adzz9TQNUHlU2EoM=@vger.kernel.org X-Gm-Message-State: AOJu0Yx97MvcOiNek7JXFELGv9fItoYqCqojEC2kpyAhCRLuzNyx5MyP B+g3fw/srO1zn8rleHzWpLK8eOpOO5S/pnKf+JXM7hRbo1MKoE2kPnA= X-Gm-Gg: Acq92OHAm/0Rsf9rjPOqui1jKm0m1ZISatNTCeqEpAqj0V+jGgFOB/F3djad3b0n1Bn rDveMRcPL8AnHBfURT6QoeH26aq2S2Qkc13UvyfUn2mTi7oY8j5/sJB+GR/dNarN3PRN5IbgKfq UnGKujDB/xUvPWJJHzIABl9EzCMiyRP01Yx1G2xeZZrYV9Zd4eiNGaAv8pDLegp9Q2VIoMqXeEa mjkHGQsUgjTw6OHty9sLZ2Rqr3/34o1J0PYdqbRoqSaRbIHEXXTecZ7UPVLk31eqXpoe3JhY12f ThNmX0M2G6mHn7HY6GsPPDY5PvPf6xleL0sa1iP3gBoPxF18OhJk1yRzEyekpz1ENsZWuN5epcg 9CX0G9mSGsKDX0snHiR4a4Somq1nZ/Xfalbd+dVMV87Sw2Z57c9afUdbz0/773yde2B6ytEuZ5F mISn39onBVTOxIdoDmim4nPoWbeCJrLLFXamxiv9xT/9yHyP7jABeEI7ctYT0/0Cak7UqCvC5h+ A== X-Received: by 2002:a05:690c:4a01:b0:79a:b440:5c77 with SMTP id 00721157ae682-7c959e890bfmr110956107b3.17.1778970890889; Sat, 16 May 2026 15:34:50 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7cc9d18c769sm641257b3.47.2026.05.16.15.34.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 May 2026 15:34:50 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com, bharata@amd.com Subject: [RFC PATCH 4/7] mm/damon/core: flat-array snapshot + bsearch in ring-drain loop Date: Sat, 16 May 2026 15:34:29 -0700 Message-ID: <20260516223439.4033-5-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260516223439.4033-1-ravis.opensrc@gmail.com> References: <20260516223439.4033-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The drain loop is O(reports x regions) when matching each ring entry back to a region. At sufficiently large CPU x region products the linear scan exceeds the sample interval, producing unbounded backlog. At drain start, snapshot each target's regions into a flat array (struct damon_target_lookup), already sorted by ascending r->ar.start. Replace the linear lookup with binary search over the snapshot, reducing the drain to O(reports x log2(regions)). Reject reports that straddle a region boundary (addr + report->size > r->ar.end) so partial-region accesses are not credited to the lower region. If the snapshot allocation fails under memory pressure, log ratelimited and fall through to zero-access reporting; the next tick retries. Hoist the per-report damon_sample_filter_out() check out of the per-target loop so it runs once per ring entry rather than N times. Signed-off-by: Ravi Jonnalagadda --- include/linux/damon.h | 7 +++ mm/damon/core.c | 137 ++++++++++++++++++++++++++++++++++++------ 2 files changed, 126 insertions(+), 18 deletions(-) diff --git a/include/linux/damon.h b/include/linux/damon.h index 8e6e1cd89e551..35cc3d42fcba8 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -1045,6 +1045,13 @@ struct damon_ctx { =20 /* Per-ctx PRNG state for damon_rand(); kdamond is the sole consumer. */ struct rnd_state rnd_state; + /* Reusable drain-loop snapshot buffer (avoids per-tick kmalloc) */ + struct { + struct damon_target_lookup *lookups; + unsigned int nr_lookups; + struct damon_region **region_buf; + unsigned int region_buf_cap; + } drain_snapshot; }; =20 /* Get a random number in [@l, @r) using @ctx's lockless PRNG. */ diff --git a/mm/damon/core.c b/mm/damon/core.c index 9ed789e932ebd..03f9c671e8bc9 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -29,6 +29,13 @@ #define DAMON_REPORT_RING_SIZE 256 #define DAMON_REPORT_RING_MASK (DAMON_REPORT_RING_SIZE - 1) =20 +/* Per-target region lookup for drain loop */ +struct damon_target_lookup { + struct damon_region **regions; + unsigned int nr_regions; +}; + + struct damon_report_ring { unsigned int head; /* written by producer (NMI) */ unsigned int tail /* written by consumer (kdamond) */ @@ -855,6 +862,7 @@ struct damon_ctx *damon_new_ctx(void) INIT_LIST_HEAD(&ctx->schemes); =20 prandom_seed_state(&ctx->rnd_state, get_random_u64()); + /* drain_snapshot zero-initialized by kzalloc =E2=80=94 no explicit init = */ =20 return ctx; } @@ -884,6 +892,8 @@ void damon_destroy_ctx(struct damon_ctx *ctx) damon_for_each_sample_filter_safe(f, next_f, &ctx->sample_control) damon_destroy_sample_filter(f, &ctx->sample_control); =20 + kfree(ctx->drain_snapshot.lookups); + kfree(ctx->drain_snapshot.region_buf); module_put(ctx->ops.owner); kfree(ctx); } @@ -3806,27 +3816,44 @@ static bool damon_sample_filter_out(struct damon_ac= cess_report *report, return !filter->allow; } =20 + static void kdamond_apply_access_report(struct damon_access_report *report, - struct damon_target *t, struct damon_ctx *ctx) + struct damon_region **regions, unsigned int nr_regions, + struct damon_ctx *ctx) { struct damon_region *r; unsigned long addr; + int left, right, mid; =20 - if (damon_sample_filter_out(report, &ctx->sample_control)) - return; if (damon_target_has_pid(ctx)) addr =3D report->vaddr; else addr =3D report->paddr; =20 - /* todo: make search faster, e.g., binary search? */ - damon_for_each_region(r, t) { - if (addr < r->ar.start) - continue; - if (r->ar.end < addr + report->size) - continue; - if (!r->access_reported) - damon_update_region_access_rate(r, true, &ctx->attrs); + /* Binary search the snapshot for the region containing addr. */ + left =3D 0; + right =3D nr_regions - 1; + r =3D NULL; + while (left <=3D right) { + /* Avoid (left + right) overflow at large nr_regions. */ + mid =3D left + (right - left) / 2; + if (addr < regions[mid]->ar.start) + right =3D mid - 1; + else if (addr >=3D regions[mid]->ar.end) + left =3D mid + 1; + else { + r =3D regions[mid]; + break; + } + } + + if (!r) + return; + /* Reject reports straddling a region boundary. */ + if (addr + report->size > r->ar.end) + return; + if (!r->access_reported) { + damon_update_region_access_rate(r, true, &ctx->attrs); r->access_reported =3D true; } } @@ -3850,10 +3877,79 @@ static unsigned int kdamond_apply_zero_access_repor= t(struct damon_ctx *ctx) return max_nr_accesses; } =20 +/* + * Build a snapshot of the ctx's targets and their region arrays for + * use by the ring drain loop. + * + * The two-pass walk over adaptive_targets is safe even though + * krealloc_array() may sleep: target list mutation is funneled + * through damon_call onto the kdamond itself, so no other thread + * can mutate the list while kdamond is running this function. + */ +static struct damon_target_lookup *damon_build_target_lookup( + struct damon_ctx *ctx, unsigned int *nr_targets_out) +{ + struct damon_target *t; + struct damon_target_lookup *tbl; + unsigned int nr_targets =3D 0, total_regions =3D 0, ti =3D 0, ri =3D 0; + + damon_for_each_target(t, ctx) { + nr_targets++; + total_regions +=3D damon_nr_regions(t); + } + + /* Realloc lookups array if needed */ + if (nr_targets > ctx->drain_snapshot.nr_lookups) { + tbl =3D krealloc_array(ctx->drain_snapshot.lookups, + nr_targets, sizeof(*tbl), GFP_KERNEL); + if (!tbl) + return NULL; + ctx->drain_snapshot.lookups =3D tbl; + ctx->drain_snapshot.nr_lookups =3D nr_targets; + } + tbl =3D ctx->drain_snapshot.lookups; + + /* Realloc contiguous region_buf if needed */ + if (total_regions > ctx->drain_snapshot.region_buf_cap) { + struct damon_region **buf; + + buf =3D krealloc_array(ctx->drain_snapshot.region_buf, + total_regions, sizeof(*buf), GFP_KERNEL); + if (!buf) + return NULL; + ctx->drain_snapshot.region_buf =3D buf; + ctx->drain_snapshot.region_buf_cap =3D total_regions; + } + + /* Fill lookup table, slicing region_buf across targets */ + ri =3D 0; + damon_for_each_target(t, ctx) { + struct damon_region *r; + + tbl[ti].regions =3D &ctx->drain_snapshot.region_buf[ri]; + tbl[ti].nr_regions =3D damon_nr_regions(t); + damon_for_each_region(r, t) + ctx->drain_snapshot.region_buf[ri++] =3D r; + ti++; + } + + *nr_targets_out =3D nr_targets; + return tbl; +} + static unsigned int kdamond_check_reported_accesses(struct damon_ctx *ctx) { int cpu; - struct damon_target *t; + struct damon_target_lookup *tbl; + unsigned int nr_targets =3D 0; + unsigned int i; + + tbl =3D damon_build_target_lookup(ctx, &nr_targets); + if (!tbl) { + pr_warn_ratelimited( + "damon: target-lookup alloc failed; ring drain skipped this tick\n"); + return kdamond_apply_zero_access_report(ctx); + } =20 for_each_cpu(cpu, &damon_rings_pending) { struct damon_report_ring *ring =3D @@ -3881,14 +3977,19 @@ static unsigned int kdamond_check_reported_accesses= (struct damon_ctx *ctx) while (tail !=3D head) { struct damon_access_report *report =3D &ring->entries[tail]; - - if (!time_before(report->report_jiffies, + if (time_before(report->report_jiffies, jiffies - usecs_to_jiffies( - ctx->attrs.sample_interval))) { - damon_for_each_target(t, ctx) - kdamond_apply_access_report( - report, t, ctx); + ctx->attrs.sample_interval))) + goto next; + if (damon_sample_filter_out(report, + &ctx->sample_control)) + goto next; + for (i =3D 0; i < nr_targets; i++) { + kdamond_apply_access_report(report, + tbl[i].regions, + tbl[i].nr_regions, ctx); } +next: tail =3D (tail + 1) & DAMON_REPORT_RING_MASK; } WRITE_ONCE(ring->tail, tail); --=20 2.43.0 From nobody Mon May 25 08:10:52 2026 Received: from mail-yw1-f194.google.com (mail-yw1-f194.google.com [209.85.128.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDB6238B7BC for ; Sat, 16 May 2026 22:34:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.194 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970896; cv=none; b=KpAUbN/fCuOnoyze1RD4TYaX0zX/Df3J57wCMOfSUBRWYBxXRsfZPHkSCK7SNBUjM2eR2DxSaYkdoynKIVnVtFN546hNTTjpmkGA+QljO4WoqK5tqcnWW4cVgU4yZYAsxj+ohnpHmXkuuUyuhp5gz7+qbGpR1MfHFeblwEzShhY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970896; c=relaxed/simple; bh=aXlE73q//GcSzON4znqAYSwtwAnA9DqVmhttVnIMS18=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lEhx+S/mho2sR6MccwuySDrcITyLdQT/Zl95BAgQn7RJL45/pY+JbNNHHtLa5Hz61rJXElWdHks0vz4a55W85FJvA5myl4+C2kPW6jMDTM1tZEFobk0aUcJHkvVQ06x2PaHbOAuMbMopgX6Z5cwW1k5yyX1RXik2x7S9WHab59A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hyaZAbcX; arc=none smtp.client-ip=209.85.128.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hyaZAbcX" Received: by mail-yw1-f194.google.com with SMTP id 00721157ae682-7c5d8f45465so4893077b3.1 for ; Sat, 16 May 2026 15:34:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778970893; x=1779575693; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3Y0N12DjU34jWmR+sp/CB6uBhGDGyyP5fTNvBo/oDgw=; b=hyaZAbcXyN5PCgrfxIWsa70WuwnLfz7yu/8lo4k30h45qzh3m/tbt95Igo2m+CoXKl U3kLaZ22P8juGLxXxzu4GsfUQVBfq7RjR9PTWUJvS03F+raTIxaSeakPOMAaYZCFNwGL bgRbZIzeGE3EIY5C0AzeItekrXvFR0EekTEvud8wrIRwGGD1z5icOzgSIX1XmIfAFBZ6 yvnnnMIoRHU3eqKQ2ZWFIbbioppc/HVd64IxurgQQuErRFX/t33pIcVy8yDlGwPZFnHZ Y0tt8eVp4sa7rsnOT5I7j1pEf/WxonWbrqtUsYXxTEaYMmQ+lc81gK54VLFn0vhUzAuk w2pQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778970893; x=1779575693; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=3Y0N12DjU34jWmR+sp/CB6uBhGDGyyP5fTNvBo/oDgw=; b=Vb/oxPdR7NeIxV1lX/0fQsZ2kpURLHq+PV0rN88iCyCqdie+glrxVQOMezKgPLnczp GFdFUwiYHlvhTYxZk/jvBJJS5xempVKnm77l8/r5MVxp4AV7AIrh19IOCzKnCPsA/ShQ h3vr+wggrzqZw36YhhASL1WwgLZvgufaKOIRhnJ9wKKpUyRwkWv9/eKbibg7dRwOTKKw vDdaTJufhDJIHDP2F2niA0DcsmLTRGFTH03dfXzOaicG9y3E9EyOgrfeun+hbyOq2to3 m5y84z0EGVyQWCGxYoF1MexixkVjC3j4dilgHokTxvAyUizApKyWC7OEsvwYsp+4Ijki /5Dg== X-Forwarded-Encrypted: i=1; AFNElJ8ubIBqwOYMufSJwGnnD2CKnKyfldW2mcqykHozE++8fhGdc/UPxeQuGggDt6npGXlTOnSRsycFQh1AjTE=@vger.kernel.org X-Gm-Message-State: AOJu0Yw+S3E9Tg5HRaQZWSJE9JGKa2aXSQtvejYKraIumpAsTnYGkZBb rA1JKn5Dvl6M3rNUMDQ9TIDbe1SzQf3R8SGRtF7NF8DeCqCsVMKQXoY= X-Gm-Gg: Acq92OG33rriH2i7iLcagjL41kk1ADFQx8Q+vZu8EGPsO4wl3KthsCiPWoDdpxBHY6Q vaF0MfZqcrGODvsjZr1ZHLV8OVUAgbnajxkDIpWqTM7XVQfn+jE972Sv0XknKeNrFyU8NIVWyxD Wbk92VwkhmdK5VIxw8uu92BGU2B5cvc/5n3DLl45Dc3QYl506uUFYCcJ27Pvl4rG4kB/9mg9+9Z vafqvaEwHCCQYUQpz4I/fdhqcQ2NOyiLdbL/1/jWVZ1OuyYmj7govck2vaXJ1RYlm6hXHJeLOOw G87ivp5DexMZEfvNCcSjB4ce6mO1dNu67wTsmaJMbcE9fvRe89Yq8OU3rUuGCUQ7CgAvNFAyZFd ciSKI/fwTZLmUpkWv5QrXcPIxqgIO7hGncxtG+qPbRYm8o3ys2M5t0Xs2yRVvky2dXtCXHOCzHT DPlkqM52+WclvYVyRe7V1k65R3hzl6kbGBONlzlYi0uP63fWFTEcno0rPfBpDVBNaRsIWVaEtOT A== X-Received: by 2002:a05:690c:3482:b0:7b2:6b19:df2b with SMTP id 00721157ae682-7c95af50b29mr102598357b3.17.1778970892689; Sat, 16 May 2026 15:34:52 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7cc991c9b64sm819827b3.1.2026.05.16.15.34.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 May 2026 15:34:52 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com, bharata@amd.com Subject: [RFC PATCH 5/7] mm/damon: add sysfs binding and dispatch hookup for paddr_ibs operations Date: Sat, 16 May 2026 15:34:30 -0700 Message-ID: <20260516223439.4033-6-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260516223439.4033-1-ravis.opensrc@gmail.com> References: <20260516223439.4033-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend damon_ops_id enum to include DAMON_OPS_PADDR_IBS and add the corresponding 'paddr_ibs' name to the sysfs ops_names array so users can select AMD IBS-based PA-mode monitoring via /sys/kernel/mm/damon/admin/kdamonds//contexts//operations. Route ops that report accesses through the hardware-sampling ring (currently only DAMON_OPS_PADDR_IBS) through the existing kdamond_check_reported_accesses() drain path used for page-fault reports. A small helper damon_ops_is_hw_hotness() centralises the classification so any future paddr-family backend that also reports through the ring just adds a case here. This routing is bound to ops.id rather than to a separate per-context flag. A flag in damon_sample_control would have to be set by the ops .init callback after damon_select_ops() and would then need to be preserved by damon_commit_sample_control() across sysfs commits; deriving from ops.id avoids both pitfalls. Signed-off-by: Ravi Jonnalagadda --- include/linux/damon.h | 2 ++ mm/damon/core.c | 13 ++++++++++++- mm/damon/sysfs.c | 12 +++++++++--- 3 files changed, 23 insertions(+), 4 deletions(-) diff --git a/include/linux/damon.h b/include/linux/damon.h index 35cc3d42fcba8..16da528845d03 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -669,12 +669,14 @@ struct damos { * @DAMON_OPS_FVADDR: Monitoring operations for only fixed ranges of virtu= al * address spaces * @DAMON_OPS_PADDR: Monitoring operations for the physical address space + * @DAMON_OPS_PADDR_IBS: AMD IBS-based PA-mode monitoring * @NR_DAMON_OPS: Number of monitoring operations implementations */ enum damon_ops_id { DAMON_OPS_VADDR, DAMON_OPS_FVADDR, DAMON_OPS_PADDR, + DAMON_OPS_PADDR_IBS, NR_DAMON_OPS, }; =20 diff --git a/mm/damon/core.c b/mm/damon/core.c index 03f9c671e8bc9..2aa031cbc70b7 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -73,6 +73,16 @@ static bool __damon_is_registered_ops(enum damon_ops_id = id) return true; } =20 +/* + * Returns true if the given ops id reports access samples through the + * hardware-sampling ring-buffer drain path (rather than its own + * .check_accesses callback). + */ +static bool damon_ops_is_hw_hotness(enum damon_ops_id id) +{ + return id =3D=3D DAMON_OPS_PADDR_IBS; +} + /** * damon_is_registered_ops() - Check if a given damon_operations is regist= ered. * @id: Id of the damon_operations to check if registered. @@ -4048,7 +4058,8 @@ static int kdamond_fn(void *data) ctx->passed_sample_intervals++; =20 /* todo: make these non-exclusive */ - if (ctx->sample_control.primitives_enabled.page_fault) + if (ctx->sample_control.primitives_enabled.page_fault || + damon_ops_is_hw_hotness(ctx->ops.id)) max_nr_accesses =3D kdamond_check_reported_accesses(ctx); else if (ctx->ops.check_accesses) max_nr_accesses =3D ctx->ops.check_accesses(ctx); diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c index fc7256e522a69..261ccf0c61846 100644 --- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -1388,6 +1388,10 @@ static const struct damon_sysfs_ops_name damon_sysfs= _ops_names[] =3D { .ops_id =3D DAMON_OPS_PADDR, .name =3D "paddr", }, + { + .ops_id =3D DAMON_OPS_PADDR_IBS, + .name =3D "paddr_ibs", + }, }; =20 struct damon_sysfs_context { @@ -2023,7 +2027,8 @@ static int damon_sysfs_add_targets(struct damon_ctx *= ctx, int i, err; =20 /* Multiple physical address space monitoring targets makes no sense */ - if (ctx->ops.id =3D=3D DAMON_OPS_PADDR && sysfs_targets->nr > 1) + if ((ctx->ops.id =3D=3D DAMON_OPS_PADDR || + ctx->ops.id =3D=3D DAMON_OPS_PADDR_IBS) && sysfs_targets->nr > 1) return -EINVAL; =20 for (i =3D 0; i < sysfs_targets->nr; i++) { @@ -2072,8 +2077,9 @@ static int damon_sysfs_apply_inputs(struct damon_ctx = *ctx, if (err) return err; ctx->addr_unit =3D sys_ctx->addr_unit; - /* addr_unit is respected by only DAMON_OPS_PADDR */ - if (sys_ctx->ops_id =3D=3D DAMON_OPS_PADDR) + /* addr_unit is respected by only paddr-family ops */ + if (sys_ctx->ops_id =3D=3D DAMON_OPS_PADDR || + sys_ctx->ops_id =3D=3D DAMON_OPS_PADDR_IBS) ctx->min_region_sz =3D max( DAMON_MIN_REGION_SZ / sys_ctx->addr_unit, 1); ctx->pause =3D sys_ctx->pause; --=20 2.43.0 From nobody Mon May 25 08:10:52 2026 Received: from mail-yw1-f196.google.com (mail-yw1-f196.google.com [209.85.128.196]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A410238E126 for ; Sat, 16 May 2026 22:34:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.196 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970897; cv=none; b=Hpf+sKz/ur5/ebBX3ZgvDj2QUa7dXYYfoZDpLOl3WMml4jDaWbpoMefMkw8l2H515y5ThtuEjuyGKgtwKq9YFQYl+xTPlprsDEAW2H/Wsl6eMJVgXNYBJ2pkJrQWNtgagY3+lfMOEH0R18gKlnZdmxGaWBuikF5UuLmXIVCpIEI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970897; c=relaxed/simple; bh=6TWoslMuzMBm6p3WGtEbFUBe97Opnv7r3KdrTyJtUCY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oFjAMmOB99ihtWxtxmsC20FVLrh9GnMtbeDnB+ePt8cMSfVYciMf0ueSAnIHIh5GQzsYdK041Yabg++mE/8+XQ8yPJQPN9hiPcGU3AuWoKPjQ/fDBMYdgu1AG8XRsroLeNvPdYSIe6SmM890nYbgZAEVYijnaBktZzDl6Z/Y5w0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=HM7ii1X9; arc=none smtp.client-ip=209.85.128.196 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="HM7ii1X9" Received: by mail-yw1-f196.google.com with SMTP id 00721157ae682-7c23248f3a3so4964677b3.1 for ; Sat, 16 May 2026 15:34:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778970895; x=1779575695; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JJ4eRCsRtGruvU7OzNvMy5LtAnMvACrX8iMrvGPwu/M=; b=HM7ii1X9GQFsCXCYcDmBz44Xbq2GF+lybz2Z2UjERPfSEsY9uOv3vzZB4pM6mrvLrc Q6xfGNvq7rhOwpgrUeo3aEnxNpmcUJcmgOxl3MVEpCn8zUeL+nWaVicqBhawZm8ZMRB/ cQaQSQGAsAI0okWIvcMR9iQpXu1x7Snl7CvlRuDtmzYzxc/RGFyq78DLZhw+km0xWvpL S6+1FxLzVaCHZJ5xyxo+mMcvwTmyRSASqR1Lp8zmeA5NKhFuOOONIkqsqs0Do0HYy+Eb 2oP+FZrP1fH7t5VhF920yZ7Ha+Sl2U9r8gXqAuv+lD1Cn0j+gpPWcNW2mQKXDVJL0Mm6 OyOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778970895; x=1779575695; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=JJ4eRCsRtGruvU7OzNvMy5LtAnMvACrX8iMrvGPwu/M=; b=iyfx+9Ja+hex6cPyid39jHUd7muYg9zYoNJjXYD4mLg0X/P0Z4KnmYZDscLaYtT5Af Ukk35KgKWaF7TnRY1WeiAW0Di3yjZggzvGEkAZvF9xW7IhCG6GKy+nwzOm8L+Wab329h hd3fZBRjSxZuOgyX9McpxqaddrY+im8iS+HzK0wCZRc0eedcy59+yI5QuQsRNgCfPHcX DLYWf23TBHF7wsAd3aUsqoUabudndp3OItHgPsZfSI+aXs3OGawKXgLDAEp/UN28vH67 +e2hkHxjf+mgM1/GqMrgJNR5zSC7YDWByFjhT77riPQ7iJMAckaN0NwyHbNSxaccrgeW lrlQ== X-Forwarded-Encrypted: i=1; AFNElJ8CIOINqwYQbnDSPX3wBkKI2SIwK7uDVaVYWhvdVxbLv5O9U5k+kWjt2D4rW1jk+GbjTvrXTp2BcVaoSCc=@vger.kernel.org X-Gm-Message-State: AOJu0YwK80FCDdal62hcPb1+NBXiRZXo5sJeXcK3SjDisZObNmdfRtzU 9hGfHKkF3gR87yB6xYbyGtzoezlllPW7yIHzZjGWrD3wX78GBTAlZLs= X-Gm-Gg: Acq92OFseSgy4bjzHDDss6lZufewphu3hBqgpKsjHZz+yKUD3mc1fkJDD146J1N6PUy 4qGD6VxpNuW0QZvZe6qNKuV2j/qp8F/B7NuekUbh+h1TZFr1KME3p2vqlKCgimO9gVAH/bYxWFX EJqUGHheVCnBdXvYF2O71mSjF13VEC3uDIDJxSyiZZtTkrskOWgoJewA12XQzqTJyohTCbtqHP4 YP7NM7qnlNiVebqJCNeCIUhFBctOSPXB82rUfpG03fYSBT1UEx7TGB6ghadhpxZPuDEIZRKFXmN YMmuXvVcjZsbXF76JYyL7WboRwW/2P5LDDG119m4dVtgwLLhrjMlSr1TYFn2hmm2FighUidqM+B 5DIwvAjpULpmBbAVnqWQuTNxTkXHza4WU6W2wHEntJY9juZkn6y959jvHT2EYutqU27Gnx3ct8T 3gMSncx1njUkHrrB4b5VTQeBk6Y93DbnAEOcYiPyZoXl2gN5P5YD89SiTZy4GO9bnmUX4+omKT1 EqZCHEq6Kc0 X-Received: by 2002:a05:690c:95:b0:7bd:9ce7:164c with SMTP id 00721157ae682-7c95caea6camr98089637b3.43.1778970894660; Sat, 16 May 2026 15:34:54 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7cc9cea0642sm643557b3.45.2026.05.16.15.34.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 May 2026 15:34:53 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com, bharata@amd.com Subject: [RFC PATCH 6/7] mm/damon/core: accept paddr_ibs in node_eligible_mem_bp ops check Date: Sat, 16 May 2026 15:34:31 -0700 Message-ID: <20260516223439.4033-7-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260516223439.4033-1-ravis.opensrc@gmail.com> References: <20260516223439.4033-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" damos_get_node_eligible_mem_bp() and the damon_commit_ctx() validation path reject any ops.id !=3D DAMON_OPS_PADDR, which caused paddr_ibs to always get 0 from the node-eligible helper. This caused the quota control loop to run open-loop (esz doubles every tick) when using the paddr_ibs backend with a node_eligible_mem_bp goal. Introduce damon_ops_id_is_paddr_family() and use it at both sites so DAMON_OPS_PADDR_IBS is accepted alongside DAMON_OPS_PADDR. The helper also gives any future paddr-family backend a single line to extend. Signed-off-by: Ravi Jonnalagadda --- mm/damon/core.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/mm/damon/core.c b/mm/damon/core.c index 2aa031cbc70b7..1e52161f4c015 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -83,6 +83,16 @@ static bool damon_ops_is_hw_hotness(enum damon_ops_id id) return id =3D=3D DAMON_OPS_PADDR_IBS; } =20 +/* + * Returns true if the ops id treats the monitoring target as a + * physical-address region (no per-task PID). Used by paddr-only + * gates such as node_eligible_mem_bp. + */ +static bool damon_ops_id_is_paddr_family(enum damon_ops_id id) +{ + return id =3D=3D DAMON_OPS_PADDR || id =3D=3D DAMON_OPS_PADDR_IBS; +} + /** * damon_is_registered_ops() - Check if a given damon_operations is regist= ered. * @id: Id of the damon_operations to check if registered. @@ -1787,8 +1797,8 @@ int damon_commit_ctx(struct damon_ctx *dst, struct da= mon_ctx *src) if (!is_power_of_2(src->min_region_sz)) return -EINVAL; =20 - /* node_eligible_mem_bp metric requires PADDR ops */ - if (src->ops.id !=3D DAMON_OPS_PADDR) { + /* node_eligible_mem_bp metric requires PADDR-family ops */ + if (!damon_ops_id_is_paddr_family(src->ops.id)) { damon_for_each_scheme(scheme, src) { struct damos_quota *quota =3D &scheme->quota; =20 @@ -3041,7 +3051,7 @@ static unsigned long damos_get_node_eligible_mem_bp(s= truct damon_ctx *c, phys_addr_t total_eligible =3D 0; phys_addr_t node_eligible; =20 - if (c->ops.id !=3D DAMON_OPS_PADDR) + if (!damon_ops_id_is_paddr_family(c->ops.id)) return 0; =20 if (nid < 0 || nid >=3D MAX_NUMNODES || !node_online(nid)) --=20 2.43.0 From nobody Mon May 25 08:10:52 2026 Received: from mail-yw1-f193.google.com (mail-yw1-f193.google.com [209.85.128.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE30938F92A for ; Sat, 16 May 2026 22:34:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.193 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970910; cv=none; b=RmwMa7SEKxLXeGLWkCjwhwd9Jk3WbS12+hpJ07gW/9lmEIXfGVJBl+9LbW7vaqYUcRjrb+SrbqLvaE0laqw28nVs4UVfT9ddPc27xmo+Im+dIaVEDEPOFl7LHqiaeHtBLzO0rfzLEgpzcRqKi6AKY2ZB8KE6n8eb+cdbUrWIcn0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778970910; c=relaxed/simple; bh=C1VJDWaTu/84/0Jx2DjeM70yxTyztJpxfgkG+Kg0D4U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RK58951hrSJGUo2NAW3SahAxCVEoO10GyVTu1gQKWYmJku1YAN0wsiRNibXeWw3yONcJLEzJdT3gkptXDMHsCLZZOzdo9ScXxAzeggFjfGdi83t3XXSpAijnJ0b3xHxRXDYv7kuUy5kMTIA1T/DXW8yrBoKx80/py9gJsSdUX3g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=avjTg+pP; arc=none smtp.client-ip=209.85.128.193 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="avjTg+pP" Received: by mail-yw1-f193.google.com with SMTP id 00721157ae682-7bf02533706so5243437b3.0 for ; Sat, 16 May 2026 15:34:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1778970897; x=1779575697; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hvMZysC8itbbidjyAgiRDox5/pyQFwV2BMVhnr5sBjw=; b=avjTg+pPmTeSv8gRYR1WijpNKZ6yA6qfCGrpW+BGOK1/JbYUFCTQkCmspWASweDi3/ EXf54ZoonnYijQ4dT0KZLb+o3fWommCh/5tXkLfsJjAId4icMgnzVx0sfL1ACa5re/nw Bci0r4JIntF8pJ5RUQfSLTrX7KEvWsoYmwPfpvxqnEyT1+hqqPFRfxbF+WlcYtSVmEoj of3GxsVq6A0r2m8mHq32iEMN8QD938ju1eEh8opiglQgqZOPjPKtfgCKlAunV25Vz/9y hTMRsUPdZ5PDH8vYxs8t8Olenk2ID5/RpLZXhxbHpwCyRMtwNdxzt5fIUQ21W9W0A3rr NXrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778970897; x=1779575697; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=hvMZysC8itbbidjyAgiRDox5/pyQFwV2BMVhnr5sBjw=; b=enk0rNrvXkfB7ebcSzuWaPKz6bsA0R6+dlgMpQCTpusez2WNs8wO/+4OrftMywdmp1 QJDLv5v6IakI7JPQRtljXwgZUAQz7jFGT3b+5v0iRrP5VU+ZJn1ZlyNsPiCW600Zib3L VEQeSYg0Ir5WctCtHJRx3YhMdhe5Ephz5QMXUn4Fj0XoWthUbyFx543dnqyO1OOkFRBM 9xnUt6MeuMpTn4InheRp4x3h/NwJZHm36Tqe7UC+Ipa9zmHo0eaWCoGEG2TEc9fwlo3f BIHl5MAJLpYV4/n+8qyhlJCsyX3HC2DrVUJe4GWiMfXVJ3BBmSOMDXTCJxQDbSqp9sBz gsFA== X-Forwarded-Encrypted: i=1; AFNElJ9biGOnk4OqgJlC9+KyHOyR316AiZEyohoj44LV7Yy5vHh8+1d1xWsalzFuhaWZ9ZmkpugCgq0yQOQKQiE=@vger.kernel.org X-Gm-Message-State: AOJu0Yw7SbJqgDRqd0VkQyjB71F/jBmcsRdfVsfCjMTNi2E7F2UZ3+jz 0Gbe2/Utf4roKzoGIw/izKLiHnrmoJ39RGbRgCgN2SWoOVR3nmewVFQ= X-Gm-Gg: Acq92OHztUVIjuanx9QYJ3L9F/wHl1VnznLVQhhXY0bSwAqizB3CbcoJMC5tbBKruv2 kBM17nJiq5OiriteQcB+s7dPbU5WTa5W3uIOyeA1CHIDRGUaALOUgb9vURZ6Z2CYBYX1W4r7yta 8KERiUY2V/iWsL7H23lFlOqLsJhbxKEWLA40Hm2trwI3l2q5iycCKTKHtPBbX6qnPpQGwI/CPxT J9VrlexWKJs04a2juyvnIkAoXtHwEc6p2M2vrXaQpjc/dGbzGfP9f+di14wFkasa3uDK6U6QzNu b7Id/3rJtQJgAHW6tqNd3wqx+Svjg7NkUyPJAM0A79VDE7CUZddvcFBHKbO3rOzcB6gadq/cvUT 0K0vXQFQycujK26fmvAozWEUKQAYMY+9ArJSgtEmMpaCzEyzuZAXhfQAflVgS01JOrjgaThSy0c QuLEQZd2bUhrnudOqYRv6d7NYGAmqr/aWuJDI/RcZhpQ/zim5Oew2Cz+Vmnrrjg/EXomv+1BiKB Q== X-Received: by 2002:a05:690c:c364:b0:7c0:de77:f466 with SMTP id 00721157ae682-7c9594b3711mr86084717b3.2.1778970896587; Sat, 16 May 2026 15:34:56 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7cc9bc0db09sm690837b3.27.2026.05.16.15.34.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 May 2026 15:34:56 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com, bharata@amd.com Subject: [RFC PATCH 7/7] mm/damon/damon_ibs: add AMD IBS-based access sampling backend Date: Sat, 16 May 2026 15:34:32 -0700 Message-ID: <20260516223439.4033-8-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260516223439.4033-1-ravis.opensrc@gmail.com> References: <20260516223439.4033-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add paddr_ibs operations using AMD IBS Op sampling via perf_event_create_kernel_counter(). IBS delivers physical-address- keyed access reports to DAMON's shared-layer ring buffer (damon_report_access()), without dependency on PTE Accessed-bit scanning or page faults. Per-CPU IBS events are created and torn down via cpuhp notifiers (CPUHP_AP_ONLINE_DYN). Routing of access reports through the ring- drain path is bound to ops.id =3D=3D DAMON_OPS_PADDR_IBS at the dispatch site (see "mm/damon: add sysfs binding and dispatch hookup for paddr_ibs operations"), so .init does not need to flip a per-context flag. Sample-time discipline: - PERF_SAMPLE_PHYS_ADDR is requested in attr.sample_type, but the IBS perf driver only fills data->phys_addr when IBS_OP_DATA3.dc_phy_addr_valid is set. Skip stale-PA samples by inspecting data->sample_flags rather than testing phys_addr for zero (which would also drop legitimate page 0). - PERF_SAMPLE_DATA_SRC is requested so the perf driver decodes IBS_OP_DATA3.{ld_op,st_op} into data->data_src.mem_op; the backend reports load vs store accordingly via damon_access_report.is_write. Module parameters: - max_cnt: IBS Op MaxCnt (writable; writes call damon_ibs_set_sample_rate() to restart sampling at the new rate) - samples_total / samples_filtered: per-CPU-aggregated counters (read-only) Source file is mm/damon/damon_ibs.c (renamed from mm/damon/ibs.c) so the resulting module is loaded as damon_ibs.ko, avoiding the generic "ibs" namespace. The IBS sampling approach is derived from Bharata B Rao's pghot RFC v5 series; the attribution header is in damon_ibs.c. Suggested-by: Bharata B Rao Link: https://lore.kernel.org/linux-mm/20260129144043.231636-1-bharata@amd.= com/ Signed-off-by: Ravi Jonnalagadda --- mm/damon/Kconfig | 10 ++ mm/damon/Makefile | 1 + mm/damon/damon_ibs.c | 369 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 380 insertions(+) create mode 100644 mm/damon/damon_ibs.c diff --git a/mm/damon/Kconfig b/mm/damon/Kconfig index ad629f0f31d8d..bb698d2717f34 100644 --- a/mm/damon/Kconfig +++ b/mm/damon/Kconfig @@ -131,4 +131,14 @@ config DAMON_ACMA min/max memory for the system and maximum memory pressure stall time ratio. =20 +config DAMON_IBS + tristate "AMD IBS-based access sampling for DAMON" + depends on DAMON_PADDR && CPU_SUP_AMD && X86_64 && PERF_EVENTS + help + Uses AMD IBS (Instruction-Based Sampling) hardware to deliver + physical-address-keyed access reports to DAMON's shared-layer + ring buffer, without relying on PTE Accessed-bit scanning or + page faults. Registers as the "paddr_ibs" operations set. + Requires AMD processors with IBS Op support. + endmenu diff --git a/mm/damon/Makefile b/mm/damon/Makefile index 22494754f41e8..109d0fb1db97d 100644 --- a/mm/damon/Makefile +++ b/mm/damon/Makefile @@ -9,3 +9,4 @@ obj-$(CONFIG_DAMON_RECLAIM) +=3D modules-common.o reclaim.o obj-$(CONFIG_DAMON_LRU_SORT) +=3D modules-common.o lru_sort.o obj-$(CONFIG_DAMON_STAT) +=3D modules-common.o stat.o obj-$(CONFIG_DAMON_ACMA) +=3D modules-common.o acma.o +obj-$(CONFIG_DAMON_IBS) +=3D damon_ibs.o diff --git a/mm/damon/damon_ibs.c b/mm/damon/damon_ibs.c new file mode 100644 index 0000000000000..1dd99e91c3928 --- /dev/null +++ b/mm/damon/damon_ibs.c @@ -0,0 +1,369 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * DAMON IBS (Instruction-Based Sampling) backend for AMD processors. + * + * Uses AMD IBS Op sampling via the perf kernel counter infrastructure to + * deliver PA-keyed access reports to DAMON's shared-layer ring buffer + * (see damon_report_access()). This enables physical-address hot-page + * detection without relying on page-table Accessed bits or page faults. + * + * The IBS sampling approach in this file derives from concepts in + * Bharata B Rao's pghot RFC v5 series for hot page tracking. + * See: https://lore.kernel.org/linux-mm/20260129144043.231636-1-bharata@a= md.com/ + * + * Author: Ravi Jonnalagadda + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include "ops-common.h" + +#define DAMON_IBS_DEFAULT_MAX_CNT 262144 /* ~4K samples/sec/core */ +#define IBS_OP_PMU_TYPE_PATH "/sys/bus/event_source/devices/ibs_op/type" + +static unsigned int damon_ibs_max_cnt =3D DAMON_IBS_DEFAULT_MAX_CNT; + +static int damon_ibs_set_sample_rate(unsigned int max_cnt); + +static int max_cnt_set(const char *val, const struct kernel_param *kp) +{ + unsigned int new_cnt; + int ret; + + ret =3D kstrtouint(val, 0, &new_cnt); + if (ret) + return ret; + if (!new_cnt) + return -EINVAL; + return damon_ibs_set_sample_rate(new_cnt); +} +static const struct kernel_param_ops max_cnt_ops =3D { + .set =3D max_cnt_set, + .get =3D param_get_uint, +}; +module_param_cb(max_cnt, &max_cnt_ops, &damon_ibs_max_cnt, 0644); +MODULE_PARM_DESC(max_cnt, + "IBS MaxCnt (ops between samples). Writes restart sampling."); + +static DEFINE_MUTEX(damon_ibs_lock); +static bool damon_ibs_enabled; +static enum cpuhp_state damon_ibs_cpuhp_state; +static unsigned int ibs_pmu_type; /* discovered at init */ + +static DEFINE_PER_CPU(struct perf_event *, damon_ibs_event); + +/* + * Diagnostic counters. Incremented from NMI context, so use per-CPU + * counters and sum them on read. + */ +static DEFINE_PER_CPU(unsigned long, ibs_samples_total_pcpu); +static DEFINE_PER_CPU(unsigned long, ibs_samples_filtered_pcpu); + +static unsigned long damon_ibs_sum_pcpu(unsigned long __percpu *var) +{ + unsigned long sum =3D 0; + int cpu; + + for_each_possible_cpu(cpu) + sum +=3D per_cpu(*var, cpu); + return sum; +} + +static int samples_total_get(char *buffer, const struct kernel_param *kp) +{ + return sysfs_emit(buffer, "%lu\n", + damon_ibs_sum_pcpu(&ibs_samples_total_pcpu)); +} + +static int samples_filtered_get(char *buffer, const struct kernel_param *k= p) +{ + return sysfs_emit(buffer, "%lu\n", + damon_ibs_sum_pcpu(&ibs_samples_filtered_pcpu)); +} + +static const struct kernel_param_ops samples_total_ops =3D { + .get =3D samples_total_get, +}; +static const struct kernel_param_ops samples_filtered_ops =3D { + .get =3D samples_filtered_get, +}; + +module_param_cb(samples_total, &samples_total_ops, NULL, 0444); +MODULE_PARM_DESC(samples_total, "Total IBS samples delivered (read-only)"); +module_param_cb(samples_filtered, &samples_filtered_ops, NULL, 0444); +MODULE_PARM_DESC(samples_filtered, "IBS samples filtered out (read-only)"); + +/** + * damon_ibs_overflow_handler() - IBS overflow callback. + * + * Called when an IBS Op counter overflows. The IBS perf driver fills + * data->phys_addr from IBSDCPHYSAD when dc_phy_addr_valid is set. + * + * Context: NMI =E2=80=94 no sleeping, no mutex, no kmalloc. + */ +static void damon_ibs_overflow_handler(struct perf_event *event, + struct perf_sample_data *data, + struct pt_regs *regs) +{ + struct damon_access_report report; + unsigned long phys_addr; + + if (!data) + return; + + /* + * PERF_SAMPLE_PHYS_ADDR was requested in attr.sample_type, but + * the IBS perf driver only populates data->phys_addr when + * IBS_OP_DATA3.dc_phy_addr_valid is set. Skip stale-PA samples + * by checking the sample_flags rather than testing phys_addr + * for zero (which would also drop legitimate page 0). + */ + if (!(data->sample_flags & PERF_SAMPLE_PHYS_ADDR)) { + this_cpu_inc(ibs_samples_filtered_pcpu); + return; + } + phys_addr =3D data->phys_addr; + + report =3D (struct damon_access_report){ + .paddr =3D phys_addr & PAGE_MASK, + .size =3D PAGE_SIZE, + .cpu =3D smp_processor_id(), + .is_write =3D !!(data->data_src.mem_op & PERF_MEM_OP_STORE), + }; + damon_report_access(&report); + this_cpu_inc(ibs_samples_total_pcpu); +} + +static int damon_ibs_create_event(int cpu) +{ + struct perf_event_attr attr =3D { + .type =3D ibs_pmu_type, + .size =3D sizeof(attr), + /* config=3D0: IBS perf driver uses sample_period as MaxCnt. */ + .config =3D 0, + .sample_period =3D damon_ibs_max_cnt, + .sample_type =3D PERF_SAMPLE_PHYS_ADDR | PERF_SAMPLE_DATA_SRC, + .pinned =3D 1, + }; + struct perf_event *event; + + event =3D perf_event_create_kernel_counter(&attr, cpu, NULL, + damon_ibs_overflow_handler, + NULL); + if (IS_ERR(event)) + return PTR_ERR(event); + + /* + * perf_event_create_kernel_counter() returns the event already + * enabled; no perf_event_enable() needed here. + */ + per_cpu(damon_ibs_event, cpu) =3D event; + return 0; +} + +static void damon_ibs_destroy_event(int cpu) +{ + struct perf_event *event =3D per_cpu(damon_ibs_event, cpu); + + if (!event) + return; + + perf_event_disable(event); + perf_event_release_kernel(event); + per_cpu(damon_ibs_event, cpu) =3D NULL; +} + +static int damon_ibs_cpu_online(unsigned int cpu) +{ + int ret =3D damon_ibs_create_event(cpu); + + if (ret) + pr_warn_ratelimited( + "damon-ibs: failed to create perf_event on cpu %u (err %d); " + "this cpu will not contribute samples\n", cpu, ret); + return 0; /* never block CPU online */ +} + +static int damon_ibs_cpu_offline(unsigned int cpu) +{ + damon_ibs_destroy_event(cpu); + return 0; +} + +/* Caller must hold damon_ibs_lock. */ +static int __damon_ibs_start(void) +{ + int ret; + + if (damon_ibs_enabled) + return -EBUSY; + + ret =3D cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "damon/ibs:online", + damon_ibs_cpu_online, damon_ibs_cpu_offline); + if (ret < 0) + return ret; + damon_ibs_cpuhp_state =3D ret; + + damon_ibs_enabled =3D true; + pr_info_once("damon-ibs: first start (max_cnt=3D%u, pmu_type=3D%u)\n", + damon_ibs_max_cnt, ibs_pmu_type); + return 0; +} + +/* Caller must hold damon_ibs_lock. */ +static void __damon_ibs_stop(void) +{ + if (!damon_ibs_enabled) + return; + + cpuhp_remove_state(damon_ibs_cpuhp_state); + damon_ibs_enabled =3D false; +} + +static int damon_ibs_start(void) +{ + int ret; + + mutex_lock(&damon_ibs_lock); + ret =3D __damon_ibs_start(); + mutex_unlock(&damon_ibs_lock); + return ret; +} + +static void damon_ibs_stop(void) +{ + mutex_lock(&damon_ibs_lock); + __damon_ibs_stop(); + mutex_unlock(&damon_ibs_lock); +} + +/** + * damon_ibs_set_sample_rate() - Set IBS sampling interval. + * @max_cnt: IBS Op MaxCnt value (ops between samples). + * Higher =3D fewer samples/sec. + * + * If IBS is already running, restart it with the new rate. + * + * Return: 0 on success; if a restart was required and failed, + * propagate the error so callers (e.g. the max_cnt module-param + * .set callback) surface it to userspace instead of silently + * leaving sampling stopped. + */ +static int damon_ibs_set_sample_rate(unsigned int max_cnt) +{ + int ret =3D 0; + + mutex_lock(&damon_ibs_lock); + damon_ibs_max_cnt =3D max_cnt ? max_cnt : DAMON_IBS_DEFAULT_MAX_CNT; + + if (damon_ibs_enabled) { + __damon_ibs_stop(); + ret =3D __damon_ibs_start(); + if (ret) + pr_warn("damon-ibs: restart failed: %d\n", ret); + } + mutex_unlock(&damon_ibs_lock); + return ret; +} + + +static void damon_ibs_init_ctx(struct damon_ctx *ctx) +{ + int ret; + + /* IBS is the access-detection source for this ctx. */ + ctx->sample_control.primitives_enabled.page_table =3D false; + + ret =3D damon_ibs_start(); + if (ret && ret !=3D -EBUSY) + pr_warn("damon-ibs: failed to start IBS sampling: %d\n", ret); +} + +/** + * damon_ibs_discover_pmu_type() - Discover IBS Op PMU type from sysfs. + * + * Reads /sys/bus/event_source/devices/ibs_op/type to get the PMU type + * identifier needed for perf_event_attr.type. + * + * TODO: replace sysfs-read with a PMU lookup API when one becomes + * available. + * + * Return: 0 on success, negative error code otherwise. + */ +static int damon_ibs_discover_pmu_type(void) +{ + struct file *f; + char buf[16]; + loff_t pos =3D 0; + ssize_t len; + int ret; + + f =3D filp_open(IBS_OP_PMU_TYPE_PATH, O_RDONLY, 0); + if (IS_ERR(f)) + return PTR_ERR(f); + + len =3D kernel_read(f, buf, sizeof(buf) - 1, &pos); + filp_close(f, NULL); + if (len <=3D 0) + return -EIO; + + buf[len] =3D '\0'; + ret =3D kstrtouint(strim(buf), 10, &ibs_pmu_type); + if (ret) + return ret; + + pr_info("damon-ibs: discovered ibs_op PMU type=3D%u\n", ibs_pmu_type); + return 0; +} + +static int __init damon_ibs_init(void) +{ + struct damon_operations ops =3D { + .id =3D DAMON_OPS_PADDR_IBS, + .owner =3D THIS_MODULE, + .init =3D damon_ibs_init_ctx, + .prepare_access_checks =3D damon_pa_prepare_access_checks, + .check_accesses =3D damon_pa_check_accesses, + .apply_probes =3D damon_pa_apply_probes, + .apply_scheme =3D damon_pa_apply_scheme, + .get_scheme_score =3D damon_pa_scheme_score, + }; + int err; + + if (!boot_cpu_has(X86_FEATURE_IBS)) + return -ENODEV; + + err =3D damon_ibs_discover_pmu_type(); + if (err) { + pr_err("damon-ibs: failed to discover IBS PMU type: %d\n", err); + return err; + } + + err =3D damon_register_ops(&ops); + if (err) + return err; + + pr_info("damon-ibs: AMD IBS backend registered (max_cnt=3D%u, pmu_type=3D= %u)\n", + damon_ibs_max_cnt, ibs_pmu_type); + return 0; +} + +static void __exit damon_ibs_exit(void) +{ + damon_ibs_stop(); + damon_unregister_ops(DAMON_OPS_PADDR_IBS); +} + +module_init(damon_ibs_init); +module_exit(damon_ibs_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Ravi Jonnalagadda "); +MODULE_DESCRIPTION("AMD IBS-based access sampling backend for DAMON"); --=20 2.43.0