From nobody Mon Jun 8 10:56:43 2026 Received: from mail-oa1-f66.google.com (mail-oa1-f66.google.com [209.85.160.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 901FC3FCB33 for ; Fri, 29 May 2026 16:57:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.66 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073823; cv=none; b=kdpnSMSwAvVqwoB7sH4ySHVVOM/SHQOatrtE/ErlK3qKR1j4Kb9f4yZJNHYJAp/2BhpVlInMBK3DuMWhLSIZDCHchqdoFLGBR/2F5rR60o7LapAPZOiBYDTzkWsLOklxfR5Jh2l967xeVq+mTv0T+6Yg9nTf//NSWkZJuwj2q+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073823; c=relaxed/simple; bh=/WB7DhhL0YPzVpIXGI8fiP8eIBsrpdAk77Iyd16ISAA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=U64IFm2WHHHIEHdzQ1/oXv+jYvdxE36RLa7LlumH9d+rqd0kTEEf6aGedlzIGCo8naiBrX9M91fli28CyA5hiwFFlwpYlDXfD/5YFG5c6Nj22V5B9//oaUCqFUskJXVdTgLhoydR/0GHIILkqpRDNxOzFxzjorBOOcu7+fZ+QRE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=slwewJ/Z; arc=none smtp.client-ip=209.85.160.66 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="slwewJ/Z" Received: by mail-oa1-f66.google.com with SMTP id 586e51a60fabf-43bf95c3f6fso3830280fac.0 for ; Fri, 29 May 2026 09:57:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780073820; x=1780678620; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vBG7U8aH1ARs37G3HmZz4dhZEB7morx48iJcdEwFffY=; b=slwewJ/Z7Tqj/w/HntFTl1PBI2hdLNI9vCtGlPAizJONGnE3yMK/xXF/nOMwOlUIIY mBrXfuE5v6zHpxyfHlOEw14SjbqAOvFUJGyoegFZBVfzzWD/cN4/jpJd0SlRzH34Hjbj 6ntx6wt+F91kksvL302QMdECdhomtnViRco2Eaq44vuiCxbgLsFhXBuwmPGwDzeVTXGr tebWzir/U3s/9aQgFMvZodtxAdG9ZbDyRL6cDRsk5yk4kcmO1Y0mLj08K8Xi0Eq1W+kJ suTMFjD5lcZnCtvFOUI+wZQld4mTRcNdoad50KwU8d7RKAyDKJsxrrVLz2jkaJos2Ats nt3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780073820; x=1780678620; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=vBG7U8aH1ARs37G3HmZz4dhZEB7morx48iJcdEwFffY=; b=VErByx7XKaBT74CJZdhX2kxs4cizoGGlStxh8s1OmaS9bUmGLo1ANyTwdh2zX4EZ3M OdpXcrZWKP+y0YLLsIZ/834nS/Omz1imcTBGCuHckgB2WoSkaC/itgLNTT/ja1NoPqrh lLYPHUSxN7gNz3mxhN2hSP8dxEnLyhTOK0UumpXcbEUJ42ybZEW7h5ZTberuPeviUuG9 YN++Zm/YXNGSTqSuSss6Mlt5v3PPd/oXezflFK8ESfKEtUpmuayw8m0gyKWO6Fx4/pYe M5qmu/usZMa9H98qUoqZT8FLjv97OVysn+6myGiKDANO+MrnrAfniNgdPQpd87epGSFu Vdvw== X-Forwarded-Encrypted: i=1; AFNElJ8+3a9N8QgjJ7pUzdZ5ptxeJnUi7xdVqXML0zCOM1a8KNpjjMn9QsRputUcspqjPrha+5joWv6Sle2tXOc=@vger.kernel.org X-Gm-Message-State: AOJu0Ywh1lcuvg/dwqC66bMc4oASB4NRZvQj0SqRsJGJPC/qBJPyttKa UdxwsVWcR3EfcKNBxaYxlNc9aEqslUw/xpwp+ep8DjTo7STG9VfDaoI= X-Gm-Gg: Acq92OFuSyVyDKVIE4jUFZJvtvn3peFwu43jYDhT7bl4CPv2aZFg5eNE2ActbI35GGF 0f/1W2wgIqhSPy/NSigXQ/uFEwIqj8L+8JI8zLSwTgjbgagzPKWTR+r1IMP5j4pxSo4p7bAzyZJ jV+lN3qnuobWUVate8V6Y7lpKSv6jVkSS6G4q6ojRFdL6uWCG5AgXqgbJ4N3FUnozmCbpakM8Kz BiAhKuJPU7J32XRqvtFxaMrKqLIgV/5A71G29Dvo6Qov90oJvVWKO9Rk4Q/wl7AFVbuBOsdbq39 Gt+WxKttdwLswR1f9JnhO4eRLNW47TsgjKLQcuSW/MMiOoPSKOtahhKTwS9nnsJnhpGeWpJtIXP 9T+9zdYmdR5XpdFT6vYc7s+ArDQOoGxE/t+hsgxKp1lHpFInOsmSO5qwjioXNwzSjQqlfruEl2n gtkqkkDDlFkP1zrgbBVDyUMVDCgQ96f7PRw1ERPLfnOwjhpqUmZJqYbMI91ZBVuurZQfj8Xd3xN K1t8//nub8j X-Received: by 2002:a05:6871:890:b0:423:4260:2e0d with SMTP id 586e51a60fabf-43c891e473cmr1659989fac.8.1780073820507; Fri, 29 May 2026 09:57:00 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-43c93ef3ad7sm1403070fac.9.2026.05.29.09.56.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 May 2026 09:57:00 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, akinobu.mita@gmail.com, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com Subject: [RFC PATCH 1/6] mm/damon: add struct damon_perf_event{,_attr} and per-ctx perf_events list Date: Fri, 29 May 2026 09:56:35 -0700 Message-ID: <20260529165640.820-2-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260529165640.820-1-ravis.opensrc@gmail.com> References: <20260529165640.820-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the substrate types for using perf events as DAMON access check sources. struct damon_perf_event_attr carries the raw PMU attr configurable from userspace; struct damon_perf_event is the per-event entry on a new damon_ctx::perf_events list. Declare damon_perf_init() and damon_perf_cleanup() in mm/damon/ops-common.h. When CONFIG_PERF_EVENTS=3Dn they fold to a no-op returning -ENOSYS. Suggested-by: Akinobu Mita Link: https://lore.kernel.org/20260423004211.7037-1-akinobu.mita@gmail.com Signed-off-by: Ravi Jonnalagadda --- include/linux/damon.h | 80 +++++++++++++++++++++++++++++++++++++++++ mm/damon/ops-common.h | 39 ++++++++++++++++++++ mm/damon/sysfs-common.h | 6 ++++ 3 files changed, 125 insertions(+) diff --git a/include/linux/damon.h b/include/linux/damon.h index c0375035a3a7b..11f1c1071b9ba 100644 --- a/include/linux/damon.h +++ b/include/linux/damon.h @@ -123,6 +123,7 @@ struct damon_target { * @size: The size of the accessed address range. * @cpu: The id of the CPU that made the access. * @tid: The task id of the task that made the access. + * @tgid: Thread group id of the task that made the access. * @is_write: Whether the access is write. * * Any DAMON API callers that notified access events can report the inform= ation @@ -135,6 +136,7 @@ struct damon_access_report { unsigned long size; unsigned int cpu; pid_t tid; + pid_t tgid; bool is_write; /* private: */ unsigned long report_jiffies; /* when this report is made */ @@ -501,6 +503,7 @@ struct damos_filter { }; =20 struct damon_ctx; +struct damon_target_lookup; struct damos; =20 /** @@ -966,6 +969,67 @@ struct damon_sample_control { struct list_head sample_filters; }; =20 +/** + * struct damon_perf_event_attr - raw PMU event attr for access check. + * + * @type: raw PMU event type. + * @config: raw PMU event config. + * @config1: raw PMU event config1. + * @config2: raw PMU event config2. + * @sample_phys_addr: whether to set PERF_SAMPLE_PHYS_ADDR in sample_type. + * @sample_weight_struct: whether to set PERF_SAMPLE_WEIGHT_STRUCT in + * sample_type. PMUs that do not advertise + * weight (e.g. AMD IBS Op) reject events with + * this flag set, so it must be opt-in. + * @exclude_kernel: exclude kernel-mode samples. + * @exclude_hv: exclude hypervisor samples. + * @freq: when true use @sample_freq, otherwise @sample_period. + * @sample_freq: target sample rate when @freq is true. + * @sample_period: period (samples-between-overflows) when @freq is false. + * @wakeup_events: perf_event_attr.wakeup_events. + * @precise_ip: precise sampling skid bound (PEBS-style PMUs). + */ +struct damon_perf_event_attr { + u32 type; + u64 config; + u64 config1; + u64 config2; + bool sample_phys_addr; + bool sample_weight_struct; + bool exclude_kernel; + bool exclude_hv; + bool freq; + u64 sample_freq; + u64 sample_period; + u32 wakeup_events; + u32 precise_ip; +}; + +/** + * struct damon_perf_event - perf event for access check. + * + * @attr: Per-event PMU attribute (configured via sysfs). + * @priv: Monitoring operations-specific data. + * @list: List head for &damon_ctx->perf_events siblings. + * @hlist_node: Tracks this event among cpuhp multi-instance entries. + * @init_complete: Set after the synchronous online sweep finishes; gates + * @any_cpu_failed writes from late hotplug callbacks. + * @any_cpu_failed: Set by the cpuhp online callback if perf_event creation + * fails on any CPU during the synchronous initial install. + * @ctx: Back-pointer to the owning damon_ctx; the cpu_online callback + * reads ctx->perf_events_active to decide whether to enable a + * late-onlining CPU's event immediately after create. + */ +struct damon_perf_event { + struct damon_perf_event_attr attr; + void *priv; + struct list_head list; + struct hlist_node hlist_node; + bool init_complete; + bool any_cpu_failed; + struct damon_ctx *ctx; +}; + /** * struct damon_ctx - Represents a context for each monitoring. This is t= he * main interface that allows users to set the attributes and get the resu= lts @@ -991,6 +1055,11 @@ struct damon_sample_control { * @addr_unit: Scale factor for core to ops address conversion. * @min_region_sz: Minimum region size. * @pause: Pause kdamond main loop. + * @perf_events: Head of perf events (&damon_perf_event) list. + * @perf_events_active: Set while kdamond_fn has the perf events armed. + * Cleared in the kdamond_fn done path before the events are + * disabled; serves as the gate for damon_commit_perf_events() + * and the kdamond_fn drain dispatch. */ struct damon_ctx { struct damon_attrs attrs; @@ -1046,6 +1115,9 @@ struct damon_ctx { unsigned long min_region_sz; bool pause; =20 + struct list_head perf_events; + bool perf_events_active; + /* private: */ /* Head of monitoring targets (&damon_target) list. */ struct list_head adaptive_targets; @@ -1054,6 +1126,14 @@ struct damon_ctx { =20 /* Per-ctx PRNG state for damon_rand(); kdamond is the sole consumer. */ struct rnd_state rnd_state; + + /* Reusable drain-loop snapshot buffer (avoids per-tick kmalloc). */ + struct { + struct damon_target_lookup *lookups; + unsigned int nr_lookups; + struct damon_region **region_buf; + unsigned int region_buf_cap; + } drain_snapshot; }; =20 /* Get a random number in [@l, @r) using @ctx's lockless PRNG. */ diff --git a/mm/damon/ops-common.h b/mm/damon/ops-common.h index 5efa5b5970def..35da400a67ec1 100644 --- a/mm/damon/ops-common.h +++ b/mm/damon/ops-common.h @@ -23,3 +23,42 @@ bool damos_folio_filter_match(struct damos_filter *filte= r, struct folio *folio); unsigned long damon_migrate_pages(struct list_head *folio_list, int target= _nid); =20 bool damos_ops_has_filter(struct damos *s); + +#ifdef CONFIG_PERF_EVENTS + +/* + * Per-event opaque allocated by damon_perf_init(). The NMI overflow + * handler does NOT touch this struct; submission goes through the + * shared per-CPU SPSC ring via damon_report_access(). + */ +struct damon_perf { + struct perf_event * __percpu *event; +}; + +int damon_perf_init(struct damon_ctx *ctx, struct damon_perf_event *event); +void damon_perf_cleanup(struct damon_ctx *ctx, struct damon_perf_event *ev= ent); +void damon_perf_event_arm(struct damon_perf_event *event); +void damon_perf_event_disarm(struct damon_perf_event *event); + +#else /* !CONFIG_PERF_EVENTS */ + +static inline int damon_perf_init(struct damon_ctx *ctx, + struct damon_perf_event *event) +{ + return -ENOSYS; +} + +static inline void damon_perf_cleanup(struct damon_ctx *ctx, + struct damon_perf_event *event) +{ +} + +static inline void damon_perf_event_arm(struct damon_perf_event *event) +{ +} + +static inline void damon_perf_event_disarm(struct damon_perf_event *event) +{ +} + +#endif /* CONFIG_PERF_EVENTS */ diff --git a/mm/damon/sysfs-common.h b/mm/damon/sysfs-common.h index 25a6c28abdea8..67c7545fd57d0 100644 --- a/mm/damon/sysfs-common.h +++ b/mm/damon/sysfs-common.h @@ -66,10 +66,13 @@ int damon_sysfs_memcg_path_to_id(char *memcg_path, u64 = *id); * sample directory */ =20 +struct damon_sysfs_perf_events; + struct damon_sysfs_sample { struct kobject kobj; struct damon_sysfs_primitives *primitives; struct damon_sysfs_sample_filters *filters; + struct damon_sysfs_perf_events *perf_events; }; =20 struct damon_sysfs_sample *damon_sysfs_sample_alloc(void); @@ -82,3 +85,6 @@ extern const struct kobj_type damon_sysfs_sample_ktype; int damon_sysfs_set_sample_control( struct damon_sample_control *control, struct damon_sysfs_sample *sysfs_sample); + +int damon_sysfs_add_perf_events(struct damon_ctx *ctx, + struct damon_sysfs_sample *sysfs_sample); --=20 2.43.0 From nobody Mon Jun 8 10:56:43 2026 Received: from mail-ot1-f68.google.com (mail-ot1-f68.google.com [209.85.210.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F3A393FE351 for ; Fri, 29 May 2026 16:57:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.68 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073826; cv=none; b=BYl92qFTnci4EyOY3FVQw4s1FcwYyPeUd2mkqkqHRCRu/we5zM9pHUG7X+14gRNg2j9h0ExPs4QFWR9oruwVignyqAmJuqH6canBp3I5ICnzDjCdxaGKlhOiO9y1TFvH9kU4dYvhF4U2nhifIFbGuPWt2eKwWVlOjVUC8jA7Qk0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073826; c=relaxed/simple; bh=F/QiQ0zSPrm/6oDcRCoTElEWdSJvT5j07i2HruZ+ebE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=h/ToZBz15jbRZejGsnXc8IVSbJmGRYFRWfvhvgabMBvhD9KyKkVuvTquSC1Tf70bQFhphirSpHropCUI4m0TPgRL5HniQ1pqWM0P3xe6+qI9F/UQH347E5F4Z4aJo+yH7n/GoPaRLCt/cpRJ1Jqq11U5t4PfysyvqOE8pqszwrI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Oxe0cd12; arc=none smtp.client-ip=209.85.210.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Oxe0cd12" Received: by mail-ot1-f68.google.com with SMTP id 46e09a7af769-7e615efd7d7so5853693a34.2 for ; Fri, 29 May 2026 09:57:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780073823; x=1780678623; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rls65X3DOzAoQMMdNR2JhXNJkdCouUDjyK2mqUrqz1s=; b=Oxe0cd12/Pz5k1qhA3VTJSMPUIhcXMw/5wOn1iXa+tolLRgeNJ+/4aAJX0mSFW/Ezm y9S9z5+NLm67T6s0J6XSyc0Z5td0ILcyCpNQ7sJ8Dq3Kcd4RmvK0XGqc/W0lxLQmm2Xj mLmdkjJsPGdnZ0iHblC//YaBEDSlTMwI0qZv+Kx/ZznALEr+w3oMIoif+HX3U/naz+eq LLFYyF/z6pR+FH/WyGeTo0Jl8fmlWyR9eqr7gU8pFBWlu6HRLox3D978WTDw1ot/2Opy JmvEVoWTM6C3BoYgyW8x5/9X87YXE1wT1L6pDyz+gALsVHrBVeV26N5sqc4tFuBVJN/l 7zrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780073823; x=1780678623; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=rls65X3DOzAoQMMdNR2JhXNJkdCouUDjyK2mqUrqz1s=; b=FXJhsyWZzBaUGjH0E4KFLB/X+o2v1XzX09oHzcfCjmQFfe2laNM+aMK5KCwzqFoYDl pP8GmafZw3+9CPu7QGeKitSoVN4HDlmeEZyl3i3Y5pTfp4liEC+KolxSlTW+sugf0yjP RhTbmvoDz9w4f+h3yeJ2wc0xe8/nAW6UGrjrD4X2+9QFvAAwDw6N8CrddBAEMYBUHMoM R2WolQGjjF4keb8B9L/w0C/mys4NtWBjROR4nHFauGM0nQmf6paKFCzsXNSIMQclwu6S OYUF8xaT3YFxUcWr/kjBT996dRkwMLUFEQrZhPBYl6V1CsPA9eMWy7WdipPvuwe6zfRD gikQ== X-Forwarded-Encrypted: i=1; AFNElJ+kibeBtjrtAQ/SArbXvq5pfDxtKhhtUCVTQwDHh9mekFidzltqlQMMnls3pGaLtEGr6a32yHHtslkL8MY=@vger.kernel.org X-Gm-Message-State: AOJu0Yy9ZNTV5k4zwQ9umZt60z6Wi4vsZ8a9jC7ZMywYWIDjrKNdUly/ NuEmZqGSgChLXCVulMEWeC+0QN7cEddQ3FUjWVudoFfuPHaS7qBXPL8= X-Gm-Gg: Acq92OF9DRiqCzME5rZfFDP65sgwzJHFDXY8LHePrDg5tjQwwdHyVJP8C5CR+EwDPBQ RrBcmO1UH/LbueSwFu2Le2pa2ihHRc2w2tyZU4vZrt1x/fHZoSNAlpgk03/ruqnwyEjtsgDu58u /vr/P4rIzs2vG17+oHQeWtD8DnjNEC2x/29PTRDc7sDNtzBztxVfBMURJd25S5EB5CBKy5+I7EO 8hbF4ubrYmAN/68GE1npg0ilEJcGdtijb3dqtor0bvazBprxeBuZYojVO4s2ymnSvIJqLS02whl cwCzzD04R7zg0wgQS+J0grc60e8UfiAS2t7YmaMxK4i4nf/6+BmdmpbS5i2zqGwYc8q8bJIaoR1 UV7x0FUtNpgh+BEUo+Gyloq6JYwoMYzMWzJRr1R6GqJq8rXKvphz4fgwUOUE+YwQJwhuszibo/1 w74Sv8CrTUz8eQx0jVeOSVKXtL6i7FLPKud4aKMbAzGSw4ZJWUk/IJJFo53vU3lL0+fZtXkvKuK 86JZwJGjwGW X-Received: by 2002:a05:6830:258d:b0:7d7:ccc7:c546 with SMTP id 46e09a7af769-7e6a1e3ad8amr396513a34.23.1780073822973; Fri, 29 May 2026 09:57:02 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7e695b8f662sm1862616a34.4.2026.05.29.09.57.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 May 2026 09:57:02 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, akinobu.mita@gmail.com, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com Subject: [RFC PATCH 2/6] mm/damon/sysfs-sample: expose perf_events configuration via sysfs Date: Fri, 29 May 2026 09:56:36 -0700 Message-ID: <20260529165640.820-3-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260529165640.820-1-ravis.opensrc@gmail.com> References: <20260529165640.820-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a perf_events/ subdirectory under each context's sample/ directory. Each numbered entry maps to one damon_perf_event and exposes its raw PMU attr, addressing flags, and period/delivery knobs. Defaults match Intel PEBS L3-miss; userspace overrides them for other PMUs. sample_weight_struct defaults off because PMUs that do not advertise PERF_SAMPLE_WEIGHT_STRUCT (e.g. AMD IBS Op) reject events that request it with -EOPNOTSUPP. Signed-off-by: Ravi Jonnalagadda --- mm/damon/sysfs-sample.c | 579 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 579 insertions(+) diff --git a/mm/damon/sysfs-sample.c b/mm/damon/sysfs-sample.c index ffc9c85455474..0570d27a47b1c 100644 --- a/mm/damon/sysfs-sample.c +++ b/mm/damon/sysfs-sample.c @@ -452,6 +452,520 @@ static const struct kobj_type damon_sysfs_primitives_= ktype =3D { .default_groups =3D damon_sysfs_primitives_groups, }; =20 +/* + * perf_event_attr directory + */ + +struct damon_sysfs_perf_event_attr { + struct kobject kobj; + u32 type; + u64 config; + u64 config1; + u64 config2; + bool sample_phys_addr; + bool sample_weight_struct; + bool exclude_kernel; + bool exclude_hv; + bool freq; + u64 sample_freq; + u64 sample_period; + u32 wakeup_events; + u32 precise_ip; +}; + +static struct damon_sysfs_perf_event_attr * +damon_sysfs_perf_event_attr_alloc(void) +{ + struct damon_sysfs_perf_event_attr *attr =3D + kzalloc(sizeof(*attr), GFP_KERNEL); + + if (!attr) + return NULL; + attr->wakeup_events =3D 1; + attr->precise_ip =3D 2; + attr->freq =3D true; + attr->exclude_kernel =3D true; + attr->exclude_hv =3D true; + return attr; +} + +static ssize_t attr_type_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "0x%x\n", perf_event_attr->type); +} + +static ssize_t attr_type_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + int err =3D kstrtou32(buf, 0, &perf_event_attr->type); + + if (err) + return -EINVAL; + return count; +} + +static ssize_t config_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "0x%llx\n", perf_event_attr->config); +} + +static ssize_t config_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + int err =3D kstrtou64(buf, 0, &perf_event_attr->config); + + if (err) + return -EINVAL; + return count; +} + +static ssize_t config1_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "0x%llx\n", perf_event_attr->config1); +} + +static ssize_t config1_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + int err =3D kstrtou64(buf, 0, &perf_event_attr->config1); + + if (err) + return -EINVAL; + return count; +} + +static ssize_t config2_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "0x%llx\n", perf_event_attr->config2); +} + +static ssize_t config2_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + int err =3D kstrtou64(buf, 0, &perf_event_attr->config2); + + if (err) + return -EINVAL; + return count; +} + +static ssize_t sample_phys_addr_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "%d\n", perf_event_attr->sample_phys_addr); +} + +static ssize_t sample_phys_addr_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + bool sample_phys_addr; + int err =3D kstrtobool(buf, &sample_phys_addr); + + if (err) + return -EINVAL; + + perf_event_attr->sample_phys_addr =3D sample_phys_addr; + return count; +} + +static ssize_t sample_weight_struct_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "%d\n", perf_event_attr->sample_weight_struct); +} + +static ssize_t sample_weight_struct_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + bool sample_weight_struct; + int err =3D kstrtobool(buf, &sample_weight_struct); + + if (err) + return -EINVAL; + + perf_event_attr->sample_weight_struct =3D sample_weight_struct; + return count; +} + +static ssize_t sample_freq_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "%llu\n", perf_event_attr->sample_freq); +} + +static ssize_t sample_freq_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + int err =3D kstrtou64(buf, 0, &perf_event_attr->sample_freq); + + if (err) + return -EINVAL; + return count; +} + +static ssize_t wakeup_events_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "%u\n", perf_event_attr->wakeup_events); +} + +static ssize_t wakeup_events_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + int err =3D kstrtou32(buf, 0, &perf_event_attr->wakeup_events); + + if (err) + return -EINVAL; + return count; +} + +static ssize_t precise_ip_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "%u\n", perf_event_attr->precise_ip); +} + +static ssize_t precise_ip_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + int err =3D kstrtou32(buf, 0, &perf_event_attr->precise_ip); + + if (err) + return -EINVAL; + return count; +} + +static ssize_t freq_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "%d\n", perf_event_attr->freq); +} + +static ssize_t freq_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + bool freq; + int err =3D kstrtobool(buf, &freq); + + if (err) + return -EINVAL; + perf_event_attr->freq =3D freq; + return count; +} + +static ssize_t sample_period_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "%llu\n", perf_event_attr->sample_period); +} + +static ssize_t sample_period_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + int err =3D kstrtou64(buf, 0, &perf_event_attr->sample_period); + + if (err) + return -EINVAL; + return count; +} + +static ssize_t exclude_kernel_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "%d\n", perf_event_attr->exclude_kernel); +} + +static ssize_t exclude_kernel_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + bool v; + int err =3D kstrtobool(buf, &v); + + if (err) + return -EINVAL; + perf_event_attr->exclude_kernel =3D v; + return count; +} + +static ssize_t exclude_hv_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + + return sysfs_emit(buf, "%d\n", perf_event_attr->exclude_hv); +} + +static ssize_t exclude_hv_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_event_attr *perf_event_attr =3D container_of(kobj, + struct damon_sysfs_perf_event_attr, kobj); + bool v; + int err =3D kstrtobool(buf, &v); + + if (err) + return -EINVAL; + perf_event_attr->exclude_hv =3D v; + return count; +} + +static void damon_sysfs_perf_event_attr_release(struct kobject *kobj) +{ + kfree(container_of(kobj, struct damon_sysfs_perf_event_attr, kobj)); +} + +static struct kobj_attribute damon_sysfs_perf_event_attr_type_attr =3D + __ATTR(type, 0600, attr_type_show, attr_type_store); + +static struct kobj_attribute damon_sysfs_perf_event_attr_config_attr =3D + __ATTR_RW_MODE(config, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_config1_attr =3D + __ATTR_RW_MODE(config1, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_config2_attr =3D + __ATTR_RW_MODE(config2, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_sample_phys_addr_= attr =3D + __ATTR_RW_MODE(sample_phys_addr, 0600); + +static struct kobj_attribute + damon_sysfs_perf_event_attr_sample_weight_struct_attr =3D + __ATTR_RW_MODE(sample_weight_struct, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_sample_freq_attr = =3D + __ATTR_RW_MODE(sample_freq, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_wakeup_events_att= r =3D + __ATTR_RW_MODE(wakeup_events, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_precise_ip_attr = =3D + __ATTR_RW_MODE(precise_ip, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_freq_attr =3D + __ATTR_RW_MODE(freq, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_sample_period_att= r =3D + __ATTR_RW_MODE(sample_period, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_exclude_kernel_at= tr =3D + __ATTR_RW_MODE(exclude_kernel, 0600); + +static struct kobj_attribute damon_sysfs_perf_event_attr_exclude_hv_attr = =3D + __ATTR_RW_MODE(exclude_hv, 0600); + +static struct attribute *damon_sysfs_perf_event_attr_attrs[] =3D { + &damon_sysfs_perf_event_attr_type_attr.attr, + &damon_sysfs_perf_event_attr_config_attr.attr, + &damon_sysfs_perf_event_attr_config1_attr.attr, + &damon_sysfs_perf_event_attr_config2_attr.attr, + &damon_sysfs_perf_event_attr_sample_phys_addr_attr.attr, + &damon_sysfs_perf_event_attr_sample_weight_struct_attr.attr, + &damon_sysfs_perf_event_attr_freq_attr.attr, + &damon_sysfs_perf_event_attr_sample_freq_attr.attr, + &damon_sysfs_perf_event_attr_sample_period_attr.attr, + &damon_sysfs_perf_event_attr_wakeup_events_attr.attr, + &damon_sysfs_perf_event_attr_precise_ip_attr.attr, + &damon_sysfs_perf_event_attr_exclude_kernel_attr.attr, + &damon_sysfs_perf_event_attr_exclude_hv_attr.attr, + NULL, +}; +ATTRIBUTE_GROUPS(damon_sysfs_perf_event_attr); + +static const struct kobj_type damon_sysfs_perf_event_attr_ktype =3D { + .release =3D damon_sysfs_perf_event_attr_release, + .sysfs_ops =3D &kobj_sysfs_ops, + .default_groups =3D damon_sysfs_perf_event_attr_groups, +}; + +/* + * perf_events directory + */ + +/* + * Cap on the number of perf events per damon_ctx, to bound the sysfs + * kobject footprint and prevent unbounded allocations from a careless + * write to nr_perf_events. + */ +#define DAMON_SYSFS_PERF_EVENTS_MAX 64 + +struct damon_sysfs_perf_events { + struct kobject kobj; + struct damon_sysfs_perf_event_attr **attrs_arr; + int nr; +}; + +static struct damon_sysfs_perf_events *damon_sysfs_perf_events_alloc(void) +{ + return kzalloc(sizeof(struct damon_sysfs_perf_events), GFP_KERNEL); +} + +static void damon_sysfs_perf_events_rm_dirs( + struct damon_sysfs_perf_events *events) +{ + struct damon_sysfs_perf_event_attr **attrs_arr =3D events->attrs_arr; + int i; + + for (i =3D 0; i < events->nr; i++) + kobject_put(&attrs_arr[i]->kobj); + events->nr =3D 0; + kfree(attrs_arr); + events->attrs_arr =3D NULL; +} + +static int damon_sysfs_perf_events_add_dirs( + struct damon_sysfs_perf_events *events, int nr_events) +{ + struct damon_sysfs_perf_event_attr **attrs_arr, *attr; + int err, i; + + damon_sysfs_perf_events_rm_dirs(events); + if (!nr_events) + return 0; + + attrs_arr =3D kmalloc_array(nr_events, sizeof(*attrs_arr), GFP_KERNEL); + if (!attrs_arr) + return -ENOMEM; + events->attrs_arr =3D attrs_arr; + + for (i =3D 0; i < nr_events; i++) { + attr =3D damon_sysfs_perf_event_attr_alloc(); + if (!attr) { + damon_sysfs_perf_events_rm_dirs(events); + return -ENOMEM; + } + + err =3D kobject_init_and_add(&attr->kobj, + &damon_sysfs_perf_event_attr_ktype, &events->kobj, + "%d", i); + if (err) { + kobject_put(&attr->kobj); + damon_sysfs_perf_events_rm_dirs(events); + return err; + } + attrs_arr[i] =3D attr; + events->nr++; + } + return 0; +} + +static ssize_t nr_perf_events_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct damon_sysfs_perf_events *events =3D container_of(kobj, + struct damon_sysfs_perf_events, kobj); + + return sysfs_emit(buf, "%d\n", events->nr); +} + +static ssize_t nr_perf_events_store(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + struct damon_sysfs_perf_events *events; + int nr, err =3D kstrtoint(buf, 0, &nr); + + if (err) + return err; + if (nr < 0 || nr > DAMON_SYSFS_PERF_EVENTS_MAX) + return -EINVAL; + + events =3D container_of(kobj, struct damon_sysfs_perf_events, kobj); + + if (!mutex_trylock(&damon_sysfs_lock)) + return -EBUSY; + err =3D damon_sysfs_perf_events_add_dirs(events, nr); + mutex_unlock(&damon_sysfs_lock); + if (err) + return err; + + return count; +} + +static void damon_sysfs_perf_events_release(struct kobject *kobj) +{ + kfree(container_of(kobj, struct damon_sysfs_perf_events, kobj)); +} + +static struct kobj_attribute damon_sysfs_perf_events_nr_attr =3D + __ATTR_RW_MODE(nr_perf_events, 0600); + +static struct attribute *damon_sysfs_perf_events_attrs[] =3D { + &damon_sysfs_perf_events_nr_attr.attr, + NULL, +}; +ATTRIBUTE_GROUPS(damon_sysfs_perf_events); + +static const struct kobj_type damon_sysfs_perf_events_ktype =3D { + .release =3D damon_sysfs_perf_events_release, + .sysfs_ops =3D &kobj_sysfs_ops, + .default_groups =3D damon_sysfs_perf_events_groups, +}; + /* * sample directory */ @@ -471,6 +985,7 @@ int damon_sysfs_sample_add_dirs(struct damon_sysfs_samp= le *sample) { struct damon_sysfs_primitives *primitives; struct damon_sysfs_sample_filters *filters; + struct damon_sysfs_perf_events *perf_events; int err; =20 primitives =3D damon_sysfs_primitives_alloc(true, false); @@ -494,7 +1009,23 @@ int damon_sysfs_sample_add_dirs(struct damon_sysfs_sa= mple *sample) if (err) goto put_filters_out; sample->filters =3D filters; + + perf_events =3D damon_sysfs_perf_events_alloc(); + if (!perf_events) { + err =3D -ENOMEM; + goto put_filters_out; + } + err =3D kobject_init_and_add(&perf_events->kobj, + &damon_sysfs_perf_events_ktype, &sample->kobj, + "perf_events"); + if (err) + goto put_perf_events_out; + sample->perf_events =3D perf_events; + return 0; +put_perf_events_out: + kobject_put(&perf_events->kobj); + sample->perf_events =3D NULL; put_filters_out: kobject_put(&filters->kobj); sample->filters =3D NULL; @@ -512,6 +1043,10 @@ void damon_sysfs_sample_rm_dirs(struct damon_sysfs_sa= mple *sample) damon_sysfs_sample_filters_rm_dirs(sample->filters); kobject_put(&sample->filters->kobj); } + if (sample->perf_events) { + damon_sysfs_perf_events_rm_dirs(sample->perf_events); + kobject_put(&sample->perf_events->kobj); + } } =20 void damon_sysfs_sample_release(struct kobject *kobj) @@ -596,3 +1131,47 @@ int damon_sysfs_set_sample_control( return damon_sysfs_set_sample_filters(control, sysfs_sample->filters); } + +static int damon_sysfs_add_perf_event( + struct damon_sysfs_perf_event_attr *sys_attr, + struct damon_ctx *ctx) +{ + struct damon_perf_event *event =3D kzalloc(sizeof(*event), GFP_KERNEL); + + if (!event) + return -ENOMEM; + + event->attr.type =3D sys_attr->type; + event->attr.config =3D sys_attr->config; + event->attr.config1 =3D sys_attr->config1; + event->attr.config2 =3D sys_attr->config2; + event->attr.sample_phys_addr =3D sys_attr->sample_phys_addr; + event->attr.sample_weight_struct =3D sys_attr->sample_weight_struct; + event->attr.freq =3D sys_attr->freq; + event->attr.sample_freq =3D sys_attr->sample_freq; + event->attr.sample_period =3D sys_attr->sample_period; + event->attr.wakeup_events =3D sys_attr->wakeup_events; + event->attr.precise_ip =3D sys_attr->precise_ip; + event->attr.exclude_kernel =3D sys_attr->exclude_kernel; + event->attr.exclude_hv =3D sys_attr->exclude_hv; + + list_add_tail(&event->list, &ctx->perf_events); + return 0; +} + +int damon_sysfs_add_perf_events(struct damon_ctx *ctx, + struct damon_sysfs_sample *sysfs_sample) +{ + struct damon_sysfs_perf_events *events =3D sysfs_sample->perf_events; + int i, err; + + if (!events) + return 0; + + for (i =3D 0; i < events->nr; i++) { + err =3D damon_sysfs_add_perf_event(events->attrs_arr[i], ctx); + if (err) + return err; + } + return 0; +} --=20 2.43.0 From nobody Mon Jun 8 10:56:43 2026 Received: from mail-ot1-f68.google.com (mail-ot1-f68.google.com [209.85.210.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDA1C3FF1C0 for ; Fri, 29 May 2026 16:57:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.68 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073828; cv=none; b=mZdINWA2ygeMnL2fThJhHCYOda3HyPKlXiZ0ZOGEuTUayoGjD9eBXFDyPbYZxG7+v3XfnEv9bPEXIr1+CXmsLi6apDTs6omCR7keU8rb/4SZiVEtjkRmXPAbNe9x97SwYaWyP12YqmrlexUXe7nYP5P4ZMRiYoS2IULnMQ2l/OY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073828; c=relaxed/simple; bh=MZuP9HdtIsahZDma/nE0553s75DZEcFis7HHFW5CR1w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cZvrSJKdZC0POkULPPArKC7c7s+e98dU13s5/1ZV7eoH1RJxVMBk5+pMJ/s+u5w0xWhfjaZeigYjpBAUJh1ngVu0Q9JcHvqzIR8lM0VVFEu/oVIyijrGm/6453nThqA+d4ZdVN3G6h7/8LlZFQdDAb5qIUoYkGOWu/bz+XnNTMk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=RlC9vDfB; arc=none smtp.client-ip=209.85.210.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RlC9vDfB" Received: by mail-ot1-f68.google.com with SMTP id 46e09a7af769-7dcdd23fcdfso7698342a34.3 for ; Fri, 29 May 2026 09:57:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780073826; x=1780678626; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aOfd8pSjBXyzWHpK5lKV8S/iJy+QOz+9xXLc+9tqa5o=; b=RlC9vDfBeczWtWM2O+JCYILKDf/oRcTBR/YSjeguGerOR5rYBSIAKGGypr/SLyZ3dd n/ljEN64iND9SaCuCmFcktWngx9GHaqWVyUjVsxTCR8039udzdWccjZNcmjtT4bh10hi RJc5Q7uqeXk85M2q5yDiywBZcqCanPekfm2D3MVm7354tbEqLHOFZklYUOb07z+d6kEF ev0Y3+GAGGL2y8Dz/gIE6VdZMZWwCQGC3PS46Z/LqLgY7oRCsN5SReUBBiF2MivHLeJM R6O7Cx3hhbFcxo8+v6T6EnO/GRO88hxImXGTUJEyq6+3deXxnXvAw3FfWDUp0OjYwDH6 NA5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780073826; x=1780678626; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=aOfd8pSjBXyzWHpK5lKV8S/iJy+QOz+9xXLc+9tqa5o=; b=a2t1sGsFTysq9jCgDDCJ4xW0IIss57CzCahEI/s+5c3a+TPmQg7ZUidmJm31QKRWRF 61K7NKx1/2oWfT7ObgOHzi67VEmuW6444IK3lfra2D2aHFG+ATd7N0xvPGZe6JeY5x8A r59dNzHOuhAuQPIZXdAkzFD3j2zz/FTkFdqRGedCum1cnGVSOk/NEJJ9NyP3gS4NQopw gvDKuRdxX5KPjyyjP460V+iIK4+9zi99c+3uoB5etdeffO/77vlutNuhg1cBUJrgSYtb Ryjupfex9YO278JcRuRaWS3B/WaNxl9p7qWDHSC1hpyRyhA8RzClbKhI5pCeAe0Xi52b KwUg== X-Forwarded-Encrypted: i=1; AFNElJ/undjQa48OdAVXLy4G3TTBeLBOJlQ2m2jp6wkEH79UN2ZOWb56YqurQjDbsrepkPCCRWt/zcmcUvN7APc=@vger.kernel.org X-Gm-Message-State: AOJu0YwGDuA6IzS0N+n1bWyUdNwC0bhOtZse7TEIR6xcSNIx5vW4dbH9 /jDaOX+w+hW9vMbotI7agnTcdjrUKtEGn+8cdbu15J1SxJW1jj8jPuc= X-Gm-Gg: Acq92OEUYgRMqUrWmdeww9uIA3ppad3Xx1IXOZKgzww5tz6K2IWiKFmH2TW0aRnCQx/ 7PuvgTJmTuRA4Y94l0rnIGDZFXQ3oyDmSwnbOo17lzMS4JGlOFfD4FfoyBQPFvKfbgzId44YGms i35hg+v7TNrvFVmkXx9OPTTMTWbeSGPDBRc92iGjwhL3r8d6Q+myAKpiHeBVZkWaReLhxWV3wEs wtzRsvHxDeB2wHvgUjoCdbUZpbRh7lhYRB4dxCAiyXpl3I2b2YIo8EpPpMUvV5IBjmB1yHWfM2c aRQ3G7Ah1LJoN6/1SLEmzxjOlv/zhCPTAFyXuvmhPZzrUU8kpomyPjZE+7nx0jxeP8eHOx5lOAu hWTfVtzy+FyC5VyBgOQMYREm+Z7lFioj63zN2zc1fwtJrz8VF1DZU5wirc3NrkrW8vtnHBf+Qda Ml3XLBH56RUczPYu4RUjT7AUGLhx7ue23fAkj1K7K6c/5qqmAWSM/Ls071vHz5XSRbnHiAX78Q7 cDkw9wt2xfqZBtSAxYcLvg= X-Received: by 2002:a05:6830:83a1:b0:7e5:f831:508f with SMTP id 46e09a7af769-7e6a1dc5519mr438211a34.15.1780073825697; Fri, 29 May 2026 09:57:05 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7e695d176dasm1812846a34.16.2026.05.29.09.57.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 May 2026 09:57:04 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, akinobu.mita@gmail.com, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com Subject: [RFC PATCH 3/6] mm/damon/sysfs: install perf_events on apply Date: Fri, 29 May 2026 09:56:37 -0700 Message-ID: <20260529165640.820-4-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260529165640.820-1-ravis.opensrc@gmail.com> References: <20260529165640.820-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Call damon_sysfs_add_perf_events() from damon_sysfs_apply_inputs() so events configured under sample/perf_events/ get attached to the damon_ctx when the kdamond starts. Signed-off-by: Ravi Jonnalagadda --- mm/damon/sysfs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/damon/sysfs.c b/mm/damon/sysfs.c index 9f71871a249d8..bc4a931fe3f0a 100644 --- a/mm/damon/sysfs.c +++ b/mm/damon/sysfs.c @@ -2092,6 +2092,9 @@ static int damon_sysfs_apply_inputs(struct damon_ctx = *ctx, return err; err =3D damon_sysfs_set_sample_control(&ctx->sample_control, sys_ctx->attrs->sample); + if (err) + return err; + err =3D damon_sysfs_add_perf_events(ctx, sys_ctx->attrs->sample); if (err) return err; err =3D damon_sysfs_add_targets(ctx, sys_ctx->targets); --=20 2.43.0 From nobody Mon Jun 8 10:56:43 2026 Received: from mail-ot1-f67.google.com (mail-ot1-f67.google.com [209.85.210.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 929BB3FF896 for ; Fri, 29 May 2026 16:57:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.67 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073832; cv=none; b=oWTIaIdIekii+0sPihMbH86hvv3QgSqoCikxvRWqFnm9W482omMeyyRgf204sPlDfk2oLxDZJE4DD4FKClQfJ36HbfM5ldgiR6TJz3/AsQjBK9FszUiPnws/z7a9+q9uHHH9QasDzifdzdEfMQTReU9O9GBD4cVcv9ulmWMcJvw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073832; c=relaxed/simple; bh=tUr1nCqvOmJ8pMbk3sKPt+KNtqSVL2gMU1FHcdfHjrQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Z9Zax0gD33vSwlCRcgrmIDXhWU5LpYg7zaD3Jpw6GC+3yY+RIBR6YVIP6R2zRv6lspZWOzvZI7x4WcFFJKa40jSCoENopYrbF/dvvng/qT2Jw2olOKwNvBE+qVuZwolhzCIkHCxj93fnt+oRO0Z/7p3ZRmoHG8H9PmyqkhSYYaQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Ns8wPtzJ; arc=none smtp.client-ip=209.85.210.67 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Ns8wPtzJ" Received: by mail-ot1-f67.google.com with SMTP id 46e09a7af769-7e568ab0bc5so14538226a34.0 for ; Fri, 29 May 2026 09:57:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780073828; x=1780678628; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9w8JzV6Ke6m6ss7khBxwWJgK/oI1uoZHRLSFhlaSQYw=; b=Ns8wPtzJkLcGjnm8w75X/s+vV7EvA2QoVjCNiz3J/L9qp9Ohia7ScwaLuQNDmIXFZZ 9yYnhT32aHu9u8rO57rdbge/B6dWiiugv3T57BmflckWCN0eLLhEFpFAWt1d50WL1nh7 hena8fnF4ZDV9SYlmeEFC9Yi+Q3xVewN5fAor3tCGwi3loqK/OsUNuVBRR3+47tpqgxb mUHoD9fJzapj5pi/Fh7RU6kvxXJbR8W2gFqUsYzDy4FvcqwW4tw1/MZHqrTwUsthCrpI K9PG6PX5FXjrm7lplbx3A6NZW7blfTlUFqpr6AgY/K4HDQ9U1rFGCHC/vbYSOmlDpI+A jQwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780073828; x=1780678628; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=9w8JzV6Ke6m6ss7khBxwWJgK/oI1uoZHRLSFhlaSQYw=; b=H8vL7PzmEQ/CTFJ+FyjMFnreJFDHf33oiqW3a9+EyG3eZ1imLvmOXjBx+oU2SmnJ/M cMDWTlLwPDeKdMYqaDZU0yzel7/5ovcrCDMtLvBbxQdxwx+w5/NI54FyvKFZmLJoQnpO ZjXUERpyIdJUTdTwGn3oARSnII05VEo35ya0jFh0G7dcEP2dbYTp7Fe8YkaOrzVEQ3LN y04btEywiMfvJ0dVKg1o5Y1p5x8wduMFLTr2529TUpgLkUIop8P4pmjWKxK37JrVXhEo iPdbr6ikm0/NLnXeWjRAivz15PP78YeHlq5I7W6wJbC5xiwdia1NTVx6h2aV+zJGcbQ8 /wuw== X-Forwarded-Encrypted: i=1; AFNElJ/a6ENSWeLAFp7t0MFcP5jmd0sXXKB3RflF1CrTiGstVRUm22pvscplvVbKUmEFRTzK+LFcrB0pcwNXxDA=@vger.kernel.org X-Gm-Message-State: AOJu0Yz6OZdWa0DE2X4x0EIhvIX9QyWqw6ZkaW1Bd9o0nX2O7mk6xvqW +TRx1HMSxOsHP0iZhAZRKWKVJe6wSq9lfWXbo5I9YXzma4hsHy+gu28= X-Gm-Gg: Acq92OFDdhRlF2G6G0nm4MPbYcukEFkxOOTXJqslfhYoOIku3yt9qMvfq7DkfjdUGMP yNEhS1hwBu/6ZoRl2dNmNtOx2wHc0IsX81TixAR4UoI5moHfzNDEx/DEAXXPObpJw5NgVCoFjDe HijUsmya7A/9fnQTx4C3C0G0WcIWJ42ZStHT/snRinIVqVDuGOGnygNB3NrFJfw5+ikjcMuZhTm XfQjMwkrCuZu9PFVEtcGoGb8lFWxoGrJVNKdjXYlJZJpcEF+3BDm1P7HArFScvKGt5PDURwleWq mgu0zcSYRO4sE3a3hvkdWCAAWptQIBDP8ft7DEHHyGYTJ50zw+ZmC82ZKQZhZu1ocA/8AcLe6ax BnNBkcBm5KYuG1IHFMOiMaoDMVlBb3Hp4aJP9BQMzuG7iOv+iBoj3gkRBMa4v1sS+6UAwgFGr2V tmm1j1iX+tw2l8xt4qPkbVHiy1kDWQ/N5o37sdwr9iMSGO3k3Wk7IxPlbmNHEVe+bfa7omaDvnm wVfjoqwp2EU X-Received: by 2002:a05:6830:1e21:b0:7e3:be56:e0e7 with SMTP id 46e09a7af769-7e6951dfbbbmr1403852a34.11.1780073828441; Fri, 29 May 2026 09:57:08 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7e695bd9c9csm1806545a34.11.2026.05.29.09.57.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 May 2026 09:57:07 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, akinobu.mita@gmail.com, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com Subject: [RFC PATCH 4/6] mm/damon/core: per-CPU SPSC ring drain and damon_perf_event lifecycle Date: Fri, 29 May 2026 09:56:38 -0700 Message-ID: <20260529165640.820-5-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260529165640.820-1-ravis.opensrc@gmail.com> References: <20260529165640.820-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replace the mutex-protected damon_access_reports[] single-buffer with a per-CPU SPSC ring. The producer (damon_report_access) is called from NMI by perf overflow handlers; the consumer (kdamond_check_reported_accesses) runs once per sample tick. - 256-entry ring per CPU with cache-line-aligned head/tail - per-CPU damon_report_ring_busy guards against NMI nesting on top of a process-context producer on the same CPU - per-CPU damon_ring_pending bit so the consumer iterates only CPUs that produced samples this tick - smp_mb between flag clear and head read on the consumer side pairs with the producer's head-publish ordering Replace the O(N) per-region scan in kdamond_apply_access_report() with bsearch over a per-tick per-target snapshot built into a reusable damon_ctx::drain_snapshot buffer. The pid-based ctx early-reject is no longer needed: kdamond_apply_access_report() already discriminates report->vaddr vs report->paddr per ctx. Wire the damon_perf_event lifecycle: init per attached event when kdamond starts, teardown when the ctx is destroyed, replayed across damon_commit_ctx. Add the matching forward decl + drain_snapshot field on struct damon_ctx. Signed-off-by: Ravi Jonnalagadda --- include/trace/events/damon.h | 17 ++ mm/damon/core.c | 383 ++++++++++++++++++++++++++++++----- 2 files changed, 344 insertions(+), 56 deletions(-) diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h index b131bee27cc4a..e97e70579a8c8 100644 --- a/include/trace/events/damon.h +++ b/include/trace/events/damon.h @@ -74,6 +74,23 @@ TRACE_EVENT(damos_esz, __entry->esz) ); =20 +TRACE_EVENT(damon_perf_ring_overflow, + + TP_PROTO(int cpu), + + TP_ARGS(cpu), + + TP_STRUCT__entry( + __field(int, cpu) + ), + + TP_fast_assign( + __entry->cpu =3D cpu; + ), + + TP_printk("cpu=3D%d", __entry->cpu) +); + TRACE_EVENT_CONDITION(damos_before_apply, =20 TP_PROTO(unsigned int context_idx, unsigned int scheme_idx, diff --git a/mm/damon/core.c b/mm/damon/core.c index 23311189b589e..1e6966e45144f 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -8,6 +8,7 @@ #define pr_fmt(fmt) "damon: " fmt =20 #include +#include #include #include #include @@ -24,22 +25,43 @@ #define CREATE_TRACE_POINTS #include =20 -#define DAMON_ACCESS_REPORTS_CAP 1000 +#define DAMON_REPORT_RING_SIZE 256 +#define DAMON_REPORT_RING_MASK (DAMON_REPORT_RING_SIZE - 1) + +/* Per-target region lookup snapshot for the drain loop. */ +struct damon_target_lookup { + struct damon_target *t; + struct damon_region **regions; + unsigned int nr_regions; +}; + +struct damon_report_ring { + unsigned int head; /* written by producer (NMI) */ + unsigned int tail /* written by consumer (kdamond) */ + ____cacheline_aligned_in_smp; + struct damon_access_report entries[DAMON_REPORT_RING_SIZE] + ____cacheline_aligned_in_smp; +}; + +static DEFINE_PER_CPU(struct damon_report_ring, damon_report_rings); +static DEFINE_PER_CPU(local_t, damon_report_ring_busy); +/* + * Producer (NMI) sets after publishing a report; consumer (kdamond) clears + * before draining the corresponding ring. Per-CPU to avoid cross-CPU + * cacheline bouncing under sampling load on large systems. + */ +static DEFINE_PER_CPU(unsigned long, damon_ring_pending); =20 static DEFINE_MUTEX(damon_lock); static int nr_running_ctxs; static bool running_exclusive_ctxs; +static struct damon_ctx *damon_perf_owner; =20 static DEFINE_MUTEX(damon_ops_lock); static struct damon_operations damon_registered_ops[NR_DAMON_OPS]; =20 static struct kmem_cache *damon_region_cache __ro_after_init; =20 -static DEFINE_MUTEX(damon_access_reports_lock); -static struct damon_access_report damon_access_reports[ - DAMON_ACCESS_REPORTS_CAP]; -static int damon_access_reports_len; - /* Should be called under damon_ops_lock with id smaller than NR_DAMON_OPS= */ static bool __damon_is_registered_ops(enum damon_ops_id id) { @@ -805,11 +827,24 @@ struct damon_ctx *damon_new_ctx(void) INIT_LIST_HEAD(&ctx->adaptive_targets); INIT_LIST_HEAD(&ctx->schemes); =20 + INIT_LIST_HEAD(&ctx->perf_events); + prandom_seed_state(&ctx->rnd_state, get_random_u64()); =20 return ctx; } =20 +static void damon_perf_destroy(struct damon_ctx *ctx) +{ + struct damon_perf_event *event, *next; + + list_for_each_entry_safe(event, next, &ctx->perf_events, list) { + damon_perf_cleanup(ctx, event); + list_del(&event->list); + kfree(event); + } +} + static void damon_destroy_targets(struct damon_ctx *ctx) { struct damon_target *t, *next_t; @@ -835,6 +870,11 @@ void damon_destroy_ctx(struct damon_ctx *ctx) damon_for_each_sample_filter_safe(f, next_f, &ctx->sample_control) damon_destroy_sample_filter(f, &ctx->sample_control); =20 + damon_perf_destroy(ctx); + + kfree(ctx->drain_snapshot.lookups); + kfree(ctx->drain_snapshot.region_buf); + kfree(ctx); } =20 @@ -1694,6 +1734,45 @@ static int damon_commit_sample_control( return damon_commit_sample_filters(dst, src); } =20 +static int damon_commit_perf_events(struct damon_ctx *dst, + struct damon_ctx *src) +{ + struct damon_perf_event *src_event, *new_event; + int err =3D 0; + + damon_perf_destroy(dst); + + list_for_each_entry(src_event, &src->perf_events, list) { + new_event =3D kzalloc(sizeof(*new_event), GFP_KERNEL); + if (!new_event) { + err =3D -ENOMEM; + goto out; + } + + new_event->attr =3D src_event->attr; + + if (damon_is_running(dst)) { + err =3D damon_perf_init(dst, new_event); + if (err) { + kfree(new_event); + goto out; + } + /* + * Events are created with attr.disabled=3D1 and only fire while + * the kdamond runs. Arm now if we are committing into a + * running ctx whose substrate is already armed. + */ + if (dst->perf_events_active) + damon_perf_event_arm(new_event); + } + list_add_tail(&new_event->list, &dst->perf_events); + } + return 0; +out: + damon_perf_destroy(dst); + return err; +} + static int __damon_commit_ctx(struct damon_ctx *dst, struct damon_ctx *src) { int err; @@ -1742,6 +1821,9 @@ static int __damon_commit_ctx(struct damon_ctx *dst, = struct damon_ctx *src) return err; err =3D damon_commit_sample_control(&dst->sample_control, &src->sample_control); + if (err) + return err; + err =3D damon_commit_perf_events(dst, src); if (err) return err; dst->addr_unit =3D src->addr_unit; @@ -1929,12 +2011,40 @@ int damon_start(struct damon_ctx **ctxs, int nr_ctx= s, bool exclusive) return -EBUSY; } =20 + /* + * The per-CPU PMU events backing the perf-event substrate are a single + * shared resource; only one ctx may own them. Reject the start if + * another already-running ctx owns the substrate, or if more than one + * ctx in this batch wants it. + */ + for (i =3D 0; i < nr_ctxs; i++) { + if (!list_empty(&ctxs[i]->perf_events)) { + int j; + + if (damon_perf_owner) { + mutex_unlock(&damon_lock); + return -EBUSY; + } + for (j =3D i + 1; j < nr_ctxs; j++) { + if (!list_empty(&ctxs[j]->perf_events)) { + mutex_unlock(&damon_lock); + return -EBUSY; + } + } + damon_perf_owner =3D ctxs[i]; + break; + } + } + for (i =3D 0; i < nr_ctxs; i++) { err =3D __damon_start(ctxs[i]); if (err) break; nr_running_ctxs++; } + if (err && damon_perf_owner && + !damon_perf_owner->kdamond) + damon_perf_owner =3D NULL; if (exclusive && nr_running_ctxs) running_exclusive_ctxs =3D true; mutex_unlock(&damon_lock); @@ -2113,29 +2223,47 @@ int damos_walk(struct damon_ctx *ctx, struct damos_= walk_control *control) * damon_report_access() - Report identified access events to DAMON. * @report: The reporting access information. * - * Report access events to DAMON. + * Report access events to DAMON via a per-CPU SPSC lockless ring. Produc= er + * is the local CPU (typically NMI from a hardware-sampling backend); + * consumer is the kdamond drain in kdamond_check_reported_accesses(). * - * Context: May sleep. + * Context: any (NMI-safe). An NMI nesting on top of a process-context + * producer on the same CPU would otherwise stomp the same entries[head] + * slot; the busy guard detects and drops in that case. * - * NOTE: we may be able to implement this as a lockless queue, and allow a= ny - * context. As the overhead is unknown, and region-based DAMON logics wou= ld - * guarantee the reports would be not made that frequently, let's start wi= th - * this simple implementation. + * If the ring is full, the sample is dropped and the per-CPU overflow + * counter incremented. */ void damon_report_access(struct damon_access_report *report) { - struct damon_access_report *dst; + struct damon_report_ring *ring; + unsigned int head, next; =20 - /* silently fail for races */ - if (!mutex_trylock(&damon_access_reports_lock)) - return; - dst =3D &damon_access_reports[damon_access_reports_len++]; - /* just drop all existing reports in favor of simplicity. */ - if (damon_access_reports_len =3D=3D DAMON_ACCESS_REPORTS_CAP) - damon_access_reports_len =3D 0; - *dst =3D *report; - dst->report_jiffies =3D jiffies; - mutex_unlock(&damon_access_reports_lock); + /* Pin to a CPU so the SPSC invariant holds for preemptible callers. */ + preempt_disable(); + if (local_inc_return(this_cpu_ptr(&damon_report_ring_busy)) !=3D 1) { + /* NMI nested on a process-context producer; drop. */ + trace_damon_perf_ring_overflow(smp_processor_id()); + goto out; + } + + ring =3D this_cpu_ptr(&damon_report_rings); + head =3D ring->head; + next =3D (head + 1) & DAMON_REPORT_RING_MASK; + + if (next =3D=3D READ_ONCE(ring->tail)) { + trace_damon_perf_ring_overflow(smp_processor_id()); + goto out; + } + + ring->entries[head] =3D *report; + ring->entries[head].report_jiffies =3D jiffies; + smp_wmb(); /* publish entry before head advance */ + WRITE_ONCE(ring->head, next); + WRITE_ONCE(*this_cpu_ptr(&damon_ring_pending), 1); +out: + local_dec(this_cpu_ptr(&damon_report_ring_busy)); + preempt_enable(); } =20 #ifdef CONFIG_MMU @@ -2145,7 +2273,8 @@ void damon_report_page_fault(struct vm_fault *vmf, bo= ol huge_pmd) .vaddr =3D vmf->address, .size =3D 1, /* todo: set appripriately */ .cpu =3D smp_processor_id(), - .tid =3D task_pid_vnr(current), + .tid =3D current->pid, + .tgid =3D task_tgid_nr(current), .is_write =3D vmf->flags & FAULT_FLAG_WRITE, }; =20 @@ -3700,6 +3829,7 @@ static void kdamond_init_ctx(struct damon_ctx *ctx) unsigned long sample_interval =3D ctx->attrs.sample_interval ? ctx->attrs.sample_interval : 1; struct damos *scheme; + struct damon_perf_event *event, *next; =20 ctx->passed_sample_intervals =3D 0; ctx->next_aggregation_sis =3D ctx->attrs.aggr_interval / sample_interval; @@ -3713,6 +3843,15 @@ static void kdamond_init_ctx(struct damon_ctx *ctx) damos_set_next_apply_sis(scheme, ctx); damos_set_filters_default_reject(scheme); } + + list_for_each_entry_safe(event, next, &ctx->perf_events, list) { + int err =3D damon_perf_init(ctx, event); + + if (err) { + list_del(&event->list); + kfree(event); + } + } } =20 static bool damon_sample_filter_matching(struct damon_access_report *repor= t, @@ -3759,26 +3898,46 @@ static bool damon_sample_filter_out(struct damon_ac= cess_report *report, } =20 static void kdamond_apply_access_report(struct damon_access_report *report, - struct damon_target *t, struct damon_ctx *ctx) + struct damon_target *t, + struct damon_region **regions, unsigned int nr_regions, + struct damon_ctx *ctx) { struct damon_region *r; unsigned long addr; + int left, right, mid; =20 - if (damon_sample_filter_out(report, &ctx->sample_control)) - return; - if (damon_target_has_pid(ctx)) + if (damon_target_has_pid(ctx)) { + if (pid_nr(t->pid) !=3D report->tgid) + return; addr =3D report->vaddr; - else + } else { addr =3D report->paddr; + } =20 - /* todo: make search faster, e.g., binary search? */ - damon_for_each_region(r, t) { - if (addr < r->ar.start) - continue; - if (r->ar.end < addr + report->size) - continue; - if (!r->access_reported) - damon_update_region_access_rate(r, true, &ctx->attrs); + /* Binary search the snapshot for the region containing addr. */ + left =3D 0; + right =3D nr_regions - 1; + r =3D NULL; + while (left <=3D right) { + /* Avoid (left + right) overflow at large nr_regions. */ + mid =3D left + (right - left) / 2; + if (addr < regions[mid]->ar.start) + right =3D mid - 1; + else if (addr >=3D regions[mid]->ar.end) + left =3D mid + 1; + else { + r =3D regions[mid]; + break; + } + } + + if (!r) + return; + /* Reject reports straddling a region boundary. */ + if (addr + report->size > r->ar.end) + return; + if (!r->access_reported) { + damon_update_region_access_rate(r, true, &ctx->attrs); r->access_reported =3D true; } } @@ -3802,28 +3961,120 @@ static unsigned int kdamond_apply_zero_access_repo= rt(struct damon_ctx *ctx) return max_nr_accesses; } =20 -static unsigned int kdamond_check_reported_accesses(struct damon_ctx *ctx) +/* + * Build a snapshot of the ctx's targets and their region arrays for use + * by the ring drain loop. The snapshot buffer is reused across ticks, + * grown via krealloc only when a new high water mark is reached. + * + * The two-pass walk over adaptive_targets is safe even though + * krealloc_array() may sleep: target list mutation is funneled through + * damon_call onto the kdamond itself, so no other thread can mutate the + * list while kdamond is running this function. + */ +static struct damon_target_lookup *damon_build_target_lookup( + struct damon_ctx *ctx, unsigned int *nr_targets_out) { - int i; - struct damon_access_report *report; struct damon_target *t; + struct damon_target_lookup *tbl; + unsigned int nr_targets =3D 0, total_regions =3D 0, ti =3D 0, ri =3D 0; =20 - /* currently damon_access_report supports only physical address */ - if (damon_target_has_pid(ctx)) - return 0; + damon_for_each_target(t, ctx) { + nr_targets++; + total_regions +=3D damon_nr_regions(t); + } =20 - mutex_lock(&damon_access_reports_lock); - for (i =3D 0; i < damon_access_reports_len; i++) { - report =3D &damon_access_reports[i]; - if (time_before(report->report_jiffies, - jiffies - - usecs_to_jiffies( - ctx->attrs.sample_interval))) + if (nr_targets > ctx->drain_snapshot.nr_lookups) { + tbl =3D krealloc_array(ctx->drain_snapshot.lookups, + nr_targets, sizeof(*tbl), GFP_KERNEL); + if (!tbl) + return NULL; + ctx->drain_snapshot.lookups =3D tbl; + ctx->drain_snapshot.nr_lookups =3D nr_targets; + } + tbl =3D ctx->drain_snapshot.lookups; + + if (total_regions > ctx->drain_snapshot.region_buf_cap) { + struct damon_region **buf; + + buf =3D krealloc_array(ctx->drain_snapshot.region_buf, + total_regions, sizeof(*buf), GFP_KERNEL); + if (!buf) + return NULL; + ctx->drain_snapshot.region_buf =3D buf; + ctx->drain_snapshot.region_buf_cap =3D total_regions; + } + + damon_for_each_target(t, ctx) { + struct damon_region *r; + + tbl[ti].t =3D t; + tbl[ti].regions =3D &ctx->drain_snapshot.region_buf[ri]; + tbl[ti].nr_regions =3D damon_nr_regions(t); + damon_for_each_region(r, t) + ctx->drain_snapshot.region_buf[ri++] =3D r; + ti++; + } + + *nr_targets_out =3D nr_targets; + return tbl; +} + +static unsigned int kdamond_check_reported_accesses(struct damon_ctx *ctx) +{ + int cpu; + struct damon_target_lookup *tbl; + unsigned int nr_targets =3D 0; + unsigned int i; + + tbl =3D damon_build_target_lookup(ctx, &nr_targets); + if (!tbl) { + pr_warn_ratelimited( + "damon: target-lookup alloc failed; ring drain skipped this tick\n"); + return kdamond_apply_zero_access_report(ctx); + } + + for_each_online_cpu(cpu) { + struct damon_report_ring *ring; + unsigned int head, tail; + + if (!READ_ONCE(*per_cpu_ptr(&damon_ring_pending, cpu))) continue; - damon_for_each_target(t, ctx) - kdamond_apply_access_report(report, t, ctx); + ring =3D per_cpu_ptr(&damon_report_rings, cpu); + + WRITE_ONCE(*per_cpu_ptr(&damon_ring_pending, cpu), 0); + /* + * Pair with the producer's smp_wmb between entry and head + * publish: order our flag clear before the head read so that + * a producer publishing between our clear and READ_ONCE(head) + * is observed via the flag it re-sets, not lost as a + * stale-head drain. + */ + smp_mb(); + head =3D READ_ONCE(ring->head); + smp_rmb(); /* pair with smp_wmb in producer */ + tail =3D ring->tail; + + while (tail !=3D head) { + struct damon_access_report *report =3D + &ring->entries[tail]; + + if (time_before(report->report_jiffies, + jiffies - usecs_to_jiffies( + ctx->attrs.sample_interval))) + goto next; + if (damon_sample_filter_out(report, + &ctx->sample_control)) + goto next; + for (i =3D 0; i < nr_targets; i++) + kdamond_apply_access_report(report, + tbl[i].t, + tbl[i].regions, + tbl[i].nr_regions, ctx); +next: + tail =3D (tail + 1) & DAMON_REPORT_RING_MASK; + } + WRITE_ONCE(ring->tail, tail); } - mutex_unlock(&damon_access_reports_lock); /* For nr_accesses_bp, absence of access should also be reported. */ return kdamond_apply_zero_access_report(ctx); } @@ -3848,6 +4099,14 @@ static int kdamond_fn(void *data) complete(&ctx->kdamond_started); kdamond_init_ctx(ctx); =20 + if (!list_empty(&ctx->perf_events)) { + struct damon_perf_event *event; + + WRITE_ONCE(ctx->perf_events_active, true); + list_for_each_entry(event, &ctx->perf_events, list) + damon_perf_event_arm(event); + } + if (ctx->ops.init) ctx->ops.init(ctx); ctx->regions_score_histogram =3D kmalloc_array(DAMOS_MAX_SCORE + 1, @@ -3871,14 +4130,15 @@ static int kdamond_fn(void *data) if (kdamond_wait_activation(ctx)) break; =20 - if (ctx->ops.prepare_access_checks) + if (list_empty(&ctx->perf_events) && + ctx->ops.prepare_access_checks) ctx->ops.prepare_access_checks(ctx); =20 kdamond_usleep(sample_interval); ctx->passed_sample_intervals++; =20 - /* todo: make these non-exclusive */ - if (ctx->sample_control.primitives_enabled.page_fault) + if (!list_empty(&ctx->perf_events) || + ctx->sample_control.primitives_enabled.page_fault) max_nr_accesses =3D kdamond_check_reported_accesses(ctx); else if (ctx->ops.check_accesses) max_nr_accesses =3D ctx->ops.check_accesses(ctx); @@ -3965,6 +4225,15 @@ static int kdamond_fn(void *data) } } done: + if (ctx->perf_events_active) { + struct damon_perf_event *event; + + WRITE_ONCE(ctx->perf_events_active, false); + list_for_each_entry(event, &ctx->perf_events, list) + damon_perf_event_disarm(event); + /* Drain any in-flight reports queued before disarm took effect. */ + kdamond_check_reported_accesses(ctx); + } damon_destroy_targets(ctx); =20 kfree(ctx->regions_score_histogram); @@ -3986,6 +4255,8 @@ static int kdamond_fn(void *data) nr_running_ctxs--; if (!nr_running_ctxs && running_exclusive_ctxs) running_exclusive_ctxs =3D false; + if (damon_perf_owner =3D=3D ctx) + damon_perf_owner =3D NULL; mutex_unlock(&damon_lock); =20 return 0; --=20 2.43.0 From nobody Mon Jun 8 10:56:43 2026 Received: from mail-ot1-f66.google.com (mail-ot1-f66.google.com [209.85.210.66]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1EDCD3FFAA2 for ; Fri, 29 May 2026 16:57:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.66 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073835; cv=none; b=tquV5JeMtWdexKyzUaAqw1qdc2RP6WFc55VBwTf7rQtQbnJwPssJTjQtpubp6spTG34M+9a9C+9HI+ldgizdjDQXGYMAez3hqHQCdO9O0gUpGDJ0BdJY3yJ3s9IpJY5MIHW8buEORyzLbMP91u3zx7jtEPnDgOeyqrNm788rj/A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073835; c=relaxed/simple; bh=iJHlTclTdvNVeqZJEsJKgPnybNHRC79ZU9HKHQ39RSo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nV+nBYnVzeAlGS46aeET3e/xTKeNeBsix+zOFGkKUMQQk7KsTvP6HMjD+lyxYRkS5DqtX4pO+O/kNb+/dmmoFA8l8AjLRxPQyJAa4QAsLD//L/qGZ6EnApme/p5FgxrO8MMgabJNjHZa92EMLML+MdXUIQW/dACnQKUQ5ebdv9A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lwdB2iI5; arc=none smtp.client-ip=209.85.210.66 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lwdB2iI5" Received: by mail-ot1-f66.google.com with SMTP id 46e09a7af769-7e582b3bcaaso11907603a34.3 for ; Fri, 29 May 2026 09:57:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780073830; x=1780678630; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4H/yIMAbtX2q0bElVklgb86re8LM0PVikJCfYxPTsWQ=; b=lwdB2iI5TL9TUe8OqODPM8cInaftzcz3ZVLEkCjp5BupyXIRVddkjeFPMKSOkY0QGB 2HZueq0GSsRU/PCdt5f9XgXTXDKHBk4OOrMl6DS3hPGInGsAw7p3S1NmVoL4oIzJJZau dmCF1eB/NrfJLiju3OFOzi8nMfh1Azowhczv7tM4xCHLlipKDb+AMF4JqaMfT5R4T5Lp xAixJeaOijRXiMEHuKh/w2AgiElTgOdzfX10rL4ujFiEoL79uJgSAwqL71s9VAiUd0Hh 4E8Lp86bXKP23+76aDfaf3rDcwGSX4V0xQVCKmNiciX9jfcQU/m6O5eOlbDDI/7s1KTu EAbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780073830; x=1780678630; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=4H/yIMAbtX2q0bElVklgb86re8LM0PVikJCfYxPTsWQ=; b=MR6GUEWjuWI6+MLulmXXptGKljg87a5d6nnnKUdEnIddhXwpLAEVEnpeFjlTgXGp+M xh7sJUL9KXueFGO6hoT3RLs71BO+2hb1/nENJb5hl/kOqneCJy/xL9H3NisZ+6WExNn1 Qgl9+NUMRoAXYAlbiEpcGYVX5oLKucJ7FSDypbGDNvywqZe78wO7Aw+OSICWpjOHfMqQ HYjH2QHepHJuzjE9HAEUq5vE6PjNKJGKgPlaE822VbOlLX6FMaRTNx2eUQ7pqEzMVI+V PhmO7ZUyLqH9REzaQgo2PCG7kkFRVqFHnVehTqAilZFA8yWptoIIBXivemzzbJX6vWM/ 3/Gg== X-Forwarded-Encrypted: i=1; AFNElJ+4aETvFnYdTiLUrYpACuVT5uZBIUmoAR9FGs72gYEREUm9kcmslHZ3NkAetTZD5QQXSOOw+TPFppr8uxE=@vger.kernel.org X-Gm-Message-State: AOJu0YzGanIL5OTYWLMyBdKgujOPs6S1nZoLUktg4yqzvZx+oNNptAlf Pr3WbVIq15+Qe9Dynx0Y7xffYcFqTNQVvlt9bD/jAi5hGJDasa53zic= X-Gm-Gg: Acq92OFcRmZLFo/9T2TtMIrouoLhC7O7bqQ6bI62iBOdAUADtaafvEEejBlAPkjoByz 1qgaMrEerX+PdnbXf7t++9OA0iLwMZ97J9FgicHr7kb36m1JEYtqIGornqQpvOMU4/LR1x9Ffxo g/5Fi5RexxbJnaffLxgKKbe5CcQgiMsBmvV985KzvuS6CHhqKhlnegYv75KjyfxqhoidHeNYycl 55QWZDlRzJt12uKg/RVngcW2DzJEe+grwWxM6sXs15YMAWqzFINGTaf1kJyuBhHUm8qudJaqc9+ lsGwsE1KZSaf+SkTN7ZmVA3HUsmZK9eY6WwB5Fuj/UHKtEGezR7zLJTlOfPPBGY71VP/ozaC+TA M7VcpHQs/dkdldnMwSbhaP0iB4yq+T7odKKdxH/6R7GUnEVOcT8T2QNX77LRydIH50KQR+i0ggD T/pcIm/SQfTpPstlMpwIP8UHIUhVg+ZhEBnur38DQMyfJ3TO95m7AYJQ0yMePa6TepWDVNF7XsJ oR5+U72tz1b X-Received: by 2002:a05:6830:348a:b0:7dc:cd0b:58ba with SMTP id 46e09a7af769-7e6a1d33c27mr359547a34.4.1780073830039; Fri, 29 May 2026 09:57:10 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7e695da86e1sm1785786a34.27.2026.05.29.09.57.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 May 2026 09:57:09 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, akinobu.mita@gmail.com, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com Subject: [RFC PATCH 5/6] mm/damon/vaddr: implement perf-event access check Date: Fri, 29 May 2026 09:56:39 -0700 Message-ID: <20260529165640.820-6-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260529165640.820-1-ravis.opensrc@gmail.com> References: <20260529165640.820-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the perf-event backend used by the substrate. Two stateless NMI overflow handlers are picked at perf_event_create_kernel_counter() time (paddr- vs vaddr-keyed) and called with context =3D NULL, so the NMI fast path never dereferences the per-event struct. Each submits a damon_access_report into the per-CPU ring. The vaddr handler drops samples with addr =3D=3D 0 or addr >=3D TASK_SIZE. The paddr handler gates on data->sample_flags & PERF_SAMPLE_PHYS_ADDR rather than testing data->phys_addr for zero (which would also drop legitimate page 0). AMD IBS Op only populates phys_addr when IBS_OP_DATA3.dc_phy_addr_valid is set; gating on sample_flags is the documented way to detect that. is_write is derived from data->data_src.mem_op. cpuhp_setup_state_multi() registers one global state at subsys_initcall; each damon_perf_event is added as an instance in damon_perf_init() so cpuhp drives per-CPU event creation and offline-time release. Events are created with disabled=3D1 and armed by kdamond_fn() when the substrate is ready; per-CPU init failures are surfaced via init_complete / any_cpu_failed so damon_perf_init() rolls back the cpuhp instance instead of leaving a half-armed event behind. Signed-off-by: Ravi Jonnalagadda --- mm/damon/vaddr.c | 267 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 267 insertions(+) diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index d271476035641..73fcea91afa07 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -7,11 +7,13 @@ =20 #define pr_fmt(fmt) "damon-va: " fmt =20 +#include #include #include #include #include #include +#include #include #include =20 @@ -957,6 +959,263 @@ static int damon_va_scheme_score(struct damon_ctx *co= ntext, return DAMOS_MAX_SCORE; } =20 +#ifdef CONFIG_PERF_EVENTS + +#define DAMON_PERF_MAX_RECORDS (1UL << 20) +#define DAMON_PERF_INIT_RECORDS (1UL << 15) + +/* + * NMI hot-path: avoid every heap dereference. These handlers carry no + * pointer back to the per-event struct -- perf_event_create_kernel_counter + * is called with context =3D NULL. Submission flows into the global + * per-CPU SPSC ring (damon_report_access -> kdamond_check_reported_access= es + * drains). + */ +static void damon_perf_overflow_vaddr(struct perf_event *perf_event, + struct perf_sample_data *data, struct pt_regs *regs) +{ + struct damon_access_report report; + + if (!data || !data->addr) + return; + + /* Drop kernel-VA hits -- only user-space VAs land in damon vaddr regions= . */ + if (data->addr >=3D TASK_SIZE) + return; + + report =3D (struct damon_access_report){ + .vaddr =3D data->addr & PAGE_MASK, + .size =3D PAGE_SIZE, + .cpu =3D smp_processor_id(), + .tid =3D current->pid, + .tgid =3D current->tgid, + .is_write =3D !!(data->data_src.mem_op & PERF_MEM_OP_STORE), + }; + damon_report_access(&report); +} + +static void damon_perf_overflow_paddr(struct perf_event *perf_event, + struct perf_sample_data *data, struct pt_regs *regs) +{ + struct damon_access_report report; + + if (!data) + return; + + /* + * AMD IBS Op only populates data->phys_addr when + * IBS_OP_DATA3.dc_phy_addr_valid is set; otherwise the field + * carries a stale value. Gate on sample_flags rather than testing + * phys_addr for zero (which would also drop legitimate page 0). + */ + if (!(data->sample_flags & PERF_SAMPLE_PHYS_ADDR)) + return; + + report =3D (struct damon_access_report){ + .paddr =3D data->phys_addr & PAGE_MASK, + .size =3D PAGE_SIZE, + .cpu =3D smp_processor_id(), + .is_write =3D !!(data->data_src.mem_op & PERF_MEM_OP_STORE), + }; + damon_report_access(&report); +} + +static enum cpuhp_state damon_perf_cpuhp_state; + +static void damon_perf_event_init_attr(struct damon_perf_event *event, + struct perf_event_attr *attr) +{ + *attr =3D (struct perf_event_attr) { + .size =3D sizeof(*attr), + .type =3D event->attr.type, + .config =3D event->attr.config, + .config1 =3D event->attr.config1, + .config2 =3D event->attr.config2, + .freq =3D event->attr.freq, + .sample_type =3D PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | + PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC | + (event->attr.sample_phys_addr ? + PERF_SAMPLE_PHYS_ADDR : 0) | + (event->attr.sample_weight_struct ? + PERF_SAMPLE_WEIGHT_STRUCT : 0), + .precise_ip =3D event->attr.precise_ip, + .pinned =3D 1, + .disabled =3D 1, + .wakeup_events =3D event->attr.wakeup_events, + .exclude_kernel =3D event->attr.exclude_kernel, + .exclude_hv =3D event->attr.exclude_hv, + }; + + /* + * sample_period and sample_freq share storage in the kernel + * perf_event_attr (union). Select based on the freq toggle so + * frequency-based callers (PEBS) and period-based callers + * (AMD IBS Op MaxCnt) both work correctly. + */ + if (event->attr.freq) + attr->sample_freq =3D event->attr.sample_freq; + else + attr->sample_period =3D event->attr.sample_period; +} + +static int damon_perf_cpu_online(unsigned int cpu, struct hlist_node *node) +{ + struct damon_perf_event *event =3D hlist_entry(node, + struct damon_perf_event, hlist_node); + struct damon_perf *perf =3D event->priv; + struct perf_event_attr attr; + struct perf_event *perf_event; + perf_overflow_handler_t handler; + + if (!perf) + return 0; + + damon_perf_event_init_attr(event, &attr); + + /* + * Pick a paddr- or vaddr-specific handler at create time so the + * NMI fast path is statically branched. Pass NULL as context -- + * handlers are stateless wrt the per-event struct, so the NMI + * fast path performs no per-event heap dereference. Submission + * flows into the global per-CPU SPSC ring via damon_report_access(). + */ + handler =3D event->attr.sample_phys_addr ? + damon_perf_overflow_paddr : damon_perf_overflow_vaddr; + + perf_event =3D perf_event_create_kernel_counter(&attr, cpu, NULL, + handler, NULL); + if (IS_ERR(perf_event)) { + pr_warn_ratelimited("damon-perf: cpu %u event create failed: %ld\n", + cpu, PTR_ERR(perf_event)); + if (!event->init_complete) + event->any_cpu_failed =3D true; + return 0; /* never block CPU online */ + } + *per_cpu_ptr(perf->event, cpu) =3D perf_event; + /* + * Late-online CPU after the substrate is armed: events are created + * with attr.disabled =3D 1 and would otherwise stay quiescent on this + * CPU until the next arm walk. Enable here so coverage matches the + * already-online CPUs. + */ + if (event->ctx && READ_ONCE(event->ctx->perf_events_active)) + perf_event_enable(perf_event); + return 0; +} + +static int damon_perf_cpu_offline(unsigned int cpu, struct hlist_node *nod= e) +{ + struct damon_perf_event *event =3D hlist_entry(node, + struct damon_perf_event, hlist_node); + struct damon_perf *perf =3D event->priv; + struct perf_event *perf_event; + + if (!perf) + return 0; + + perf_event =3D per_cpu(*perf->event, cpu); + if (perf_event) { + perf_event_disable(perf_event); + perf_event_release_kernel(perf_event); + *per_cpu_ptr(perf->event, cpu) =3D NULL; + } + return 0; +} + +void damon_perf_event_arm(struct damon_perf_event *event) +{ + struct damon_perf *perf =3D event->priv; + struct perf_event *perf_event; + int cpu; + + if (!perf) + return; + + for_each_online_cpu(cpu) { + perf_event =3D *per_cpu_ptr(perf->event, cpu); + if (perf_event) + perf_event_enable(perf_event); + } +} + +void damon_perf_event_disarm(struct damon_perf_event *event) +{ + struct damon_perf *perf =3D event->priv; + struct perf_event *perf_event; + int cpu; + + if (!perf) + return; + + for_each_online_cpu(cpu) { + perf_event =3D *per_cpu_ptr(perf->event, cpu); + if (perf_event) + perf_event_disable(perf_event); + } +} + +int damon_perf_init(struct damon_ctx *ctx, struct damon_perf_event *event) +{ + struct damon_perf *perf; + int err =3D -ENOMEM; + + perf =3D kzalloc(sizeof(*perf), GFP_KERNEL); + if (!perf) + return -ENOMEM; + + perf->event =3D alloc_percpu(typeof(*perf->event)); + if (!perf->event) + goto free_perf; + + event->priv =3D perf; + event->ctx =3D ctx; + INIT_HLIST_NODE(&event->hlist_node); + + /* + * cpuhp_state_add_instance() invokes the online callback synchronously + * for every currently-online CPU; late-online CPUs subsequently get + * an event automatically and offline CPUs release theirs cleanly. + */ + err =3D cpuhp_state_add_instance(damon_perf_cpuhp_state, + &event->hlist_node); + if (err) + goto free_event; + + event->init_complete =3D true; + if (event->any_cpu_failed) { + cpuhp_state_remove_instance(damon_perf_cpuhp_state, + &event->hlist_node); + err =3D -ENODEV; + goto free_event; + } + + return 0; + +free_event: + free_percpu(perf->event); +free_perf: + kfree(perf); + event->priv =3D NULL; + return err; +} + +void damon_perf_cleanup(struct damon_ctx *ctx, struct damon_perf_event *ev= ent) +{ + struct damon_perf *perf =3D event->priv; + + if (!perf) + return; + + cpuhp_state_remove_instance(damon_perf_cpuhp_state, + &event->hlist_node); + + free_percpu(perf->event); + kfree(perf); + event->priv =3D NULL; +} + +#endif /* CONFIG_PERF_EVENTS */ + static int __init damon_va_initcall(void) { struct damon_operations ops =3D { @@ -979,6 +1238,14 @@ static int __init damon_va_initcall(void) ops_fvaddr.init =3D NULL; ops_fvaddr.update =3D NULL; =20 +#ifdef CONFIG_PERF_EVENTS + err =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "damon/perf:online", + damon_perf_cpu_online, damon_perf_cpu_offline); + if (err < 0) + return err; + damon_perf_cpuhp_state =3D err; +#endif + err =3D damon_register_ops(&ops); if (err) return err; --=20 2.43.0 From nobody Mon Jun 8 10:56:43 2026 Received: from mail-oa1-f65.google.com (mail-oa1-f65.google.com [209.85.160.65]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 040834014BB for ; Fri, 29 May 2026 16:57:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.65 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073835; cv=none; b=uE4sky0HVSTaQh3v6EqOFjr+7CgcDy7vX349zL0czddVwKW3wLsB6FzyyPcB2St+W0+KmUF3d7eXvSjIRtUaprliUDypeajs9Ihs5oGmyYswNYLbfrwvNj7hJJWuBHOjWOujHzZblPnrlLN9XPJhWfn8IiSnOEeZVOgM4xQY9jk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780073835; c=relaxed/simple; bh=mBWxNyD3dD7WZKndJ1kCqjFL7BsBzSS97QOevn010vE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rMnLJciOGnRuyP3iqi6ch/grsGQkTOR3TU7ZCE4lyloYEWj1XNluAtqntMIW8WNaXaxex1fQ5keBr7JegSp1Ux0U8CAdt6T5hyJwMeI/fG6Xy2Wil74DYE23LP5l6rbV2K23xJRTn2QqzFQtFKRp/QZ5U/Svizbh/67Ezsh8KNU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=LZnnRRL1; arc=none smtp.client-ip=209.85.160.65 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LZnnRRL1" Received: by mail-oa1-f65.google.com with SMTP id 586e51a60fabf-40ea36b56b7so10496832fac.3 for ; Fri, 29 May 2026 09:57:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780073832; x=1780678632; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=o/62Lo7gqlbSGmK7of07r4Hjek/ejMB9r9tyx1CYLkQ=; b=LZnnRRL1vGOfhV+JSSJS2Fwm3Kj3Wd7fgtbu5Wxsrx38sNLSKDdFpVqbbLj9E82Epo CusJOc3gjrAY/UYUdBy0jzRFdLO7gT7n17SU2hnj/oIfLyjHVMIUomZRk1MRelF9a0ia bsqCmf/SbGo4bqp0KResGcMJt2KlqGLlL2AJvJXRImp1Qxzf38hj63UVJsYVRn+p2j7j wBPiLhFj4sBVUWgZrw9ca6/IZQEQKIeBBnsXwX8TLLvt1DmC3VYShFuVvosRdgAVtA0T u/Ych1eNj2B+tMoVzhB/jAE3FMlPv8I1+eRxy4f48fH9VHA+hhiJ1P1XPhRn4A9sYd+y xsvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780073832; x=1780678632; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=o/62Lo7gqlbSGmK7of07r4Hjek/ejMB9r9tyx1CYLkQ=; b=rRvPR9Zb/a1Gqi833CJ5S5oftOLtz3fBcZpmULD4Bx6bUNMy60D/9/YDWlGHTCX/GH piSKS0P/GLeNf3cF7JXLsHPY/JbZVndmGbMy9LonFT0kTZOQzPzkwoWHvirBQi9SLyfw BA8zDBrGiPN21dazE8QBd1e+eu2A+t5QG/cHGJTUHdi/0e15zOpFHjTI8xYcjzDFWOaR 3gsYfK/2N/pNh9fqn/4Wr79tLsqaw7XwzOxJhI9XWZgU+4k8XBv93LTiTeY9a6wp7otM fhBh16tloXCHoVwBESTo7MwvtULoHO8iVYTgdm8yErzVUqV9MJuT9mkcLJ/WviEi9Ilo yaDw== X-Forwarded-Encrypted: i=1; AFNElJ/ua1ppuyzANwyfVuvHCy4ZNAhX836BfdhnJd+CT9GphZ+0iZxxuSkGlKuICf9ZdwR6LvY5UU05gwr3+HQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yz/FFJ7vnuRjDgRjEFKy15zp5hFQRbHbV7N7Prp998pqp9JvyJx uuApffZya1X3+3IaMUDCB7l0RGqcOAnWFpEZSj75RGBqCuth760FwFQ= X-Gm-Gg: Acq92OFDUZonbP/Sbx86G4CZ8KbDRN76hceFnTe9zV+H1nO6VhWUGP4Z/hVcAkRJzKs +ywZzSvaOL6jmwjXuJ1KbiGl7bvUQK/LmZCdL7PhrFXT/sOFy2EM16OL+ghlVrkgFffSAreZZ1J LWNomtVlMwRqLN9ogqqK1kLTfNBL1iS5DmuRC92je6ZPcIHJ0rjuaBojAjF6nt5ridf9X89XME+ 0I/Iyb3xs4Y3PIVXoJEypzC0jjwfVasHywNBVN5kaxGR8fMaLu5H8SQO4C6D+EMEN9y3K87ho7M 7NPZ5RtRBLh86Ww5UWRNikr8yxryX5CYs2aBTQfCOS0lAWyqtQ/Ay3WjG4rIsWnIk2Ivdf1iVc5 fVLNoYOCV9dMSxgN0ZQKmu+M/0xQCEveb/hcufx+XTbqo4pv834qjYtKBRsTHqFjQBQYxhebshi MD72A/n3WFotcrmLcMWVTYvjVYs8etmjIIaEs3QZ4M/HB/0JloDy1DAabSI6c+ohzld8bBXdLIw XkhEtlpW7Dk X-Received: by 2002:a05:6871:81e:b0:43b:9922:9ddd with SMTP id 586e51a60fabf-43ca425b65dmr369732fac.27.1780073831692; Fri, 29 May 2026 09:57:11 -0700 (PDT) Received: from localhost (23-116-43-216.lightspeed.sntcca.sbcglobal.net. [23.116.43.216]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-43c93a2861fsm1392913fac.3.2026.05.29.09.57.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 May 2026 09:57:11 -0700 (PDT) From: Ravi Jonnalagadda To: sj@kernel.org, akinobu.mita@gmail.com, damon@lists.linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, bijan311@gmail.com, ajayjoshi@micron.com, honggyu.kim@sk.com, yunjeong.mun@sk.com, ravis.opensrc@gmail.com Subject: [RFC PATCH 6/6] mm/damon: add damos_node_eligible_mem_bp tracepoint Date: Fri, 29 May 2026 09:56:40 -0700 Message-ID: <20260529165640.820-7-ravis.opensrc@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260529165640.820-1-ravis.opensrc@gmail.com> References: <20260529165640.820-1-ravis.opensrc@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Fire a tracepoint at every DAMOS_QUOTA_NODE_ELIGIBLE_MEM_BP goal evaluation, exposing (context, scheme, nid, target_value, current_value). This gives userspace observability into goal-tracking without polling sysfs. The trace_..._enabled() guard avoids the damon_for_each_scheme() iteration cost when nothing is listening. Signed-off-by: Ravi Jonnalagadda --- include/trace/events/damon.h | 32 ++++++++++++++++++++++++++++++++ mm/damon/core.c | 20 ++++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h index e97e70579a8c8..877627c9a1a18 100644 --- a/include/trace/events/damon.h +++ b/include/trace/events/damon.h @@ -91,6 +91,38 @@ TRACE_EVENT(damon_perf_ring_overflow, TP_printk("cpu=3D%d", __entry->cpu) ); =20 +/* Per-tick DAMOS_QUOTA_NODE_ELIGIBLE_MEM_BP goal evaluation. */ +TRACE_EVENT(damos_node_eligible_mem_bp, + + TP_PROTO(unsigned int context_idx, unsigned int scheme_idx, + int nid, + unsigned long target_value, unsigned long current_value), + + TP_ARGS(context_idx, scheme_idx, nid, target_value, current_value), + + TP_STRUCT__entry( + __field(unsigned int, context_idx) + __field(unsigned int, scheme_idx) + __field(int, nid) + __field(unsigned long, target_value) + __field(unsigned long, current_value) + ), + + TP_fast_assign( + __entry->context_idx =3D context_idx; + __entry->scheme_idx =3D scheme_idx; + __entry->nid =3D nid; + __entry->target_value =3D target_value; + __entry->current_value =3D current_value; + ), + + TP_printk("ctx_idx=3D%u scheme_idx=3D%u nid=3D%d " + "target_value=3D%lu current_value=3D%lu", + __entry->context_idx, __entry->scheme_idx, + __entry->nid, + __entry->target_value, __entry->current_value) +); + TRACE_EVENT_CONDITION(damos_before_apply, =20 TP_PROTO(unsigned int context_idx, unsigned int scheme_idx, diff --git a/mm/damon/core.c b/mm/damon/core.c index 1e6966e45144f..609d627e2b33e 100644 --- a/mm/damon/core.c +++ b/mm/damon/core.c @@ -3203,6 +3203,26 @@ static unsigned long damos_quota_score(struct damon_= ctx *c, struct damos *s) highest_score =3D max(highest_score, mult_frac(goal->current_value, 10000, goal->target_value)); + + /* + * Per-tick visibility of NODE_ELIGIBLE_MEM_BP goal evaluation + * for userspace convergence-detection. + */ + if (goal->metric =3D=3D DAMOS_QUOTA_NODE_ELIGIBLE_MEM_BP && + trace_damos_node_eligible_mem_bp_enabled()) { + unsigned int cidx =3D 0, sidx =3D 0; + struct damos *siter; + + damon_for_each_scheme(siter, c) { + if (siter =3D=3D s) + break; + sidx++; + } + trace_damos_node_eligible_mem_bp(cidx, sidx, + goal->nid, + goal->target_value, + goal->current_value); + } } =20 return highest_score; --=20 2.43.0