From nobody Thu Feb 12 21:45:30 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4D4B6197516; Thu, 6 Jun 2024 14:41:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717684870; cv=none; b=ffRio5mkEFPDDxywBI/2/b5HMfSuYU+gDdK9kF66mLRqPSY+ptCnUNYqrvc8E4lVLxocPgY60l0j+yiXcb9/3zXKYtnsHU3J92jTbhsYLWS1H5DTw8bBRe9BGWg3UDoM7lRoizS9h2TRsT5TkVbm541emg6mTSnsb6EOvRqw5hs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717684870; c=relaxed/simple; bh=jgvDYoAmqkn4uTQVbYHxbGX5xcFETuFf2q77vyPdHXU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LbpgKVqcuWWkqEtUMVAWWrhO+yTvVtCZrxZzu6//Q/KjSa1NAyE8QeYJRJysEt95fglG90bj0v/WhkfZelYFOp9BaIN1/lKgfmO9BcLzcn/TLC5U2zE8LEJGCVWbF+CfozroD9eDw/GftbyTIf/aGDqCLqoRc9gHEak0mMEosts= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 155B3339; Thu, 6 Jun 2024 07:41:32 -0700 (PDT) Received: from e126817.cambridge.arm.com (e126817.cambridge.arm.com [10.2.3.5]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 01A363F64C; Thu, 6 Jun 2024 07:41:05 -0700 (PDT) From: Ben Gainey To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org Cc: james.clark@arm.com, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gainey Subject: [PATCH v7 1/4] perf: Support PERF_SAMPLE_READ with inherit Date: Thu, 6 Jun 2024 15:40:56 +0100 Message-ID: <20240606144059.365633-2-ben.gainey@arm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240606144059.365633-1-ben.gainey@arm.com> References: <20240606144059.365633-1-ben.gainey@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This change allows events to use PERF_SAMPLE READ with inherit so long as PERF_SAMPLE_TID is also set. In this configuration, an event will be inherited into any child processes / threads, allowing convenient profiling of a multiprocess or multithreaded application, whilst allowing profiling tools to collect per-thread samples, in particular of groups of counters. The read_format field of both PERF_RECORD_READ and PERF_RECORD_SAMPLE are changed by this new configuration, but calls to `read()` on the same event file descriptor are unaffected and continue to return the cumulative total. Signed-off-by: Ben Gainey --- include/linux/perf_event.h | 1 + kernel/events/core.c | 78 ++++++++++++++++++++++++++++---------- 2 files changed, 58 insertions(+), 21 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index a5304ae8c654..810e0fe85572 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -935,6 +935,7 @@ struct perf_event_context { =20 int nr_task_data; int nr_stat; + int nr_inherit_read; int nr_freq; int rotate_disable; =20 diff --git a/kernel/events/core.c b/kernel/events/core.c index f0128c5ff278..5c4f292bc6ce 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1767,6 +1767,14 @@ perf_event_groups_next(struct perf_event *event, str= uct pmu *pmu) event =3D rb_entry_safe(rb_next(&event->group_node), \ typeof(*event), group_node)) =20 +/* + * Does the event attribute request inherit with PERF_SAMPLE_READ + */ +static inline bool has_inherit_and_sample_read(struct perf_event_attr *att= r) +{ + return attr->inherit && (attr->sample_type & PERF_SAMPLE_READ); +} + /* * Add an event from the lists for its context. * Must be called with ctx->mutex and ctx->lock held. @@ -1797,6 +1805,8 @@ list_add_event(struct perf_event *event, struct perf_= event_context *ctx) ctx->nr_user++; if (event->attr.inherit_stat) ctx->nr_stat++; + if (has_inherit_and_sample_read(&event->attr)) + ctx->nr_inherit_read++; =20 if (event->state > PERF_EVENT_STATE_OFF) perf_cgroup_event_enable(event, ctx); @@ -2021,6 +2031,8 @@ list_del_event(struct perf_event *event, struct perf_= event_context *ctx) ctx->nr_user--; if (event->attr.inherit_stat) ctx->nr_stat--; + if (has_inherit_and_sample_read(&event->attr)) + ctx->nr_inherit_read--; =20 list_del_rcu(&event->event_entry); =20 @@ -3532,11 +3544,18 @@ perf_event_context_sched_out(struct task_struct *ta= sk, struct task_struct *next) perf_ctx_disable(ctx, false); =20 /* PMIs are disabled; ctx->nr_pending is stable. */ - if (local_read(&ctx->nr_pending) || + if (ctx->nr_inherit_read || + next_ctx->nr_inherit_read || + local_read(&ctx->nr_pending) || local_read(&next_ctx->nr_pending)) { /* * Must not swap out ctx when there's pending * events that rely on the ctx->task relation. + * + * Likewise, when a context contains inherit + + * SAMPLE_READ events they should be switched + * out using the slow path so that they are + * treated as if they were distinct contexts. */ raw_spin_unlock(&next_ctx->lock); rcu_read_unlock(); @@ -4552,11 +4571,19 @@ static void __perf_event_read(void *info) raw_spin_unlock(&ctx->lock); } =20 -static inline u64 perf_event_count(struct perf_event *event) +static inline u64 perf_event_count_cumulative(struct perf_event *event) { return local64_read(&event->count) + atomic64_read(&event->child_count); } =20 +static inline u64 perf_event_count(struct perf_event *event, bool self_val= ue_only) +{ + if (self_value_only && has_inherit_and_sample_read(&event->attr)) + return local64_read(&event->count); + + return perf_event_count_cumulative(event); +} + static void calc_timer_values(struct perf_event *event, u64 *now, u64 *enabled, @@ -5473,7 +5500,7 @@ static u64 __perf_event_read_value(struct perf_event = *event, u64 *enabled, u64 * mutex_lock(&event->child_mutex); =20 (void)perf_event_read(event, false); - total +=3D perf_event_count(event); + total +=3D perf_event_count_cumulative(event); =20 *enabled +=3D event->total_time_enabled + atomic64_read(&event->child_total_time_enabled); @@ -5482,7 +5509,7 @@ static u64 __perf_event_read_value(struct perf_event = *event, u64 *enabled, u64 * =20 list_for_each_entry(child, &event->child_list, child_list) { (void)perf_event_read(child, false); - total +=3D perf_event_count(child); + total +=3D perf_event_count_cumulative(child); *enabled +=3D child->total_time_enabled; *running +=3D child->total_time_running; } @@ -5564,14 +5591,14 @@ static int __perf_read_group_add(struct perf_event = *leader, /* * Write {count,id} tuples for every sibling. */ - values[n++] +=3D perf_event_count(leader); + values[n++] +=3D perf_event_count_cumulative(leader); if (read_format & PERF_FORMAT_ID) values[n++] =3D primary_event_id(leader); if (read_format & PERF_FORMAT_LOST) values[n++] =3D atomic64_read(&leader->lost_samples); =20 for_each_sibling_event(sub, leader) { - values[n++] +=3D perf_event_count(sub); + values[n++] +=3D perf_event_count_cumulative(sub); if (read_format & PERF_FORMAT_ID) values[n++] =3D primary_event_id(sub); if (read_format & PERF_FORMAT_LOST) @@ -6151,7 +6178,7 @@ void perf_event_update_userpage(struct perf_event *ev= ent) ++userpg->lock; barrier(); userpg->index =3D perf_event_index(event); - userpg->offset =3D perf_event_count(event); + userpg->offset =3D perf_event_count_cumulative(event); if (userpg->index) userpg->offset -=3D local64_read(&event->hw.prev_count); =20 @@ -7205,13 +7232,14 @@ void perf_event__output_id_sample(struct perf_event= *event, =20 static void perf_output_read_one(struct perf_output_handle *handle, struct perf_event *event, - u64 enabled, u64 running) + u64 enabled, u64 running, + bool from_sample) { u64 read_format =3D event->attr.read_format; u64 values[5]; int n =3D 0; =20 - values[n++] =3D perf_event_count(event); + values[n++] =3D perf_event_count(event, from_sample); if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED) { values[n++] =3D enabled + atomic64_read(&event->child_total_time_enabled); @@ -7229,8 +7257,9 @@ static void perf_output_read_one(struct perf_output_h= andle *handle, } =20 static void perf_output_read_group(struct perf_output_handle *handle, - struct perf_event *event, - u64 enabled, u64 running) + struct perf_event *event, + u64 enabled, u64 running, + bool from_sample) { struct perf_event *leader =3D event->group_leader, *sub; u64 read_format =3D event->attr.read_format; @@ -7256,7 +7285,7 @@ static void perf_output_read_group(struct perf_output= _handle *handle, (leader->state =3D=3D PERF_EVENT_STATE_ACTIVE)) leader->pmu->read(leader); =20 - values[n++] =3D perf_event_count(leader); + values[n++] =3D perf_event_count(leader, from_sample); if (read_format & PERF_FORMAT_ID) values[n++] =3D primary_event_id(leader); if (read_format & PERF_FORMAT_LOST) @@ -7271,7 +7300,7 @@ static void perf_output_read_group(struct perf_output= _handle *handle, (sub->state =3D=3D PERF_EVENT_STATE_ACTIVE)) sub->pmu->read(sub); =20 - values[n++] =3D perf_event_count(sub); + values[n++] =3D perf_event_count(sub, from_sample); if (read_format & PERF_FORMAT_ID) values[n++] =3D primary_event_id(sub); if (read_format & PERF_FORMAT_LOST) @@ -7292,9 +7321,14 @@ static void perf_output_read_group(struct perf_outpu= t_handle *handle, * The problem is that its both hard and excessively expensive to iterate = the * child list, not to mention that its impossible to IPI the children runn= ing * on another CPU, from interrupt/NMI context. + * + * Instead the combination of PERF_SAMPLE_READ and inherit will track per-= thread + * counts rather than attempting to accumulate some value across all child= ren on + * all cores. */ static void perf_output_read(struct perf_output_handle *handle, - struct perf_event *event) + struct perf_event *event, + bool from_sample) { u64 enabled =3D 0, running =3D 0, now; u64 read_format =3D event->attr.read_format; @@ -7312,9 +7346,9 @@ static void perf_output_read(struct perf_output_handl= e *handle, calc_timer_values(event, &now, &enabled, &running); =20 if (event->attr.read_format & PERF_FORMAT_GROUP) - perf_output_read_group(handle, event, enabled, running); + perf_output_read_group(handle, event, enabled, running, from_sample); else - perf_output_read_one(handle, event, enabled, running); + perf_output_read_one(handle, event, enabled, running, from_sample); } =20 void perf_output_sample(struct perf_output_handle *handle, @@ -7354,7 +7388,7 @@ void perf_output_sample(struct perf_output_handle *ha= ndle, perf_output_put(handle, data->period); =20 if (sample_type & PERF_SAMPLE_READ) - perf_output_read(handle, event); + perf_output_read(handle, event, true); =20 if (sample_type & PERF_SAMPLE_CALLCHAIN) { int size =3D 1; @@ -7955,7 +7989,7 @@ perf_event_read_event(struct perf_event *event, return; =20 perf_output_put(&handle, read_event); - perf_output_read(&handle, event); + perf_output_read(&handle, event, false); perf_event__output_id_sample(event, &handle, &sample); =20 perf_output_end(&handle); @@ -12021,10 +12055,12 @@ perf_event_alloc(struct perf_event_attr *attr, in= t cpu, local64_set(&hwc->period_left, hwc->sample_period); =20 /* - * We currently do not support PERF_SAMPLE_READ on inherited events. + * We do not support PERF_SAMPLE_READ on inherited events unless + * PERF_SAMPLE_TID is also selected, which allows inherited events to + * collect per-thread samples. * See perf_output_read(). */ - if (attr->inherit && (attr->sample_type & PERF_SAMPLE_READ)) + if (has_inherit_and_sample_read(attr) && !(attr->sample_type & PERF_SAMPL= E_TID)) goto err_ns; =20 if (!has_branch_stack(event)) @@ -13048,7 +13084,7 @@ static void sync_child_event(struct perf_event *chi= ld_event) perf_event_read_event(child_event, task); } =20 - child_val =3D perf_event_count(child_event); + child_val =3D perf_event_count_cumulative(child_event); =20 /* * Add back the child's count to the parent's count: --=20 2.45.2 From nobody Thu Feb 12 21:45:30 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 3C4EE1B3F25; Thu, 6 Jun 2024 14:41:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717684871; cv=none; b=ByvpRLKHCPFCMmMab9Dm2f57YZBWlOS3OaWLoejVbLbYIqzHCCCpWnPQz+0i8FmVJ4x6H5hgYd3AZ8kg0djePY9leFtzhaNFaPUqZHC/rxF5QtrHaRhRq2Q/Vc7tlaVsCM6+YutWThrQ6NmKd7whSljpkc1oo/d9XhgeNpJd8Oc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717684871; c=relaxed/simple; bh=hd5THI3gAhXc2jN10Sxl+TkEfdWPtV/F/TJXm7qJTkM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sgWZGF1x1nhAmmi6TqeuZ8cMfdF7bfYWfQeThLdoPRhiLSs67vBIh8Wt6+qs8C3D/qSDwFdjun0m4FZEugQHXAG3KaXWJh1sMn3thEQXuRrKXN+HKTt+YNfQcXHC0MIQqlEF7QC6ZqJj8amNjbfuWJfq0Qb2fVa3uwA7kDv4gHQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 03B5CDA7; Thu, 6 Jun 2024 07:41:34 -0700 (PDT) Received: from e126817.cambridge.arm.com (e126817.cambridge.arm.com [10.2.3.5]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id E45883F64C; Thu, 6 Jun 2024 07:41:07 -0700 (PDT) From: Ben Gainey To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org Cc: james.clark@arm.com, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gainey Subject: [PATCH v7 2/4] tools/perf: Track where perf_sample_ids need per-thread periods Date: Thu, 6 Jun 2024 15:40:57 +0100 Message-ID: <20240606144059.365633-3-ben.gainey@arm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240606144059.365633-1-ben.gainey@arm.com> References: <20240606144059.365633-1-ben.gainey@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" perf_sample_ids and related evlist/evsel code are modified to track which events combine inherit+PERF_SAMPLE_READ+PERF_SAMPLE_TID. Events with this combination of properties must be handled differently when calculating each sample period. They must use the combination of (ID + TID) to uniquely identify each distinct sequence of values. Signed-off-by: Ben Gainey --- tools/lib/perf/evlist.c | 1 + tools/lib/perf/evsel.c | 7 +++++++ tools/lib/perf/include/internal/evsel.h | 9 +++++++++ 3 files changed, 17 insertions(+) diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c index c6d67fc9e57e..d17288eeaee4 100644 --- a/tools/lib/perf/evlist.c +++ b/tools/lib/perf/evlist.c @@ -255,6 +255,7 @@ static void perf_evlist__id_hash(struct perf_evlist *ev= list, =20 sid->id =3D id; sid->evsel =3D evsel; + sid->period_per_thread =3D perf_evsel__attr_has_per_thread_sample_period(= evsel); hash =3D hash_64(sid->id, PERF_EVLIST__HLIST_BITS); hlist_add_head(&sid->node, &evlist->heads[hash]); } diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c index c07160953224..f7abb879f416 100644 --- a/tools/lib/perf/evsel.c +++ b/tools/lib/perf/evsel.c @@ -537,6 +537,13 @@ void perf_evsel__free_id(struct perf_evsel *evsel) evsel->ids =3D 0; } =20 +bool perf_evsel__attr_has_per_thread_sample_period(struct perf_evsel *evse= l) +{ + return (evsel->attr.sample_type & PERF_SAMPLE_READ) + && (evsel->attr.sample_type & PERF_SAMPLE_TID) + && evsel->attr.inherit; +} + void perf_counts_values__scale(struct perf_counts_values *count, bool scale, __s8 *pscaled) { diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/inclu= de/internal/evsel.h index 5cd220a61962..f8de2bf89c76 100644 --- a/tools/lib/perf/include/internal/evsel.h +++ b/tools/lib/perf/include/internal/evsel.h @@ -36,6 +36,13 @@ struct perf_sample_id { =20 /* Holds total ID period value for PERF_SAMPLE_READ processing. */ u64 period; + + /* + * When inherit is combined with PERF_SAMPLE_READ, the period value is + * per (id, thread) tuple, rather than per id, so use the stream_id to + * uniquely identify the period, rather than the id. + */ + bool period_per_thread; }; =20 struct perf_evsel { @@ -88,4 +95,6 @@ int perf_evsel__apply_filter(struct perf_evsel *evsel, co= nst char *filter); int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads= ); void perf_evsel__free_id(struct perf_evsel *evsel); =20 +bool perf_evsel__attr_has_per_thread_sample_period(struct perf_evsel *evse= l); + #endif /* __LIBPERF_INTERNAL_EVSEL_H */ --=20 2.45.2 From nobody Thu Feb 12 21:45:30 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2ED861B4C39; Thu, 6 Jun 2024 14:41:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717684873; cv=none; b=WSHkwm59rF+g7l11NsCTm+Y42QWJjTA90cp3Sv519SwtzXSxv8jGbinLspvMqxNIQ49/MNGzhfS+tkPB877Okc8BRB7QMSTjJElzoZhr/b+7WpM1noLDLNmP9U4UBG15Qbxjw5dapbpu3ZjbzGRMvBqqVNUQAf9v4uPecr0eJM4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717684873; c=relaxed/simple; bh=YiuwC0oKHjhlIUUagY/XBVaj0Z56MxBbTJJsUKSWu9w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=duU51qv9U/Vdsv21x9vfS1mOnu7gCTppvKp8+Ch6GgNXuA/wcobJ0WFI5BJaVeF3b5+Eld9DmReJ+weot0eAoDqES1Z56vBSMnkUZZCq7ussx7KYDwYHC6QaXh94KelZXZX9HHWnUeUbaxbw9aKWRxBUre17Cex9Rbo/jKs8dQc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E62AD1042; Thu, 6 Jun 2024 07:41:35 -0700 (PDT) Received: from e126817.cambridge.arm.com (e126817.cambridge.arm.com [10.2.3.5]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id D2E293F64C; Thu, 6 Jun 2024 07:41:09 -0700 (PDT) From: Ben Gainey To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org Cc: james.clark@arm.com, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gainey Subject: [PATCH v7 3/4] tools/perf: Correctly calculate sample period for inherited SAMPLE_READ values Date: Thu, 6 Jun 2024 15:40:58 +0100 Message-ID: <20240606144059.365633-4-ben.gainey@arm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240606144059.365633-1-ben.gainey@arm.com> References: <20240606144059.365633-1-ben.gainey@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Sample period calculation is updated to take into account the fact that events with inherit+PERF_SAMPLE_READ+PERF_SAMPLE_TID should use the TID in combination with the ID field in the read_format data to identify which value represents the previous accumulated counter total used= to calculate the period delta since the last sample. perf_sample_id is modified to support tracking per-thread values, along with the existing global per-id values. In the per-thread case, values are stored in a hash by TID within the perf_sample_id, and are dynamically allocated as the number is not known ahead of time. deliver_sample_value is modified to correctly locate the previous sample storage based on the attribute, stream id and thread id. Signed-off-by: Ben Gainey --- tools/lib/perf/evsel.c | 41 ++++++++++++++++++++++ tools/lib/perf/include/internal/evsel.h | 45 +++++++++++++++++++++++-- tools/perf/util/session.c | 11 ++++-- 3 files changed, 92 insertions(+), 5 deletions(-) diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c index f7abb879f416..26f3d7ba0f26 100644 --- a/tools/lib/perf/evsel.c +++ b/tools/lib/perf/evsel.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include #include @@ -23,6 +24,7 @@ void perf_evsel__init(struct perf_evsel *evsel, struct pe= rf_event_attr *attr, int idx) { INIT_LIST_HEAD(&evsel->node); + INIT_LIST_HEAD(&evsel->per_stream_periods); evsel->attr =3D *attr; evsel->idx =3D idx; evsel->leader =3D evsel; @@ -531,10 +533,17 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, in= t ncpus, int nthreads) =20 void perf_evsel__free_id(struct perf_evsel *evsel) { + struct perf_sample_id_period *pos, *n; + xyarray__delete(evsel->sample_id); evsel->sample_id =3D NULL; zfree(&evsel->id); evsel->ids =3D 0; + + perf_evsel_for_each_per_thread_period_safe(evsel, n, pos) { + list_del_init(&pos->node); + free(pos); + } } =20 bool perf_evsel__attr_has_per_thread_sample_period(struct perf_evsel *evse= l) @@ -544,6 +553,38 @@ bool perf_evsel__attr_has_per_thread_sample_period(str= uct perf_evsel *evsel) && evsel->attr.inherit; } =20 +u64 *perf_sample_id__get_period_storage(struct perf_sample_id *sid, u32 ti= d) +{ + struct hlist_head *head; + struct perf_sample_id_period *res; + int hash; + + if (!sid->period_per_thread) + return &sid->period; + + hash =3D hash_32(tid, PERF_SAMPLE_ID__HLIST_BITS); + head =3D &sid->periods[hash]; + + hlist_for_each_entry(res, head, hnode) + if (res->tid =3D=3D tid) + return &res->period; + + if (sid->evsel =3D=3D NULL) + return NULL; + + res =3D zalloc(sizeof(struct perf_sample_id_period)); + if (res =3D=3D NULL) + return NULL; + + INIT_LIST_HEAD(&res->node); + res->tid =3D tid; + + list_add_tail(&res->node, &sid->evsel->per_stream_periods); + hlist_add_head(&res->hnode, &sid->periods[hash]); + + return &res->period; +} + void perf_counts_values__scale(struct perf_counts_values *count, bool scale, __s8 *pscaled) { diff --git a/tools/lib/perf/include/internal/evsel.h b/tools/lib/perf/inclu= de/internal/evsel.h index f8de2bf89c76..797dc9d78254 100644 --- a/tools/lib/perf/include/internal/evsel.h +++ b/tools/lib/perf/include/internal/evsel.h @@ -11,6 +11,32 @@ struct perf_thread_map; struct xyarray; =20 +/** + * The per-thread accumulated period storage node. + */ +struct perf_sample_id_period { + struct list_head node; + struct hlist_node hnode; + /* Holds total ID period value for PERF_SAMPLE_READ processing. */ + u64 period; + /* The TID that the values belongs to */ + u32 tid; +}; + +/** + * perf_evsel_for_each_per_thread_period_safe - safely iterate thru all the + * per_stream_periods + * @evlist:perf_evsel instance to iterate + * @item: struct perf_sample_id_period iterator + * @tmp: struct perf_sample_id_period temp iterator + */ +#define perf_evsel_for_each_per_thread_period_safe(evsel, tmp, item) \ + list_for_each_entry_safe(item, tmp, &(evsel)->per_stream_periods, node) + + +#define PERF_SAMPLE_ID__HLIST_BITS 4 +#define PERF_SAMPLE_ID__HLIST_SIZE (1 << PERF_SAMPLE_ID__HLIST_BITS) + /* * Per fd, to map back from PERF_SAMPLE_ID to evsel, only used when there = are * more than one entry in the evlist. @@ -34,8 +60,18 @@ struct perf_sample_id { pid_t machine_pid; struct perf_cpu vcpu; =20 - /* Holds total ID period value for PERF_SAMPLE_READ processing. */ - u64 period; + union { + /* + * Holds total ID period value for PERF_SAMPLE_READ processing + * (when period is not per-thread). + */ + u64 period; + /* + * Holds total ID period value for PERF_SAMPLE_READ processing + * (when period is per-thread). + */ + struct hlist_head periods[PERF_SAMPLE_ID__HLIST_SIZE]; + }; =20 /* * When inherit is combined with PERF_SAMPLE_READ, the period value is @@ -65,6 +101,9 @@ struct perf_evsel { u32 ids; struct perf_evsel *leader; =20 + /* Where period_per_thread is true, stores the per-thread values */ + struct list_head per_stream_periods; + /* parse modifier helper */ int nr_members; /* @@ -97,4 +136,6 @@ void perf_evsel__free_id(struct perf_evsel *evsel); =20 bool perf_evsel__attr_has_per_thread_sample_period(struct perf_evsel *evse= l); =20 +u64 *perf_sample_id__get_period_storage(struct perf_sample_id *sid, u32 ti= d); + #endif /* __LIBPERF_INTERNAL_EVSEL_H */ diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index a10343b9dcd4..cf5dbe075674 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -1478,14 +1478,19 @@ static int deliver_sample_value(struct evlist *evli= st, { struct perf_sample_id *sid =3D evlist__id2sid(evlist, v->id); struct evsel *evsel; + u64 *storage =3D NULL; =20 if (sid) { + storage =3D perf_sample_id__get_period_storage(sid, sample->tid); + } + + if (storage) { sample->id =3D v->id; - sample->period =3D v->value - sid->period; - sid->period =3D v->value; + sample->period =3D v->value - *storage; + *storage =3D v->value; } =20 - if (!sid || sid->evsel =3D=3D NULL) { + if (!storage || sid->evsel =3D=3D NULL) { ++evlist->stats.nr_unknown_id; return 0; } --=20 2.45.2 From nobody Thu Feb 12 21:45:30 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0CFA31B5805; Thu, 6 Jun 2024 14:41:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717684875; cv=none; b=aIPe/f53LX8BK0DErnK1HblKk9OlOLGah8fbOG8xr5GoNdimQaUeNFqsPv8QVziL+tPkIb43wJy/S+FwWglH7BetcQf+UjMW/brLStRapqwVvQUgGL5wDOUIz8zUUUNJmIcgP7dudyWdo9ojWJlAetunMqMz9a3tmhyZmzX6vcA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717684875; c=relaxed/simple; bh=gxj+I27UMyVJzJ7ErqdXr+YTlUgOSXY23WDXBCCcEm4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pti9GkvDqCZ/Di8+WOhbMqfZfuRTcXp/EVAyEoLY9aDf2HWil4zH8A8RBL126VuP+FKJWXQJNmIKY4aQxlc7raSxLZzeZxSgYV8U4hGhexNJZ9/0bPZGv2W8eMkCMtu+cbFOJTDglvxPl+krgzERlPmBNKVYhT1lOSPBf1CXxGw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D4CFF2F4; Thu, 6 Jun 2024 07:41:37 -0700 (PDT) Received: from e126817.cambridge.arm.com (e126817.cambridge.arm.com [10.2.3.5]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C17593F64C; Thu, 6 Jun 2024 07:41:11 -0700 (PDT) From: Ben Gainey To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org Cc: james.clark@arm.com, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Ben Gainey Subject: [PATCH v7 4/4] tools/perf: Allow inherit + PERF_SAMPLE_READ when opening events Date: Thu, 6 Jun 2024 15:40:59 +0100 Message-ID: <20240606144059.365633-5-ben.gainey@arm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240606144059.365633-1-ben.gainey@arm.com> References: <20240606144059.365633-1-ben.gainey@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The tool will now default to this new mode if the user specifies a sampling group when not in system-wide mode, and when --no-inherit is not specified. This change updates evsel to allow the combination of inherit and PERF_SAMPLE_READ. A fallback is implemented for kernel versions where this feature is not supported. Signed-off-by: Ben Gainey --- tools/perf/tests/attr/README | 2 + .../tests/attr/test-record-group-sampling | 3 +- .../tests/attr/test-record-group-sampling1 | 51 ++++++++++++++++ .../tests/attr/test-record-group-sampling2 | 61 +++++++++++++++++++ tools/perf/tests/attr/test-record-group2 | 1 + ...{test-record-group2 =3D> test-record-group3} | 10 +-- tools/perf/util/evsel.c | 19 +++++- tools/perf/util/evsel.h | 1 + 8 files changed, 141 insertions(+), 7 deletions(-) create mode 100644 tools/perf/tests/attr/test-record-group-sampling1 create mode 100644 tools/perf/tests/attr/test-record-group-sampling2 copy tools/perf/tests/attr/{test-record-group2 =3D> test-record-group3} (8= 1%) diff --git a/tools/perf/tests/attr/README b/tools/perf/tests/attr/README index 4066fec7180a..67c4ca76b85d 100644 --- a/tools/perf/tests/attr/README +++ b/tools/perf/tests/attr/README @@ -51,6 +51,8 @@ Following tests are defined (with perf commands): perf record --call-graph fp kill (test-record-graph-fp-aarc= h64) perf record -e '{cycles,instructions}' kill (test-record-group1) perf record -e '{cycles/period=3D1/,instructions/period=3D2/}:S' kill (t= est-record-group2) + perf record -e '{cycles,cache-misses}:S' kill (test-record-group-samplin= g1) + perf record -c 10000 -e '{cycles,cache-misses}:S' kill (test-record-grou= p-sampling2) perf record -D kill (test-record-no-delay) perf record -i kill (test-record-no-inherit) perf record -n kill (test-record-no-samples) diff --git a/tools/perf/tests/attr/test-record-group-sampling b/tools/perf/= tests/attr/test-record-group-sampling index 97e7e64a38f0..c32ceb156226 100644 --- a/tools/perf/tests/attr/test-record-group-sampling +++ b/tools/perf/tests/attr/test-record-group-sampling @@ -2,6 +2,7 @@ command =3D record args =3D --no-bpf-event -e '{cycles,cache-misses}:S' kill >/dev/null 2>= &1 ret =3D 1 +kernel_until =3D 6.10 =20 [event-1:base-record] fd=3D1 @@ -18,7 +19,7 @@ group_fd=3D1 type=3D0 config=3D3 =20 -# default | PERF_SAMPLE_READ +# default | PERF_SAMPLE_READ | PERF_SAMPLE_PERIOD sample_type=3D343 =20 # PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST diff --git a/tools/perf/tests/attr/test-record-group-sampling1 b/tools/perf= /tests/attr/test-record-group-sampling1 new file mode 100644 index 000000000000..db6404c1ef78 --- /dev/null +++ b/tools/perf/tests/attr/test-record-group-sampling1 @@ -0,0 +1,51 @@ +[config] +command =3D record +args =3D --no-bpf-event -e '{cycles,cache-misses}:S' kill >/dev/null 2>= &1 +ret =3D 1 +kernel_since =3D 6.10 + +[event-1:base-record] +fd=3D1 +group_fd=3D-1 + +# cycles +type=3D0 +config=3D0 + +# default | PERF_SAMPLE_READ | PERF_SAMPLE_PERIOD +sample_type=3D343 + +# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST | PERF_FORMAT_TOT= AL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING +read_format=3D28|31 +task=3D1 +mmap=3D1 +comm=3D1 +enable_on_exec=3D1 +disabled=3D1 + +# inherit is enabled for group sampling +inherit=3D1 + +[event-2:base-record] +fd=3D2 +group_fd=3D1 + +# cache-misses +type=3D0 +config=3D3 + +# default | PERF_SAMPLE_READ | PERF_SAMPLE_PERIOD +sample_type=3D343 + +# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST | PERF_FORMAT_TOT= AL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING +read_format=3D28|31 +task=3D0 +mmap=3D0 +comm=3D0 +enable_on_exec=3D0 +disabled=3D0 +freq=3D0 + +# inherit is enabled for group sampling +inherit=3D1 + diff --git a/tools/perf/tests/attr/test-record-group-sampling2 b/tools/perf= /tests/attr/test-record-group-sampling2 new file mode 100644 index 000000000000..32884df78c95 --- /dev/null +++ b/tools/perf/tests/attr/test-record-group-sampling2 @@ -0,0 +1,61 @@ +[config] +command =3D record +args =3D --no-bpf-event -c 10000 -e '{cycles,cache-misses}:S' kill >/de= v/null 2>&1 +ret =3D 1 +kernel_since =3D 6.10 + +[event-1:base-record] +fd=3D1 +group_fd=3D-1 + +# cycles +type=3D0 +config=3D0 + +# default | PERF_SAMPLE_READ +sample_type=3D87 + +# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST | PERF_FORMAT_TOT= AL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING +read_format=3D28|31 +task=3D1 +mmap=3D1 +comm=3D1 +enable_on_exec=3D1 +disabled=3D1 + +# inherit is enabled for group sampling +inherit=3D1 + +# sampling disabled +sample_freq=3D0 +sample_period=3D10000 +freq=3D0 +write_backward=3D0 + +[event-2:base-record] +fd=3D2 +group_fd=3D1 + +# cache-misses +type=3D0 +config=3D3 + +# default | PERF_SAMPLE_READ +sample_type=3D87 + +# PERF_FORMAT_ID | PERF_FORMAT_GROUP | PERF_FORMAT_LOST | PERF_FORMAT_TOT= AL_TIME_ENABLED | PERF_FORMAT_TOTAL_TIME_RUNNING +read_format=3D28|31 +task=3D0 +mmap=3D0 +comm=3D0 +enable_on_exec=3D0 +disabled=3D0 + +# inherit is enabled for group sampling +inherit=3D1 + +# sampling disabled +sample_freq=3D0 +sample_period=3D0 +freq=3D0 +write_backward=3D0 diff --git a/tools/perf/tests/attr/test-record-group2 b/tools/perf/tests/at= tr/test-record-group2 index cebdaa8e64e4..8fe6c679618c 100644 --- a/tools/perf/tests/attr/test-record-group2 +++ b/tools/perf/tests/attr/test-record-group2 @@ -2,6 +2,7 @@ command =3D record args =3D --no-bpf-event -e '{cycles/period=3D1234000/,instructions/peri= od=3D6789000/}:S' kill >/dev/null 2>&1 ret =3D 1 +kernel_until =3D 6.10 =20 [event-1:base-record] fd=3D1 diff --git a/tools/perf/tests/attr/test-record-group2 b/tools/perf/tests/at= tr/test-record-group3 similarity index 81% copy from tools/perf/tests/attr/test-record-group2 copy to tools/perf/tests/attr/test-record-group3 index cebdaa8e64e4..72e6a9ee2f60 100644 --- a/tools/perf/tests/attr/test-record-group2 +++ b/tools/perf/tests/attr/test-record-group3 @@ -2,6 +2,7 @@ command =3D record args =3D --no-bpf-event -e '{cycles/period=3D1234000/,instructions/peri= od=3D6789000/}:S' kill >/dev/null 2>&1 ret =3D 1 +kernel_since =3D 6.10 =20 [event-1:base-record] fd=3D1 @@ -9,8 +10,9 @@ group_fd=3D-1 config=3D0|1 sample_period=3D1234000 sample_type=3D87 -read_format=3D12|28 -inherit=3D0 +read_format=3D28|31 +disabled=3D1 +inherit=3D1 freq=3D0 =20 [event-2:base-record] @@ -19,9 +21,9 @@ group_fd=3D1 config=3D0|1 sample_period=3D6789000 sample_type=3D87 -read_format=3D12|28 +read_format=3D28|31 disabled=3D0 -inherit=3D0 +inherit=3D1 mmap=3D0 comm=3D0 freq=3D0 diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 4f818ab6b662..66b782811997 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1162,7 +1162,15 @@ void evsel__config(struct evsel *evsel, struct recor= d_opts *opts, */ if (leader->core.nr_members > 1) { attr->read_format |=3D PERF_FORMAT_GROUP; - attr->inherit =3D 0; + } + + /* + * Inherit + SAMPLE_READ requires SAMPLE_TID in the read_format + */ + if (attr->inherit) { + evsel__set_sample_bit(evsel, TID); + evsel->core.attr.read_format |=3D + PERF_FORMAT_ID; } } =20 @@ -1838,6 +1846,8 @@ static int __evsel__prepare_open(struct evsel *evsel,= struct perf_cpu_map *cpus, =20 static void evsel__disable_missing_features(struct evsel *evsel) { + if (perf_missing_features.inherit_sample_read) + evsel->core.attr.inherit =3D 0; if (perf_missing_features.branch_counters) evsel->core.attr.branch_sample_type &=3D ~PERF_SAMPLE_BRANCH_COUNTERS; if (perf_missing_features.read_lost) @@ -1893,7 +1903,12 @@ bool evsel__detect_missing_features(struct evsel *ev= sel) * Must probe features in the order they were added to the * perf_event_attr interface. */ - if (!perf_missing_features.branch_counters && + if (!perf_missing_features.inherit_sample_read && + evsel->core.attr.inherit && (evsel->core.attr.sample_type & PERF_SAMP= LE_READ)) { + perf_missing_features.inherit_sample_read =3D true; + pr_debug2("Using PERF_SAMPLE_READ / :S modifier is not compatible with i= nherit, falling back to no-inherit.\n"); + return true; + } else if (!perf_missing_features.branch_counters && (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_COUNTERS)) { perf_missing_features.branch_counters =3D true; pr_debug2("switching off branch counters support\n"); diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 375a38e15cd9..911c2fd42c6d 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -192,6 +192,7 @@ struct perf_missing_features { bool weight_struct; bool read_lost; bool branch_counters; + bool inherit_sample_read; }; =20 extern struct perf_missing_features perf_missing_features; --=20 2.45.2