From nobody Mon Feb 9 06:24:23 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D65647796; Tue, 12 Mar 2024 23:49:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710287377; cv=none; b=i/IPzyYm1oHdzBFT5UTKeWzY30vOG3Ot9ClCGLZvly+jRiNhDwkt0WJ/P9CipJcQLu+jHangW1Bf4w/gr4aO6Zc4JX7wij9kVAyTkl8LsQZGuqMv9QKx3k32uZpcYUcUkfv2V/vUNQxYNzx7jfm8QBJtcIoa+Pxj3Ptzor1ToTQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710287377; c=relaxed/simple; bh=6Mj8hdj114yQ7jMfCFUKSlImUfTm0d75fiRzzNPci2M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LUuVo6FYBxNeaABydehNaY77zwkzDdImaWepJHLWTN4oNy3dDSBg8vo80xefS7WnWGGvI1eh51pADEh5aByeq7OxC+CbzzRroZg1LEiH4jpKwJSU4IcTvLPDFJtlP9NaBZaVTNe/n2q0NBuftKqrOps0O6TmiG3rxqv6AhJsm/M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=e+Li7eUo; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="e+Li7eUo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710287376; x=1741823376; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6Mj8hdj114yQ7jMfCFUKSlImUfTm0d75fiRzzNPci2M=; b=e+Li7eUozKVQZzfDLDCUDv5XEb5dG5HE24MI0jks3YGPNNXbPhMIlb6j oHijvd6KJexMZTWs89dfI7nrAS9DOloBHI6DLV05QnC//SKu08n2k2Wft HoEQCY3UFGALSVKosUzLx4nzaENZiZUWyWNqcXRqmFdYnnvcrgc2QlKZZ 7WCaqkresbX1ZChGonbusClTBYZUknKpqYK3r29UrnIOzbezcNvuX3x12 BtddaN9OmQUkZhsq/Pq3LvjPaTZ4dxlJHBP/HGkxLYd38qESZYYiqy5kY cm6tswm0fxPCWOw6gToZdjycT1DSwaBqAVUCl+0agkOBZH0wIOHCK9auK A==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="8847694" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="8847694" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 16:49:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="42715675" Received: from fl31ca102ks0602.deacluster.intel.com (HELO gnr-bkc.deacluster.intel.com) ([10.75.133.163]) by fmviesa001.fm.intel.com with ESMTP; 12 Mar 2024 16:49:33 -0700 From: weilin.wang@intel.com To: weilin.wang@intel.com, Namhyung Kim , Ian Rogers , Arnaldo Carvalho de Melo , Peter Zijlstra , Ingo Molnar , Alexander Shishkin , Jiri Olsa , Adrian Hunter , Kan Liang Cc: linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, Perry Taylor , Samantha Alt , Caleb Biggers Subject: [RFC PATCH v4 2/6] perf stat: Fork and launch perf record when perf stat needs to get retire latency value for a metric. Date: Tue, 12 Mar 2024 19:49:17 -0400 Message-ID: <20240312234921.812685-3-weilin.wang@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240312234921.812685-1-weilin.wang@intel.com> References: <20240312234921.812685-1-weilin.wang@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Weilin Wang When retire_latency value is used in a metric formula, perf stat would fork= a perf record process with "-e" and "-W" options. Perf record will collect required retire_latency values in parallel while perf stat is collecting counting values. At the point of time that perf stat stops counting, it would send sigterm s= ignal to perf record process and receiving sampling data back from perf record fr= om a pipe. Perf stat will then process the received data to get retire latency d= ata and calculate metric result. Signed-off-by: Weilin Wang --- tools/perf/builtin-stat.c | 165 +++++++++++++++++++++++++++++++++- tools/perf/util/data.c | 4 + tools/perf/util/data.h | 1 + tools/perf/util/metricgroup.h | 12 +++ tools/perf/util/stat.h | 2 + 5 files changed, 182 insertions(+), 2 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 6291e1e24535..4e92e73cbeaf 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -94,8 +94,13 @@ #include #include =20 +#include "util/sample.h" +#include +#include + #define DEFAULT_SEPARATOR " " #define FREEZE_ON_SMI_PATH "devices/cpu/freeze_on_smi" +#define PERF_DATA "-" =20 static void print_counters(struct timespec *ts, int argc, const char **arg= v); =20 @@ -163,6 +168,8 @@ static struct perf_stat_config stat_config =3D { .ctl_fd_ack =3D -1, .iostat_run =3D false, .tpebs_events =3D LIST_HEAD_INIT(stat_config.tpebs_events), + .tpebs_results =3D LIST_HEAD_INIT(stat_config.tpebs_results), + .tpebs_pid =3D -1, }; =20 static bool cpus_map_matched(struct evsel *a, struct evsel *b) @@ -687,12 +694,155 @@ static enum counter_recovery stat_handle_error(struc= t evsel *counter) return COUNTER_FATAL; } =20 -static int __run_perf_record(void) +static int __run_perf_record(const char **record_argv) { + int i =3D 0; + struct tpebs_event *e; + pr_debug("Prepare perf record for retire_latency\n"); + + record_argv[i++] =3D "perf"; + record_argv[i++] =3D "record"; + record_argv[i++] =3D "-W"; + record_argv[i++] =3D "--synth=3Dno"; + + if (stat_config.user_requested_cpu_list) { + record_argv[i++] =3D "-C"; + record_argv[i++] =3D stat_config.user_requested_cpu_list; + } + + if (stat_config.system_wide) + record_argv[i++] =3D "-a"; + + list_for_each_entry(e, &stat_config.tpebs_events, nd) { + record_argv[i++] =3D "-e"; + record_argv[i++] =3D e->name; + } + + record_argv[i++] =3D "-o"; + record_argv[i++] =3D PERF_DATA; + return 0; } =20 +static void prepare_run_command(struct child_process *cmd, + const char **argv) +{ + memset(cmd, 0, sizeof(*cmd)); + cmd->argv =3D argv; + cmd->out =3D -1; +} + +static int prepare_perf_record(struct child_process *cmd) +{ + const char **record_argv; + + record_argv =3D calloc(10 + 2 * stat_config.tpebs_event_size, sizeof(char= *)); + if (!record_argv) + return -1; + __run_perf_record(record_argv); + + prepare_run_command(cmd, record_argv); + return start_command(cmd); +} + +struct perf_script { + struct perf_tool tool; + struct perf_session *session; +}; + +static void tpebs_data__delete(void) +{ + struct tpebs_retire_lat *r, *rtmp; + struct tpebs_event *e, *etmp; + list_for_each_entry_safe(r, rtmp, &stat_config.tpebs_results, nd) { + list_del_init(&r->nd); + free(r); + } + list_for_each_entry_safe(e, etmp, &stat_config.tpebs_events, nd) { + list_del_init(&e->nd); + free(e); + } +} + +static int process_sample_event(struct perf_tool *tool __maybe_unused, + union perf_event *event __maybe_unused, + struct perf_sample *sample, + struct evsel *evsel, + struct machine *machine __maybe_unused) +{ + int ret =3D 0; + const char *evname; + struct tpebs_retire_lat *t; + + evname =3D evsel__name(evsel); + + /* + * Need to handle per core results? We are assuming average retire + * latency value will be used. Save the number of samples and the sum of + * retire latency value for each event. + */ + list_for_each_entry(t, &stat_config.tpebs_results, nd) { + if (!strcmp(evname, t->name)) { + t->count +=3D 1; + t->sum +=3D sample->retire_lat; + break; + } + } + + return ret; +} + +static int process_feature_event(struct perf_session *session, + union perf_event *event) +{ + if (event->feat.feat_id < HEADER_LAST_FEATURE) + return perf_event__process_feature(session, event); + return 0; +} + +static int __cmd_script(struct child_process *cmd __maybe_unused) +{ + int err =3D 0; + struct perf_session *session; + struct perf_data data =3D { + .mode =3D PERF_DATA_MODE_READ, + .path =3D PERF_DATA, + .fd =3D cmd->out, + }; + struct perf_script script =3D { + .tool =3D { + .sample =3D process_sample_event, + .ordering_requires_timestamps =3D true, + .feature =3D process_feature_event, + .attr =3D perf_event__process_attr, + }, + }; + struct tpebs_event *e; + + list_for_each_entry(e, &stat_config.tpebs_events, nd) { + struct tpebs_retire_lat *new =3D malloc(sizeof(struct tpebs_retire_lat)); + + if (!new) + return -1; + new->name =3D strdup(e->name); + new->tpebs_name =3D strdup(e->tpebs_name); + new->count =3D 0; + new->sum =3D 0; + list_add_tail(&new->nd, &stat_config.tpebs_results); + } + + kill(cmd->pid, SIGTERM); + session =3D perf_session__new(&data, &script.tool); + if (IS_ERR(session)) + return PTR_ERR(session); + script.session =3D session; + err =3D perf_session__process_events(session); + perf_session__delete(session); + + return err; +} + static int __run_perf_stat(int argc, const char **argv, int run_idx) { int interval =3D stat_config.interval; @@ -709,13 +859,15 @@ static int __run_perf_stat(int argc, const char **arg= v, int run_idx) struct affinity saved_affinity, *affinity =3D NULL; int err; bool second_pass =3D false; + struct child_process cmd; =20 /* Prepare perf record for sampling event retire_latency before fork and * prepare workload */ if (stat_config.tpebs_event_size > 0) { int ret; =20 - ret =3D __run_perf_record(); + pr_debug("perf stat pid =3D %d\n", getpid()); + ret =3D prepare_perf_record(&cmd); if (ret) return ret; } @@ -925,6 +1077,13 @@ static int __run_perf_stat(int argc, const char **arg= v, int run_idx) =20 t1 =3D rdclock(); =20 + if (stat_config.tpebs_event_size > 0) { + int ret; + + ret =3D __cmd_script(&cmd); + close(cmd.out); + } + if (stat_config.walltime_run_table) stat_config.walltime_run[run_idx] =3D t1 - t0; =20 @@ -2972,5 +3131,7 @@ int cmd_stat(int argc, const char **argv) metricgroup__rblist_exit(&stat_config.metric_events); evlist__close_control(stat_config.ctl_fd, stat_config.ctl_fd_ack, &stat_c= onfig.ctl_fd_close); =20 + tpebs_data__delete(); + return status; } diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c index 08c4bfbd817f..2e2a20fc5c30 100644 --- a/tools/perf/util/data.c +++ b/tools/perf/util/data.c @@ -185,6 +185,10 @@ static bool check_pipe(struct perf_data *data) int fd =3D perf_data__is_read(data) ? STDIN_FILENO : STDOUT_FILENO; =20 + if (data->fd > 0) { + fd =3D data->fd; + } + if (!data->path) { if (!fstat(fd, &st) && S_ISFIFO(st.st_mode)) is_pipe =3D true; diff --git a/tools/perf/util/data.h b/tools/perf/util/data.h index 110f3ebde30f..720638116ca0 100644 --- a/tools/perf/util/data.h +++ b/tools/perf/util/data.h @@ -28,6 +28,7 @@ struct perf_data_file { =20 struct perf_data { const char *path; + int fd; struct perf_data_file file; bool is_pipe; bool is_dir; diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h index 7c24ed768ff3..3c37d80c4d34 100644 --- a/tools/perf/util/metricgroup.h +++ b/tools/perf/util/metricgroup.h @@ -68,10 +68,22 @@ struct metric_expr { =20 struct tpebs_event { struct list_head nd; + /* Event name */ const char *name; + /* Event name with TPEBS modifier */ const char *tpebs_name; }; =20 +struct tpebs_retire_lat { + struct list_head nd; + /* Event name */ + const char *name; + /* Event name with TPEBS modifier */ + const char *tpebs_name; + size_t count; + int sum; +}; + struct metric_event *metricgroup__lookup(struct rblist *metric_events, struct evsel *evsel, bool create); diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h index b987960df3c5..0726bdc06681 100644 --- a/tools/perf/util/stat.h +++ b/tools/perf/util/stat.h @@ -111,6 +111,8 @@ struct perf_stat_config { struct rblist metric_events; struct list_head tpebs_events; size_t tpebs_event_size; + struct list_head tpebs_results; + pid_t tpebs_pid; int ctl_fd; int ctl_fd_ack; bool ctl_fd_close; --=20 2.43.0