From nobody Fri Feb 13 12:31:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CA41CE79A6 for ; Tue, 26 Sep 2023 04:30:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233624AbjIZEa2 (ORCPT ); Tue, 26 Sep 2023 00:30:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232956AbjIZEaJ (ORCPT ); Tue, 26 Sep 2023 00:30:09 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10F92EC; Mon, 25 Sep 2023 21:30:01 -0700 (PDT) Received: from kwepemd200002.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4RvmrL49dgztSPy; Tue, 26 Sep 2023 12:25:38 +0800 (CST) Received: from M910t.huawei.com (10.110.54.157) by kwepemd200002.china.huawei.com (7.221.188.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.1258.23; Tue, 26 Sep 2023 12:29:58 +0800 From: Changbin Du To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo CC: Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Ian Rogers , Adrian Hunter , , , , Changbin Du , kernel test robot Subject: [PATCH v5 3/5] perf: add new option '--workload-config' to set workload sched_policy/prio/cpumask Date: Tue, 26 Sep 2023 12:29:36 +0800 Message-ID: <20230926042938.509234-4-changbin.du@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230926042938.509234-1-changbin.du@huawei.com> References: <20230926042938.509234-1-changbin.du@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.110.54.157] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemd200002.china.huawei.com (7.221.188.186) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" To get consistent benchmarking results, sometimes we need to set the sched_policy/priority/cpumask of the workload to reduce system noise. For example, CPU binding is required on big.little system. $ perf stat -- taskset -c 0 ls However, the events of 'taskset' itself are also counted here. To get more accurate result, this should be avoided. To get away of the middleman, this adds a new option '--workload-config' to do the same jobs for stat and record commands. --workload-config <[sched_policy=3Dpolicy][,sched_prio=3Dpriority][,cpu-l= ist=3Dlist]> setup target workload (the ) attributes: sched_policy: other|fifo|rr|batch|idle sched_prio: scheduling priority for fifo|rr, nice value for= other cpu-list: CPU affinity. e.g. 1-3:5 is processors #1, #2, #3= and #5 For example, $ sudo perf stat --workload-config sched_policy=3Dfifo,sched_prio=3D40,cp= u-list=3D0-3:7 -- ls Above command will make 'ls' run on CPU #0-#3 and #7 with fifo scheduler and realtime priority is 40. Cc: kernel test robot Signed-off-by: Changbin Du --- v2: Use cpu list spec instead of cpu mask number. v3: o rename '--workload-attr' as '--workload-config' o transform to key-value style option --- tools/perf/Documentation/perf-record.txt | 7 ++ tools/perf/Documentation/perf-stat.txt | 6 ++ tools/perf/builtin-record.c | 27 ++++++ tools/perf/builtin-stat.c | 19 ++++ tools/perf/util/evlist.c | 108 +++++++++++++++++++++++ tools/perf/util/evlist.h | 3 + tools/perf/util/target.h | 9 ++ 7 files changed, 179 insertions(+) diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Document= ation/perf-record.txt index d5217be012d7..da4692751e17 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -821,6 +821,13 @@ filtered through the mask provided by -C option. only, as of now. So the applications built without the frame pointer might see bogus addresses. =20 +--workload-config <[sched_policy=3Dpolicy][,sched_prio=3Dpriority][,cpu-li= st=3Dlist]>:: + setup target workload (the ) attributes: + + sched_policy: other|fifo|rr|batch|idle + sched_prio: scheduling priority for fifo|rr, nice value for other + cpu-list: CPU affinity. e.g. 1-3:5 is processors #1, #2, #3 and #5 + include::intel-hybrid.txt[] =20 SEE ALSO diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentat= ion/perf-stat.txt index 8f789fa1242e..b2038f7e236a 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -262,6 +262,12 @@ disable events during measurements: wait -n ${perf_pid} exit $? =20 +--workload-config <[sched_policy=3Dpolicy][,sched_prio=3Dpriority][,cpu-li= st=3Dlist]>:: + setup target workload (the ) attributes: + + sched_policy: other|fifo|rr|batch|idle + sched_prio: scheduling priority for fifo|rr, nice value for other + cpu-list: CPU affinity. e.g. 1-3:5 is processors #1, #2, #3 and #5 =20 --pre:: --post:: diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 34bb31f08bb5..20799a1e60f6 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -3277,6 +3277,17 @@ static int parse_record_synth_option(const struct op= tion *opt, return 0; } =20 +static int record_parse_workload_attr_opt(const struct option *opt, + const char *arg, + int unset __maybe_unused) +{ + struct record_opts *opts =3D opt->value; + + return evlist__parse_workload_config(arg, &opts->target.workload.sched_po= licy, + &opts->target.workload.sched_priority, + &opts->target.workload.cpu_map); +} + /* * XXX Ideally would be local to cmd_record() and passed to a record__new * because we need to have access to it in record__exit, that is called @@ -3297,6 +3308,8 @@ static struct record record =3D { .target =3D { .uses_mmap =3D true, .default_per_cpu =3D true, + .workload.sched_policy =3D -1, + .workload.sched_priority =3D 0, }, .mmap_flush =3D MMAP_FLUSH_DEFAULT, .nr_threads_synthesize =3D 1, @@ -3321,6 +3334,12 @@ static struct record record =3D { const char record_callchain_help[] =3D CALLCHAIN_RECORD_HELP "\n\t\t\t\tDefault: fp"; =20 +const char record_workload_config_help[] =3D + "setup target workload (the ) attributes:\n\n" + HELP_PAD "sched_policy: other|fifo|rr|batch|idle\n" + HELP_PAD "sched_prio: scheduling priority for fifo|rr, nice value for oth= er\n" + HELP_PAD "cpu-list: CPU affinity. e.g. 1-3:5 is processors #1, #2, #3 and= #5"; + static bool dry_run; =20 static struct parse_events_option_args parse_events_option_args =3D { @@ -3535,6 +3554,10 @@ static struct option __record_options[] =3D { "write collected trace data into several data files using parallel = threads", record__parse_threads), OPT_BOOLEAN(0, "off-cpu", &record.off_cpu, "Enable off-cpu analysis"), + OPT_CALLBACK(0, "workload-config", &record.opts, + "[sched_policy=3Dpolicy][,sched_prio=3Dpriority][,cpu-list=3Dlist]", + record_workload_config_help, + &record_parse_workload_attr_opt), OPT_END() }; =20 @@ -4221,6 +4244,10 @@ int cmd_record(int argc, const char **argv) record__free_thread_masks(rec, rec->nr_threads); rec->nr_threads =3D 0; evlist__close_control(rec->opts.ctl_fd, rec->opts.ctl_fd_ack, &rec->opts.= ctl_fd_close); + if (rec->opts.target.workload.cpu_map) { + perf_cpu_map__put(rec->opts.target.workload.cpu_map); + rec->opts.target.workload.cpu_map =3D NULL; + } return err; } =20 diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 07b48f6df48e..a7a3a788e7d9 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -108,6 +108,8 @@ static bool all_counters_use_bpf =3D true; =20 static struct target target =3D { .uid =3D UINT_MAX, + .workload.sched_policy =3D -1, + .workload.sched_priority =3D 0, }; =20 #define METRIC_ONLY_LEN 20 @@ -1160,6 +1162,14 @@ static int parse_cache_level(const struct option *op= t, return 0; } =20 +static int parse_workload_attr_opt(const struct option *opt __maybe_unused= , const char *arg, + int unset __maybe_unused) +{ + return evlist__parse_workload_config(arg, &target.workload.sched_policy, + &target.workload.sched_priority, + &target.workload.cpu_map); +} + static struct option stat_options[] =3D { OPT_BOOLEAN('T', "transaction", &transaction_run, "hardware transaction statistics"), @@ -1220,6 +1230,10 @@ static struct option stat_options[] =3D { OPT_BOOLEAN(0, "append", &append_file, "append to the output file"), OPT_INTEGER(0, "log-fd", &output_fd, "log output to fd, instead of stderr"), + OPT_CALLBACK(0, "workload-config", &stat_config, + "[sched_policy=3Dpolicy][,sched_prio=3Dpriority][,cpu-list=3Dlist]", + record_workload_config_help, + &parse_workload_attr_opt), OPT_STRING(0, "pre", &pre_cmd, "command", "command to run prior to the measured command"), OPT_STRING(0, "post", &post_cmd, "command", @@ -2893,5 +2907,10 @@ int cmd_stat(int argc, const char **argv) metricgroup__rblist_exit(&stat_config.metric_events); evlist__close_control(stat_config.ctl_fd, stat_config.ctl_fd_ack, &stat_c= onfig.ctl_fd_close); =20 + if (target.workload.cpu_map) { + perf_cpu_map__put(target.workload.cpu_map); + target.workload.cpu_map =3D NULL; + } + return status; } diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 7ef43f72098e..7ad7a4fed282 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -33,6 +33,7 @@ #include "util/bpf-filter.h" #include "util/stat.h" #include "util/util.h" +#include "util/parse-sublevel-options.h" #include #include #include @@ -46,6 +47,7 @@ #include #include #include +#include =20 #include #include @@ -1398,6 +1400,109 @@ int evlist__open(struct evlist *evlist) return err; } =20 +int evlist__parse_workload_config(const char *str, int *sched_policy, int = *sched_priority, + struct perf_cpu_map **cpu_map) +{ + char *policy_str =3D NULL; + int priority =3D -1; + char *cpu_list =3D NULL; + int ret; + struct sublevel_option workload_conf_opts[] =3D { + { .name =3D "sched_policy", .str_ptr =3D &policy_str}, + { .name =3D "sched_prio", .value_ptr =3D &priority}, + { .name =3D "cpu-list", .str_ptr =3D &cpu_list}, + { .name =3D NULL, } + }; + + ret =3D perf_parse_sublevel_options(str, workload_conf_opts); + if (ret) + return ret; + + /* sched policy, default to 'other'. */ + if (!policy_str || !strncmp(policy_str, "other", sizeof("other"))) + *sched_policy =3D SCHED_OTHER; + else if (!strncmp(policy_str, "fifo", sizeof("fifo"))) + *sched_policy =3D SCHED_FIFO; + else if (!strncmp(policy_str, "rr", sizeof("rr"))) + *sched_policy =3D SCHED_RR; + else if (!strncmp(policy_str, "batch", sizeof("batch"))) + *sched_policy =3D SCHED_BATCH; + else if (!strncmp(policy_str, "idle", sizeof("idle"))) + *sched_policy =3D SCHED_IDLE; + else { + pr_err("workload_attr: unknown sched policy %s\n", policy_str); + ret =3D -EINVAL; + goto out; + } + + /* check sched priority and set default value */ + if (*sched_policy =3D=3D SCHED_FIFO || *sched_policy =3D=3D SCHED_RR) { + if (priority =3D=3D -1) + priority =3D 99; /* default to lowest priority */ + else if (priority < 1 || priority > 99) { + pr_err("workload_attr: invalid priority %d for fifo and rr, allowed [1,= 99]\n", + priority); + ret =3D -EINVAL; + goto out; + } + } else if (*sched_policy =3D=3D SCHED_OTHER && priority =3D=3D -1) + priority =3D 0; + *sched_priority =3D priority; + + /* allowed cpu list */ + *cpu_map =3D __perf_cpu_map__new(cpu_list, ':'); + if (!*cpu_map) { + pr_err("workload_attr: failed to get cpus map from %s\n", cpu_list); + ret =3D -EINVAL; + } + +out: + free(policy_str); + free(cpu_list); + return ret; +} + +static int configurate_workload(struct target *target) +{ + struct sched_param param; + int policy =3D target->workload.sched_policy; + int priority =3D target->workload.sched_priority; + + if (policy >=3D 0) { + param.sched_priority =3D (policy =3D=3D SCHED_FIFO || policy =3D=3D SCHE= D_RR) ? + priority : 0; + if (sched_setscheduler(0, policy, ¶m) !=3D 0) { + pr_err("failed to set the sched policy %d: %s\n", policy, strerror(errn= o)); + return -1; + } + + if (policy =3D=3D SCHED_OTHER) { + if (setpriority(PRIO_PROCESS, 0, priority) !=3D 0) { + pr_err("failed to set the nice value %d: %s\n", priority, strerror(err= no)); + return -1; + } + } + } + + if (target->workload.cpu_map) { + size_t cpuset_size =3D -1; + cpu_set_t *cpu_set; + + cpu_set =3D perf_cpu_map__2_cpuset(target->workload.cpu_map, &cpuset_siz= e); + if (!cpu_set) + return -1; + + if (sched_setaffinity(0, cpuset_size, cpu_set) !=3D 0) { + pr_err("failed to set the sched affinity: %s\n", strerror(errno)); + CPU_FREE(cpu_set); + return -1; + } + CPU_FREE(cpu_set); + } + + return 0; +} + int evlist__prepare_workload(struct evlist *evlist, struct target *target,= const char *argv[], bool pipe_output, void (*exec_error)(int signo, siginfo_t *info, v= oid *ucontext)) { @@ -1464,6 +1569,9 @@ int evlist__prepare_workload(struct evlist *evlist, s= truct target *target, const exit(ret); } =20 + if (configurate_workload(target) !=3D 0) + exit(-1); + execvp(argv[0], (char **)argv); =20 if (exec_error) { diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h index 664c6bf7b3e0..540e17d0d9fe 100644 --- a/tools/perf/util/evlist.h +++ b/tools/perf/util/evlist.h @@ -15,6 +15,7 @@ #include #include #include +#include =20 struct pollfd; struct thread_map; @@ -180,6 +181,8 @@ void evlist__set_id_pos(struct evlist *evlist); void evlist__config(struct evlist *evlist, struct record_opts *opts, struc= t callchain_param *callchain); int record_opts__config(struct record_opts *opts); =20 +int evlist__parse_workload_config(const char *str, int *sched_policy, int = *sched_priority, + struct perf_cpu_map **cpu_set); int evlist__prepare_workload(struct evlist *evlist, struct target *target, const char *argv[], bool pipe_output, void (*exec_error)(int signo, siginfo_t *info, void *ucontext)); diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h index d582cae8e105..78b7e7ab1c7b 100644 --- a/tools/perf/util/target.h +++ b/tools/perf/util/target.h @@ -4,6 +4,7 @@ =20 #include #include +#include =20 struct target { const char *pid; @@ -19,6 +20,12 @@ struct target { bool use_bpf; int initial_delay; const char *attr_map; + + struct { + int sched_policy; + int sched_priority; + struct perf_cpu_map *cpu_map; + } workload; }; =20 enum target_errno { @@ -103,4 +110,6 @@ static inline bool target__uses_dummy_map(struct target= *target) return use_dummy; } =20 +extern const char record_workload_config_help[]; + #endif /* _PERF_TARGET_H */ --=20 2.25.1