From nobody Sat May 10 21:30:32 2025 Received: from mx10.gouders.net (mx10.gouders.net [202.61.206.94]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDF651DED42; Sat, 5 Apr 2025 12:02:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=202.61.206.94 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743854529; cv=none; b=e1KdmymnPvSErKcXwu1ZHPR+suAzitegHsgCpvA/pgTVz7UDak+EHyoCcNdkr/w7lBTI+hnW3ZinPSDR+UaHuTWRBmemq+UzHAdncp7mVIt+pafP2iFzkfJLeb0RrYY4e8INxKDkNudPBWLmaWW/qJms7iSAoYHku0zcw4GFJRw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1743854529; c=relaxed/simple; bh=q36thpo4qph9HO2471J3L8SDoZKI0EjIb7JRIbOTa3c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fWUY+ywjbnIO5zYLzqtsYv1a0FbvX2JxvlB42iCt6W8t+Y4t5Lxj3na872B+FxQGqtBFZTns3Bt9U4zyb2o4iobla5bgn+u1V9eiIPRxqkw61o7E9byOGQbjFhGGgEt75QmsmTka98A8eg+/W19Yk3ZwvsH/drtUpPO7329uR0M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gouders.net; spf=pass smtp.mailfrom=gouders.net; dkim=pass (1024-bit key) header.d=gouders.net header.i=@gouders.net header.b=n7ppOQfa; arc=none smtp.client-ip=202.61.206.94 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gouders.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gouders.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=gouders.net header.i=@gouders.net header.b="n7ppOQfa" Received: from localhost (ip-109-42-179-132.web.vodafone.de [109.42.179.132]) (authenticated bits=0) by mx10.gouders.net (8.17.1.9/8.17.1.9) with ESMTPSA id 535C1aJL022451 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Sat, 5 Apr 2025 14:01:36 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gouders.net; s=gnet; t=1743854497; bh=q36thpo4qph9HO2471J3L8SDoZKI0EjIb7JRIbOTa3c=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=n7ppOQfak5bGpMjs/1zK4iVTEX/jOaofVz2g90pNjwA1Scsev06mtdKaGLTZ+ES6Z PJF8N4f+10A5xOibeagSo1QvTP1LcA07viZrnVxXj8LIOil9RmNmMD+agcQZNUrdmZ Ln9mGRE5MK4MOOjHIFtq9EOPvdtXC0tKJikaHyXw= From: Dirk Gouders <dirk@gouders.net> To: Namhyung Kim <namhyung@kernel.org>, Arnaldo Carvalho de Melo <acme@kernel.org>, Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org> Cc: Dirk Gouders <dirk@gouders.net>, Ian Rogers <irogers@google.com>, Adrian Hunter <adrian.hunter@intel.com>, LKML <linux-kernel@vger.kernel.org>, linux-perf-users@vger.kernel.org Subject: [PATCH v2 2/3] perf bench sched pipe: add complete graph simulation Date: Sat, 5 Apr 2025 14:00:07 +0200 Message-ID: <20250405120039.15953-3-dirk@gouders.net> X-Mailer: git-send-email 2.45.3 In-Reply-To: <20250405120039.15953-1-dirk@gouders.net> References: <20250402212402.15658-2-dirk@gouders.net> <20250405120039.15953-1-dirk@gouders.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, we have only one worker function: the simulation of a ring for token traversal. Add another worker to simulate a complete graph (Kn) for token traversal. A new option -K/--Kn can be used to use the new worker. Those different workers could be interesting, because they produce workload varieties noticeable by perf-report(1), for example: (booted with mitigations=3Doff, 6 processes) Ring simulation: Samples: 92K of event 'cycles:P', Event count (approx.): 18690208287 Overhead Command Shared Object Symbol 13.16% sched-pipe [kernel.kallsyms] [k] timerqueue_add 7.10% sched-pipe [kernel.kallsyms] [k] read_hpet 3.36% sched-pipe [kernel.kallsyms] [k] _copy_from_iter 3.23% sched-pipe [kernel.kallsyms] [k] _copy_to_iter 2.64% sched-pipe [kernel.kallsyms] [k] vfs_write 2.55% sched-pipe [kernel.kallsyms] [k] vfs_read Kn simulation: Samples: 163K of event 'cycles:P', Event count (approx.): 100366721164 Overhead Command Shared Object Symbol 5.11% sched-pipe [kernel.kallsyms] [k] _copy_from_iter 4.90% sched-pipe [kernel.kallsyms] [k] queued_spin_lock_slowpath 3.99% sched-pipe [kernel.kallsyms] [k] _copy_to_iter 3.35% sched-pipe [kernel.kallsyms] [k] timerqueue_add 2.80% sched-pipe [kernel.kallsyms] [k] check_preemption_disabled 2.56% sched-pipe [kernel.kallsyms] [k] vfs_write 2.40% sched-pipe [kernel.kallsyms] [k] vfs_read Signed-off-by: Dirk Gouders <dirk@gouders.net> --- tools/perf/Documentation/perf-bench.txt | 5 +++ tools/perf/bench/sched-pipe.c | 60 +++++++++++++++++++++++-- 2 files changed, 62 insertions(+), 3 deletions(-) diff --git a/tools/perf/Documentation/perf-bench.txt b/tools/perf/Documenta= tion/perf-bench.txt index 8a651f2fe3aa..6f7df3d47821 100644 --- a/tools/perf/Documentation/perf-bench.txt +++ b/tools/perf/Documentation/perf-bench.txt @@ -130,6 +130,11 @@ process). =20 Options of *pipe* ^^^^^^^^^^^^^^^^^ +-K:: +--Kn:: +Simulate a complete graph instead of a ring for sending tokens. +Each process sends and receives tokens to/from every other process. + -l:: --loop=3D:: Specify number of loops. diff --git a/tools/perf/bench/sched-pipe.c b/tools/perf/bench/sched-pipe.c index 28dd7f3a11b2..3c76e8249a9b 100644 --- a/tools/perf/bench/sched-pipe.c +++ b/tools/perf/bench/sched-pipe.c @@ -50,6 +50,8 @@ static bool threaded; static unsigned int nr_threads =3D 2; =20 static bool nonblocking; +static bool Kn_mode; /* Toggle for ring mode -> complete graph mode */ + static char *cgrp_names[2]; static struct cgroup *cgrps[2]; =20 @@ -90,6 +92,7 @@ static const struct option options[] =3D { OPT_BOOLEAN('n', "nonblocking", &nonblocking, "Use non-blocking operation= s"), OPT_UINTEGER('p', "nprocs", &nr_threads, "Number of processes"), OPT_UINTEGER('l', "loop", &loops, "Specify number of loops"), + OPT_BOOLEAN('K', "Kn", &Kn_mode, "Send tokens in a complete graph instea= d of a ring."), OPT_BOOLEAN('T', "threaded", &threaded, "Specify threads/process based ta= sk setup"), OPT_CALLBACK('G', "cgroups", NULL, "SEND,RECV", "Put sender and receivers in given cgroups", @@ -188,11 +191,55 @@ static inline int read_pipe(struct thread_data *td) return ret; } =20 +/* + * Worker thread for processes forming a complete graph, + * sending tokens one to each other. + */ +static void *worker_thread_kn(void *__tdata) +{ + struct thread_data *this_thread =3D __tdata; + struct thread_data *all_threads =3D this_thread - this_thread->nr; + + int ret, m =3D 0; + unsigned int i; + unsigned int t; + + ret =3D enter_cgroup(this_thread->nr); + if (ret < 0) { + this_thread->cgroup_failed =3D true; + return NULL; + } + + if (nonblocking) { + this_thread->epoll_ev.events =3D EPOLLIN; + this_thread->epoll_fd =3D epoll_create(1); + BUG_ON(this_thread->epoll_fd < 0); + BUG_ON(epoll_ctl(this_thread->epoll_fd, EPOLL_CTL_ADD, this_thread->pipe= _read, &this_thread->epoll_ev) < 0); + } + + for (i =3D 0; i < loops; i++) { + /* First: feed all other workers. */ + for (t =3D 0; t < nr_threads; t++) + if (t !=3D this_thread->nr) { + ret =3D write(all_threads[t].pipe_write, &m, sizeof(int)); + BUG_ON(ret !=3D sizeof(int)); + } + + /* Read a token from all other workers. */ + for (t =3D 1; t < nr_threads; t++) { + ret =3D read_pipe(this_thread); + BUG_ON(ret !=3D sizeof(int)); + } + } + + return NULL; +} + /* * Worker thread for nodes forming a ring, receiving tokens from the left * neighbor and sending them to the right one. */ -static void *worker_thread(void *__tdata) +static void *worker_thread_ring(void *__tdata) { struct thread_data *this_thread =3D __tdata; struct thread_data *first_thread =3D this_thread - this_thread->nr; @@ -231,6 +278,9 @@ static void *worker_thread(void *__tdata) return NULL; } =20 +/* Ring mode is the default. */ +void * (*worker_thread)(void *) =3D worker_thread_ring; + static struct thread_data *create_thread_data(void) { struct thread_data *threads; @@ -279,6 +329,9 @@ int bench_sched_pipe(int argc, const char **argv) =20 argc =3D parse_options(argc, argv, options, bench_sched_pipe_usage, 0); =20 + if (Kn_mode) + worker_thread =3D worker_thread_kn; + threads =3D create_thread_data(); =20 gettimeofday(&start, NULL); @@ -331,8 +384,9 @@ int bench_sched_pipe(int argc, const char **argv) =20 switch (bench_format) { case BENCH_FORMAT_DEFAULT: - printf("# Executed %d pipe operations between %u %s\n\n", loops, - nr_threads, threaded ? "threads" : "processes"); + printf("# Executed %d pipe operations (%s) between %u %s\n\n", loops, + Kn_mode ? "Kn" : "ring", nr_threads, + threaded ? "threads" : "processes"); =20 result_usec =3D diff.tv_sec * USEC_PER_SEC; result_usec +=3D diff.tv_usec; --=20 2.45.3