From: K Prateek Nayak <kprateek.nayak@amd.com>
Currently, there does not exist a straightforward way to extract the
names of the sched domains and match them to the per-cpu domain entry in
/proc/schedstat other than looking at the debugfs files which are only
visible after enabling "verbose" debug after commit 34320745dfc9
("sched/debug: Put sched/domains files under the verbose flag")
Since tools like `perf sched schedstat` require displaying per-domain
information in user friendly manner, display the names of sched domain,
alongside their level in /proc/schedstat if CONFIG_SCHED_DEBUG is enabled.
Domain names also makes the /proc/schedstat data unambiguous when some
of the cpus are offline. For example, on a 128 cpus AMD Zen3 machine
where CPU0 and CPU64 are SMT siblings and CPU64 is offline:
Before:
cpu0 ...
domain0 ...
domain1 ...
cpu1 ...
domain0 ...
domain1 ...
domain2 ...
After:
cpu0 ...
domain0:MC ...
domain1:PKG ...
cpu1 ...
domain0:SMT ...
domain1:MC ...
domain2:PKG ...
schedstat version has not been updated since this change merely adds
additional information to the domain name field and does not add a new
field altogether.
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: James Clark <james.clark@linaro.org>
---
Documentation/scheduler/sched-stats.rst | 8 ++++++--
kernel/sched/stats.c | 6 +++++-
2 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/Documentation/scheduler/sched-stats.rst b/Documentation/scheduler/sched-stats.rst
index 7c2b16c4729d..b60a3e7bc108 100644
--- a/Documentation/scheduler/sched-stats.rst
+++ b/Documentation/scheduler/sched-stats.rst
@@ -6,6 +6,8 @@ Version 16 of schedstats changed the order of definitions within
'enum cpu_idle_type', which changed the order of [CPU_MAX_IDLE_TYPES]
columns in show_schedstat(). In particular the position of CPU_IDLE
and __CPU_NOT_IDLE changed places. The size of the array is unchanged.
+With CONFIG_SCHED_DEBUG enabled, the domain field can also print the
+name of the corresponding sched domain.
Version 15 of schedstats dropped counters for some sched_yield:
yld_exp_empty, yld_act_empty and yld_both_empty. Otherwise, it is
@@ -71,9 +73,11 @@ Domain statistics
-----------------
One of these is produced per domain for each cpu described. (Note that if
CONFIG_SMP is not defined, *no* domains are utilized and these lines
-will not appear in the output.)
+will not appear in the output. [:<name>] is an optional extension to the domain
+field that prints the name of the corresponding sched domain. It can appear in
+schedstat version 16 and above, and requires CONFIG_SCHED_DEBUG.)
-domain<N> <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
+domain<N>[:<name>] <cpumask> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
The first field is a bit mask indicating what cpus this domain operates over.
diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
index eb0cdcd4d921..bd4ed737e894 100644
--- a/kernel/sched/stats.c
+++ b/kernel/sched/stats.c
@@ -138,7 +138,11 @@ static int show_schedstat(struct seq_file *seq, void *v)
for_each_domain(cpu, sd) {
enum cpu_idle_type itype;
- seq_printf(seq, "domain%d %*pb", dcount++,
+ seq_printf(seq, "domain%d", dcount++);
+#ifdef CONFIG_SCHED_DEBUG
+ seq_printf(seq, ":%s", sd->name);
+#endif
+ seq_printf(seq, " %*pb",
cpumask_pr_args(sched_domain_span(sd)));
for (itype = 0; itype < CPU_MAX_IDLE_TYPES; itype++) {
seq_printf(seq, " %u %u %u %u %u %u %u %u",
--
2.43.0
On Fri, Nov 22, 2024 at 08:44:47AM +0000, Swapnil Sapkal wrote: > schedstat version has not been updated since this change merely adds > additional information to the domain name field and does not add a new > field altogether. So I don't care much either way, but if an existing tool is trying to parse the domain number, it might now get confused by the extra characters. I know of no such tool, just being pedantic.
Hello Peter, On 11/22/2024 4:42 PM, Peter Zijlstra wrote: > On Fri, Nov 22, 2024 at 08:44:47AM +0000, Swapnil Sapkal wrote: > >> schedstat version has not been updated since this change merely adds >> additional information to the domain name field and does not add a new >> field altogether. > > So I don't care much either way, but if an existing tool is trying to > parse the domain number, it might now get confused by the extra > characters. I know of no such tool, just being pedantic. I will change the schedstat version so that it will not break any existing tools. However the domain name is under CONFIG_SCHED_DEBUG and thus incrementing the schedstats version is not sufficient when this config option is disabled. Is it okay for you to move domain name out of CONFIG_SCHED_DEBUG? Previously, I sent couple of patches to change the schedstat layout for * Correctly accounting the different types of imbalances [1] * Accurately computing the task hot migrations [2] (This was your patch) I would like to include these two patches that better helps justify the version change. Is that okay by you? [1] https://lore.kernel.org/all/20240514044445.1409976-1-swapnil.sapkal@amd.com/ [2] https://lore.kernel.org/all/20230614102224.12555-1-swapnil.sapkal@amd.com/ -- Thanks and Regards, Swapnil
On Mon, Nov 25, 2024 at 06:01:16PM +0530, Sapkal, Swapnil wrote: > Hello Peter, > > On 11/22/2024 4:42 PM, Peter Zijlstra wrote: > > On Fri, Nov 22, 2024 at 08:44:47AM +0000, Swapnil Sapkal wrote: > > > > > schedstat version has not been updated since this change merely adds > > > additional information to the domain name field and does not add a new > > > field altogether. > > > > So I don't care much either way, but if an existing tool is trying to > > parse the domain number, it might now get confused by the extra > > characters. I know of no such tool, just being pedantic. > > I will change the schedstat version so that it will not break any existing > tools. > > However the domain name is under CONFIG_SCHED_DEBUG and thus incrementing > the schedstats version is not sufficient when this config option is > disabled. Is it okay for you to move domain name out of CONFIG_SCHED_DEBUG? I suppose so.. > Previously, I sent couple of patches to change the schedstat layout for > * Correctly accounting the different types of imbalances [1] > * Accurately computing the task hot migrations [2] (This was your patch) > I would like to include these two patches that better helps justify the > version change. Is that okay by you? Yeah, sure.
On 11/22/2024 2:14 PM, Swapnil Sapkal wrote: > From: K Prateek Nayak <kprateek.nayak@amd.com> > > Currently, there does not exist a straightforward way to extract the > names of the sched domains and match them to the per-cpu domain entry in > /proc/schedstat other than looking at the debugfs files which are only > visible after enabling "verbose" debug after commit 34320745dfc9 > ("sched/debug: Put sched/domains files under the verbose flag") > > Since tools like `perf sched schedstat` require displaying per-domain > information in user friendly manner, display the names of sched domain, > alongside their level in /proc/schedstat if CONFIG_SCHED_DEBUG is enabled. > > Domain names also makes the /proc/schedstat data unambiguous when some > of the cpus are offline. For example, on a 128 cpus AMD Zen3 machine > where CPU0 and CPU64 are SMT siblings and CPU64 is offline: > > Before: > cpu0 ... > domain0 ... > domain1 ... > cpu1 ... > domain0 ... > domain1 ... > domain2 ... > > After: > cpu0 ... > domain0:MC ... > domain1:PKG ... > cpu1 ... > domain0:SMT ... > domain1:MC ... > domain2:PKG ... > > schedstat version has not been updated since this change merely adds > additional information to the domain name field and does not add a new > field altogether. > > Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> > Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> > Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
On Fri, Nov 22, 2024 at 02:25:22PM +0530, Sapkal, Swapnil wrote: > On 11/22/2024 2:14 PM, Swapnil Sapkal wrote: > > From: K Prateek Nayak <kprateek.nayak@amd.com> > > > > Currently, there does not exist a straightforward way to extract the > > names of the sched domains and match them to the per-cpu domain entry in > > /proc/schedstat other than looking at the debugfs files which are only > > visible after enabling "verbose" debug after commit 34320745dfc9 > > ("sched/debug: Put sched/domains files under the verbose flag") > > > > Since tools like `perf sched schedstat` require displaying per-domain > > information in user friendly manner, display the names of sched domain, > > alongside their level in /proc/schedstat if CONFIG_SCHED_DEBUG is enabled. > > > > Domain names also makes the /proc/schedstat data unambiguous when some > > of the cpus are offline. For example, on a 128 cpus AMD Zen3 machine > > where CPU0 and CPU64 are SMT siblings and CPU64 is offline: > > > > Before: > > cpu0 ... > > domain0 ... > > domain1 ... > > cpu1 ... > > domain0 ... > > domain1 ... > > domain2 ... > > > > After: > > cpu0 ... > > domain0:MC ... > > domain1:PKG ... > > cpu1 ... > > domain0:SMT ... > > domain1:MC ... > > domain2:PKG ... > > > > schedstat version has not been updated since this change merely adds > > additional information to the domain name field and does not add a new > > field altogether. > > > > Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> > > Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> > > Tested-by: James Clark <james.clark@linaro.org> > > Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Surely you mean either acked-by or reviewed-by ? Otherwise I suggest you re-read the documentation on tags.
Hi Peter, On 11/22/2024 4:38 PM, Peter Zijlstra wrote: > On Fri, Nov 22, 2024 at 02:25:22PM +0530, Sapkal, Swapnil wrote: >> On 11/22/2024 2:14 PM, Swapnil Sapkal wrote: >>> From: K Prateek Nayak <kprateek.nayak@amd.com> >>> >>> Currently, there does not exist a straightforward way to extract the >>> names of the sched domains and match them to the per-cpu domain entry in >>> /proc/schedstat other than looking at the debugfs files which are only >>> visible after enabling "verbose" debug after commit 34320745dfc9 >>> ("sched/debug: Put sched/domains files under the verbose flag") >>> >>> Since tools like `perf sched schedstat` require displaying per-domain >>> information in user friendly manner, display the names of sched domain, >>> alongside their level in /proc/schedstat if CONFIG_SCHED_DEBUG is enabled. >>> >>> Domain names also makes the /proc/schedstat data unambiguous when some >>> of the cpus are offline. For example, on a 128 cpus AMD Zen3 machine >>> where CPU0 and CPU64 are SMT siblings and CPU64 is offline: >>> >>> Before: >>> cpu0 ... >>> domain0 ... >>> domain1 ... >>> cpu1 ... >>> domain0 ... >>> domain1 ... >>> domain2 ... >>> >>> After: >>> cpu0 ... >>> domain0:MC ... >>> domain1:PKG ... >>> cpu1 ... >>> domain0:SMT ... >>> domain1:MC ... >>> domain2:PKG ... >>> >>> schedstat version has not been updated since this change merely adds >>> additional information to the domain name field and does not add a new >>> field altogether. >>> >>> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> >>> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> >>> Tested-by: James Clark <james.clark@linaro.org> >> >> Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com> > > Surely you mean either acked-by or reviewed-by ? Otherwise I suggest you > re-read the documentation on tags. I took over the series from Ravi but forgot to add my S-o-b. I will make sure to include it from next time onward. -- Thanks and Regards, Swapnil
© 2016 - 2024 Red Hat, Inc.