-- Observed in, but not limited to, Linux 6.1.1
Dear all,
rtla osnoise hist always outputs '0' as average duration value. Example:
# rtla osnoise hist -P F:1 -c 0-1 -r 900000 -d 1M -b 1 -E 5000 -T 1
# RTLA osnoise histogram
# Time unit is microseconds (us)
# Duration: 0 00:01:00
...
count: 5629 1364
min: 1 1
avg: 0 0
max: 2955 56
This is due to sum_sample in osnoise_hist_update_multiple() being
calculated as the sum (duration), not as sum (duration * count).
Rounding, instead of truncating, of the average value would be cool.
The following patch would solve the issue described above:
Sampled duration must be weighted by observed quantity, to arrive at a
correct average duration value.
Fix calculation of total duration by summing (duration * count).
Introduce rounding for calculation of final value.
Signed-off-by: Andreas Ziegler <br015@umbiko.net>
--- a/tools/tracing/rtla/src/osnoise_hist.c
+++ b/tools/tracing/rtla/src/osnoise_hist.c
@@ -121,6 +121,7 @@
{
struct osnoise_hist_params *params = tool->params;
struct osnoise_hist_data *data = tool->data;
+ unsigned long long total_duration;
int entries = data->entries;
int bucket;
int *hist;
@@ -131,10 +132,12 @@
if (data->bucket_size)
bucket = duration / data->bucket_size;
+ total_duration = duration * count;
+
hist = data->hist[cpu].samples;
data->hist[cpu].count += count;
update_min(&data->hist[cpu].min_sample, &duration);
- update_sum(&data->hist[cpu].sum_sample, &duration);
+ update_sum(&data->hist[cpu].sum_sample, &total_duration);
update_max(&data->hist[cpu].max_sample, &duration);
if (bucket < entries)
@@ -333,7 +336,7 @@
if (data->hist[cpu].count)
trace_seq_printf(trace->seq, "%9llu ",
- data->hist[cpu].sum_sample / data->hist[cpu].count);
+ (data->hist[cpu].sum_sample + data->hist[cpu].count / 2) /
data->hist[cpu].count);
else
trace_seq_printf(trace->seq, " - ");
}
Kind regards,
Andreas
Hi Andreas, On 12/24/22 13:39, Andreas Ziegler wrote: > -- Observed in, but not limited to, Linux 6.1.1 Since original commit... The best way to report this is adding a Fixes: tag. For example: Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode") > rtla osnoise hist always outputs '0' as average duration value. Example: > > # rtla osnoise hist -P F:1 -c 0-1 -r 900000 -d 1M -b 1 -E 5000 -T 1 > # RTLA osnoise histogram > # Time unit is microseconds (us) > # Duration: 0 00:01:00 > ... > count: 5629 1364 > min: 1 1 > avg: 0 0 > max: 2955 56 > > This is due to sum_sample in osnoise_hist_update_multiple() being calculated as the sum (duration), not as sum (duration * count). Yeah, that is correct. It works on timerlat hist because timerlat hist collects each trace event. osnoise hist uses in-kernel histograms, so we need to multiply the value with the count. This is a leftover from the development phase, as I started using tracing and then moved to histograms (which is better). > Rounding, instead of truncating, of the average value would be cool. I thought: the values were already rounded up, so it might be rounding too much. But we are in user space. It is just easier to add a two digits precision to the value, no? > The following patch would solve the issue described above: > > > Sampled duration must be weighted by observed quantity, to arrive at a > correct average duration value. > > Fix calculation of total duration by summing (duration * count). > Introduce rounding for calculation of final value. > > Signed-off-by: Andreas Ziegler <br015@umbiko.net> > > --- a/tools/tracing/rtla/src/osnoise_hist.c > +++ b/tools/tracing/rtla/src/osnoise_hist.c > @@ -121,6 +121,7 @@ > { > struct osnoise_hist_params *params = tool->params; > struct osnoise_hist_data *data = tool->data; > + unsigned long long total_duration; > int entries = data->entries; > int bucket; > int *hist; > @@ -131,10 +132,12 @@ > if (data->bucket_size) > bucket = duration / data->bucket_size; > > + total_duration = duration * count; > + > hist = data->hist[cpu].samples; > data->hist[cpu].count += count; > update_min(&data->hist[cpu].min_sample, &duration); > - update_sum(&data->hist[cpu].sum_sample, &duration); > + update_sum(&data->hist[cpu].sum_sample, &total_duration); How about re-seding a patch with the code above, adding the: Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode") and... > update_max(&data->hist[cpu].max_sample, &duration); > > if (bucket < entries) > @@ -333,7 +336,7 @@ > > if (data->hist[cpu].count) > trace_seq_printf(trace->seq, "%9llu ", > - data->hist[cpu].sum_sample / data->hist[cpu].count); > + (data->hist[cpu].sum_sample + data->hist[cpu].count / 2) / data->hist[cpu].count); another patch with this part, adding two digits precision? > else > trace_seq_printf(trace->seq, " - "); > } > Thanks! -- Daniel > Kind regards, > Andreas
Version 2 of the proposed patch, with changes split in two separate commits, as suggested by Daniel Bristot de Oliveira rtla osnoise hist always outputs '0' as average duration value. Example: # rtla osnoise hist -P F:1 -c 0-1 -r 900000 -d 1M -b 1 -E 5000 -T 1 # RTLA osnoise histogram # Time unit is microseconds (us) # Duration: 0 00:01:00 ... count: 5629 1364 min: 1 1 avg: 0 0 max: 2955 56 This is due to sum_sample in osnoise_hist_update_multiple() being calculated as the sum (duration), not as sum (duration * count). Truncating of the average value in final output suggests too optimistic results; display floating point value instead. Andreas Ziegler (2): tools/tracing/rtla: osnoise_hist: use total duration for average calculation tools/tracing/rtla: osnoise_hist: display average with two-digit precision tools/tracing/rtla/src/osnoise_hist.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) -- 2.34.1
Sampled durations must be weighted by observed quantity, to arrive at a correct
average duration value.
Perform calculation of total duration by summing (duration * count).
Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode")
Signed-off-by: Andreas Ziegler <br015@umbiko.net>
---
Changes v1 -> v2:
- add 'Fixes' line (Daniel)
tools/tracing/rtla/src/osnoise_hist.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/tracing/rtla/src/osnoise_hist.c b/tools/tracing/rtla/src/osnoise_hist.c
index 5d7ea479ac89..fe34452fc4ec 100644
--- a/tools/tracing/rtla/src/osnoise_hist.c
+++ b/tools/tracing/rtla/src/osnoise_hist.c
@@ -121,6 +121,7 @@ static void osnoise_hist_update_multiple(struct osnoise_tool *tool, int cpu,
{
struct osnoise_hist_params *params = tool->params;
struct osnoise_hist_data *data = tool->data;
+ unsigned long long total_duration;
int entries = data->entries;
int bucket;
int *hist;
@@ -131,10 +132,12 @@ static void osnoise_hist_update_multiple(struct osnoise_tool *tool, int cpu,
if (data->bucket_size)
bucket = duration / data->bucket_size;
+ total_duration = duration * count;
+
hist = data->hist[cpu].samples;
data->hist[cpu].count += count;
update_min(&data->hist[cpu].min_sample, &duration);
- update_sum(&data->hist[cpu].sum_sample, &duration);
+ update_sum(&data->hist[cpu].sum_sample, &total_duration);
update_max(&data->hist[cpu].max_sample, &duration);
if (bucket < entries)
--
2.34.1
On 1/3/23 11:33, Andreas Ziegler wrote: > Sampled durations must be weighted by observed quantity, to arrive at a correct > average duration value. > > Perform calculation of total duration by summing (duration * count). > > Fixes: 829a6c0b5698 ("rtla/osnoise: Add the hist mode") > Signed-off-by: Andreas Ziegler <br015@umbiko.net> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> -- Daniel > --- > Changes v1 -> v2: > - add 'Fixes' line (Daniel) > > tools/tracing/rtla/src/osnoise_hist.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/tools/tracing/rtla/src/osnoise_hist.c b/tools/tracing/rtla/src/osnoise_hist.c > index 5d7ea479ac89..fe34452fc4ec 100644 > --- a/tools/tracing/rtla/src/osnoise_hist.c > +++ b/tools/tracing/rtla/src/osnoise_hist.c > @@ -121,6 +121,7 @@ static void osnoise_hist_update_multiple(struct osnoise_tool *tool, int cpu, > { > struct osnoise_hist_params *params = tool->params; > struct osnoise_hist_data *data = tool->data; > + unsigned long long total_duration; > int entries = data->entries; > int bucket; > int *hist; > @@ -131,10 +132,12 @@ static void osnoise_hist_update_multiple(struct osnoise_tool *tool, int cpu, > if (data->bucket_size) > bucket = duration / data->bucket_size; > > + total_duration = duration * count; > + > hist = data->hist[cpu].samples; > data->hist[cpu].count += count; > update_min(&data->hist[cpu].min_sample, &duration); > - update_sum(&data->hist[cpu].sum_sample, &duration); > + update_sum(&data->hist[cpu].sum_sample, &total_duration); > update_max(&data->hist[cpu].max_sample, &duration); > > if (bucket < entries)
Calculate average value in osnoise-hist summary with two-digit
precision to avoid displaying too optimitic results.
Signed-off-by: Andreas Ziegler <br015@umbiko.net>
---
Changes v1 -> v2:
- output with two-digit precision instead of rounding (Daniel)
tools/tracing/rtla/src/osnoise_hist.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/tracing/rtla/src/osnoise_hist.c b/tools/tracing/rtla/src/osnoise_hist.c
index fe34452fc4ec..13e1233690bb 100644
--- a/tools/tracing/rtla/src/osnoise_hist.c
+++ b/tools/tracing/rtla/src/osnoise_hist.c
@@ -335,8 +335,8 @@ osnoise_print_summary(struct osnoise_hist_params *params,
continue;
if (data->hist[cpu].count)
- trace_seq_printf(trace->seq, "%9llu ",
- data->hist[cpu].sum_sample / data->hist[cpu].count);
+ trace_seq_printf(trace->seq, "%9.2f ",
+ ((double) data->hist[cpu].sum_sample) / data->hist[cpu].count);
else
trace_seq_printf(trace->seq, " - ");
}
--
2.34.1
On 1/3/23 11:34, Andreas Ziegler wrote: > Calculate average value in osnoise-hist summary with two-digit > precision to avoid displaying too optimitic results. > > Signed-off-by: Andreas Ziegler <br015@umbiko.net> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org> -- Daniel
On Sat, Dec 24, 2022 at 7:48 AM Andreas Ziegler <br015@umbiko.net> wrote: > > -- Observed in, but not limited to, Linux 6.1.1 Wait, "but not limited to"? What does that mean? Are there more versions affected? -- Slade
On 2022-12-24 21:17, Slade Watkins wrote: > On Sat, Dec 24, 2022 at 7:48 AM Andreas Ziegler <br015@umbiko.net> > wrote: >> >> -- Observed in, but not limited to, Linux 6.1.1 > > Wait, "but not limited to"? What does that mean? Are there more > versions affected? This was meant to indicate that the bug is not a regression; it can be found in every release since introduction in 5.17. Currently affected are 6.1.y and 6.0.y kernel trees. Regards, Andreas > -- Slade
© 2016 - 2025 Red Hat, Inc.