hrtimer precision issue/question??

Posted by richard clark 10 months, 3 weeks ago

With diff below under the 'cyclictest -a 0 -t 1 -m -p99' trigger from
the arm64-based linux box, the interval is 1000us and the arch_timer
in the system is: arch_timer: cp15 timer(s) running at 31.25MHz
(phys).  1tick = 32ns for the arch timer, I am not sure if those
durations less than 1000us are expected?

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 11c704bfd4fb..691d1fe6e944 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -2036,14 +2036,25 @@ int nanosleep_copyout(struct restart_block
*restart, struct timespec64 *ts)
 static int __sched do_nanosleep(struct hrtimer_sleeper *t, enum
hrtimer_mode mode)
 {
        struct restart_block *restart;
+       ktime_t t0, t1;
+       int c0, c1;
+       bool hit = !strncmp(current->comm, "cyclictest", 10);

        do {
+               if (hit) {
+                       c0 = smp_processor_id();
+                       t0 = ktime_get();
+               }
                set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE);
                hrtimer_sleeper_start_expires(t, mode);

                if (likely(t->task))
                        schedule();
-
+               if (hit) {
+                       t1 = ktime_get();
+                       c1 = smp_processor_id();
+                       pr_info("[%2d-%2d]t0=%lld,t1=%lld,d=%lld
us\n", c0, c1, t0, t1, ktime_to_us(ktime_sub(t1, t0)));
+               }
                hrtimer_cancel(&t->timer);
                mode = HRTIMER_MODE_ABS;

Part of the real output:
[  165.555795] [ 0- 0]t0=165550399226,t1=165551394303,d=995 us
[  165.556802] [ 0- 0]t0=165551398751,t1=165552400997,d=1002 us
[  165.557797] [ 0- 0]t0=165552415141,t1=165553396106,d=980 us
[  165.558794] [ 0- 0]t0=165553402090,t1=165554394160,d=992 us
[  165.559794] [ 0- 0]t0=165554398832,t1=165555393941,d=995 us
[  165.559995] [ 6- 5]t0=165545533151,t1=165555594519,d=10061 us
[  165.560802] [ 0- 0]t0=165555398485,t1=165556400795,d=1002 us
[  165.561796] [ 0- 0]t0=165556415259,t1=165557395457,d=980 us
[  165.562794] [ 0- 0]t0=165557401601,t1=165558394086,d=992 us
[  165.563794] [ 0- 0]t0=165558398694,t1=165559393996,d=995 us
[  165.564802] [ 0- 0]t0=165559398188,t1=165560400785,d=1002 us
[  165.565796] [ 0- 0]t0=165560414673,t1=165561395671,d=980 us
[  165.566794] [ 0- 0]t0=165561402327,t1=165562394108,d=991 us
[  165.567794] [ 0- 0]t0=165562399068,t1=165563393890,d=994 us
[  165.568802] [ 0- 0]t0=165563399362,t1=165564401640,d=1002 us
[  165.569796] [ 0- 0]t0=165564413640,t1=165565395885,d=982 us
[  165.570104] [ 5- 6]t0=165555639767,t1=165565703087,d=10063 us
[  165.570795] [ 0- 0]t0=165565401421,t1=165566394739,d=993 us
[  165.571794] [ 0- 0]t0=165566400531,t1=165567393816,d=993 us
[  165.572799] [ 0- 0]t0=165567399800,t1=165568398270,d=998 us
[  165.573798] [ 0- 0]t0=165568412702,t1=165569398115,d=985 us
[  165.574794] [ 0- 0]t0=165569404451,t1=165570394249,d=989 us
[  165.575794] [ 0- 0]t0=165570399145,t1=165571393838,d=994 us
[  165.576801] [ 0- 0]t0=165571398606,t1=165572400276,d=1001 us
[  165.577796] [ 0- 0]t0=165572414260,t1=165573395417,d=981 us
[  165.578794] [ 0- 0]t0=165573401561,t1=165574394015,d=992 us
[  165.579794] [ 0- 0]t0=165574398975,t1=165575393957,d=994 us
[  165.580208] [ 6- 5]t0=165565749263,t1=165575807879,d=10058 us
[  165.580801] [ 0- 0]t0=165575398405,t1=165576399978,d=1001 us

Thanks

Re: hrtimer precision issue/question??

Posted by Thomas Gleixner 10 months, 3 weeks ago

On Sat, Mar 22 2025 at 11:20, richard clark wrote:
> With diff below under the 'cyclictest -a 0 -t 1 -m -p99' trigger from
> the arm64-based linux box, the interval is 1000us and the arch_timer
> in the system is: arch_timer: cp15 timer(s) running at 31.25MHz
> (phys).  1tick = 32ns for the arch timer, I am not sure if those
> durations less than 1000us are expected?

With your method of measurement yes. There is a german saying, which
describes this. It roughly translates to:

   "Who measures a lot, might measure a lot of garbage."

But it accurately describes, what you are measuring here. You do:

    t1 = ktime_get();
    arm_timer(T);
    schedule();
    t2 = ktime_get();

and then look at t2 - t1. That only tells you how long the task actually
slept. But that's ignoring the most important information here:

    arm_timer(T);

cyclictest uses:

           clock_nanosleep(clockid, ABSTIME, &T);

and T is maintained in absolute time on a periodic time line.

    T = starttime + N * interval;

So the only interesting information here is at which time the task
returns from schedule(), i.e. you want to look at:

        t2 - T

Why? Because that gives you the latency of the wakeup. That's what
cyclictest is looking at in user space:

           clock_nanosleep(... &T);
           clock_gettime(..., &T2);
           latency = T2 - T;

Now what you are looking at is the time at which the cyclictest task
comes back into the kernel to sleep, which is obviously

           t1 = T[N] + latency[N-1] + execution_time;

But the timer is armed for T[N], so your t2 is:

           t2 = T[N] + latency[N];

You surely can do the remaining math and map that to the output:

> [  165.555795] [ 0- 0]t0=165550399226,t1=165551394303,d=995 us
> [  165.556802] [ 0- 0]t0=165551398751,t1=165552400997,d=1002 us

Right?

Thanks,

        tglx

Re: hrtimer precision issue/question??

Posted by richard clark 10 months, 3 weeks ago

Hi tglx, Thanks for the explanation! I tried run 'cyclictest -a 0 -t 1
-m -p99 -r' to use the relative timer instead of absolute, now the
output of the same diff as:
[   63.824382] [ 0- 0]t0=63816686720,t1=63817689248,d=1002 us
[   63.825387] [ 0- 0]t0=63817691648,t1=63818694144,d=1002 us
[   63.825691] [ 4- 7]t0=63808939136,t1=63818998336,d=10059 us
[   63.826392] [ 0- 0]t0=63818696544,t1=63819699456,d=1002 us
[   63.827402] [ 0- 0]t0=63819704352,t1=63820709120,d=1004 us
[   63.828407] [ 0- 0]t0=63820712448,t1=63821714976,d=1002 us
[   63.829413] [ 0- 0]t0=63821717536,t1=63822720032,d=1002 us
[   63.830417] [ 0- 0]t0=63822722496,t1=63823724928,d=1002 us
[   63.831423] [ 0- 0]t0=63823727264,t1=63824730912,d=1003 us
[   63.832429] [ 0- 0]t0=63824734176,t1=63825736768,d=1002 us
[   63.833434] [ 0- 0]t0=63825739168,t1=63826741568,d=1002 us
[   63.834439] [ 0- 0]t0=63826743904,t1=63827746432,d=1002 us
[   63.835445] [ 0- 0]t0=63827748832,t1=63828752576,d=1003 us
[   63.836164] [ 7- 4]t0=63819412064,t1=63829471328,d=10059 us
[   63.836452] [ 0- 0]t0=63828755936,t1=63829760032,d=1004 us

Now I am wondering if the cyclictest will have the same result using
the relative timer instead of absolute, suppose all the other
conditions are the same?
Also the '-a 0' set affinity to cpu0 and only one timer thread with
'-t 1', don't know why some cyclictest migrate(?) to the other cpus
than cpu0, any comments about this...
Thanks!

On Sun, Mar 23, 2025 at 5:04 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Sat, Mar 22 2025 at 11:20, richard clark wrote:
> > With diff below under the 'cyclictest -a 0 -t 1 -m -p99' trigger from
> > the arm64-based linux box, the interval is 1000us and the arch_timer
> > in the system is: arch_timer: cp15 timer(s) running at 31.25MHz
> > (phys).  1tick = 32ns for the arch timer, I am not sure if those
> > durations less than 1000us are expected?
>
> With your method of measurement yes. There is a german saying, which
> describes this. It roughly translates to:
>
>    "Who measures a lot, might measure a lot of garbage."
>
> But it accurately describes, what you are measuring here. You do:
>
>     t1 = ktime_get();
>     arm_timer(T);
>     schedule();
>     t2 = ktime_get();
>
> and then look at t2 - t1. That only tells you how long the task actually
> slept. But that's ignoring the most important information here:
>
>     arm_timer(T);
>
> cyclictest uses:
>
>            clock_nanosleep(clockid, ABSTIME, &T);
>
> and T is maintained in absolute time on a periodic time line.
>
>     T = starttime + N * interval;
>
> So the only interesting information here is at which time the task
> returns from schedule(), i.e. you want to look at:
>
>         t2 - T
>
> Why? Because that gives you the latency of the wakeup. That's what
> cyclictest is looking at in user space:
>
>            clock_nanosleep(... &T);
>            clock_gettime(..., &T2);
>            latency = T2 - T;
>
> Now what you are looking at is the time at which the cyclictest task
> comes back into the kernel to sleep, which is obviously
>
>            t1 = T[N] + latency[N-1] + execution_time;
>
> But the timer is armed for T[N], so your t2 is:
>
>            t2 = T[N] + latency[N];
>
> You surely can do the remaining math and map that to the output:
>
> > [  165.555795] [ 0- 0]t0=165550399226,t1=165551394303,d=995 us
> > [  165.556802] [ 0- 0]t0=165551398751,t1=165552400997,d=1002 us
>
> Right?
>
> Thanks,
>
>         tglx