scripts/simpletrace.py | 11 +++++++++++ 1 file changed, 11 insertions(+)
In some trace log, there're unstable timestamp breaking temporal
ordering of trace records. For example:
kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6
kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
Negative delta intervals tend to get drowned in the massive trace logs,
and an unstable timestamp can corrupt the calculation of intervals
between two events adjacent to it.
Therefore, mark the outputs with unstable timestamps as WARN like:
WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6
WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60
This would help to identify unusual events.
And skip them without updating Formatter2.last_timestamp_ns to avoid
time back.
Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
---
scripts/simpletrace.py | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py
index cef81b0707f0..23911dddb8a6 100755
--- a/scripts/simpletrace.py
+++ b/scripts/simpletrace.py
@@ -343,6 +343,17 @@ def __init__(self):
def catchall(self, *rec_args, event, timestamp_ns, pid, event_id):
if self.last_timestamp_ns is None:
self.last_timestamp_ns = timestamp_ns
+
+ if timestamp_ns < self.last_timestamp_ns:
+ fields = [
+ f'{name}={r}' if is_string(type) else f'{name}=0x{r:x}'
+ for r, (type, name) in zip(rec_args, event.args)
+ ]
+ print(f'WARN: skip unstable timestamp: {event.name} '
+ f'cur({timestamp_ns})-pre({self.last_timestamp_ns}) {pid=} ' +
+ f' '.join(fields))
+ return
+
delta_ns = timestamp_ns - self.last_timestamp_ns
self.last_timestamp_ns = timestamp_ns
--
2.34.1
On Wed, 8 May 2024 at 00:19, Zhao Liu <zhao1.liu@intel.com> wrote: > > In some trace log, there're unstable timestamp breaking temporal > ordering of trace records. For example: > > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6 > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > Negative delta intervals tend to get drowned in the massive trace logs, > and an unstable timestamp can corrupt the calculation of intervals > between two events adjacent to it. > > Therefore, mark the outputs with unstable timestamps as WARN like: Why are the timestamps non-monotonic? In a situation like that maybe not only the negative timestamps are useless but even some positive timestamps are incorrect. I think it's worth understanding the nature of the instability before merging a fix. Stefan > > WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6 > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > This would help to identify unusual events. > > And skip them without updating Formatter2.last_timestamp_ns to avoid > time back. > > Signed-off-by: Zhao Liu <zhao1.liu@intel.com> > --- > scripts/simpletrace.py | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py > index cef81b0707f0..23911dddb8a6 100755 > --- a/scripts/simpletrace.py > +++ b/scripts/simpletrace.py > @@ -343,6 +343,17 @@ def __init__(self): > def catchall(self, *rec_args, event, timestamp_ns, pid, event_id): > if self.last_timestamp_ns is None: > self.last_timestamp_ns = timestamp_ns > + > + if timestamp_ns < self.last_timestamp_ns: > + fields = [ > + f'{name}={r}' if is_string(type) else f'{name}=0x{r:x}' > + for r, (type, name) in zip(rec_args, event.args) > + ] > + print(f'WARN: skip unstable timestamp: {event.name} ' > + f'cur({timestamp_ns})-pre({self.last_timestamp_ns}) {pid=} ' + > + f' '.join(fields)) > + return > + > delta_ns = timestamp_ns - self.last_timestamp_ns > self.last_timestamp_ns = timestamp_ns > > -- > 2.34.1 > >
> Why are the timestamps non-monotonic? > > In a situation like that maybe not only the negative timestamps are > useless but even some positive timestamps are incorrect. I think it's > worth understanding the nature of the instability before merging a > fix. I agree with Stefan on this. We'll need to find the underlying cause first.
On Wed, May 08, 2024 at 02:05:04PM -0400, Stefan Hajnoczi wrote: > Date: Wed, 8 May 2024 14:05:04 -0400 > From: Stefan Hajnoczi <stefanha@gmail.com> > Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable > timestamp as WARN > > On Wed, 8 May 2024 at 00:19, Zhao Liu <zhao1.liu@intel.com> wrote: > > > > In some trace log, there're unstable timestamp breaking temporal > > ordering of trace records. For example: > > > > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6 > > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > > > Negative delta intervals tend to get drowned in the massive trace logs, > > and an unstable timestamp can corrupt the calculation of intervals > > between two events adjacent to it. > > > > Therefore, mark the outputs with unstable timestamps as WARN like: > > Why are the timestamps non-monotonic? > > In a situation like that maybe not only the negative timestamps are > useless but even some positive timestamps are incorrect. I think it's > worth understanding the nature of the instability before merging a > fix. I grabbed more traces (by -trace "*" to cover as many events as possible ) and have a couple observations: * It's not an issue with kvm's ioctl, and that qemu_mutex_lock/ object_dynamic_cast_assert accounted for the vast majority of all exception timestamps. * The total exception timestamp occurrence probability was roughly 0.013% (608 / 4,616,938) in a 398M trace file. * And the intervals between the "wrong" timestamp and its pre event are almost all within 50ns, even more concentrated within 20ns (there are even quite a few 1~10ns). Just a guess: Would it be possible that a trace event which is too short of an interval, and happen to meet a delay in signaling to send to writeout_thread? It seems that this short interval like a lack of real-time guarantees in the underlying mechanism... If it's QEMU's own issue, I feel like the intervals should be randomized, not just within 50ns... May I ask what you think? Any suggestions for researching this situation ;-) Thanks, Zhao
On Thu, May 09, 2024 at 11:59:10AM +0800, Zhao Liu wrote: > On Wed, May 08, 2024 at 02:05:04PM -0400, Stefan Hajnoczi wrote: > > Date: Wed, 8 May 2024 14:05:04 -0400 > > From: Stefan Hajnoczi <stefanha@gmail.com> > > Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable > > timestamp as WARN > > > > On Wed, 8 May 2024 at 00:19, Zhao Liu <zhao1.liu@intel.com> wrote: > > > > > > In some trace log, there're unstable timestamp breaking temporal > > > ordering of trace records. For example: > > > > > > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6 > > > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > > > > > Negative delta intervals tend to get drowned in the massive trace logs, > > > and an unstable timestamp can corrupt the calculation of intervals > > > between two events adjacent to it. > > > > > > Therefore, mark the outputs with unstable timestamps as WARN like: > > > > Why are the timestamps non-monotonic? > > > > In a situation like that maybe not only the negative timestamps are > > useless but even some positive timestamps are incorrect. I think it's > > worth understanding the nature of the instability before merging a > > fix. > > I grabbed more traces (by -trace "*" to cover as many events as possible > ) and have a couple observations: > > * It's not an issue with kvm's ioctl, and that qemu_mutex_lock/ > object_dynamic_cast_assert accounted for the vast majority of all > exception timestamps. > * The total exception timestamp occurrence probability was roughly 0.013% > (608 / 4,616,938) in a 398M trace file. > * And the intervals between the "wrong" timestamp and its pre event are > almost all within 50ns, even more concentrated within 20ns (there are > even quite a few 1~10ns). > > Just a guess: > > Would it be possible that a trace event which is too short of an interval, > and happen to meet a delay in signaling to send to writeout_thread? > > It seems that this short interval like a lack of real-time guarantees in > the underlying mechanism... > > If it's QEMU's own issue, I feel like the intervals should be randomized, > not just within 50ns... > > May I ask what you think? Any suggestions for researching this situation > ;-) QEMU uses clock_gettime(CLOCK_MONOTONIC) on Linux hosts. The man page says: All CLOCK_MONOTONIC variants guarantee that the time returned by consecutive calls will not go backwards, but successive calls may—depending on the architecture—return identical (not-in‐ creased) time values. trace_record_start() calls clock_gettime(CLOCK_MONOTONIC) so trace events should have monotonically increasing timestamps. I don't see a scenario where trace record A's timestamp is greater than trace record B's timestamp unless the clock is non-monotonic. Which host CPU architecture and operating system are you running? Please attach to the QEMU process with gdb and print out the value of the use_rt_clock variable or add a printf in init_get_clock(). The value should be 1. Stefan
Hi Stefan, > QEMU uses clock_gettime(CLOCK_MONOTONIC) on Linux hosts. The man page > says: > > All CLOCK_MONOTONIC variants guarantee that the time returned by > consecutive calls will not go backwards, but successive calls > may—depending on the architecture—return identical (not-in‐ > creased) time values. > > trace_record_start() calls clock_gettime(CLOCK_MONOTONIC) so trace events > should have monotonically increasing timestamps. > > I don't see a scenario where trace record A's timestamp is greater than > trace record B's timestamp unless the clock is non-monotonic. > > Which host CPU architecture and operating system are you running? I tested on these 2 machines: * CML (intel 10th) with Ubuntu 22.04 + kernel v6.5.0-28 * MTL (intel 14th) with Ubuntu 22.04.2 + kernel v6.9.0 > Please attach to the QEMU process with gdb and print out the value of > the use_rt_clock variable or add a printf in init_get_clock(). The value > should be 1. Thanks, on both above machines, use_rt_clock is 1 and there're both timestamp reversal issues with the following debug print: diff --git a/include/qemu/timer.h b/include/qemu/timer.h index 9a366e551fb3..7657785c27dc 100644 --- a/include/qemu/timer.h +++ b/include/qemu/timer.h @@ -831,10 +831,17 @@ extern int use_rt_clock; static inline int64_t get_clock(void) { + static int64_t clock = 0; if (use_rt_clock) { struct timespec ts; clock_gettime(CLOCK_MONOTONIC, &ts); - return ts.tv_sec * 1000000000LL + ts.tv_nsec; + int64_t tmp = ts.tv_sec * 1000000000LL + ts.tv_nsec; + if (tmp <= clock) { + printf("get_clock: strange, clock: %ld, tmp: %ld\n", clock, tmp); + } + assert(tmp > clock); + clock = tmp; + return clock; } else { /* XXX: using gettimeofday leads to problems if the date changes, so it should be avoided. */ diff --git a/util/qemu-timer-common.c b/util/qemu-timer-common.c index cc1326f72646..3bf06eb4a4ce 100644 --- a/util/qemu-timer-common.c +++ b/util/qemu-timer-common.c @@ -59,5 +59,6 @@ static void __attribute__((constructor)) init_get_clock(void) use_rt_clock = 1; } clock_start = get_clock(); + printf("init_get_clock: use_rt_clock: %d\n", use_rt_clock); } #endif --- The timestamp interval is very small, for example: get_clock: strange, clock: 3302130503505, tmp: 3302130503503 or get_clock: strange, clock: 2761577819846455, tmp: 2761577819846395 I also tried to use CLOCK_MONOTONIC_RAW, but there's still the reversal issue. Thanks, Zhao
On Tue, May 14, 2024, 03:57 Zhao Liu <zhao1.liu@intel.com> wrote: > Hi Stefan, > > > QEMU uses clock_gettime(CLOCK_MONOTONIC) on Linux hosts. The man page > > says: > > > > All CLOCK_MONOTONIC variants guarantee that the time returned by > > consecutive calls will not go backwards, but successive calls > > may—depending on the architecture—return identical (not-in‐ > > creased) time values. > > > > trace_record_start() calls clock_gettime(CLOCK_MONOTONIC) so trace events > > should have monotonically increasing timestamps. > > > > I don't see a scenario where trace record A's timestamp is greater than > > trace record B's timestamp unless the clock is non-monotonic. > > > > Which host CPU architecture and operating system are you running? > > I tested on these 2 machines: > * CML (intel 10th) with Ubuntu 22.04 + kernel v6.5.0-28 > * MTL (intel 14th) with Ubuntu 22.04.2 + kernel v6.9.0 > > > Please attach to the QEMU process with gdb and print out the value of > > the use_rt_clock variable or add a printf in init_get_clock(). The value > > should be 1. > > Thanks, on both above machines, use_rt_clock is 1 and there're both > timestamp reversal issues with the following debug print: > > diff --git a/include/qemu/timer.h b/include/qemu/timer.h > index 9a366e551fb3..7657785c27dc 100644 > --- a/include/qemu/timer.h > +++ b/include/qemu/timer.h > @@ -831,10 +831,17 @@ extern int use_rt_clock; > > static inline int64_t get_clock(void) > { > + static int64_t clock = 0; > Please try with a thread local variable (__thread) to check whether this happens within a single thread. If it only happens with a global variable then we'd need to look more closely at race conditions in the patch below. I don't think the patch is a reliable way to detect non-monotonic timestamps in a multi-threaded program. if (use_rt_clock) { > struct timespec ts; > clock_gettime(CLOCK_MONOTONIC, &ts); > - return ts.tv_sec * 1000000000LL + ts.tv_nsec; > + int64_t tmp = ts.tv_sec * 1000000000LL + ts.tv_nsec; > + if (tmp <= clock) { > + printf("get_clock: strange, clock: %ld, tmp: %ld\n", clock, > tmp); > + } > + assert(tmp > clock); > + clock = tmp; > + return clock; > } else { > /* XXX: using gettimeofday leads to problems if the date > changes, so it should be avoided. */ > diff --git a/util/qemu-timer-common.c b/util/qemu-timer-common.c > index cc1326f72646..3bf06eb4a4ce 100644 > --- a/util/qemu-timer-common.c > +++ b/util/qemu-timer-common.c > @@ -59,5 +59,6 @@ static void __attribute__((constructor)) > init_get_clock(void) > use_rt_clock = 1; > } > clock_start = get_clock(); > + printf("init_get_clock: use_rt_clock: %d\n", use_rt_clock); > } > #endif > > --- > The timestamp interval is very small, for example: > get_clock: strange, clock: 3302130503505, tmp: 3302130503503 > > or > > get_clock: strange, clock: 2761577819846455, tmp: 2761577819846395 > > I also tried to use CLOCK_MONOTONIC_RAW, but there's still the reversal > issue. > > Thanks, > Zhao > >
On 8/5/24 06:32, Zhao Liu wrote: > In some trace log, there're unstable timestamp breaking temporal > ordering of trace records. For example: > > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6 > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > Negative delta intervals tend to get drowned in the massive trace logs, > and an unstable timestamp can corrupt the calculation of intervals > between two events adjacent to it. > > Therefore, mark the outputs with unstable timestamps as WARN like: > > WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6 > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > This would help to identify unusual events. > > And skip them without updating Formatter2.last_timestamp_ns to avoid > time back. Can't we reorder them instead? > Signed-off-by: Zhao Liu <zhao1.liu@intel.com> > --- > scripts/simpletrace.py | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/scripts/simpletrace.py b/scripts/simpletrace.py > index cef81b0707f0..23911dddb8a6 100755 > --- a/scripts/simpletrace.py > +++ b/scripts/simpletrace.py > @@ -343,6 +343,17 @@ def __init__(self): > def catchall(self, *rec_args, event, timestamp_ns, pid, event_id): > if self.last_timestamp_ns is None: > self.last_timestamp_ns = timestamp_ns > + > + if timestamp_ns < self.last_timestamp_ns: > + fields = [ > + f'{name}={r}' if is_string(type) else f'{name}=0x{r:x}' > + for r, (type, name) in zip(rec_args, event.args) > + ] > + print(f'WARN: skip unstable timestamp: {event.name} ' > + f'cur({timestamp_ns})-pre({self.last_timestamp_ns}) {pid=} ' + > + f' '.join(fields)) > + return > + > delta_ns = timestamp_ns - self.last_timestamp_ns > self.last_timestamp_ns = timestamp_ns >
Hi Philippe, On Wed, May 08, 2024 at 03:09:39PM +0200, Philippe Mathieu-Daudé wrote: > Date: Wed, 8 May 2024 15:09:39 +0200 > From: Philippe Mathieu-Daudé <philmd@linaro.org> > Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable > timestamp as WARN > > On 8/5/24 06:32, Zhao Liu wrote: > > In some trace log, there're unstable timestamp breaking temporal > > ordering of trace records. For example: > > > > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6 > > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > > > Negative delta intervals tend to get drowned in the massive trace logs, > > and an unstable timestamp can corrupt the calculation of intervals > > between two events adjacent to it. > > > > Therefore, mark the outputs with unstable timestamps as WARN like: > > > > WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6 > > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > > > This would help to identify unusual events. > > > > And skip them without updating Formatter2.last_timestamp_ns to avoid > > time back. > > Can't we reorder them instead? I think so...IIUC, when the current event with "wrong" timestamp is detected, its previous events records have already been output and cannot be reordered. Regards, Zhao
On 8/5/24 15:55, Zhao Liu wrote: > Hi Philippe, > > On Wed, May 08, 2024 at 03:09:39PM +0200, Philippe Mathieu-Daudé wrote: >> Date: Wed, 8 May 2024 15:09:39 +0200 >> From: Philippe Mathieu-Daudé <philmd@linaro.org> >> Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable >> timestamp as WARN >> >> On 8/5/24 06:32, Zhao Liu wrote: >>> In some trace log, there're unstable timestamp breaking temporal >>> ordering of trace records. For example: >>> >>> kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6 >>> kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 >>> kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 >>> >>> Negative delta intervals tend to get drowned in the massive trace logs, >>> and an unstable timestamp can corrupt the calculation of intervals >>> between two events adjacent to it. >>> >>> Therefore, mark the outputs with unstable timestamps as WARN like: >>> >>> WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6 >>> WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 >>> WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 >>> >>> This would help to identify unusual events. >>> >>> And skip them without updating Formatter2.last_timestamp_ns to avoid >>> time back. >> >> Can't we reorder them instead? > > I think so...IIUC, when the current event with "wrong" timestamp is > detected, its previous events records have already been output and > cannot be reordered. Well, this certainly need rework to cleverly handle out of sync events, reordering them to display them correctly. Skipping events really sounds a bad idea to me. Even more out of sync ones. Some simple approach could be to queue up to N events, then sort them and unqueue N/2, while processing all events. Regards, Phil.
On Wed, May 08, 2024 at 04:23:00PM +0200, Philippe Mathieu-Daudé wrote: > Date: Wed, 8 May 2024 16:23:00 +0200 > From: Philippe Mathieu-Daudé <philmd@linaro.org> > Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable > timestamp as WARN > > On 8/5/24 15:55, Zhao Liu wrote: > > Hi Philippe, > > > > On Wed, May 08, 2024 at 03:09:39PM +0200, Philippe Mathieu-Daudé wrote: > > > Date: Wed, 8 May 2024 15:09:39 +0200 > > > From: Philippe Mathieu-Daudé <philmd@linaro.org> > > > Subject: Re: [PATCH] scripts/simpletrace: Mark output with unstable > > > timestamp as WARN > > > > > > On 8/5/24 06:32, Zhao Liu wrote: > > > > In some trace log, there're unstable timestamp breaking temporal > > > > ordering of trace records. For example: > > > > > > > > kvm_run_exit -0.015 pid=3289596 cpu_index=0x0 reason=0x6 > > > > kvm_vm_ioctl -0.020 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > > > kvm_vm_ioctl -0.021 pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > > > > > > > Negative delta intervals tend to get drowned in the massive trace logs, > > > > and an unstable timestamp can corrupt the calculation of intervals > > > > between two events adjacent to it. > > > > > > > > Therefore, mark the outputs with unstable timestamps as WARN like: > > > > > > > > WARN: skip unstable timestamp: kvm_run_exit cur(8497404907761146)-pre(8497404907761161) pid=3289596 cpu_index=0x0 reason=0x6 > > > > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908603653)-pre(8497404908603673) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > > > WARN: skip unstable timestamp: kvm_vm_ioctl cur(8497404908625787)-pre(8497404908625808) pid=3289596 type=0xffffffffc008ae67 arg=0x7ffeefb5aa60 > > > > > > > > This would help to identify unusual events. > > > > > > > > And skip them without updating Formatter2.last_timestamp_ns to avoid > > > > time back. > > > > > > Can't we reorder them instead? > > > > I think so...IIUC, when the current event with "wrong" timestamp is > > detected, its previous events records have already been output and > > cannot be reordered. > > Well, this certainly need rework to cleverly handle out of sync > events, reordering them to display them correctly. > > Skipping events really sounds a bad idea to me. Even more out > of sync ones. > > Some simple approach could be to queue up to N events, then sort them > and unqueue N/2, while processing all events. Reordering might be possible, but as Stefan replied, some psitive value might also be problematic too. Maybe my negative value check is not enough in itself... Thanks, Zhao
© 2016 - 2024 Red Hat, Inc.