[PATCH 01/17] replay: Fix migration use of clock for statistics

Nicholas Piggin posted 17 patches 3 days, 13 hours ago
[PATCH 01/17] replay: Fix migration use of clock for statistics
Posted by Nicholas Piggin 3 days, 13 hours ago
Migration reads CLOCK_HOST when not holding the replay_mutex, which
asserts when recording a trace. These are not guest visible so should
be CLOCK_REALTIME like other statistics in MigrationState, which do
not require the replay_mutex.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 migration/migration.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 8c5bd0a75c8..2eb9e50a263 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3433,7 +3433,7 @@ static void *migration_thread(void *opaque)
 {
     MigrationState *s = opaque;
     MigrationThread *thread = NULL;
-    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     MigThrError thr_error;
     bool urgent = false;
     Error *local_err = NULL;
@@ -3504,7 +3504,7 @@ static void *migration_thread(void *opaque)
         goto out;
     }
 
-    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
+    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
 
     trace_migration_thread_setup_complete();
 
@@ -3584,7 +3584,7 @@ static void *bg_migration_thread(void *opaque)
 
     migration_rate_set(RATE_LIMIT_DISABLED);
 
-    setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+    setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
     /*
      * We want to save vmstate for the moment when migration has been
      * initiated but also we want to save RAM content while VM is running.
@@ -3629,7 +3629,7 @@ static void *bg_migration_thread(void *opaque)
         goto fail_setup;
     }
 
-    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
+    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
 
     trace_migration_thread_setup_complete();
 
-- 
2.45.2
Re: [PATCH 01/17] replay: Fix migration use of clock for statistics
Posted by Peter Xu 3 days, 7 hours ago
On Fri, Dec 20, 2024 at 08:42:03PM +1000, Nicholas Piggin wrote:
> Migration reads CLOCK_HOST when not holding the replay_mutex, which
> asserts when recording a trace. These are not guest visible so should
> be CLOCK_REALTIME like other statistics in MigrationState, which do
> not require the replay_mutex.

Irrelevant of the change, should we document such lock implications in
timer.h?

> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  migration/migration.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 8c5bd0a75c8..2eb9e50a263 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3433,7 +3433,7 @@ static void *migration_thread(void *opaque)
>  {
>      MigrationState *s = opaque;
>      MigrationThread *thread = NULL;
> -    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> +    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>      MigThrError thr_error;
>      bool urgent = false;
>      Error *local_err = NULL;
> @@ -3504,7 +3504,7 @@ static void *migration_thread(void *opaque)
>          goto out;
>      }
>  
> -    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> +    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
>  
>      trace_migration_thread_setup_complete();
>  
> @@ -3584,7 +3584,7 @@ static void *bg_migration_thread(void *opaque)
>  
>      migration_rate_set(RATE_LIMIT_DISABLED);
>  
> -    setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> +    setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>      /*
>       * We want to save vmstate for the moment when migration has been
>       * initiated but also we want to save RAM content while VM is running.
> @@ -3629,7 +3629,7 @@ static void *bg_migration_thread(void *opaque)
>          goto fail_setup;
>      }
>  
> -    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> +    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
>  
>      trace_migration_thread_setup_complete();
>  
> -- 
> 2.45.2
> 

-- 
Peter Xu
Re: [PATCH 01/17] replay: Fix migration use of clock for statistics
Posted by Nicholas Piggin 2 days, 20 hours ago
On Sat Dec 21, 2024 at 2:31 AM AEST, Peter Xu wrote:
> On Fri, Dec 20, 2024 at 08:42:03PM +1000, Nicholas Piggin wrote:
> > Migration reads CLOCK_HOST when not holding the replay_mutex, which
> > asserts when recording a trace. These are not guest visible so should
> > be CLOCK_REALTIME like other statistics in MigrationState, which do
> > not require the replay_mutex.
>
> Irrelevant of the change, should we document such lock implications in
> timer.h?

I guess the intention was to try to avoid caller caring too much
about replay internals, so I'm not sure if that will help or
hinder understanding :(

I think the big rule is something like "if it affects guest state,
then you must use HOST or VIRTUAL*, if it does not affect guest state
then you must use REALTIME". record-replay code then takes care of
replay mutex locking.

Does get a little fuzzy around edges in code that is somewhat
aware of record-replay though, like migration/snapshots.

(Pavel please correct me if I've been saying the wrong things)

Thanks,
Nick

>
> > 
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> >  migration/migration.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 8c5bd0a75c8..2eb9e50a263 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -3433,7 +3433,7 @@ static void *migration_thread(void *opaque)
> >  {
> >      MigrationState *s = opaque;
> >      MigrationThread *thread = NULL;
> > -    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > +    int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >      MigThrError thr_error;
> >      bool urgent = false;
> >      Error *local_err = NULL;
> > @@ -3504,7 +3504,7 @@ static void *migration_thread(void *opaque)
> >          goto out;
> >      }
> >  
> > -    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> > +    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
> >  
> >      trace_migration_thread_setup_complete();
> >  
> > @@ -3584,7 +3584,7 @@ static void *bg_migration_thread(void *opaque)
> >  
> >      migration_rate_set(RATE_LIMIT_DISABLED);
> >  
> > -    setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
> > +    setup_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> >      /*
> >       * We want to save vmstate for the moment when migration has been
> >       * initiated but also we want to save RAM content while VM is running.
> > @@ -3629,7 +3629,7 @@ static void *bg_migration_thread(void *opaque)
> >          goto fail_setup;
> >      }
> >  
> > -    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_HOST) - setup_start;
> > +    s->setup_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - setup_start;
> >  
> >      trace_migration_thread_setup_complete();
> >  
> > -- 
> > 2.45.2
> > 
Re: [PATCH 01/17] replay: Fix migration use of clock for statistics
Posted by Peter Xu 6 hours ago
On Sat, Dec 21, 2024 at 01:02:01PM +1000, Nicholas Piggin wrote:
> On Sat Dec 21, 2024 at 2:31 AM AEST, Peter Xu wrote:
> > On Fri, Dec 20, 2024 at 08:42:03PM +1000, Nicholas Piggin wrote:
> > > Migration reads CLOCK_HOST when not holding the replay_mutex, which
> > > asserts when recording a trace. These are not guest visible so should
> > > be CLOCK_REALTIME like other statistics in MigrationState, which do
> > > not require the replay_mutex.
> >
> > Irrelevant of the change, should we document such lock implications in
> > timer.h?
> 
> I guess the intention was to try to avoid caller caring too much
> about replay internals, so I'm not sure if that will help or
> hinder understanding :(

CLOCK_HOST should be the wall clock in QEMU, IIUC.  If any QEMU caller
tries to read host wall clock requires some mutex to be held.. then I don't
see how we can avoid mentioning it.  It's indeed weird if we need to take a
feature specific mutex just to read the wallclock.. But maybe I misread the
context somewhere..

> 
> I think the big rule is something like "if it affects guest state,
> then you must use HOST or VIRTUAL*, if it does not affect guest state

HOST clock logically shouldn't be relevant to guest-state?

> then you must use REALTIME". record-replay code then takes care of
> replay mutex locking.
> 
> Does get a little fuzzy around edges in code that is somewhat
> aware of record-replay though, like migration/snapshots.

Said that, I agree with the change itself - any measurement may not want to
involve NTP at all... which HOST / gtod will, but REALTIME won't.  However
this patch doesn't seem to be for that purpose..  So I'd like to double
check.

Thanks,

-- 
Peter Xu