[PATCH V4 06/11] migration: preserve cpu ticks if suspended

Steve Sistare posted 11 patches 2 years, 5 months ago
Maintainers: Stefan Berger <stefanb@linux.vnet.ibm.com>, "Alex Bennée" <alex.bennee@linaro.org>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Gerd Hoffmann <kraxel@redhat.com>, Stefano Stabellini <sstabellini@kernel.org>, Anthony Perard <anthony.perard@citrix.com>, Paul Durrant <paul@xen.org>, Paolo Bonzini <pbonzini@redhat.com>, Juan Quintela <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Leonardo Bras <leobras@redhat.com>, Richard Henderson <richard.henderson@linaro.org>, Thomas Huth <thuth@redhat.com>, Laurent Vivier <lvivier@redhat.com>
There is a newer version of this series
[PATCH V4 06/11] migration: preserve cpu ticks if suspended
Posted by Steve Sistare 2 years, 5 months ago
During RUN_STATE_SUSPENDED, the cpu clock remains enabled, so the
timers_state saved to the migration stream is stale, causing time errors
in the guest when it wakes from suspend.

To fix, maintain a shadow copy of timers_state, and update the shadow in a
pre_save handler that uses the same logic found in cpu_disable_ticks.  Copy
the shadow to timers_state in post_load.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 softmmu/cpu-timers.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/softmmu/cpu-timers.c b/softmmu/cpu-timers.c
index 117408c..d5af317 100644
--- a/softmmu/cpu-timers.c
+++ b/softmmu/cpu-timers.c
@@ -157,6 +157,36 @@ static bool icount_shift_state_needed(void *opaque)
     return icount_enabled() == 2;
 }
 
+static int cpu_pre_save_ticks(void *opaque)
+{
+    TimersState *t = &timers_state;
+    TimersState *snap = opaque;
+
+    seqlock_write_lock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+
+    if (t->cpu_ticks_enabled) {
+        snap->cpu_ticks_offset = t->cpu_ticks_offset + cpu_get_host_ticks();
+        snap->cpu_clock_offset = cpu_get_clock_locked();
+    } else {
+        snap->cpu_ticks_offset = t->cpu_ticks_offset;
+        snap->cpu_clock_offset = t->cpu_clock_offset;
+    }
+    seqlock_write_unlock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+    return 0;
+}
+
+static int cpu_post_load_ticks(void *opaque, int version_id)
+{
+    TimersState *t = &timers_state;
+    TimersState *snap = opaque;
+
+    seqlock_write_lock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+    t->cpu_ticks_offset = snap->cpu_ticks_offset;
+    t->cpu_clock_offset = snap->cpu_clock_offset;
+    seqlock_write_unlock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+    return 0;
+}
+
 /*
  * Subsection for warp timer migration is optional, because may not be created
  */
@@ -221,6 +251,8 @@ static const VMStateDescription vmstate_timers = {
     .name = "timer",
     .version_id = 2,
     .minimum_version_id = 1,
+    .pre_save = cpu_pre_save_ticks,
+    .post_load = cpu_post_load_ticks,
     .fields = (VMStateField[]) {
         VMSTATE_INT64(cpu_ticks_offset, TimersState),
         VMSTATE_UNUSED(8),
@@ -269,9 +301,11 @@ TimersState timers_state;
 /* initialize timers state and the cpu throttle for convenience */
 void cpu_timers_init(void)
 {
+    static TimersState timers_snapshot;
+
     seqlock_init(&timers_state.vm_clock_seqlock);
     qemu_spin_init(&timers_state.vm_clock_lock);
-    vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
+    vmstate_register(NULL, 0, &vmstate_timers, &timers_snapshot);
 
     cpu_throttle_init();
 }
-- 
1.8.3.1
Re: [PATCH V4 06/11] migration: preserve cpu ticks if suspended
Posted by Peter Xu 2 years, 5 months ago
On Tue, Aug 29, 2023 at 11:18:01AM -0700, Steve Sistare wrote:
> During RUN_STATE_SUSPENDED, the cpu clock remains enabled, so the
> timers_state saved to the migration stream is stale, causing time errors
> in the guest when it wakes from suspend.

Instead of having this, I'm wondering whether we should just let:

        ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);

stop the vm for suspended too - I think we reached a consensus that
SUSPENDED should be treated the same as running here (except the vcpu
beingg running or not).

So the more risky change is we should make runstate_is_running() cover
SUSPENDED, but of course that again can affect many other call sites.. and
I'm not sure whether it's 100% working everywhere.

I think I mentioned the other "easier" way, which is to modify
vm_stop_force_state() to take suspended:

int vm_stop_force_state(RunState state)
 {
-    if (runstate_is_running()) {
+    if (runstate_is_running() || runstate_is_suspended()) {
         return vm_stop(state);

That resides in cpus.c but it really only affects migration, so much less
risky.  Do you think this should be the better (and correct) way to go?

-- 
Peter Xu
Re: [PATCH V4 06/11] migration: preserve cpu ticks if suspended
Posted by Steven Sistare 2 years, 2 months ago
On 8/30/2023 12:47 PM, Peter Xu wrote:
> On Tue, Aug 29, 2023 at 11:18:01AM -0700, Steve Sistare wrote:
>> During RUN_STATE_SUSPENDED, the cpu clock remains enabled, so the
>> timers_state saved to the migration stream is stale, causing time errors
>> in the guest when it wakes from suspend.
> 
> Instead of having this, I'm wondering whether we should just let:
> 
>         ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
> 
> stop the vm for suspended too - I think we reached a consensus that
> SUSPENDED should be treated the same as running here (except the vcpu
> beingg running or not).
> 
> So the more risky change is we should make runstate_is_running() cover
> SUSPENDED, but of course that again can affect many other call sites.. and
> I'm not sure whether it's 100% working everywhere.
> 
> I think I mentioned the other "easier" way, which is to modify
> vm_stop_force_state() to take suspended:
> 
> int vm_stop_force_state(RunState state)
>  {
> -    if (runstate_is_running()) {
> +    if (runstate_is_running() || runstate_is_suspended()) {
>          return vm_stop(state);
> 
> That resides in cpus.c but it really only affects migration, so much less
> risky.  Do you think this should be the better (and correct) way to go?

Agreed, good idea, done in V5 - steve
Re: [PATCH V4 06/11] migration: preserve cpu ticks if suspended
Posted by Steven Sistare 2 years, 5 months ago
On 8/30/2023 12:47 PM, Peter Xu wrote:
> On Tue, Aug 29, 2023 at 11:18:01AM -0700, Steve Sistare wrote:
>> During RUN_STATE_SUSPENDED, the cpu clock remains enabled, so the
>> timers_state saved to the migration stream is stale, causing time errors
>> in the guest when it wakes from suspend.
> 
> Instead of having this, I'm wondering whether we should just let:
> 
>         ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
> 
> stop the vm for suspended too - I think we reached a consensus that
> SUSPENDED should be treated the same as running here (except the vcpu
> beingg running or not).
> 
> So the more risky change is we should make runstate_is_running() cover
> SUSPENDED, but of course that again can affect many other call sites.. and
> I'm not sure whether it's 100% working everywhere.
> 
> I think I mentioned the other "easier" way, which is to modify
> vm_stop_force_state() to take suspended:
> 
> int vm_stop_force_state(RunState state)
>  {
> -    if (runstate_is_running()) {
> +    if (runstate_is_running() || runstate_is_suspended()) {
>          return vm_stop(state);
> 
> That resides in cpus.c but it really only affects migration, so much less
> risky.  Do you think this should be the better (and correct) way to go?

Hi Peter, thanks for your careful review.  I have some other work to do, but
I will get back to this soon.

- Steve