[PATCH v4 22/24] target/ppc: fix timebase register reset state

Nicholas Piggin posted 24 patches 1 year, 9 months ago
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>, "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, Nicholas Piggin <npiggin@gmail.com>, Daniel Henrique Barboza <danielhb413@gmail.com>, "Cédric Le Goater" <clg@kaod.org>, David Gibson <david@gibson.dropbear.id.au>, Harsh Prateek Bora <harshpb@linux.ibm.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, John Snow <jsnow@redhat.com>, Cleber Rosa <crosa@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Wainer dos Santos Moschetta <wainersm@redhat.com>, Beraldo Leal <bleal@redhat.com>
There is a newer version of this series
[PATCH v4 22/24] target/ppc: fix timebase register reset state
Posted by Nicholas Piggin 1 year, 9 months ago
(H)DEC and PURR get reset before icount does, which causes them to be
skewed and not match the init state. This can cause replay to not
match the recorded trace exactly. For DEC and HDEC this is usually not
noticable since they tend to get programmed before affecting the
target machine. PURR has been observed to cause replay bugs when
running Linux.

Fix this by resetting using a time of 0.

Cc: qemu-ppc@nongnu.org
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/ppc.c         | 11 ++++++++---
 target/ppc/machine.c |  4 ++++
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index fadb8f5239..f22321779e 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -1112,16 +1112,21 @@ void cpu_ppc_tb_reset(CPUPPCState *env)
         timer_del(tb_env->hdecr_timer);
         ppc_set_irq(cpu, PPC_INTERRUPT_HDECR, 0);
         tb_env->hdecr_next = 0;
+        _cpu_ppc_store_hdecr(cpu, 0, 0, 0, 64);
     }
 
     /*
      * There is a bug in Linux 2.4 kernels:
      * if a decrementer exception is pending when it enables msr_ee at startup,
      * it's not ready to handle it...
+     *
+     * On machine reset, in this is called before icount is reset, so
+     * for icount-mode, setting TB registers using now=qemu_clock_get_ns
+     * results in them being skewed when icount does get reset. Use an
+     * explicit 0 to get a consistent reset state.
      */
-    cpu_ppc_store_decr(env, -1);
-    cpu_ppc_store_hdecr(env, -1);
-    cpu_ppc_store_purr(env, 0x0000000000000000ULL);
+    _cpu_ppc_store_decr(cpu, 0, 0, -1, 64);
+    _cpu_ppc_store_purr(env, 0, 0);
 }
 
 void cpu_ppc_tb_free(CPUPPCState *env)
diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index 203fe28e01..4c4294eafe 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -215,6 +215,8 @@ static int cpu_pre_save(void *opaque)
          * it here.
          */
         env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
+        printf("cpu_ppc_pre_save  TB:0x%016lx\n", cpu_ppc_load_tbl(env));
+        printf("cpu_ppc_pre_save DEC:0x%016lx\n", cpu_ppc_load_decr(env));
     }
 
     return 0;
@@ -333,6 +335,8 @@ static int cpu_post_load(void *opaque, int version_id)
          * triggered types (including HDEC) would need to carry more state.
          */
         cpu_ppc_store_decr(env, env->spr[SPR_DECR]);
+        printf("cpu_ppc_post_ld   TB:0x%016lx\n", cpu_ppc_load_tbl(env));
+        printf("cpu_ppc_post_ld  DEC:0x%016lx\n", cpu_ppc_load_decr(env));
         pmu_mmcr01_updated(env);
     }
 
-- 
2.42.0
Re: [PATCH v4 22/24] target/ppc: fix timebase register reset state
Posted by Alex Bennée 1 year, 9 months ago
Nicholas Piggin <npiggin@gmail.com> writes:

> (H)DEC and PURR get reset before icount does, which causes them to be
> skewed and not match the init state. This can cause replay to not
> match the recorded trace exactly. For DEC and HDEC this is usually not
> noticable since they tend to get programmed before affecting the
> target machine. PURR has been observed to cause replay bugs when
> running Linux.
>
> Fix this by resetting using a time of 0.
>
> Cc: qemu-ppc@nongnu.org
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  hw/ppc/ppc.c         | 11 ++++++++---
>  target/ppc/machine.c |  4 ++++
>  2 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
> index fadb8f5239..f22321779e 100644
> --- a/hw/ppc/ppc.c
> +++ b/hw/ppc/ppc.c
> @@ -1112,16 +1112,21 @@ void cpu_ppc_tb_reset(CPUPPCState *env)
>          timer_del(tb_env->hdecr_timer);
>          ppc_set_irq(cpu, PPC_INTERRUPT_HDECR, 0);
>          tb_env->hdecr_next = 0;
> +        _cpu_ppc_store_hdecr(cpu, 0, 0, 0, 64);
>      }
>  
>      /*
>       * There is a bug in Linux 2.4 kernels:
>       * if a decrementer exception is pending when it enables msr_ee at startup,
>       * it's not ready to handle it...
> +     *
> +     * On machine reset, in this is called before icount is reset, so
> +     * for icount-mode, setting TB registers using now=qemu_clock_get_ns
> +     * results in them being skewed when icount does get reset. Use an
> +     * explicit 0 to get a consistent reset state.
>       */
> -    cpu_ppc_store_decr(env, -1);
> -    cpu_ppc_store_hdecr(env, -1);
> -    cpu_ppc_store_purr(env, 0x0000000000000000ULL);
> +    _cpu_ppc_store_decr(cpu, 0, 0, -1, 64);
> +    _cpu_ppc_store_purr(env, 0, 0);
>  }
>  
>  void cpu_ppc_tb_free(CPUPPCState *env)
> diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> index 203fe28e01..4c4294eafe 100644
> --- a/target/ppc/machine.c
> +++ b/target/ppc/machine.c
> @@ -215,6 +215,8 @@ static int cpu_pre_save(void *opaque)
>           * it here.
>           */
>          env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
> +        printf("cpu_ppc_pre_save  TB:0x%016lx\n", cpu_ppc_load_tbl(env));
> +        printf("cpu_ppc_pre_save DEC:0x%016lx\n",
>  cpu_ppc_load_decr(env));

I think this is left over debug, which btw needs proper types:

  ../../target/ppc/machine.c: In function ‘cpu_pre_save’:
  ../../target/ppc/machine.c:219:45: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘target_ulong’ {aka ‘unsigned int’} [-Werror=format=]
    219 |         printf("cpu_ppc_pre_save DEC:0x%016lx\n", cpu_ppc_load_decr(env));
        |                                        ~~~~~^     ~~~~~~~~~~~~~~~~~~~~~~
        |                                             |     |
        |                                             |     target_ulong {aka unsigned int}
        |                                             long unsigned int
        |                                        %016x
  ../../target/ppc/machine.c: In function ‘cpu_post_load’:
  ../../target/ppc/machine.c:339:45: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘target_ulong’ {aka ‘unsigned int’} [-Werror=format=]
    339 |         printf("cpu_ppc_post_ld  DEC:0x%016lx\n", cpu_ppc_load_decr(env));
        |                                        ~~~~~^     ~~~~~~~~~~~~~~~~~~~~~~
        |                                             |     |
        |                                             |     target_ulong {aka unsigned int}
        |                                             long unsigned int
        |                                        %016x
  cc1: all warnings being treated as errors

>      }
>  
>      return 0;
> @@ -333,6 +335,8 @@ static int cpu_post_load(void *opaque, int version_id)
>           * triggered types (including HDEC) would need to carry more state.
>           */
>          cpu_ppc_store_decr(env, env->spr[SPR_DECR]);
> +        printf("cpu_ppc_post_ld   TB:0x%016lx\n", cpu_ppc_load_tbl(env));
> +        printf("cpu_ppc_post_ld  DEC:0x%016lx\n", cpu_ppc_load_decr(env));
>          pmu_mmcr01_updated(env);
>      }

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro
Re: [PATCH v4 22/24] target/ppc: fix timebase register reset state
Posted by Nicholas Piggin 1 year, 9 months ago
On Tue Mar 12, 2024 at 11:24 PM AEST, Alex Bennée wrote:
> Nicholas Piggin <npiggin@gmail.com> writes:
>
> > (H)DEC and PURR get reset before icount does, which causes them to be
> > skewed and not match the init state. This can cause replay to not
> > match the recorded trace exactly. For DEC and HDEC this is usually not
> > noticable since they tend to get programmed before affecting the
> > target machine. PURR has been observed to cause replay bugs when
> > running Linux.
> >
> > Fix this by resetting using a time of 0.
> >
> > Cc: qemu-ppc@nongnu.org
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> >  hw/ppc/ppc.c         | 11 ++++++++---
> >  target/ppc/machine.c |  4 ++++
> >  2 files changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
> > index fadb8f5239..f22321779e 100644
> > --- a/hw/ppc/ppc.c
> > +++ b/hw/ppc/ppc.c
> > @@ -1112,16 +1112,21 @@ void cpu_ppc_tb_reset(CPUPPCState *env)
> >          timer_del(tb_env->hdecr_timer);
> >          ppc_set_irq(cpu, PPC_INTERRUPT_HDECR, 0);
> >          tb_env->hdecr_next = 0;
> > +        _cpu_ppc_store_hdecr(cpu, 0, 0, 0, 64);
> >      }
> >  
> >      /*
> >       * There is a bug in Linux 2.4 kernels:
> >       * if a decrementer exception is pending when it enables msr_ee at startup,
> >       * it's not ready to handle it...
> > +     *
> > +     * On machine reset, in this is called before icount is reset, so
> > +     * for icount-mode, setting TB registers using now=qemu_clock_get_ns
> > +     * results in them being skewed when icount does get reset. Use an
> > +     * explicit 0 to get a consistent reset state.
> >       */
> > -    cpu_ppc_store_decr(env, -1);
> > -    cpu_ppc_store_hdecr(env, -1);
> > -    cpu_ppc_store_purr(env, 0x0000000000000000ULL);
> > +    _cpu_ppc_store_decr(cpu, 0, 0, -1, 64);
> > +    _cpu_ppc_store_purr(env, 0, 0);
> >  }
> >  
> >  void cpu_ppc_tb_free(CPUPPCState *env)
> > diff --git a/target/ppc/machine.c b/target/ppc/machine.c
> > index 203fe28e01..4c4294eafe 100644
> > --- a/target/ppc/machine.c
> > +++ b/target/ppc/machine.c
> > @@ -215,6 +215,8 @@ static int cpu_pre_save(void *opaque)
> >           * it here.
> >           */
> >          env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
> > +        printf("cpu_ppc_pre_save  TB:0x%016lx\n", cpu_ppc_load_tbl(env));
> > +        printf("cpu_ppc_pre_save DEC:0x%016lx\n",
> >  cpu_ppc_load_decr(env));
>
> I think this is left over debug, which btw needs proper types:

Yes you're right, sorry that was intended to be removed.

Thanks,
Nick