[Qemu-devel] [PATCH] Correct CPACR reset value for v7 cores

Peter Maydell posted 1 patch 5 years, 11 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20180522173713.26282-1-peter.maydell@linaro.org
Test checkpatch passed
Test docker-mingw@fedora passed
Test docker-quick@centos7 passed
Test s390x passed
target/arm/helper.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
[Qemu-devel] [PATCH] Correct CPACR reset value for v7 cores
Posted by Peter Maydell 5 years, 11 months ago
In commit f0aff255700 we made cpacr_write() enforce that some CPACR
bits are RAZ/WI and some are RAO/WI for ARMv7 cores. Unfortunately
we forgot to also update the register's reset value. The effect
was that (a) a guest that read CPACR on reset would not see ones in
the RAO bits, and (b) if you did a migration before the guest did
a write to the CPACR then the migration would fail because the
destination would enforce the RAO bits and then complain that they
didn't match the zero value from the source.

Implement reset for the CPACR using a custom reset function
that just calls cpacr_write(), to avoid having to duplicate
the logic for which bits are RAO.

This bug would affect migration for TCG CPUs which are ARMv7
with VFP but without one of Neon or VFPv3.

Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
This is sufficient that a save-and-reload while the romulus-bmc
machine is in the bootloader will work. On the other hand if
I do a save-and-reload after the kernel has started booting
then we get the classic "guest hang after reload", so some
state is still not being transferred somewhere (probably in
a device in the machine model?)
---
 target/arm/helper.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index c0f739972e..6023bf6046 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -863,6 +863,14 @@ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
     env->cp15.cpacr_el1 = value;
 }
 
+static void cpacr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+    /* Call cpacr_write() so that we reset with the correct RAO bits set
+     * for our CPU features.
+     */
+    cpacr_write(env, ri, 0);
+}
+
 static CPAccessResult cpacr_access(CPUARMState *env, const ARMCPRegInfo *ri,
                                    bool isread)
 {
@@ -920,7 +928,7 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
     { .name = "CPACR", .state = ARM_CP_STATE_BOTH, .opc0 = 3,
       .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 2, .accessfn = cpacr_access,
       .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.cpacr_el1),
-      .resetvalue = 0, .writefn = cpacr_write },
+      .resetfn = cpacr_reset, .writefn = cpacr_write },
     REGINFO_SENTINEL
 };
 
-- 
2.17.0


Re: [Qemu-devel] [PATCH] Correct CPACR reset value for v7 cores
Posted by Cédric Le Goater 5 years, 11 months ago
On 05/22/2018 07:37 PM, Peter Maydell wrote:
> In commit f0aff255700 we made cpacr_write() enforce that some CPACR
> bits are RAZ/WI and some are RAO/WI for ARMv7 cores. Unfortunately
> we forgot to also update the register's reset value. The effect
> was that (a) a guest that read CPACR on reset would not see ones in
> the RAO bits, and (b) if you did a migration before the guest did
> a write to the CPACR then the migration would fail because the
> destination would enforce the RAO bits and then complain that they
> didn't match the zero value from the source.
> 
> Implement reset for the CPACR using a custom reset function
> that just calls cpacr_write(), to avoid having to duplicate
> the logic for which bits are RAO.
> 
> This bug would affect migration for TCG CPUs which are ARMv7
> with VFP but without one of Neon or VFPv3.
> 
> Reported-by: Cédric Le Goater <clg@kaod.org>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Tested-by: Cédric Le Goater <clg@kaod.org>

> ---
> This is sufficient that a save-and-reload while the romulus-bmc
> machine is in the bootloader will work. On the other hand if
> I do a save-and-reload after the kernel has started booting
> then we get the classic "guest hang after reload", so some
> state is still not being transferred somewhere (probably in
> a device in the machine model?)

I see the problem. Bizarre.

Thanks,

C.  

> ---
>  target/arm/helper.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index c0f739972e..6023bf6046 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -863,6 +863,14 @@ static void cpacr_write(CPUARMState *env, const ARMCPRegInfo *ri,
>      env->cp15.cpacr_el1 = value;
>  }
>  
> +static void cpacr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
> +{
> +    /* Call cpacr_write() so that we reset with the correct RAO bits set
> +     * for our CPU features.
> +     */
> +    cpacr_write(env, ri, 0);
> +}
> +
>  static CPAccessResult cpacr_access(CPUARMState *env, const ARMCPRegInfo *ri,
>                                     bool isread)
>  {
> @@ -920,7 +928,7 @@ static const ARMCPRegInfo v6_cp_reginfo[] = {
>      { .name = "CPACR", .state = ARM_CP_STATE_BOTH, .opc0 = 3,
>        .crn = 1, .crm = 0, .opc1 = 0, .opc2 = 2, .accessfn = cpacr_access,
>        .access = PL1_RW, .fieldoffset = offsetof(CPUARMState, cp15.cpacr_el1),
> -      .resetvalue = 0, .writefn = cpacr_write },
> +      .resetfn = cpacr_reset, .writefn = cpacr_write },
>      REGINFO_SENTINEL
>  };
>  
> 


Re: [Qemu-devel] [PATCH] Correct CPACR reset value for v7 cores
Posted by Peter Maydell 5 years, 11 months ago
On 22 May 2018 at 20:06, Cédric Le Goater <clg@kaod.org> wrote:
> On 05/22/2018 07:37 PM, Peter Maydell wrote:
>> This is sufficient that a save-and-reload while the romulus-bmc
>> machine is in the bootloader will work. On the other hand if
>> I do a save-and-reload after the kernel has started booting
>> then we get the classic "guest hang after reload", so some
>> state is still not being transferred somewhere (probably in
>> a device in the machine model?)
>
> I see the problem. Bizarre.

Usually it means that something in the path from timer device
through to the CPU has failed to save-and-reload the state
that it needs to, so that when the timer fires post-reload
it doesn't actually cause a CPU interrupt. You can probably
debug by putting a breakpoint on whatever looks like a likely
timer device's 'set the irq line' function and stepping
through to figure out what's happening.

(You're not using trustzone, are you? I know of a bug with
migration of cpus with TZ enabled which I'm probably going to
look at later this week)

thanks
-- PMM

Re: [Qemu-devel] [PATCH] Correct CPACR reset value for v7 cores
Posted by Cédric Le Goater 5 years, 11 months ago
On 05/22/2018 09:26 PM, Peter Maydell wrote:
> On 22 May 2018 at 20:06, Cédric Le Goater <clg@kaod.org> wrote:
>> On 05/22/2018 07:37 PM, Peter Maydell wrote:
>>> This is sufficient that a save-and-reload while the romulus-bmc
>>> machine is in the bootloader will work. On the other hand if
>>> I do a save-and-reload after the kernel has started booting
>>> then we get the classic "guest hang after reload", so some
>>> state is still not being transferred somewhere (probably in
>>> a device in the machine model?)
>>
>> I see the problem. Bizarre.
> 
> Usually it means that something in the path from timer device
> through to the CPU has failed to save-and-reload the state
> that it needs to, so that when the timer fires post-reload
> it doesn't actually cause a CPU interrupt. You can probably
> debug by putting a breakpoint on whatever looks like a likely
> timer device's 'set the irq line' function and stepping
> through to figure out what's happening.

OK. I will take a look.

What is bizarre is that the romulus-bmc and the palmetto-bmc
machines use the same timer controller model and the same Linux 
driver. So only the CPU type would differ.
 
> (You're not using trustzone, are you? I know of a bug with
> migration of cpus with TZ enabled which I'm probably going to
> look at later this week)

The PSR is :

	PSR=20000093 --C- A S svc32

'S' for secure. Is that also trustzone ?

Thanks,

C. 

Re: [Qemu-devel] [PATCH] Correct CPACR reset value for v7 cores
Posted by Peter Maydell 5 years, 11 months ago
On 23 May 2018 at 10:13, Cédric Le Goater <clg@kaod.org> wrote:
> On 05/22/2018 09:26 PM, Peter Maydell wrote:
>> (You're not using trustzone, are you? I know of a bug with
>> migration of cpus with TZ enabled which I'm probably going to
>> look at later this week)
>
> The PSR is :
>
>         PSR=20000093 --C- A S svc32
>
> 'S' for secure. Is that also trustzone ?

Yes. In that case this very likely is the same bug (we don't
migrate the secure-bank registers properly, so if Linux
is executing in secure state it doesn't resume right).

thanks
-- PMM