hw/arm/aspeed_ast2600.c | 2 ++ target/arm/cpu_tcg.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
Currently, the CPU features exposed to the AST2600 QEMU machines are :
half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt
vfpd32 lpae evtstrm
But, the features of the Cortex A7 CPU on the Aspeed AST2600 A3 SoC
are :
half thumb fastmult vfp edsp vfpv3 vfpv3d16 tls vfpv4 idiva idivt
lpae evtstrm
The vfpv3d16 feature bit is common to both vfpv3 and vfpv4, and for
this SoC, QEMU should advertise a VFPv4 unit with 16 double-precision
registers, and not 32 registers.
Drop neon support and hack the default mvfr0 register value of the
cortex A7 to advertise 16 registers.
How can that be done cleanly ? Should we :
* introduce a new A7 CPU with its own _initfn routine ?
* introduce a new CPU property to set the number of "Advanced SIMD
and floating-point" registers in arm_cpu_realizefn() ?
This problem was raised by a buildroot rootfs compiled with vfpv4.
Boot went fine under QEMU but on real HW, user space binaries had
issues with output. Compiling buildroot with vfpv4d16 fixed it and
I didn't dig further. Nevertheless, it would be nice to catch such
issues with QEMU.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
---
hw/arm/aspeed_ast2600.c | 2 ++
target/arm/cpu_tcg.c | 2 +-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c
index dcdc9bc54456..af987fd418ec 100644
--- a/hw/arm/aspeed_ast2600.c
+++ b/hw/arm/aspeed_ast2600.c
@@ -330,6 +330,8 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp)
object_property_set_int(OBJECT(&s->cpu[i]), "cntfrq", 1125000000,
&error_abort);
+ object_property_set_bool(OBJECT(&s->cpu[i]), "neon", false,
+ &error_abort);
object_property_set_link(OBJECT(&s->cpu[i]), "memory",
OBJECT(s->memory), &error_abort);
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index 98b5ba216041..b3f93783a061 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -545,7 +545,7 @@ static void cortex_a7_initfn(Object *obj)
cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A7;
cpu->midr = 0x410fc075;
cpu->reset_fpsid = 0x41023075;
- cpu->isar.mvfr0 = 0x10110222;
+ cpu->isar.mvfr0 = 0x10110221; /* SIMDREG == 0x1 -> 16 registers */
cpu->isar.mvfr1 = 0x11111111;
cpu->ctr = 0x84448003;
cpu->reset_sctlr = 0x00c50078;
--
2.37.3
On 9/26/22 08:26, Cédric Le Goater wrote: > Currently, the CPU features exposed to the AST2600 QEMU machines are : > > half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt > vfpd32 lpae evtstrm > > But, the features of the Cortex A7 CPU on the Aspeed AST2600 A3 SoC > are : > > half thumb fastmult vfp edsp vfpv3 vfpv3d16 tls vfpv4 idiva idivt > lpae evtstrm > > The vfpv3d16 feature bit is common to both vfpv3 and vfpv4, and for > this SoC, QEMU should advertise a VFPv4 unit with 16 double-precision > registers, and not 32 registers. > > Drop neon support and hack the default mvfr0 register value of the > cortex A7 to advertise 16 registers. > > How can that be done cleanly ? Should we : > > * introduce a new A7 CPU with its own _initfn routine ? > * introduce a new CPU property to set the number of "Advanced SIMD > and floating-point" registers in arm_cpu_realizefn() ? This is a note in the Cortex A7 MPCore Technical reference saying : "When FPU option is selected without NEON, the FPU is VFPv4-D16 and uses 16 double-precision registers. When the FPU is implemented with NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers. This register bank is shared with NEON." Could we deduce the number of registers from the availability of the NEON feature, on A7 only ? Thanks, C. > > This problem was raised by a buildroot rootfs compiled with vfpv4. > Boot went fine under QEMU but on real HW, user space binaries had > issues with output. Compiling buildroot with vfpv4d16 fixed it and > I didn't dig further. Nevertheless, it would be nice to catch such > issues with QEMU. > > Signed-off-by: Cédric Le Goater <clg@kaod.org> > --- > hw/arm/aspeed_ast2600.c | 2 ++ > target/arm/cpu_tcg.c | 2 +- > 2 files changed, 3 insertions(+), 1 deletion(-) > > diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c > index dcdc9bc54456..af987fd418ec 100644 > --- a/hw/arm/aspeed_ast2600.c > +++ b/hw/arm/aspeed_ast2600.c > @@ -330,6 +330,8 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp) > > object_property_set_int(OBJECT(&s->cpu[i]), "cntfrq", 1125000000, > &error_abort); > + object_property_set_bool(OBJECT(&s->cpu[i]), "neon", false, > + &error_abort); > object_property_set_link(OBJECT(&s->cpu[i]), "memory", > OBJECT(s->memory), &error_abort); > > diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c > index 98b5ba216041..b3f93783a061 100644 > --- a/target/arm/cpu_tcg.c > +++ b/target/arm/cpu_tcg.c > @@ -545,7 +545,7 @@ static void cortex_a7_initfn(Object *obj) > cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A7; > cpu->midr = 0x410fc075; > cpu->reset_fpsid = 0x41023075; > - cpu->isar.mvfr0 = 0x10110222; > + cpu->isar.mvfr0 = 0x10110221; /* SIMDREG == 0x1 -> 16 registers */ > cpu->isar.mvfr1 = 0x11111111; > cpu->ctr = 0x84448003; > cpu->reset_sctlr = 0x00c50078;
On Mon, 26 Sept 2022 at 07:05, Cédric Le Goater <clg@kaod.org> wrote: > > On 9/26/22 08:26, Cédric Le Goater wrote: > > Currently, the CPU features exposed to the AST2600 QEMU machines are : > > > > half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt > > vfpd32 lpae evtstrm > > > > But, the features of the Cortex A7 CPU on the Aspeed AST2600 A3 SoC > > are : > > > > half thumb fastmult vfp edsp vfpv3 vfpv3d16 tls vfpv4 idiva idivt > > lpae evtstrm > > > > The vfpv3d16 feature bit is common to both vfpv3 and vfpv4, and for > > this SoC, QEMU should advertise a VFPv4 unit with 16 double-precision > > registers, and not 32 registers. > > > > Drop neon support and hack the default mvfr0 register value of the > > cortex A7 to advertise 16 registers. > > > > How can that be done cleanly ? Should we : > > > > * introduce a new A7 CPU with its own _initfn routine ? > > * introduce a new CPU property to set the number of "Advanced SIMD > > and floating-point" registers in arm_cpu_realizefn() ? > > This is a note in the Cortex A7 MPCore Technical reference saying : > > "When FPU option is selected without NEON, the FPU is VFPv4-D16 and uses 16 > double-precision registers. When the FPU is implemented with NEON, the FPU is > VFPv4-D32 and uses 32 double-precision registers. This register bank is shared > with NEON." The datasheet only has this to say: "1.2GHz dual-core ARM Cortex A7 (r0p5) 32-bit CPU with FPU" With no details about the FPU. The hardware is a golden reference though: fpsid: 41023075 mvfr0: 10110221 mvfr1: 11000011 $ bitfield mvfr0 0x10110221 decoding as Media and VFP Feature Register 0 0x10110221 [269550113] A_SIMD registers: 0x1 [16 x 64-bit registers] Single precision: 0x2 [Supported, VFPv4 or VFPv3] Double precision: 0x2 [Supported, VFPv4 or VFPv3] VFP exception trapping: 0x0 [Not supported] Divide: 0x1 [Hardware divide is supported] Square Root: 0x1 [Hardware square root supported] Short vectors: 0x0 [Not supported] VFP Rounding Modes: 0x1 [All modes supported] $ bitfield mvfr1 0x11000011 decoding as Media and VFP Feature Register 1 0x11000011 [285212689] FZ: 0x1 D_NaN mode: 0x1 A_SIMD load/store: 0x0 A_SIMD integer: 0x0 A_SIMD SPFP: 0x0 A_SIMD HPFP: 0x0 VFP HPFP: 0x2 A_SIMD FMAC: 0x1 As you say, no NEON and 16 64-bit registers. > > Could we deduce the number of registers from the availability of the NEON > feature, on A7 only ? We certainly should make the NEON property match the mvfr1 value. Linux tests for NEON with this: (fmrx(MVFR1) & 0x000fff00) == 0x00011100) https://elixir.bootlin.com/linux/v5.19/source/arch/arm/vfp/vfpmodule.c#L812 Cheers, Joel > > > > This problem was raised by a buildroot rootfs compiled with vfpv4. > > Boot went fine under QEMU but on real HW, user space binaries had > > issues with output. Compiling buildroot with vfpv4d16 fixed it and > > I didn't dig further. Nevertheless, it would be nice to catch such > > issues with QEMU. > > > > Signed-off-by: Cédric Le Goater <clg@kaod.org> > > --- > > hw/arm/aspeed_ast2600.c | 2 ++ > > target/arm/cpu_tcg.c | 2 +- > > 2 files changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c > > index dcdc9bc54456..af987fd418ec 100644 > > --- a/hw/arm/aspeed_ast2600.c > > +++ b/hw/arm/aspeed_ast2600.c > > @@ -330,6 +330,8 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp) > > > > object_property_set_int(OBJECT(&s->cpu[i]), "cntfrq", 1125000000, > > &error_abort); > > + object_property_set_bool(OBJECT(&s->cpu[i]), "neon", false, > > + &error_abort); > > object_property_set_link(OBJECT(&s->cpu[i]), "memory", > > OBJECT(s->memory), &error_abort); > > > > diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c > > index 98b5ba216041..b3f93783a061 100644 > > --- a/target/arm/cpu_tcg.c > > +++ b/target/arm/cpu_tcg.c > > @@ -545,7 +545,7 @@ static void cortex_a7_initfn(Object *obj) > > cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A7; > > cpu->midr = 0x410fc075; > > cpu->reset_fpsid = 0x41023075; > > - cpu->isar.mvfr0 = 0x10110222; > > + cpu->isar.mvfr0 = 0x10110221; /* SIMDREG == 0x1 -> 16 registers */ > > cpu->isar.mvfr1 = 0x11111111; > > cpu->ctr = 0x84448003; > > cpu->reset_sctlr = 0x00c50078; >
On 9/27/22 03:49, Joel Stanley wrote: > On Mon, 26 Sept 2022 at 07:05, Cédric Le Goater <clg@kaod.org> wrote: >> >> On 9/26/22 08:26, Cédric Le Goater wrote: >>> Currently, the CPU features exposed to the AST2600 QEMU machines are : >>> >>> half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt >>> vfpd32 lpae evtstrm >>> >>> But, the features of the Cortex A7 CPU on the Aspeed AST2600 A3 SoC >>> are : >>> >>> half thumb fastmult vfp edsp vfpv3 vfpv3d16 tls vfpv4 idiva idivt >>> lpae evtstrm >>> >>> The vfpv3d16 feature bit is common to both vfpv3 and vfpv4, and for >>> this SoC, QEMU should advertise a VFPv4 unit with 16 double-precision >>> registers, and not 32 registers. >>> >>> Drop neon support and hack the default mvfr0 register value of the >>> cortex A7 to advertise 16 registers. >>> >>> How can that be done cleanly ? Should we : >>> >>> * introduce a new A7 CPU with its own _initfn routine ? >>> * introduce a new CPU property to set the number of "Advanced SIMD >>> and floating-point" registers in arm_cpu_realizefn() ? >> >> This is a note in the Cortex A7 MPCore Technical reference saying : >> >> "When FPU option is selected without NEON, the FPU is VFPv4-D16 and uses 16 >> double-precision registers. When the FPU is implemented with NEON, the FPU is >> VFPv4-D32 and uses 32 double-precision registers. This register bank is shared >> with NEON." > > The datasheet only has this to say: > > "1.2GHz dual-core ARM Cortex A7 (r0p5) 32-bit CPU with FPU" > > With no details about the FPU. The hardware is a golden reference though: > > fpsid: 41023075 > mvfr0: 10110221 > mvfr1: 11000011 > > $ bitfield mvfr0 0x10110221 > decoding as Media and VFP Feature Register 0 > 0x10110221 [269550113] > A_SIMD registers: 0x1 [16 x 64-bit registers] > Single precision: 0x2 [Supported, VFPv4 or VFPv3] > Double precision: 0x2 [Supported, VFPv4 or VFPv3] > VFP exception trapping: 0x0 [Not supported] > Divide: 0x1 [Hardware divide is supported] > Square Root: 0x1 [Hardware square root supported] > Short vectors: 0x0 [Not supported] > VFP Rounding Modes: 0x1 [All modes supported] > > $ bitfield mvfr1 0x11000011 > decoding as Media and VFP Feature Register 1 > 0x11000011 [285212689] > FZ: 0x1 > D_NaN mode: 0x1 > A_SIMD load/store: 0x0 > A_SIMD integer: 0x0 > A_SIMD SPFP: 0x0 > A_SIMD HPFP: 0x0 > VFP HPFP: 0x2 > A_SIMD FMAC: 0x1 > > As you say, no NEON and 16 64-bit registers. > >> >> Could we deduce the number of registers from the availability of the NEON >> feature, on A7 only ? > > We certainly should make the NEON property match the mvfr1 value. > Linux tests for NEON with this: > > (fmrx(MVFR1) & 0x000fff00) == 0x00011100) > > https://elixir.bootlin.com/linux/v5.19/source/arch/arm/vfp/vfpmodule.c#L812 ok. I will resend with 2 patches. An obvious first one removing NEON from the AsT2600 SoC and a second decreasing the number of registers to 16 when NEON is off. Thanks, C. > Cheers, > > Joel > >>> >>> This problem was raised by a buildroot rootfs compiled with vfpv4. >>> Boot went fine under QEMU but on real HW, user space binaries had >>> issues with output. Compiling buildroot with vfpv4d16 fixed it and >>> I didn't dig further. Nevertheless, it would be nice to catch such >>> issues with QEMU. >>> >>> Signed-off-by: Cédric Le Goater <clg@kaod.org> >>> --- >>> hw/arm/aspeed_ast2600.c | 2 ++ >>> target/arm/cpu_tcg.c | 2 +- >>> 2 files changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/hw/arm/aspeed_ast2600.c b/hw/arm/aspeed_ast2600.c >>> index dcdc9bc54456..af987fd418ec 100644 >>> --- a/hw/arm/aspeed_ast2600.c >>> +++ b/hw/arm/aspeed_ast2600.c >>> @@ -330,6 +330,8 @@ static void aspeed_soc_ast2600_realize(DeviceState *dev, Error **errp) >>> >>> object_property_set_int(OBJECT(&s->cpu[i]), "cntfrq", 1125000000, >>> &error_abort); >>> + object_property_set_bool(OBJECT(&s->cpu[i]), "neon", false, >>> + &error_abort); >>> object_property_set_link(OBJECT(&s->cpu[i]), "memory", >>> OBJECT(s->memory), &error_abort); >>> >>> diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c >>> index 98b5ba216041..b3f93783a061 100644 >>> --- a/target/arm/cpu_tcg.c >>> +++ b/target/arm/cpu_tcg.c >>> @@ -545,7 +545,7 @@ static void cortex_a7_initfn(Object *obj) >>> cpu->kvm_target = QEMU_KVM_ARM_TARGET_CORTEX_A7; >>> cpu->midr = 0x410fc075; >>> cpu->reset_fpsid = 0x41023075; >>> - cpu->isar.mvfr0 = 0x10110222; >>> + cpu->isar.mvfr0 = 0x10110221; /* SIMDREG == 0x1 -> 16 registers */ >>> cpu->isar.mvfr1 = 0x11111111; >>> cpu->ctr = 0x84448003; >>> cpu->reset_sctlr = 0x00c50078; >>
© 2016 - 2024 Red Hat, Inc.