[v2] RE: [PATCH v2 0/4] target/arm/cpu: Introduce sve_vq_supported bitmap

RE: [PATCH v2 0/4] target/arm/cpu: Introduce sve_vq_supported bitmap
Posted by ishii.shuuichir@fujitsu.com 4 years, 5 months ago
Thanks for the comments.
We're sorry for the time it took you due to our lack of understanding.

> As I've stated a few different times, the sve-max-vq property is of marginal use, as
> it only works for CPU models that support all vector lengths, including
> non-power-of-2 lengths. This is because the property says to enable all vector
> lengths smaller and including sve-max-vq and to disable all the rest, but if the
> CPU doesn't support non-power-of-2 vector lengths, then sve-max-vq will not
> work for anything other than max-vq=1 and max-vq=2. For this reason, I didn't
> even bring this property over to the 'host' cpu type. I think I once suggested that
> you don't bring it over to A64FX either and I'll suggest that again.

The setting process of cpu->sve_max_vq was left untouched in the a64fx_cpu_set_sve function
that we originally tried to implement, and our understanding of it was lacking.
Andrew is right about the specification of the sve-max-vq property.

As you suggested, we will not implement the sve-max-vq setting process in a64fx.

> > Therefore, We have applied the following fix, is this ok?
> 
> No :-)

We'd like to repost the patch to reflect the comments noted.

Best regards.

> -----Original Message-----
> From: Andrew Jones <drjones@redhat.com>
> Sent: Friday, August 27, 2021 7:04 PM
> To: Ishii, Shuuichirou/石井 周一郎 <ishii.shuuichir@fujitsu.com>
> Cc: richard.henderson@linaro.org; peter.maydell@linaro.org;
> philmd@redhat.com; qemu-devel@nongnu.org; qemu-arm@nongnu.org
> Subject: Re: [PATCH v2 0/4] target/arm/cpu: Introduce sve_vq_supported bitmap
> 
> On Fri, Aug 27, 2021 at 08:30:07AM +0000, ishii.shuuichir@fujitsu.com wrote:
> >
> > Thank you, Andrew, for creating the patches.
> > And thank you all for your comments.
> >
> > I have applied the suggested v2 patch series by andrew locally, and
> > reviewed the next version of the a64fx patch series as follows.
> > I would appreciate if you could comment on whether there are any
> > problems with the fixes before I post the next version of the patch.
> > (The a64fx patch series to be posted later will be split as previously
> > posted.)
> >
> > ----------------------------------------------------------------------
> > -------------------------- diff --git a/docs/system/arm/virt.rst
> > b/docs/system/arm/virt.rst index 59acf0eeaf..850787495b 100644
> > --- a/docs/system/arm/virt.rst
> > +++ b/docs/system/arm/virt.rst
> > @@ -55,6 +55,7 @@ Supported guest CPU types:
> >  - ``cortex-a53`` (64-bit)
> >  - ``cortex-a57`` (64-bit)
> >  - ``cortex-a72`` (64-bit)
> > +- ``a64fx`` (64-bit)
> >  - ``host`` (with KVM only)
> >  - ``max`` (same as ``host`` for KVM; best possible emulation with
> > TCG)
> >
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c index
> > 81eda46b0b..10286d3fd6 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -200,6 +200,7 @@ static const char *valid_cpus[] = {
> >      ARM_CPU_TYPE_NAME("cortex-a53"),
> >      ARM_CPU_TYPE_NAME("cortex-a57"),
> >      ARM_CPU_TYPE_NAME("cortex-a72"),
> > +    ARM_CPU_TYPE_NAME("a64fx"),
> >      ARM_CPU_TYPE_NAME("host"),
> >      ARM_CPU_TYPE_NAME("max"),
> >  };
> > diff --git a/target/arm/cpu.c b/target/arm/cpu.c index
> > 2866dd7658..2d9f779cb6 100644
> > --- a/target/arm/cpu.c
> > +++ b/target/arm/cpu.c
> > @@ -1350,10 +1350,12 @@ void arm_cpu_finalize_features(ARMCPU *cpu,
> Error **errp)
> >      Error *local_err = NULL;
> >
> >      if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
> > -        arm_cpu_sve_finalize(cpu, &local_err);
> > -        if (local_err != NULL) {
> > -            error_propagate(errp, local_err);
> > -            return;
> > +        if (!arm_feature(&cpu->env, ARM_FEATURE_A64FX)) {
> > +            arm_cpu_sve_finalize(cpu, &local_err);
> > +            if (local_err != NULL) {
> > +                error_propagate(errp, local_err);
> > +                return;
> > +            }
> 
> We don't want to do this anymore. Now we want to call
> arm_cpu_sve_finalize() for all cpu models.
> 
> >          }
> >
> >          /*
> > diff --git a/target/arm/cpu.h b/target/arm/cpu.h index
> > cc645b5742..bf8d9ddaa1 100644
> > --- a/target/arm/cpu.h
> > +++ b/target/arm/cpu.h
> > @@ -2149,6 +2149,7 @@ enum arm_features {
> >      ARM_FEATURE_M_SECURITY, /* M profile Security Extension */
> >      ARM_FEATURE_M_MAIN, /* M profile Main Extension */
> >      ARM_FEATURE_V8_1M, /* M profile extras only in v8.1M and later */
> > +    ARM_FEATURE_A64FX, /* Fujitsu A64FX processor */
> >  };
> >
> >  static inline int arm_feature(CPUARMState *env, int feature) diff
> > --git a/target/arm/cpu64.c b/target/arm/cpu64.c index
> > 2f0cbddab5..18e813264a 100644
> > --- a/target/arm/cpu64.c
> > +++ b/target/arm/cpu64.c
> > @@ -841,10 +841,80 @@ static void aarch64_max_initfn(Object *obj)
> >                          cpu_max_set_sve_max_vq, NULL, NULL);  }
> >
> > +static void a64fx_cpu_set_sve(Object *obj) {
> > +    int i;
> > +    Error *errp = NULL;
> > +    ARMCPU *cpu = ARM_CPU(obj);
> > +    /* Suppport of A64FX's vector length are 128,256 and 512bit only */
> > +    const char *vl[] = {"sve128", "sve256", "sve512"};
> > +
> > +    object_property_add_bool(obj, "sve", cpu_arm_get_sve,
> cpu_arm_set_sve);
> > +    object_property_set_bool(obj, "sve", true, &errp);
> > +
> > +    for(i = 0; i <sizeof(vl)/sizeof(vl[0]); i++){
> > +        object_property_add(obj, vl[i], "bool", cpu_arm_get_sve_vq,
> > +                            cpu_arm_set_sve_vq, NULL, NULL);
> > +        object_property_set_bool(obj, vl[i], true, &errp);
> > +    }
> 
> We don't want to do any of the above anymore either. We can add all the properties
> with aarch64_add_sve_properties() now because only the supported vector
> lengths may be enabled.
> 
> > +
> > +    bitmap_zero(cpu->sve_vq_supported, ARM_MAX_VQ);
> > +    set_bit(0, cpu->sve_vq_supported); /* 128bit */
> > +    set_bit(1, cpu->sve_vq_supported); /* 256bit */
> > +    set_bit(3, cpu->sve_vq_supported); /* 512bit */
> 
> This is correct, but since it's only a few lines it can be put directly into
> aarch64_a64fx_initfn().
> 
> > +
> > +    cpu->sve_max_vq = find_last_bit(cpu->sve_vq_map, ARM_MAX_VQ) +
> 1;
> 
> This isn't necessary. arm_cpu_sve_finalize() will manage it.
> 
> > +}
> > +
> > +static void aarch64_a64fx_initfn(Object *obj) {
> > +    ARMCPU *cpu = ARM_CPU(obj);
> > +
> > +    cpu->dtb_compatible = "arm,a64fx";
> > +    set_feature(&cpu->env, ARM_FEATURE_A64FX);
> > +    set_feature(&cpu->env, ARM_FEATURE_V8);
> > +    set_feature(&cpu->env, ARM_FEATURE_NEON);
> > +    set_feature(&cpu->env, ARM_FEATURE_GENERIC_TIMER);
> > +    set_feature(&cpu->env, ARM_FEATURE_AARCH64);
> > +    set_feature(&cpu->env, ARM_FEATURE_EL2);
> > +    set_feature(&cpu->env, ARM_FEATURE_EL3);
> > +    set_feature(&cpu->env, ARM_FEATURE_PMU);
> > +    cpu->midr = 0x461f0010;
> > +    cpu->revidr = 0x00000000;
> > +    cpu->ctr = 0x86668006;
> > +    cpu->reset_sctlr = 0x30000180;
> > +    cpu->isar.id_aa64pfr0 =   0x0000000101111111; /* No RAS Extensions
> */
> > +    cpu->isar.id_aa64pfr1 = 0x0000000000000000;
> > +    cpu->isar.id_aa64dfr0 = 0x0000000010305408;
> > +    cpu->isar.id_aa64dfr1 = 0x0000000000000000;
> > +    cpu->id_aa64afr0 = 0x0000000000000000;
> > +    cpu->id_aa64afr1 = 0x0000000000000000;
> > +    cpu->isar.id_aa64mmfr0 = 0x0000000000001122;
> > +    cpu->isar.id_aa64mmfr1 = 0x0000000011212100;
> > +    cpu->isar.id_aa64mmfr2 = 0x0000000000001011;
> > +    cpu->isar.id_aa64isar0 = 0x0000000010211120;
> > +    cpu->isar.id_aa64isar1 = 0x0000000000010001;
> > +    cpu->isar.id_aa64zfr0 = 0x0000000000000000;
> > +    cpu->clidr = 0x0000000080000023;
> > +    cpu->ccsidr[0] = 0x7007e01c; /* 64KB L1 dcache */
> > +    cpu->ccsidr[1] = 0x2007e01c; /* 64KB L1 icache */
> > +    cpu->ccsidr[2] = 0x70ffe07c; /* 8MB L2 cache */
> > +    cpu->dcz_blocksize = 6; /* 256 bytes */
> > +    cpu->gic_num_lrs = 4;
> > +    cpu->gic_vpribits = 5;
> > +    cpu->gic_vprebits = 5;
> > +
> > +    /* Set SVE properties */
> > +    a64fx_cpu_set_sve(obj);
> > +
> > +    /* TODO:  Add A64FX specific HPC extension registers */ }
> > +
> >  static const ARMCPUInfo aarch64_cpus[] = {
> >      { .name = "cortex-a57",         .initfn = aarch64_a57_initfn },
> >      { .name = "cortex-a53",         .initfn = aarch64_a53_initfn },
> >      { .name = "cortex-a72",         .initfn = aarch64_a72_initfn },
> > +    { .name = "a64fx",              .initfn = aarch64_a64fx_initfn },
> >      { .name = "max",                .initfn = aarch64_max_initfn },
> >  };
> >
> > diff --git a/tests/qtest/arm-cpu-features.c
> > b/tests/qtest/arm-cpu-features.c index 8252b85bb8..bf30bee462 100644
> > --- a/tests/qtest/arm-cpu-features.c
> > +++ b/tests/qtest/arm-cpu-features.c
> > @@ -472,6 +472,11 @@ static void test_query_cpu_model_expansion(const
> void *data)
> >          assert_has_feature_enabled(qts, "max", "sve128");
> >          assert_has_feature_enabled(qts, "cortex-a57", "pmu");
> >          assert_has_feature_enabled(qts, "cortex-a57", "aarch64");
> > +        assert_has_feature_enabled(qts, "a64fx", "pmu");
> > +        assert_has_feature_enabled(qts, "a64fx", "aarch64");
> > +        assert_has_feature_enabled(qts, "a64fx", "sve");
> > +        assert_has_feature_enabled(qts, "a64fx", "sve128");
> > +        assert_has_feature_enabled(qts, "a64fx", "sve256");
> > +        assert_has_feature_enabled(qts, "a64fx", "sve512");
> 
> You should also add tests that make sure no other vectors may be enabled.
> Also, you can use assert_sve_vls() to check more than one vector length is
> enabled at once. So, something like
> 
> /*
>  * For A64FX SVE should be enabled by default with the vectors
>  * sve128, sve256, and sve512
>  */
>  assert_has_feature_enabled(qts, "a64fx", "sve");  assert_sve_vls(qts, "a64fx",
> 0xb, NULL);
> 
> /*
>  * A64FX does not support any other vector lengths besides those
>  * that are enabled by default.
>  */
>  assert_error(qts, "a64fx", "cannot enable sve384", "{ 'sve384': true }");
> assert_error(qts, "a64fx", "cannot enable sve640", "{ 'sve640': true }");
> 
> 
> You could also test that completely disabling sve works and that disabling
> sve256 (which also disables sve512) and/or only test disabling sve512 works, but
> that would only be testing the same code covered by tests with the 'max' cpu type,
> since that code is shared.
> 
> 
> >
> >          sve_tests_default(qts, "max");
> >          pauth_tests_default(qts, "max");
> > ----------------------------------------------------------------------
> > --------------------------
> >
> > By the way, There is one thing that bothers me about the above modification.
> > The A64FX supports only 128-bit, 256-bit, and 512-bit SVE vector
> > lengths, but the The following process in the arm_cpu_sve_finalize() function
> sets all bits of cpu->sve_vq_map to 1.
> >
> > /* Set all bits not explicitly set within sve-max-vq. */
> > bitmap_complement(tmp, cpu->sve_vq_init, max_vq);
> > bitmap_or(cpu->sve_vq_map, cpu->sve_vq_map, tmp, max_vq);
> >
> > As a result, enter a routine where the following process will be executed.
> >
> > error_setg(errp, "cannot set sve-max-vq=%d", cpu->sve_max_vq);
> 
> This is the correct behavior and the rest of the error message explains why
> 
>  error_append_hint(errp, "This CPU does not support "
>                    "the vector length %d-bits.\n", vq * 128);
> 
> In A64FX's case that will point out the vector length 384-bits, which indeed A64FX
> does not support.
> 
> As I've stated a few different times, the sve-max-vq property is of marginal use, as
> it only works for CPU models that support all vector lengths, including
> non-power-of-2 lengths. This is because the property says to enable all vector
> lengths smaller and including sve-max-vq and to disable all the rest, but if the
> CPU doesn't support non-power-of-2 vector lengths, then sve-max-vq will not
> work for anything other than max-vq=1 and max-vq=2. For this reason, I didn't
> even bring this property over to the 'host' cpu type. I think I once suggested that
> you don't bring it over to A64FX either and I'll suggest that again.
> 
> >
> > Therefore, We have applied the following fix, is this ok?
> 
> No :-)
> 
> Thanks,
> drew
> 
> > ----------------------------------------------------------------------
> > --------------------------
> > --- a/target/arm/cpu.c
> > +++ b/target/arm/cpu.c
> > @@ -1350,10 +1350,12 @@ void arm_cpu_finalize_features(ARMCPU *cpu,
> Error **errp)
> >      Error *local_err = NULL;
> >
> >      if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
> > -        arm_cpu_sve_finalize(cpu, &local_err);
> > -        if (local_err != NULL) {
> > -            error_propagate(errp, local_err);
> > -            return;
> > +        if (!arm_feature(&cpu->env, ARM_FEATURE_A64FX)) {
> > +            arm_cpu_sve_finalize(cpu, &local_err);
> > +            if (local_err != NULL) {
> > +                error_propagate(errp, local_err);
> > +                return;
> > +            }
> >          }
> > ----------------------------------------------------------------------
> > --------------------------
> >
> > Please your comments if we have misunderstood.
> > Best regards.
> >
> > > -----Original Message-----
> > > From: Andrew Jones <drjones@redhat.com>
> > > Sent: Tuesday, August 24, 2021 1:07 AM
> > > To: qemu-devel@nongnu.org; qemu-arm@nongnu.org
> > > Cc: Ishii, Shuuichirou/石井 周一郎 <ishii.shuuichir@fujitsu.com>;
> > > richard.henderson@linaro.org; peter.maydell@linaro.org;
> > > philmd@redhat.com
> > > Subject: [PATCH v2 0/4] target/arm/cpu: Introduce sve_vq_supported
> > > bitmap
> > >
> > > v2:
> > >  - Completed testing
> > >  - Removed extra space in an error message
> > >  - Added Phil's r-b's
> > >
> > > While reviewing the new A64FX CPU type it became clear that CPU
> > > types should be able to specify which SVE vector lengths are
> > > supported. This series adds a new bitmap member to ARMCPU and
> > > modifies arm_cpu_sve_finalize() to validate inputs against it.
> > > So far we only need to set the bitmap for the 'max' CPU type though
> > > and, since it supports all vector lengths, we just fill the whole thing.
> > >
> > > This series was inspired by Richard Henderson's suggestion to
> > > replace arm_cpu_sve_finalize's kvm_supported bitmap with something
> > > that could be shared with TCG.
> > >
> > > Thanks,
> > > drew
> > >
> > >
> > > Andrew Jones (4):
> > >   target/arm/cpu: Introduce sve_vq_supported bitmap
> > >   target/arm/kvm64: Ensure sve vls map is completely clear
> > >   target/arm/cpu64: Replace kvm_supported with sve_vq_supported
> > >   target/arm/cpu64: Validate sve vector lengths are supported
> > >
> > >  target/arm/cpu.h   |   4 ++
> > >  target/arm/cpu64.c | 118
> > > +++++++++++++++++++++------------------------
> > >  target/arm/kvm64.c |   2 +-
> > >  3 files changed, 61 insertions(+), 63 deletions(-)
> > >
> > > --
> > > 2.31.1
> >