hw/arm: MPAM Emulation + PPTT cache description.

[RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max

Posted by Jonathan Cameron via 1 year, 3 months ago

Used to drive the MPAM cache intialization and to exercise more
of the PPTT cache entry generation code. Perhaps a default
L3 cache is acceptable for max?

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 target/arm/tcg/cpu64.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index 8019f00bc3..2af67739f6 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -711,6 +711,17 @@ void aarch64_max_tcg_initfn(Object *obj)
     uint64_t t;
     uint32_t u;
 
+    /*
+     * Expanded cache set
+     */
+    cpu->clidr = 0x8204923; /* 4 4 4 4 3 in 3 bit fields */
+    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
+    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
+    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 unified cache */
+    cpu->ccsidr[4] = 0x000007ff0000007cull; /* 2MB L3 cache 128B line */
+    cpu->ccsidr[6] = 0x00007fff0000007cull; /* 16MB L4 cache 128B line */
+    cpu->ccsidr[8] = 0x0007ffff0000007cull; /* 2048MB L5 cache 128B line */
+
     /*
      * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a real
      * one and try to apply errata workarounds or use impdef features we
@@ -828,6 +839,7 @@ void aarch64_max_tcg_initfn(Object *obj)
     t = FIELD_DP64(t, ID_AA64MMFR2, BBM, 2);      /* FEAT_BBM at level 2 */
     t = FIELD_DP64(t, ID_AA64MMFR2, EVT, 2);      /* FEAT_EVT */
     t = FIELD_DP64(t, ID_AA64MMFR2, E0PD, 1);     /* FEAT_E0PD */
+    t = FIELD_DP64(t, ID_AA64MMFR2, CCIDX, 1);      /* FEAT_TTCNP */
     cpu->isar.id_aa64mmfr2 = t;
 
     t = cpu->isar.id_aa64zfr0;
-- 
2.39.2

Re: [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max

Posted by Alex Bennée 1 year, 3 months ago

Jonathan Cameron <Jonathan.Cameron@huawei.com> writes:

> Used to drive the MPAM cache intialization and to exercise more
> of the PPTT cache entry generation code. Perhaps a default
> L3 cache is acceptable for max?
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  target/arm/tcg/cpu64.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
> index 8019f00bc3..2af67739f6 100644
> --- a/target/arm/tcg/cpu64.c
> +++ b/target/arm/tcg/cpu64.c
> @@ -711,6 +711,17 @@ void aarch64_max_tcg_initfn(Object *obj)
>      uint64_t t;
>      uint32_t u;
>  
> +    /*
> +     * Expanded cache set
> +     */
> +    cpu->clidr = 0x8204923; /* 4 4 4 4 3 in 3 bit fields */
> +    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
> +    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
> +    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 unified cache */
> +    cpu->ccsidr[4] = 0x000007ff0000007cull; /* 2MB L3 cache 128B line */
> +    cpu->ccsidr[6] = 0x00007fff0000007cull; /* 16MB L4 cache 128B line */
> +    cpu->ccsidr[8] = 0x0007ffff0000007cull; /* 2048MB L5 cache 128B line */
> +

I think Peter in another thread wondered if we should have a generic
function for expanding the cache idr registers based on a abstract lane
definition. 

>      /*
>       * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a real
>       * one and try to apply errata workarounds or use impdef features we
> @@ -828,6 +839,7 @@ void aarch64_max_tcg_initfn(Object *obj)
>      t = FIELD_DP64(t, ID_AA64MMFR2, BBM, 2);      /* FEAT_BBM at level 2 */
>      t = FIELD_DP64(t, ID_AA64MMFR2, EVT, 2);      /* FEAT_EVT */
>      t = FIELD_DP64(t, ID_AA64MMFR2, E0PD, 1);     /* FEAT_E0PD */
> +    t = FIELD_DP64(t, ID_AA64MMFR2, CCIDX, 1);      /* FEAT_TTCNP */
>      cpu->isar.id_aa64mmfr2 = t;
>  
>      t = cpu->isar.id_aa64zfr0;


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max

Posted by Jonathan Cameron via 1 year, 3 months ago

On Mon, 14 Aug 2023 11:13:58 +0100
Alex Bennée <alex.bennee@linaro.org> wrote:

> Jonathan Cameron <Jonathan.Cameron@huawei.com> writes:
> 
> > Used to drive the MPAM cache intialization and to exercise more
> > of the PPTT cache entry generation code. Perhaps a default
> > L3 cache is acceptable for max?
> >
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  target/arm/tcg/cpu64.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
> > index 8019f00bc3..2af67739f6 100644
> > --- a/target/arm/tcg/cpu64.c
> > +++ b/target/arm/tcg/cpu64.c
> > @@ -711,6 +711,17 @@ void aarch64_max_tcg_initfn(Object *obj)
> >      uint64_t t;
> >      uint32_t u;
> >  
> > +    /*
> > +     * Expanded cache set
> > +     */
> > +    cpu->clidr = 0x8204923; /* 4 4 4 4 3 in 3 bit fields */
> > +    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
> > +    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
> > +    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 unified cache */
> > +    cpu->ccsidr[4] = 0x000007ff0000007cull; /* 2MB L3 cache 128B line */
> > +    cpu->ccsidr[6] = 0x00007fff0000007cull; /* 16MB L4 cache 128B line */
> > +    cpu->ccsidr[8] = 0x0007ffff0000007cull; /* 2048MB L5 cache 128B line */
> > +  
> 
> I think Peter in another thread wondered if we should have a generic
> function for expanding the cache idr registers based on a abstract lane
> definition. 
> 

Great!

This response?
https://lore.kernel.org/qemu-devel/CAFEAcA_Lzj1LEutMro72fCfqiCWtOpd+5b-YPcfKv8Bg1f+rCg@mail.gmail.com/

That might get us somewhere but ultimately I think we need a general way to push this stuff
in as parameters of the CPU or a CPU definition with a wide enough set of caches to allow us to
poke the boundaries and hang a typical MPAM setup off it.  Would people mind adding at least
an L3 to max? The L4 and above is useful for checking the PPTT building code works,
but that's probably more a development time activity than an every day one.

Jonathan



> >      /*
> >       * Reset MIDR so the guest doesn't mistake our 'max' CPU type for a real
> >       * one and try to apply errata workarounds or use impdef features we
> > @@ -828,6 +839,7 @@ void aarch64_max_tcg_initfn(Object *obj)
> >      t = FIELD_DP64(t, ID_AA64MMFR2, BBM, 2);      /* FEAT_BBM at level 2 */
> >      t = FIELD_DP64(t, ID_AA64MMFR2, EVT, 2);      /* FEAT_EVT */
> >      t = FIELD_DP64(t, ID_AA64MMFR2, E0PD, 1);     /* FEAT_E0PD */
> > +    t = FIELD_DP64(t, ID_AA64MMFR2, CCIDX, 1);      /* FEAT_TTCNP */
> >      cpu->isar.id_aa64mmfr2 = t;
> >  
> >      t = cpu->isar.id_aa64zfr0;  
> 
>

Re: [RFC PATCH 2/5] HACK: target/arm/tcg: Add some more caches to cpu=max

Posted by Richard Henderson 1 year, 3 months ago

On 8/23/23 07:59, Jonathan Cameron via wrote:
> On Mon, 14 Aug 2023 11:13:58 +0100
> Alex Bennée <alex.bennee@linaro.org> wrote:
> 
>> Jonathan Cameron <Jonathan.Cameron@huawei.com> writes:
>>
>>> Used to drive the MPAM cache intialization and to exercise more
>>> of the PPTT cache entry generation code. Perhaps a default
>>> L3 cache is acceptable for max?
>>>
>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>> ---
>>>   target/arm/tcg/cpu64.c | 12 ++++++++++++
>>>   1 file changed, 12 insertions(+)
>>>
>>> diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
>>> index 8019f00bc3..2af67739f6 100644
>>> --- a/target/arm/tcg/cpu64.c
>>> +++ b/target/arm/tcg/cpu64.c
>>> @@ -711,6 +711,17 @@ void aarch64_max_tcg_initfn(Object *obj)
>>>       uint64_t t;
>>>       uint32_t u;
>>>   
>>> +    /*
>>> +     * Expanded cache set
>>> +     */
>>> +    cpu->clidr = 0x8204923; /* 4 4 4 4 3 in 3 bit fields */
>>> +    cpu->ccsidr[0] = 0x000000ff0000001aull; /* 64KB L1 dcache */
>>> +    cpu->ccsidr[1] = 0x000000ff0000001aull; /* 64KB L1 icache */
>>> +    cpu->ccsidr[2] = 0x000007ff0000003aull; /* 1MB L2 unified cache */
>>> +    cpu->ccsidr[4] = 0x000007ff0000007cull; /* 2MB L3 cache 128B line */
>>> +    cpu->ccsidr[6] = 0x00007fff0000007cull; /* 16MB L4 cache 128B line */
>>> +    cpu->ccsidr[8] = 0x0007ffff0000007cull; /* 2048MB L5 cache 128B line */
>>> +
>>
>> I think Peter in another thread wondered if we should have a generic
>> function for expanding the cache idr registers based on a abstract lane
>> definition.
>>
> 
> Great!
> 
> This response?
> https://lore.kernel.org/qemu-devel/CAFEAcA_Lzj1LEutMro72fCfqiCWtOpd+5b-YPcfKv8Bg1f+rCg@mail.gmail.com/

Followed up with

https://lore.kernel.org/qemu-devel/20230811214031.171020-6-richard.henderson@linaro.org/


r~