[v3] Support for new CPU model SapphireRapids

[PATCH v3 0/6] Support for new CPU model SapphireRapids

Posted by Lei Wang 3 years, 1 month ago

This series aims to add a new CPU model SapphireRapids, and tries to
address the problem stated in
https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
so that named CPU model can define its own AMX values, and QEMU won't
pass the wrong AMX values to KVM in future platforms if they have
different values supported.

The original patch is
https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.

---

Changelog:

v3:
 - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
 - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/

v2:
 - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
   unsupported.
 - Remove unnecessary function definition and make code cleaner.
 - Fix some typos.
 - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t


Lei Wang (6):
  i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
  i386: Remove unused parameter "uint32_t bit" in
    feature_word_description()
  i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
    features
  i386: Mask and report unavailable multi-bit feature values
  i386: Initialize AMX CPUID leaves with corresponding env->features[]
    leaves
  i386: Add new CPU model SapphireRapids

 target/i386/cpu-internal.h |  11 ++
 target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
 target/i386/cpu.h          |  16 ++
 3 files changed, 322 insertions(+), 16 deletions(-)


base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9
-- 
2.34.1

Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids

Posted by Igor Mammedov 3 years ago

On Fri,  6 Jan 2023 00:38:20 -0800
Lei Wang <lei4.wang@intel.com> wrote:

> This series aims to add a new CPU model SapphireRapids, and tries to
> address the problem stated in
> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
> so that named CPU model can define its own AMX values, and QEMU won't
> pass the wrong AMX values to KVM in future platforms if they have
> different values supported.
> 
> The original patch is
> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.

MultiBitFeatureInfo looks like an interesting
idea but among fixing whatever issues this has atm,
you'd probably need to introduce a new  qdev_bitfield property
infrastructure so that such features could be treated like any
other qdev properties.
Also when MultiBitFeatureInfo is added, one should convert all
other usecases where it's applicable (not only for new code)
so that we would end up with consolidated approach instead of
zoo mess.

I'm not sure all MultiBitFeatureInfo complexity is necessary
just for adding a new CPU model, I'd rather use ad hoc approach
that we were using before for non boolean features.

And then try to develop MultiBitFeatureInfo & co as a separate
series to demonstrate how much it will improve current
cpu models definitions.

PS:
 'make check-acceptance' are broken with this

> ---
> 
> Changelog:
> 
> v3:
>  - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
>  - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/
> 
> v2:
>  - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
>    unsupported.
>  - Remove unnecessary function definition and make code cleaner.
>  - Fix some typos.
>  - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t
> 
> 
> Lei Wang (6):
>   i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
>   i386: Remove unused parameter "uint32_t bit" in
>     feature_word_description()
>   i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
>     features
>   i386: Mask and report unavailable multi-bit feature values
>   i386: Initialize AMX CPUID leaves with corresponding env->features[]
>     leaves
>   i386: Add new CPU model SapphireRapids
> 
>  target/i386/cpu-internal.h |  11 ++
>  target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
>  target/i386/cpu.h          |  16 ++
>  3 files changed, 322 insertions(+), 16 deletions(-)
> 
> 
> base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9

Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids

Posted by Robert Hoo 2 years, 11 months ago

On Thu, 2023-02-02 at 12:05 +0100, Igor Mammedov wrote:
> MultiBitFeatureInfo looks like an interesting
> idea 

Yeah, we can feel how much effort Lei spent on this.

> but among fixing whatever issues this has atm,
> you'd probably need to introduce a new  qdev_bitfield property
> infrastructure so that such features could be treated like any
> other qdev properties.
> Also when MultiBitFeatureInfo is added, one should convert all
> other usecases where it's applicable (not only for new code)
> so that we would end up with consolidated approach instead of
> zoo mess.
> 
> I'm not sure all MultiBitFeatureInfo complexity is necessary

Kinda ack.

> just for adding a new CPU model, I'd rather use ad hoc approach
> that we were using before for non boolean features.
> 
> And then try to develop MultiBitFeatureInfo & co as a separate
> series to demonstrate how much it will improve current
> cpu models definitions.
> 

CPUID word isn't always bit wise, i.e. each bit representing a feature,
this isn't new.

e.g.
CPUID.1H.EBX[bit23,16] -- Maximum number of addressable IDs for logical
processors in this physical package
CPUID.04H
etc.

And interestingly, we can see that among so many CPUID leaves (which in
turn contain *words* of EAX, EBX, ECX, EDX), only a few has a
corresponding feature word defined in 

typedef enum FeatureWord {
    FEAT_1_EDX,
    FEAT_1_ECX,
    ...
}

Why?

Those CPUID returns are not *feature words(names)*, they're numbers to
decode, strings to interpreted, etc. So does this CPUID.1DH/1EH, I
think.
Why cannot handle them like handling CPUID.04H?

Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids

Posted by Xiaoyao Li 3 years ago

On 2/2/2023 7:05 PM, Igor Mammedov wrote:
> On Fri,  6 Jan 2023 00:38:20 -0800
> Lei Wang <lei4.wang@intel.com> wrote:
> 
>> This series aims to add a new CPU model SapphireRapids, and tries to
>> address the problem stated in
>> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
>> so that named CPU model can define its own AMX values, and QEMU won't
>> pass the wrong AMX values to KVM in future platforms if they have
>> different values supported.
>>
>> The original patch is
>> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.
> 
> MultiBitFeatureInfo looks like an interesting
> idea but among fixing whatever issues this has atm,
> you'd probably need to introduce a new  qdev_bitfield property
> infrastructure so that such features could be treated like any
> other qdev properties.
> Also when MultiBitFeatureInfo is added, one should convert all
> other usecases where it's applicable (not only for new code)
> so that we would end up with consolidated approach instead of
> zoo mess.
> 
> I'm not sure all MultiBitFeatureInfo complexity is necessary
> just for adding a new CPU model, I'd rather use ad hoc approach
> that we were using before for non boolean features.

We have to introduce MultiBitFeatureInfo for SPR cpu model if AMX is 
supposed to be included with SPR cpu model. In fact, MultiBitFeatureInfo 
should have been introduced when adding AMX virtualization support in 
QEMU. I.e., current AMX virtualization design is problematic just like 
Intel-PT virtualization.

Ideally, this series can be split as two: 1) Fix AMX virtualization (by 
introducing MultiBitFeatureInfo), 2) define SPR cpu model.

> 
> And then try to develop MultiBitFeatureInfo & co as a separate
> series to demonstrate how much it will improve current
> cpu models definitions.
> 
> PS:
>   'make check-acceptance' are broken with this
> 
>> ---
>>
>> Changelog:
>>
>> v3:
>>   - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
>>   - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/
>>
>> v2:
>>   - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
>>     unsupported.
>>   - Remove unnecessary function definition and make code cleaner.
>>   - Fix some typos.
>>   - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t
>>
>>
>> Lei Wang (6):
>>    i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
>>    i386: Remove unused parameter "uint32_t bit" in
>>      feature_word_description()
>>    i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
>>      features
>>    i386: Mask and report unavailable multi-bit feature values
>>    i386: Initialize AMX CPUID leaves with corresponding env->features[]
>>      leaves
>>    i386: Add new CPU model SapphireRapids
>>
>>   target/i386/cpu-internal.h |  11 ++
>>   target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
>>   target/i386/cpu.h          |  16 ++
>>   3 files changed, 322 insertions(+), 16 deletions(-)
>>
>>
>> base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9
>

Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids

Posted by Wang, Lei 3 years ago

On 2/2/2023 7:05 PM, Igor Mammedov wrote:
> On Fri,  6 Jan 2023 00:38:20 -0800
> Lei Wang <lei4.wang@intel.com> wrote:
> 
>> This series aims to add a new CPU model SapphireRapids, and tries to
>> address the problem stated in
>> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
>> so that named CPU model can define its own AMX values, and QEMU won't
>> pass the wrong AMX values to KVM in future platforms if they have
>> different values supported.
>>
>> The original patch is
>> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.
> 
> MultiBitFeatureInfo looks like an interesting
> idea but among fixing whatever issues this has atm,
> you'd probably need to introduce a new  qdev_bitfield property
> infrastructure so that such features could be treated like any
> other qdev properties.
> Also when MultiBitFeatureInfo is added, one should convert all
> other usecases where it's applicable (not only for new code)
> so that we would end up with consolidated approach instead of
> zoo mess.
> 
> I'm not sure all MultiBitFeatureInfo complexity is necessary
> just for adding a new CPU model, I'd rather use ad hoc approach
> that we were using before for non boolean features.

Hi, Igor. I do not quite understand what does the "ad hoc approach" mean,
currently if we specify a multi-bit non-boolean CPUID value which is different
from the host value to CPU model, e.g., consider the following scenario:

- KVM **ONLY** supports value 5 (101) and,
- QEMU user want to pass value 3 (011) to it,

and follow the current logic:

    uint64_t unavailable_features = requested_features & ~host_feat;

then:

1. The warning message will be messy and not intuitive:

requested_features bit 1 is 1 and host_feat bit 1 is 0, so it will warn on this
non-sense bit.


2. Some CPUID bits will "leak" into the final CPUID passed to KVM:

requested_features bit 0 is 1 and host_feat bit 0 is also 1, so it will pass
this CPUID bit to host, the request_features value is 3 (011), finally we get 1
(001), this is not we want.

Am I understanding it correctly?

> 
> And then try to develop MultiBitFeatureInfo & co as a separate
> series to demonstrate how much it will improve current
> cpu models definitions.
> 
> PS:
>  'make check-acceptance' are broken with this
> 
>> ---
>>
>> Changelog:
>>
>> v3:
>>  - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
>>  - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/
>>
>> v2:
>>  - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
>>    unsupported.
>>  - Remove unnecessary function definition and make code cleaner.
>>  - Fix some typos.
>>  - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t
>>
>>
>> Lei Wang (6):
>>   i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
>>   i386: Remove unused parameter "uint32_t bit" in
>>     feature_word_description()
>>   i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
>>     features
>>   i386: Mask and report unavailable multi-bit feature values
>>   i386: Initialize AMX CPUID leaves with corresponding env->features[]
>>     leaves
>>   i386: Add new CPU model SapphireRapids
>>
>>  target/i386/cpu-internal.h |  11 ++
>>  target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
>>  target/i386/cpu.h          |  16 ++
>>  3 files changed, 322 insertions(+), 16 deletions(-)
>>
>>
>> base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9
>

Re: [PATCH v3 0/6] Support for new CPU model SapphireRapids

Posted by Igor Mammedov 2 years, 11 months ago

On Tue, 7 Feb 2023 10:50:56 +0800
"Wang, Lei" <lei4.wang@intel.com> wrote:

> On 2/2/2023 7:05 PM, Igor Mammedov wrote:
> > On Fri,  6 Jan 2023 00:38:20 -0800
> > Lei Wang <lei4.wang@intel.com> wrote:
> >   
> >> This series aims to add a new CPU model SapphireRapids, and tries to
> >> address the problem stated in
> >> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#mcf67dbd1ad37c65d7988c36a2b267be9afd2fb30,
> >> so that named CPU model can define its own AMX values, and QEMU won't
> >> pass the wrong AMX values to KVM in future platforms if they have
> >> different values supported.
> >>
> >> The original patch is
> >> https://lore.kernel.org/all/20220812055751.14553-1-lei4.wang@intel.com/T/#u.  
> > 
> > MultiBitFeatureInfo looks like an interesting
> > idea but among fixing whatever issues this has atm,
> > you'd probably need to introduce a new  qdev_bitfield property
> > infrastructure so that such features could be treated like any
> > other qdev properties.
> > Also when MultiBitFeatureInfo is added, one should convert all
> > other usecases where it's applicable (not only for new code)
> > so that we would end up with consolidated approach instead of
> > zoo mess.
> > 
> > I'm not sure all MultiBitFeatureInfo complexity is necessary
> > just for adding a new CPU model, I'd rather use ad hoc approach
> > that we were using before for non boolean features.  
> 
> Hi, Igor. I do not quite understand what does the "ad hoc approach" mean,

by ah hoc I've mean instead of introducing MultiBitFeatureInfo
try to opencode fixups and checks for AMX properties.
(we do have a number of of such cpuid 'features')
For example look at [x]level (it's just a case MultiBitFeatureInfo that takes 32 bits).
Yes that would be ugly but much less complicated than new infrastructure.

And when/if  MultiBitFeatureInfo is ready for usage, you can convert
cpu models to it (AMX and all other 'legacy' features that were open coded).

> currently if we specify a multi-bit non-boolean CPUID value which is different
> from the host value to CPU model, e.g., consider the following scenario:
> 
> - KVM **ONLY** supports value 5 (101) and,
> - QEMU user want to pass value 3 (011) to it,
> 
> and follow the current logic:
> 
>     uint64_t unavailable_features = requested_features & ~host_feat;
> 
> then:
> 
> 1. The warning message will be messy and not intuitive:
> 
> requested_features bit 1 is 1 and host_feat bit 1 is 0, so it will warn on this
> non-sense bit.
> 
> 
> 2. Some CPUID bits will "leak" into the final CPUID passed to KVM:
> 
> requested_features bit 0 is 1 and host_feat bit 0 is also 1, so it will pass
> this CPUID bit to host, the request_features value is 3 (011), finally we get 1
> (001), this is not we want.
> 
> Am I understanding it correctly?
> 
> > 
> > And then try to develop MultiBitFeatureInfo & co as a separate
> > series to demonstrate how much it will improve current
> > cpu models definitions.
> > 
> > PS:
> >  'make check-acceptance' are broken with this
> >   
> >> ---
> >>
> >> Changelog:
> >>
> >> v3:
> >>  - Rebase on the latest QEMU (d1852caab131ea898134fdcea8c14bc2ee75fbe9).
> >>  - v2: https://lore.kernel.org/qemu-devel/20221102085256.81139-1-lei4.wang@intel.com/
> >>
> >> v2:
> >>  - Fix when passing all zeros of AMX-related CPUID, QEMU will warn
> >>    unsupported.
> >>  - Remove unnecessary function definition and make code cleaner.
> >>  - Fix some typos.
> >>  - v1: https://lore.kernel.org/qemu-devel/20221027020036.373140-1-lei4.wang@intel.com/T/#t
> >>
> >>
> >> Lei Wang (6):
> >>   i386: Introduce FeatureWordInfo for AMX CPUID leaf 0x1D and 0x1E
> >>   i386: Remove unused parameter "uint32_t bit" in
> >>     feature_word_description()
> >>   i386: Introduce new struct "MultiBitFeatureInfo" for multi-bit
> >>     features
> >>   i386: Mask and report unavailable multi-bit feature values
> >>   i386: Initialize AMX CPUID leaves with corresponding env->features[]
> >>     leaves
> >>   i386: Add new CPU model SapphireRapids
> >>
> >>  target/i386/cpu-internal.h |  11 ++
> >>  target/i386/cpu.c          | 311 +++++++++++++++++++++++++++++++++++--
> >>  target/i386/cpu.h          |  16 ++
> >>  3 files changed, 322 insertions(+), 16 deletions(-)
> >>
> >>
> >> base-commit: d1852caab131ea898134fdcea8c14bc2ee75fbe9  
> >   
>