[v3] target/i386/cpu: introduce new CPU models for x86-64 ABI levels

[PATCH v3 0/4] target/i386/cpu: introduce new CPU models for x86-64 ABI levels

Posted by Daniel P. Berrangé 2 years, 10 months ago

This series is motivated by this blog that describes how RHEL-9
will recommend use of the x86-64-v2 microarchitectural ABI level:

  https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/

The implication of compiling code with -march=x86-64-v2 is that
this code will no longer be guaranteed to be runnable on a
number of the CPU models exposed by the x86_64 target emulator,
most notably qemu64 which is the default.

This series is not proposing to change the QEMU default CPU model
for x86_64 target. I show how this is can trivially be done in
patch 3, but not suggesting that we actually do that, as upstream
is quite conservative in dropping support for old host hardware.

New CPU models
==============

It is instead exploring the possibilities of defining new CPU
models in QEMU to closely match the x86-64 uarch ABI levels in
a relatively vendor agnostic manner. This could be used by
downstream vendors who wish to target specific uarch ABI levels
in custom machine types.

One of the nice things about "qemu64" is that its naming presents
it as effectively being a vendor-neutral model (if we ignore that
vendor=AMD is in fact reported in CPUID).

If we look at the feature set fo x86-64-v2 ABI, we see that the
QEMU "Nehalem" model is the closest match. This is also happens
to be runnable on AMD Opteron G4/G5 and EPYC hosts. None the less,
the use of an Intel specific CPU model name on an AMD host feels
uncomfortable.

Vendor neutral naming
=====================

The idea behind this series is thus to introduce new CPU model
names with vendor neutral naming, to more directly correlate
with defined x86-64 uarch ABI levels. We don't want to just
invent CPUs with a completely arbitrary set of CPU features as
history has shown that brings its own problems. eg a guest
uses features A and B, but only does CPUID checks for existence
of feature B, assuming that B implies A.

The specification for x86-64 ABI levels uses x86-64-vNN naming
but this clashes with QEMU's use of "vNN" for versioning. I
felt it would be confusing to end up with CPU model names
like  "x86-64-v1-v1". Thus I've used an "-abiNNN" suffix
instead. Also note I have an "-abi1" suffix here for the
baseline. Arguably we could just drop the suffix entirely for
the baseline.

A further note is that we could have used "qemu64" as the
naming prefix, eg qemu64-abi2, qemu64-abi3, etc. Alot of
people hold negative opinions of the qemu64 model in general
though, so I felt a clean break with the past might be
desirable, even though the x86-64-abi1 CPU  model is effectively
identical to qemu64.

Runnability of new models
=========================

The goal of the x86-64-abiNNN CPU models is that they should be
runnable on any physical host which supports the CPUIDs features
for that uarch ABI level. It is hard to figure out what exact
set of CPUID features we should report. The uarch ABI document
only specifies the minimum expectation, but we can't define a
CPU in QEMU using only the minimum set as that would omit
countless important features.

Thus to determine the feature set to use in x86-64-abiNNN CPU
models, this series used a script (see patch 4) which did the
following:

  * Find list of current CPU models that satisfy ABI NN
  * From this list, calculate the lowest common denominator (LCD)
    of CPUID features
  * From the LCD, find the existing CPU model that has the fewest
    extra features over the LCD.
  * Use that closest match, minus the extra features
    to define x86-64-abiNN

This approach works well for the baseline ABI, and level 2 ABI

For ABI level 3 there is a bit of a larger divergance in feature
sets, because the LCD between Intel and EPYC models is relatively
large. We omit aes pcid erms invpcid tsc-deadline x2apic pclmulqdq

For ABI level 4 the divergance is small. I believe this is a
deceptive situation that arises because currently only Intel
CPUs in QEMU are able to satisfy ABI level 4 currently. None
of the EPYC CPUs defined in QEMU are able to satisfy it.

I'm concerned that if future EPYC CPU generations are added to
QEMU which *do* indeed satisfy x86-64 v4, then QEMU's x86-64-abi4
CPUID may prove to be too greedy and thus be unable to run on
the future CPUs. In theory we can address this through use of
CPU versioning, but ideally we would not have to deal with that.

Nested virt caveats
===================

A further caveat is that none of these generic models define
any VMX feature set. At the very least I believe this means
that nested virt can't live migrate, but maybe this breaks
ability to use nested virt entirely. I'm unclear what the
best approach is to deal with this, that would let the user
do both "-cpu x86-64-abi2,svm=on" and "-cpu x86-64-abi2,vmx=on"

Side channel mitigation caveats
===============================

Finally none of the x86-64-abiNN models include any of the
features related to CPU side channel mitigations. We can't
add them without sacrificing the ability to run on some
physical hardware that would otherwise satisfy the uarch
ABI level.

This makes me a little uneasy. One of the main downsides
of using "qemu64" is that it leaves guests vulnerable to the
side channel attacks. Introducing new general purpose CPUs
that still have this flaw feels undesirable. I don't know
whethe to prioritize safety, while sacrificing runability.

Crypto accelerator caveats
==========================

Similarly I'm not a huge fan of leaving out the "aes"
instruction for accelerated crypto, as missing "aes" is
also one of the key factors in making qemu64 a bad choice.

If we include 'aes' in x86-64-abi2, then we loose support
for Nehalem hosts.

If we include 'aes' in x86-64-abi3 then we further loose
support for Dhyana hosts (an EPYC derived CPU).

Wrap up
=======

Overall this series shows some CPU models for QEMU that
can map to each x86-64 ABI level, but I'm unsure whether
it is wise to actually go forward with this idea given
the various caveats.

I think they'll mostly only be useful as a built-in default
for machine types. If anyone is going to explicitly specify
a CPU model they'll almost always be better off picking a
vendor specific model or host passthrough. There can be
cases where people have a mix of AMD and Intel bare metal
machines they need portability across but this feel rare.

Whatever else comes of this series, at least patch 1 should
be useful, giving people an idea of what ABI levels each
QEMU CPU can satisfy in the documentation.

Changed in v3:

 - Rebased to resolve conflicts with git master

Changed in v2:

 - Don't document any of the CPU aliases, since they vary per
   machine type, only the concrete types.
 - Formally add the script for re-generating the docs

Daniel P. Berrangé (4):
  docs: add a table showing x86-64 ABI compatibility levels
  target/i386: define CPU models to model x86-64 ABI levels
  scripts: helper to generate x86_64 CPU ABI compat info
  NOT FOR MERGE target/i386: use x86-64-abi1 CPU model as default on
    x86_64

 MAINTAINERS                        |   2 +-
 docs/system/cpu-models-x86-abi.csv |  75 +++++++++++
 docs/system/cpu-models-x86.rst.inc |  22 ++++
 hw/i386/pc_piix.c                  |   3 +
 hw/i386/pc_q35.c                   |   3 +
 scripts/cpu-x86-uarch-abi.py       | 194 +++++++++++++++++++++++++++++
 target/i386/cpu.c                  | 156 +++++++++++++++++++++++
 target/i386/cpu.h                  |   2 +-
 8 files changed, 455 insertions(+), 2 deletions(-)
 create mode 100644 docs/system/cpu-models-x86-abi.csv
 create mode 100644 scripts/cpu-x86-uarch-abi.py

-- 
2.31.1

Re: [PATCH v3 0/4] target/i386/cpu: introduce new CPU models for x86-64 ABI levels

Posted by Eduardo Habkost 2 years, 10 months ago

On Mon, Jun 07, 2021 at 02:58:39PM +0100, Daniel P. Berrangé wrote:
> This series is motivated by this blog that describes how RHEL-9
> will recommend use of the x86-64-v2 microarchitectural ABI level:
> 
>   https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/
> 
> The implication of compiling code with -march=x86-64-v2 is that
> this code will no longer be guaranteed to be runnable on a
> number of the CPU models exposed by the x86_64 target emulator,
> most notably qemu64 which is the default.
> 
> This series is not proposing to change the QEMU default CPU model
> for x86_64 target. I show how this is can trivially be done in
> patch 3, but not suggesting that we actually do that, as upstream
> is quite conservative in dropping support for old host hardware.
> 
> New CPU models
> ==============
> 
> It is instead exploring the possibilities of defining new CPU
> models in QEMU to closely match the x86-64 uarch ABI levels in
> a relatively vendor agnostic manner. This could be used by
> downstream vendors who wish to target specific uarch ABI levels
> in custom machine types.
> 
> One of the nice things about "qemu64" is that its naming presents
> it as effectively being a vendor-neutral model (if we ignore that
> vendor=AMD is in fact reported in CPUID).
> 
> If we look at the feature set fo x86-64-v2 ABI, we see that the
> QEMU "Nehalem" model is the closest match. This is also happens
> to be runnable on AMD Opteron G4/G5 and EPYC hosts. None the less,
> the use of an Intel specific CPU model name on an AMD host feels
> uncomfortable.
> 
> Vendor neutral naming
> =====================
> 
> The idea behind this series is thus to introduce new CPU model
> names with vendor neutral naming, to more directly correlate
> with defined x86-64 uarch ABI levels. We don't want to just
> invent CPUs with a completely arbitrary set of CPU features as
> history has shown that brings its own problems. eg a guest
> uses features A and B, but only does CPUID checks for existence
> of feature B, assuming that B implies A.
> 
> The specification for x86-64 ABI levels uses x86-64-vNN naming
> but this clashes with QEMU's use of "vNN" for versioning. I
> felt it would be confusing to end up with CPU model names
> like  "x86-64-v1-v1". Thus I've used an "-abiNNN" suffix
> instead. Also note I have an "-abi1" suffix here for the
> baseline. Arguably we could just drop the suffix entirely for
> the baseline.
> 
> A further note is that we could have used "qemu64" as the
> naming prefix, eg qemu64-abi2, qemu64-abi3, etc. Alot of
> people hold negative opinions of the qemu64 model in general
> though, so I felt a clean break with the past might be
> desirable, even though the x86-64-abi1 CPU  model is effectively
> identical to qemu64.
> 
> Runnability of new models
> =========================
> 
> The goal of the x86-64-abiNNN CPU models is that they should be
> runnable on any physical host which supports the CPUIDs features
> for that uarch ABI level. It is hard to figure out what exact
> set of CPUID features we should report. The uarch ABI document
> only specifies the minimum expectation, but we can't define a
> CPU in QEMU using only the minimum set as that would omit
> countless important features.
> 

Do you have a list of features that were not in the ABI document
but were included in the CPU models you've added?  What exactly
make them important enough for us, but not important enough for
the ABI level specification writers?


In patch 2/3 you wrote:

| Historically we've found that defining CPU models with an arbitrary
| combination of CPU features can be problematic, as some guest OS
| will not check all features they use, and instead assume that if
| they see feature "XX", then "YY" will always exist. This is reasonable
| in bare metal, but subject to breakage in virtualization.

Do we know how likely that is?  Any examples where this has
happened?

What if we simply used the raw list of features from the
specification, add those as CPU models experimentally, and let
people report issues if they find them?  Too risky?  Not useful
in practice?

Maybe we could provide both a "x86-64-abiNN" and a
"x86-64-abiNN-minimal" CPU model.  We could even use the -minimal
CPU model definition to write a test case that ensure the
algorithm you describe below is always being followed.  The name
should make it clear that the CPU model is more conservative but
less likely to be useful.


> Thus to determine the feature set to use in x86-64-abiNNN CPU
> models, this series used a script (see patch 4) which did the
> following:
> 
>   * Find list of current CPU models that satisfy ABI NN
>   * From this list, calculate the lowest common denominator (LCD)
>     of CPUID features
>   * From the LCD, find the existing CPU model that has the fewest
>     extra features over the LCD.
>   * Use that closest match, minus the extra features
>     to define x86-64-abiNN
> 
> This approach works well for the baseline ABI, and level 2 ABI
> 
> For ABI level 3 there is a bit of a larger divergance in feature
> sets, because the LCD between Intel and EPYC models is relatively
> large. We omit aes pcid erms invpcid tsc-deadline x2apic pclmulqdq
> 
> For ABI level 4 the divergance is small. I believe this is a
> deceptive situation that arises because currently only Intel
> CPUs in QEMU are able to satisfy ABI level 4 currently. None
> of the EPYC CPUs defined in QEMU are able to satisfy it.
> 
> I'm concerned that if future EPYC CPU generations are added to
> QEMU which *do* indeed satisfy x86-64 v4, then QEMU's x86-64-abi4
> CPUID may prove to be too greedy and thus be unable to run on
> the future CPUs. In theory we can address this through use of
> CPU versioning, but ideally we would not have to deal with that.

If no AMD CPU model satisfy v4 yet, maybe we should avoid adding
any feature that is not present in any known AMD CPUs, unless
they are already required by the ABI level specification.
Otherwise, the list is likely to contain Intel features that AMD
has no plans to implement at all.

Or maybe we should just refrain from adding a x86-64-abi4 CPU
model until we have a CPU model from AMD that satisfies v4.

> 
> Nested virt caveats
> ===================
> 
> A further caveat is that none of these generic models define
> any VMX feature set. At the very least I believe this means
> that nested virt can't live migrate, but maybe this breaks
> ability to use nested virt entirely. I'm unclear what the
> best approach is to deal with this, that would let the user
> do both "-cpu x86-64-abi2,svm=on" and "-cpu x86-64-abi2,vmx=on"

I don't get this part.  Why can't the CPU model include a VMX
feature set that is the LCD of the VMX-capable CPUs that satisfy
abiNN?  The VMX feature set is supposed to be just a default when
using vmx=on.

> 
> Side channel mitigation caveats
> ===============================
> 
> Finally none of the x86-64-abiNN models include any of the
> features related to CPU side channel mitigations. We can't
> add them without sacrificing the ability to run on some
> physical hardware that would otherwise satisfy the uarch
> ABI level.
> 
> This makes me a little uneasy. One of the main downsides
> of using "qemu64" is that it leaves guests vulnerable to the
> side channel attacks. Introducing new general purpose CPUs
> that still have this flaw feels undesirable. I don't know
> whethe to prioritize safety, while sacrificing runability.

That's the biggest caveat for me.

I believe the default should prioritize safety.  I believe you
implicitly established that runnability is not an absolute
requirement when you added features that are required by the ABI
level specification.


> 
> Crypto accelerator caveats
> ==========================
> 
> Similarly I'm not a huge fan of leaving out the "aes"
> instruction for accelerated crypto, as missing "aes" is
> also one of the key factors in making qemu64 a bad choice.
> 
> If we include 'aes' in x86-64-abi2, then we loose support
> for Nehalem hosts.
> 
> If we include 'aes' in x86-64-abi3 then we further loose
> support for Dhyana hosts (an EPYC derived CPU).
> 
> Wrap up
> =======
> 
> Overall this series shows some CPU models for QEMU that
> can map to each x86-64 ABI level, but I'm unsure whether
> it is wise to actually go forward with this idea given
> the various caveats.
> 
> I think they'll mostly only be useful as a built-in default
> for machine types. If anyone is going to explicitly specify
> a CPU model they'll almost always be better off picking a
> vendor specific model or host passthrough. There can be
> cases where people have a mix of AMD and Intel bare metal
> machines they need portability across but this feel rare.

I would prefer to address the problem of poor defaults the
following way:

- We should start requiring management software to pass a explicit
  live-migration-policy argument if live migration needs to work
  (after documenting that and waiting for the deprecation period,
  of course).
- If live migration is not required, we can default to host
  passthrough.
- If live migration is required, we should stop providing a
  default CPU model, and requiring it to be configured
  explicitly.

QEMU doesn't have enough information to choose a good CPU model
default when live migration is required.  Even if we add the
x86-64-abiNN CPU models in this series, we have no way to decide
if abi1, abi2, abi3, or abi4 is a better default, do we?

That doesn't mean the CPU models you are adding aren't useful.
They may still be useful for management software or for
downstream vendors to choose their own defaults.

> 
> Whatever else comes of this series, at least patch 1 should
> be useful, giving people an idea of what ABI levels each
> QEMU CPU can satisfy in the documentation.
> 
> Changed in v3:
> 
>  - Rebased to resolve conflicts with git master
> 
> Changed in v2:
> 
>  - Don't document any of the CPU aliases, since they vary per
>    machine type, only the concrete types.
>  - Formally add the script for re-generating the docs
> 
> Daniel P. Berrangé (4):
>   docs: add a table showing x86-64 ABI compatibility levels
>   target/i386: define CPU models to model x86-64 ABI levels
>   scripts: helper to generate x86_64 CPU ABI compat info
>   NOT FOR MERGE target/i386: use x86-64-abi1 CPU model as default on
>     x86_64
> 
>  MAINTAINERS                        |   2 +-
>  docs/system/cpu-models-x86-abi.csv |  75 +++++++++++
>  docs/system/cpu-models-x86.rst.inc |  22 ++++
>  hw/i386/pc_piix.c                  |   3 +
>  hw/i386/pc_q35.c                   |   3 +
>  scripts/cpu-x86-uarch-abi.py       | 194 +++++++++++++++++++++++++++++
>  target/i386/cpu.c                  | 156 +++++++++++++++++++++++
>  target/i386/cpu.h                  |   2 +-
>  8 files changed, 455 insertions(+), 2 deletions(-)
>  create mode 100644 docs/system/cpu-models-x86-abi.csv
>  create mode 100644 scripts/cpu-x86-uarch-abi.py
> 
> -- 
> 2.31.1
> 
> 

-- 
Eduardo

Re: [PATCH v3 0/4] target/i386/cpu: introduce new CPU models for x86-64 ABI levels

Posted by Daniel P. Berrangé 2 years, 10 months ago

On Mon, Jun 07, 2021 at 06:33:10PM -0300, Eduardo Habkost wrote:
> On Mon, Jun 07, 2021 at 02:58:39PM +0100, Daniel P. Berrangé wrote:
> > This series is motivated by this blog that describes how RHEL-9
> > will recommend use of the x86-64-v2 microarchitectural ABI level:
> > 
> >   https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/
> > 
> > The implication of compiling code with -march=x86-64-v2 is that
> > this code will no longer be guaranteed to be runnable on a
> > number of the CPU models exposed by the x86_64 target emulator,
> > most notably qemu64 which is the default.
> > 
> > This series is not proposing to change the QEMU default CPU model
> > for x86_64 target. I show how this is can trivially be done in
> > patch 3, but not suggesting that we actually do that, as upstream
> > is quite conservative in dropping support for old host hardware.
> > 
> > New CPU models
> > ==============
> > 
> > It is instead exploring the possibilities of defining new CPU
> > models in QEMU to closely match the x86-64 uarch ABI levels in
> > a relatively vendor agnostic manner. This could be used by
> > downstream vendors who wish to target specific uarch ABI levels
> > in custom machine types.
> > 
> > One of the nice things about "qemu64" is that its naming presents
> > it as effectively being a vendor-neutral model (if we ignore that
> > vendor=AMD is in fact reported in CPUID).
> > 
> > If we look at the feature set fo x86-64-v2 ABI, we see that the
> > QEMU "Nehalem" model is the closest match. This is also happens
> > to be runnable on AMD Opteron G4/G5 and EPYC hosts. None the less,
> > the use of an Intel specific CPU model name on an AMD host feels
> > uncomfortable.
> > 
> > Vendor neutral naming
> > =====================
> > 
> > The idea behind this series is thus to introduce new CPU model
> > names with vendor neutral naming, to more directly correlate
> > with defined x86-64 uarch ABI levels. We don't want to just
> > invent CPUs with a completely arbitrary set of CPU features as
> > history has shown that brings its own problems. eg a guest
> > uses features A and B, but only does CPUID checks for existence
> > of feature B, assuming that B implies A.
> > 
> > The specification for x86-64 ABI levels uses x86-64-vNN naming
> > but this clashes with QEMU's use of "vNN" for versioning. I
> > felt it would be confusing to end up with CPU model names
> > like  "x86-64-v1-v1". Thus I've used an "-abiNNN" suffix
> > instead. Also note I have an "-abi1" suffix here for the
> > baseline. Arguably we could just drop the suffix entirely for
> > the baseline.
> > 
> > A further note is that we could have used "qemu64" as the
> > naming prefix, eg qemu64-abi2, qemu64-abi3, etc. Alot of
> > people hold negative opinions of the qemu64 model in general
> > though, so I felt a clean break with the past might be
> > desirable, even though the x86-64-abi1 CPU  model is effectively
> > identical to qemu64.
> > 
> > Runnability of new models
> > =========================
> > 
> > The goal of the x86-64-abiNNN CPU models is that they should be
> > runnable on any physical host which supports the CPUIDs features
> > for that uarch ABI level. It is hard to figure out what exact
> > set of CPUID features we should report. The uarch ABI document
> > only specifies the minimum expectation, but we can't define a
> > CPU in QEMU using only the minimum set as that would omit
> > countless important features.
> > 
> 
> Do you have a list of features that were not in the ABI document
> but were included in the CPU models you've added?  What exactly
> make them important enough for us, but not important enough for
> the ABI level specification writers?

The ABI specification isn't trying to define a real CPU, so it
doesn't define a full set of features. It is merely setting a
minimum bar, against which vendor specific CPUs are evaluated
for compliance. IOW, there are multiple AMD and Intel CPUs that
satisfy x86-64-abi1. They will all have have a distinct set of
features, but share a common core. The ABI spec is fairly
focused on the various SIMD matrix instructions in particular.

In this QEMU patch meanwhile, we are attempting to define a
real CPU, so need to spec more than just the core featureset.

> In patch 2/3 you wrote:
> 
> | Historically we've found that defining CPU models with an arbitrary
> | combination of CPU features can be problematic, as some guest OS
> | will not check all features they use, and instead assume that if
> | they see feature "XX", then "YY" will always exist. This is reasonable
> | in bare metal, but subject to breakage in virtualization.
> 
> Do we know how likely that is?  Any examples where this has
> happened?

I don't have specific bugs off hand. I just know that the traditional
way libvirt implemented host-model by taking a base model and turning
on/off arbitrary features has been a source of bugs.

> What if we simply used the raw list of features from the
> specification, add those as CPU models experimentally, and let
> people report issues if they find them?  Too risky?  Not useful
> in practice?

Taking ABI level 1.

The combination of level 0 and level 1 ABI specifies
these features:

{'popcnt', 'sse4.1', 'sse', 'mmx', 'ssse3', 'pni', 'cx16', 'fpu', 'sse2', 'sse4.2', 'cx8', 'syscall', 'fxsr', 'cmov', 'lahf-lm'}

The Nehalem CPU which I used as a basis for this level provides 

{'hypervisor', 'pge', 'mmx', 'ssse3', 'kvm-nopiodelay', 'pse', 'de', 'sep', 'tsc', 'nx', 'popcnt', 'mtrr', 'kvm-steal-time', 'apic', 'sse2', 'clflush', 'fxsr', 'sse4.1', 'sse', 'cx16', 'syscall', 'x2apic', 'kvmclock-stable-bit', 'pni', 'kvm-pv-eoi', 'kvmclock', 'kvm-asyncpf', 'vme', 'mce', 'msr', 'mca', 'sse4.2', 'fpu', 'cx8', 'lm', 'pse36', 'pat', 'cmov', 'lahf-lm', 'pae'}

So if we removed all features except those explicitly documented
by the ABI spec we loose:

{'hypervisor', 'pge', 'pse', 'kvm-nopiodelay', 'de', 'sep', 'tsc', 'nx', 'mtrr', 'kvm-steal-time', 'apic', 'clflush', 'x2apic', 'kvmclock-stable-bit', 'kvm-pv-eoi', 'kvmclock', 'kvm-asyncpf', 'vme', 'mce', 'msr', 'mca', 'pse36', 'lm', 'pat', 'pae'}

Many (most) of these are pretty important features IMHO. I don't
think you can define a useful CPU without them.


> Maybe we could provide both a "x86-64-abiNN" and a
> "x86-64-abiNN-minimal" CPU model.  We could even use the -minimal
> CPU model definition to write a test case that ensure the
> algorithm you describe below is always being followed.  The name
> should make it clear that the CPU model is more conservative but
> less likely to be useful.

I don't think the minimal is usable in practice.

> > Thus to determine the feature set to use in x86-64-abiNNN CPU
> > models, this series used a script (see patch 4) which did the
> > following:
> > 
> >   * Find list of current CPU models that satisfy ABI NN
> >   * From this list, calculate the lowest common denominator (LCD)
> >     of CPUID features
> >   * From the LCD, find the existing CPU model that has the fewest
> >     extra features over the LCD.
> >   * Use that closest match, minus the extra features
> >     to define x86-64-abiNN
> > 
> > This approach works well for the baseline ABI, and level 2 ABI
> > 
> > For ABI level 3 there is a bit of a larger divergance in feature
> > sets, because the LCD between Intel and EPYC models is relatively
> > large. We omit aes pcid erms invpcid tsc-deadline x2apic pclmulqdq
> > 
> > For ABI level 4 the divergance is small. I believe this is a
> > deceptive situation that arises because currently only Intel
> > CPUs in QEMU are able to satisfy ABI level 4 currently. None
> > of the EPYC CPUs defined in QEMU are able to satisfy it.
> > 
> > I'm concerned that if future EPYC CPU generations are added to
> > QEMU which *do* indeed satisfy x86-64 v4, then QEMU's x86-64-abi4
> > CPUID may prove to be too greedy and thus be unable to run on
> > the future CPUs. In theory we can address this through use of
> > CPU versioning, but ideally we would not have to deal with that.
> 
> If no AMD CPU model satisfy v4 yet, maybe we should avoid adding
> any feature that is not present in any known AMD CPUs, unless
> they are already required by the ABI level specification.
> Otherwise, the list is likely to contain Intel features that AMD
> has no plans to implement at all.
> 
> Or maybe we should just refrain from adding a x86-64-abi4 CPU
> model until we have a CPU model from AMD that satisfies v4.

I'd probably go for the latter option and just ignore v4.


> > Nested virt caveats
> > ===================
> > 
> > A further caveat is that none of these generic models define
> > any VMX feature set. At the very least I believe this means
> > that nested virt can't live migrate, but maybe this breaks
> > ability to use nested virt entirely. I'm unclear what the
> > best approach is to deal with this, that would let the user
> > do both "-cpu x86-64-abi2,svm=on" and "-cpu x86-64-abi2,vmx=on"
> 
> I don't get this part.  Why can't the CPU model include a VMX
> feature set that is the LCD of the VMX-capable CPUs that satisfy
> abiNN?  The VMX feature set is supposed to be just a default when
> using vmx=on.

Maybe I'm mis-interpreting things. I didn't think it made sense
to define all the VMX capabilities, if the CPU was later intended
to be used with svm=on, but it sounds like you're saying this is
harmless.

> > 
> > Side channel mitigation caveats
> > ===============================
> > 
> > Finally none of the x86-64-abiNN models include any of the
> > features related to CPU side channel mitigations. We can't
> > add them without sacrificing the ability to run on some
> > physical hardware that would otherwise satisfy the uarch
> > ABI level.
> > 
> > This makes me a little uneasy. One of the main downsides
> > of using "qemu64" is that it leaves guests vulnerable to the
> > side channel attacks. Introducing new general purpose CPUs
> > that still have this flaw feels undesirable. I don't know
> > whethe to prioritize safety, while sacrificing runability.
> 
> That's the biggest caveat for me.
> 
> I believe the default should prioritize safety.  I believe you
> implicitly established that runnability is not an absolute
> requirement when you added features that are required by the ABI
> level specification.


The challenge with the side channel mitigations is that the flags
to be exposed are rather vendor specific, so I don't see a way
to define a single CPU that works for both Intel and AMD hosts
and includes the side channel mods.

> > Crypto accelerator caveats
> > ==========================
> > 
> > Similarly I'm not a huge fan of leaving out the "aes"
> > instruction for accelerated crypto, as missing "aes" is
> > also one of the key factors in making qemu64 a bad choice.
> > 
> > If we include 'aes' in x86-64-abi2, then we loose support
> > for Nehalem hosts.
> > 
> > If we include 'aes' in x86-64-abi3 then we further loose
> > support for Dhyana hosts (an EPYC derived CPU).
> > 
> > Wrap up
> > =======
> > 
> > Overall this series shows some CPU models for QEMU that
> > can map to each x86-64 ABI level, but I'm unsure whether
> > it is wise to actually go forward with this idea given
> > the various caveats.
> > 
> > I think they'll mostly only be useful as a built-in default
> > for machine types. If anyone is going to explicitly specify
> > a CPU model they'll almost always be better off picking a
> > vendor specific model or host passthrough. There can be
> > cases where people have a mix of AMD and Intel bare metal
> > machines they need portability across but this feel rare.
> 
> I would prefer to address the problem of poor defaults the
> following way:
> 
> - We should start requiring management software to pass a explicit
>   live-migration-policy argument if live migration needs to work
>   (after documenting that and waiting for the deprecation period,
>   of course).
> - If live migration is not required, we can default to host
>   passthrough.
> - If live migration is required, we should stop providing a
>   default CPU model, and requiring it to be configured
>   explicitly.

I don't think this will make a significant difference, because
libvirt will end up saying live migration is required for
everything it starts, unless libvirt has in turn been told by
a mgmt app that it doesn't need live migration. The biggest
mgmt apps I know (oVirt, OpenStack, KubeVirt) all want live
migration in general. There may be some cases where live
mig is not desired, but I expect it would be 10% or less
of deployments.

> QEMU doesn't have enough information to choose a good CPU model
> default when live migration is required.  Even if we add the
> x86-64-abiNN CPU models in this series, we have no way to decide
> if abi1, abi2, abi3, or abi4 is a better default, do we?

Nope, QEMU doesn't have enough info to choose. The only way I
can see is if the OS vendor decides what their baseline is
going to be. eg Fedora explicitly supports all CPUs right back
to first generation x86_64 chips, so to provide runnability on
all possible Fedora hosts, we need qemu64.  In RHEL-9, it was
decided to compile everything with march=x86_64-abi1, so we
can guarantee that x86_64-abi1 is always runnable on any RHEL9
host, and thus can use that instead of qemu64. 


FWIW, I'm totally fine with saying that this concept of defining
CPUs for x86_64 ABI levels is a failed experiment.

To me the real killers are the lack of side channel mitigations
and lack of AES accelerator instructions. Without those, it just
doesn't feel like we're offering a big enough step up over the
qemu64 CPU. We can't add the side channels as it feels too
vendor specific, and we can't add the AES accel, because there
are physical CPUs satisfying abi1, which don't have AES.

At least the documentation of ABI compatibiltiy levels is useful
though.

Given that my x86_64-abi1 CPU is a copy of Nehalem, if RHEL9
did want a newer CPU default, it could just pick Nehalem as
that.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|