[PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode

Babu Moger posted 8 patches 3 years, 8 months ago
Test docker-quick@centos7 failed
Test docker-mingw@fedora failed
Test checkpatch failed
Test FreeBSD failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/159804762216.39954.15502128500494116468.stgit@naples-babu.amd.com
Maintainers: Eduardo Habkost <ehabkost@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Richard Henderson <rth@twiddle.net>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
There is a newer version of this series
hw/i386/pc.c               |    8 +--
hw/i386/x86.c              |   43 +++-------------
include/hw/i386/topology.h |  101 ---------------------------------------
include/hw/i386/x86.h      |    9 ---
target/i386/cpu.c          |  115 ++++++++++++++++----------------------------
target/i386/cpu.h          |    3 -
tests/test-x86-cpuid.c     |   40 ++++++++-------
7 files changed, 73 insertions(+), 246 deletions(-)
[PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Babu Moger 3 years, 8 months ago
To support some of the complex topology, we introduced EPYC mode apicid decode.
But, EPYC mode decode is running into problems. Also it can become quite a
maintenance problem in the future. So, it was decided to remove that code and
use the generic decode which works for majority of the topology. Most of the
SPECed configuration would work just fine. With some non-SPECed user inputs,
it will create some sub-optimal configuration.
Here is the discussion thread.
https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/

This series removes all the EPYC mode specific apicid changes and use the generic
apicid decode.

---
v5:
 Revert EPYC specific decode.
 Simplify CPUID_8000_001E

v4:
  https://lore.kernel.org/qemu-devel/159744083536.39197.13827776633866601278.stgit@naples-babu.amd.com/
  Not much of a change. Just added few text changes.
  Error out configuration instead of warning if dies are not configured in EPYC.
  Few other text changes to clarify the removal of node_id, nr_nodes and nodes_per_pkg.

v3:
  https://lore.kernel.org/qemu-devel/159681772267.9679.1334429994189974662.stgit@naples-babu.amd.com/#r
  Added a new check to pass the dies for EPYC numa configuration.
  Added Simplify CPUID_8000_001E patch with some changes suggested by Igor.
  Dropped the patch to build the topology from CpuInstanceProperties.
  TODO: Not sure if we still need the Autonuma changes Igor mentioned.
  Needs more clarity on that.

v2:
  https://lore.kernel.org/qemu-devel/159362436285.36204.986406297373871949.stgit@naples-babu.amd.com/
  Used the numa information from CpuInstanceProperties for building
  the apic_id suggested by Igor.
  Also did some minor code re-aarangement to take care of changes.
  Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
  it later.

v1:
 https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.stgit@naples-babu.amd.com

Babu Moger (8):
      hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
      Revert "i386: Fix pkg_id offset for EPYC cpu models"
      Revert "target/i386: Enable new apic id encoding for EPYC based cpus models"
      Revert "hw/i386: Move arch_id decode inside x86_cpus_init"
      Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition"
      Revert "hw/i386: Introduce apicid functions inside X86MachineState"
      Revert "hw/386: Add EPYC mode topology decoding functions"
      i386: Simplify CPUID_8000_001E for AMD


 hw/i386/pc.c               |    8 +--
 hw/i386/x86.c              |   43 +++-------------
 include/hw/i386/topology.h |  101 ---------------------------------------
 include/hw/i386/x86.h      |    9 ---
 target/i386/cpu.c          |  115 ++++++++++++++++----------------------------
 target/i386/cpu.h          |    3 -
 tests/test-x86-cpuid.c     |   40 ++++++++-------
 7 files changed, 73 insertions(+), 246 deletions(-)

--
Signature

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Dr. David Alan Gilbert 3 years, 8 months ago
* Babu Moger (babu.moger@amd.com) wrote:
> To support some of the complex topology, we introduced EPYC mode apicid decode.
> But, EPYC mode decode is running into problems. Also it can become quite a
> maintenance problem in the future. So, it was decided to remove that code and
> use the generic decode which works for majority of the topology. Most of the
> SPECed configuration would work just fine. With some non-SPECed user inputs,
> it will create some sub-optimal configuration.
> Here is the discussion thread.
> https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> 
> This series removes all the EPYC mode specific apicid changes and use the generic
> apicid decode.

Hi Babu,
  This does simplify things a lot!
One worry, what happens about a live migration of a VM from an old qemu
that was using the node-id to a qemu with this new scheme?

Dave

> ---
> v5:
>  Revert EPYC specific decode.
>  Simplify CPUID_8000_001E
> 
> v4:
>   https://lore.kernel.org/qemu-devel/159744083536.39197.13827776633866601278.stgit@naples-babu.amd.com/
>   Not much of a change. Just added few text changes.
>   Error out configuration instead of warning if dies are not configured in EPYC.
>   Few other text changes to clarify the removal of node_id, nr_nodes and nodes_per_pkg.
> 
> v3:
>   https://lore.kernel.org/qemu-devel/159681772267.9679.1334429994189974662.stgit@naples-babu.amd.com/#r
>   Added a new check to pass the dies for EPYC numa configuration.
>   Added Simplify CPUID_8000_001E patch with some changes suggested by Igor.
>   Dropped the patch to build the topology from CpuInstanceProperties.
>   TODO: Not sure if we still need the Autonuma changes Igor mentioned.
>   Needs more clarity on that.
> 
> v2:
>   https://lore.kernel.org/qemu-devel/159362436285.36204.986406297373871949.stgit@naples-babu.amd.com/
>   Used the numa information from CpuInstanceProperties for building
>   the apic_id suggested by Igor.
>   Also did some minor code re-aarangement to take care of changes.
>   Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
>   it later.
> 
> v1:
>  https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.stgit@naples-babu.amd.com
> 
> Babu Moger (8):
>       hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
>       Revert "i386: Fix pkg_id offset for EPYC cpu models"
>       Revert "target/i386: Enable new apic id encoding for EPYC based cpus models"
>       Revert "hw/i386: Move arch_id decode inside x86_cpus_init"
>       Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition"
>       Revert "hw/i386: Introduce apicid functions inside X86MachineState"
>       Revert "hw/386: Add EPYC mode topology decoding functions"
>       i386: Simplify CPUID_8000_001E for AMD
> 
> 
>  hw/i386/pc.c               |    8 +--
>  hw/i386/x86.c              |   43 +++-------------
>  include/hw/i386/topology.h |  101 ---------------------------------------
>  include/hw/i386/x86.h      |    9 ---
>  target/i386/cpu.c          |  115 ++++++++++++++++----------------------------
>  target/i386/cpu.h          |    3 -
>  tests/test-x86-cpuid.c     |   40 ++++++++-------
>  7 files changed, 73 insertions(+), 246 deletions(-)
> 
> --
> Signature
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Babu Moger 3 years, 8 months ago
Hi Dave,

On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> * Babu Moger (babu.moger@amd.com) wrote:
>> To support some of the complex topology, we introduced EPYC mode apicid decode.
>> But, EPYC mode decode is running into problems. Also it can become quite a
>> maintenance problem in the future. So, it was decided to remove that code and
>> use the generic decode which works for majority of the topology. Most of the
>> SPECed configuration would work just fine. With some non-SPECed user inputs,
>> it will create some sub-optimal configuration.
>> Here is the discussion thread.
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
>>
>> This series removes all the EPYC mode specific apicid changes and use the generic
>> apicid decode.
> 
> Hi Babu,
>   This does simplify things a lot!
> One worry, what happens about a live migration of a VM from an old qemu
> that was using the node-id to a qemu with this new scheme?

The node_id which we introduced was only used internally. This wasn't
exposed outside. I don't think live migration will be an issue.

Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Dr. David Alan Gilbert 3 years, 8 months ago
* Babu Moger (babu.moger@amd.com) wrote:
> Hi Dave,
> 
> On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > * Babu Moger (babu.moger@amd.com) wrote:
> >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> >> But, EPYC mode decode is running into problems. Also it can become quite a
> >> maintenance problem in the future. So, it was decided to remove that code and
> >> use the generic decode which works for majority of the topology. Most of the
> >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> >> it will create some sub-optimal configuration.
> >> Here is the discussion thread.
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> >>
> >> This series removes all the EPYC mode specific apicid changes and use the generic
> >> apicid decode.
> > 
> > Hi Babu,
> >   This does simplify things a lot!
> > One worry, what happens about a live migration of a VM from an old qemu
> > that was using the node-id to a qemu with this new scheme?
> 
> The node_id which we introduced was only used internally. This wasn't
> exposed outside. I don't think live migration will be an issue.

Didn't it become part of the APIC ID visible to the guest?

Dave

-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Tue, 25 Aug 2020 09:15:04 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Babu Moger (babu.moger@amd.com) wrote:
> > Hi Dave,
> > 
> > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > * Babu Moger (babu.moger@amd.com) wrote:  
> > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > >> maintenance problem in the future. So, it was decided to remove that code and
> > >> use the generic decode which works for majority of the topology. Most of the
> > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > >> it will create some sub-optimal configuration.
> > >> Here is the discussion thread.
> > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > >>
> > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > >> apicid decode.  
> > > 
> > > Hi Babu,
> > >   This does simplify things a lot!
> > > One worry, what happens about a live migration of a VM from an old qemu
> > > that was using the node-id to a qemu with this new scheme?  
> > 
> > The node_id which we introduced was only used internally. This wasn't
> > exposed outside. I don't think live migration will be an issue.  
> 
> Didn't it become part of the APIC ID visible to the guest?

Daniel asked similar question wrt hard error on start up,
when CLI is not sufficient to create EPYC cpu.

https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html

Migration might fall into the same category.
Also looking at the history, 5.0 commit 
  247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).

(I'm not aware of somebody complaining about it)

Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.


With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
Multi-node configs would be correct only if user assigns cpus to numa nodes
by duplicating internal node_id algorithm that this series removes.

There might be other broken cases that I don't recall anymore
(should be mentioned in previous versions of this series)


To summarize from migration pov (ignoring ed78467a21459 change):

 1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration
 2) with this series (lets call it qemu 5.2)
     pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
     qemu 5.0, 5.1 ==> qemu 5.2 - broken

It's all about picking which poison to choose,
I'd preffer 2nd case as it lets drop a lot of complicated code that
doesn't work as expected.

PS:
 I didn't review it yet, but with this series we aren't
 making up internal node_ids that should match user provided numa node ids somehow.
 It seems series lost the patch that would enforce numa in case -smp dies>1,
 but otherwise it heads in the right direction.

> 
> Dave
> 


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Dr. David Alan Gilbert 3 years, 8 months ago
* Igor Mammedov (imammedo@redhat.com) wrote:
> On Tue, 25 Aug 2020 09:15:04 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Babu Moger (babu.moger@amd.com) wrote:
> > > Hi Dave,
> > > 
> > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > >> use the generic decode which works for majority of the topology. Most of the
> > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > >> it will create some sub-optimal configuration.
> > > >> Here is the discussion thread.
> > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > >>
> > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > >> apicid decode.  
> > > > 
> > > > Hi Babu,
> > > >   This does simplify things a lot!
> > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > that was using the node-id to a qemu with this new scheme?  
> > > 
> > > The node_id which we introduced was only used internally. This wasn't
> > > exposed outside. I don't think live migration will be an issue.  
> > 
> > Didn't it become part of the APIC ID visible to the guest?
> 
> Daniel asked similar question wrt hard error on start up,
> when CLI is not sufficient to create EPYC cpu.
> 
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> 
> Migration might fall into the same category.
> Also looking at the history, 5.0 commit 
>   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> 
> (I'm not aware of somebody complaining about it)
> 
> Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> 
> 
> With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> Multi-node configs would be correct only if user assigns cpus to numa nodes
> by duplicating internal node_id algorithm that this series removes.
> 
> There might be other broken cases that I don't recall anymore
> (should be mentioned in previous versions of this series)
> 
> 
> To summarize from migration pov (ignoring ed78467a21459 change):
> 
>  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration

Oh ....

>  2) with this series (lets call it qemu 5.2)
>      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
>      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> 
> It's all about picking which poison to choose,
> I'd preffer 2nd case as it lets drop a lot of complicated code that
> doesn't work as expected.

I think that would make our lives easier for other reasons; so I'm happy
to go with that.

> PS:
>  I didn't review it yet, but with this series we aren't
>  making up internal node_ids that should match user provided numa node ids somehow.
>  It seems series lost the patch that would enforce numa in case -smp dies>1,
>  but otherwise it heads in the right direction.

Dave

> > 
> > Dave
> > 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Tue, 25 Aug 2020 16:25:21 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Tue, 25 Aug 2020 09:15:04 +0100
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >   
> > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > Hi Dave,
> > > > 
> > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:    
> > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > >> it will create some sub-optimal configuration.
> > > > >> Here is the discussion thread.
> > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > >>
> > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > >> apicid decode.    
> > > > > 
> > > > > Hi Babu,
> > > > >   This does simplify things a lot!
> > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > that was using the node-id to a qemu with this new scheme?    
> > > > 
> > > > The node_id which we introduced was only used internally. This wasn't
> > > > exposed outside. I don't think live migration will be an issue.    
> > > 
> > > Didn't it become part of the APIC ID visible to the guest?  
> > 
> > Daniel asked similar question wrt hard error on start up,
> > when CLI is not sufficient to create EPYC cpu.
> > 
> > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > 
> > Migration might fall into the same category.
> > Also looking at the history, 5.0 commit 
> >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > 
> > (I'm not aware of somebody complaining about it)
> > 
> > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > 
> > 
> > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > by duplicating internal node_id algorithm that this series removes.
> > 
> > There might be other broken cases that I don't recall anymore
> > (should be mentioned in previous versions of this series)
> > 
> > 
> > To summarize from migration pov (ignoring ed78467a21459 change):
> > 
> >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> 
> Oh ....
> 
> >  2) with this series (lets call it qemu 5.2)
> >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > 
> > It's all about picking which poison to choose,
> > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > doesn't work as expected.  
> 
> I think that would make our lives easier for other reasons; so I'm happy
> to go with that.

to make things less painful for users, me wonders if there is a way
to block migration if epyc and specific QEMU versions are used?

> > PS:
> >  I didn't review it yet, but with this series we aren't
> >  making up internal node_ids that should match user provided numa node ids somehow.
> >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> >  but otherwise it heads in the right direction.  
> 
> Dave
> 
> > > 
> > > Dave
> > >   
> >   


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Dr. David Alan Gilbert 3 years, 8 months ago
* Igor Mammedov (imammedo@redhat.com) wrote:
> On Tue, 25 Aug 2020 16:25:21 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > >   
> > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > Hi Dave,
> > > > > 
> > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:    
> > > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > >> it will create some sub-optimal configuration.
> > > > > >> Here is the discussion thread.
> > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > > >>
> > > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > >> apicid decode.    
> > > > > > 
> > > > > > Hi Babu,
> > > > > >   This does simplify things a lot!
> > > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > > that was using the node-id to a qemu with this new scheme?    
> > > > > 
> > > > > The node_id which we introduced was only used internally. This wasn't
> > > > > exposed outside. I don't think live migration will be an issue.    
> > > > 
> > > > Didn't it become part of the APIC ID visible to the guest?  
> > > 
> > > Daniel asked similar question wrt hard error on start up,
> > > when CLI is not sufficient to create EPYC cpu.
> > > 
> > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > 
> > > Migration might fall into the same category.
> > > Also looking at the history, 5.0 commit 
> > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > > 
> > > (I'm not aware of somebody complaining about it)
> > > 
> > > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > > 
> > > 
> > > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > > by duplicating internal node_id algorithm that this series removes.
> > > 
> > > There might be other broken cases that I don't recall anymore
> > > (should be mentioned in previous versions of this series)
> > > 
> > > 
> > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > 
> > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > 
> > Oh ....
> > 
> > >  2) with this series (lets call it qemu 5.2)
> > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > 
> > > It's all about picking which poison to choose,
> > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > doesn't work as expected.  
> > 
> > I think that would make our lives easier for other reasons; so I'm happy
> > to go with that.
> 
> to make things less painful for users, me wonders if there is a way
> to block migration if epyc and specific QEMU versions are used?

We have no way to block based on version - and that's a pretty painful
thing to do; we can block based on machine type.

But before we get there; can we understand in which combinations that
things break and why exactly - would it break on a 1 or 2 vCPU guest -
or would it only break when we get to the point the upper bits start
being used for example?  Why exaclty would it break - i.e. is it going
to change the name of sections in the migration stream - or are the
values we need actually going to migrate OK?

Dave


> > > PS:
> > >  I didn't review it yet, but with this series we aren't
> > >  making up internal node_ids that should match user provided numa node ids somehow.
> > >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> > >  but otherwise it heads in the right direction.  
> > 
> > Dave
> > 
> > > > 
> > > > Dave
> > > >   
> > >   
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Wed, 26 Aug 2020 15:10:46 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Tue, 25 Aug 2020 16:25:21 +0100
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > 
> > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > >   
> > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > Hi Dave,
> > > > > > 
> > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:    
> > > > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > >> it will create some sub-optimal configuration.
> > > > > > >> Here is the discussion thread.
> > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > > > >>
> > > > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > >> apicid decode.    
> > > > > > > 
> > > > > > > Hi Babu,
> > > > > > >   This does simplify things a lot!
> > > > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > > > that was using the node-id to a qemu with this new scheme?    
> > > > > > 
> > > > > > The node_id which we introduced was only used internally. This wasn't
> > > > > > exposed outside. I don't think live migration will be an issue.    
> > > > > 
> > > > > Didn't it become part of the APIC ID visible to the guest?  
> > > > 
> > > > Daniel asked similar question wrt hard error on start up,
> > > > when CLI is not sufficient to create EPYC cpu.
> > > > 
> > > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > > 
> > > > Migration might fall into the same category.
> > > > Also looking at the history, 5.0 commit 
> > > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > > > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > > > 
> > > > (I'm not aware of somebody complaining about it)
> > > > 
> > > > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > > > 
> > > > 
> > > > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > > > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > > > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > > > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > > > by duplicating internal node_id algorithm that this series removes.
> > > > 
> > > > There might be other broken cases that I don't recall anymore
> > > > (should be mentioned in previous versions of this series)
> > > > 
> > > > 
> > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > 
> > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > > 
> > > Oh ....
> > > 
> > > >  2) with this series (lets call it qemu 5.2)
> > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > 
> > > > It's all about picking which poison to choose,
> > > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > > doesn't work as expected.  
> > > 
> > > I think that would make our lives easier for other reasons; so I'm happy
> > > to go with that.
> > 
> > to make things less painful for users, me wonders if there is a way
> > to block migration if epyc and specific QEMU versions are used?
> 
> We have no way to block based on version - and that's a pretty painful
> thing to do; we can block based on machine type.
> 
> But before we get there; can we understand in which combinations that
> things break and why exactly - would it break on a 1 or 2 vCPU guest -
> or would it only break when we get to the point the upper bits start
> being used for example?  Why exaclty would it break - i.e. is it going
> to change the name of sections in the migration stream - or are the
> values we need actually going to migrate OK?

it's values of APIC ID, where 4.2 and 5.0 QEMU use different values
if numa is enabled.
I'd expect guest to be very confused in when this happens.

here is an example:
qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7

(QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
{
    "return": 7
}

vs

qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7
(QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
{
    "return": 15
}

we probably can't do anything based on machine type versions, as
4.2 and older versions on qemu-5.0 and newer use different algorithm to calculate apic-id.

Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is used. 


> Dave
> 
> 
> > > > PS:
> > > >  I didn't review it yet, but with this series we aren't
> > > >  making up internal node_ids that should match user provided numa node ids somehow.
> > > >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> > > >  but otherwise it heads in the right direction.  
> > > 
> > > Dave
> > > 
> > > > > 
> > > > > Dave
> > > > >   
> > > >   
> > 


RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Babu Moger 3 years, 8 months ago

> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Thursday, August 27, 2020 4:19 PM
> To: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Cc: ehabkost@redhat.com; mst@redhat.com; qemu-devel@nongnu.org;
> Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> rth@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> generic decode
> 
> On Wed, 26 Aug 2020 15:10:46 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > >
> > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > On Tue, 25 Aug 2020 09:15:04 +0100 "Dr. David Alan Gilbert"
> > > > > <dgilbert@redhat.com> wrote:
> > > > >
> > > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > > > Hi Dave,
> > > > > > >
> > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > > > >> To support some of the complex topology, we introduced EPYC
> mode apicid decode.
> > > > > > > >> But, EPYC mode decode is running into problems. Also it
> > > > > > > >> can become quite a maintenance problem in the future. So,
> > > > > > > >> it was decided to remove that code and use the generic
> > > > > > > >> decode which works for majority of the topology. Most of
> > > > > > > >> the SPECed configuration would work just fine. With some
> non-SPECed user inputs, it will create some sub-optimal configuration.
> > > > > > > >> Here is the discussion thread.
> > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https
> > > > > > > >> %3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-
> 1d84-a6e
> > > > > > > >> 7-e468-
> d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.
> > > > > > > >>
> moger%40amd.com%7C9b15ee395daa4935640408d84acedf13%7C3dd8
> > > > > > > >>
> 961fe4884e608e11a82d994e183d%7C0%7C0%7C637341599663177545
> > > > > > > >>
> &amp;sdata=4okYGU%2F8QTYqEOZEd1EBC%2BEsIIrEV59HZrHzpbsR8s
> > > > > > > >> U%3D&amp;reserved=0
> > > > > > > >>
> > > > > > > >> This series removes all the EPYC mode specific apicid changes
> and use the generic
> > > > > > > >> apicid decode.
> > > > > > > >
> > > > > > > > Hi Babu,
> > > > > > > >   This does simplify things a lot!
> > > > > > > > One worry, what happens about a live migration of a VM from
> an old qemu
> > > > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > > >
> > > > > > > The node_id which we introduced was only used internally. This
> wasn't
> > > > > > > exposed outside. I don't think live migration will be an issue.
> > > > > >
> > > > > > Didn't it become part of the APIC ID visible to the guest?
> > > > >
> > > > > Daniel asked similar question wrt hard error on start up, when
> > > > > CLI is not sufficient to create EPYC cpu.
> > > > >
> > > > >
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%
> > > > > 2Fwww.mail-archive.com%2Fqemu-
> devel%40nongnu.org%2Fmsg728536.htm
> > > > >
> l&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C9b15ee395daa49356
> 404
> > > > >
> 08d84acedf13%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63734
> 1
> > > > >
> 599663177545&amp;sdata=OnHz23W4F4TdYwlxPZwC%2B8YRY1K3qJ5U9Sfdo
> Oc
> > > > > GXtw%3D&amp;reserved=0
> > > > >
> > > > > Migration might fall into the same category.
> > > > > Also looking at the history, 5.0 commit
> > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for
> > > > > EPYC based cpus models silently broke APIC ID (without versioning),
> for all EPYC models (that's were 1 new and 1 old one).
> > > > >
> > > > > (I'm not aware of somebody complaining about it)
> > > > >
> > > > > Another commit ed78467a21459, changed CPUID_8000_001E without
> versioning as well.
> > > > >
> > > > >
> > > > > With current EPYC apicid code, if all starts align (no numa or 1
> > > > > numa node only on CLI and no -smp dies=) it might produce a valid
> CPU (apicid+CPUID_8000_001E).
> > > > > No numa is gray area, since EPYC spec implies that it has to be numa
> machine in case of real EPYC cpus.
> > > > > Multi-node configs would be correct only if user assigns cpus to
> > > > > numa nodes by duplicating internal node_id algorithm that this series
> removes.
> > > > >
> > > > > There might be other broken cases that I don't recall anymore
> > > > > (should be mentioned in previous versions of this series)
> > > > >
> > > > >
> > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > >
> > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration
> > > >
> > > > Oh ....
> > > >
> > > > >  2) with this series (lets call it qemu 5.2)
> > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks
> current code to pre-5.0
> > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > >
> > > > > It's all about picking which poison to choose, I'd preffer 2nd
> > > > > case as it lets drop a lot of complicated code that doesn't work
> > > > > as expected.
> > > >
> > > > I think that would make our lives easier for other reasons; so I'm
> > > > happy to go with that.
> > >
> > > to make things less painful for users, me wonders if there is a way
> > > to block migration if epyc and specific QEMU versions are used?
> >
> > We have no way to block based on version - and that's a pretty painful
> > thing to do; we can block based on machine type.
> >
> > But before we get there; can we understand in which combinations that
> > things break and why exactly - would it break on a 1 or 2 vCPU guest -
> > or would it only break when we get to the point the upper bits start
> > being used for example?  Why exaclty would it break - i.e. is it going
> > to change the name of sections in the migration stream - or are the
> > values we need actually going to migrate OK?
> 
> it's values of APIC ID, where 4.2 and 5.0 QEMU use different values if numa is
> enabled.
> I'd expect guest to be very confused in when this happens.
> 
> here is an example:
> qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa
> node,cpus=4-7
> 
> (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id {
>     "return": 7
> }
> 
> vs
> 
> qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa
> node,cpus=4-7
> (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id {
>     "return": 15
> }
> 
> we probably can't do anything based on machine type versions, as
> 4.2 and older versions on qemu-5.0 and newer use different algorithm to
> calculate apic-id.
> 
> Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
> 4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is
> used.

That is correct. When we revert all the node_id related changes, we will
go back to 4.2 algorithm. It will work fine with user passing "-smp
dies=n". It also keeps the code simple. That is why I kept the decoding of
0x8000001e like this below. This will also match apicid decoding.

*ecx = ((topo_info->dies_per_pkg - 1) << 8) |  ((cpu->apic_id >>
apicid_die_offset(topo_info)) & 0xFF);


Still not clear if we need to add a warning when numa nodes != dies.
Worried about adding that check and remove it again later.

What about auto_enable_numa? Do we still need it?

I can send the patches tomorrow if these things are clarified.
Thanks

> 
> 
> > Dave
> >
> >
> > > > > PS:
> > > > >  I didn't review it yet, but with this series we aren't  making
> > > > > up internal node_ids that should match user provided numa node ids
> somehow.
> > > > >  It seems series lost the patch that would enforce numa in case
> > > > > -smp dies>1,  but otherwise it heads in the right direction.
> > > >
> > > > Dave
> > > >
> > > > > >
> > > > > > Dave
> > > > > >
> > > > >
> > >


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Thu, 27 Aug 2020 17:58:01 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Thursday, August 27, 2020 4:19 PM
> > To: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Cc: ehabkost@redhat.com; mst@redhat.com; qemu-devel@nongnu.org;
> > Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > rth@twiddle.net
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > generic decode
> > 
> > On Wed, 26 Aug 2020 15:10:46 +0100
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >   
> > > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > >  
> > > > > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > > > > On Tue, 25 Aug 2020 09:15:04 +0100 "Dr. David Alan Gilbert"
> > > > > > <dgilbert@redhat.com> wrote:
> > > > > >  
> > > > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > > > Hi Dave,
> > > > > > > >
> > > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:  
> > > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > > > >> To support some of the complex topology, we introduced EPYC  
> > mode apicid decode.  
> > > > > > > > >> But, EPYC mode decode is running into problems. Also it
> > > > > > > > >> can become quite a maintenance problem in the future. So,
> > > > > > > > >> it was decided to remove that code and use the generic
> > > > > > > > >> decode which works for majority of the topology. Most of
> > > > > > > > >> the SPECed configuration would work just fine. With some  
> > non-SPECed user inputs, it will create some sub-optimal configuration.  
> > > > > > > > >> Here is the discussion thread.
> > > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https
> > > > > > > > >> %3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-  
> > 1d84-a6e  
> > > > > > > > >> 7-e468-  
> > d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.  
> > > > > > > > >>  
> > moger%40amd.com%7C9b15ee395daa4935640408d84acedf13%7C3dd8  
> > > > > > > > >>  
> > 961fe4884e608e11a82d994e183d%7C0%7C0%7C637341599663177545  
> > > > > > > > >>  
> > &amp;sdata=4okYGU%2F8QTYqEOZEd1EBC%2BEsIIrEV59HZrHzpbsR8s  
> > > > > > > > >> U%3D&amp;reserved=0
> > > > > > > > >>
> > > > > > > > >> This series removes all the EPYC mode specific apicid changes  
> > and use the generic  
> > > > > > > > >> apicid decode.  
> > > > > > > > >
> > > > > > > > > Hi Babu,
> > > > > > > > >   This does simplify things a lot!
> > > > > > > > > One worry, what happens about a live migration of a VM from  
> > an old qemu  
> > > > > > > > > that was using the node-id to a qemu with this new scheme?  
> > > > > > > >
> > > > > > > > The node_id which we introduced was only used internally. This  
> > wasn't  
> > > > > > > > exposed outside. I don't think live migration will be an issue.  
> > > > > > >
> > > > > > > Didn't it become part of the APIC ID visible to the guest?  
> > > > > >
> > > > > > Daniel asked similar question wrt hard error on start up, when
> > > > > > CLI is not sufficient to create EPYC cpu.
> > > > > >
> > > > > >  
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%  
> > > > > > 2Fwww.mail-archive.com%2Fqemu-  
> > devel%40nongnu.org%2Fmsg728536.htm  
> > > > > >  
> > l&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C9b15ee395daa49356
> > 404  
> > > > > >  
> > 08d84acedf13%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63734
> > 1  
> > > > > >  
> > 599663177545&amp;sdata=OnHz23W4F4TdYwlxPZwC%2B8YRY1K3qJ5U9Sfdo
> > Oc  
> > > > > > GXtw%3D&amp;reserved=0
> > > > > >
> > > > > > Migration might fall into the same category.
> > > > > > Also looking at the history, 5.0 commit
> > > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for
> > > > > > EPYC based cpus models silently broke APIC ID (without versioning),  
> > for all EPYC models (that's were 1 new and 1 old one).  
> > > > > >
> > > > > > (I'm not aware of somebody complaining about it)
> > > > > >
> > > > > > Another commit ed78467a21459, changed CPUID_8000_001E without  
> > versioning as well.  
> > > > > >
> > > > > >
> > > > > > With current EPYC apicid code, if all starts align (no numa or 1
> > > > > > numa node only on CLI and no -smp dies=) it might produce a valid  
> > CPU (apicid+CPUID_8000_001E).  
> > > > > > No numa is gray area, since EPYC spec implies that it has to be numa  
> > machine in case of real EPYC cpus.  
> > > > > > Multi-node configs would be correct only if user assigns cpus to
> > > > > > numa nodes by duplicating internal node_id algorithm that this series  
> > removes.  
> > > > > >
> > > > > > There might be other broken cases that I don't recall anymore
> > > > > > (should be mentioned in previous versions of this series)
> > > > > >
> > > > > >
> > > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > > >
> > > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > > > >
> > > > > Oh ....
> > > > >  
> > > > > >  2) with this series (lets call it qemu 5.2)
> > > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks  
> > current code to pre-5.0  
> > > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > >
> > > > > > It's all about picking which poison to choose, I'd preffer 2nd
> > > > > > case as it lets drop a lot of complicated code that doesn't work
> > > > > > as expected.  
> > > > >
> > > > > I think that would make our lives easier for other reasons; so I'm
> > > > > happy to go with that.  
> > > >
> > > > to make things less painful for users, me wonders if there is a way
> > > > to block migration if epyc and specific QEMU versions are used?  
> > >
> > > We have no way to block based on version - and that's a pretty painful
> > > thing to do; we can block based on machine type.
> > >
> > > But before we get there; can we understand in which combinations that
> > > things break and why exactly - would it break on a 1 or 2 vCPU guest -
> > > or would it only break when we get to the point the upper bits start
> > > being used for example?  Why exaclty would it break - i.e. is it going
> > > to change the name of sections in the migration stream - or are the
> > > values we need actually going to migrate OK?  
> > 
> > it's values of APIC ID, where 4.2 and 5.0 QEMU use different values if numa is
> > enabled.
> > I'd expect guest to be very confused in when this happens.
> > 
> > here is an example:
> > qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa
> > node,cpus=4-7
> > 
> > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id {
> >     "return": 7
> > }
> > 
> > vs
> > 
> > qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa
> > node,cpus=4-7
> > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id {
> >     "return": 15
> > }
> > 
> > we probably can't do anything based on machine type versions, as
> > 4.2 and older versions on qemu-5.0 and newer use different algorithm to
> > calculate apic-id.
> > 
> > Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
> > 4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is
> > used.  
> 
> That is correct. When we revert all the node_id related changes, we will
> go back to 4.2 algorithm. It will work fine with user passing "-smp
> dies=n". It also keeps the code simple. That is why I kept the decoding of
> 0x8000001e like this below. This will also match apicid decoding.
> 
> *ecx = ((topo_info->dies_per_pkg - 1) << 8) |  ((cpu->apic_id >>
> apicid_die_offset(topo_info)) & 0xFF);
that will work when there is no -numa on CLI, when -numa is used,
we should use node id that user provided.
like you did in previous revision
   "[PATCH v4 1/3] i386: Simplify CPUID_8000_001E for AMD"

> Still not clear if we need to add a warning when numa nodes != dies.
> Worried about adding that check and remove it again later.
Since there is objection wrt making it error and I'd go with warning for now,
it makes life of person who have to figure what's wrong a bit easier.

> What about auto_enable_numa? Do we still need it?
>
> 
> I can send the patches tomorrow if these things are clarified.
> Thanks
With auto_enable_numa it would be cleaner as you can reuse
the same numa code to set 0x8000001e.ecx vs hardcodding it as above.

Maybe post series without auto_enable_numa so we fix migration
regression ASAP and then switch to auto_enable_numa on top.


> 
> > 
> >   
> > > Dave
> > >
> > >  
> > > > > > PS:
> > > > > >  I didn't review it yet, but with this series we aren't  making
> > > > > > up internal node_ids that should match user provided numa node ids  
> > somehow.  
> > > > > >  It seems series lost the patch that would enforce numa in case
> > > > > > -smp dies>1,  but otherwise it heads in the right direction.  
> > > > >
> > > > > Dave
> > > > >  
> > > > > > >
> > > > > > > Dave
> > > > > > >  
> > > > > >  
> > > >  
> 


RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Babu Moger 3 years, 8 months ago

> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Friday, August 28, 2020 3:43 AM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>; ehabkost@redhat.com;
> mst@redhat.com; qemu-devel@nongnu.org; pbonzini@redhat.com;
> rth@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> generic decode
> 
> On Thu, 27 Aug 2020 17:58:01 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > > -----Original Message-----
> > > From: Igor Mammedov <imammedo@redhat.com>
> > > Sent: Thursday, August 27, 2020 4:19 PM
> > > To: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > Cc: ehabkost@redhat.com; mst@redhat.com; qemu-devel@nongnu.org;
> > > Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > rth@twiddle.net
> > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > generic decode
> > >
> > > On Wed, 26 Aug 2020 15:10:46 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > >
> > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > On Tue, 25 Aug 2020 16:25:21 +0100 "Dr. David Alan Gilbert"
> > > > > <dgilbert@redhat.com> wrote:
> > > > >
> > > > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > > > On Tue, 25 Aug 2020 09:15:04 +0100 "Dr. David Alan Gilbert"
> > > > > > > <dgilbert@redhat.com> wrote:
> > > > > > >
> > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > > > > > Hi Dave,
> > > > > > > > >
> > > > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:
> > > > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > > > > > >> To support some of the complex topology, we
> > > > > > > > > >> introduced EPYC
> > > mode apicid decode.
> > > > > > > > > >> But, EPYC mode decode is running into problems. Also
> > > > > > > > > >> it can become quite a maintenance problem in the
> > > > > > > > > >> future. So, it was decided to remove that code and
> > > > > > > > > >> use the generic decode which works for majority of
> > > > > > > > > >> the topology. Most of the SPECed configuration would
> > > > > > > > > >> work just fine. With some
> > > non-SPECed user inputs, it will create some sub-optimal configuration.
> > > > > > > > > >> Here is the discussion thread.
> > > > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=h
> > > > > > > > > >> ttps
> > > > > > > > > >> %3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-
> > > 1d84-a6e
> > > > > > > > > >> 7-e468-
> > > d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.
> > > > > > > > > >>
> > > moger%40amd.com%7C9b15ee395daa4935640408d84acedf13%7C3dd8
> > > > > > > > > >>
> > > 961fe4884e608e11a82d994e183d%7C0%7C0%7C637341599663177545
> > > > > > > > > >>
> > > &amp;sdata=4okYGU%2F8QTYqEOZEd1EBC%2BEsIIrEV59HZrHzpbsR8s
> > > > > > > > > >> U%3D&amp;reserved=0
> > > > > > > > > >>
> > > > > > > > > >> This series removes all the EPYC mode specific apicid
> > > > > > > > > >> changes
> > > and use the generic
> > > > > > > > > >> apicid decode.
> > > > > > > > > >
> > > > > > > > > > Hi Babu,
> > > > > > > > > >   This does simplify things a lot!
> > > > > > > > > > One worry, what happens about a live migration of a VM
> > > > > > > > > > from
> > > an old qemu
> > > > > > > > > > that was using the node-id to a qemu with this new scheme?
> > > > > > > > >
> > > > > > > > > The node_id which we introduced was only used
> > > > > > > > > internally. This
> > > wasn't
> > > > > > > > > exposed outside. I don't think live migration will be an issue.
> > > > > > > >
> > > > > > > > Didn't it become part of the APIC ID visible to the guest?
> > > > > > >
> > > > > > > Daniel asked similar question wrt hard error on start up,
> > > > > > > when CLI is not sufficient to create EPYC cpu.
> > > > > > >
> > > > > > >
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%25
> > > > > > > 2Fwww.mail-archive.com%2Fqemu-
> > > devel%40nongnu.org%2Fmsg728536.htm
> > > > > > >
> > >
> l&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C9b15ee395daa49356
> > > 404
> > > > > > >
> > >
> 08d84acedf13%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63734
> > > 1
> > > > > > >
> > >
> 599663177545&amp;sdata=OnHz23W4F4TdYwlxPZwC%2B8YRY1K3qJ5U9Sfdo
> > > Oc
> > > > > > > GXtw%3D&amp;reserved=0
> > > > > > >
> > > > > > > Migration might fall into the same category.
> > > > > > > Also looking at the history, 5.0 commit
> > > > > > >   247b18c593ec29 target/i386: Enable new apic id encoding
> > > > > > > for EPYC based cpus models silently broke APIC ID (without
> > > > > > > versioning),
> > > for all EPYC models (that's were 1 new and 1 old one).
> > > > > > >
> > > > > > > (I'm not aware of somebody complaining about it)
> > > > > > >
> > > > > > > Another commit ed78467a21459, changed CPUID_8000_001E
> > > > > > > without
> > > versioning as well.
> > > > > > >
> > > > > > >
> > > > > > > With current EPYC apicid code, if all starts align (no numa
> > > > > > > or 1 numa node only on CLI and no -smp dies=) it might
> > > > > > > produce a valid
> > > CPU (apicid+CPUID_8000_001E).
> > > > > > > No numa is gray area, since EPYC spec implies that it has to
> > > > > > > be numa
> > > machine in case of real EPYC cpus.
> > > > > > > Multi-node configs would be correct only if user assigns
> > > > > > > cpus to numa nodes by duplicating internal node_id algorithm
> > > > > > > that this series
> > > removes.
> > > > > > >
> > > > > > > There might be other broken cases that I don't recall
> > > > > > > anymore (should be mentioned in previous versions of this
> > > > > > > series)
> > > > > > >
> > > > > > >
> > > > > > > To summarize from migration pov (ignoring ed78467a21459
> change):
> > > > > > >
> > > > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration
> > > > > >
> > > > > > Oh ....
> > > > > >
> > > > > > >  2) with this series (lets call it qemu 5.2)
> > > > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically
> > > > > > > rollbacks
> > > current code to pre-5.0
> > > > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > > >
> > > > > > > It's all about picking which poison to choose, I'd preffer
> > > > > > > 2nd case as it lets drop a lot of complicated code that
> > > > > > > doesn't work as expected.
> > > > > >
> > > > > > I think that would make our lives easier for other reasons; so
> > > > > > I'm happy to go with that.
> > > > >
> > > > > to make things less painful for users, me wonders if there is a
> > > > > way to block migration if epyc and specific QEMU versions are used?
> > > >
> > > > We have no way to block based on version - and that's a pretty
> > > > painful thing to do; we can block based on machine type.
> > > >
> > > > But before we get there; can we understand in which combinations
> > > > that things break and why exactly - would it break on a 1 or 2
> > > > vCPU guest - or would it only break when we get to the point the
> > > > upper bits start being used for example?  Why exaclty would it
> > > > break - i.e. is it going to change the name of sections in the
> > > > migration stream - or are the values we need actually going to migrate
> OK?
> > >
> > > it's values of APIC ID, where 4.2 and 5.0 QEMU use different values
> > > if numa is enabled.
> > > I'd expect guest to be very confused in when this happens.
> > >
> > > here is an example:
> > > qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3
> > > -numa
> > > node,cpus=4-7
> > >
> > > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> {
> > >     "return": 7
> > > }
> > >
> > > vs
> > >
> > > qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3
> > > -numa
> > > node,cpus=4-7
> > > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> {
> > >     "return": 15
> > > }
> > >
> > > we probably can't do anything based on machine type versions, as
> > > 4.2 and older versions on qemu-5.0 and newer use different algorithm
> > > to calculate apic-id.
> > >
> > > Hence was suggestion to leave 5.0/5.1 with broken apic id and revert
> > > back to
> > > 4.2 algorithm, which should encode APIC ID correctly when '-smp
> > > dies' is used.
> >
> > That is correct. When we revert all the node_id related changes, we
> > will go back to 4.2 algorithm. It will work fine with user passing
> > "-smp dies=n". It also keeps the code simple. That is why I kept the
> > decoding of 0x8000001e like this below. This will also match apicid decoding.
> >
> > *ecx = ((topo_info->dies_per_pkg - 1) << 8) |  ((cpu->apic_id >>
> > apicid_die_offset(topo_info)) & 0xFF);
> that will work when there is no -numa on CLI, when -numa is used, we
> should use node id that user provided.
> like you did in previous revision
>    "[PATCH v4 1/3] i386: Simplify CPUID_8000_001E for AMD"

This might be a problem in the future with new BIOS options to change the
NPS(Nodes per Socket). Nodes and dies may not match. Then we will end up
with wrong CPUID_8000_001E encoding. That is why I wanted to keep both of
them separate. Users have the option to configure the way it matches their
bios config.


> 
> > Still not clear if we need to add a warning when numa nodes != dies.
> > Worried about adding that check and remove it again later.
> Since there is objection wrt making it error and I'd go with warning for now, it
> makes life of person who have to figure what's wrong a bit easier.
> 
> > What about auto_enable_numa? Do we still need it?
> >
> >
> > I can send the patches tomorrow if these things are clarified.
> > Thanks
> With auto_enable_numa it would be cleaner as you can reuse the same
> numa code to set 0x8000001e.ecx vs hardcodding it as above.
> 
> Maybe post series without auto_enable_numa so we fix migration
> regression ASAP and then switch to auto_enable_numa on top.
> 
> 
> >
> > >
> > >
> > > > Dave
> > > >
> > > >
> > > > > > > PS:
> > > > > > >  I didn't review it yet, but with this series we aren't
> > > > > > > making up internal node_ids that should match user provided
> > > > > > > numa node ids
> > > somehow.
> > > > > > >  It seems series lost the patch that would enforce numa in
> > > > > > > case -smp dies>1,  but otherwise it heads in the right direction.
> > > > > >
> > > > > > Dave
> > > > > >
> > > > > > > >
> > > > > > > > Dave
> > > > > > > >
> > > > > > >
> > > > >
> >


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Dr. David Alan Gilbert 3 years, 8 months ago
* Igor Mammedov (imammedo@redhat.com) wrote:
> On Wed, 26 Aug 2020 15:10:46 +0100
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> 
> > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > 
> > > > * Igor Mammedov (imammedo@redhat.com) wrote:
> > > > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > >   
> > > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > > Hi Dave,
> > > > > > > 
> > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:    
> > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > >> it will create some sub-optimal configuration.
> > > > > > > >> Here is the discussion thread.
> > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > > > > >>
> > > > > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > >> apicid decode.    
> > > > > > > > 
> > > > > > > > Hi Babu,
> > > > > > > >   This does simplify things a lot!
> > > > > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > > > > that was using the node-id to a qemu with this new scheme?    
> > > > > > > 
> > > > > > > The node_id which we introduced was only used internally. This wasn't
> > > > > > > exposed outside. I don't think live migration will be an issue.    
> > > > > > 
> > > > > > Didn't it become part of the APIC ID visible to the guest?  
> > > > > 
> > > > > Daniel asked similar question wrt hard error on start up,
> > > > > when CLI is not sufficient to create EPYC cpu.
> > > > > 
> > > > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > > > 
> > > > > Migration might fall into the same category.
> > > > > Also looking at the history, 5.0 commit 
> > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > > > > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > > > > 
> > > > > (I'm not aware of somebody complaining about it)
> > > > > 
> > > > > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > > > > 
> > > > > 
> > > > > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > > > > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > > > > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > > > > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > > > > by duplicating internal node_id algorithm that this series removes.
> > > > > 
> > > > > There might be other broken cases that I don't recall anymore
> > > > > (should be mentioned in previous versions of this series)
> > > > > 
> > > > > 
> > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > > 
> > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration  
> > > > 
> > > > Oh ....
> > > > 
> > > > >  2) with this series (lets call it qemu 5.2)
> > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > 
> > > > > It's all about picking which poison to choose,
> > > > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > > > doesn't work as expected.  
> > > > 
> > > > I think that would make our lives easier for other reasons; so I'm happy
> > > > to go with that.
> > > 
> > > to make things less painful for users, me wonders if there is a way
> > > to block migration if epyc and specific QEMU versions are used?
> > 
> > We have no way to block based on version - and that's a pretty painful
> > thing to do; we can block based on machine type.
> > 
> > But before we get there; can we understand in which combinations that
> > things break and why exactly - would it break on a 1 or 2 vCPU guest -
> > or would it only break when we get to the point the upper bits start
> > being used for example?  Why exaclty would it break - i.e. is it going
> > to change the name of sections in the migration stream - or are the
> > values we need actually going to migrate OK?
> 
> it's values of APIC ID, where 4.2 and 5.0 QEMU use different values
> if numa is enabled.
> I'd expect guest to be very confused in when this happens.
> 
> here is an example:
> qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7

OK, but it'll probably be OK on small VMs with a single NUMA node?

Dave

> (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> {
>     "return": 7
> }
> 
> vs
> 
> qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7
> (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> {
>     "return": 15
> }
> 
> we probably can't do anything based on machine type versions, as
> 4.2 and older versions on qemu-5.0 and newer use different algorithm to calculate apic-id.
> 
> Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
> 4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is used. 
> 
> 
> > Dave
> > 
> > 
> > > > > PS:
> > > > >  I didn't review it yet, but with this series we aren't
> > > > >  making up internal node_ids that should match user provided numa node ids somehow.
> > > > >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> > > > >  but otherwise it heads in the right direction.  
> > > > 
> > > > Dave
> > > > 
> > > > > > 
> > > > > > Dave
> > > > > >   
> > > > >   
> > > 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Fri, 28 Aug 2020 09:48:30 +0100
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Igor Mammedov (imammedo@redhat.com) wrote:
> > On Wed, 26 Aug 2020 15:10:46 +0100
> > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> >   
> > > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > > On Tue, 25 Aug 2020 16:25:21 +0100
> > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > >   
> > > > > * Igor Mammedov (imammedo@redhat.com) wrote:  
> > > > > > On Tue, 25 Aug 2020 09:15:04 +0100
> > > > > > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > > > > >     
> > > > > > > * Babu Moger (babu.moger@amd.com) wrote:    
> > > > > > > > Hi Dave,
> > > > > > > > 
> > > > > > > > On 8/24/20 1:41 PM, Dr. David Alan Gilbert wrote:      
> > > > > > > > > * Babu Moger (babu.moger@amd.com) wrote:      
> > > > > > > > >> To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > >> But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > >> maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > >> use the generic decode which works for majority of the topology. Most of the
> > > > > > > > >> SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > >> it will create some sub-optimal configuration.
> > > > > > > > >> Here is the discussion thread.
> > > > > > > > >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-d5b437c1b254%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C74d90724af9c4adcc75008d8485d4d16%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637338912853492167&amp;sdata=GTsMKcpeYXAA0CvpLTirPHKdNSdlJE3RuPjCtSyWtGQ%3D&amp;reserved=0
> > > > > > > > >>
> > > > > > > > >> This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > >> apicid decode.      
> > > > > > > > > 
> > > > > > > > > Hi Babu,
> > > > > > > > >   This does simplify things a lot!
> > > > > > > > > One worry, what happens about a live migration of a VM from an old qemu
> > > > > > > > > that was using the node-id to a qemu with this new scheme?      
> > > > > > > > 
> > > > > > > > The node_id which we introduced was only used internally. This wasn't
> > > > > > > > exposed outside. I don't think live migration will be an issue.      
> > > > > > > 
> > > > > > > Didn't it become part of the APIC ID visible to the guest?    
> > > > > > 
> > > > > > Daniel asked similar question wrt hard error on start up,
> > > > > > when CLI is not sufficient to create EPYC cpu.
> > > > > > 
> > > > > > https://www.mail-archive.com/qemu-devel@nongnu.org/msg728536.html
> > > > > > 
> > > > > > Migration might fall into the same category.
> > > > > > Also looking at the history, 5.0 commit 
> > > > > >   247b18c593ec29 target/i386: Enable new apic id encoding for EPYC based cpus models
> > > > > > silently broke APIC ID (without versioning), for all EPYC models (that's were 1 new and 1 old one).
> > > > > > 
> > > > > > (I'm not aware of somebody complaining about it)
> > > > > > 
> > > > > > Another commit ed78467a21459, changed CPUID_8000_001E without versioning as well.
> > > > > > 
> > > > > > 
> > > > > > With current EPYC apicid code, if all starts align (no numa or 1 numa node only on
> > > > > > CLI and no -smp dies=) it might produce a valid CPU (apicid+CPUID_8000_001E).
> > > > > > No numa is gray area, since EPYC spec implies that it has to be numa machine in case of real EPYC cpus.
> > > > > > Multi-node configs would be correct only if user assigns cpus to numa nodes
> > > > > > by duplicating internal node_id algorithm that this series removes.
> > > > > > 
> > > > > > There might be other broken cases that I don't recall anymore
> > > > > > (should be mentioned in previous versions of this series)
> > > > > > 
> > > > > > 
> > > > > > To summarize from migration pov (ignoring ed78467a21459 change):
> > > > > > 
> > > > > >  1) old qemu pre-5.0 ==>  qemu 5.0, 5.1 - broken migration    
> > > > > 
> > > > > Oh ....
> > > > >   
> > > > > >  2) with this series (lets call it qemu 5.2)
> > > > > >      pre-5.0 ==> qemu 5.2 - should work as series basically rollbacks current code to pre-5.0
> > > > > >      qemu 5.0, 5.1 ==> qemu 5.2 - broken
> > > > > > 
> > > > > > It's all about picking which poison to choose,
> > > > > > I'd preffer 2nd case as it lets drop a lot of complicated code that
> > > > > > doesn't work as expected.    
> > > > > 
> > > > > I think that would make our lives easier for other reasons; so I'm happy
> > > > > to go with that.  
> > > > 
> > > > to make things less painful for users, me wonders if there is a way
> > > > to block migration if epyc and specific QEMU versions are used?  
> > > 
> > > We have no way to block based on version - and that's a pretty painful
> > > thing to do; we can block based on machine type.
> > > 
> > > But before we get there; can we understand in which combinations that
> > > things break and why exactly - would it break on a 1 or 2 vCPU guest -
> > > or would it only break when we get to the point the upper bits start
> > > being used for example?  Why exaclty would it break - i.e. is it going
> > > to change the name of sections in the migration stream - or are the
> > > values we need actually going to migrate OK?  
> > 
> > it's values of APIC ID, where 4.2 and 5.0 QEMU use different values
> > if numa is enabled.
> > I'd expect guest to be very confused in when this happens.
> > 
> > here is an example:
> > qemu-4.2 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7  
> 
> OK, but it'll probably be OK on small VMs with a single NUMA node?

it should be fine if -numa isn't used.
 
> Dave
> 
> > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> > {
> >     "return": 7
> > }
> > 
> > vs
> > 
> > qemu-5.1 -cpu EPYC -smp 8,sockets=1,cores=8 -numa node,cpus=0-3 -numa node,cpus=4-7
> > (QEMU) qom-get path=/machine/unattached/device[8] property=apic-id
> > {
> >     "return": 15
> > }
> > 
> > we probably can't do anything based on machine type versions, as
> > 4.2 and older versions on qemu-5.0 and newer use different algorithm to calculate apic-id.
> > 
> > Hence was suggestion to leave 5.0/5.1 with broken apic id and revert back to
> > 4.2 algorithm, which should encode APIC ID correctly when '-smp dies' is used. 
> > 
> >   
> > > Dave
> > > 
> > >   
> > > > > > PS:
> > > > > >  I didn't review it yet, but with this series we aren't
> > > > > >  making up internal node_ids that should match user provided numa node ids somehow.
> > > > > >  It seems series lost the patch that would enforce numa in case -smp dies>1,
> > > > > >  but otherwise it heads in the right direction.    
> > > > > 
> > > > > Dave
> > > > >   
> > > > > > > 
> > > > > > > Dave
> > > > > > >     
> > > > > >     
> > > >   
> >   


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Fri, 21 Aug 2020 17:12:19 -0500
Babu Moger <babu.moger@amd.com> wrote:

> To support some of the complex topology, we introduced EPYC mode apicid decode.
> But, EPYC mode decode is running into problems. Also it can become quite a
> maintenance problem in the future. So, it was decided to remove that code and
> use the generic decode which works for majority of the topology. Most of the
> SPECed configuration would work just fine. With some non-SPECed user inputs,
> it will create some sub-optimal configuration.
> Here is the discussion thread.
> https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> 
> This series removes all the EPYC mode specific apicid changes and use the generic
> apicid decode.

the main difference between EPYC and all other CPUs is that
it requires numa configuration (it's not optional)
so we need an extra patch on top of this series to enfoce that, i.e:

 if (epyc && !numa) 
    error("EPYC cpu requires numa to be configured")

I think there was a patch in previous revisions that aimed for this.
Simplest form would be above snippet.

More complex one, would be moving auto_enable_numa from MachineClass to
MachineState so we can change it at runtime if EPYC is used. That should
take care of use case where user hasn't provided -numa.


Eduardo,
 is there any way to tell managment that particular CPU type requires
 -numa ?

> ---
> v5:
>  Revert EPYC specific decode.
>  Simplify CPUID_8000_001E
> 
> v4:
>   https://lore.kernel.org/qemu-devel/159744083536.39197.13827776633866601278.stgit@naples-babu.amd.com/
>   Not much of a change. Just added few text changes.
>   Error out configuration instead of warning if dies are not configured in EPYC.
>   Few other text changes to clarify the removal of node_id, nr_nodes and nodes_per_pkg.
> 
> v3:
>   https://lore.kernel.org/qemu-devel/159681772267.9679.1334429994189974662.stgit@naples-babu.amd.com/#r
>   Added a new check to pass the dies for EPYC numa configuration.
>   Added Simplify CPUID_8000_001E patch with some changes suggested by Igor.
>   Dropped the patch to build the topology from CpuInstanceProperties.
>   TODO: Not sure if we still need the Autonuma changes Igor mentioned.
>   Needs more clarity on that.
> 
> v2:
>   https://lore.kernel.org/qemu-devel/159362436285.36204.986406297373871949.stgit@naples-babu.amd.com/
>   Used the numa information from CpuInstanceProperties for building
>   the apic_id suggested by Igor.
>   Also did some minor code re-aarangement to take care of changes.
>   Dropped the patch "Simplify CPUID_8000_001E" from v1. Will send
>   it later.
> 
> v1:
>  https://lore.kernel.org/qemu-devel/159164739269.20543.3074052993891532749.stgit@naples-babu.amd.com
> 
> Babu Moger (8):
>       hw/i386: Remove node_id, nr_nodes and nodes_per_pkg from topology
>       Revert "i386: Fix pkg_id offset for EPYC cpu models"
>       Revert "target/i386: Enable new apic id encoding for EPYC based cpus models"
>       Revert "hw/i386: Move arch_id decode inside x86_cpus_init"
>       Revert "i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition"
>       Revert "hw/i386: Introduce apicid functions inside X86MachineState"
>       Revert "hw/386: Add EPYC mode topology decoding functions"
>       i386: Simplify CPUID_8000_001E for AMD
> 
> 
>  hw/i386/pc.c               |    8 +--
>  hw/i386/x86.c              |   43 +++-------------
>  include/hw/i386/topology.h |  101 ---------------------------------------
>  include/hw/i386/x86.h      |    9 ---
>  target/i386/cpu.c          |  115 ++++++++++++++++----------------------------
>  target/i386/cpu.h          |    3 -
>  tests/test-x86-cpuid.c     |   40 ++++++++-------
>  7 files changed, 73 insertions(+), 246 deletions(-)
> 
> --
> Signature
> 


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Daniel P. Berrangé 3 years, 8 months ago
On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> On Fri, 21 Aug 2020 17:12:19 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > But, EPYC mode decode is running into problems. Also it can become quite a
> > maintenance problem in the future. So, it was decided to remove that code and
> > use the generic decode which works for majority of the topology. Most of the
> > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > it will create some sub-optimal configuration.
> > Here is the discussion thread.
> > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > 
> > This series removes all the EPYC mode specific apicid changes and use the generic
> > apicid decode.
> 
> the main difference between EPYC and all other CPUs is that
> it requires numa configuration (it's not optional)
> so we need an extra patch on top of this series to enfoce that, i.e:
> 
>  if (epyc && !numa) 
>     error("EPYC cpu requires numa to be configured")

Please no. This will break 90% of current usage of the EPYC CPU in
real world QEMU deployments. That is way too user hostile to introduce
as a requirement.

Why do we need to force this ?  People have been successfuly using
EPYC CPUs without NUMA in QEMU for years now.

It might not match behaviour of bare metal silicon, but that hasn't
obviously caused the world to come crashing down.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Wed, 26 Aug 2020 13:50:59 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > On Fri, 21 Aug 2020 17:12:19 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > maintenance problem in the future. So, it was decided to remove that code and
> > > use the generic decode which works for majority of the topology. Most of the
> > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > it will create some sub-optimal configuration.
> > > Here is the discussion thread.
> > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > 
> > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > apicid decode.  
> > 
> > the main difference between EPYC and all other CPUs is that
> > it requires numa configuration (it's not optional)
> > so we need an extra patch on top of this series to enfoce that, i.e:
> > 
> >  if (epyc && !numa) 
> >     error("EPYC cpu requires numa to be configured")  
> 
> Please no. This will break 90% of current usage of the EPYC CPU in
> real world QEMU deployments. That is way too user hostile to introduce
> as a requirement.
> 
> Why do we need to force this ?  People have been successfuly using
> EPYC CPUs without NUMA in QEMU for years now.
> 
> It might not match behaviour of bare metal silicon, but that hasn't
> obviously caused the world to come crashing down.
So far it produces warning in linux kernel (RHBZ1728166),
(resulting performance might be suboptimal), but I haven't seen
anyone reporting crashes yet.


What other options do we have?
Perhaps we can turn on strict check for new machine types only,
so old configs can keep broken topology (CPUID),
while new ones would require -numa and produce correct topology.


> 
> Regards,
> Daniel


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Daniel P. Berrangé 3 years, 8 months ago
On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 13:50:59 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >   
> > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > use the generic decode which works for majority of the topology. Most of the
> > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > it will create some sub-optimal configuration.
> > > > Here is the discussion thread.
> > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > 
> > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > apicid decode.  
> > > 
> > > the main difference between EPYC and all other CPUs is that
> > > it requires numa configuration (it's not optional)
> > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > 
> > >  if (epyc && !numa) 
> > >     error("EPYC cpu requires numa to be configured")  
> > 
> > Please no. This will break 90% of current usage of the EPYC CPU in
> > real world QEMU deployments. That is way too user hostile to introduce
> > as a requirement.
> > 
> > Why do we need to force this ?  People have been successfuly using
> > EPYC CPUs without NUMA in QEMU for years now.
> > 
> > It might not match behaviour of bare metal silicon, but that hasn't
> > obviously caused the world to come crashing down.
> So far it produces warning in linux kernel (RHBZ1728166),
> (resulting performance might be suboptimal), but I haven't seen
> anyone reporting crashes yet.
> 
> 
> What other options do we have?
> Perhaps we can turn on strict check for new machine types only,
> so old configs can keep broken topology (CPUID),
> while new ones would require -numa and produce correct topology.

No, tieing this to machine types is not viable either. That is still
going to break essentially every single management application that
exists today using QEMU.

Breaking stuff existing apps is not acceptable for something that is
merely reporting sub-optimal performance. That's simply a documentation
task to highlight best practice to app developers.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Wed, 26 Aug 2020 14:36:38 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 13:50:59 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >     
> > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > it will create some sub-optimal configuration.
> > > > > Here is the discussion thread.
> > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > 
> > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > apicid decode.    
> > > > 
> > > > the main difference between EPYC and all other CPUs is that
> > > > it requires numa configuration (it's not optional)
> > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > 
> > > >  if (epyc && !numa) 
> > > >     error("EPYC cpu requires numa to be configured")    
> > > 
> > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > real world QEMU deployments. That is way too user hostile to introduce
> > > as a requirement.
> > > 
> > > Why do we need to force this ?  People have been successfuly using
> > > EPYC CPUs without NUMA in QEMU for years now.
> > > 
> > > It might not match behaviour of bare metal silicon, but that hasn't
> > > obviously caused the world to come crashing down.  
> > So far it produces warning in linux kernel (RHBZ1728166),
> > (resulting performance might be suboptimal), but I haven't seen
> > anyone reporting crashes yet.
> > 
> > 
> > What other options do we have?
> > Perhaps we can turn on strict check for new machine types only,
> > so old configs can keep broken topology (CPUID),
> > while new ones would require -numa and produce correct topology.  
> 
> No, tieing this to machine types is not viable either. That is still
> going to break essentially every single management application that
> exists today using QEMU.
for that we have deprecation process, so users could switch to new CLI
that would be required.


> Breaking stuff existing apps is not acceptable for something that is
> merely reporting sub-optimal performance. That's simply a documentation
> task to highlight best practice to app developers.
> 
> Regards,
> Daniel


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Daniel P. Berrangé 3 years, 8 months ago
On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 14:36:38 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >   
> > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > >     
> > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > it will create some sub-optimal configuration.
> > > > > > Here is the discussion thread.
> > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > 
> > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > apicid decode.    
> > > > > 
> > > > > the main difference between EPYC and all other CPUs is that
> > > > > it requires numa configuration (it's not optional)
> > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > 
> > > > >  if (epyc && !numa) 
> > > > >     error("EPYC cpu requires numa to be configured")    
> > > > 
> > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > as a requirement.
> > > > 
> > > > Why do we need to force this ?  People have been successfuly using
> > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > 
> > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > obviously caused the world to come crashing down.  
> > > So far it produces warning in linux kernel (RHBZ1728166),
> > > (resulting performance might be suboptimal), but I haven't seen
> > > anyone reporting crashes yet.
> > > 
> > > 
> > > What other options do we have?
> > > Perhaps we can turn on strict check for new machine types only,
> > > so old configs can keep broken topology (CPUID),
> > > while new ones would require -numa and produce correct topology.  
> > 
> > No, tieing this to machine types is not viable either. That is still
> > going to break essentially every single management application that
> > exists today using QEMU.
> for that we have deprecation process, so users could switch to new CLI
> that would be required.

We could, but I don't find the cost/benefit tradeoff is compelling.

There are so many places where we diverge from what bare metal would
do, that I don't see a good reason to introduce this breakage, even
if we notify users via a deprecation message. 

If QEMU wants to require NUMA for EPYC, then QEMU could internally
create a single NUMA node if none was specified for new machine
types, such that there is no visible change or breakage to any
mgmt apps.  


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Eduardo Habkost 3 years, 8 months ago
On Wed, Aug 26, 2020 at 04:03:40PM +0100, Daniel P. Berrangé wrote:
> On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 14:36:38 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >   
> > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > >     
> > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > it will create some sub-optimal configuration.
> > > > > > > Here is the discussion thread.
> > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > 
> > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > apicid decode.    
> > > > > > 
> > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > it requires numa configuration (it's not optional)
> > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > 
> > > > > >  if (epyc && !numa) 
> > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > 
> > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > as a requirement.
> > > > > 
> > > > > Why do we need to force this ?  People have been successfuly using
> > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > 
> > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > obviously caused the world to come crashing down.  
> > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > (resulting performance might be suboptimal), but I haven't seen
> > > > anyone reporting crashes yet.
> > > > 
> > > > 
> > > > What other options do we have?
> > > > Perhaps we can turn on strict check for new machine types only,
> > > > so old configs can keep broken topology (CPUID),
> > > > while new ones would require -numa and produce correct topology.  
> > > 
> > > No, tieing this to machine types is not viable either. That is still
> > > going to break essentially every single management application that
> > > exists today using QEMU.
> > for that we have deprecation process, so users could switch to new CLI
> > that would be required.
> 
> We could, but I don't find the cost/benefit tradeoff is compelling.
> 
> There are so many places where we diverge from what bare metal would
> do, that I don't see a good reason to introduce this breakage, even
> if we notify users via a deprecation message. 
> 
> If QEMU wants to require NUMA for EPYC, then QEMU could internally
> create a single NUMA node if none was specified for new machine
> types, such that there is no visible change or breakage to any
> mgmt apps.  

Is anything expected to break if we just set
auto_enable_numa=true unconditionally on pc-*-5.2 and newer?

-- 
Eduardo


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Wed, 26 Aug 2020 16:03:40 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 14:36:38 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >   
> > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > >     
> > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > it will create some sub-optimal configuration.
> > > > > > > Here is the discussion thread.
> > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > 
> > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > apicid decode.    
> > > > > > 
> > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > it requires numa configuration (it's not optional)
> > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > 
> > > > > >  if (epyc && !numa) 
> > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > 
> > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > as a requirement.
> > > > > 
> > > > > Why do we need to force this ?  People have been successfuly using
> > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > 
> > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > obviously caused the world to come crashing down.  
> > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > (resulting performance might be suboptimal), but I haven't seen
> > > > anyone reporting crashes yet.
> > > > 
> > > > 
> > > > What other options do we have?
> > > > Perhaps we can turn on strict check for new machine types only,
> > > > so old configs can keep broken topology (CPUID),
> > > > while new ones would require -numa and produce correct topology.  
> > > 
> > > No, tieing this to machine types is not viable either. That is still
> > > going to break essentially every single management application that
> > > exists today using QEMU.
> > for that we have deprecation process, so users could switch to new CLI
> > that would be required.
> 
> We could, but I don't find the cost/benefit tradeoff is compelling.
> 
> There are so many places where we diverge from what bare metal would
> do, that I don't see a good reason to introduce this breakage, even
> if we notify users via a deprecation message. 
I find (3) and (4) good enough reasons to use deprecation.

> If QEMU wants to require NUMA for EPYC, then QEMU could internally
> create a single NUMA node if none was specified for new machine
> types, such that there is no visible change or breakage to any
> mgmt apps.  

(1) for configs that started without -numa &&|| without -smp dies>1,
      QEMU can do just that (enable auto_enable_numa).

(2) As for configs that are out of spec, I do not care much (junk in - junk out)
(though not having to spend time on bug reports and debug issues, just to say
it's not supported in the end, makes deprecation sound like a reasonable
choice)

(3) However if config matches bare metal i.e. CPU has more than 1 die and within
dies limits (spec wise), QEMU has to produce valid CPUs.
In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
on user's behalf. That's where we have to error out and ask for explicit
numa configuration.

For such configs, current code (since 5.0), will produce in the best case
performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
in the worst case issues might be related to invalid APIC ID if running on EPYC host
and HW takes in account subfields of APIC ID (according to Babu real CPU uses
die_id(aka node_id) internally).
I'd rather error out on nonsense configs earlier than debug such issues
and than error out anyways later (upsetting more users).

(4)
If I were non hobby user, I'd hate if QEMU allowed me to start invalid config,
that I'd have to spend time on debugging issues (including performance ones),
instead of clearly telling me what's wrong and how config should be corrected.
I'd probably jump to another hypervisor that does the job right,
instead of digging into QEMU codebase and CPU specs to figure out how
to hack and configure it.


> 
> 
> Regards,
> Daniel


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Eduardo Habkost 3 years, 8 months ago
On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 16:03:40 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > 
> > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >   
> > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > >     
> > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > Here is the discussion thread.
> > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > 
> > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > apicid decode.    
> > > > > > > 
> > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > it requires numa configuration (it's not optional)
> > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > 
> > > > > > >  if (epyc && !numa) 
> > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > 
> > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > as a requirement.
> > > > > > 
> > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > 
> > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > obviously caused the world to come crashing down.  
> > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > anyone reporting crashes yet.
> > > > > 
> > > > > 
> > > > > What other options do we have?
> > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > so old configs can keep broken topology (CPUID),
> > > > > while new ones would require -numa and produce correct topology.  
> > > > 
> > > > No, tieing this to machine types is not viable either. That is still
> > > > going to break essentially every single management application that
> > > > exists today using QEMU.
> > > for that we have deprecation process, so users could switch to new CLI
> > > that would be required.
> > 
> > We could, but I don't find the cost/benefit tradeoff is compelling.
> > 
> > There are so many places where we diverge from what bare metal would
> > do, that I don't see a good reason to introduce this breakage, even
> > if we notify users via a deprecation message. 
> I find (3) and (4) good enough reasons to use deprecation.
> 
> > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > create a single NUMA node if none was specified for new machine
> > types, such that there is no visible change or breakage to any
> > mgmt apps.  
> 
> (1) for configs that started without -numa &&|| without -smp dies>1,
>       QEMU can do just that (enable auto_enable_numa).

Why exactly do we need auto_enable_numa with dies=1?

If I understand correctly, Babu said earlier in this thread[1]
that we don't need auto_enable_numa.

[1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/

> 
> (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> (though not having to spend time on bug reports and debug issues, just to say
> it's not supported in the end, makes deprecation sound like a reasonable
> choice)
> 
> (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> dies limits (spec wise), QEMU has to produce valid CPUs.
> In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> on user's behalf. That's where we have to error out and ask for explicit
> numa configuration.
> 
> For such configs, current code (since 5.0), will produce in the best case
> performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> in the worst case issues might be related to invalid APIC ID if running on EPYC host
> and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> die_id(aka node_id) internally).
> I'd rather error out on nonsense configs earlier than debug such issues
> and than error out anyways later (upsetting more users).
> 

The requirements are not clear to me.  Is this just about making
CPU die_id match the NUMA node ID, or are there additional
constraints?


> (4)
> If I were non hobby user, I'd hate if QEMU allowed me to start invalid config,
> that I'd have to spend time on debugging issues (including performance ones),
> instead of clearly telling me what's wrong and how config should be corrected.
> I'd probably jump to another hypervisor that does the job right,
> instead of digging into QEMU codebase and CPU specs to figure out how
> to hack and configure it.
> 

-- 
Eduardo


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Thu, 27 Aug 2020 15:07:52 -0400
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 16:03:40 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > 
> > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >   
> > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > >     
> > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > Here is the discussion thread.
> > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > 
> > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > apicid decode.    
> > > > > > > > 
> > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > 
> > > > > > > >  if (epyc && !numa) 
> > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > 
> > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > as a requirement.
> > > > > > > 
> > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > 
> > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > obviously caused the world to come crashing down.  
> > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > anyone reporting crashes yet.
> > > > > > 
> > > > > > 
> > > > > > What other options do we have?
> > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > so old configs can keep broken topology (CPUID),
> > > > > > while new ones would require -numa and produce correct topology.  
> > > > > 
> > > > > No, tieing this to machine types is not viable either. That is still
> > > > > going to break essentially every single management application that
> > > > > exists today using QEMU.
> > > > for that we have deprecation process, so users could switch to new CLI
> > > > that would be required.
> > > 
> > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > 
> > > There are so many places where we diverge from what bare metal would
> > > do, that I don't see a good reason to introduce this breakage, even
> > > if we notify users via a deprecation message. 
> > I find (3) and (4) good enough reasons to use deprecation.
> > 
> > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > create a single NUMA node if none was specified for new machine
> > > types, such that there is no visible change or breakage to any
> > > mgmt apps.  
> > 
> > (1) for configs that started without -numa &&|| without -smp dies>1,
> >       QEMU can do just that (enable auto_enable_numa).
> 
> Why exactly do we need auto_enable_numa with dies=1?
> 
> If I understand correctly, Babu said earlier in this thread[1]
> that we don't need auto_enable_numa.
> 
> [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/

in case of 1 die, -numa is not must have as it's one numa node only.
Though having auto_enable_numa, will allow to reuse the CPU.node-id property
to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.

 
> > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > (though not having to spend time on bug reports and debug issues, just to say
> > it's not supported in the end, makes deprecation sound like a reasonable
> > choice)
> > 
> > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > dies limits (spec wise), QEMU has to produce valid CPUs.
> > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > on user's behalf. That's where we have to error out and ask for explicit
> > numa configuration.
> > 
> > For such configs, current code (since 5.0), will produce in the best case
> > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > die_id(aka node_id) internally).
> > I'd rather error out on nonsense configs earlier than debug such issues
> > and than error out anyways later (upsetting more users).
> > 
> 
> The requirements are not clear to me.  Is this just about making
> CPU die_id match the NUMA node ID, or are there additional
> constraints?
die_id is per socket numa node index, so it's not numa node id in
a sense we use it in qemu
(that's where all the confusion started that led to current code)

I understood that each die in EPYC chip is a numa node, which encodes
NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
wrote earlier that EPYC makes -numa non optional.

In case of only one die we can either use auto_enable_numa to ensure
that we have consistent code or special case it and just hardcode 
CPUID_Fn8000001E_ECX value which is hackish but will let us avoid
enabling numa (explicitly or implictly).

in case of multiple dies, CPUID_Fn8000001E_ECX (encodes number of nodes +
systemwide numa node id looking at CPUID of real EPYC machine)
shall match -numa mapping (otherwise it's a bug where CPUID and
ACPI mismatch).
Here we can go to ways:
  1) ask user to provide sane config with -numa (I'd prefer that)
     and use that info to fill in CPUID_Fn8000001E_ECX
  2) pretend that it's non numa machine, skip ACPI SRAT table
     but make up CPUID_Fn8000001E (i.e. another special case)
     (requires another code path and addition to -numa one)



> 
> 
> > (4)
> > If I were non hobby user, I'd hate if QEMU allowed me to start invalid config,
> > that I'd have to spend time on debugging issues (including performance ones),
> > instead of clearly telling me what's wrong and how config should be corrected.
> > I'd probably jump to another hypervisor that does the job right,
> > instead of digging into QEMU codebase and CPU specs to figure out how
> > to hack and configure it.
> > 
> 


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Daniel P. Berrangé 3 years, 8 months ago
On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> On Thu, 27 Aug 2020 15:07:52 -0400
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > 
> > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > 
> > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > >   
> > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > > >     
> > > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > > 
> > > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > > apicid decode.    
> > > > > > > > > 
> > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > > 
> > > > > > > > >  if (epyc && !numa) 
> > > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > > 
> > > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > > as a requirement.
> > > > > > > > 
> > > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > 
> > > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > anyone reporting crashes yet.
> > > > > > > 
> > > > > > > 
> > > > > > > What other options do we have?
> > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > while new ones would require -numa and produce correct topology.  
> > > > > > 
> > > > > > No, tieing this to machine types is not viable either. That is still
> > > > > > going to break essentially every single management application that
> > > > > > exists today using QEMU.
> > > > > for that we have deprecation process, so users could switch to new CLI
> > > > > that would be required.
> > > > 
> > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > 
> > > > There are so many places where we diverge from what bare metal would
> > > > do, that I don't see a good reason to introduce this breakage, even
> > > > if we notify users via a deprecation message. 
> > > I find (3) and (4) good enough reasons to use deprecation.
> > > 
> > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > create a single NUMA node if none was specified for new machine
> > > > types, such that there is no visible change or breakage to any
> > > > mgmt apps.  
> > > 
> > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > >       QEMU can do just that (enable auto_enable_numa).
> > 
> > Why exactly do we need auto_enable_numa with dies=1?
> > 
> > If I understand correctly, Babu said earlier in this thread[1]
> > that we don't need auto_enable_numa.
> > 
> > [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/
> 
> in case of 1 die, -numa is not must have as it's one numa node only.
> Though having auto_enable_numa, will allow to reuse the CPU.node-id property
> to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.
> 
>  
> > > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > > (though not having to spend time on bug reports and debug issues, just to say
> > > it's not supported in the end, makes deprecation sound like a reasonable
> > > choice)
> > > 
> > > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > > dies limits (spec wise), QEMU has to produce valid CPUs.
> > > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > > on user's behalf. That's where we have to error out and ask for explicit
> > > numa configuration.
> > > 
> > > For such configs, current code (since 5.0), will produce in the best case
> > > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > > die_id(aka node_id) internally).
> > > I'd rather error out on nonsense configs earlier than debug such issues
> > > and than error out anyways later (upsetting more users).
> > > 
> > 
> > The requirements are not clear to me.  Is this just about making
> > CPU die_id match the NUMA node ID, or are there additional
> > constraints?
> die_id is per socket numa node index, so it's not numa node id in
> a sense we use it in qemu
> (that's where all the confusion started that led to current code)
> 
> I understood that each die in EPYC chip is a numa node, which encodes
> NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
> wrote earlier that EPYC makes -numa non optional.

AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
have used, the BIOS lets you choose whether the dies are exposed as
1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
that I see.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Eduardo Habkost 3 years, 8 months ago
On Fri, Aug 28, 2020 at 09:55:33AM +0100, Daniel P. Berrangé wrote:
> On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> > On Thu, 27 Aug 2020 15:07:52 -0400
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > 
> > > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > 
> > > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > 
> > > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > >   
> > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > > > >     
> > > > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > > > 
> > > > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > > > apicid decode.    
> > > > > > > > > > 
> > > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > > > 
> > > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > > > 
> > > > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > > > as a requirement.
> > > > > > > > > 
> > > > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > > 
> > > > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > > anyone reporting crashes yet.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > What other options do we have?
> > > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > > while new ones would require -numa and produce correct topology.  
> > > > > > > 
> > > > > > > No, tieing this to machine types is not viable either. That is still
> > > > > > > going to break essentially every single management application that
> > > > > > > exists today using QEMU.
> > > > > > for that we have deprecation process, so users could switch to new CLI
> > > > > > that would be required.
> > > > > 
> > > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > > 
> > > > > There are so many places where we diverge from what bare metal would
> > > > > do, that I don't see a good reason to introduce this breakage, even
> > > > > if we notify users via a deprecation message. 
> > > > I find (3) and (4) good enough reasons to use deprecation.
> > > > 
> > > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > > create a single NUMA node if none was specified for new machine
> > > > > types, such that there is no visible change or breakage to any
> > > > > mgmt apps.  
> > > > 
> > > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > > >       QEMU can do just that (enable auto_enable_numa).
> > > 
> > > Why exactly do we need auto_enable_numa with dies=1?
> > > 
> > > If I understand correctly, Babu said earlier in this thread[1]
> > > that we don't need auto_enable_numa.
> > > 
> > > [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/
> > 
> > in case of 1 die, -numa is not must have as it's one numa node only.
> > Though having auto_enable_numa, will allow to reuse the CPU.node-id property
> > to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.
> > 
> >  
> > > > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > > > (though not having to spend time on bug reports and debug issues, just to say
> > > > it's not supported in the end, makes deprecation sound like a reasonable
> > > > choice)
> > > > 
> > > > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > > > dies limits (spec wise), QEMU has to produce valid CPUs.
> > > > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > > > on user's behalf. That's where we have to error out and ask for explicit
> > > > numa configuration.
> > > > 
> > > > For such configs, current code (since 5.0), will produce in the best case
> > > > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > > > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > > > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > > > die_id(aka node_id) internally).
> > > > I'd rather error out on nonsense configs earlier than debug such issues
> > > > and than error out anyways later (upsetting more users).
> > > > 
> > > 
> > > The requirements are not clear to me.  Is this just about making
> > > CPU die_id match the NUMA node ID, or are there additional
> > > constraints?
> > die_id is per socket numa node index, so it's not numa node id in
> > a sense we use it in qemu
> > (that's where all the confusion started that led to current code)
> > 
> > I understood that each die in EPYC chip is a numa node, which encodes
> > NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
> > wrote earlier that EPYC makes -numa non optional.
> 
> AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
> have used, the BIOS lets you choose whether the dies are exposed as
> 1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
> that I see.

If you change that setting, will all CPUID bits be kept the same,
or the die topology seen by the OS will change?

-- 
Eduardo


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Daniel P. Berrangé 3 years, 8 months ago
On Fri, Aug 28, 2020 at 12:29:31PM -0400, Eduardo Habkost wrote:
> On Fri, Aug 28, 2020 at 09:55:33AM +0100, Daniel P. Berrangé wrote:
> > On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> > > On Thu, 27 Aug 2020 15:07:52 -0400
> > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > > 
> > > > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > 
> > > > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > 
> > > > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > >   
> > > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > > > > >     
> > > > > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > > > > 
> > > > > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > > > > apicid decode.    
> > > > > > > > > > > 
> > > > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > > > > 
> > > > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > > > > 
> > > > > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > > > > as a requirement.
> > > > > > > > > > 
> > > > > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > > > 
> > > > > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > > > anyone reporting crashes yet.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > What other options do we have?
> > > > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > > > while new ones would require -numa and produce correct topology.  
> > > > > > > > 
> > > > > > > > No, tieing this to machine types is not viable either. That is still
> > > > > > > > going to break essentially every single management application that
> > > > > > > > exists today using QEMU.
> > > > > > > for that we have deprecation process, so users could switch to new CLI
> > > > > > > that would be required.
> > > > > > 
> > > > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > > > 
> > > > > > There are so many places where we diverge from what bare metal would
> > > > > > do, that I don't see a good reason to introduce this breakage, even
> > > > > > if we notify users via a deprecation message. 
> > > > > I find (3) and (4) good enough reasons to use deprecation.
> > > > > 
> > > > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > > > create a single NUMA node if none was specified for new machine
> > > > > > types, such that there is no visible change or breakage to any
> > > > > > mgmt apps.  
> > > > > 
> > > > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > > > >       QEMU can do just that (enable auto_enable_numa).
> > > > 
> > > > Why exactly do we need auto_enable_numa with dies=1?
> > > > 
> > > > If I understand correctly, Babu said earlier in this thread[1]
> > > > that we don't need auto_enable_numa.
> > > > 
> > > > [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/
> > > 
> > > in case of 1 die, -numa is not must have as it's one numa node only.
> > > Though having auto_enable_numa, will allow to reuse the CPU.node-id property
> > > to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.
> > > 
> > >  
> > > > > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > > > > (though not having to spend time on bug reports and debug issues, just to say
> > > > > it's not supported in the end, makes deprecation sound like a reasonable
> > > > > choice)
> > > > > 
> > > > > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > > > > dies limits (spec wise), QEMU has to produce valid CPUs.
> > > > > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > > > > on user's behalf. That's where we have to error out and ask for explicit
> > > > > numa configuration.
> > > > > 
> > > > > For such configs, current code (since 5.0), will produce in the best case
> > > > > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > > > > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > > > > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > > > > die_id(aka node_id) internally).
> > > > > I'd rather error out on nonsense configs earlier than debug such issues
> > > > > and than error out anyways later (upsetting more users).
> > > > > 
> > > > 
> > > > The requirements are not clear to me.  Is this just about making
> > > > CPU die_id match the NUMA node ID, or are there additional
> > > > constraints?
> > > die_id is per socket numa node index, so it's not numa node id in
> > > a sense we use it in qemu
> > > (that's where all the confusion started that led to current code)
> > > 
> > > I understood that each die in EPYC chip is a numa node, which encodes
> > > NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
> > > wrote earlier that EPYC makes -numa non optional.
> > 
> > AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
> > have used, the BIOS lets you choose whether the dies are exposed as
> > 1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
> > that I see.
> 
> If you change that setting, will all CPUID bits be kept the same,
> or the die topology seen by the OS will change?

I don't know offhand, and don't currently have access to the hardware.
All I know is that I was able to change between 1, 2 and 4 NUMA nodes
and that was reflected in numactl display, I didn't check the CPUID
when I was testing previously.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Eduardo Habkost 3 years, 8 months ago
On Fri, Aug 28, 2020 at 05:32:51PM +0100, Daniel P. Berrangé wrote:
> On Fri, Aug 28, 2020 at 12:29:31PM -0400, Eduardo Habkost wrote:
> > On Fri, Aug 28, 2020 at 09:55:33AM +0100, Daniel P. Berrangé wrote:
> > > On Thu, Aug 27, 2020 at 10:55:26PM +0200, Igor Mammedov wrote:
> > > > On Thu, 27 Aug 2020 15:07:52 -0400
> > > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > > > 
> > > > > On Thu, Aug 27, 2020 at 07:03:14PM +0200, Igor Mammedov wrote:
> > > > > > On Wed, 26 Aug 2020 16:03:40 +0100
> > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > 
> > > > > > > On Wed, Aug 26, 2020 at 04:02:58PM +0200, Igor Mammedov wrote:
> > > > > > > > On Wed, 26 Aug 2020 14:36:38 +0100
> > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > 
> > > > > > > > > On Wed, Aug 26, 2020 at 03:30:34PM +0200, Igor Mammedov wrote:
> > > > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > > >   
> > > > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > > > > > > > > > Babu Moger <babu.moger@amd.com> wrote:
> > > > > > > > > > > >     
> > > > > > > > > > > > > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > > > > > > > > > > > > But, EPYC mode decode is running into problems. Also it can become quite a
> > > > > > > > > > > > > maintenance problem in the future. So, it was decided to remove that code and
> > > > > > > > > > > > > use the generic decode which works for majority of the topology. Most of the
> > > > > > > > > > > > > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > > > > > > > > > > > > it will create some sub-optimal configuration.
> > > > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > > > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > > > > > > > > > > > > 
> > > > > > > > > > > > > This series removes all the EPYC mode specific apicid changes and use the generic
> > > > > > > > > > > > > apicid decode.    
> > > > > > > > > > > > 
> > > > > > > > > > > > the main difference between EPYC and all other CPUs is that
> > > > > > > > > > > > it requires numa configuration (it's not optional)
> > > > > > > > > > > > so we need an extra patch on top of this series to enfoce that, i.e:
> > > > > > > > > > > > 
> > > > > > > > > > > >  if (epyc && !numa) 
> > > > > > > > > > > >     error("EPYC cpu requires numa to be configured")    
> > > > > > > > > > > 
> > > > > > > > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > > > > > > > real world QEMU deployments. That is way too user hostile to introduce
> > > > > > > > > > > as a requirement.
> > > > > > > > > > > 
> > > > > > > > > > > Why do we need to force this ?  People have been successfuly using
> > > > > > > > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > > > > > > > > 
> > > > > > > > > > > It might not match behaviour of bare metal silicon, but that hasn't
> > > > > > > > > > > obviously caused the world to come crashing down.  
> > > > > > > > > > So far it produces warning in linux kernel (RHBZ1728166),
> > > > > > > > > > (resulting performance might be suboptimal), but I haven't seen
> > > > > > > > > > anyone reporting crashes yet.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > What other options do we have?
> > > > > > > > > > Perhaps we can turn on strict check for new machine types only,
> > > > > > > > > > so old configs can keep broken topology (CPUID),
> > > > > > > > > > while new ones would require -numa and produce correct topology.  
> > > > > > > > > 
> > > > > > > > > No, tieing this to machine types is not viable either. That is still
> > > > > > > > > going to break essentially every single management application that
> > > > > > > > > exists today using QEMU.
> > > > > > > > for that we have deprecation process, so users could switch to new CLI
> > > > > > > > that would be required.
> > > > > > > 
> > > > > > > We could, but I don't find the cost/benefit tradeoff is compelling.
> > > > > > > 
> > > > > > > There are so many places where we diverge from what bare metal would
> > > > > > > do, that I don't see a good reason to introduce this breakage, even
> > > > > > > if we notify users via a deprecation message. 
> > > > > > I find (3) and (4) good enough reasons to use deprecation.
> > > > > > 
> > > > > > > If QEMU wants to require NUMA for EPYC, then QEMU could internally
> > > > > > > create a single NUMA node if none was specified for new machine
> > > > > > > types, such that there is no visible change or breakage to any
> > > > > > > mgmt apps.  
> > > > > > 
> > > > > > (1) for configs that started without -numa &&|| without -smp dies>1,
> > > > > >       QEMU can do just that (enable auto_enable_numa).
> > > > > 
> > > > > Why exactly do we need auto_enable_numa with dies=1?
> > > > > 
> > > > > If I understand correctly, Babu said earlier in this thread[1]
> > > > > that we don't need auto_enable_numa.
> > > > > 
> > > > > [1] https://lore.kernel.org/qemu-devel/11489e5f-2285-ddb4-9c35-c9f522d603a0@amd.com/
> > > > 
> > > > in case of 1 die, -numa is not must have as it's one numa node only.
> > > > Though having auto_enable_numa, will allow to reuse the CPU.node-id property
> > > > to compose CPUID_Fn8000001E_ECX. i.e only code one path vs numa|non-numa variant.
> > > > 
> > > >  
> > > > > > (2) As for configs that are out of spec, I do not care much (junk in - junk out)
> > > > > > (though not having to spend time on bug reports and debug issues, just to say
> > > > > > it's not supported in the end, makes deprecation sound like a reasonable
> > > > > > choice)
> > > > > > 
> > > > > > (3) However if config matches bare metal i.e. CPU has more than 1 die and within
> > > > > > dies limits (spec wise), QEMU has to produce valid CPUs.
> > > > > > In this case QEMU can't make up multiple numa nodes and mappings of RAM/CPUs
> > > > > > on user's behalf. That's where we have to error out and ask for explicit
> > > > > > numa configuration.
> > > > > > 
> > > > > > For such configs, current code (since 5.0), will produce in the best case
> > > > > > performance issues  due to mismatching data in APICID, CPUID and ACPI tables,
> > > > > > in the worst case issues might be related to invalid APIC ID if running on EPYC host
> > > > > > and HW takes in account subfields of APIC ID (according to Babu real CPU uses
> > > > > > die_id(aka node_id) internally).
> > > > > > I'd rather error out on nonsense configs earlier than debug such issues
> > > > > > and than error out anyways later (upsetting more users).
> > > > > > 
> > > > > 
> > > > > The requirements are not clear to me.  Is this just about making
> > > > > CPU die_id match the NUMA node ID, or are there additional
> > > > > constraints?
> > > > die_id is per socket numa node index, so it's not numa node id in
> > > > a sense we use it in qemu
> > > > (that's where all the confusion started that led to current code)
> > > > 
> > > > I understood that each die in EPYC chip is a numa node, which encodes
> > > > NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
> > > > wrote earlier that EPYC makes -numa non optional.
> > > 
> > > AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
> > > have used, the BIOS lets you choose whether the dies are exposed as
> > > 1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
> > > that I see.
> > 
> > If you change that setting, will all CPUID bits be kept the same,
> > or the die topology seen by the OS will change?
> 
> I don't know offhand, and don't currently have access to the hardware.
> All I know is that I was able to change between 1, 2 and 4 NUMA nodes
> and that was reflected in numactl display, I didn't check the CPUID
> when I was testing previously.

Babu, do you know the answer here?

If CPUID is kept the same with 1, 2 and 4 NUMA nodes, then having
NUMA configured is not a requirement at all.

-- 
Eduardo


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Babu Moger 3 years, 8 months ago
Responding to Eduardo's question. Some emails are not comming to my
mailbox for some reason. Responding git send-email --in-reply-to.


>> > > > I understood that each die in EPYC chip is a numa node, which encodes
>> > > > NUMA node ID (system wide) in CPUID_Fn8000001E_ECX, that's why I
>> > > > wrote earlier that EPYC makes -numa non optional.
>> > > 
>> > > AFAIK, that isnt a hard requirement.  In bare metal EPYC machine I
>> > > have used, the BIOS lets you choose whether the dies are exposed as
>> > > 1, 2 or 4 NUMA nodes. So there's no fixed  die == numa node mapping
>> > > that I see.
>> > 
>> > If you change that setting, will all CPUID bits be kept the same,
>> > or the die topology seen by the OS will change?
>> 
>> I don't know offhand, and don't currently have access to the hardware.
>> All I know is that I was able to change between 1, 2 and 4 NUMA nodes
>> and that was reflected in numactl display, I didn't check the CPUID
>> when I was testing previously.
>
>Babu, do you know the answer here?
>
>If CPUID is kept the same with 1, 2 and 4 NUMA nodes, then having
>NUMA configured is not a requirement at all.

Yes. The CPUID are kept the same with 1, 2 and 4 NUMA nodes.
So, having numa configered in not a requirement. Following are the
details of NPS2 and NPS4. Seing the same behaviour with NPS1 also.

NPS2:
================================
#lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              256
On-line CPU(s) list: 0-255
Thread(s) per core:  2
Core(s) per socket:  64
Socket(s):           2
NUMA node(s):        4
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               49
Model name:          AMD EPYC 7742 64-Core Processor
Stepping:            0
CPU MHz:             1785.033
CPU max MHz:         2250.0000
CPU min MHz:         1500.0000
BogoMIPS:            4491.51
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            16384K
NUMA node0 CPU(s):   0-31,128-159
NUMA node1 CPU(s):   32-63,160-191
NUMA node2 CPU(s):   64-95,192-223
NUMA node3 CPU(s):   96-127,224-255

#cpuid -l 0x8000001e -r

CPU 0:
   0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000000 edx=0x00000000
CPU 1:
   0x8000001e 0x00: eax=0x00000002 ebx=0x00000101 ecx=0x00000000 edx=0x00000000
CPU 2:
   0x8000001e 0x00: eax=0x00000004 ebx=0x00000102 ecx=0x00000000 edx=0x00000000
CPU 3:
   0x8000001e 0x00: eax=0x00000006 ebx=0x00000103 ecx=0x00000000 edx=0x00000000
CPU 4:
   0x8000001e 0x00: eax=0x00000008 ebx=0x00000104 ecx=0x00000000 edx=0x00000000
CPU 5:
   0x8000001e 0x00: eax=0x0000000a ebx=0x00000105 ecx=0x00000000 edx=0x00000000
CPU 6:
   0x8000001e 0x00: eax=0x0000000c ebx=0x00000106 ecx=0x00000000 edx=0x00000000
CPU 7:
   0x8000001e 0x00: eax=0x0000000e ebx=0x00000107 ecx=0x00000000 edx=0x00000000
CPU 8:
   0x8000001e 0x00: eax=0x00000010 ebx=0x00000108 ecx=0x00000000 edx=0x00000000
CPU 9:
   0x8000001e 0x00: eax=0x00000012 ebx=0x00000109 ecx=0x00000000 edx=0x00000000
CPU 10:
   0x8000001e 0x00: eax=0x00000014 ebx=0x0000010a ecx=0x00000000 edx=0x00000000
CPU 11:
   0x8000001e 0x00: eax=0x00000016 ebx=0x0000010b ecx=0x00000000 edx=0x00000000
CPU 12:
   0x8000001e 0x00: eax=0x00000018 ebx=0x0000010c ecx=0x00000000 edx=0x00000000
CPU 13:
   0x8000001e 0x00: eax=0x0000001a ebx=0x0000010d ecx=0x00000000 edx=0x00000000
CPU 14:
   0x8000001e 0x00: eax=0x0000001c ebx=0x0000010e ecx=0x00000000 edx=0x00000000
CPU 15:
   0x8000001e 0x00: eax=0x0000001e ebx=0x0000010f ecx=0x00000000 edx=0x00000000
CPU 16:
   0x8000001e 0x00: eax=0x00000020 ebx=0x00000110 ecx=0x00000000 edx=0x00000000
CPU 17:
   0x8000001e 0x00: eax=0x00000022 ebx=0x00000111 ecx=0x00000000 edx=0x00000000
CPU 18:
   0x8000001e 0x00: eax=0x00000024 ebx=0x00000112 ecx=0x00000000 edx=0x00000000
CPU 19:
   0x8000001e 0x00: eax=0x00000026 ebx=0x00000113 ecx=0x00000000 edx=0x00000000
CPU 20:
   0x8000001e 0x00: eax=0x00000028 ebx=0x00000114 ecx=0x00000000 edx=0x00000000
CPU 21:
   0x8000001e 0x00: eax=0x0000002a ebx=0x00000115 ecx=0x00000000 edx=0x00000000
CPU 22:
   0x8000001e 0x00: eax=0x0000002c ebx=0x00000116 ecx=0x00000000 edx=0x00000000
CPU 23:
   0x8000001e 0x00: eax=0x0000002e ebx=0x00000117 ecx=0x00000000 edx=0x00000000
CPU 24:
   0x8000001e 0x00: eax=0x00000030 ebx=0x00000118 ecx=0x00000000 edx=0x00000000
CPU 25:
   0x8000001e 0x00: eax=0x00000032 ebx=0x00000119 ecx=0x00000000 edx=0x00000000
CPU 26:
   0x8000001e 0x00: eax=0x00000034 ebx=0x0000011a ecx=0x00000000 edx=0x00000000
CPU 27:
   0x8000001e 0x00: eax=0x00000036 ebx=0x0000011b ecx=0x00000000 edx=0x00000000
CPU 28:
   0x8000001e 0x00: eax=0x00000038 ebx=0x0000011c ecx=0x00000000 edx=0x00000000
CPU 29:
   0x8000001e 0x00: eax=0x0000003a ebx=0x0000011d ecx=0x00000000 edx=0x00000000
CPU 30:
   0x8000001e 0x00: eax=0x0000003c ebx=0x0000011e ecx=0x00000000 edx=0x00000000
CPU 31:
   0x8000001e 0x00: eax=0x0000003e ebx=0x0000011f ecx=0x00000000 edx=0x00000000
CPU 32:
   0x8000001e 0x00: eax=0x00000040 ebx=0x00000120 ecx=0x00000000 edx=0x00000000
CPU 33:
   0x8000001e 0x00: eax=0x00000042 ebx=0x00000121 ecx=0x00000000 edx=0x00000000
CPU 34:
   0x8000001e 0x00: eax=0x00000044 ebx=0x00000122 ecx=0x00000000 edx=0x00000000
CPU 35:
   0x8000001e 0x00: eax=0x00000046 ebx=0x00000123 ecx=0x00000000 edx=0x00000000
CPU 36:
   0x8000001e 0x00: eax=0x00000048 ebx=0x00000124 ecx=0x00000000 edx=0x00000000
CPU 37:
   0x8000001e 0x00: eax=0x0000004a ebx=0x00000125 ecx=0x00000000 edx=0x00000000
CPU 38:
   0x8000001e 0x00: eax=0x0000004c ebx=0x00000126 ecx=0x00000000 edx=0x00000000
CPU 39:
   0x8000001e 0x00: eax=0x0000004e ebx=0x00000127 ecx=0x00000000 edx=0x00000000
CPU 40:
   0x8000001e 0x00: eax=0x00000050 ebx=0x00000128 ecx=0x00000000 edx=0x00000000
CPU 41:
   0x8000001e 0x00: eax=0x00000052 ebx=0x00000129 ecx=0x00000000 edx=0x00000000
CPU 42:
   0x8000001e 0x00: eax=0x00000054 ebx=0x0000012a ecx=0x00000000 edx=0x00000000
CPU 43:
   0x8000001e 0x00: eax=0x00000056 ebx=0x0000012b ecx=0x00000000 edx=0x00000000
CPU 44:
   0x8000001e 0x00: eax=0x00000058 ebx=0x0000012c ecx=0x00000000 edx=0x00000000
CPU 45:
   0x8000001e 0x00: eax=0x0000005a ebx=0x0000012d ecx=0x00000000 edx=0x00000000
CPU 46:
   0x8000001e 0x00: eax=0x0000005c ebx=0x0000012e ecx=0x00000000 edx=0x00000000
CPU 47:
   0x8000001e 0x00: eax=0x0000005e ebx=0x0000012f ecx=0x00000000 edx=0x00000000
CPU 48:
   0x8000001e 0x00: eax=0x00000060 ebx=0x00000130 ecx=0x00000000 edx=0x00000000
CPU 49:
   0x8000001e 0x00: eax=0x00000062 ebx=0x00000131 ecx=0x00000000 edx=0x00000000
CPU 50:
   0x8000001e 0x00: eax=0x00000064 ebx=0x00000132 ecx=0x00000000 edx=0x00000000
CPU 51:
   0x8000001e 0x00: eax=0x00000066 ebx=0x00000133 ecx=0x00000000 edx=0x00000000
CPU 52:
   0x8000001e 0x00: eax=0x00000068 ebx=0x00000134 ecx=0x00000000 edx=0x00000000
CPU 53:
   0x8000001e 0x00: eax=0x0000006a ebx=0x00000135 ecx=0x00000000 edx=0x00000000
CPU 54:
   0x8000001e 0x00: eax=0x0000006c ebx=0x00000136 ecx=0x00000000 edx=0x00000000
CPU 55:
   0x8000001e 0x00: eax=0x0000006e ebx=0x00000137 ecx=0x00000000 edx=0x00000000
CPU 56:
   0x8000001e 0x00: eax=0x00000070 ebx=0x00000138 ecx=0x00000000 edx=0x00000000
CPU 57:
   0x8000001e 0x00: eax=0x00000072 ebx=0x00000139 ecx=0x00000000 edx=0x00000000
CPU 58:
   0x8000001e 0x00: eax=0x00000074 ebx=0x0000013a ecx=0x00000000 edx=0x00000000
CPU 59:
   0x8000001e 0x00: eax=0x00000076 ebx=0x0000013b ecx=0x00000000 edx=0x00000000
CPU 60:
   0x8000001e 0x00: eax=0x00000078 ebx=0x0000013c ecx=0x00000000 edx=0x00000000
CPU 61:
   0x8000001e 0x00: eax=0x0000007a ebx=0x0000013d ecx=0x00000000 edx=0x00000000
CPU 62:
   0x8000001e 0x00: eax=0x0000007c ebx=0x0000013e ecx=0x00000000 edx=0x00000000
CPU 63:
   0x8000001e 0x00: eax=0x0000007e ebx=0x0000013f ecx=0x00000000 edx=0x00000000
CPU 64:
   0x8000001e 0x00: eax=0x00000080 ebx=0x00000100 ecx=0x00000001 edx=0x00000000
CPU 65:
   0x8000001e 0x00: eax=0x00000082 ebx=0x00000101 ecx=0x00000001 edx=0x00000000
CPU 66:
   0x8000001e 0x00: eax=0x00000084 ebx=0x00000102 ecx=0x00000001 edx=0x00000000
CPU 67:
   0x8000001e 0x00: eax=0x00000086 ebx=0x00000103 ecx=0x00000001 edx=0x00000000
CPU 68:
   0x8000001e 0x00: eax=0x00000088 ebx=0x00000104 ecx=0x00000001 edx=0x00000000
CPU 69:
   0x8000001e 0x00: eax=0x0000008a ebx=0x00000105 ecx=0x00000001 edx=0x00000000
CPU 70:
   0x8000001e 0x00: eax=0x0000008c ebx=0x00000106 ecx=0x00000001 edx=0x00000000
CPU 71:
   0x8000001e 0x00: eax=0x0000008e ebx=0x00000107 ecx=0x00000001 edx=0x00000000
CPU 72:
   0x8000001e 0x00: eax=0x00000090 ebx=0x00000108 ecx=0x00000001 edx=0x00000000
CPU 73:
   0x8000001e 0x00: eax=0x00000092 ebx=0x00000109 ecx=0x00000001 edx=0x00000000
CPU 74:
   0x8000001e 0x00: eax=0x00000094 ebx=0x0000010a ecx=0x00000001 edx=0x00000000
CPU 75:
   0x8000001e 0x00: eax=0x00000096 ebx=0x0000010b ecx=0x00000001 edx=0x00000000
CPU 76:
   0x8000001e 0x00: eax=0x00000098 ebx=0x0000010c ecx=0x00000001 edx=0x00000000
CPU 77:
   0x8000001e 0x00: eax=0x0000009a ebx=0x0000010d ecx=0x00000001 edx=0x00000000
CPU 78:
   0x8000001e 0x00: eax=0x0000009c ebx=0x0000010e ecx=0x00000001 edx=0x00000000
CPU 79:
   0x8000001e 0x00: eax=0x0000009e ebx=0x0000010f ecx=0x00000001 edx=0x00000000
CPU 80:
   0x8000001e 0x00: eax=0x000000a0 ebx=0x00000110 ecx=0x00000001 edx=0x00000000
CPU 81:
   0x8000001e 0x00: eax=0x000000a2 ebx=0x00000111 ecx=0x00000001 edx=0x00000000
CPU 82:
   0x8000001e 0x00: eax=0x000000a4 ebx=0x00000112 ecx=0x00000001 edx=0x00000000
CPU 83:
   0x8000001e 0x00: eax=0x000000a6 ebx=0x00000113 ecx=0x00000001 edx=0x00000000
CPU 84:
   0x8000001e 0x00: eax=0x000000a8 ebx=0x00000114 ecx=0x00000001 edx=0x00000000
CPU 85:
   0x8000001e 0x00: eax=0x000000aa ebx=0x00000115 ecx=0x00000001 edx=0x00000000
CPU 86:
   0x8000001e 0x00: eax=0x000000ac ebx=0x00000116 ecx=0x00000001 edx=0x00000000
CPU 87:
   0x8000001e 0x00: eax=0x000000ae ebx=0x00000117 ecx=0x00000001 edx=0x00000000
CPU 88:
   0x8000001e 0x00: eax=0x000000b0 ebx=0x00000118 ecx=0x00000001 edx=0x00000000
CPU 89:
   0x8000001e 0x00: eax=0x000000b2 ebx=0x00000119 ecx=0x00000001 edx=0x00000000
CPU 90:
   0x8000001e 0x00: eax=0x000000b4 ebx=0x0000011a ecx=0x00000001 edx=0x00000000
CPU 91:
   0x8000001e 0x00: eax=0x000000b6 ebx=0x0000011b ecx=0x00000001 edx=0x00000000
CPU 92:
   0x8000001e 0x00: eax=0x000000b8 ebx=0x0000011c ecx=0x00000001 edx=0x00000000
CPU 93:
   0x8000001e 0x00: eax=0x000000ba ebx=0x0000011d ecx=0x00000001 edx=0x00000000
CPU 94:
   0x8000001e 0x00: eax=0x000000bc ebx=0x0000011e ecx=0x00000001 edx=0x00000000
CPU 95:
   0x8000001e 0x00: eax=0x000000be ebx=0x0000011f ecx=0x00000001 edx=0x00000000
CPU 96:
   0x8000001e 0x00: eax=0x000000c0 ebx=0x00000120 ecx=0x00000001 edx=0x00000000
CPU 97:
   0x8000001e 0x00: eax=0x000000c2 ebx=0x00000121 ecx=0x00000001 edx=0x00000000
CPU 98:
   0x8000001e 0x00: eax=0x000000c4 ebx=0x00000122 ecx=0x00000001 edx=0x00000000
CPU 99:
   0x8000001e 0x00: eax=0x000000c6 ebx=0x00000123 ecx=0x00000001 edx=0x00000000
CPU 100:
   0x8000001e 0x00: eax=0x000000c8 ebx=0x00000124 ecx=0x00000001 edx=0x00000000
CPU 101:
   0x8000001e 0x00: eax=0x000000ca ebx=0x00000125 ecx=0x00000001 edx=0x00000000
CPU 102:
   0x8000001e 0x00: eax=0x000000cc ebx=0x00000126 ecx=0x00000001 edx=0x00000000
CPU 103:
   0x8000001e 0x00: eax=0x000000ce ebx=0x00000127 ecx=0x00000001 edx=0x00000000
CPU 104:
   0x8000001e 0x00: eax=0x000000d0 ebx=0x00000128 ecx=0x00000001 edx=0x00000000
CPU 105:
   0x8000001e 0x00: eax=0x000000d2 ebx=0x00000129 ecx=0x00000001 edx=0x00000000
CPU 106:
   0x8000001e 0x00: eax=0x000000d4 ebx=0x0000012a ecx=0x00000001 edx=0x00000000
CPU 107:
   0x8000001e 0x00: eax=0x000000d6 ebx=0x0000012b ecx=0x00000001 edx=0x00000000
CPU 108:
   0x8000001e 0x00: eax=0x000000d8 ebx=0x0000012c ecx=0x00000001 edx=0x00000000
CPU 109:
   0x8000001e 0x00: eax=0x000000da ebx=0x0000012d ecx=0x00000001 edx=0x00000000
CPU 110:
   0x8000001e 0x00: eax=0x000000dc ebx=0x0000012e ecx=0x00000001 edx=0x00000000
CPU 111:
   0x8000001e 0x00: eax=0x000000de ebx=0x0000012f ecx=0x00000001 edx=0x00000000
CPU 112:
   0x8000001e 0x00: eax=0x000000e0 ebx=0x00000130 ecx=0x00000001 edx=0x00000000
CPU 113:
   0x8000001e 0x00: eax=0x000000e2 ebx=0x00000131 ecx=0x00000001 edx=0x00000000
CPU 114:
   0x8000001e 0x00: eax=0x000000e4 ebx=0x00000132 ecx=0x00000001 edx=0x00000000
CPU 115:
   0x8000001e 0x00: eax=0x000000e6 ebx=0x00000133 ecx=0x00000001 edx=0x00000000
CPU 116:
   0x8000001e 0x00: eax=0x000000e8 ebx=0x00000134 ecx=0x00000001 edx=0x00000000
CPU 117:
   0x8000001e 0x00: eax=0x000000ea ebx=0x00000135 ecx=0x00000001 edx=0x00000000
CPU 118:
   0x8000001e 0x00: eax=0x000000ec ebx=0x00000136 ecx=0x00000001 edx=0x00000000
CPU 119:
   0x8000001e 0x00: eax=0x000000ee ebx=0x00000137 ecx=0x00000001 edx=0x00000000
CPU 120:
   0x8000001e 0x00: eax=0x000000f0 ebx=0x00000138 ecx=0x00000001 edx=0x00000000
CPU 121:
   0x8000001e 0x00: eax=0x000000f2 ebx=0x00000139 ecx=0x00000001 edx=0x00000000
CPU 122:
   0x8000001e 0x00: eax=0x000000f4 ebx=0x0000013a ecx=0x00000001 edx=0x00000000
CPU 123:
   0x8000001e 0x00: eax=0x000000f6 ebx=0x0000013b ecx=0x00000001 edx=0x00000000
CPU 124:
   0x8000001e 0x00: eax=0x000000f8 ebx=0x0000013c ecx=0x00000001 edx=0x00000000
CPU 125:
   0x8000001e 0x00: eax=0x000000fa ebx=0x0000013d ecx=0x00000001 edx=0x00000000
CPU 126:
   0x8000001e 0x00: eax=0x000000fc ebx=0x0000013e ecx=0x00000001 edx=0x00000000
CPU 127:
   0x8000001e 0x00: eax=0x000000fe ebx=0x0000013f ecx=0x00000001 edx=0x00000000
CPU 128:
   0x8000001e 0x00: eax=0x00000001 ebx=0x00000100 ecx=0x00000000 edx=0x00000000
CPU 129:
   0x8000001e 0x00: eax=0x00000003 ebx=0x00000101 ecx=0x00000000 edx=0x00000000
CPU 130:
   0x8000001e 0x00: eax=0x00000005 ebx=0x00000102 ecx=0x00000000 edx=0x00000000
CPU 131:
   0x8000001e 0x00: eax=0x00000007 ebx=0x00000103 ecx=0x00000000 edx=0x00000000
CPU 132:
   0x8000001e 0x00: eax=0x00000009 ebx=0x00000104 ecx=0x00000000 edx=0x00000000
CPU 133:
   0x8000001e 0x00: eax=0x0000000b ebx=0x00000105 ecx=0x00000000 edx=0x00000000
CPU 134:
   0x8000001e 0x00: eax=0x0000000d ebx=0x00000106 ecx=0x00000000 edx=0x00000000
CPU 135:
   0x8000001e 0x00: eax=0x0000000f ebx=0x00000107 ecx=0x00000000 edx=0x00000000
CPU 136:
   0x8000001e 0x00: eax=0x00000011 ebx=0x00000108 ecx=0x00000000 edx=0x00000000
CPU 137:
   0x8000001e 0x00: eax=0x00000013 ebx=0x00000109 ecx=0x00000000 edx=0x00000000
CPU 138:
   0x8000001e 0x00: eax=0x00000015 ebx=0x0000010a ecx=0x00000000 edx=0x00000000
CPU 139:
   0x8000001e 0x00: eax=0x00000017 ebx=0x0000010b ecx=0x00000000 edx=0x00000000
CPU 140:
   0x8000001e 0x00: eax=0x00000019 ebx=0x0000010c ecx=0x00000000 edx=0x00000000
CPU 141:
   0x8000001e 0x00: eax=0x0000001b ebx=0x0000010d ecx=0x00000000 edx=0x00000000
CPU 142:
   0x8000001e 0x00: eax=0x0000001d ebx=0x0000010e ecx=0x00000000 edx=0x00000000
CPU 143:
   0x8000001e 0x00: eax=0x0000001f ebx=0x0000010f ecx=0x00000000 edx=0x00000000
CPU 144:
   0x8000001e 0x00: eax=0x00000021 ebx=0x00000110 ecx=0x00000000 edx=0x00000000
CPU 145:
   0x8000001e 0x00: eax=0x00000023 ebx=0x00000111 ecx=0x00000000 edx=0x00000000
CPU 146:
   0x8000001e 0x00: eax=0x00000025 ebx=0x00000112 ecx=0x00000000 edx=0x00000000
CPU 147:
   0x8000001e 0x00: eax=0x00000027 ebx=0x00000113 ecx=0x00000000 edx=0x00000000
CPU 148:
   0x8000001e 0x00: eax=0x00000029 ebx=0x00000114 ecx=0x00000000 edx=0x00000000
CPU 149:
   0x8000001e 0x00: eax=0x0000002b ebx=0x00000115 ecx=0x00000000 edx=0x00000000
CPU 150:
   0x8000001e 0x00: eax=0x0000002d ebx=0x00000116 ecx=0x00000000 edx=0x00000000
CPU 151:
   0x8000001e 0x00: eax=0x0000002f ebx=0x00000117 ecx=0x00000000 edx=0x00000000
CPU 152:
   0x8000001e 0x00: eax=0x00000031 ebx=0x00000118 ecx=0x00000000 edx=0x00000000
CPU 153:
   0x8000001e 0x00: eax=0x00000033 ebx=0x00000119 ecx=0x00000000 edx=0x00000000
CPU 154:
   0x8000001e 0x00: eax=0x00000035 ebx=0x0000011a ecx=0x00000000 edx=0x00000000
CPU 155:
   0x8000001e 0x00: eax=0x00000037 ebx=0x0000011b ecx=0x00000000 edx=0x00000000
CPU 156:
   0x8000001e 0x00: eax=0x00000039 ebx=0x0000011c ecx=0x00000000 edx=0x00000000
CPU 157:
   0x8000001e 0x00: eax=0x0000003b ebx=0x0000011d ecx=0x00000000 edx=0x00000000
CPU 158:
   0x8000001e 0x00: eax=0x0000003d ebx=0x0000011e ecx=0x00000000 edx=0x00000000
CPU 159:
   0x8000001e 0x00: eax=0x0000003f ebx=0x0000011f ecx=0x00000000 edx=0x00000000
CPU 160:
   0x8000001e 0x00: eax=0x00000041 ebx=0x00000120 ecx=0x00000000 edx=0x00000000
CPU 161:
   0x8000001e 0x00: eax=0x00000043 ebx=0x00000121 ecx=0x00000000 edx=0x00000000
CPU 162:
   0x8000001e 0x00: eax=0x00000045 ebx=0x00000122 ecx=0x00000000 edx=0x00000000
CPU 163:
   0x8000001e 0x00: eax=0x00000047 ebx=0x00000123 ecx=0x00000000 edx=0x00000000
CPU 164:
   0x8000001e 0x00: eax=0x00000049 ebx=0x00000124 ecx=0x00000000 edx=0x00000000
CPU 165:
   0x8000001e 0x00: eax=0x0000004b ebx=0x00000125 ecx=0x00000000 edx=0x00000000
CPU 166:
   0x8000001e 0x00: eax=0x0000004d ebx=0x00000126 ecx=0x00000000 edx=0x00000000
CPU 167:
   0x8000001e 0x00: eax=0x0000004f ebx=0x00000127 ecx=0x00000000 edx=0x00000000
CPU 168:
   0x8000001e 0x00: eax=0x00000051 ebx=0x00000128 ecx=0x00000000 edx=0x00000000
CPU 169:
   0x8000001e 0x00: eax=0x00000053 ebx=0x00000129 ecx=0x00000000 edx=0x00000000
CPU 170:
   0x8000001e 0x00: eax=0x00000055 ebx=0x0000012a ecx=0x00000000 edx=0x00000000
CPU 171:
   0x8000001e 0x00: eax=0x00000057 ebx=0x0000012b ecx=0x00000000 edx=0x00000000
CPU 172:
   0x8000001e 0x00: eax=0x00000059 ebx=0x0000012c ecx=0x00000000 edx=0x00000000
CPU 173:
   0x8000001e 0x00: eax=0x0000005b ebx=0x0000012d ecx=0x00000000 edx=0x00000000
CPU 174:
   0x8000001e 0x00: eax=0x0000005d ebx=0x0000012e ecx=0x00000000 edx=0x00000000
CPU 175:
   0x8000001e 0x00: eax=0x0000005f ebx=0x0000012f ecx=0x00000000 edx=0x00000000
CPU 176:
   0x8000001e 0x00: eax=0x00000061 ebx=0x00000130 ecx=0x00000000 edx=0x00000000
CPU 177:
   0x8000001e 0x00: eax=0x00000063 ebx=0x00000131 ecx=0x00000000 edx=0x00000000
CPU 178:
   0x8000001e 0x00: eax=0x00000065 ebx=0x00000132 ecx=0x00000000 edx=0x00000000
CPU 179:
   0x8000001e 0x00: eax=0x00000067 ebx=0x00000133 ecx=0x00000000 edx=0x00000000
CPU 180:
   0x8000001e 0x00: eax=0x00000069 ebx=0x00000134 ecx=0x00000000 edx=0x00000000
CPU 181:
   0x8000001e 0x00: eax=0x0000006b ebx=0x00000135 ecx=0x00000000 edx=0x00000000
CPU 182:
   0x8000001e 0x00: eax=0x0000006d ebx=0x00000136 ecx=0x00000000 edx=0x00000000
CPU 183:
   0x8000001e 0x00: eax=0x0000006f ebx=0x00000137 ecx=0x00000000 edx=0x00000000
CPU 184:
   0x8000001e 0x00: eax=0x00000071 ebx=0x00000138 ecx=0x00000000 edx=0x00000000
CPU 185:
   0x8000001e 0x00: eax=0x00000073 ebx=0x00000139 ecx=0x00000000 edx=0x00000000
CPU 186:
   0x8000001e 0x00: eax=0x00000075 ebx=0x0000013a ecx=0x00000000 edx=0x00000000
CPU 187:
   0x8000001e 0x00: eax=0x00000077 ebx=0x0000013b ecx=0x00000000 edx=0x00000000
CPU 188:
   0x8000001e 0x00: eax=0x00000079 ebx=0x0000013c ecx=0x00000000 edx=0x00000000
CPU 189:
   0x8000001e 0x00: eax=0x0000007b ebx=0x0000013d ecx=0x00000000 edx=0x00000000
CPU 190:
   0x8000001e 0x00: eax=0x0000007d ebx=0x0000013e ecx=0x00000000 edx=0x00000000
CPU 191:
   0x8000001e 0x00: eax=0x0000007f ebx=0x0000013f ecx=0x00000000 edx=0x00000000
CPU 192:
   0x8000001e 0x00: eax=0x00000081 ebx=0x00000100 ecx=0x00000001 edx=0x00000000
CPU 193:
   0x8000001e 0x00: eax=0x00000083 ebx=0x00000101 ecx=0x00000001 edx=0x00000000
CPU 194:
   0x8000001e 0x00: eax=0x00000085 ebx=0x00000102 ecx=0x00000001 edx=0x00000000
CPU 195:
   0x8000001e 0x00: eax=0x00000087 ebx=0x00000103 ecx=0x00000001 edx=0x00000000
CPU 196:
   0x8000001e 0x00: eax=0x00000089 ebx=0x00000104 ecx=0x00000001 edx=0x00000000
CPU 197:
   0x8000001e 0x00: eax=0x0000008b ebx=0x00000105 ecx=0x00000001 edx=0x00000000
CPU 198:
   0x8000001e 0x00: eax=0x0000008d ebx=0x00000106 ecx=0x00000001 edx=0x00000000
CPU 199:
   0x8000001e 0x00: eax=0x0000008f ebx=0x00000107 ecx=0x00000001 edx=0x00000000
CPU 200:
   0x8000001e 0x00: eax=0x00000091 ebx=0x00000108 ecx=0x00000001 edx=0x00000000
CPU 201:
   0x8000001e 0x00: eax=0x00000093 ebx=0x00000109 ecx=0x00000001 edx=0x00000000
CPU 202:
   0x8000001e 0x00: eax=0x00000095 ebx=0x0000010a ecx=0x00000001 edx=0x00000000
CPU 203:
   0x8000001e 0x00: eax=0x00000097 ebx=0x0000010b ecx=0x00000001 edx=0x00000000
CPU 204:
   0x8000001e 0x00: eax=0x00000099 ebx=0x0000010c ecx=0x00000001 edx=0x00000000
CPU 205:
   0x8000001e 0x00: eax=0x0000009b ebx=0x0000010d ecx=0x00000001 edx=0x00000000
CPU 206:
   0x8000001e 0x00: eax=0x0000009d ebx=0x0000010e ecx=0x00000001 edx=0x00000000
CPU 207:
   0x8000001e 0x00: eax=0x0000009f ebx=0x0000010f ecx=0x00000001 edx=0x00000000
CPU 208:
   0x8000001e 0x00: eax=0x000000a1 ebx=0x00000110 ecx=0x00000001 edx=0x00000000
CPU 209:
   0x8000001e 0x00: eax=0x000000a3 ebx=0x00000111 ecx=0x00000001 edx=0x00000000
CPU 210:
   0x8000001e 0x00: eax=0x000000a5 ebx=0x00000112 ecx=0x00000001 edx=0x00000000
CPU 211:
   0x8000001e 0x00: eax=0x000000a7 ebx=0x00000113 ecx=0x00000001 edx=0x00000000
CPU 212:
   0x8000001e 0x00: eax=0x000000a9 ebx=0x00000114 ecx=0x00000001 edx=0x00000000
CPU 213:
   0x8000001e 0x00: eax=0x000000ab ebx=0x00000115 ecx=0x00000001 edx=0x00000000
CPU 214:
   0x8000001e 0x00: eax=0x000000ad ebx=0x00000116 ecx=0x00000001 edx=0x00000000
CPU 215:
   0x8000001e 0x00: eax=0x000000af ebx=0x00000117 ecx=0x00000001 edx=0x00000000
CPU 216:
   0x8000001e 0x00: eax=0x000000b1 ebx=0x00000118 ecx=0x00000001 edx=0x00000000
CPU 217:
   0x8000001e 0x00: eax=0x000000b3 ebx=0x00000119 ecx=0x00000001 edx=0x00000000
CPU 218:
   0x8000001e 0x00: eax=0x000000b5 ebx=0x0000011a ecx=0x00000001 edx=0x00000000
CPU 219:
   0x8000001e 0x00: eax=0x000000b7 ebx=0x0000011b ecx=0x00000001 edx=0x00000000
CPU 220:
   0x8000001e 0x00: eax=0x000000b9 ebx=0x0000011c ecx=0x00000001 edx=0x00000000
CPU 221:
   0x8000001e 0x00: eax=0x000000bb ebx=0x0000011d ecx=0x00000001 edx=0x00000000
CPU 222:
   0x8000001e 0x00: eax=0x000000bd ebx=0x0000011e ecx=0x00000001 edx=0x00000000
CPU 223:
   0x8000001e 0x00: eax=0x000000bf ebx=0x0000011f ecx=0x00000001 edx=0x00000000
CPU 224:
   0x8000001e 0x00: eax=0x000000c1 ebx=0x00000120 ecx=0x00000001 edx=0x00000000
CPU 225:
   0x8000001e 0x00: eax=0x000000c3 ebx=0x00000121 ecx=0x00000001 edx=0x00000000
CPU 226:
   0x8000001e 0x00: eax=0x000000c5 ebx=0x00000122 ecx=0x00000001 edx=0x00000000
CPU 227:
   0x8000001e 0x00: eax=0x000000c7 ebx=0x00000123 ecx=0x00000001 edx=0x00000000
CPU 228:
   0x8000001e 0x00: eax=0x000000c9 ebx=0x00000124 ecx=0x00000001 edx=0x00000000
CPU 229:
   0x8000001e 0x00: eax=0x000000cb ebx=0x00000125 ecx=0x00000001 edx=0x00000000
CPU 230:
   0x8000001e 0x00: eax=0x000000cd ebx=0x00000126 ecx=0x00000001 edx=0x00000000
CPU 231:
   0x8000001e 0x00: eax=0x000000cf ebx=0x00000127 ecx=0x00000001 edx=0x00000000
CPU 232:
   0x8000001e 0x00: eax=0x000000d1 ebx=0x00000128 ecx=0x00000001 edx=0x00000000
CPU 233:
   0x8000001e 0x00: eax=0x000000d3 ebx=0x00000129 ecx=0x00000001 edx=0x00000000
CPU 234:
   0x8000001e 0x00: eax=0x000000d5 ebx=0x0000012a ecx=0x00000001 edx=0x00000000
CPU 235:
   0x8000001e 0x00: eax=0x000000d7 ebx=0x0000012b ecx=0x00000001 edx=0x00000000
CPU 236:
   0x8000001e 0x00: eax=0x000000d9 ebx=0x0000012c ecx=0x00000001 edx=0x00000000
CPU 237:
   0x8000001e 0x00: eax=0x000000db ebx=0x0000012d ecx=0x00000001 edx=0x00000000
CPU 238:
   0x8000001e 0x00: eax=0x000000dd ebx=0x0000012e ecx=0x00000001 edx=0x00000000
CPU 239:
   0x8000001e 0x00: eax=0x000000df ebx=0x0000012f ecx=0x00000001 edx=0x00000000
CPU 240:
   0x8000001e 0x00: eax=0x000000e1 ebx=0x00000130 ecx=0x00000001 edx=0x00000000
CPU 241:
   0x8000001e 0x00: eax=0x000000e3 ebx=0x00000131 ecx=0x00000001 edx=0x00000000
CPU 242:
   0x8000001e 0x00: eax=0x000000e5 ebx=0x00000132 ecx=0x00000001 edx=0x00000000
CPU 243:
   0x8000001e 0x00: eax=0x000000e7 ebx=0x00000133 ecx=0x00000001 edx=0x00000000
CPU 244:
   0x8000001e 0x00: eax=0x000000e9 ebx=0x00000134 ecx=0x00000001 edx=0x00000000
CPU 245:
   0x8000001e 0x00: eax=0x000000eb ebx=0x00000135 ecx=0x00000001 edx=0x00000000
CPU 246:
   0x8000001e 0x00: eax=0x000000ed ebx=0x00000136 ecx=0x00000001 edx=0x00000000
CPU 247:
   0x8000001e 0x00: eax=0x000000ef ebx=0x00000137 ecx=0x00000001 edx=0x00000000
CPU 248:
   0x8000001e 0x00: eax=0x000000f1 ebx=0x00000138 ecx=0x00000001 edx=0x00000000
CPU 249:
   0x8000001e 0x00: eax=0x000000f3 ebx=0x00000139 ecx=0x00000001 edx=0x00000000
CPU 250:
   0x8000001e 0x00: eax=0x000000f5 ebx=0x0000013a ecx=0x00000001 edx=0x00000000
CPU 251:
   0x8000001e 0x00: eax=0x000000f7 ebx=0x0000013b ecx=0x00000001 edx=0x00000000
CPU 252:
   0x8000001e 0x00: eax=0x000000f9 ebx=0x0000013c ecx=0x00000001 edx=0x00000000
CPU 253:
   0x8000001e 0x00: eax=0x000000fb ebx=0x0000013d ecx=0x00000001 edx=0x00000000
CPU 254:
   0x8000001e 0x00: eax=0x000000fd ebx=0x0000013e ecx=0x00000001 edx=0x00000000
CPU 255:
   0x8000001e 0x00: eax=0x000000ff ebx=0x0000013f ecx=0x00000001 edx=0x00000000
[root@rome NPS]#


NPS4:
================================
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              256
On-line CPU(s) list: 0-255
Thread(s) per core:  2
Core(s) per socket:  64
Socket(s):           2
NUMA node(s):        8
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               49
Model name:          AMD EPYC 7742 64-Core Processor
Stepping:            0
CPU MHz:             1862.249
CPU max MHz:         2250.0000
CPU min MHz:         1500.0000
BogoMIPS:            4491.24
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            16384K
NUMA node0 CPU(s):   0-15,128-143
NUMA node1 CPU(s):   16-31,144-159
NUMA node2 CPU(s):   32-47,160-175
NUMA node3 CPU(s):   48-63,176-191
NUMA node4 CPU(s):   64-79,192-207
NUMA node5 CPU(s):   80-95,208-223
NUMA node6 CPU(s):   96-111,224-239
NUMA node7 CPU(s):   112-127,240-255

#cpuid -l 0x8000001e -r

CPU 0:
   0x8000001e 0x00: eax=0x00000000 ebx=0x00000100 ecx=0x00000000 edx=0x00000000
CPU 1:
   0x8000001e 0x00: eax=0x00000002 ebx=0x00000101 ecx=0x00000000 edx=0x00000000
CPU 2:
   0x8000001e 0x00: eax=0x00000004 ebx=0x00000102 ecx=0x00000000 edx=0x00000000
CPU 3:
   0x8000001e 0x00: eax=0x00000006 ebx=0x00000103 ecx=0x00000000 edx=0x00000000
CPU 4:
   0x8000001e 0x00: eax=0x00000008 ebx=0x00000104 ecx=0x00000000 edx=0x00000000
CPU 5:
   0x8000001e 0x00: eax=0x0000000a ebx=0x00000105 ecx=0x00000000 edx=0x00000000
CPU 6:
   0x8000001e 0x00: eax=0x0000000c ebx=0x00000106 ecx=0x00000000 edx=0x00000000
CPU 7:
   0x8000001e 0x00: eax=0x0000000e ebx=0x00000107 ecx=0x00000000 edx=0x00000000
CPU 8:
   0x8000001e 0x00: eax=0x00000010 ebx=0x00000108 ecx=0x00000000 edx=0x00000000
CPU 9:
   0x8000001e 0x00: eax=0x00000012 ebx=0x00000109 ecx=0x00000000 edx=0x00000000
CPU 10:
   0x8000001e 0x00: eax=0x00000014 ebx=0x0000010a ecx=0x00000000 edx=0x00000000
CPU 11:
   0x8000001e 0x00: eax=0x00000016 ebx=0x0000010b ecx=0x00000000 edx=0x00000000
CPU 12:
   0x8000001e 0x00: eax=0x00000018 ebx=0x0000010c ecx=0x00000000 edx=0x00000000
CPU 13:
   0x8000001e 0x00: eax=0x0000001a ebx=0x0000010d ecx=0x00000000 edx=0x00000000
CPU 14:
   0x8000001e 0x00: eax=0x0000001c ebx=0x0000010e ecx=0x00000000 edx=0x00000000
CPU 15:
   0x8000001e 0x00: eax=0x0000001e ebx=0x0000010f ecx=0x00000000 edx=0x00000000
CPU 16:
   0x8000001e 0x00: eax=0x00000020 ebx=0x00000110 ecx=0x00000000 edx=0x00000000
CPU 17:
   0x8000001e 0x00: eax=0x00000022 ebx=0x00000111 ecx=0x00000000 edx=0x00000000
CPU 18:
   0x8000001e 0x00: eax=0x00000024 ebx=0x00000112 ecx=0x00000000 edx=0x00000000
CPU 19:
   0x8000001e 0x00: eax=0x00000026 ebx=0x00000113 ecx=0x00000000 edx=0x00000000
CPU 20:
   0x8000001e 0x00: eax=0x00000028 ebx=0x00000114 ecx=0x00000000 edx=0x00000000
CPU 21:
   0x8000001e 0x00: eax=0x0000002a ebx=0x00000115 ecx=0x00000000 edx=0x00000000
CPU 22:
   0x8000001e 0x00: eax=0x0000002c ebx=0x00000116 ecx=0x00000000 edx=0x00000000
CPU 23:
   0x8000001e 0x00: eax=0x0000002e ebx=0x00000117 ecx=0x00000000 edx=0x00000000
CPU 24:
   0x8000001e 0x00: eax=0x00000030 ebx=0x00000118 ecx=0x00000000 edx=0x00000000
CPU 25:
   0x8000001e 0x00: eax=0x00000032 ebx=0x00000119 ecx=0x00000000 edx=0x00000000
CPU 26:
   0x8000001e 0x00: eax=0x00000034 ebx=0x0000011a ecx=0x00000000 edx=0x00000000
CPU 27:
   0x8000001e 0x00: eax=0x00000036 ebx=0x0000011b ecx=0x00000000 edx=0x00000000
CPU 28:
   0x8000001e 0x00: eax=0x00000038 ebx=0x0000011c ecx=0x00000000 edx=0x00000000
CPU 29:
   0x8000001e 0x00: eax=0x0000003a ebx=0x0000011d ecx=0x00000000 edx=0x00000000
CPU 30:
   0x8000001e 0x00: eax=0x0000003c ebx=0x0000011e ecx=0x00000000 edx=0x00000000
CPU 31:
   0x8000001e 0x00: eax=0x0000003e ebx=0x0000011f ecx=0x00000000 edx=0x00000000
CPU 32:
   0x8000001e 0x00: eax=0x00000040 ebx=0x00000120 ecx=0x00000000 edx=0x00000000
CPU 33:
   0x8000001e 0x00: eax=0x00000042 ebx=0x00000121 ecx=0x00000000 edx=0x00000000
CPU 34:
   0x8000001e 0x00: eax=0x00000044 ebx=0x00000122 ecx=0x00000000 edx=0x00000000
CPU 35:
   0x8000001e 0x00: eax=0x00000046 ebx=0x00000123 ecx=0x00000000 edx=0x00000000
CPU 36:
   0x8000001e 0x00: eax=0x00000048 ebx=0x00000124 ecx=0x00000000 edx=0x00000000
CPU 37:
   0x8000001e 0x00: eax=0x0000004a ebx=0x00000125 ecx=0x00000000 edx=0x00000000
CPU 38:
   0x8000001e 0x00: eax=0x0000004c ebx=0x00000126 ecx=0x00000000 edx=0x00000000
CPU 39:
   0x8000001e 0x00: eax=0x0000004e ebx=0x00000127 ecx=0x00000000 edx=0x00000000
CPU 40:
   0x8000001e 0x00: eax=0x00000050 ebx=0x00000128 ecx=0x00000000 edx=0x00000000
CPU 41:
   0x8000001e 0x00: eax=0x00000052 ebx=0x00000129 ecx=0x00000000 edx=0x00000000
CPU 42:
   0x8000001e 0x00: eax=0x00000054 ebx=0x0000012a ecx=0x00000000 edx=0x00000000
CPU 43:
   0x8000001e 0x00: eax=0x00000056 ebx=0x0000012b ecx=0x00000000 edx=0x00000000
CPU 44:
   0x8000001e 0x00: eax=0x00000058 ebx=0x0000012c ecx=0x00000000 edx=0x00000000
CPU 45:
   0x8000001e 0x00: eax=0x0000005a ebx=0x0000012d ecx=0x00000000 edx=0x00000000
CPU 46:
   0x8000001e 0x00: eax=0x0000005c ebx=0x0000012e ecx=0x00000000 edx=0x00000000
CPU 47:
   0x8000001e 0x00: eax=0x0000005e ebx=0x0000012f ecx=0x00000000 edx=0x00000000
CPU 48:
   0x8000001e 0x00: eax=0x00000060 ebx=0x00000130 ecx=0x00000000 edx=0x00000000
CPU 49:
   0x8000001e 0x00: eax=0x00000062 ebx=0x00000131 ecx=0x00000000 edx=0x00000000
CPU 50:
   0x8000001e 0x00: eax=0x00000064 ebx=0x00000132 ecx=0x00000000 edx=0x00000000
CPU 51:
   0x8000001e 0x00: eax=0x00000066 ebx=0x00000133 ecx=0x00000000 edx=0x00000000
CPU 52:
   0x8000001e 0x00: eax=0x00000068 ebx=0x00000134 ecx=0x00000000 edx=0x00000000
CPU 53:
   0x8000001e 0x00: eax=0x0000006a ebx=0x00000135 ecx=0x00000000 edx=0x00000000
CPU 54:
   0x8000001e 0x00: eax=0x0000006c ebx=0x00000136 ecx=0x00000000 edx=0x00000000
CPU 55:
   0x8000001e 0x00: eax=0x0000006e ebx=0x00000137 ecx=0x00000000 edx=0x00000000
CPU 56:
   0x8000001e 0x00: eax=0x00000070 ebx=0x00000138 ecx=0x00000000 edx=0x00000000
CPU 57:
   0x8000001e 0x00: eax=0x00000072 ebx=0x00000139 ecx=0x00000000 edx=0x00000000
CPU 58:
   0x8000001e 0x00: eax=0x00000074 ebx=0x0000013a ecx=0x00000000 edx=0x00000000
CPU 59:
   0x8000001e 0x00: eax=0x00000076 ebx=0x0000013b ecx=0x00000000 edx=0x00000000
CPU 60:
   0x8000001e 0x00: eax=0x00000078 ebx=0x0000013c ecx=0x00000000 edx=0x00000000
CPU 61:
   0x8000001e 0x00: eax=0x0000007a ebx=0x0000013d ecx=0x00000000 edx=0x00000000
CPU 62:
   0x8000001e 0x00: eax=0x0000007c ebx=0x0000013e ecx=0x00000000 edx=0x00000000
CPU 63:
   0x8000001e 0x00: eax=0x0000007e ebx=0x0000013f ecx=0x00000000 edx=0x00000000
CPU 64:
   0x8000001e 0x00: eax=0x00000080 ebx=0x00000100 ecx=0x00000001 edx=0x00000000
CPU 65:
   0x8000001e 0x00: eax=0x00000082 ebx=0x00000101 ecx=0x00000001 edx=0x00000000
CPU 66:
   0x8000001e 0x00: eax=0x00000084 ebx=0x00000102 ecx=0x00000001 edx=0x00000000
CPU 67:
   0x8000001e 0x00: eax=0x00000086 ebx=0x00000103 ecx=0x00000001 edx=0x00000000
CPU 68:
   0x8000001e 0x00: eax=0x00000088 ebx=0x00000104 ecx=0x00000001 edx=0x00000000
CPU 69:
   0x8000001e 0x00: eax=0x0000008a ebx=0x00000105 ecx=0x00000001 edx=0x00000000
CPU 70:
   0x8000001e 0x00: eax=0x0000008c ebx=0x00000106 ecx=0x00000001 edx=0x00000000
CPU 71:
   0x8000001e 0x00: eax=0x0000008e ebx=0x00000107 ecx=0x00000001 edx=0x00000000
CPU 72:
   0x8000001e 0x00: eax=0x00000090 ebx=0x00000108 ecx=0x00000001 edx=0x00000000
CPU 73:
   0x8000001e 0x00: eax=0x00000092 ebx=0x00000109 ecx=0x00000001 edx=0x00000000
CPU 74:
   0x8000001e 0x00: eax=0x00000094 ebx=0x0000010a ecx=0x00000001 edx=0x00000000
CPU 75:
   0x8000001e 0x00: eax=0x00000096 ebx=0x0000010b ecx=0x00000001 edx=0x00000000
CPU 76:
   0x8000001e 0x00: eax=0x00000098 ebx=0x0000010c ecx=0x00000001 edx=0x00000000
CPU 77:
   0x8000001e 0x00: eax=0x0000009a ebx=0x0000010d ecx=0x00000001 edx=0x00000000
CPU 78:
   0x8000001e 0x00: eax=0x0000009c ebx=0x0000010e ecx=0x00000001 edx=0x00000000
CPU 79:
   0x8000001e 0x00: eax=0x0000009e ebx=0x0000010f ecx=0x00000001 edx=0x00000000
CPU 80:
   0x8000001e 0x00: eax=0x000000a0 ebx=0x00000110 ecx=0x00000001 edx=0x00000000
CPU 81:
   0x8000001e 0x00: eax=0x000000a2 ebx=0x00000111 ecx=0x00000001 edx=0x00000000
CPU 82:
   0x8000001e 0x00: eax=0x000000a4 ebx=0x00000112 ecx=0x00000001 edx=0x00000000
CPU 83:
   0x8000001e 0x00: eax=0x000000a6 ebx=0x00000113 ecx=0x00000001 edx=0x00000000
CPU 84:
   0x8000001e 0x00: eax=0x000000a8 ebx=0x00000114 ecx=0x00000001 edx=0x00000000
CPU 85:
   0x8000001e 0x00: eax=0x000000aa ebx=0x00000115 ecx=0x00000001 edx=0x00000000
CPU 86:
   0x8000001e 0x00: eax=0x000000ac ebx=0x00000116 ecx=0x00000001 edx=0x00000000
CPU 87:
   0x8000001e 0x00: eax=0x000000ae ebx=0x00000117 ecx=0x00000001 edx=0x00000000
CPU 88:
   0x8000001e 0x00: eax=0x000000b0 ebx=0x00000118 ecx=0x00000001 edx=0x00000000
CPU 89:
   0x8000001e 0x00: eax=0x000000b2 ebx=0x00000119 ecx=0x00000001 edx=0x00000000
CPU 90:
   0x8000001e 0x00: eax=0x000000b4 ebx=0x0000011a ecx=0x00000001 edx=0x00000000
CPU 91:
   0x8000001e 0x00: eax=0x000000b6 ebx=0x0000011b ecx=0x00000001 edx=0x00000000
CPU 92:
   0x8000001e 0x00: eax=0x000000b8 ebx=0x0000011c ecx=0x00000001 edx=0x00000000
CPU 93:
   0x8000001e 0x00: eax=0x000000ba ebx=0x0000011d ecx=0x00000001 edx=0x00000000
CPU 94:
   0x8000001e 0x00: eax=0x000000bc ebx=0x0000011e ecx=0x00000001 edx=0x00000000
CPU 95:
   0x8000001e 0x00: eax=0x000000be ebx=0x0000011f ecx=0x00000001 edx=0x00000000
CPU 96:
   0x8000001e 0x00: eax=0x000000c0 ebx=0x00000120 ecx=0x00000001 edx=0x00000000
CPU 97:
   0x8000001e 0x00: eax=0x000000c2 ebx=0x00000121 ecx=0x00000001 edx=0x00000000
CPU 98:
   0x8000001e 0x00: eax=0x000000c4 ebx=0x00000122 ecx=0x00000001 edx=0x00000000
CPU 99:
   0x8000001e 0x00: eax=0x000000c6 ebx=0x00000123 ecx=0x00000001 edx=0x00000000
CPU 100:
   0x8000001e 0x00: eax=0x000000c8 ebx=0x00000124 ecx=0x00000001 edx=0x00000000
CPU 101:
   0x8000001e 0x00: eax=0x000000ca ebx=0x00000125 ecx=0x00000001 edx=0x00000000
CPU 102:
   0x8000001e 0x00: eax=0x000000cc ebx=0x00000126 ecx=0x00000001 edx=0x00000000
CPU 103:
   0x8000001e 0x00: eax=0x000000ce ebx=0x00000127 ecx=0x00000001 edx=0x00000000
CPU 104:
   0x8000001e 0x00: eax=0x000000d0 ebx=0x00000128 ecx=0x00000001 edx=0x00000000
CPU 105:
   0x8000001e 0x00: eax=0x000000d2 ebx=0x00000129 ecx=0x00000001 edx=0x00000000
CPU 106:
   0x8000001e 0x00: eax=0x000000d4 ebx=0x0000012a ecx=0x00000001 edx=0x00000000
CPU 107:
   0x8000001e 0x00: eax=0x000000d6 ebx=0x0000012b ecx=0x00000001 edx=0x00000000
CPU 108:
   0x8000001e 0x00: eax=0x000000d8 ebx=0x0000012c ecx=0x00000001 edx=0x00000000
CPU 109:
   0x8000001e 0x00: eax=0x000000da ebx=0x0000012d ecx=0x00000001 edx=0x00000000
CPU 110:
   0x8000001e 0x00: eax=0x000000dc ebx=0x0000012e ecx=0x00000001 edx=0x00000000
CPU 111:
   0x8000001e 0x00: eax=0x000000de ebx=0x0000012f ecx=0x00000001 edx=0x00000000
CPU 112:
   0x8000001e 0x00: eax=0x000000e0 ebx=0x00000130 ecx=0x00000001 edx=0x00000000
CPU 113:
   0x8000001e 0x00: eax=0x000000e2 ebx=0x00000131 ecx=0x00000001 edx=0x00000000
CPU 114:
   0x8000001e 0x00: eax=0x000000e4 ebx=0x00000132 ecx=0x00000001 edx=0x00000000
CPU 115:
   0x8000001e 0x00: eax=0x000000e6 ebx=0x00000133 ecx=0x00000001 edx=0x00000000
CPU 116:
   0x8000001e 0x00: eax=0x000000e8 ebx=0x00000134 ecx=0x00000001 edx=0x00000000
CPU 117:
   0x8000001e 0x00: eax=0x000000ea ebx=0x00000135 ecx=0x00000001 edx=0x00000000
CPU 118:
   0x8000001e 0x00: eax=0x000000ec ebx=0x00000136 ecx=0x00000001 edx=0x00000000
CPU 119:
   0x8000001e 0x00: eax=0x000000ee ebx=0x00000137 ecx=0x00000001 edx=0x00000000
CPU 120:
   0x8000001e 0x00: eax=0x000000f0 ebx=0x00000138 ecx=0x00000001 edx=0x00000000
CPU 121:
   0x8000001e 0x00: eax=0x000000f2 ebx=0x00000139 ecx=0x00000001 edx=0x00000000
CPU 122:
   0x8000001e 0x00: eax=0x000000f4 ebx=0x0000013a ecx=0x00000001 edx=0x00000000
CPU 123:
   0x8000001e 0x00: eax=0x000000f6 ebx=0x0000013b ecx=0x00000001 edx=0x00000000
CPU 124:
   0x8000001e 0x00: eax=0x000000f8 ebx=0x0000013c ecx=0x00000001 edx=0x00000000
CPU 125:
   0x8000001e 0x00: eax=0x000000fa ebx=0x0000013d ecx=0x00000001 edx=0x00000000
CPU 126:
   0x8000001e 0x00: eax=0x000000fc ebx=0x0000013e ecx=0x00000001 edx=0x00000000
CPU 127:
   0x8000001e 0x00: eax=0x000000fe ebx=0x0000013f ecx=0x00000001 edx=0x00000000
CPU 128:
   0x8000001e 0x00: eax=0x00000001 ebx=0x00000100 ecx=0x00000000 edx=0x00000000
CPU 129:
   0x8000001e 0x00: eax=0x00000003 ebx=0x00000101 ecx=0x00000000 edx=0x00000000
CPU 130:
   0x8000001e 0x00: eax=0x00000005 ebx=0x00000102 ecx=0x00000000 edx=0x00000000
CPU 131:
   0x8000001e 0x00: eax=0x00000007 ebx=0x00000103 ecx=0x00000000 edx=0x00000000
CPU 132:
   0x8000001e 0x00: eax=0x00000009 ebx=0x00000104 ecx=0x00000000 edx=0x00000000
CPU 133:
   0x8000001e 0x00: eax=0x0000000b ebx=0x00000105 ecx=0x00000000 edx=0x00000000
CPU 134:
   0x8000001e 0x00: eax=0x0000000d ebx=0x00000106 ecx=0x00000000 edx=0x00000000
CPU 135:
   0x8000001e 0x00: eax=0x0000000f ebx=0x00000107 ecx=0x00000000 edx=0x00000000
CPU 136:
   0x8000001e 0x00: eax=0x00000011 ebx=0x00000108 ecx=0x00000000 edx=0x00000000
CPU 137:
   0x8000001e 0x00: eax=0x00000013 ebx=0x00000109 ecx=0x00000000 edx=0x00000000
CPU 138:
   0x8000001e 0x00: eax=0x00000015 ebx=0x0000010a ecx=0x00000000 edx=0x00000000
CPU 139:
   0x8000001e 0x00: eax=0x00000017 ebx=0x0000010b ecx=0x00000000 edx=0x00000000
CPU 140:
   0x8000001e 0x00: eax=0x00000019 ebx=0x0000010c ecx=0x00000000 edx=0x00000000
CPU 141:
   0x8000001e 0x00: eax=0x0000001b ebx=0x0000010d ecx=0x00000000 edx=0x00000000
CPU 142:
   0x8000001e 0x00: eax=0x0000001d ebx=0x0000010e ecx=0x00000000 edx=0x00000000
CPU 143:
   0x8000001e 0x00: eax=0x0000001f ebx=0x0000010f ecx=0x00000000 edx=0x00000000
CPU 144:
   0x8000001e 0x00: eax=0x00000021 ebx=0x00000110 ecx=0x00000000 edx=0x00000000
CPU 145:
   0x8000001e 0x00: eax=0x00000023 ebx=0x00000111 ecx=0x00000000 edx=0x00000000
CPU 146:
   0x8000001e 0x00: eax=0x00000025 ebx=0x00000112 ecx=0x00000000 edx=0x00000000
CPU 147:
   0x8000001e 0x00: eax=0x00000027 ebx=0x00000113 ecx=0x00000000 edx=0x00000000
CPU 148:
   0x8000001e 0x00: eax=0x00000029 ebx=0x00000114 ecx=0x00000000 edx=0x00000000
CPU 149:
   0x8000001e 0x00: eax=0x0000002b ebx=0x00000115 ecx=0x00000000 edx=0x00000000
CPU 150:
   0x8000001e 0x00: eax=0x0000002d ebx=0x00000116 ecx=0x00000000 edx=0x00000000
CPU 151:
   0x8000001e 0x00: eax=0x0000002f ebx=0x00000117 ecx=0x00000000 edx=0x00000000
CPU 152:
   0x8000001e 0x00: eax=0x00000031 ebx=0x00000118 ecx=0x00000000 edx=0x00000000
CPU 153:
   0x8000001e 0x00: eax=0x00000033 ebx=0x00000119 ecx=0x00000000 edx=0x00000000
CPU 154:
   0x8000001e 0x00: eax=0x00000035 ebx=0x0000011a ecx=0x00000000 edx=0x00000000
CPU 155:
   0x8000001e 0x00: eax=0x00000037 ebx=0x0000011b ecx=0x00000000 edx=0x00000000
CPU 156:
   0x8000001e 0x00: eax=0x00000039 ebx=0x0000011c ecx=0x00000000 edx=0x00000000
CPU 157:
   0x8000001e 0x00: eax=0x0000003b ebx=0x0000011d ecx=0x00000000 edx=0x00000000
CPU 158:
   0x8000001e 0x00: eax=0x0000003d ebx=0x0000011e ecx=0x00000000 edx=0x00000000
CPU 159:
   0x8000001e 0x00: eax=0x0000003f ebx=0x0000011f ecx=0x00000000 edx=0x00000000
CPU 160:
   0x8000001e 0x00: eax=0x00000041 ebx=0x00000120 ecx=0x00000000 edx=0x00000000
CPU 161:
   0x8000001e 0x00: eax=0x00000043 ebx=0x00000121 ecx=0x00000000 edx=0x00000000
CPU 162:
   0x8000001e 0x00: eax=0x00000045 ebx=0x00000122 ecx=0x00000000 edx=0x00000000
CPU 163:
   0x8000001e 0x00: eax=0x00000047 ebx=0x00000123 ecx=0x00000000 edx=0x00000000
CPU 164:
   0x8000001e 0x00: eax=0x00000049 ebx=0x00000124 ecx=0x00000000 edx=0x00000000
CPU 165:
   0x8000001e 0x00: eax=0x0000004b ebx=0x00000125 ecx=0x00000000 edx=0x00000000
CPU 166:
   0x8000001e 0x00: eax=0x0000004d ebx=0x00000126 ecx=0x00000000 edx=0x00000000
CPU 167:
   0x8000001e 0x00: eax=0x0000004f ebx=0x00000127 ecx=0x00000000 edx=0x00000000
CPU 168:
   0x8000001e 0x00: eax=0x00000051 ebx=0x00000128 ecx=0x00000000 edx=0x00000000
CPU 169:
   0x8000001e 0x00: eax=0x00000053 ebx=0x00000129 ecx=0x00000000 edx=0x00000000
CPU 170:
   0x8000001e 0x00: eax=0x00000055 ebx=0x0000012a ecx=0x00000000 edx=0x00000000
CPU 171:
   0x8000001e 0x00: eax=0x00000057 ebx=0x0000012b ecx=0x00000000 edx=0x00000000
CPU 172:
   0x8000001e 0x00: eax=0x00000059 ebx=0x0000012c ecx=0x00000000 edx=0x00000000
CPU 173:
   0x8000001e 0x00: eax=0x0000005b ebx=0x0000012d ecx=0x00000000 edx=0x00000000
CPU 174:
   0x8000001e 0x00: eax=0x0000005d ebx=0x0000012e ecx=0x00000000 edx=0x00000000
CPU 175:
   0x8000001e 0x00: eax=0x0000005f ebx=0x0000012f ecx=0x00000000 edx=0x00000000
CPU 176:
   0x8000001e 0x00: eax=0x00000061 ebx=0x00000130 ecx=0x00000000 edx=0x00000000
CPU 177:
   0x8000001e 0x00: eax=0x00000063 ebx=0x00000131 ecx=0x00000000 edx=0x00000000
CPU 178:
   0x8000001e 0x00: eax=0x00000065 ebx=0x00000132 ecx=0x00000000 edx=0x00000000
CPU 179:
   0x8000001e 0x00: eax=0x00000067 ebx=0x00000133 ecx=0x00000000 edx=0x00000000
CPU 180:
   0x8000001e 0x00: eax=0x00000069 ebx=0x00000134 ecx=0x00000000 edx=0x00000000
CPU 181:
   0x8000001e 0x00: eax=0x0000006b ebx=0x00000135 ecx=0x00000000 edx=0x00000000
CPU 182:
   0x8000001e 0x00: eax=0x0000006d ebx=0x00000136 ecx=0x00000000 edx=0x00000000
CPU 183:
   0x8000001e 0x00: eax=0x0000006f ebx=0x00000137 ecx=0x00000000 edx=0x00000000
CPU 184:
   0x8000001e 0x00: eax=0x00000071 ebx=0x00000138 ecx=0x00000000 edx=0x00000000
CPU 185:
   0x8000001e 0x00: eax=0x00000073 ebx=0x00000139 ecx=0x00000000 edx=0x00000000
CPU 186:
   0x8000001e 0x00: eax=0x00000075 ebx=0x0000013a ecx=0x00000000 edx=0x00000000
CPU 187:
   0x8000001e 0x00: eax=0x00000077 ebx=0x0000013b ecx=0x00000000 edx=0x00000000
CPU 188:
   0x8000001e 0x00: eax=0x00000079 ebx=0x0000013c ecx=0x00000000 edx=0x00000000
CPU 189:
   0x8000001e 0x00: eax=0x0000007b ebx=0x0000013d ecx=0x00000000 edx=0x00000000
CPU 190:
   0x8000001e 0x00: eax=0x0000007d ebx=0x0000013e ecx=0x00000000 edx=0x00000000
CPU 191:
   0x8000001e 0x00: eax=0x0000007f ebx=0x0000013f ecx=0x00000000 edx=0x00000000
CPU 192:
   0x8000001e 0x00: eax=0x00000081 ebx=0x00000100 ecx=0x00000001 edx=0x00000000
CPU 193:
   0x8000001e 0x00: eax=0x00000083 ebx=0x00000101 ecx=0x00000001 edx=0x00000000
CPU 194:
   0x8000001e 0x00: eax=0x00000085 ebx=0x00000102 ecx=0x00000001 edx=0x00000000
CPU 195:
   0x8000001e 0x00: eax=0x00000087 ebx=0x00000103 ecx=0x00000001 edx=0x00000000
CPU 196:
   0x8000001e 0x00: eax=0x00000089 ebx=0x00000104 ecx=0x00000001 edx=0x00000000
CPU 197:
   0x8000001e 0x00: eax=0x0000008b ebx=0x00000105 ecx=0x00000001 edx=0x00000000
CPU 198:
   0x8000001e 0x00: eax=0x0000008d ebx=0x00000106 ecx=0x00000001 edx=0x00000000
CPU 199:
   0x8000001e 0x00: eax=0x0000008f ebx=0x00000107 ecx=0x00000001 edx=0x00000000
CPU 200:
   0x8000001e 0x00: eax=0x00000091 ebx=0x00000108 ecx=0x00000001 edx=0x00000000
CPU 201:
   0x8000001e 0x00: eax=0x00000093 ebx=0x00000109 ecx=0x00000001 edx=0x00000000
CPU 202:
   0x8000001e 0x00: eax=0x00000095 ebx=0x0000010a ecx=0x00000001 edx=0x00000000
CPU 203:
   0x8000001e 0x00: eax=0x00000097 ebx=0x0000010b ecx=0x00000001 edx=0x00000000
CPU 204:
   0x8000001e 0x00: eax=0x00000099 ebx=0x0000010c ecx=0x00000001 edx=0x00000000
CPU 205:
   0x8000001e 0x00: eax=0x0000009b ebx=0x0000010d ecx=0x00000001 edx=0x00000000
CPU 206:
   0x8000001e 0x00: eax=0x0000009d ebx=0x0000010e ecx=0x00000001 edx=0x00000000
CPU 207:
   0x8000001e 0x00: eax=0x0000009f ebx=0x0000010f ecx=0x00000001 edx=0x00000000
CPU 208:
   0x8000001e 0x00: eax=0x000000a1 ebx=0x00000110 ecx=0x00000001 edx=0x00000000
CPU 209:
   0x8000001e 0x00: eax=0x000000a3 ebx=0x00000111 ecx=0x00000001 edx=0x00000000
CPU 210:
   0x8000001e 0x00: eax=0x000000a5 ebx=0x00000112 ecx=0x00000001 edx=0x00000000
CPU 211:
   0x8000001e 0x00: eax=0x000000a7 ebx=0x00000113 ecx=0x00000001 edx=0x00000000
CPU 212:
   0x8000001e 0x00: eax=0x000000a9 ebx=0x00000114 ecx=0x00000001 edx=0x00000000
CPU 213:
   0x8000001e 0x00: eax=0x000000ab ebx=0x00000115 ecx=0x00000001 edx=0x00000000
CPU 214:
   0x8000001e 0x00: eax=0x000000ad ebx=0x00000116 ecx=0x00000001 edx=0x00000000
CPU 215:
   0x8000001e 0x00: eax=0x000000af ebx=0x00000117 ecx=0x00000001 edx=0x00000000
CPU 216:
   0x8000001e 0x00: eax=0x000000b1 ebx=0x00000118 ecx=0x00000001 edx=0x00000000
CPU 217:
   0x8000001e 0x00: eax=0x000000b3 ebx=0x00000119 ecx=0x00000001 edx=0x00000000
CPU 218:
   0x8000001e 0x00: eax=0x000000b5 ebx=0x0000011a ecx=0x00000001 edx=0x00000000
CPU 219:
   0x8000001e 0x00: eax=0x000000b7 ebx=0x0000011b ecx=0x00000001 edx=0x00000000
CPU 220:
   0x8000001e 0x00: eax=0x000000b9 ebx=0x0000011c ecx=0x00000001 edx=0x00000000
CPU 221:
   0x8000001e 0x00: eax=0x000000bb ebx=0x0000011d ecx=0x00000001 edx=0x00000000
CPU 222:
   0x8000001e 0x00: eax=0x000000bd ebx=0x0000011e ecx=0x00000001 edx=0x00000000
CPU 223:
   0x8000001e 0x00: eax=0x000000bf ebx=0x0000011f ecx=0x00000001 edx=0x00000000
CPU 224:
   0x8000001e 0x00: eax=0x000000c1 ebx=0x00000120 ecx=0x00000001 edx=0x00000000
CPU 225:
   0x8000001e 0x00: eax=0x000000c3 ebx=0x00000121 ecx=0x00000001 edx=0x00000000
CPU 226:
   0x8000001e 0x00: eax=0x000000c5 ebx=0x00000122 ecx=0x00000001 edx=0x00000000
CPU 227:
   0x8000001e 0x00: eax=0x000000c7 ebx=0x00000123 ecx=0x00000001 edx=0x00000000
CPU 228:
   0x8000001e 0x00: eax=0x000000c9 ebx=0x00000124 ecx=0x00000001 edx=0x00000000
CPU 229:
   0x8000001e 0x00: eax=0x000000cb ebx=0x00000125 ecx=0x00000001 edx=0x00000000
CPU 230:
   0x8000001e 0x00: eax=0x000000cd ebx=0x00000126 ecx=0x00000001 edx=0x00000000
CPU 231:
   0x8000001e 0x00: eax=0x000000cf ebx=0x00000127 ecx=0x00000001 edx=0x00000000
CPU 232:
   0x8000001e 0x00: eax=0x000000d1 ebx=0x00000128 ecx=0x00000001 edx=0x00000000
CPU 233:
   0x8000001e 0x00: eax=0x000000d3 ebx=0x00000129 ecx=0x00000001 edx=0x00000000
CPU 234:
   0x8000001e 0x00: eax=0x000000d5 ebx=0x0000012a ecx=0x00000001 edx=0x00000000
CPU 235:
   0x8000001e 0x00: eax=0x000000d7 ebx=0x0000012b ecx=0x00000001 edx=0x00000000
CPU 236:
   0x8000001e 0x00: eax=0x000000d9 ebx=0x0000012c ecx=0x00000001 edx=0x00000000
CPU 237:
   0x8000001e 0x00: eax=0x000000db ebx=0x0000012d ecx=0x00000001 edx=0x00000000
CPU 238:
   0x8000001e 0x00: eax=0x000000dd ebx=0x0000012e ecx=0x00000001 edx=0x00000000
CPU 239:
   0x8000001e 0x00: eax=0x000000df ebx=0x0000012f ecx=0x00000001 edx=0x00000000
CPU 240:
   0x8000001e 0x00: eax=0x000000e1 ebx=0x00000130 ecx=0x00000001 edx=0x00000000
CPU 241:
   0x8000001e 0x00: eax=0x000000e3 ebx=0x00000131 ecx=0x00000001 edx=0x00000000
CPU 242:
   0x8000001e 0x00: eax=0x000000e5 ebx=0x00000132 ecx=0x00000001 edx=0x00000000
CPU 243:
   0x8000001e 0x00: eax=0x000000e7 ebx=0x00000133 ecx=0x00000001 edx=0x00000000
CPU 244:
   0x8000001e 0x00: eax=0x000000e9 ebx=0x00000134 ecx=0x00000001 edx=0x00000000
CPU 245:
   0x8000001e 0x00: eax=0x000000eb ebx=0x00000135 ecx=0x00000001 edx=0x00000000
CPU 246:
   0x8000001e 0x00: eax=0x000000ed ebx=0x00000136 ecx=0x00000001 edx=0x00000000
CPU 247:
   0x8000001e 0x00: eax=0x000000ef ebx=0x00000137 ecx=0x00000001 edx=0x00000000
CPU 248:
   0x8000001e 0x00: eax=0x000000f1 ebx=0x00000138 ecx=0x00000001 edx=0x00000000
CPU 249:
   0x8000001e 0x00: eax=0x000000f3 ebx=0x00000139 ecx=0x00000001 edx=0x00000000
CPU 250:
   0x8000001e 0x00: eax=0x000000f5 ebx=0x0000013a ecx=0x00000001 edx=0x00000000
CPU 251:
   0x8000001e 0x00: eax=0x000000f7 ebx=0x0000013b ecx=0x00000001 edx=0x00000000
CPU 252:
   0x8000001e 0x00: eax=0x000000f9 ebx=0x0000013c ecx=0x00000001 edx=0x00000000
CPU 253:
   0x8000001e 0x00: eax=0x000000fb ebx=0x0000013d ecx=0x00000001 edx=0x00000000
CPU 254:
   0x8000001e 0x00: eax=0x000000fd ebx=0x0000013e ecx=0x00000001 edx=0x00000000
CPU 255:
   0x8000001e 0x00: eax=0x000000ff ebx=0x0000013f ecx=0x00000001 edx=0x00000000

RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Babu Moger 3 years, 8 months ago
> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Wednesday, August 26, 2020 8:31 AM
> To: Daniel P. Berrangé <berrange@redhat.com>
> Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> decode
> 
> On Wed, 26 Aug 2020 13:50:59 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >
> > > > To support some of the complex topology, we introduced EPYC mode
> apicid decode.
> > > > But, EPYC mode decode is running into problems. Also it can become
> > > > quite a maintenance problem in the future. So, it was decided to
> > > > remove that code and use the generic decode which works for
> > > > majority of the topology. Most of the SPECed configuration would
> > > > work just fine. With some non-SPECed user inputs, it will create some sub-
> optimal configuration.
> > > > Here is the discussion thread.
> > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> d5b437c1b25
> > > >
> 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> 52f92
> > > >
> 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> C0
> > > >
> %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> YO%2B
> > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > >
> > > > This series removes all the EPYC mode specific apicid changes and
> > > > use the generic apicid decode.
> > >
> > > the main difference between EPYC and all other CPUs is that it
> > > requires numa configuration (it's not optional) so we need an extra
No, That is not true. Because of that assumption we made all these apicid
changes. And here we are now.

AMD supports varies mixed configurations. In case of EPYC-Rome, we have
NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1, basically we
have all the cores in a socket under one numa node. This is non-numa
configuration.
Looking at the various configurations and also discussing internally, it
is not advisable to have (epyc && !numa) check.

> > > patch on top of this series to enfoce that, i.e:
> > >
> > >  if (epyc && !numa)
> > >     error("EPYC cpu requires numa to be configured")
> >
> > Please no. This will break 90% of current usage of the EPYC CPU in
> > real world QEMU deployments. That is way too user hostile to introduce
> > as a requirement.
> >
> > Why do we need to force this ?  People have been successfuly using
> > EPYC CPUs without NUMA in QEMU for years now.
> >
> > It might not match behaviour of bare metal silicon, but that hasn't
> > obviously caused the world to come crashing down.
> So far it produces warning in linux kernel (RHBZ1728166), (resulting performance
> might be suboptimal), but I haven't seen anyone reporting crashes yet.
> 
> 
> What other options do we have?
> Perhaps we can turn on strict check for new machine types only, so old configs
> can keep broken topology (CPUID), while new ones would require -numa and
> produce correct topology.
> 
> 
> >
> > Regards,
> > Daniel


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Dr. David Alan Gilbert 3 years, 8 months ago
* Babu Moger (babu.moger@amd.com) wrote:
> 
> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Wednesday, August 26, 2020 8:31 AM
> > To: Daniel P. Berrangé <berrange@redhat.com>
> > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > decode
> > 
> > On Wed, 26 Aug 2020 13:50:59 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > On Fri, 21 Aug 2020 17:12:19 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >
> > > > > To support some of the complex topology, we introduced EPYC mode
> > apicid decode.
> > > > > But, EPYC mode decode is running into problems. Also it can become
> > > > > quite a maintenance problem in the future. So, it was decided to
> > > > > remove that code and use the generic decode which works for
> > > > > majority of the topology. Most of the SPECed configuration would
> > > > > work just fine. With some non-SPECed user inputs, it will create some sub-
> > optimal configuration.
> > > > > Here is the discussion thread.
> > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2F
> > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > d5b437c1b25
> > > > >
> > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > 52f92
> > > > >
> > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > C0
> > > > >
> > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > YO%2B
> > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > >
> > > > > This series removes all the EPYC mode specific apicid changes and
> > > > > use the generic apicid decode.
> > > >
> > > > the main difference between EPYC and all other CPUs is that it
> > > > requires numa configuration (it's not optional) so we need an extra
> No, That is not true. Because of that assumption we made all these apicid
> changes. And here we are now.
> 
> AMD supports varies mixed configurations. In case of EPYC-Rome, we have
> NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1, basically we
> have all the cores in a socket under one numa node. This is non-numa
> configuration.
> Looking at the various configurations and also discussing internally, it
> is not advisable to have (epyc && !numa) check.

Indeed on real hardware, I don't think we always see NUMA; my single
socket, 16 core/32 thread 7302P Dell box, shows the kernel printing
'No NUMA configuration found...Faking a node.'

So if real hardware hasn't got a NUMA node, what's the real problem?

Dave

> > > > patch on top of this series to enfoce that, i.e:
> > > >
> > > >  if (epyc && !numa)
> > > >     error("EPYC cpu requires numa to be configured")
> > >
> > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > real world QEMU deployments. That is way too user hostile to introduce
> > > as a requirement.
> > >
> > > Why do we need to force this ?  People have been successfuly using
> > > EPYC CPUs without NUMA in QEMU for years now.
> > >
> > > It might not match behaviour of bare metal silicon, but that hasn't
> > > obviously caused the world to come crashing down.
> > So far it produces warning in linux kernel (RHBZ1728166), (resulting performance
> > might be suboptimal), but I haven't seen anyone reporting crashes yet.
> > 
> > 
> > What other options do we have?
> > Perhaps we can turn on strict check for new machine types only, so old configs
> > can keep broken topology (CPUID), while new ones would require -numa and
> > produce correct topology.
> > 
> > 
> > >
> > > Regards,
> > > Daniel
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Babu Moger 3 years, 8 months ago

> -----Original Message-----
> From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Sent: Wednesday, August 26, 2020 1:34 PM
> To: Moger, Babu <Babu.Moger@amd.com>
> Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> pbonzini@redhat.com; rth@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> decode
> 
> * Babu Moger (babu.moger@amd.com) wrote:
> >
> > > -----Original Message-----
> > > From: Igor Mammedov <imammedo@redhat.com>
> > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > generic decode
> > >
> > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > <babu.moger@amd.com> wrote:
> > > > >
> > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > mode
> > > apicid decode.
> > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > decided to remove that code and use the generic decode which
> > > > > > works for majority of the topology. Most of the SPECed
> > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > inputs, it will create some sub-
> > > optimal configuration.
> > > > > > Here is the discussion thread.
> > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > F%2F
> > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > > d5b437c1b25
> > > > > >
> > >
> 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > > 52f92
> > > > > >
> > >
> 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > C0
> > > > > >
> > >
> %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > > YO%2B
> > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > >
> > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > and use the generic apicid decode.
> > > > >
> > > > > the main difference between EPYC and all other CPUs is that it
> > > > > requires numa configuration (it's not optional) so we need an
> > > > > extra
> > No, That is not true. Because of that assumption we made all these
> > apicid changes. And here we are now.
> >
> > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > basically we have all the cores in a socket under one numa node. This
> > is non-numa configuration.
> > Looking at the various configurations and also discussing internally,
> > it is not advisable to have (epyc && !numa) check.
> 
> Indeed on real hardware, I don't think we always see NUMA; my single socket,
> 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> configuration found...Faking a node.'
> 
> So if real hardware hasn't got a NUMA node, what's the real problem?

I don't see any problem once we revert all these changes(patch 1-7).
We don't need if (epyc && !numa) error check or auto_enable_numa=true
unconditionally.

> 
> Dave
> 
> > > > > patch on top of this series to enfoce that, i.e:
> > > > >
> > > > >  if (epyc && !numa)
> > > > >     error("EPYC cpu requires numa to be configured")
> > > >
> > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > real world QEMU deployments. That is way too user hostile to
> > > > introduce as a requirement.
> > > >
> > > > Why do we need to force this ?  People have been successfuly using
> > > > EPYC CPUs without NUMA in QEMU for years now.
> > > >
> > > > It might not match behaviour of bare metal silicon, but that
> > > > hasn't obviously caused the world to come crashing down.
> > > So far it produces warning in linux kernel (RHBZ1728166), (resulting
> > > performance might be suboptimal), but I haven't seen anyone reporting
> crashes yet.
> > >
> > >
> > > What other options do we have?
> > > Perhaps we can turn on strict check for new machine types only, so
> > > old configs can keep broken topology (CPUID), while new ones would
> > > require -numa and produce correct topology.
> > >
> > >
> > > >
> > > > Regards,
> > > > Daniel
> >
> >
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Wed, 26 Aug 2020 13:45:51 -0500
Babu Moger <babu.moger@amd.com> wrote:

> 
> 
> > -----Original Message-----
> > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > Sent: Wednesday, August 26, 2020 1:34 PM
> > To: Moger, Babu <Babu.Moger@amd.com>
> > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> > Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> > pbonzini@redhat.com; rth@twiddle.net
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > decode
> > 
> > * Babu Moger (babu.moger@amd.com) wrote:
> > >
> > > > -----Original Message-----
> > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > > generic decode
> > > >
> > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > <babu.moger@amd.com> wrote:
> > > > > >
> > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > mode
> > > > apicid decode.
> > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > > decided to remove that code and use the generic decode which
> > > > > > > works for majority of the topology. Most of the SPECed
> > > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > > inputs, it will create some sub-
> > > > optimal configuration.
> > > > > > > Here is the discussion thread.
> > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > > F%2F
> > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > > > d5b437c1b25
> > > > > > >
> > > >
> > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > > > 52f92
> > > > > > >
> > > >
> > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > > C0
> > > > > > >
> > > >
> > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > > > YO%2B
> > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > >
> > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > and use the generic apicid decode.
> > > > > >
> > > > > > the main difference between EPYC and all other CPUs is that it
> > > > > > requires numa configuration (it's not optional) so we need an
> > > > > > extra
> > > No, That is not true. Because of that assumption we made all these
> > > apicid changes. And here we are now.
> > >
> > > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > > basically we have all the cores in a socket under one numa node. This
> > > is non-numa configuration.
> > > Looking at the various configurations and also discussing internally,
> > > it is not advisable to have (epyc && !numa) check.
> > 
> > Indeed on real hardware, I don't think we always see NUMA; my single socket,
> > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> > configuration found...Faking a node.'
looks like firmware bug or maybe it's feature and there is a knob in fw
to turn it on/off in case used OS doesn't like it for some reason.


> > So if real hardware hasn't got a NUMA node, what's the real problem?
> 
> I don't see any problem once we revert all these changes(patch 1-7).
> We don't need if (epyc && !numa) error check or auto_enable_numa=true
> unconditionally.

We need revert to unbreak migration from QEMU < 5.0,
everything else (fixes for CPUID_Fn8000001E) could go on top.

So what's on top (because old code also wasn't correct when
CPUID_Fn8000001E is taken in account, tha's why we are at this point),

When starting QEMU without -numa
Indeed we can skip "if (epyc && !numa) error check or auto_enable_numa=true",
in case where there is 1 die (NPS1).
(1) User however may set core/threads number bigger than possible by spec,
    in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not be
    valid spec vise and could trigger OPPs in guest kernel.
    Given we allow go out of spec, perhaps we should add a warning at
    realize time saying that used -smp config is not supported since it
    doesn't match AMD EPYC spec and might not work.

(2) Earlier we agreed that we can reuse existing die_id instead of internal
    (topo_ids.node_id in current code)
    (It's is called DIE_ID and NODE ID in spec interchangeably)
    Same as (1) add a warning when '-smp dies' goes beyond spec limits.
    
(3) "-smp dies>1" ''if'' we allow to run it without -numa,
    then system wide NUMA node id in CPUID_Fn8000001E_ECX probably doesn't matter.
    could be something like in spec but taking in account die offset, to produce
    unique id.

    Same, add a warning that there are more than 1 dies but numa is not enabled,
    suggest to enable numa.

    With current code it produces invalid APIC ID for valid '-smp' combination,
    however if we revert it and switch to die_id than it should produce
    valid APIC ID once again (as in 4.2).
    Given it produces invalid APIC id, maybe we should just ditch the case and
    fold it in (4) (i.e. require -numa if "-smp dies>1")

(4) -numa is used (RHBZ1728166)
    we need to ensure that socket*dies == ms->numa_state->num_nodes
     and make sure that CPUID_Fn8000001E_ECX consistent with
    cpu mapping provided with "-numa cpu=" option.

Warnings won't help a lot, but at least they will point out at
possible problem when someone complains.

> > 
> > Dave
> > 
> > > > > > patch on top of this series to enfoce that, i.e:
> > > > > >
> > > > > >  if (epyc && !numa)
> > > > > >     error("EPYC cpu requires numa to be configured")
> > > > >
> > > > > Please no. This will break 90% of current usage of the EPYC CPU in
> > > > > real world QEMU deployments. That is way too user hostile to
> > > > > introduce as a requirement.
> > > > >
> > > > > Why do we need to force this ?  People have been successfuly using
> > > > > EPYC CPUs without NUMA in QEMU for years now.
> > > > >
> > > > > It might not match behaviour of bare metal silicon, but that
> > > > > hasn't obviously caused the world to come crashing down.
> > > > So far it produces warning in linux kernel (RHBZ1728166), (resulting
> > > > performance might be suboptimal), but I haven't seen anyone reporting
> > crashes yet.
> > > >
> > > >
> > > > What other options do we have?
> > > > Perhaps we can turn on strict check for new machine types only, so
> > > > old configs can keep broken topology (CPUID), while new ones would
> > > > require -numa and produce correct topology.
> > > >
> > > >
> > > > >
> > > > > Regards,
> > > > > Daniel
> > >
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Daniel P. Berrangé 3 years, 8 months ago
On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:
> On Wed, 26 Aug 2020 13:45:51 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > 
> > 
> > > -----Original Message-----
> > > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > To: Moger, Babu <Babu.Moger@amd.com>
> > > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> > > Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> > > pbonzini@redhat.com; rth@twiddle.net
> > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > > decode
> > > 
> > > * Babu Moger (babu.moger@amd.com) wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > > > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > > > generic decode
> > > > >
> > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > <babu.moger@amd.com> wrote:
> > > > > > >
> > > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > > mode
> > > > > apicid decode.
> > > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > > > decided to remove that code and use the generic decode which
> > > > > > > > works for majority of the topology. Most of the SPECed
> > > > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > > > inputs, it will create some sub-
> > > > > optimal configuration.
> > > > > > > > Here is the discussion thread.
> > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > > > F%2F
> > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-
> > > > > d5b437c1b25
> > > > > > > >
> > > > >
> > > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c
> > > > > 52f92
> > > > > > > >
> > > > >
> > > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > > > C0
> > > > > > > >
> > > > >
> > > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA
> > > > > YO%2B
> > > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > > >
> > > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > > and use the generic apicid decode.
> > > > > > >
> > > > > > > the main difference between EPYC and all other CPUs is that it
> > > > > > > requires numa configuration (it's not optional) so we need an
> > > > > > > extra
> > > > No, That is not true. Because of that assumption we made all these
> > > > apicid changes. And here we are now.
> > > >
> > > > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > > > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > > > basically we have all the cores in a socket under one numa node. This
> > > > is non-numa configuration.
> > > > Looking at the various configurations and also discussing internally,
> > > > it is not advisable to have (epyc && !numa) check.
> > > 
> > > Indeed on real hardware, I don't think we always see NUMA; my single socket,
> > > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> > > configuration found...Faking a node.'
> looks like firmware bug or maybe it's feature and there is a knob in fw
> to turn it on/off in case used OS doesn't like it for some reason.
> 
> 
> > > So if real hardware hasn't got a NUMA node, what's the real problem?
> > 
> > I don't see any problem once we revert all these changes(patch 1-7).
> > We don't need if (epyc && !numa) error check or auto_enable_numa=true
> > unconditionally.
> 
> We need revert to unbreak migration from QEMU < 5.0,
> everything else (fixes for CPUID_Fn8000001E) could go on top.
> 
> So what's on top (because old code also wasn't correct when
> CPUID_Fn8000001E is taken in account, tha's why we are at this point),
> 
> When starting QEMU without -numa
> Indeed we can skip "if (epyc && !numa) error check or auto_enable_numa=true",
> in case where there is 1 die (NPS1).
> (1) User however may set core/threads number bigger than possible by spec,
>     in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not be
>     valid spec vise and could trigger OPPs in guest kernel.
>     Given we allow go out of spec, perhaps we should add a warning at
>     realize time saying that used -smp config is not supported since it
>     doesn't match AMD EPYC spec and might not work.
> 
> (2) Earlier we agreed that we can reuse existing die_id instead of internal
>     (topo_ids.node_id in current code)
>     (It's is called DIE_ID and NODE ID in spec interchangeably)
>     Same as (1) add a warning when '-smp dies' goes beyond spec limits.
>     
> (3) "-smp dies>1" ''if'' we allow to run it without -numa,
>     then system wide NUMA node id in CPUID_Fn8000001E_ECX probably doesn't matter.
>     could be something like in spec but taking in account die offset, to produce
>     unique id.
> 
>     Same, add a warning that there are more than 1 dies but numa is not enabled,
>     suggest to enable numa.
> 
>     With current code it produces invalid APIC ID for valid '-smp' combination,
>     however if we revert it and switch to die_id than it should produce
>     valid APIC ID once again (as in 4.2).
>     Given it produces invalid APIC id, maybe we should just ditch the case and
>     fold it in (4) (i.e. require -numa if "-smp dies>1")
> 
> (4) -numa is used (RHBZ1728166)
>     we need to ensure that socket*dies == ms->numa_state->num_nodes
>      and make sure that CPUID_Fn8000001E_ECX consistent with
>     cpu mapping provided with "-numa cpu=" option.

Why do we need to socket*dies == ms->numa_state->num_nodes ? That doesn't
seem to be the case in bare metal EPYC nodes I've used which lets you
configure how many NUMA nodes in firmware.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Fri, 28 Aug 2020 09:58:03 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Aug 2020 13:45:51 -0500
> > Babu Moger <babu.moger@amd.com> wrote:
> >   
> > > 
> > >   
> > > > -----Original Message-----
> > > > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > > > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> > > > Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> > > > pbonzini@redhat.com; rth@twiddle.net
> > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic
> > > > decode
> > > > 
> > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > >  
> > > > > > -----Original Message-----
> > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > > > Cc: Moger, Babu <Babu.Moger@amd.com>; pbonzini@redhat.com;
> > > > > > rth@twiddle.net; ehabkost@redhat.com; qemu-devel@nongnu.org;
> > > > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > > > > > generic decode
> > > > > >
> > > > > > On Wed, 26 Aug 2020 13:50:59 +0100
> > > > > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >  
> > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:  
> > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > >  
> > > > > > > > > To support some of the complex topology, we introduced EPYC
> > > > > > > > > mode  
> > > > > > apicid decode.  
> > > > > > > > > But, EPYC mode decode is running into problems. Also it can
> > > > > > > > > become quite a maintenance problem in the future. So, it was
> > > > > > > > > decided to remove that code and use the generic decode which
> > > > > > > > > works for majority of the topology. Most of the SPECed
> > > > > > > > > configuration would work just fine. With some non-SPECed user
> > > > > > > > > inputs, it will create some sub-  
> > > > > > optimal configuration.  
> > > > > > > > > Here is the discussion thread.
> > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2
> > > > > > > > > F%2F
> > > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-e468-  
> > > > > > d5b437c1b25  
> > > > > > > > >  
> > > > > >  
> > > > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a5c  
> > > > > > 52f92  
> > > > > > > > >  
> > > > > >  
> > > > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7  
> > > > > > C0  
> > > > > > > > >  
> > > > > >  
> > > > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMeA  
> > > > > > YO%2B  
> > > > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > > > >
> > > > > > > > > This series removes all the EPYC mode specific apicid changes
> > > > > > > > > and use the generic apicid decode.  
> > > > > > > >
> > > > > > > > the main difference between EPYC and all other CPUs is that it
> > > > > > > > requires numa configuration (it's not optional) so we need an
> > > > > > > > extra  
> > > > > No, That is not true. Because of that assumption we made all these
> > > > > apicid changes. And here we are now.
> > > > >
> > > > > AMD supports varies mixed configurations. In case of EPYC-Rome, we
> > > > > have NPS1, NPS2 and NPS4(Numa Nodes per socket). In case of NPS1,
> > > > > basically we have all the cores in a socket under one numa node. This
> > > > > is non-numa configuration.
> > > > > Looking at the various configurations and also discussing internally,
> > > > > it is not advisable to have (epyc && !numa) check.  
> > > > 
> > > > Indeed on real hardware, I don't think we always see NUMA; my single socket,
> > > > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No NUMA
> > > > configuration found...Faking a node.'  
> > looks like firmware bug or maybe it's feature and there is a knob in fw
> > to turn it on/off in case used OS doesn't like it for some reason.
> > 
> >   
> > > > So if real hardware hasn't got a NUMA node, what's the real problem?  
> > > 
> > > I don't see any problem once we revert all these changes(patch 1-7).
> > > We don't need if (epyc && !numa) error check or auto_enable_numa=true
> > > unconditionally.  
> > 
> > We need revert to unbreak migration from QEMU < 5.0,
> > everything else (fixes for CPUID_Fn8000001E) could go on top.
> > 
> > So what's on top (because old code also wasn't correct when
> > CPUID_Fn8000001E is taken in account, tha's why we are at this point),
> > 
> > When starting QEMU without -numa
> > Indeed we can skip "if (epyc && !numa) error check or auto_enable_numa=true",
> > in case where there is 1 die (NPS1).
> > (1) User however may set core/threads number bigger than possible by spec,
> >     in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not be
> >     valid spec vise and could trigger OPPs in guest kernel.
> >     Given we allow go out of spec, perhaps we should add a warning at
> >     realize time saying that used -smp config is not supported since it
> >     doesn't match AMD EPYC spec and might not work.
> > 
> > (2) Earlier we agreed that we can reuse existing die_id instead of internal
> >     (topo_ids.node_id in current code)
> >     (It's is called DIE_ID and NODE ID in spec interchangeably)
> >     Same as (1) add a warning when '-smp dies' goes beyond spec limits.
> >     
> > (3) "-smp dies>1" ''if'' we allow to run it without -numa,
> >     then system wide NUMA node id in CPUID_Fn8000001E_ECX probably doesn't matter.
> >     could be something like in spec but taking in account die offset, to produce
> >     unique id.
> > 
> >     Same, add a warning that there are more than 1 dies but numa is not enabled,
> >     suggest to enable numa.
> > 
> >     With current code it produces invalid APIC ID for valid '-smp' combination,
> >     however if we revert it and switch to die_id than it should produce
> >     valid APIC ID once again (as in 4.2).
> >     Given it produces invalid APIC id, maybe we should just ditch the case and
> >     fold it in (4) (i.e. require -numa if "-smp dies>1")
> > 
> > (4) -numa is used (RHBZ1728166)
> >     we need to ensure that socket*dies == ms->numa_state->num_nodes
> >      and make sure that CPUID_Fn8000001E_ECX consistent with
> >     cpu mapping provided with "-numa cpu=" option.  
> 
> Why do we need to socket*dies == ms->numa_state->num_nodes ? That doesn't
> seem to be the case in bare metal EPYC nodes I've used which lets you
> configure how many NUMA nodes in firmware.

(From dumps Babu has provided earlier, it was dies == nodes and
CPUID_Fn8000001E_ECX == numa node ids in SRAT.)

dumping CPUID_Fn8000001E and SRAT table for such configs will help us
to figure out if we need socket*dies != nodes and how to compose config
were SRAT differs from CPUID_Fn8000001E_ECX.

Babu, can you provide CPUID_Fn8000001E and SRAT dumps for
above configs combinations? Or to some spec/guide how it should be.


> 
> 
> Regards,
> Daniel


RE: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Babu Moger 3 years, 8 months ago

> -----Original Message-----
> From: Igor Mammedov <imammedo@redhat.com>
> Sent: Friday, August 28, 2020 6:25 AM
> To: Daniel P. Berrangé <berrange@redhat.com>
> Cc: Moger, Babu <Babu.Moger@amd.com>; Dr. David Alan Gilbert
> <dgilbert@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> pbonzini@redhat.com; rth@twiddle.net
> Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> generic decode
> 
> On Fri, 28 Aug 2020 09:58:03 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:
> > > On Wed, 26 Aug 2020 13:45:51 -0500
> > > Babu Moger <babu.moger@amd.com> wrote:
> > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > > > > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com;
> > > > > Michal Privoznik <mprivozn@redhat.com>; qemu-
> devel@nongnu.org;
> > > > > pbonzini@redhat.com; rth@twiddle.net
> > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and
> > > > > use generic decode
> > > > >
> > > > > * Babu Moger (babu.moger@amd.com) wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > > > > Cc: Moger, Babu <Babu.Moger@amd.com>;
> pbonzini@redhat.com;
> > > > > > > rth@twiddle.net; ehabkost@redhat.com; qemu-
> devel@nongnu.org;
> > > > > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode
> > > > > > > and use generic decode
> > > > > > >
> > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100 Daniel P. Berrangé
> > > > > > > <berrange@redhat.com> wrote:
> > > > > > >
> > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov
> wrote:
> > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > > >
> > > > > > > > > > To support some of the complex topology, we introduced
> > > > > > > > > > EPYC mode
> > > > > > > apicid decode.
> > > > > > > > > > But, EPYC mode decode is running into problems. Also
> > > > > > > > > > it can become quite a maintenance problem in the
> > > > > > > > > > future. So, it was decided to remove that code and use
> > > > > > > > > > the generic decode which works for majority of the
> > > > > > > > > > topology. Most of the SPECed configuration would work
> > > > > > > > > > just fine. With some non-SPECed user inputs, it will
> > > > > > > > > > create some sub-
> > > > > > > optimal configuration.
> > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=ht
> > > > > > > > > > tps%3A%252
> > > > > > > > > > F%2F
> > > > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-
> e468
> > > > > > > > > > -
> > > > > > > d5b437c1b25
> > > > > > > > > >
> > > > > > >
> > > > >
> 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a
> 5c
> > > > > > > 52f92
> > > > > > > > > >
> > > > > > >
> > > > >
> 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7
> > > > > > > C0
> > > > > > > > > >
> > > > > > >
> > > > >
> %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMe
> A
> > > > > > > YO%2B
> > > > > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > > > > >
> > > > > > > > > > This series removes all the EPYC mode specific apicid
> > > > > > > > > > changes and use the generic apicid decode.
> > > > > > > > >
> > > > > > > > > the main difference between EPYC and all other CPUs is
> > > > > > > > > that it requires numa configuration (it's not optional)
> > > > > > > > > so we need an extra
> > > > > > No, That is not true. Because of that assumption we made all
> > > > > > these apicid changes. And here we are now.
> > > > > >
> > > > > > AMD supports varies mixed configurations. In case of
> > > > > > EPYC-Rome, we have NPS1, NPS2 and NPS4(Numa Nodes per
> socket).
> > > > > > In case of NPS1, basically we have all the cores in a socket
> > > > > > under one numa node. This is non-numa configuration.
> > > > > > Looking at the various configurations and also discussing
> > > > > > internally, it is not advisable to have (epyc && !numa) check.
> > > > >
> > > > > Indeed on real hardware, I don't think we always see NUMA; my
> > > > > single socket,
> > > > > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No
> > > > > NUMA configuration found...Faking a node.'
> > > looks like firmware bug or maybe it's feature and there is a knob in
> > > fw to turn it on/off in case used OS doesn't like it for some reason.
> > >
> > >
> > > > > So if real hardware hasn't got a NUMA node, what's the real problem?
> > > >
> > > > I don't see any problem once we revert all these changes(patch 1-7).
> > > > We don't need if (epyc && !numa) error check or
> > > > auto_enable_numa=true unconditionally.
> > >
> > > We need revert to unbreak migration from QEMU < 5.0, everything else
> > > (fixes for CPUID_Fn8000001E) could go on top.
> > >
> > > So what's on top (because old code also wasn't correct when
> > > CPUID_Fn8000001E is taken in account, tha's why we are at this
> > > point),
> > >
> > > When starting QEMU without -numa
> > > Indeed we can skip "if (epyc && !numa) error check or
> > > auto_enable_numa=true", in case where there is 1 die (NPS1).
> > > (1) User however may set core/threads number bigger than possible by
> spec,
> > >     in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not
> be
> > >     valid spec vise and could trigger OPPs in guest kernel.
> > >     Given we allow go out of spec, perhaps we should add a warning at
> > >     realize time saying that used -smp config is not supported since it
> > >     doesn't match AMD EPYC spec and might not work.
> > >
> > > (2) Earlier we agreed that we can reuse existing die_id instead of internal
> > >     (topo_ids.node_id in current code)
> > >     (It's is called DIE_ID and NODE ID in spec interchangeably)
> > >     Same as (1) add a warning when '-smp dies' goes beyond spec limits.
> > >
> > > (3) "-smp dies>1" ''if'' we allow to run it without -numa,
> > >     then system wide NUMA node id in CPUID_Fn8000001E_ECX probably
> doesn't matter.
> > >     could be something like in spec but taking in account die offset, to
> produce
> > >     unique id.
> > >
> > >     Same, add a warning that there are more than 1 dies but numa is not
> enabled,
> > >     suggest to enable numa.
> > >
> > >     With current code it produces invalid APIC ID for valid '-smp'
> combination,
> > >     however if we revert it and switch to die_id than it should produce
> > >     valid APIC ID once again (as in 4.2).
> > >     Given it produces invalid APIC id, maybe we should just ditch the case
> and
> > >     fold it in (4) (i.e. require -numa if "-smp dies>1")
> > >
> > > (4) -numa is used (RHBZ1728166)
> > >     we need to ensure that socket*dies == ms->numa_state->num_nodes
> > >      and make sure that CPUID_Fn8000001E_ECX consistent with
> > >     cpu mapping provided with "-numa cpu=" option.
> >
> > Why do we need to socket*dies == ms->numa_state->num_nodes ? That
> > doesn't seem to be the case in bare metal EPYC nodes I've used which
> > lets you configure how many NUMA nodes in firmware.
> 
> (From dumps Babu has provided earlier, it was dies == nodes and
> CPUID_Fn8000001E_ECX == numa node ids in SRAT.)

Yes, That is correct. In most cases dies == nodes.

But that is going to change. In future(even in EPYC-Rome) with new f/w
BIOS option, users can configure their numa node. It will give the option
to keep NPS1, SPS2 or NSP4(Nodes per socket). In those cases dies and
nodes will not match. That is why I wanted to keep them separate. User can
change dies or -numa to match their bios config.

> 
> dumping CPUID_Fn8000001E and SRAT table for such configs will help us to
> figure out if we need socket*dies != nodes and how to compose config were
> SRAT differs from CPUID_Fn8000001E_ECX.
> 
> Babu, can you provide CPUID_Fn8000001E and SRAT dumps for above configs
> combinations? Or to some spec/guide how it should be.

I dont have the config right now. But I will try to get one.

> 
> 
> >
> >
> > Regards,
> > Daniel


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Igor Mammedov 3 years, 8 months ago
On Fri, 28 Aug 2020 09:17:42 -0500
Babu Moger <babu.moger@amd.com> wrote:

> > -----Original Message-----
> > From: Igor Mammedov <imammedo@redhat.com>
> > Sent: Friday, August 28, 2020 6:25 AM
> > To: Daniel P. Berrangé <berrange@redhat.com>
> > Cc: Moger, Babu <Babu.Moger@amd.com>; Dr. David Alan Gilbert
> > <dgilbert@redhat.com>; ehabkost@redhat.com; mst@redhat.com; Michal
> > Privoznik <mprivozn@redhat.com>; qemu-devel@nongnu.org;
> > pbonzini@redhat.com; rth@twiddle.net
> > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use
> > generic decode
> > 
> > On Fri, 28 Aug 2020 09:58:03 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Thu, Aug 27, 2020 at 10:21:10PM +0200, Igor Mammedov wrote:  
> > > > On Wed, 26 Aug 2020 13:45:51 -0500
> > > > Babu Moger <babu.moger@amd.com> wrote:
> > > >  
> > > > >
> > > > >  
> > > > > > -----Original Message-----
> > > > > > From: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > > > > Sent: Wednesday, August 26, 2020 1:34 PM
> > > > > > To: Moger, Babu <Babu.Moger@amd.com>
> > > > > > Cc: Igor Mammedov <imammedo@redhat.com>; Daniel P. Berrangé
> > > > > > <berrange@redhat.com>; ehabkost@redhat.com; mst@redhat.com;
> > > > > > Michal Privoznik <mprivozn@redhat.com>; qemu-  
> > devel@nongnu.org;  
> > > > > > pbonzini@redhat.com; rth@twiddle.net
> > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and
> > > > > > use generic decode
> > > > > >
> > > > > > * Babu Moger (babu.moger@amd.com) wrote:  
> > > > > > >  
> > > > > > > > -----Original Message-----
> > > > > > > > From: Igor Mammedov <imammedo@redhat.com>
> > > > > > > > Sent: Wednesday, August 26, 2020 8:31 AM
> > > > > > > > To: Daniel P. Berrangé <berrange@redhat.com>
> > > > > > > > Cc: Moger, Babu <Babu.Moger@amd.com>;  
> > pbonzini@redhat.com;  
> > > > > > > > rth@twiddle.net; ehabkost@redhat.com; qemu-  
> > devel@nongnu.org;  
> > > > > > > > mst@redhat.com; Michal Privoznik <mprivozn@redhat.com>
> > > > > > > > Subject: Re: [PATCH v5 0/8] Remove EPYC mode apicid decode
> > > > > > > > and use generic decode
> > > > > > > >
> > > > > > > > On Wed, 26 Aug 2020 13:50:59 +0100 Daniel P. Berrangé
> > > > > > > > <berrange@redhat.com> wrote:
> > > > > > > >  
> > > > > > > > > On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov  
> > wrote:  
> > > > > > > > > > On Fri, 21 Aug 2020 17:12:19 -0500 Babu Moger
> > > > > > > > > > <babu.moger@amd.com> wrote:
> > > > > > > > > >  
> > > > > > > > > > > To support some of the complex topology, we introduced
> > > > > > > > > > > EPYC mode  
> > > > > > > > apicid decode.  
> > > > > > > > > > > But, EPYC mode decode is running into problems. Also
> > > > > > > > > > > it can become quite a maintenance problem in the
> > > > > > > > > > > future. So, it was decided to remove that code and use
> > > > > > > > > > > the generic decode which works for majority of the
> > > > > > > > > > > topology. Most of the SPECed configuration would work
> > > > > > > > > > > just fine. With some non-SPECed user inputs, it will
> > > > > > > > > > > create some sub-  
> > > > > > > > optimal configuration.  
> > > > > > > > > > > Here is the discussion thread.
> > > > > > > > > > > https://nam11.safelinks.protection.outlook.com/?url=ht
> > > > > > > > > > > tps%3A%252
> > > > > > > > > > > F%2F
> > > > > > > > > > > lore.kernel.org%2Fqemu-devel%2Fc0bcc1a6-1d84-a6e7-  
> > e468  
> > > > > > > > > > > -  
> > > > > > > > d5b437c1b25  
> > > > > > > > > > >  
> > > > > > > >  
> > > > > >  
> > 4%40amd.com%2F&amp;data=02%7C01%7Cbabu.moger%40amd.com%7C8a
> > 5c  
> > > > > > > > 52f92  
> > > > > > > > > > >  
> > > > > > > >  
> > > > > >  
> > 3f04082a40808d849c43d49%7C3dd8961fe4884e608e11a82d994e183d%7C0%7  
> > > > > > > > C0  
> > > > > > > > > > >  
> > > > > > > >  
> > > > > >  
> > %7C637340454473508873&amp;sdata=VnW28H1v4XwK3GaNGFxu%2BhwiMe
> > A  
> > > > > > > > YO%2B  
> > > > > > > > > > > 3WAzo3DeY5Ha8%3D&amp;reserved=0
> > > > > > > > > > >
> > > > > > > > > > > This series removes all the EPYC mode specific apicid
> > > > > > > > > > > changes and use the generic apicid decode.  
> > > > > > > > > >
> > > > > > > > > > the main difference between EPYC and all other CPUs is
> > > > > > > > > > that it requires numa configuration (it's not optional)
> > > > > > > > > > so we need an extra  
> > > > > > > No, That is not true. Because of that assumption we made all
> > > > > > > these apicid changes. And here we are now.
> > > > > > >
> > > > > > > AMD supports varies mixed configurations. In case of
> > > > > > > EPYC-Rome, we have NPS1, NPS2 and NPS4(Numa Nodes per  
> > socket).  
> > > > > > > In case of NPS1, basically we have all the cores in a socket
> > > > > > > under one numa node. This is non-numa configuration.
> > > > > > > Looking at the various configurations and also discussing
> > > > > > > internally, it is not advisable to have (epyc && !numa) check.  
> > > > > >
> > > > > > Indeed on real hardware, I don't think we always see NUMA; my
> > > > > > single socket,
> > > > > > 16 core/32 thread 7302P Dell box, shows the kernel printing 'No
> > > > > > NUMA configuration found...Faking a node.'  
> > > > looks like firmware bug or maybe it's feature and there is a knob in
> > > > fw to turn it on/off in case used OS doesn't like it for some reason.
> > > >
> > > >  
> > > > > > So if real hardware hasn't got a NUMA node, what's the real problem?  
> > > > >
> > > > > I don't see any problem once we revert all these changes(patch 1-7).
> > > > > We don't need if (epyc && !numa) error check or
> > > > > auto_enable_numa=true unconditionally.  
> > > >
> > > > We need revert to unbreak migration from QEMU < 5.0, everything else
> > > > (fixes for CPUID_Fn8000001E) could go on top.
> > > >
> > > > So what's on top (because old code also wasn't correct when
> > > > CPUID_Fn8000001E is taken in account, tha's why we are at this
> > > > point),
> > > >
> > > > When starting QEMU without -numa
> > > > Indeed we can skip "if (epyc && !numa) error check or
> > > > auto_enable_numa=true", in case where there is 1 die (NPS1).
> > > > (1) User however may set core/threads number bigger than possible by  
> > spec,  
> > > >     in which case CPUID_Fn8000001E_EBX/CPUID_Fn8000001E_ECX will not  
> > be  
> > > >     valid spec vise and could trigger OPPs in guest kernel.
> > > >     Given we allow go out of spec, perhaps we should add a warning at
> > > >     realize time saying that used -smp config is not supported since it
> > > >     doesn't match AMD EPYC spec and might not work.
> > > >
> > > > (2) Earlier we agreed that we can reuse existing die_id instead of internal
> > > >     (topo_ids.node_id in current code)
> > > >     (It's is called DIE_ID and NODE ID in spec interchangeably)
> > > >     Same as (1) add a warning when '-smp dies' goes beyond spec limits.
> > > >
> > > > (3) "-smp dies>1" ''if'' we allow to run it without -numa,
> > > >     then system wide NUMA node id in CPUID_Fn8000001E_ECX probably  
> > doesn't matter.  
> > > >     could be something like in spec but taking in account die offset, to  
> > produce  
> > > >     unique id.
> > > >
> > > >     Same, add a warning that there are more than 1 dies but numa is not  
> > enabled,  
> > > >     suggest to enable numa.
> > > >
> > > >     With current code it produces invalid APIC ID for valid '-smp'  
> > combination,  
> > > >     however if we revert it and switch to die_id than it should produce
> > > >     valid APIC ID once again (as in 4.2).
> > > >     Given it produces invalid APIC id, maybe we should just ditch the case  
> > and  
> > > >     fold it in (4) (i.e. require -numa if "-smp dies>1")
> > > >
> > > > (4) -numa is used (RHBZ1728166)
> > > >     we need to ensure that socket*dies == ms->numa_state->num_nodes
> > > >      and make sure that CPUID_Fn8000001E_ECX consistent with
> > > >     cpu mapping provided with "-numa cpu=" option.  
> > >
> > > Why do we need to socket*dies == ms->numa_state->num_nodes ? That
> > > doesn't seem to be the case in bare metal EPYC nodes I've used which
> > > lets you configure how many NUMA nodes in firmware.  
> > 
> > (From dumps Babu has provided earlier, it was dies == nodes and
> > CPUID_Fn8000001E_ECX == numa node ids in SRAT.)  
> 
> Yes, That is correct. In most cases dies == nodes.
> 
> But that is going to change. In future(even in EPYC-Rome) with new f/w
> BIOS option, users can configure their numa node. It will give the option
> to keep NPS1, SPS2 or NSP4(Nodes per socket). In those cases dies and
> nodes will not match. That is why I wanted to keep them separate. User can
> change dies or -numa to match their bios config.

if real hw will do that, than that's fine.
it will be hw vendor who will be fixing issues if any when it comes to guest OS
(i.e. Windows)


> > dumping CPUID_Fn8000001E and SRAT table for such configs will help us to
> > figure out if we need socket*dies != nodes and how to compose config were
> > SRAT differs from CPUID_Fn8000001E_ECX.
> > 
> > Babu, can you provide CPUID_Fn8000001E and SRAT dumps for above configs
> > combinations? Or to some spec/guide how it should be.  
> 
> I dont have the config right now. But I will try to get one.
> 
> > 
> >   
> > >
> > >
> > > Regards,
> > > Daniel  
> 
> 


Re: [PATCH v5 0/8] Remove EPYC mode apicid decode and use generic decode
Posted by Eduardo Habkost 3 years, 8 months ago
On Wed, Aug 26, 2020 at 02:38:49PM +0200, Igor Mammedov wrote:
> On Fri, 21 Aug 2020 17:12:19 -0500
> Babu Moger <babu.moger@amd.com> wrote:
> 
> > To support some of the complex topology, we introduced EPYC mode apicid decode.
> > But, EPYC mode decode is running into problems. Also it can become quite a
> > maintenance problem in the future. So, it was decided to remove that code and
> > use the generic decode which works for majority of the topology. Most of the
> > SPECed configuration would work just fine. With some non-SPECed user inputs,
> > it will create some sub-optimal configuration.
> > Here is the discussion thread.
> > https://lore.kernel.org/qemu-devel/c0bcc1a6-1d84-a6e7-e468-d5b437c1b254@amd.com/
> > 
> > This series removes all the EPYC mode specific apicid changes and use the generic
> > apicid decode.
> 
> the main difference between EPYC and all other CPUs is that
> it requires numa configuration (it's not optional)
> so we need an extra patch on top of this series to enfoce that, i.e:
> 
>  if (epyc && !numa) 
>     error("EPYC cpu requires numa to be configured")
> 
> I think there was a patch in previous revisions that aimed for this.
> Simplest form would be above snippet.
> 
> More complex one, would be moving auto_enable_numa from MachineClass to
> MachineState so we can change it at runtime if EPYC is used. That should
> take care of use case where user hasn't provided -numa.

This sounds like a good solution.  It actually sounds simpler
than the alternatives (which just move the complexity to other
components).

We can keep MachineClass::auto_enable_numa as is, and just use it
to initialize the default value of MachineState::auto_enable_numa.

-- 
Eduardo