[PATCH v2 09/11] docs: Improve documentation for CPU topology

Andrea Bolognani posted 11 patches 1 year ago
[PATCH v2 09/11] docs: Improve documentation for CPU topology
Posted by Andrea Bolognani 1 year ago
On the guest configuration side, mention that support for the
"dies" attribute was introduced in libvirt 6.1.0 and clarify
that the ability to use non-default values is subject to
architecuture and machine limitations.

On the host capabilities side, the documentation was pretty
much entirely missing. It's still far from perfect, but anything
is better than having no information at all.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
---
 docs/formatcaps.rst   | 48 +++++++++++++++++++++++++++++++++++++------
 docs/formatdomain.rst | 16 +++++++++------
 2 files changed, 52 insertions(+), 12 deletions(-)

diff --git a/docs/formatcaps.rst b/docs/formatcaps.rst
index 3cccf70882..60f8b7caca 100644
--- a/docs/formatcaps.rst
+++ b/docs/formatcaps.rst
@@ -37,6 +37,12 @@ The ``<host/>`` element consists of the following child elements:
    The host UUID.
 ``cpu``
    The host CPU architecture and features.
+
+   Note that, while this element contains a ``topology`` sub-element,
+   the information contained therein is farily high-level and likely
+   not very useful when it comes to optimizing guest vCPU placement.
+   Look into the ``topology`` *element*, described below, for more
+   detailed information.
 ``power_management``
    whether host is capable of memory suspend, disk hibernation, or hybrid
    suspend.
@@ -44,12 +50,42 @@ The ``<host/>`` element consists of the following child elements:
    This element exposes information on the hypervisor's migration capabilities,
    like live migration, supported URI transports, and so on.
 ``topology``
-   This element embodies the host internal topology. Management applications may
-   want to learn this information when orchestrating new guests - e.g. due to
-   reduce inter-NUMA node transfers. Note that the ``sockets`` value reported
-   here is per-NUMA-node; this is in contrast to the value given in domain
-   definitions, which is interpreted as a total number of sockets for the
-   domain.
+   This element describes the host CPU topology in detail.
+
+   Management applications may want to use this information when defining new
+   guests: for example, in order to ensure that all vCPUs are scheduled on
+   CPUs that are in the same NUMA node or even CPU core.
+
+   The ``cells`` sub-element contains a list of NUMA nodes, each one
+   represented by a single ``cell`` element. Within each ``cell``, a ``cpus``
+   sub-element contains a list of logical CPUs, each one represented by a
+   single ``cpu`` element. In both cases, the ``num`` attribute of the
+   top-level element contains the number of children.
+
+   Each ``cpu`` element contains the following attributes:
+
+   ``id``
+     CPU identifier. Can be used to refer to it in the context of
+     `CPU tuning <formatdomain.html#cpu-tuning>`__.
+
+   ``socket_id``
+     Identifier for the physical package the CPU is in.
+
+   ``die_id``
+     Identifier for the die the CPU is in.
+
+     Note that not all architectures support CPU dies: if the current
+     architecture doesn't, the value will be 0 for all CPUs.
+
+   ``core_id``
+     Identifier for the core the CPU is in.
+
+   ``siblings``
+     List of CPUs that are in the same core.
+
+     The list will include the current CPU, plus all other CPUs that have the
+     same values for ``socket_id``, ``die_id`` and ``core_id``.
+
 ``secmodel``
    To find out default security labels for different security models you need to
    parse this element. In contrast with the former elements, this is repeated
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index 298ad46a45..73deaa5cb3 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
@@ -1578,14 +1578,18 @@ In case no restrictions need to be put on CPU model and its features, a simpler
    supported vendors can be found in ``cpu_map/*_vendors.xml``.
 ``topology``
    The ``topology`` element specifies requested topology of virtual CPU provided
-   to the guest. Four attributes, ``sockets``, ``dies``, ``cores``, and
-   ``threads``, accept non-zero positive integer values. They refer to the
-   total number of CPU sockets, number of dies per socket, number of cores per
-   die, and number of threads per core, respectively. The ``dies`` attribute is
-   optional and will default to 1 if omitted, while the other attributes are all
-   mandatory. Hypervisors may require that the maximum number of vCPUs specified
+   to the guest.
+   Its attributes ``sockets``, ``dies`` (:since:`Since 6.1.0`), ``cores``,
+   and ``threads`` accept non-zero positive integer values.
+   They refer to the total number of CPU sockets, number of dies per socket,
+   number of cores per die, and number of threads per core, respectively.
+   The ``dies`` attribute is optional and will default to 1 if omitted, while
+   the other attributes are all mandatory.
+   Hypervisors may require that the maximum number of vCPUs specified
    by the ``cpus`` element equals to the number of vcpus resulting from the
    topology.
+   Moreover, not all architectures and machine types support specifying a value
+   other than 1 for all attributes.
 ``feature``
    The ``cpu`` element can contain zero or more ``feature`` elements used to
    fine-tune features provided by the selected CPU model. The list of known
-- 
2.43.0
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org
Re: [PATCH v2 09/11] docs: Improve documentation for CPU topology
Posted by Peter Krempa 1 year ago
On Thu, Jan 11, 2024 at 15:26:41 +0100, Andrea Bolognani wrote:
> On the guest configuration side, mention that support for the
> "dies" attribute was introduced in libvirt 6.1.0 and clarify
> that the ability to use non-default values is subject to
> architecuture and machine limitations.
> 
> On the host capabilities side, the documentation was pretty
> much entirely missing. It's still far from perfect, but anything
> is better than having no information at all.
> 
> Signed-off-by: Andrea Bolognani <abologna@redhat.com>
> ---
>  docs/formatcaps.rst   | 48 +++++++++++++++++++++++++++++++++++++------
>  docs/formatdomain.rst | 16 +++++++++------
>  2 files changed, 52 insertions(+), 12 deletions(-)
> 
> diff --git a/docs/formatcaps.rst b/docs/formatcaps.rst
> index 3cccf70882..60f8b7caca 100644
> --- a/docs/formatcaps.rst
> +++ b/docs/formatcaps.rst

[....]

> @@ -44,12 +50,42 @@ The ``<host/>`` element consists of the following child elements:
>     This element exposes information on the hypervisor's migration capabilities,
>     like live migration, supported URI transports, and so on.
>  ``topology``
> -   This element embodies the host internal topology. Management applications may
> -   want to learn this information when orchestrating new guests - e.g. due to
> -   reduce inter-NUMA node transfers. Note that the ``sockets`` value reported
> -   here is per-NUMA-node; this is in contrast to the value given in domain
> -   definitions, which is interpreted as a total number of sockets for the
> -   domain.
> +   This element describes the host CPU topology in detail.
> +
> +   Management applications may want to use this information when defining new
> +   guests: for example, in order to ensure that all vCPUs are scheduled on
> +   CPUs that are in the same NUMA node or even CPU core.
> +
> +   The ``cells`` sub-element contains a list of NUMA nodes, each one
> +   represented by a single ``cell`` element. Within each ``cell``, a ``cpus``
> +   sub-element contains a list of logical CPUs, each one represented by a
> +   single ``cpu`` element. In both cases, the ``num`` attribute of the
> +   top-level element contains the number of children.
> +
> +   Each ``cpu`` element contains the following attributes:
> +
> +   ``id``
> +     CPU identifier. Can be used to refer to it in the context of
> +     `CPU tuning <formatdomain.html#cpu-tuning>`__.
> +
> +   ``socket_id``
> +     Identifier for the physical package the CPU is in.
> +
> +   ``die_id``
> +     Identifier for the die the CPU is in.
> +
> +     Note that not all architectures support CPU dies: if the current
> +     architecture doesn't, the value will be 0 for all CPUs.
> +
> +   ``core_id``
> +     Identifier for the core the CPU is in.
> +
> +   ``siblings``
> +     List of CPUs that are in the same core.
> +
> +     The list will include the current CPU, plus all other CPUs that have the
> +     same values for ``socket_id``, ``die_id`` and ``core_id``.

IIRC the bit about 'core_id' is not true, at least for some older AMD
cpus which had two fixed point units (each having it's own core id)
sharing a FPU and some other less-used modules.

That was a long time ago though, but the distinction was that the lowest
level cache was shared at this level (again IIRC)

See commit 828820e2d371205d6a6061301165d58a1a92e611 ; the 'bulldozer'
example.


>  ``secmodel``
>     To find out default security labels for different security models you need to
>     parse this element. In contrast with the former elements, this is repeated
> diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
> index 298ad46a45..73deaa5cb3 100644
> --- a/docs/formatdomain.rst
> +++ b/docs/formatdomain.rst
> @@ -1578,14 +1578,18 @@ In case no restrictions need to be put on CPU model and its features, a simpler
>     supported vendors can be found in ``cpu_map/*_vendors.xml``.
>  ``topology``
>     The ``topology`` element specifies requested topology of virtual CPU provided
> -   to the guest. Four attributes, ``sockets``, ``dies``, ``cores``, and
> -   ``threads``, accept non-zero positive integer values. They refer to the
> -   total number of CPU sockets, number of dies per socket, number of cores per
> -   die, and number of threads per core, respectively. The ``dies`` attribute is
> -   optional and will default to 1 if omitted, while the other attributes are all
> -   mandatory. Hypervisors may require that the maximum number of vCPUs specified
> +   to the guest.
> +   Its attributes ``sockets``, ``dies`` (:since:`Since 6.1.0`), ``cores``,
> +   and ``threads`` accept non-zero positive integer values.
> +   They refer to the total number of CPU sockets, number of dies per socket,
> +   number of cores per die, and number of threads per core, respectively.
> +   The ``dies`` attribute is optional and will default to 1 if omitted, while
> +   the other attributes are all mandatory.
> +   Hypervisors may require that the maximum number of vCPUs specified
>     by the ``cpus`` element equals to the number of vcpus resulting from the
>     topology.
> +   Moreover, not all architectures and machine types support specifying a value
> +   other than 1 for all attributes.
>  ``feature``
>     The ``cpu`` element can contain zero or more ``feature`` elements used toa

I'm not sure what to do with the siblings thing, but the rest:

Reviewed-by: Peter Krempa <pkrempa@redhat.com>
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org
Re: Re: [PATCH v2 09/11] docs: Improve documentation for CPU topology
Posted by Andrea Bolognani 1 year ago
On Fri, Jan 12, 2024 at 05:18:31PM +0100, Peter Krempa wrote:
> On Thu, Jan 11, 2024 at 15:26:41 +0100, Andrea Bolognani wrote:
> > +   Each ``cpu`` element contains the following attributes:
> > +
> > +   ``core_id``
> > +     Identifier for the core the CPU is in.
> > +
> > +   ``siblings``
> > +     List of CPUs that are in the same core.
> > +
> > +     The list will include the current CPU, plus all other CPUs that have the
> > +     same values for ``socket_id``, ``die_id`` and ``core_id``.
>
> IIRC the bit about 'core_id' is not true, at least for some older AMD
> cpus which had two fixed point units (each having it's own core id)
> sharing a FPU and some other less-used modules.
>
> That was a long time ago though, but the distinction was that the lowest
> level cache was shared at this level (again IIRC)
>
> See commit 828820e2d371205d6a6061301165d58a1a92e611 ; the 'bulldozer'
> example.

I've heard the AMD Bulldozer being mentioned as a curiosity several
times over the years. My understanding is that the architecture has
now been completely abandoned, and that the most recent hardware
that employs it was manufactured roughly a decade ago.

The kernel documentation[1] for the files that we parse to produce
those values is the following:

  What:        /sys/devices/system/cpu/cpuX/topology/core_id
  Description: the CPU core ID of cpuX. Typically it is the hardware platform's
               identifier (rather than the kernel's). The actual value is
               architecture and platform dependent.
  Values:      integer

  What:        /sys/devices/system/cpu/cpuX/topology/core_cpus_list
  Description: human-readable list of CPUs within the same core.
               The format is like 0-3, 8-11, 14,17.
               (deprecated name: "thread_siblings_list").
  Values:      decimal list.

So I think that, for all cases that are actually relevant today, the
explanations I'm introducing are accurate. If you have reservations
about them, please let me know how you'd like to change them and we
can certainly find a compromise :)


[1] https://www.kernel.org/doc/Documentation/ABI/stable/sysfs-devices-system-cpu
-- 
Andrea Bolognani / Red Hat / Virtualization
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org
Re: Re: Re: [PATCH v2 09/11] docs: Improve documentation for CPU topology
Posted by Andrea Bolognani 1 year ago
On Fri, Jan 12, 2024 at 08:58:52AM -0800, Andrea Bolognani wrote:
> On Fri, Jan 12, 2024 at 05:18:31PM +0100, Peter Krempa wrote:
> > On Thu, Jan 11, 2024 at 15:26:41 +0100, Andrea Bolognani wrote:
> > > +   Each ``cpu`` element contains the following attributes:
> > > +
> > > +   ``core_id``
> > > +     Identifier for the core the CPU is in.
> > > +
> > > +   ``siblings``
> > > +     List of CPUs that are in the same core.
> > > +
> > > +     The list will include the current CPU, plus all other CPUs that have the
> > > +     same values for ``socket_id``, ``die_id`` and ``core_id``.
> >
> > IIRC the bit about 'core_id' is not true, at least for some older AMD
> > cpus which had two fixed point units (each having it's own core id)
> > sharing a FPU and some other less-used modules.
> >
> > That was a long time ago though, but the distinction was that the lowest
> > level cache was shared at this level (again IIRC)
> >
> > See commit 828820e2d371205d6a6061301165d58a1a92e611 ; the 'bulldozer'
> > example.
>
> I've heard the AMD Bulldozer being mentioned as a curiosity several
> times over the years. My understanding is that the architecture has
> now been completely abandoned, and that the most recent hardware
> that employs it was manufactured roughly a decade ago.
>
> The kernel documentation[1] for the files that we parse to produce
> those values is the following:
>
>   What:        /sys/devices/system/cpu/cpuX/topology/core_id
>   Description: the CPU core ID of cpuX. Typically it is the hardware platform's
>                identifier (rather than the kernel's). The actual value is
>                architecture and platform dependent.
>   Values:      integer
>
>   What:        /sys/devices/system/cpu/cpuX/topology/core_cpus_list
>   Description: human-readable list of CPUs within the same core.
>                The format is like 0-3, 8-11, 14,17.
>                (deprecated name: "thread_siblings_list").
>   Values:      decimal list.
>
> So I think that, for all cases that are actually relevant today, the
> explanations I'm introducing are accurate. If you have reservations
> about them, please let me know how you'd like to change them and we
> can certainly find a compromise :)

So, can I push this as is with your R-b, or do you want me to make
further tweaks?

> [1] https://www.kernel.org/doc/Documentation/ABI/stable/sysfs-devices-system-cpu
-- 
Andrea Bolognani / Red Hat / Virtualization
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org
Re: Re: Re: [PATCH v2 09/11] docs: Improve documentation for CPU topology
Posted by Peter Krempa 1 year ago
On Mon, Jan 15, 2024 at 05:58:18 -0800, Andrea Bolognani wrote:
> On Fri, Jan 12, 2024 at 08:58:52AM -0800, Andrea Bolognani wrote:
> > On Fri, Jan 12, 2024 at 05:18:31PM +0100, Peter Krempa wrote:
> > > On Thu, Jan 11, 2024 at 15:26:41 +0100, Andrea Bolognani wrote:
> > > > +   Each ``cpu`` element contains the following attributes:
> > > > +
> > > > +   ``core_id``
> > > > +     Identifier for the core the CPU is in.
> > > > +
> > > > +   ``siblings``
> > > > +     List of CPUs that are in the same core.
> > > > +
> > > > +     The list will include the current CPU, plus all other CPUs that have the
> > > > +     same values for ``socket_id``, ``die_id`` and ``core_id``.
> > >
> > > IIRC the bit about 'core_id' is not true, at least for some older AMD
> > > cpus which had two fixed point units (each having it's own core id)
> > > sharing a FPU and some other less-used modules.
> > >
> > > That was a long time ago though, but the distinction was that the lowest
> > > level cache was shared at this level (again IIRC)
> > >
> > > See commit 828820e2d371205d6a6061301165d58a1a92e611 ; the 'bulldozer'
> > > example.
> >
> > I've heard the AMD Bulldozer being mentioned as a curiosity several
> > times over the years. My understanding is that the architecture has
> > now been completely abandoned, and that the most recent hardware
> > that employs it was manufactured roughly a decade ago.
> >
> > The kernel documentation[1] for the files that we parse to produce
> > those values is the following:
> >
> >   What:        /sys/devices/system/cpu/cpuX/topology/core_id
> >   Description: the CPU core ID of cpuX. Typically it is the hardware platform's
> >                identifier (rather than the kernel's). The actual value is
> >                architecture and platform dependent.
> >   Values:      integer
> >
> >   What:        /sys/devices/system/cpu/cpuX/topology/core_cpus_list
> >   Description: human-readable list of CPUs within the same core.
> >                The format is like 0-3, 8-11, 14,17.
> >                (deprecated name: "thread_siblings_list").
> >   Values:      decimal list.
> >
> > So I think that, for all cases that are actually relevant today, the
> > explanations I'm introducing are accurate. If you have reservations
> > about them, please let me know how you'd like to change them and we
> > can certainly find a compromise :)
> 
> So, can I push this as is with your R-b, or do you want me to make
> further tweaks?

Ah, sorry, I forgot to respond. I think this explanation makes sense,
and since the HW I've mentioned is now obsolete as well as kernel
markign the fields as deprecated:

Reviewed-by: Peter Krempa <pkrempa@redhat.com>
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org