[v1] numa: add '-numa cpu' option

[Qemu-devel] [PATCH for-2.10 22/23] numa: add '-numa cpu, ...' option for property based node mapping

Posted by Igor Mammedov 8 years, 10 months ago

legacy cpu to node mapping is using cpu index values to map
VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
option. However cpu index is internal concept and QEMU users
have to guess /reimplement qemu's logic/ to map it to
a concrete cpu socket/core/thread to make sane CPUs
placement across numa nodes.

This patch allows to map cpu objects to numa nodes using
the same properties as used for cpus with -device/device_add
(socket-id/core-id/thread-id/node-id).

At present valid properties/values to address CPUs could be
fetched using hotpluggable-cpus monitor/qmp command, it will
require user to start qemu twice when creating domain to fetch
possible CPUs for a machine type/-smp layout first and
then the second time with numa explicit mapping for actual
usage. The first step results could be saved and reused to
set/change mapping later as far as machine type/-smp stays
the same.

Proposed impl. supports exact and wildcard matching to
simplify CLI and allow to set mapping for a specific cpu
or group of cpu objects specified by matched properties.

For example:

   # exact mapping x86
   -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n

   # exact mapping SPAPR
   -numa cpu,node-id=x,core-id=y

   # wildcard mapping, all cpu objects that match socket-id=y
   # are mapped to node-id=x
   -numa cpu,node-id=x,socket-id=y

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 numa.c           | 13 +++++++++++++
 qapi-schema.json |  7 +++++--
 qemu-options.hx  | 23 ++++++++++++++++++++++-
 3 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/numa.c b/numa.c
index 088fae3..588586b 100644
--- a/numa.c
+++ b/numa.c
@@ -246,6 +246,19 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
         }
         nb_numa_nodes++;
         break;
+    case NUMA_OPTIONS_TYPE_CPU:
+        if (!object->u.cpu.has_node_id) {
+            error_setg(&err, "Missing mandatory node-id property");
+            goto end;
+        }
+        if (!numa_info[object->u.cpu.node_id].present) {
+            error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
+                "defined with -numa node,nodeid=ID before it's used with "
+                "-numa cpu,node-id=ID", object->u.cpu.node_id);
+            goto end;
+        }
+        machine_set_cpu_numa_node(ms, &object->u.cpu, &err);
+        break;
     default:
         abort();
     }
diff --git a/qapi-schema.json b/qapi-schema.json
index a6b5955..a9a1d5e 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -5673,10 +5673,12 @@
 ##
 # @NumaOptionsType:
 #
+# @cpu: property based CPU(s) to node mapping (Since: 2.10)
+#
 # Since: 2.1
 ##
 { 'enum': 'NumaOptionsType',
-  'data': [ 'node' ] }
+  'data': [ 'node', 'cpu' ] }
 
 ##
 # @NumaOptions:
@@ -5689,7 +5691,8 @@
   'base': { 'type': 'NumaOptionsType' },
   'discriminator': 'type',
   'data': {
-    'node': 'NumaNodeOptions' }}
+    'node': 'NumaNodeOptions',
+    'cpu': 'CpuInstanceProperties' }}
 
 ##
 # @NumaNodeOptions:
diff --git a/qemu-options.hx b/qemu-options.hx
index 99af8ed..2185c34 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -139,13 +139,16 @@ ETEXI
 
 DEF("numa", HAS_ARG, QEMU_OPTION_numa,
     "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
-    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
+    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
+    "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n", QEMU_ARCH_ALL)
 STEXI
 @item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
 @itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
+@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
 @findex -numa
 Define a NUMA node and assign RAM and VCPUs to it.
 
+Legacy VCPU assignment uses @samp{cpus} option where
 @var{firstcpu} and @var{lastcpu} are CPU indexes. Each
 @samp{cpus} option represent a contiguous range of CPU indexes
 (or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
@@ -159,6 +162,24 @@ a NUMA node:
 -numa node,cpus=0-2,cpus=5
 @end example
 
+@samp{cpu} option is new alternative to @samp{cpus} option
+uses @samp{socket-id|core-id|thread-id} properties to assign
+CPU objects to a @var{node} using topology layout properties of CPU.
+Set of properties is machine specific, and depends on used machine
+type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
+monitor command.
+@samp{node-id} property specifies @var{node} to which CPU object
+will be assigned, it's required for @var{node} to be declared
+with @samp{node} option before it's used with @samp{cpu} option.
+
+For example:
+@example
+-M pc \
+-smp 1,sockets=2,maxcpus=2 \
+-numa node,nodeid=0 -numa node,nodeid=1 \
+-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
+@end example
+
 @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
 assigns RAM from a given memory backend device to a node. If
 @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
-- 
2.7.4

Re: [Qemu-devel] [PATCH for-2.10 22/23] numa: add '-numa cpu, ...' option for property based node mapping

Posted by Eric Blake 8 years, 10 months ago

On 03/22/2017 08:32 AM, Igor Mammedov wrote:
> legacy cpu to node mapping is using cpu index values to map
> VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> option. However cpu index is internal concept and QEMU users
> have to guess /reimplement qemu's logic/ to map it to
> a concrete cpu socket/core/thread to make sane CPUs
> placement across numa nodes.
> 
> This patch allows to map cpu objects to numa nodes using
> the same properties as used for cpus with -device/device_add
> (socket-id/core-id/thread-id/node-id).
> 
> At present valid properties/values to address CPUs could be
> fetched using hotpluggable-cpus monitor/qmp command, it will
> require user to start qemu twice when creating domain to fetch
> possible CPUs for a machine type/-smp layout first and
> then the second time with numa explicit mapping for actual
> usage. The first step results could be saved and reused to
> set/change mapping later as far as machine type/-smp stays
> the same.
> 
> Proposed impl. supports exact and wildcard matching to
> simplify CLI and allow to set mapping for a specific cpu
> or group of cpu objects specified by matched properties.
> 
> For example:
> 
>    # exact mapping x86
>    -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
> 
>    # exact mapping SPAPR
>    -numa cpu,node-id=x,core-id=y
> 
>    # wildcard mapping, all cpu objects that match socket-id=y
>    # are mapped to node-id=x
>    -numa cpu,node-id=x,socket-id=y
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  numa.c           | 13 +++++++++++++
>  qapi-schema.json |  7 +++++--
>  qemu-options.hx  | 23 ++++++++++++++++++++++-
>  3 files changed, 40 insertions(+), 3 deletions(-)
> 

>  
> +@samp{cpu} option is new alternative to @samp{cpus} option

s/is/is a/

> +uses @samp{socket-id|core-id|thread-id} properties to assign

s/uses/which uses/

> +CPU objects to a @var{node} using topology layout properties of CPU.
> +Set of properties is machine specific, and depends on used machine

s/Set/The set/

> +type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
> +monitor command.
> +@samp{node-id} property specifies @var{node} to which CPU object
> +will be assigned, it's required for @var{node} to be declared
> +with @samp{node} option before it's used with @samp{cpu} option.
> +
> +For example:
> +@example
> +-M pc \
> +-smp 1,sockets=2,maxcpus=2 \
> +-numa node,nodeid=0 -numa node,nodeid=1 \
> +-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
> +@end example
> +
>  @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
>  assigns RAM from a given memory backend device to a node. If
>  @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
> 

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Re: [Qemu-devel] [PATCH for-2.10 22/23] numa: add '-numa cpu, ...' option for property based node mapping

Posted by Igor Mammedov 8 years, 10 months ago

On Thu, 23 Mar 2017 08:23:32 -0500
Eric Blake <eblake@redhat.com> wrote:

...
> >  
> > +@samp{cpu} option is new alternative to @samp{cpus} option  
> 
> s/is/is a/
> 
> > +uses @samp{socket-id|core-id|thread-id} properties to assign  
> 
> s/uses/which uses/
> 
> > +CPU objects to a @var{node} using topology layout properties of CPU.
> > +Set of properties is machine specific, and depends on used machine  
> 
> s/Set/The set/
> 
Fixed in v2 branch

Re: [Qemu-devel] [PATCH for-2.10 22/23] numa: add '-numa cpu, ...' option for property based node mapping

Posted by David Gibson 8 years, 10 months ago

On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote:
> legacy cpu to node mapping is using cpu index values to map
> VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> option. However cpu index is internal concept and QEMU users
> have to guess /reimplement qemu's logic/ to map it to
> a concrete cpu socket/core/thread to make sane CPUs
> placement across numa nodes.
> 
> This patch allows to map cpu objects to numa nodes using
> the same properties as used for cpus with -device/device_add
> (socket-id/core-id/thread-id/node-id).
> 
> At present valid properties/values to address CPUs could be
> fetched using hotpluggable-cpus monitor/qmp command, it will
> require user to start qemu twice when creating domain to fetch
> possible CPUs for a machine type/-smp layout first and
> then the second time with numa explicit mapping for actual
> usage. The first step results could be saved and reused to
> set/change mapping later as far as machine type/-smp stays
> the same.
> 
> Proposed impl. supports exact and wildcard matching to
> simplify CLI and allow to set mapping for a specific cpu
> or group of cpu objects specified by matched properties.
> 
> For example:
> 
>    # exact mapping x86
>    -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
> 
>    # exact mapping SPAPR
>    -numa cpu,node-id=x,core-id=y
> 
>    # wildcard mapping, all cpu objects that match socket-id=y
>    # are mapped to node-id=x
>    -numa cpu,node-id=x,socket-id=y
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>

What's the rationale for adding a new CLI, rather than adding node-id
properties to the appropriate objects with -device, -global or -set as
appropriate?

> ---
>  numa.c           | 13 +++++++++++++
>  qapi-schema.json |  7 +++++--
>  qemu-options.hx  | 23 ++++++++++++++++++++++-
>  3 files changed, 40 insertions(+), 3 deletions(-)
> 
> diff --git a/numa.c b/numa.c
> index 088fae3..588586b 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -246,6 +246,19 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
>          }
>          nb_numa_nodes++;
>          break;
> +    case NUMA_OPTIONS_TYPE_CPU:
> +        if (!object->u.cpu.has_node_id) {
> +            error_setg(&err, "Missing mandatory node-id property");
> +            goto end;
> +        }
> +        if (!numa_info[object->u.cpu.node_id].present) {
> +            error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
> +                "defined with -numa node,nodeid=ID before it's used with "
> +                "-numa cpu,node-id=ID", object->u.cpu.node_id);
> +            goto end;
> +        }
> +        machine_set_cpu_numa_node(ms, &object->u.cpu, &err);
> +        break;
>      default:
>          abort();
>      }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index a6b5955..a9a1d5e 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -5673,10 +5673,12 @@
>  ##
>  # @NumaOptionsType:
>  #
> +# @cpu: property based CPU(s) to node mapping (Since: 2.10)
> +#
>  # Since: 2.1
>  ##
>  { 'enum': 'NumaOptionsType',
> -  'data': [ 'node' ] }
> +  'data': [ 'node', 'cpu' ] }
>  
>  ##
>  # @NumaOptions:
> @@ -5689,7 +5691,8 @@
>    'base': { 'type': 'NumaOptionsType' },
>    'discriminator': 'type',
>    'data': {
> -    'node': 'NumaNodeOptions' }}
> +    'node': 'NumaNodeOptions',
> +    'cpu': 'CpuInstanceProperties' }}
>  
>  ##
>  # @NumaNodeOptions:
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 99af8ed..2185c34 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -139,13 +139,16 @@ ETEXI
>  
>  DEF("numa", HAS_ARG, QEMU_OPTION_numa,
>      "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> -    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
> +    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> +    "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n", QEMU_ARCH_ALL)
>  STEXI
>  @item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
>  @itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> +@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
>  @findex -numa
>  Define a NUMA node and assign RAM and VCPUs to it.
>  
> +Legacy VCPU assignment uses @samp{cpus} option where
>  @var{firstcpu} and @var{lastcpu} are CPU indexes. Each
>  @samp{cpus} option represent a contiguous range of CPU indexes
>  (or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
> @@ -159,6 +162,24 @@ a NUMA node:
>  -numa node,cpus=0-2,cpus=5
>  @end example
>  
> +@samp{cpu} option is new alternative to @samp{cpus} option
> +uses @samp{socket-id|core-id|thread-id} properties to assign
> +CPU objects to a @var{node} using topology layout properties of CPU.
> +Set of properties is machine specific, and depends on used machine
> +type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
> +monitor command.
> +@samp{node-id} property specifies @var{node} to which CPU object
> +will be assigned, it's required for @var{node} to be declared
> +with @samp{node} option before it's used with @samp{cpu} option.
> +
> +For example:
> +@example
> +-M pc \
> +-smp 1,sockets=2,maxcpus=2 \
> +-numa node,nodeid=0 -numa node,nodeid=1 \
> +-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
> +@end example
> +
>  @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
>  assigns RAM from a given memory backend device to a node. If
>  @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Re: [Qemu-devel] [PATCH for-2.10 22/23] numa: add '-numa cpu, ...' option for property based node mapping

Posted by Igor Mammedov 8 years, 10 months ago

On Tue, 28 Mar 2017 16:16:02 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote:
> > legacy cpu to node mapping is using cpu index values to map
> > VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> > option. However cpu index is internal concept and QEMU users
> > have to guess /reimplement qemu's logic/ to map it to
> > a concrete cpu socket/core/thread to make sane CPUs
> > placement across numa nodes.
> > 
> > This patch allows to map cpu objects to numa nodes using
> > the same properties as used for cpus with -device/device_add
> > (socket-id/core-id/thread-id/node-id).
> > 
> > At present valid properties/values to address CPUs could be
> > fetched using hotpluggable-cpus monitor/qmp command, it will
> > require user to start qemu twice when creating domain to fetch
> > possible CPUs for a machine type/-smp layout first and
> > then the second time with numa explicit mapping for actual
> > usage. The first step results could be saved and reused to
> > set/change mapping later as far as machine type/-smp stays
> > the same.
> > 
> > Proposed impl. supports exact and wildcard matching to
> > simplify CLI and allow to set mapping for a specific cpu
> > or group of cpu objects specified by matched properties.
> > 
> > For example:
> > 
> >    # exact mapping x86
> >    -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
> > 
> >    # exact mapping SPAPR
> >    -numa cpu,node-id=x,core-id=y
> > 
> >    # wildcard mapping, all cpu objects that match socket-id=y
> >    # are mapped to node-id=x
> >    -numa cpu,node-id=x,socket-id=y
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> 
> What's the rationale for adding a new CLI, rather than adding node-id
> properties to the appropriate objects with -device, -global or -set as
> appropriate?
 '-global' applies to all cpus, while '-device,-set' applies to present
 at boot time cpus only. So they do not work for the case of possible but
 not present at boot time objects. For ACPI based targets, we need to have
 numa mapping at boot time to build ACPI SRAT table.
 I don't know if it's important for spapr/fdt, but it uses current predefined
 mapping with -numa node,cpus=x-y and new CLI hides from user internal
 cpu_index and allows to use the same properties as we use for -device cpu,...
 to define mapping to numa nodes for present/possible cpus.

> 
> > ---
> >  numa.c           | 13 +++++++++++++
> >  qapi-schema.json |  7 +++++--
> >  qemu-options.hx  | 23 ++++++++++++++++++++++-
> >  3 files changed, 40 insertions(+), 3 deletions(-)
> > 
> > diff --git a/numa.c b/numa.c
> > index 088fae3..588586b 100644
> > --- a/numa.c
> > +++ b/numa.c
> > @@ -246,6 +246,19 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
> >          }
> >          nb_numa_nodes++;
> >          break;
> > +    case NUMA_OPTIONS_TYPE_CPU:
> > +        if (!object->u.cpu.has_node_id) {
> > +            error_setg(&err, "Missing mandatory node-id property");
> > +            goto end;
> > +        }
> > +        if (!numa_info[object->u.cpu.node_id].present) {
> > +            error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
> > +                "defined with -numa node,nodeid=ID before it's used with "
> > +                "-numa cpu,node-id=ID", object->u.cpu.node_id);
> > +            goto end;
> > +        }
> > +        machine_set_cpu_numa_node(ms, &object->u.cpu, &err);
> > +        break;
> >      default:
> >          abort();
> >      }
> > diff --git a/qapi-schema.json b/qapi-schema.json
> > index a6b5955..a9a1d5e 100644
> > --- a/qapi-schema.json
> > +++ b/qapi-schema.json
> > @@ -5673,10 +5673,12 @@
> >  ##
> >  # @NumaOptionsType:
> >  #
> > +# @cpu: property based CPU(s) to node mapping (Since: 2.10)
> > +#
> >  # Since: 2.1
> >  ##
> >  { 'enum': 'NumaOptionsType',
> > -  'data': [ 'node' ] }
> > +  'data': [ 'node', 'cpu' ] }
> >  
> >  ##
> >  # @NumaOptions:
> > @@ -5689,7 +5691,8 @@
> >    'base': { 'type': 'NumaOptionsType' },
> >    'discriminator': 'type',
> >    'data': {
> > -    'node': 'NumaNodeOptions' }}
> > +    'node': 'NumaNodeOptions',
> > +    'cpu': 'CpuInstanceProperties' }}
> >  
> >  ##
> >  # @NumaNodeOptions:
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index 99af8ed..2185c34 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -139,13 +139,16 @@ ETEXI
> >  
> >  DEF("numa", HAS_ARG, QEMU_OPTION_numa,
> >      "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> > -    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
> > +    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> > +    "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n", QEMU_ARCH_ALL)
> >  STEXI
> >  @item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> >  @itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> > +@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
> >  @findex -numa
> >  Define a NUMA node and assign RAM and VCPUs to it.
> >  
> > +Legacy VCPU assignment uses @samp{cpus} option where
> >  @var{firstcpu} and @var{lastcpu} are CPU indexes. Each
> >  @samp{cpus} option represent a contiguous range of CPU indexes
> >  (or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
> > @@ -159,6 +162,24 @@ a NUMA node:
> >  -numa node,cpus=0-2,cpus=5
> >  @end example
> >  
> > +@samp{cpu} option is new alternative to @samp{cpus} option
> > +uses @samp{socket-id|core-id|thread-id} properties to assign
> > +CPU objects to a @var{node} using topology layout properties of CPU.
> > +Set of properties is machine specific, and depends on used machine
> > +type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
> > +monitor command.
> > +@samp{node-id} property specifies @var{node} to which CPU object
> > +will be assigned, it's required for @var{node} to be declared
> > +with @samp{node} option before it's used with @samp{cpu} option.
> > +
> > +For example:
> > +@example
> > +-M pc \
> > +-smp 1,sockets=2,maxcpus=2 \
> > +-numa node,nodeid=0 -numa node,nodeid=1 \
> > +-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
> > +@end example
> > +
> >  @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
> >  assigns RAM from a given memory backend device to a node. If
> >  @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
>

Re: [Qemu-devel] [PATCH for-2.10 22/23] numa: add '-numa cpu, ...' option for property based node mapping

Posted by David Gibson 8 years, 10 months ago

On Tue, Mar 28, 2017 at 01:09:11PM +0200, Igor Mammedov wrote:
> On Tue, 28 Mar 2017 16:16:02 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote:
> > > legacy cpu to node mapping is using cpu index values to map
> > > VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> > > option. However cpu index is internal concept and QEMU users
> > > have to guess /reimplement qemu's logic/ to map it to
> > > a concrete cpu socket/core/thread to make sane CPUs
> > > placement across numa nodes.
> > > 
> > > This patch allows to map cpu objects to numa nodes using
> > > the same properties as used for cpus with -device/device_add
> > > (socket-id/core-id/thread-id/node-id).
> > > 
> > > At present valid properties/values to address CPUs could be
> > > fetched using hotpluggable-cpus monitor/qmp command, it will
> > > require user to start qemu twice when creating domain to fetch
> > > possible CPUs for a machine type/-smp layout first and
> > > then the second time with numa explicit mapping for actual
> > > usage. The first step results could be saved and reused to
> > > set/change mapping later as far as machine type/-smp stays
> > > the same.
> > > 
> > > Proposed impl. supports exact and wildcard matching to
> > > simplify CLI and allow to set mapping for a specific cpu
> > > or group of cpu objects specified by matched properties.
> > > 
> > > For example:
> > > 
> > >    # exact mapping x86
> > >    -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
> > > 
> > >    # exact mapping SPAPR
> > >    -numa cpu,node-id=x,core-id=y
> > > 
> > >    # wildcard mapping, all cpu objects that match socket-id=y
> > >    # are mapped to node-id=x
> > >    -numa cpu,node-id=x,socket-id=y
> > > 
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > 
> > What's the rationale for adding a new CLI, rather than adding node-id
> > properties to the appropriate objects with -device, -global or -set as
> > appropriate?
>  '-global' applies to all cpus, while '-device,-set' applies to present
>  at boot time cpus only. So they do not work for the case of possible but
>  not present at boot time objects.

Ah!  Of course.

> For ACPI based targets, we need to have
>  numa mapping at boot time to build ACPI SRAT table.
>  I don't know if it's important for spapr/fdt,

Not in the same way.  For spapr the device tree fragment for the new
cpu is supplied to the guest at hotplug time rather than having to be
in the initial device tree.  So for us, node could be supplied with
device_add.

> but it uses current predefined
>  mapping with -numa node,cpus=x-y and new CLI hides from user internal
>  cpu_index and allows to use the same properties as we use for -device cpu,...
>  to define mapping to numa nodes for present/possible cpus.
> 
> > 
> > > ---
> > >  numa.c           | 13 +++++++++++++
> > >  qapi-schema.json |  7 +++++--
> > >  qemu-options.hx  | 23 ++++++++++++++++++++++-
> > >  3 files changed, 40 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/numa.c b/numa.c
> > > index 088fae3..588586b 100644
> > > --- a/numa.c
> > > +++ b/numa.c
> > > @@ -246,6 +246,19 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
> > >          }
> > >          nb_numa_nodes++;
> > >          break;
> > > +    case NUMA_OPTIONS_TYPE_CPU:
> > > +        if (!object->u.cpu.has_node_id) {
> > > +            error_setg(&err, "Missing mandatory node-id property");
> > > +            goto end;
> > > +        }
> > > +        if (!numa_info[object->u.cpu.node_id].present) {
> > > +            error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
> > > +                "defined with -numa node,nodeid=ID before it's used with "
> > > +                "-numa cpu,node-id=ID", object->u.cpu.node_id);
> > > +            goto end;
> > > +        }
> > > +        machine_set_cpu_numa_node(ms, &object->u.cpu, &err);
> > > +        break;
> > >      default:
> > >          abort();
> > >      }
> > > diff --git a/qapi-schema.json b/qapi-schema.json
> > > index a6b5955..a9a1d5e 100644
> > > --- a/qapi-schema.json
> > > +++ b/qapi-schema.json
> > > @@ -5673,10 +5673,12 @@
> > >  ##
> > >  # @NumaOptionsType:
> > >  #
> > > +# @cpu: property based CPU(s) to node mapping (Since: 2.10)
> > > +#
> > >  # Since: 2.1
> > >  ##
> > >  { 'enum': 'NumaOptionsType',
> > > -  'data': [ 'node' ] }
> > > +  'data': [ 'node', 'cpu' ] }
> > >  
> > >  ##
> > >  # @NumaOptions:
> > > @@ -5689,7 +5691,8 @@
> > >    'base': { 'type': 'NumaOptionsType' },
> > >    'discriminator': 'type',
> > >    'data': {
> > > -    'node': 'NumaNodeOptions' }}
> > > +    'node': 'NumaNodeOptions',
> > > +    'cpu': 'CpuInstanceProperties' }}
> > >  
> > >  ##
> > >  # @NumaNodeOptions:
> > > diff --git a/qemu-options.hx b/qemu-options.hx
> > > index 99af8ed..2185c34 100644
> > > --- a/qemu-options.hx
> > > +++ b/qemu-options.hx
> > > @@ -139,13 +139,16 @@ ETEXI
> > >  
> > >  DEF("numa", HAS_ARG, QEMU_OPTION_numa,
> > >      "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> > > -    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
> > > +    "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> > > +    "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n", QEMU_ARCH_ALL)
> > >  STEXI
> > >  @item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> > >  @itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> > > +@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
> > >  @findex -numa
> > >  Define a NUMA node and assign RAM and VCPUs to it.
> > >  
> > > +Legacy VCPU assignment uses @samp{cpus} option where
> > >  @var{firstcpu} and @var{lastcpu} are CPU indexes. Each
> > >  @samp{cpus} option represent a contiguous range of CPU indexes
> > >  (or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
> > > @@ -159,6 +162,24 @@ a NUMA node:
> > >  -numa node,cpus=0-2,cpus=5
> > >  @end example
> > >  
> > > +@samp{cpu} option is new alternative to @samp{cpus} option
> > > +uses @samp{socket-id|core-id|thread-id} properties to assign
> > > +CPU objects to a @var{node} using topology layout properties of CPU.
> > > +Set of properties is machine specific, and depends on used machine
> > > +type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
> > > +monitor command.
> > > +@samp{node-id} property specifies @var{node} to which CPU object
> > > +will be assigned, it's required for @var{node} to be declared
> > > +with @samp{node} option before it's used with @samp{cpu} option.
> > > +
> > > +For example:
> > > +@example
> > > +-M pc \
> > > +-smp 1,sockets=2,maxcpus=2 \
> > > +-numa node,nodeid=0 -numa node,nodeid=1 \
> > > +-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
> > > +@end example
> > > +
> > >  @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
> > >  assigns RAM from a given memory backend device to a node. If
> > >  @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

Re: [Qemu-devel] [PATCH for-2.10 22/23] numa: add '-numa cpu, ...' option for property based node mapping

Posted by Igor Mammedov 8 years, 10 months ago

On Wed, 29 Mar 2017 13:27:23 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Mar 28, 2017 at 01:09:11PM +0200, Igor Mammedov wrote:
> > On Tue, 28 Mar 2017 16:16:02 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote:  
> > > > legacy cpu to node mapping is using cpu index values to map
> > > > VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> > > > option. However cpu index is internal concept and QEMU users
> > > > have to guess /reimplement qemu's logic/ to map it to
> > > > a concrete cpu socket/core/thread to make sane CPUs
> > > > placement across numa nodes.
> > > > 
> > > > This patch allows to map cpu objects to numa nodes using
> > > > the same properties as used for cpus with -device/device_add
> > > > (socket-id/core-id/thread-id/node-id).
> > > > 
> > > > At present valid properties/values to address CPUs could be
> > > > fetched using hotpluggable-cpus monitor/qmp command, it will
> > > > require user to start qemu twice when creating domain to fetch
> > > > possible CPUs for a machine type/-smp layout first and
> > > > then the second time with numa explicit mapping for actual
> > > > usage. The first step results could be saved and reused to
> > > > set/change mapping later as far as machine type/-smp stays
> > > > the same.
> > > > 
> > > > Proposed impl. supports exact and wildcard matching to
> > > > simplify CLI and allow to set mapping for a specific cpu
> > > > or group of cpu objects specified by matched properties.
> > > > 
> > > > For example:
> > > > 
> > > >    # exact mapping x86
> > > >    -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
> > > > 
> > > >    # exact mapping SPAPR
> > > >    -numa cpu,node-id=x,core-id=y
> > > > 
> > > >    # wildcard mapping, all cpu objects that match socket-id=y
> > > >    # are mapped to node-id=x
> > > >    -numa cpu,node-id=x,socket-id=y
> > > > 
> > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>  
> > > 
> > > What's the rationale for adding a new CLI, rather than adding node-id
> > > properties to the appropriate objects with -device, -global or -set as
> > > appropriate?  
> >  '-global' applies to all cpus, while '-device,-set' applies to present
> >  at boot time cpus only. So they do not work for the case of possible but
> >  not present at boot time objects.  
> 
> Ah!  Of course.
> 
> > For ACPI based targets, we need to have
> >  numa mapping at boot time to build ACPI SRAT table.
> >  I don't know if it's important for spapr/fdt,  
> 
> Not in the same way.  For spapr the device tree fragment for the new
> cpu is supplied to the guest at hotplug time rather than having to be
> in the initial device tree.  So for us, node could be supplied with
> device_add.
I've implemented cpu.node-id check in the same way for all targets
for spapr it's patch patch 06/23 which forces cpu.node-id to match
whatever mapping has been provided with -numa cpu[s]
OR
with implied default /0/ if mapping for cpu hasn't been specified
with -numa explicitly.

That way it won't break legacy machines and on compat code is necessary,
I'd would leave it up to you with patch on top of this to lift restriction/make
it more relaxed for spapr if you think it won't break anything.

Although from libvirt pov, I'd prefer to treat all targets uniformly,
which narrows choice down to '-numa' mapping approach that it uses now.

> > but it uses current predefined
> >  mapping with -numa node,cpus=x-y and new CLI hides from user internal
> >  cpu_index and allows to use the same properties as we use for -device cpu,...
> >  to define mapping to numa nodes for present/possible cpus.
...

Re: [Qemu-devel] [PATCH for-2.10 22/23] numa: add '-numa cpu, ...' option for property based node mapping

Posted by David Gibson 8 years, 10 months ago

On Wed, Mar 29, 2017 at 02:08:58PM +0200, Igor Mammedov wrote:
> On Wed, 29 Mar 2017 13:27:23 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Tue, Mar 28, 2017 at 01:09:11PM +0200, Igor Mammedov wrote:
> > > On Tue, 28 Mar 2017 16:16:02 +1100
> > > David Gibson <david@gibson.dropbear.id.au> wrote:
> > >   
> > > > On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote:  
> > > > > legacy cpu to node mapping is using cpu index values to map
> > > > > VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> > > > > option. However cpu index is internal concept and QEMU users
> > > > > have to guess /reimplement qemu's logic/ to map it to
> > > > > a concrete cpu socket/core/thread to make sane CPUs
> > > > > placement across numa nodes.
> > > > > 
> > > > > This patch allows to map cpu objects to numa nodes using
> > > > > the same properties as used for cpus with -device/device_add
> > > > > (socket-id/core-id/thread-id/node-id).
> > > > > 
> > > > > At present valid properties/values to address CPUs could be
> > > > > fetched using hotpluggable-cpus monitor/qmp command, it will
> > > > > require user to start qemu twice when creating domain to fetch
> > > > > possible CPUs for a machine type/-smp layout first and
> > > > > then the second time with numa explicit mapping for actual
> > > > > usage. The first step results could be saved and reused to
> > > > > set/change mapping later as far as machine type/-smp stays
> > > > > the same.
> > > > > 
> > > > > Proposed impl. supports exact and wildcard matching to
> > > > > simplify CLI and allow to set mapping for a specific cpu
> > > > > or group of cpu objects specified by matched properties.
> > > > > 
> > > > > For example:
> > > > > 
> > > > >    # exact mapping x86
> > > > >    -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
> > > > > 
> > > > >    # exact mapping SPAPR
> > > > >    -numa cpu,node-id=x,core-id=y
> > > > > 
> > > > >    # wildcard mapping, all cpu objects that match socket-id=y
> > > > >    # are mapped to node-id=x
> > > > >    -numa cpu,node-id=x,socket-id=y
> > > > > 
> > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>  
> > > > 
> > > > What's the rationale for adding a new CLI, rather than adding node-id
> > > > properties to the appropriate objects with -device, -global or -set as
> > > > appropriate?  
> > >  '-global' applies to all cpus, while '-device,-set' applies to present
> > >  at boot time cpus only. So they do not work for the case of possible but
> > >  not present at boot time objects.  
> > 
> > Ah!  Of course.
> > 
> > > For ACPI based targets, we need to have
> > >  numa mapping at boot time to build ACPI SRAT table.
> > >  I don't know if it's important for spapr/fdt,  
> > 
> > Not in the same way.  For spapr the device tree fragment for the new
> > cpu is supplied to the guest at hotplug time rather than having to be
> > in the initial device tree.  So for us, node could be supplied with
> > device_add.
> I've implemented cpu.node-id check in the same way for all targets
> for spapr it's patch patch 06/23 which forces cpu.node-id to match
> whatever mapping has been provided with -numa cpu[s]
> OR
> with implied default /0/ if mapping for cpu hasn't been specified
> with -numa explicitly.
> 
> That way it won't break legacy machines and on compat code is necessary,
> I'd would leave it up to you with patch on top of this to lift restriction/make
> it more relaxed for spapr if you think it won't break anything.
> 
> Although from libvirt pov, I'd prefer to treat all targets uniformly,
> which narrows choice down to '-numa' mapping approach that it uses
> now.

Yeah, that makes sense.  If we ever have a compelling reason to allow
node designation at device_add time on Power, we can relax the
restrictions then.  I doubt it will ever happen.

> 
> > > but it uses current predefined
> > >  mapping with -numa node,cpus=x-y and new CLI hides from user internal
> > >  cpu_index and allows to use the same properties as we use for -device cpu,...
> > >  to define mapping to numa nodes for present/possible cpus.
> ...
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson