legacy cpu to node mapping is using cpu index values to map
VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
option. However cpu index is internal concept and QEMU users
have to guess /reimplement qemu's logic/ to map it to
a concrete cpu socket/core/thread to make sane CPUs
placement across numa nodes.
This patch allows to map cpu objects to numa nodes using
the same properties as used for cpus with -device/device_add
(socket-id/core-id/thread-id/node-id).
At present valid properties/values to address CPUs could be
fetched using hotpluggable-cpus monitor/qmp command, it will
require user to start qemu twice when creating domain to fetch
possible CPUs for a machine type/-smp layout first and
then the second time with numa explicit mapping for actual
usage. The first step results could be saved and reused to
set/change mapping later as far as machine type/-smp stays
the same.
Proposed impl. supports exact and wildcard matching to
simplify CLI and allow to set mapping for a specific cpu
or group of cpu objects specified by matched properties.
For example:
# exact mapping x86
-numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
# exact mapping SPAPR
-numa cpu,node-id=x,core-id=y
# wildcard mapping, all cpu objects that match socket-id=y
# are mapped to node-id=x
-numa cpu,node-id=x,socket-id=y
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
numa.c | 13 +++++++++++++
qapi-schema.json | 7 +++++--
qemu-options.hx | 23 ++++++++++++++++++++++-
3 files changed, 40 insertions(+), 3 deletions(-)
diff --git a/numa.c b/numa.c
index 088fae3..588586b 100644
--- a/numa.c
+++ b/numa.c
@@ -246,6 +246,19 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
}
nb_numa_nodes++;
break;
+ case NUMA_OPTIONS_TYPE_CPU:
+ if (!object->u.cpu.has_node_id) {
+ error_setg(&err, "Missing mandatory node-id property");
+ goto end;
+ }
+ if (!numa_info[object->u.cpu.node_id].present) {
+ error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
+ "defined with -numa node,nodeid=ID before it's used with "
+ "-numa cpu,node-id=ID", object->u.cpu.node_id);
+ goto end;
+ }
+ machine_set_cpu_numa_node(ms, &object->u.cpu, &err);
+ break;
default:
abort();
}
diff --git a/qapi-schema.json b/qapi-schema.json
index a6b5955..a9a1d5e 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -5673,10 +5673,12 @@
##
# @NumaOptionsType:
#
+# @cpu: property based CPU(s) to node mapping (Since: 2.10)
+#
# Since: 2.1
##
{ 'enum': 'NumaOptionsType',
- 'data': [ 'node' ] }
+ 'data': [ 'node', 'cpu' ] }
##
# @NumaOptions:
@@ -5689,7 +5691,8 @@
'base': { 'type': 'NumaOptionsType' },
'discriminator': 'type',
'data': {
- 'node': 'NumaNodeOptions' }}
+ 'node': 'NumaNodeOptions',
+ 'cpu': 'CpuInstanceProperties' }}
##
# @NumaNodeOptions:
diff --git a/qemu-options.hx b/qemu-options.hx
index 99af8ed..2185c34 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -139,13 +139,16 @@ ETEXI
DEF("numa", HAS_ARG, QEMU_OPTION_numa,
"-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
- "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
+ "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
+ "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n", QEMU_ARCH_ALL)
STEXI
@item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
@itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
+@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
@findex -numa
Define a NUMA node and assign RAM and VCPUs to it.
+Legacy VCPU assignment uses @samp{cpus} option where
@var{firstcpu} and @var{lastcpu} are CPU indexes. Each
@samp{cpus} option represent a contiguous range of CPU indexes
(or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
@@ -159,6 +162,24 @@ a NUMA node:
-numa node,cpus=0-2,cpus=5
@end example
+@samp{cpu} option is new alternative to @samp{cpus} option
+uses @samp{socket-id|core-id|thread-id} properties to assign
+CPU objects to a @var{node} using topology layout properties of CPU.
+Set of properties is machine specific, and depends on used machine
+type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
+monitor command.
+@samp{node-id} property specifies @var{node} to which CPU object
+will be assigned, it's required for @var{node} to be declared
+with @samp{node} option before it's used with @samp{cpu} option.
+
+For example:
+@example
+-M pc \
+-smp 1,sockets=2,maxcpus=2 \
+-numa node,nodeid=0 -numa node,nodeid=1 \
+-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
+@end example
+
@samp{mem} assigns a given RAM amount to a node. @samp{memdev}
assigns RAM from a given memory backend device to a node. If
@samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
--
2.7.4
On 03/22/2017 08:32 AM, Igor Mammedov wrote:
> legacy cpu to node mapping is using cpu index values to map
> VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> option. However cpu index is internal concept and QEMU users
> have to guess /reimplement qemu's logic/ to map it to
> a concrete cpu socket/core/thread to make sane CPUs
> placement across numa nodes.
>
> This patch allows to map cpu objects to numa nodes using
> the same properties as used for cpus with -device/device_add
> (socket-id/core-id/thread-id/node-id).
>
> At present valid properties/values to address CPUs could be
> fetched using hotpluggable-cpus monitor/qmp command, it will
> require user to start qemu twice when creating domain to fetch
> possible CPUs for a machine type/-smp layout first and
> then the second time with numa explicit mapping for actual
> usage. The first step results could be saved and reused to
> set/change mapping later as far as machine type/-smp stays
> the same.
>
> Proposed impl. supports exact and wildcard matching to
> simplify CLI and allow to set mapping for a specific cpu
> or group of cpu objects specified by matched properties.
>
> For example:
>
> # exact mapping x86
> -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
>
> # exact mapping SPAPR
> -numa cpu,node-id=x,core-id=y
>
> # wildcard mapping, all cpu objects that match socket-id=y
> # are mapped to node-id=x
> -numa cpu,node-id=x,socket-id=y
>
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> numa.c | 13 +++++++++++++
> qapi-schema.json | 7 +++++--
> qemu-options.hx | 23 ++++++++++++++++++++++-
> 3 files changed, 40 insertions(+), 3 deletions(-)
>
>
> +@samp{cpu} option is new alternative to @samp{cpus} option
s/is/is a/
> +uses @samp{socket-id|core-id|thread-id} properties to assign
s/uses/which uses/
> +CPU objects to a @var{node} using topology layout properties of CPU.
> +Set of properties is machine specific, and depends on used machine
s/Set/The set/
> +type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
> +monitor command.
> +@samp{node-id} property specifies @var{node} to which CPU object
> +will be assigned, it's required for @var{node} to be declared
> +with @samp{node} option before it's used with @samp{cpu} option.
> +
> +For example:
> +@example
> +-M pc \
> +-smp 1,sockets=2,maxcpus=2 \
> +-numa node,nodeid=0 -numa node,nodeid=1 \
> +-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
> +@end example
> +
> @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
> assigns RAM from a given memory backend device to a node. If
> @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
On Thu, 23 Mar 2017 08:23:32 -0500
Eric Blake <eblake@redhat.com> wrote:
...
> >
> > +@samp{cpu} option is new alternative to @samp{cpus} option
>
> s/is/is a/
>
> > +uses @samp{socket-id|core-id|thread-id} properties to assign
>
> s/uses/which uses/
>
> > +CPU objects to a @var{node} using topology layout properties of CPU.
> > +Set of properties is machine specific, and depends on used machine
>
> s/Set/The set/
>
Fixed in v2 branch
On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote:
> legacy cpu to node mapping is using cpu index values to map
> VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> option. However cpu index is internal concept and QEMU users
> have to guess /reimplement qemu's logic/ to map it to
> a concrete cpu socket/core/thread to make sane CPUs
> placement across numa nodes.
>
> This patch allows to map cpu objects to numa nodes using
> the same properties as used for cpus with -device/device_add
> (socket-id/core-id/thread-id/node-id).
>
> At present valid properties/values to address CPUs could be
> fetched using hotpluggable-cpus monitor/qmp command, it will
> require user to start qemu twice when creating domain to fetch
> possible CPUs for a machine type/-smp layout first and
> then the second time with numa explicit mapping for actual
> usage. The first step results could be saved and reused to
> set/change mapping later as far as machine type/-smp stays
> the same.
>
> Proposed impl. supports exact and wildcard matching to
> simplify CLI and allow to set mapping for a specific cpu
> or group of cpu objects specified by matched properties.
>
> For example:
>
> # exact mapping x86
> -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
>
> # exact mapping SPAPR
> -numa cpu,node-id=x,core-id=y
>
> # wildcard mapping, all cpu objects that match socket-id=y
> # are mapped to node-id=x
> -numa cpu,node-id=x,socket-id=y
>
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
What's the rationale for adding a new CLI, rather than adding node-id
properties to the appropriate objects with -device, -global or -set as
appropriate?
> ---
> numa.c | 13 +++++++++++++
> qapi-schema.json | 7 +++++--
> qemu-options.hx | 23 ++++++++++++++++++++++-
> 3 files changed, 40 insertions(+), 3 deletions(-)
>
> diff --git a/numa.c b/numa.c
> index 088fae3..588586b 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -246,6 +246,19 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
> }
> nb_numa_nodes++;
> break;
> + case NUMA_OPTIONS_TYPE_CPU:
> + if (!object->u.cpu.has_node_id) {
> + error_setg(&err, "Missing mandatory node-id property");
> + goto end;
> + }
> + if (!numa_info[object->u.cpu.node_id].present) {
> + error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
> + "defined with -numa node,nodeid=ID before it's used with "
> + "-numa cpu,node-id=ID", object->u.cpu.node_id);
> + goto end;
> + }
> + machine_set_cpu_numa_node(ms, &object->u.cpu, &err);
> + break;
> default:
> abort();
> }
> diff --git a/qapi-schema.json b/qapi-schema.json
> index a6b5955..a9a1d5e 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -5673,10 +5673,12 @@
> ##
> # @NumaOptionsType:
> #
> +# @cpu: property based CPU(s) to node mapping (Since: 2.10)
> +#
> # Since: 2.1
> ##
> { 'enum': 'NumaOptionsType',
> - 'data': [ 'node' ] }
> + 'data': [ 'node', 'cpu' ] }
>
> ##
> # @NumaOptions:
> @@ -5689,7 +5691,8 @@
> 'base': { 'type': 'NumaOptionsType' },
> 'discriminator': 'type',
> 'data': {
> - 'node': 'NumaNodeOptions' }}
> + 'node': 'NumaNodeOptions',
> + 'cpu': 'CpuInstanceProperties' }}
>
> ##
> # @NumaNodeOptions:
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 99af8ed..2185c34 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -139,13 +139,16 @@ ETEXI
>
> DEF("numa", HAS_ARG, QEMU_OPTION_numa,
> "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> - "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
> + "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> + "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n", QEMU_ARCH_ALL)
> STEXI
> @item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> @itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> +@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
> @findex -numa
> Define a NUMA node and assign RAM and VCPUs to it.
>
> +Legacy VCPU assignment uses @samp{cpus} option where
> @var{firstcpu} and @var{lastcpu} are CPU indexes. Each
> @samp{cpus} option represent a contiguous range of CPU indexes
> (or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
> @@ -159,6 +162,24 @@ a NUMA node:
> -numa node,cpus=0-2,cpus=5
> @end example
>
> +@samp{cpu} option is new alternative to @samp{cpus} option
> +uses @samp{socket-id|core-id|thread-id} properties to assign
> +CPU objects to a @var{node} using topology layout properties of CPU.
> +Set of properties is machine specific, and depends on used machine
> +type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
> +monitor command.
> +@samp{node-id} property specifies @var{node} to which CPU object
> +will be assigned, it's required for @var{node} to be declared
> +with @samp{node} option before it's used with @samp{cpu} option.
> +
> +For example:
> +@example
> +-M pc \
> +-smp 1,sockets=2,maxcpus=2 \
> +-numa node,nodeid=0 -numa node,nodeid=1 \
> +-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
> +@end example
> +
> @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
> assigns RAM from a given memory backend device to a node. If
> @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
On Tue, 28 Mar 2017 16:16:02 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:
> On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote:
> > legacy cpu to node mapping is using cpu index values to map
> > VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> > option. However cpu index is internal concept and QEMU users
> > have to guess /reimplement qemu's logic/ to map it to
> > a concrete cpu socket/core/thread to make sane CPUs
> > placement across numa nodes.
> >
> > This patch allows to map cpu objects to numa nodes using
> > the same properties as used for cpus with -device/device_add
> > (socket-id/core-id/thread-id/node-id).
> >
> > At present valid properties/values to address CPUs could be
> > fetched using hotpluggable-cpus monitor/qmp command, it will
> > require user to start qemu twice when creating domain to fetch
> > possible CPUs for a machine type/-smp layout first and
> > then the second time with numa explicit mapping for actual
> > usage. The first step results could be saved and reused to
> > set/change mapping later as far as machine type/-smp stays
> > the same.
> >
> > Proposed impl. supports exact and wildcard matching to
> > simplify CLI and allow to set mapping for a specific cpu
> > or group of cpu objects specified by matched properties.
> >
> > For example:
> >
> > # exact mapping x86
> > -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
> >
> > # exact mapping SPAPR
> > -numa cpu,node-id=x,core-id=y
> >
> > # wildcard mapping, all cpu objects that match socket-id=y
> > # are mapped to node-id=x
> > -numa cpu,node-id=x,socket-id=y
> >
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
>
> What's the rationale for adding a new CLI, rather than adding node-id
> properties to the appropriate objects with -device, -global or -set as
> appropriate?
'-global' applies to all cpus, while '-device,-set' applies to present
at boot time cpus only. So they do not work for the case of possible but
not present at boot time objects. For ACPI based targets, we need to have
numa mapping at boot time to build ACPI SRAT table.
I don't know if it's important for spapr/fdt, but it uses current predefined
mapping with -numa node,cpus=x-y and new CLI hides from user internal
cpu_index and allows to use the same properties as we use for -device cpu,...
to define mapping to numa nodes for present/possible cpus.
>
> > ---
> > numa.c | 13 +++++++++++++
> > qapi-schema.json | 7 +++++--
> > qemu-options.hx | 23 ++++++++++++++++++++++-
> > 3 files changed, 40 insertions(+), 3 deletions(-)
> >
> > diff --git a/numa.c b/numa.c
> > index 088fae3..588586b 100644
> > --- a/numa.c
> > +++ b/numa.c
> > @@ -246,6 +246,19 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
> > }
> > nb_numa_nodes++;
> > break;
> > + case NUMA_OPTIONS_TYPE_CPU:
> > + if (!object->u.cpu.has_node_id) {
> > + error_setg(&err, "Missing mandatory node-id property");
> > + goto end;
> > + }
> > + if (!numa_info[object->u.cpu.node_id].present) {
> > + error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
> > + "defined with -numa node,nodeid=ID before it's used with "
> > + "-numa cpu,node-id=ID", object->u.cpu.node_id);
> > + goto end;
> > + }
> > + machine_set_cpu_numa_node(ms, &object->u.cpu, &err);
> > + break;
> > default:
> > abort();
> > }
> > diff --git a/qapi-schema.json b/qapi-schema.json
> > index a6b5955..a9a1d5e 100644
> > --- a/qapi-schema.json
> > +++ b/qapi-schema.json
> > @@ -5673,10 +5673,12 @@
> > ##
> > # @NumaOptionsType:
> > #
> > +# @cpu: property based CPU(s) to node mapping (Since: 2.10)
> > +#
> > # Since: 2.1
> > ##
> > { 'enum': 'NumaOptionsType',
> > - 'data': [ 'node' ] }
> > + 'data': [ 'node', 'cpu' ] }
> >
> > ##
> > # @NumaOptions:
> > @@ -5689,7 +5691,8 @@
> > 'base': { 'type': 'NumaOptionsType' },
> > 'discriminator': 'type',
> > 'data': {
> > - 'node': 'NumaNodeOptions' }}
> > + 'node': 'NumaNodeOptions',
> > + 'cpu': 'CpuInstanceProperties' }}
> >
> > ##
> > # @NumaNodeOptions:
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index 99af8ed..2185c34 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -139,13 +139,16 @@ ETEXI
> >
> > DEF("numa", HAS_ARG, QEMU_OPTION_numa,
> > "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> > - "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
> > + "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> > + "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n", QEMU_ARCH_ALL)
> > STEXI
> > @item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> > @itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> > +@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
> > @findex -numa
> > Define a NUMA node and assign RAM and VCPUs to it.
> >
> > +Legacy VCPU assignment uses @samp{cpus} option where
> > @var{firstcpu} and @var{lastcpu} are CPU indexes. Each
> > @samp{cpus} option represent a contiguous range of CPU indexes
> > (or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
> > @@ -159,6 +162,24 @@ a NUMA node:
> > -numa node,cpus=0-2,cpus=5
> > @end example
> >
> > +@samp{cpu} option is new alternative to @samp{cpus} option
> > +uses @samp{socket-id|core-id|thread-id} properties to assign
> > +CPU objects to a @var{node} using topology layout properties of CPU.
> > +Set of properties is machine specific, and depends on used machine
> > +type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
> > +monitor command.
> > +@samp{node-id} property specifies @var{node} to which CPU object
> > +will be assigned, it's required for @var{node} to be declared
> > +with @samp{node} option before it's used with @samp{cpu} option.
> > +
> > +For example:
> > +@example
> > +-M pc \
> > +-smp 1,sockets=2,maxcpus=2 \
> > +-numa node,nodeid=0 -numa node,nodeid=1 \
> > +-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
> > +@end example
> > +
> > @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
> > assigns RAM from a given memory backend device to a node. If
> > @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
>
On Tue, Mar 28, 2017 at 01:09:11PM +0200, Igor Mammedov wrote:
> On Tue, 28 Mar 2017 16:16:02 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
>
> > On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote:
> > > legacy cpu to node mapping is using cpu index values to map
> > > VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]'
> > > option. However cpu index is internal concept and QEMU users
> > > have to guess /reimplement qemu's logic/ to map it to
> > > a concrete cpu socket/core/thread to make sane CPUs
> > > placement across numa nodes.
> > >
> > > This patch allows to map cpu objects to numa nodes using
> > > the same properties as used for cpus with -device/device_add
> > > (socket-id/core-id/thread-id/node-id).
> > >
> > > At present valid properties/values to address CPUs could be
> > > fetched using hotpluggable-cpus monitor/qmp command, it will
> > > require user to start qemu twice when creating domain to fetch
> > > possible CPUs for a machine type/-smp layout first and
> > > then the second time with numa explicit mapping for actual
> > > usage. The first step results could be saved and reused to
> > > set/change mapping later as far as machine type/-smp stays
> > > the same.
> > >
> > > Proposed impl. supports exact and wildcard matching to
> > > simplify CLI and allow to set mapping for a specific cpu
> > > or group of cpu objects specified by matched properties.
> > >
> > > For example:
> > >
> > > # exact mapping x86
> > > -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n
> > >
> > > # exact mapping SPAPR
> > > -numa cpu,node-id=x,core-id=y
> > >
> > > # wildcard mapping, all cpu objects that match socket-id=y
> > > # are mapped to node-id=x
> > > -numa cpu,node-id=x,socket-id=y
> > >
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> >
> > What's the rationale for adding a new CLI, rather than adding node-id
> > properties to the appropriate objects with -device, -global or -set as
> > appropriate?
> '-global' applies to all cpus, while '-device,-set' applies to present
> at boot time cpus only. So they do not work for the case of possible but
> not present at boot time objects.
Ah! Of course.
> For ACPI based targets, we need to have
> numa mapping at boot time to build ACPI SRAT table.
> I don't know if it's important for spapr/fdt,
Not in the same way. For spapr the device tree fragment for the new
cpu is supplied to the guest at hotplug time rather than having to be
in the initial device tree. So for us, node could be supplied with
device_add.
> but it uses current predefined
> mapping with -numa node,cpus=x-y and new CLI hides from user internal
> cpu_index and allows to use the same properties as we use for -device cpu,...
> to define mapping to numa nodes for present/possible cpus.
>
> >
> > > ---
> > > numa.c | 13 +++++++++++++
> > > qapi-schema.json | 7 +++++--
> > > qemu-options.hx | 23 ++++++++++++++++++++++-
> > > 3 files changed, 40 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/numa.c b/numa.c
> > > index 088fae3..588586b 100644
> > > --- a/numa.c
> > > +++ b/numa.c
> > > @@ -246,6 +246,19 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error **errp)
> > > }
> > > nb_numa_nodes++;
> > > break;
> > > + case NUMA_OPTIONS_TYPE_CPU:
> > > + if (!object->u.cpu.has_node_id) {
> > > + error_setg(&err, "Missing mandatory node-id property");
> > > + goto end;
> > > + }
> > > + if (!numa_info[object->u.cpu.node_id].present) {
> > > + error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be "
> > > + "defined with -numa node,nodeid=ID before it's used with "
> > > + "-numa cpu,node-id=ID", object->u.cpu.node_id);
> > > + goto end;
> > > + }
> > > + machine_set_cpu_numa_node(ms, &object->u.cpu, &err);
> > > + break;
> > > default:
> > > abort();
> > > }
> > > diff --git a/qapi-schema.json b/qapi-schema.json
> > > index a6b5955..a9a1d5e 100644
> > > --- a/qapi-schema.json
> > > +++ b/qapi-schema.json
> > > @@ -5673,10 +5673,12 @@
> > > ##
> > > # @NumaOptionsType:
> > > #
> > > +# @cpu: property based CPU(s) to node mapping (Since: 2.10)
> > > +#
> > > # Since: 2.1
> > > ##
> > > { 'enum': 'NumaOptionsType',
> > > - 'data': [ 'node' ] }
> > > + 'data': [ 'node', 'cpu' ] }
> > >
> > > ##
> > > # @NumaOptions:
> > > @@ -5689,7 +5691,8 @@
> > > 'base': { 'type': 'NumaOptionsType' },
> > > 'discriminator': 'type',
> > > 'data': {
> > > - 'node': 'NumaNodeOptions' }}
> > > + 'node': 'NumaNodeOptions',
> > > + 'cpu': 'CpuInstanceProperties' }}
> > >
> > > ##
> > > # @NumaNodeOptions:
> > > diff --git a/qemu-options.hx b/qemu-options.hx
> > > index 99af8ed..2185c34 100644
> > > --- a/qemu-options.hx
> > > +++ b/qemu-options.hx
> > > @@ -139,13 +139,16 @@ ETEXI
> > >
> > > DEF("numa", HAS_ARG, QEMU_OPTION_numa,
> > > "-numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> > > - "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n", QEMU_ARCH_ALL)
> > > + "-numa node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node]\n"
> > > + "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n", QEMU_ARCH_ALL)
> > > STEXI
> > > @item -numa node[,mem=@var{size}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> > > @itemx -numa node[,memdev=@var{id}][,cpus=@var{firstcpu}[-@var{lastcpu}]][,nodeid=@var{node}]
> > > +@itemx -numa cpu,node-id=@var{node}[,socket-id=@var{x}][,core-id=@var{y}][,thread-id=@var{z}]
> > > @findex -numa
> > > Define a NUMA node and assign RAM and VCPUs to it.
> > >
> > > +Legacy VCPU assignment uses @samp{cpus} option where
> > > @var{firstcpu} and @var{lastcpu} are CPU indexes. Each
> > > @samp{cpus} option represent a contiguous range of CPU indexes
> > > (or a single VCPU if @var{lastcpu} is omitted). A non-contiguous
> > > @@ -159,6 +162,24 @@ a NUMA node:
> > > -numa node,cpus=0-2,cpus=5
> > > @end example
> > >
> > > +@samp{cpu} option is new alternative to @samp{cpus} option
> > > +uses @samp{socket-id|core-id|thread-id} properties to assign
> > > +CPU objects to a @var{node} using topology layout properties of CPU.
> > > +Set of properties is machine specific, and depends on used machine
> > > +type/@samp{smp} options. It could be queried with @samp{hotpluggable-cpus}
> > > +monitor command.
> > > +@samp{node-id} property specifies @var{node} to which CPU object
> > > +will be assigned, it's required for @var{node} to be declared
> > > +with @samp{node} option before it's used with @samp{cpu} option.
> > > +
> > > +For example:
> > > +@example
> > > +-M pc \
> > > +-smp 1,sockets=2,maxcpus=2 \
> > > +-numa node,nodeid=0 -numa node,nodeid=1 \
> > > +-numa cpu,node-id=0,socket-id=0 -numa cpu,node-id=1,socket-id=1
> > > +@end example
> > > +
> > > @samp{mem} assigns a given RAM amount to a node. @samp{memdev}
> > > assigns RAM from a given memory backend device to a node. If
> > > @samp{mem} and @samp{memdev} are omitted in all nodes, RAM is
> >
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
On Wed, 29 Mar 2017 13:27:23 +1100 David Gibson <david@gibson.dropbear.id.au> wrote: > On Tue, Mar 28, 2017 at 01:09:11PM +0200, Igor Mammedov wrote: > > On Tue, 28 Mar 2017 16:16:02 +1100 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote: > > > > legacy cpu to node mapping is using cpu index values to map > > > > VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]' > > > > option. However cpu index is internal concept and QEMU users > > > > have to guess /reimplement qemu's logic/ to map it to > > > > a concrete cpu socket/core/thread to make sane CPUs > > > > placement across numa nodes. > > > > > > > > This patch allows to map cpu objects to numa nodes using > > > > the same properties as used for cpus with -device/device_add > > > > (socket-id/core-id/thread-id/node-id). > > > > > > > > At present valid properties/values to address CPUs could be > > > > fetched using hotpluggable-cpus monitor/qmp command, it will > > > > require user to start qemu twice when creating domain to fetch > > > > possible CPUs for a machine type/-smp layout first and > > > > then the second time with numa explicit mapping for actual > > > > usage. The first step results could be saved and reused to > > > > set/change mapping later as far as machine type/-smp stays > > > > the same. > > > > > > > > Proposed impl. supports exact and wildcard matching to > > > > simplify CLI and allow to set mapping for a specific cpu > > > > or group of cpu objects specified by matched properties. > > > > > > > > For example: > > > > > > > > # exact mapping x86 > > > > -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n > > > > > > > > # exact mapping SPAPR > > > > -numa cpu,node-id=x,core-id=y > > > > > > > > # wildcard mapping, all cpu objects that match socket-id=y > > > > # are mapped to node-id=x > > > > -numa cpu,node-id=x,socket-id=y > > > > > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > > > > > What's the rationale for adding a new CLI, rather than adding node-id > > > properties to the appropriate objects with -device, -global or -set as > > > appropriate? > > '-global' applies to all cpus, while '-device,-set' applies to present > > at boot time cpus only. So they do not work for the case of possible but > > not present at boot time objects. > > Ah! Of course. > > > For ACPI based targets, we need to have > > numa mapping at boot time to build ACPI SRAT table. > > I don't know if it's important for spapr/fdt, > > Not in the same way. For spapr the device tree fragment for the new > cpu is supplied to the guest at hotplug time rather than having to be > in the initial device tree. So for us, node could be supplied with > device_add. I've implemented cpu.node-id check in the same way for all targets for spapr it's patch patch 06/23 which forces cpu.node-id to match whatever mapping has been provided with -numa cpu[s] OR with implied default /0/ if mapping for cpu hasn't been specified with -numa explicitly. That way it won't break legacy machines and on compat code is necessary, I'd would leave it up to you with patch on top of this to lift restriction/make it more relaxed for spapr if you think it won't break anything. Although from libvirt pov, I'd prefer to treat all targets uniformly, which narrows choice down to '-numa' mapping approach that it uses now. > > but it uses current predefined > > mapping with -numa node,cpus=x-y and new CLI hides from user internal > > cpu_index and allows to use the same properties as we use for -device cpu,... > > to define mapping to numa nodes for present/possible cpus. ...
On Wed, Mar 29, 2017 at 02:08:58PM +0200, Igor Mammedov wrote: > On Wed, 29 Mar 2017 13:27:23 +1100 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Tue, Mar 28, 2017 at 01:09:11PM +0200, Igor Mammedov wrote: > > > On Tue, 28 Mar 2017 16:16:02 +1100 > > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > > > On Wed, Mar 22, 2017 at 02:32:47PM +0100, Igor Mammedov wrote: > > > > > legacy cpu to node mapping is using cpu index values to map > > > > > VCPU to node with help of '-numa node,nodeid=node,cpus=x[-y]' > > > > > option. However cpu index is internal concept and QEMU users > > > > > have to guess /reimplement qemu's logic/ to map it to > > > > > a concrete cpu socket/core/thread to make sane CPUs > > > > > placement across numa nodes. > > > > > > > > > > This patch allows to map cpu objects to numa nodes using > > > > > the same properties as used for cpus with -device/device_add > > > > > (socket-id/core-id/thread-id/node-id). > > > > > > > > > > At present valid properties/values to address CPUs could be > > > > > fetched using hotpluggable-cpus monitor/qmp command, it will > > > > > require user to start qemu twice when creating domain to fetch > > > > > possible CPUs for a machine type/-smp layout first and > > > > > then the second time with numa explicit mapping for actual > > > > > usage. The first step results could be saved and reused to > > > > > set/change mapping later as far as machine type/-smp stays > > > > > the same. > > > > > > > > > > Proposed impl. supports exact and wildcard matching to > > > > > simplify CLI and allow to set mapping for a specific cpu > > > > > or group of cpu objects specified by matched properties. > > > > > > > > > > For example: > > > > > > > > > > # exact mapping x86 > > > > > -numa cpu,node-id=x,socket-id=y,core-id=z,thread-id=n > > > > > > > > > > # exact mapping SPAPR > > > > > -numa cpu,node-id=x,core-id=y > > > > > > > > > > # wildcard mapping, all cpu objects that match socket-id=y > > > > > # are mapped to node-id=x > > > > > -numa cpu,node-id=x,socket-id=y > > > > > > > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com> > > > > > > > > What's the rationale for adding a new CLI, rather than adding node-id > > > > properties to the appropriate objects with -device, -global or -set as > > > > appropriate? > > > '-global' applies to all cpus, while '-device,-set' applies to present > > > at boot time cpus only. So they do not work for the case of possible but > > > not present at boot time objects. > > > > Ah! Of course. > > > > > For ACPI based targets, we need to have > > > numa mapping at boot time to build ACPI SRAT table. > > > I don't know if it's important for spapr/fdt, > > > > Not in the same way. For spapr the device tree fragment for the new > > cpu is supplied to the guest at hotplug time rather than having to be > > in the initial device tree. So for us, node could be supplied with > > device_add. > I've implemented cpu.node-id check in the same way for all targets > for spapr it's patch patch 06/23 which forces cpu.node-id to match > whatever mapping has been provided with -numa cpu[s] > OR > with implied default /0/ if mapping for cpu hasn't been specified > with -numa explicitly. > > That way it won't break legacy machines and on compat code is necessary, > I'd would leave it up to you with patch on top of this to lift restriction/make > it more relaxed for spapr if you think it won't break anything. > > Although from libvirt pov, I'd prefer to treat all targets uniformly, > which narrows choice down to '-numa' mapping approach that it uses > now. Yeah, that makes sense. If we ever have a compelling reason to allow node designation at device_add time on Power, we can relax the restrictions then. I doubt it will ever happen. > > > > but it uses current predefined > > > mapping with -numa node,cpus=x-y and new CLI hides from user internal > > > cpu_index and allows to use the same properties as we use for -device cpu,... > > > to define mapping to numa nodes for present/possible cpus. > ... > -- David Gibson | I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
© 2016 - 2025 Red Hat, Inc.