[Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id

Laurent Vivier posted 1 patch 4 years, 11 months ago
Test FreeBSD passed
Test docker-clang@ubuntu passed
Test s390x passed
Test docker-mingw@fedora passed
Test checkpatch passed
Test asan passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190524103521.13847-1-lvivier@redhat.com
Maintainers: Eduardo Habkost <ehabkost@redhat.com>
There is a newer version of this series
numa.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
[Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Laurent Vivier 4 years, 11 months ago
On pseries, core-ids are strongly binded to a node-id by the command
line option. If an user tries to add a CPU to the wrong node, he has
an error but it is not really helpful:

  qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
                        -numa node,nodeid=0 -numa node,nodeid=1 ...

  (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
  Error: node-id=1 must match numa node specified with -numa option

This patch improves this error message by giving to the user the good
topology information (node-id, socket-id and thread-id if they are
available) to use with the core-id he's providing:

  Error: node-id=1 must match numa node specified with -numa option 'node-id 0'

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---

Notes:
    v3: only add the topology to the existing message
        As suggested by Igor replace
          Error: core-id 30 can only be plugged into node-id 0
        by
          Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
    
    v2: display full topology in the error message

 numa.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/numa.c b/numa.c
index 3875e1efda3a..7882ec294be4 100644
--- a/numa.c
+++ b/numa.c
@@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
     set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
 }
 
+static char *cpu_topology_to_string(const CPUArchId *cpu)
+{
+    GString *s = g_string_new(NULL);
+    if (cpu->props.has_socket_id) {
+        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
+    }
+    if (cpu->props.has_node_id) {
+        if (s->len) {
+            g_string_append_printf(s, ", ");
+        }
+        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
+    }
+    if (cpu->props.has_thread_id) {
+        if (s->len) {
+            g_string_append_printf(s, ", ");
+        }
+        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
+    }
+    return g_string_free(s, false);
+}
+
 void numa_cpu_pre_plug(const CPUArchId *slot, DeviceState *dev, Error **errp)
 {
     int node_id = object_property_get_int(OBJECT(dev), "node-id", &error_abort);
@@ -470,8 +491,10 @@ void numa_cpu_pre_plug(const CPUArchId *slot, DeviceState *dev, Error **errp)
                                     "node-id", errp);
         }
     } else if (node_id != slot->props.node_id) {
+        char *topology = cpu_topology_to_string(slot);
         error_setg(errp, "node-id=%d must match numa node specified "
-                   "with -numa option", node_id);
+                   "with -numa option '%s'", node_id, topology);
+        g_free(topology);
     }
 }
 
-- 
2.20.1


Re: [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Igor Mammedov 4 years, 11 months ago
On Fri, 24 May 2019 12:35:21 +0200
Laurent Vivier <lvivier@redhat.com> wrote:

> On pseries, core-ids are strongly binded to a node-id by the command
> line option. If an user tries to add a CPU to the wrong node, he has
> an error but it is not really helpful:
> 
>   qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
>                         -numa node,nodeid=0 -numa node,nodeid=1 ...
> 
>   (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
>   Error: node-id=1 must match numa node specified with -numa option
> 
> This patch improves this error message by giving to the user the good
> topology information (node-id, socket-id and thread-id if they are
> available) to use with the core-id he's providing:
> 
>   Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
> 
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
> 
> Notes:
>     v3: only add the topology to the existing message
>         As suggested by Igor replace
>           Error: core-id 30 can only be plugged into node-id 0
>         by
>           Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
>     
>     v2: display full topology in the error message
> 
>  numa.c | 25 ++++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/numa.c b/numa.c
> index 3875e1efda3a..7882ec294be4 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
>      set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
>  }
>  
> +static char *cpu_topology_to_string(const CPUArchId *cpu)
> +{
> +    GString *s = g_string_new(NULL);
> +    if (cpu->props.has_socket_id) {
> +        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
> +    }
> +    if (cpu->props.has_node_id) {
> +        if (s->len) {
> +            g_string_append_printf(s, ", ");
> +        }
> +        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
> +    }
> +    if (cpu->props.has_thread_id) {
> +        if (s->len) {
> +            g_string_append_printf(s, ", ");
> +        }
> +        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
> +    }
> +    return g_string_free(s, false);
> +}

turns out we already have such helper: cpu_slot_to_string()

> +
>  void numa_cpu_pre_plug(const CPUArchId *slot, DeviceState *dev, Error **errp)
>  {
>      int node_id = object_property_get_int(OBJECT(dev), "node-id", &error_abort);
> @@ -470,8 +491,10 @@ void numa_cpu_pre_plug(const CPUArchId *slot, DeviceState *dev, Error **errp)
>                                      "node-id", errp);
>          }
>      } else if (node_id != slot->props.node_id) {
> +        char *topology = cpu_topology_to_string(slot);
>          error_setg(errp, "node-id=%d must match numa node specified "
> -                   "with -numa option", node_id);
> +                   "with -numa option '%s'", node_id, topology);
> +        g_free(topology);
>      }
>  }
>  


Re: [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Laurent Vivier 4 years, 11 months ago
On 24/05/2019 16:10, Igor Mammedov wrote:
> On Fri, 24 May 2019 12:35:21 +0200
> Laurent Vivier <lvivier@redhat.com> wrote:
> 
>> On pseries, core-ids are strongly binded to a node-id by the command
>> line option. If an user tries to add a CPU to the wrong node, he has
>> an error but it is not really helpful:
>>
>>    qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
>>                          -numa node,nodeid=0 -numa node,nodeid=1 ...
>>
>>    (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
>>    Error: node-id=1 must match numa node specified with -numa option
>>
>> This patch improves this error message by giving to the user the good
>> topology information (node-id, socket-id and thread-id if they are
>> available) to use with the core-id he's providing:
>>
>>    Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
>>
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> ---
>>
>> Notes:
>>      v3: only add the topology to the existing message
>>          As suggested by Igor replace
>>            Error: core-id 30 can only be plugged into node-id 0
>>          by
>>            Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
>>      
>>      v2: display full topology in the error message
>>
>>   numa.c | 25 ++++++++++++++++++++++++-
>>   1 file changed, 24 insertions(+), 1 deletion(-)
>>
>> diff --git a/numa.c b/numa.c
>> index 3875e1efda3a..7882ec294be4 100644
>> --- a/numa.c
>> +++ b/numa.c
>> @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
>>       set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
>>   }
>>   
>> +static char *cpu_topology_to_string(const CPUArchId *cpu)
>> +{
>> +    GString *s = g_string_new(NULL);
>> +    if (cpu->props.has_socket_id) {
>> +        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
>> +    }
>> +    if (cpu->props.has_node_id) {
>> +        if (s->len) {
>> +            g_string_append_printf(s, ", ");
>> +        }
>> +        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
>> +    }
>> +    if (cpu->props.has_thread_id) {
>> +        if (s->len) {
>> +            g_string_append_printf(s, ", ");
>> +        }
>> +        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
>> +    }
>> +    return g_string_free(s, false);
>> +}
> 
> turns out we already have such helper: cpu_slot_to_string()

It doesn't display the node-id but the core-id. And node-id is what we 
need to know.

Thanks,
Laurent

Re: [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Eduardo Habkost 4 years, 11 months ago
On Fri, May 24, 2019 at 04:39:12PM +0200, Laurent Vivier wrote:
> On 24/05/2019 16:10, Igor Mammedov wrote:
> > On Fri, 24 May 2019 12:35:21 +0200
> > Laurent Vivier <lvivier@redhat.com> wrote:
> > 
> > > On pseries, core-ids are strongly binded to a node-id by the command
> > > line option. If an user tries to add a CPU to the wrong node, he has
> > > an error but it is not really helpful:
> > > 
> > >    qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
> > >                          -numa node,nodeid=0 -numa node,nodeid=1 ...
> > > 
> > >    (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
> > >    Error: node-id=1 must match numa node specified with -numa option
> > > 
> > > This patch improves this error message by giving to the user the good
> > > topology information (node-id, socket-id and thread-id if they are
> > > available) to use with the core-id he's providing:
> > > 
> > >    Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
> > > 
> > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> > > ---
> > > 
> > > Notes:
> > >      v3: only add the topology to the existing message
> > >          As suggested by Igor replace
> > >            Error: core-id 30 can only be plugged into node-id 0
> > >          by
> > >            Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
> > >      v2: display full topology in the error message
> > > > > >   numa.c | 25 ++++++++++++++++++++++++-
> > >   1 file changed, 24 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/numa.c b/numa.c
> > > index 3875e1efda3a..7882ec294be4 100644
> > > --- a/numa.c
> > > +++ b/numa.c
> > > @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
> > >       set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
> > >   }
> > > +static char *cpu_topology_to_string(const CPUArchId *cpu)
> > > +{
> > > +    GString *s = g_string_new(NULL);
> > > +    if (cpu->props.has_socket_id) {
> > > +        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
> > > +    }
> > > +    if (cpu->props.has_node_id) {
> > > +        if (s->len) {
> > > +            g_string_append_printf(s, ", ");
> > > +        }
> > > +        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
> > > +    }
> > > +    if (cpu->props.has_thread_id) {
> > > +        if (s->len) {
> > > +            g_string_append_printf(s, ", ");
> > > +        }
> > > +        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
> > > +    }
> > > +    return g_string_free(s, false);
> > > +}
> > 
> > turns out we already have such helper: cpu_slot_to_string()
> 
> It doesn't display the node-id but the core-id. And node-id is what we need
> to know.

I'm confused about what you are trying to do here.

On v1, the message looked like:
  Error: core-id 30 can only be plugged into node-id 0

which is probably good for spapr.


Then I suggested you added the other cpu->props fields.  e.g. on
PC the message would look like:
  Error: socket-id 20, core-id 30, thread-id 40 can only be plugged into node-id 0


But you sent a v2 patch that would print this on PC:
  Error: core-id 30 can only be plugged into socket-id 20, node-id 0, thread-id 40

which doesn't make sense to me.


Then in a reply to v2, Igor suggested:

 error_setg(errp, "node-id=%d must match numa node specified "
                   "with -numa option '%s'", node_id, topology);


Igor suggest would address the problem above.  I expected it to become:
  node-id=0 must match numa node specified with -numa option core-id=30
and on PC:
  node-id=0 must match numa node specified with -numa option socket-id=20,core-id=30,thread-id=40

Or maybe it could include the input node-id too:
  node-id=0 must match numa node specified with -numa option node-id=1,core-id=30
and on PC:
  node-id=0 must match numa node specified with -numa option node-id=1,socket-id=20,core-id=30,thread-id=40

Both options would work.


But you implemented code that would print:
  Error: node-id=0 must match numa node specified with -numa option 'node-id 1'
and on PC it would print:
  Error: node-id=0 must match numa node specified with -numa option 'socket-id 20 node-id 1 thread-id=40'

which doesn't make sense to me.


I was expecting something like:
  Error: CPU slot core-id=30 is bound to node-id 0, but node-id 1 was specified
and on PC:
  Error: CPU slot socket-id=20,core-id=30,thread-id=40 is bound to node-id 0, but node-id 1 was specified


-- 
Eduardo

Re: [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Laurent Vivier 4 years, 11 months ago
On 24/05/2019 22:14, Eduardo Habkost wrote:
> On Fri, May 24, 2019 at 04:39:12PM +0200, Laurent Vivier wrote:
>> On 24/05/2019 16:10, Igor Mammedov wrote:
>>> On Fri, 24 May 2019 12:35:21 +0200
>>> Laurent Vivier <lvivier@redhat.com> wrote:
>>>
>>>> On pseries, core-ids are strongly binded to a node-id by the command
>>>> line option. If an user tries to add a CPU to the wrong node, he has
>>>> an error but it is not really helpful:
>>>>
>>>>     qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
>>>>                           -numa node,nodeid=0 -numa node,nodeid=1 ...
>>>>
>>>>     (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
>>>>     Error: node-id=1 must match numa node specified with -numa option
>>>>
>>>> This patch improves this error message by giving to the user the good
>>>> topology information (node-id, socket-id and thread-id if they are
>>>> available) to use with the core-id he's providing:
>>>>
>>>>     Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
>>>>
>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>> ---
>>>>
>>>> Notes:
>>>>       v3: only add the topology to the existing message
>>>>           As suggested by Igor replace
>>>>             Error: core-id 30 can only be plugged into node-id 0
>>>>           by
>>>>             Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
>>>>       v2: display full topology in the error message
>>>>>>>    numa.c | 25 ++++++++++++++++++++++++-
>>>>    1 file changed, 24 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/numa.c b/numa.c
>>>> index 3875e1efda3a..7882ec294be4 100644
>>>> --- a/numa.c
>>>> +++ b/numa.c
>>>> @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
>>>>        set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
>>>>    }
>>>> +static char *cpu_topology_to_string(const CPUArchId *cpu)
>>>> +{
>>>> +    GString *s = g_string_new(NULL);
>>>> +    if (cpu->props.has_socket_id) {
>>>> +        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
>>>> +    }
>>>> +    if (cpu->props.has_node_id) {
>>>> +        if (s->len) {
>>>> +            g_string_append_printf(s, ", ");
>>>> +        }
>>>> +        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
>>>> +    }
>>>> +    if (cpu->props.has_thread_id) {
>>>> +        if (s->len) {
>>>> +            g_string_append_printf(s, ", ");
>>>> +        }
>>>> +        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
>>>> +    }
>>>> +    return g_string_free(s, false);
>>>> +}
>>>
>>> turns out we already have such helper: cpu_slot_to_string()
>>
>> It doesn't display the node-id but the core-id. And node-id is what we need
>> to know.
> 
> I'm confused about what you are trying to do here.
> 
> On v1, the message looked like:
>    Error: core-id 30 can only be plugged into node-id 0
> 
> which is probably good for spapr.
> 
> 
> Then I suggested you added the other cpu->props fields.  e.g. on
> PC the message would look like:
>    Error: socket-id 20, core-id 30, thread-id 40 can only be plugged into node-id 0
> 
> 
> But you sent a v2 patch that would print this on PC:
>    Error: core-id 30 can only be plugged into socket-id 20, node-id 0, thread-id 40
> 
> which doesn't make sense to me.
> 
> 
> Then in a reply to v2, Igor suggested:
> 
>   error_setg(errp, "node-id=%d must match numa node specified "
>                     "with -numa option '%s'", node_id, topology);
> 
> 
> Igor suggest would address the problem above.  I expected it to become:
>    node-id=0 must match numa node specified with -numa option core-id=30
> and on PC:
>    node-id=0 must match numa node specified with -numa option socket-id=20,core-id=30,thread-id=40
> 
> Or maybe it could include the input node-id too:
>    node-id=0 must match numa node specified with -numa option node-id=1,core-id=30
> and on PC:
>    node-id=0 must match numa node specified with -numa option node-id=1,socket-id=20,core-id=30,thread-id=40
> 
> Both options would work.
> 
> 
> But you implemented code that would print:
>    Error: node-id=0 must match numa node specified with -numa option 'node-id 1'
> and on PC it would print:
>    Error: node-id=0 must match numa node specified with -numa option 'socket-id 20 node-id 1 thread-id=40'
> 
> which doesn't make sense to me.
> 
> 
> I was expecting something like:
>    Error: CPU slot core-id=30 is bound to node-id 0, but node-id 1 was specified
> and on PC:
>    Error: CPU slot socket-id=20,core-id=30,thread-id=40 is bound to node-id 0, but node-id 1 was specified
> 
> 

The idea is to provide the information to the user to help him to know 
where the cpu can be plugged when it cannot on the node-id he originally 
provided.

So all the solutions you propose sounds good to me.

I only need you and Igor agree on the same one.

Thanks,
Laurent

Re: [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Igor Mammedov 4 years, 11 months ago
On Mon, 27 May 2019 08:55:49 +0200
Laurent Vivier <lvivier@redhat.com> wrote:

> On 24/05/2019 22:14, Eduardo Habkost wrote:
> > On Fri, May 24, 2019 at 04:39:12PM +0200, Laurent Vivier wrote:  
> >> On 24/05/2019 16:10, Igor Mammedov wrote:  
> >>> On Fri, 24 May 2019 12:35:21 +0200
> >>> Laurent Vivier <lvivier@redhat.com> wrote:
> >>>  
> >>>> On pseries, core-ids are strongly binded to a node-id by the command
> >>>> line option. If an user tries to add a CPU to the wrong node, he has
> >>>> an error but it is not really helpful:
> >>>>
> >>>>     qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
> >>>>                           -numa node,nodeid=0 -numa node,nodeid=1 ...
> >>>>
> >>>>     (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
> >>>>     Error: node-id=1 must match numa node specified with -numa option
> >>>>
> >>>> This patch improves this error message by giving to the user the good
> >>>> topology information (node-id, socket-id and thread-id if they are
> >>>> available) to use with the core-id he's providing:
> >>>>
> >>>>     Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
> >>>>
> >>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> >>>> ---
> >>>>
> >>>> Notes:
> >>>>       v3: only add the topology to the existing message
> >>>>           As suggested by Igor replace
> >>>>             Error: core-id 30 can only be plugged into node-id 0
> >>>>           by
> >>>>             Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
> >>>>       v2: display full topology in the error message  
> >>>>>>>    numa.c | 25 ++++++++++++++++++++++++-  
> >>>>    1 file changed, 24 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/numa.c b/numa.c
> >>>> index 3875e1efda3a..7882ec294be4 100644
> >>>> --- a/numa.c
> >>>> +++ b/numa.c
> >>>> @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
> >>>>        set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
> >>>>    }
> >>>> +static char *cpu_topology_to_string(const CPUArchId *cpu)
> >>>> +{
> >>>> +    GString *s = g_string_new(NULL);
> >>>> +    if (cpu->props.has_socket_id) {
> >>>> +        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
> >>>> +    }
> >>>> +    if (cpu->props.has_node_id) {
> >>>> +        if (s->len) {
> >>>> +            g_string_append_printf(s, ", ");
> >>>> +        }
> >>>> +        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
> >>>> +    }
> >>>> +    if (cpu->props.has_thread_id) {
> >>>> +        if (s->len) {
> >>>> +            g_string_append_printf(s, ", ");
> >>>> +        }
> >>>> +        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
> >>>> +    }
> >>>> +    return g_string_free(s, false);
> >>>> +}  
> >>>
> >>> turns out we already have such helper: cpu_slot_to_string()  
> >>
> >> It doesn't display the node-id but the core-id. And node-id is what we need
> >> to know.  
> > 
> > I'm confused about what you are trying to do here.
> > 
> > On v1, the message looked like:
> >    Error: core-id 30 can only be plugged into node-id 0
> > 
> > which is probably good for spapr.
> > 
> > 
> > Then I suggested you added the other cpu->props fields.  e.g. on
> > PC the message would look like:
> >    Error: socket-id 20, core-id 30, thread-id 40 can only be plugged into node-id 0
> > 
> > 
> > But you sent a v2 patch that would print this on PC:
> >    Error: core-id 30 can only be plugged into socket-id 20, node-id 0, thread-id 40
> > 
> > which doesn't make sense to me.
> > 
> > 
> > Then in a reply to v2, Igor suggested:
> > 
> >   error_setg(errp, "node-id=%d must match numa node specified "
> >                     "with -numa option '%s'", node_id, topology);
> > 
> > 
> > Igor suggest would address the problem above.  I expected it to become:
> >    node-id=0 must match numa node specified with -numa option core-id=30
> > and on PC:
> >    node-id=0 must match numa node specified with -numa option socket-id=20,core-id=30,thread-id=40
> > 
> > Or maybe it could include the input node-id too:
> >    node-id=0 must match numa node specified with -numa option node-id=1,core-id=30
> > and on PC:
> >    node-id=0 must match numa node specified with -numa option node-id=1,socket-id=20,core-id=30,thread-id=40
> > 
> > Both options would work.
> > 
> > 
> > But you implemented code that would print:
> >    Error: node-id=0 must match numa node specified with -numa option 'node-id 1'
> > and on PC it would print:
> >    Error: node-id=0 must match numa node specified with -numa option 'socket-id 20 node-id 1 thread-id=40'
> > 
> > which doesn't make sense to me.
> > 
> > 
> > I was expecting something like:
> >    Error: CPU slot core-id=30 is bound to node-id 0, but node-id 1 was specified
> > and on PC:
> >    Error: CPU slot socket-id=20,core-id=30,thread-id=40 is bound to node-id 0, but node-id 1 was specified
> > 
> >   
> 
> The idea is to provide the information to the user to help him to know 
> where the cpu can be plugged when it cannot on the node-id he originally 
> provided.
> 
> So all the solutions you propose sounds good to me.
> 
> I only need you and Igor agree on the same one.

We with Eduardo basically agree on contents/set of properties to print,
it is only different phrasing (Eduardo's suggestion is better than what we have now).
But lets get to what problem you are going to fix/improve. SO I've went ahead and tried
with following CLI:

   qemu-system-x86_64 -smp 1,maxcpus=4 -numa node,cpus=0-1 -numa node,cpus=2-3 -monitor stdio -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1

end it errored out with:

  qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: node-id=1 must match numa node specified with -numa option

As you see we already have all user provide properties for cpu (including invalid ones) reported,
what we are missing is suggestion for valid node-id. How about following error message:

qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: invalid node-id, must be 0

> 
> Thanks,
> Laurent


Re: [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Laurent Vivier 4 years, 11 months ago
On 27/05/2019 14:50, Igor Mammedov wrote:
> On Mon, 27 May 2019 08:55:49 +0200
> Laurent Vivier <lvivier@redhat.com> wrote:
> 
>> On 24/05/2019 22:14, Eduardo Habkost wrote:
>>> On Fri, May 24, 2019 at 04:39:12PM +0200, Laurent Vivier wrote:
>>>> On 24/05/2019 16:10, Igor Mammedov wrote:
>>>>> On Fri, 24 May 2019 12:35:21 +0200
>>>>> Laurent Vivier <lvivier@redhat.com> wrote:
>>>>>   
>>>>>> On pseries, core-ids are strongly binded to a node-id by the command
>>>>>> line option. If an user tries to add a CPU to the wrong node, he has
>>>>>> an error but it is not really helpful:
>>>>>>
>>>>>>      qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
>>>>>>                            -numa node,nodeid=0 -numa node,nodeid=1 ...
>>>>>>
>>>>>>      (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
>>>>>>      Error: node-id=1 must match numa node specified with -numa option
>>>>>>
>>>>>> This patch improves this error message by giving to the user the good
>>>>>> topology information (node-id, socket-id and thread-id if they are
>>>>>> available) to use with the core-id he's providing:
>>>>>>
>>>>>>      Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
>>>>>>
>>>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>>>> ---
>>>>>>
>>>>>> Notes:
>>>>>>        v3: only add the topology to the existing message
>>>>>>            As suggested by Igor replace
>>>>>>              Error: core-id 30 can only be plugged into node-id 0
>>>>>>            by
>>>>>>              Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
>>>>>>        v2: display full topology in the error message
>>>>>>>>>     numa.c | 25 ++++++++++++++++++++++++-
>>>>>>     1 file changed, 24 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/numa.c b/numa.c
>>>>>> index 3875e1efda3a..7882ec294be4 100644
>>>>>> --- a/numa.c
>>>>>> +++ b/numa.c
>>>>>> @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
>>>>>>         set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
>>>>>>     }
>>>>>> +static char *cpu_topology_to_string(const CPUArchId *cpu)
>>>>>> +{
>>>>>> +    GString *s = g_string_new(NULL);
>>>>>> +    if (cpu->props.has_socket_id) {
>>>>>> +        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
>>>>>> +    }
>>>>>> +    if (cpu->props.has_node_id) {
>>>>>> +        if (s->len) {
>>>>>> +            g_string_append_printf(s, ", ");
>>>>>> +        }
>>>>>> +        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
>>>>>> +    }
>>>>>> +    if (cpu->props.has_thread_id) {
>>>>>> +        if (s->len) {
>>>>>> +            g_string_append_printf(s, ", ");
>>>>>> +        }
>>>>>> +        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
>>>>>> +    }
>>>>>> +    return g_string_free(s, false);
>>>>>> +}
>>>>>
>>>>> turns out we already have such helper: cpu_slot_to_string()
>>>>
>>>> It doesn't display the node-id but the core-id. And node-id is what we need
>>>> to know.
>>>
>>> I'm confused about what you are trying to do here.
>>>
>>> On v1, the message looked like:
>>>     Error: core-id 30 can only be plugged into node-id 0
>>>
>>> which is probably good for spapr.
>>>
>>>
>>> Then I suggested you added the other cpu->props fields.  e.g. on
>>> PC the message would look like:
>>>     Error: socket-id 20, core-id 30, thread-id 40 can only be plugged into node-id 0
>>>
>>>
>>> But you sent a v2 patch that would print this on PC:
>>>     Error: core-id 30 can only be plugged into socket-id 20, node-id 0, thread-id 40
>>>
>>> which doesn't make sense to me.
>>>
>>>
>>> Then in a reply to v2, Igor suggested:
>>>
>>>    error_setg(errp, "node-id=%d must match numa node specified "
>>>                      "with -numa option '%s'", node_id, topology);
>>>
>>>
>>> Igor suggest would address the problem above.  I expected it to become:
>>>     node-id=0 must match numa node specified with -numa option core-id=30
>>> and on PC:
>>>     node-id=0 must match numa node specified with -numa option socket-id=20,core-id=30,thread-id=40
>>>
>>> Or maybe it could include the input node-id too:
>>>     node-id=0 must match numa node specified with -numa option node-id=1,core-id=30
>>> and on PC:
>>>     node-id=0 must match numa node specified with -numa option node-id=1,socket-id=20,core-id=30,thread-id=40
>>>
>>> Both options would work.
>>>
>>>
>>> But you implemented code that would print:
>>>     Error: node-id=0 must match numa node specified with -numa option 'node-id 1'
>>> and on PC it would print:
>>>     Error: node-id=0 must match numa node specified with -numa option 'socket-id 20 node-id 1 thread-id=40'
>>>
>>> which doesn't make sense to me.
>>>
>>>
>>> I was expecting something like:
>>>     Error: CPU slot core-id=30 is bound to node-id 0, but node-id 1 was specified
>>> and on PC:
>>>     Error: CPU slot socket-id=20,core-id=30,thread-id=40 is bound to node-id 0, but node-id 1 was specified
>>>
>>>    
>>
>> The idea is to provide the information to the user to help him to know
>> where the cpu can be plugged when it cannot on the node-id he originally
>> provided.
>>
>> So all the solutions you propose sounds good to me.
>>
>> I only need you and Igor agree on the same one.
> 
> We with Eduardo basically agree on contents/set of properties to print,
> it is only different phrasing (Eduardo's suggestion is better than what we have now).
> But lets get to what problem you are going to fix/improve. SO I've went ahead and tried
> with following CLI:
> 
>     qemu-system-x86_64 -smp 1,maxcpus=4 -numa node,cpus=0-1 -numa node,cpus=2-3 -monitor stdio -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1
> 
> end it errored out with:
> 
>    qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: node-id=1 must match numa node specified with -numa option
> 
> As you see we already have all user provide properties for cpu (including invalid ones) reported,
> what we are missing is suggestion for valid node-id. How about following error message:
> 
> qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: invalid node-id, must be 0

The case I'm worrying about is when the cpu is hotplugged: we don't have the "-device ..." information.

  $ qemu-system-ppc64 -nodefaults -nographic -monitor stdio -m 1G -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 -numa node,nodeid=0 -numa node,nodeid=1
  QEMU 3.0.1 monitor - type 'help' for more information
  (qemu) device_add power8_v2.0-spapr-cpu-core,core-id=30,node-id=1
  node-id=1 must match numa node specified with -numa option

So you can see the needed information is missing.

Thanks,
Laurent

Re: [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Eduardo Habkost 4 years, 11 months ago
On Mon, May 27, 2019 at 03:52:30PM +0200, Laurent Vivier wrote:
> On 27/05/2019 14:50, Igor Mammedov wrote:
> > On Mon, 27 May 2019 08:55:49 +0200
> > Laurent Vivier <lvivier@redhat.com> wrote:
> > 
> >> On 24/05/2019 22:14, Eduardo Habkost wrote:
> >>> On Fri, May 24, 2019 at 04:39:12PM +0200, Laurent Vivier wrote:
> >>>> On 24/05/2019 16:10, Igor Mammedov wrote:
> >>>>> On Fri, 24 May 2019 12:35:21 +0200
> >>>>> Laurent Vivier <lvivier@redhat.com> wrote:
> >>>>>   
> >>>>>> On pseries, core-ids are strongly binded to a node-id by the command
> >>>>>> line option. If an user tries to add a CPU to the wrong node, he has
> >>>>>> an error but it is not really helpful:
> >>>>>>
> >>>>>>      qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
> >>>>>>                            -numa node,nodeid=0 -numa node,nodeid=1 ...
> >>>>>>
> >>>>>>      (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
> >>>>>>      Error: node-id=1 must match numa node specified with -numa option
> >>>>>>
> >>>>>> This patch improves this error message by giving to the user the good
> >>>>>> topology information (node-id, socket-id and thread-id if they are
> >>>>>> available) to use with the core-id he's providing:
> >>>>>>
> >>>>>>      Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
> >>>>>>
> >>>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> >>>>>> ---
> >>>>>>
> >>>>>> Notes:
> >>>>>>        v3: only add the topology to the existing message
> >>>>>>            As suggested by Igor replace
> >>>>>>              Error: core-id 30 can only be plugged into node-id 0
> >>>>>>            by
> >>>>>>              Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
> >>>>>>        v2: display full topology in the error message
> >>>>>>>>>     numa.c | 25 ++++++++++++++++++++++++-
> >>>>>>     1 file changed, 24 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/numa.c b/numa.c
> >>>>>> index 3875e1efda3a..7882ec294be4 100644
> >>>>>> --- a/numa.c
> >>>>>> +++ b/numa.c
> >>>>>> @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
> >>>>>>         set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
> >>>>>>     }
> >>>>>> +static char *cpu_topology_to_string(const CPUArchId *cpu)
> >>>>>> +{
> >>>>>> +    GString *s = g_string_new(NULL);
> >>>>>> +    if (cpu->props.has_socket_id) {
> >>>>>> +        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
> >>>>>> +    }
> >>>>>> +    if (cpu->props.has_node_id) {
> >>>>>> +        if (s->len) {
> >>>>>> +            g_string_append_printf(s, ", ");
> >>>>>> +        }
> >>>>>> +        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
> >>>>>> +    }
> >>>>>> +    if (cpu->props.has_thread_id) {
> >>>>>> +        if (s->len) {
> >>>>>> +            g_string_append_printf(s, ", ");
> >>>>>> +        }
> >>>>>> +        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
> >>>>>> +    }
> >>>>>> +    return g_string_free(s, false);
> >>>>>> +}
> >>>>>
> >>>>> turns out we already have such helper: cpu_slot_to_string()
> >>>>
> >>>> It doesn't display the node-id but the core-id. And node-id is what we need
> >>>> to know.
> >>>
> >>> I'm confused about what you are trying to do here.
> >>>
> >>> On v1, the message looked like:
> >>>     Error: core-id 30 can only be plugged into node-id 0
> >>>
> >>> which is probably good for spapr.
> >>>
> >>>
> >>> Then I suggested you added the other cpu->props fields.  e.g. on
> >>> PC the message would look like:
> >>>     Error: socket-id 20, core-id 30, thread-id 40 can only be plugged into node-id 0
> >>>
> >>>
> >>> But you sent a v2 patch that would print this on PC:
> >>>     Error: core-id 30 can only be plugged into socket-id 20, node-id 0, thread-id 40
> >>>
> >>> which doesn't make sense to me.
> >>>
> >>>
> >>> Then in a reply to v2, Igor suggested:
> >>>
> >>>    error_setg(errp, "node-id=%d must match numa node specified "
> >>>                      "with -numa option '%s'", node_id, topology);
> >>>
> >>>
> >>> Igor suggest would address the problem above.  I expected it to become:
> >>>     node-id=0 must match numa node specified with -numa option core-id=30
> >>> and on PC:
> >>>     node-id=0 must match numa node specified with -numa option socket-id=20,core-id=30,thread-id=40
> >>>
> >>> Or maybe it could include the input node-id too:
> >>>     node-id=0 must match numa node specified with -numa option node-id=1,core-id=30
> >>> and on PC:
> >>>     node-id=0 must match numa node specified with -numa option node-id=1,socket-id=20,core-id=30,thread-id=40
> >>>
> >>> Both options would work.
> >>>
> >>>
> >>> But you implemented code that would print:
> >>>     Error: node-id=0 must match numa node specified with -numa option 'node-id 1'
> >>> and on PC it would print:
> >>>     Error: node-id=0 must match numa node specified with -numa option 'socket-id 20 node-id 1 thread-id=40'
> >>>
> >>> which doesn't make sense to me.
> >>>
> >>>
> >>> I was expecting something like:
> >>>     Error: CPU slot core-id=30 is bound to node-id 0, but node-id 1 was specified
> >>> and on PC:
> >>>     Error: CPU slot socket-id=20,core-id=30,thread-id=40 is bound to node-id 0, but node-id 1 was specified
> >>>
> >>>    
> >>
> >> The idea is to provide the information to the user to help him to know
> >> where the cpu can be plugged when it cannot on the node-id he originally
> >> provided.
> >>
> >> So all the solutions you propose sounds good to me.
> >>
> >> I only need you and Igor agree on the same one.
> > 
> > We with Eduardo basically agree on contents/set of properties to print,
> > it is only different phrasing (Eduardo's suggestion is better than what we have now).
> > But lets get to what problem you are going to fix/improve. SO I've went ahead and tried
> > with following CLI:
> > 
> >     qemu-system-x86_64 -smp 1,maxcpus=4 -numa node,cpus=0-1 -numa node,cpus=2-3 -monitor stdio -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1
> > 
> > end it errored out with:
> > 
> >    qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: node-id=1 must match numa node specified with -numa option
> > 
> > As you see we already have all user provide properties for cpu (including invalid ones) reported,
> > what we are missing is suggestion for valid node-id. How about following error message:
> > 
> > qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: invalid node-id, must be 0
> 
> The case I'm worrying about is when the cpu is hotplugged: we don't have the "-device ..." information.
> 
>   $ qemu-system-ppc64 -nodefaults -nographic -monitor stdio -m 1G -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 -numa node,nodeid=0 -numa node,nodeid=1
>   QEMU 3.0.1 monitor - type 'help' for more information
>   (qemu) device_add power8_v2.0-spapr-cpu-core,core-id=30,node-id=1
>   node-id=1 must match numa node specified with -numa option
> 
> So you can see the needed information is missing.

What kind of essential information would be missing if we
followed Igor's suggestion?

   (qemu) device_add power8_v2.0-spapr-cpu-core,core-id=30,node-id=1
   invalid node-id, must be 0

If you really want to identify core-id on the error message (like
you did in v1 and v2), it seems OK too.  It just requires extra
work because on PC core-id alone doesn't identify a CPU slot (you
need socket-id, core-id, thread-id).

-- 
Eduardo

Re: [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Laurent Vivier 4 years, 11 months ago
On 28/05/2019 15:44, Eduardo Habkost wrote:
> On Mon, May 27, 2019 at 03:52:30PM +0200, Laurent Vivier wrote:
>> On 27/05/2019 14:50, Igor Mammedov wrote:
>>> On Mon, 27 May 2019 08:55:49 +0200
>>> Laurent Vivier <lvivier@redhat.com> wrote:
>>>
>>>> On 24/05/2019 22:14, Eduardo Habkost wrote:
>>>>> On Fri, May 24, 2019 at 04:39:12PM +0200, Laurent Vivier wrote:
>>>>>> On 24/05/2019 16:10, Igor Mammedov wrote:
>>>>>>> On Fri, 24 May 2019 12:35:21 +0200
>>>>>>> Laurent Vivier <lvivier@redhat.com> wrote:
>>>>>>>    
>>>>>>>> On pseries, core-ids are strongly binded to a node-id by the command
>>>>>>>> line option. If an user tries to add a CPU to the wrong node, he has
>>>>>>>> an error but it is not really helpful:
>>>>>>>>
>>>>>>>>       qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
>>>>>>>>                             -numa node,nodeid=0 -numa node,nodeid=1 ...
>>>>>>>>
>>>>>>>>       (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
>>>>>>>>       Error: node-id=1 must match numa node specified with -numa option
>>>>>>>>
>>>>>>>> This patch improves this error message by giving to the user the good
>>>>>>>> topology information (node-id, socket-id and thread-id if they are
>>>>>>>> available) to use with the core-id he's providing:
>>>>>>>>
>>>>>>>>       Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
>>>>>>>>
>>>>>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> Notes:
>>>>>>>>         v3: only add the topology to the existing message
>>>>>>>>             As suggested by Igor replace
>>>>>>>>               Error: core-id 30 can only be plugged into node-id 0
>>>>>>>>             by
>>>>>>>>               Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
>>>>>>>>         v2: display full topology in the error message
>>>>>>>>>>>      numa.c | 25 ++++++++++++++++++++++++-
>>>>>>>>      1 file changed, 24 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/numa.c b/numa.c
>>>>>>>> index 3875e1efda3a..7882ec294be4 100644
>>>>>>>> --- a/numa.c
>>>>>>>> +++ b/numa.c
>>>>>>>> @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
>>>>>>>>          set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
>>>>>>>>      }
>>>>>>>> +static char *cpu_topology_to_string(const CPUArchId *cpu)
>>>>>>>> +{
>>>>>>>> +    GString *s = g_string_new(NULL);
>>>>>>>> +    if (cpu->props.has_socket_id) {
>>>>>>>> +        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
>>>>>>>> +    }
>>>>>>>> +    if (cpu->props.has_node_id) {
>>>>>>>> +        if (s->len) {
>>>>>>>> +            g_string_append_printf(s, ", ");
>>>>>>>> +        }
>>>>>>>> +        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
>>>>>>>> +    }
>>>>>>>> +    if (cpu->props.has_thread_id) {
>>>>>>>> +        if (s->len) {
>>>>>>>> +            g_string_append_printf(s, ", ");
>>>>>>>> +        }
>>>>>>>> +        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
>>>>>>>> +    }
>>>>>>>> +    return g_string_free(s, false);
>>>>>>>> +}
>>>>>>>
>>>>>>> turns out we already have such helper: cpu_slot_to_string()
>>>>>>
>>>>>> It doesn't display the node-id but the core-id. And node-id is what we need
>>>>>> to know.
>>>>>
>>>>> I'm confused about what you are trying to do here.
>>>>>
>>>>> On v1, the message looked like:
>>>>>      Error: core-id 30 can only be plugged into node-id 0
>>>>>
>>>>> which is probably good for spapr.
>>>>>
>>>>>
>>>>> Then I suggested you added the other cpu->props fields.  e.g. on
>>>>> PC the message would look like:
>>>>>      Error: socket-id 20, core-id 30, thread-id 40 can only be plugged into node-id 0
>>>>>
>>>>>
>>>>> But you sent a v2 patch that would print this on PC:
>>>>>      Error: core-id 30 can only be plugged into socket-id 20, node-id 0, thread-id 40
>>>>>
>>>>> which doesn't make sense to me.
>>>>>
>>>>>
>>>>> Then in a reply to v2, Igor suggested:
>>>>>
>>>>>     error_setg(errp, "node-id=%d must match numa node specified "
>>>>>                       "with -numa option '%s'", node_id, topology);
>>>>>
>>>>>
>>>>> Igor suggest would address the problem above.  I expected it to become:
>>>>>      node-id=0 must match numa node specified with -numa option core-id=30
>>>>> and on PC:
>>>>>      node-id=0 must match numa node specified with -numa option socket-id=20,core-id=30,thread-id=40
>>>>>
>>>>> Or maybe it could include the input node-id too:
>>>>>      node-id=0 must match numa node specified with -numa option node-id=1,core-id=30
>>>>> and on PC:
>>>>>      node-id=0 must match numa node specified with -numa option node-id=1,socket-id=20,core-id=30,thread-id=40
>>>>>
>>>>> Both options would work.
>>>>>
>>>>>
>>>>> But you implemented code that would print:
>>>>>      Error: node-id=0 must match numa node specified with -numa option 'node-id 1'
>>>>> and on PC it would print:
>>>>>      Error: node-id=0 must match numa node specified with -numa option 'socket-id 20 node-id 1 thread-id=40'
>>>>>
>>>>> which doesn't make sense to me.
>>>>>
>>>>>
>>>>> I was expecting something like:
>>>>>      Error: CPU slot core-id=30 is bound to node-id 0, but node-id 1 was specified
>>>>> and on PC:
>>>>>      Error: CPU slot socket-id=20,core-id=30,thread-id=40 is bound to node-id 0, but node-id 1 was specified
>>>>>
>>>>>     
>>>>
>>>> The idea is to provide the information to the user to help him to know
>>>> where the cpu can be plugged when it cannot on the node-id he originally
>>>> provided.
>>>>
>>>> So all the solutions you propose sounds good to me.
>>>>
>>>> I only need you and Igor agree on the same one.
>>>
>>> We with Eduardo basically agree on contents/set of properties to print,
>>> it is only different phrasing (Eduardo's suggestion is better than what we have now).
>>> But lets get to what problem you are going to fix/improve. SO I've went ahead and tried
>>> with following CLI:
>>>
>>>      qemu-system-x86_64 -smp 1,maxcpus=4 -numa node,cpus=0-1 -numa node,cpus=2-3 -monitor stdio -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1
>>>
>>> end it errored out with:
>>>
>>>     qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: node-id=1 must match numa node specified with -numa option
>>>
>>> As you see we already have all user provide properties for cpu (including invalid ones) reported,
>>> what we are missing is suggestion for valid node-id. How about following error message:
>>>
>>> qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: invalid node-id, must be 0
>>
>> The case I'm worrying about is when the cpu is hotplugged: we don't have the "-device ..." information.
>>
>>    $ qemu-system-ppc64 -nodefaults -nographic -monitor stdio -m 1G -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 -numa node,nodeid=0 -numa node,nodeid=1
>>    QEMU 3.0.1 monitor - type 'help' for more information
>>    (qemu) device_add power8_v2.0-spapr-cpu-core,core-id=30,node-id=1
>>    node-id=1 must match numa node specified with -numa option
>>
>> So you can see the needed information is missing.
> 
> What kind of essential information would be missing if we
> followed Igor's suggestion?
> 
>     (qemu) device_add power8_v2.0-spapr-cpu-core,core-id=30,node-id=1
>     invalid node-id, must be 0
> 

"invalid node-id, must be 0" is perfect for me. I update the patch.

Thank you,
Laurent


Re: [Qemu-devel] [PATCH v3] numa: improve cpu hotplug error message with a wrong node-id
Posted by Igor Mammedov 4 years, 11 months ago
On Mon, 27 May 2019 15:52:30 +0200
Laurent Vivier <lvivier@redhat.com> wrote:

> On 27/05/2019 14:50, Igor Mammedov wrote:
> > On Mon, 27 May 2019 08:55:49 +0200
> > Laurent Vivier <lvivier@redhat.com> wrote:
> >   
> >> On 24/05/2019 22:14, Eduardo Habkost wrote:  
> >>> On Fri, May 24, 2019 at 04:39:12PM +0200, Laurent Vivier wrote:  
> >>>> On 24/05/2019 16:10, Igor Mammedov wrote:  
> >>>>> On Fri, 24 May 2019 12:35:21 +0200
> >>>>> Laurent Vivier <lvivier@redhat.com> wrote:
> >>>>>     
> >>>>>> On pseries, core-ids are strongly binded to a node-id by the command
> >>>>>> line option. If an user tries to add a CPU to the wrong node, he has
> >>>>>> an error but it is not really helpful:
> >>>>>>
> >>>>>>      qemu-system-ppc64 ... -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 \
> >>>>>>                            -numa node,nodeid=0 -numa node,nodeid=1 ...
> >>>>>>
> >>>>>>      (qemu) device_add power9_v2.0-spapr-cpu-core,core-id=30,node-id=1
> >>>>>>      Error: node-id=1 must match numa node specified with -numa option
> >>>>>>
> >>>>>> This patch improves this error message by giving to the user the good
> >>>>>> topology information (node-id, socket-id and thread-id if they are
> >>>>>> available) to use with the core-id he's providing:
> >>>>>>
> >>>>>>      Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
> >>>>>>
> >>>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> >>>>>> ---
> >>>>>>
> >>>>>> Notes:
> >>>>>>        v3: only add the topology to the existing message
> >>>>>>            As suggested by Igor replace
> >>>>>>              Error: core-id 30 can only be plugged into node-id 0
> >>>>>>            by
> >>>>>>              Error: node-id=1 must match numa node specified with -numa option 'node-id 0'
> >>>>>>        v2: display full topology in the error message  
> >>>>>>>>>     numa.c | 25 ++++++++++++++++++++++++-  
> >>>>>>     1 file changed, 24 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/numa.c b/numa.c
> >>>>>> index 3875e1efda3a..7882ec294be4 100644
> >>>>>> --- a/numa.c
> >>>>>> +++ b/numa.c
> >>>>>> @@ -458,6 +458,27 @@ void qmp_set_numa_node(NumaOptions *cmd, Error **errp)
> >>>>>>         set_numa_options(MACHINE(qdev_get_machine()), cmd, errp);
> >>>>>>     }
> >>>>>> +static char *cpu_topology_to_string(const CPUArchId *cpu)
> >>>>>> +{
> >>>>>> +    GString *s = g_string_new(NULL);
> >>>>>> +    if (cpu->props.has_socket_id) {
> >>>>>> +        g_string_append_printf(s, "socket-id %"PRId64, cpu->props.socket_id);
> >>>>>> +    }
> >>>>>> +    if (cpu->props.has_node_id) {
> >>>>>> +        if (s->len) {
> >>>>>> +            g_string_append_printf(s, ", ");
> >>>>>> +        }
> >>>>>> +        g_string_append_printf(s, "node-id %"PRId64, cpu->props.node_id);
> >>>>>> +    }
> >>>>>> +    if (cpu->props.has_thread_id) {
> >>>>>> +        if (s->len) {
> >>>>>> +            g_string_append_printf(s, ", ");
> >>>>>> +        }
> >>>>>> +        g_string_append_printf(s, "thread-id %"PRId64, cpu->props.thread_id);
> >>>>>> +    }
> >>>>>> +    return g_string_free(s, false);
> >>>>>> +}  
> >>>>>
> >>>>> turns out we already have such helper: cpu_slot_to_string()  
> >>>>
> >>>> It doesn't display the node-id but the core-id. And node-id is what we need
> >>>> to know.  
> >>>
> >>> I'm confused about what you are trying to do here.
> >>>
> >>> On v1, the message looked like:
> >>>     Error: core-id 30 can only be plugged into node-id 0
> >>>
> >>> which is probably good for spapr.
> >>>
> >>>
> >>> Then I suggested you added the other cpu->props fields.  e.g. on
> >>> PC the message would look like:
> >>>     Error: socket-id 20, core-id 30, thread-id 40 can only be plugged into node-id 0
> >>>
> >>>
> >>> But you sent a v2 patch that would print this on PC:
> >>>     Error: core-id 30 can only be plugged into socket-id 20, node-id 0, thread-id 40
> >>>
> >>> which doesn't make sense to me.
> >>>
> >>>
> >>> Then in a reply to v2, Igor suggested:
> >>>
> >>>    error_setg(errp, "node-id=%d must match numa node specified "
> >>>                      "with -numa option '%s'", node_id, topology);
> >>>
> >>>
> >>> Igor suggest would address the problem above.  I expected it to become:
> >>>     node-id=0 must match numa node specified with -numa option core-id=30
> >>> and on PC:
> >>>     node-id=0 must match numa node specified with -numa option socket-id=20,core-id=30,thread-id=40
> >>>
> >>> Or maybe it could include the input node-id too:
> >>>     node-id=0 must match numa node specified with -numa option node-id=1,core-id=30
> >>> and on PC:
> >>>     node-id=0 must match numa node specified with -numa option node-id=1,socket-id=20,core-id=30,thread-id=40
> >>>
> >>> Both options would work.
> >>>
> >>>
> >>> But you implemented code that would print:
> >>>     Error: node-id=0 must match numa node specified with -numa option 'node-id 1'
> >>> and on PC it would print:
> >>>     Error: node-id=0 must match numa node specified with -numa option 'socket-id 20 node-id 1 thread-id=40'
> >>>
> >>> which doesn't make sense to me.
> >>>
> >>>
> >>> I was expecting something like:
> >>>     Error: CPU slot core-id=30 is bound to node-id 0, but node-id 1 was specified
> >>> and on PC:
> >>>     Error: CPU slot socket-id=20,core-id=30,thread-id=40 is bound to node-id 0, but node-id 1 was specified
> >>>
> >>>      
> >>
> >> The idea is to provide the information to the user to help him to know
> >> where the cpu can be plugged when it cannot on the node-id he originally
> >> provided.
> >>
> >> So all the solutions you propose sounds good to me.
> >>
> >> I only need you and Igor agree on the same one.  
> > 
> > We with Eduardo basically agree on contents/set of properties to print,
> > it is only different phrasing (Eduardo's suggestion is better than what we have now).
> > But lets get to what problem you are going to fix/improve. SO I've went ahead and tried
> > with following CLI:
> > 
> >     qemu-system-x86_64 -smp 1,maxcpus=4 -numa node,cpus=0-1 -numa node,cpus=2-3 -monitor stdio -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1
> > 
> > end it errored out with:
> > 
> >    qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: node-id=1 must match numa node specified with -numa option
> > 
> > As you see we already have all user provide properties for cpu (including invalid ones) reported,
> > what we are missing is suggestion for valid node-id. How about following error message:
> > 
> > qemu-system-x86_64: -device qemu64-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,node-id=1: invalid node-id, must be 0  
> 
> The case I'm worrying about is when the cpu is hotplugged: we don't have the "-device ..." information.
> 
>   $ qemu-system-ppc64 -nodefaults -nographic -monitor stdio -m 1G -smp 1,maxcpus=64,cores=1,threads=1,sockets=1 -numa node,nodeid=0 -numa node,nodeid=1
>   QEMU 3.0.1 monitor - type 'help' for more information
>   (qemu) device_add power8_v2.0-spapr-cpu-core,core-id=30,node-id=1
>   node-id=1 must match numa node specified with -numa option
> 
> So you can see the needed information is missing.

device-add is synchronous command so user (monitor) has a invalid properties command right above error, similar thing applies to QMP where user can match command with reply.
Repeating device properties looks to me like unnecessary date duplication.

> 
> Thanks,
> Laurent