[PATCH v2 20/25] x86: Fix x86_cpu_new() error API violations

Markus Armbruster posted 25 patches 5 years, 4 months ago
Maintainers: Eric Auger <eric.auger@redhat.com>, "Edgar E. Iglesias" <edgar.iglesias@gmail.com>, Xie Changlong <xiechanglong.d@gmail.com>, Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>, Aurelien Jarno <aurelien@aurel32.net>, Wen Congyang <wencongyang2@huawei.com>, Paolo Bonzini <pbonzini@redhat.com>, Markus Armbruster <armbru@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, Peter Maydell <peter.maydell@linaro.org>, Eduardo Habkost <ehabkost@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, David Gibson <david@gibson.dropbear.id.au>, Aleksandar Markovic <aleksandar.qemu.devel@gmail.com>, Greg Kurz <groug@kaod.org>, "Michael S. Tsirkin" <mst@redhat.com>, Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>, Christian Schoenebeck <qemu_oss@crudebyte.com>, Alistair Francis <Alistair.Francis@wdc.com>, Bastian Koppelmann <kbastian@mail.uni-paderborn.de>, Alistair Francis <alistair@alistair23.me>, Jason Wang <jasowang@redhat.com>, Max Reitz <mreitz@redhat.com>, Gerd Hoffmann <kraxel@redhat.com>, Michael Roth <mdroth@linux.vnet.ibm.com>, Palmer Dabbelt <palmer@dabbelt.com>, Sagar Karandikar <sagark@eecs.berkeley.edu>, Richard Henderson <rth@twiddle.net>, "Marc-André Lureau" <marcandre.lureau@redhat.com>
There is a newer version of this series
[PATCH v2 20/25] x86: Fix x86_cpu_new() error API violations
Posted by Markus Armbruster 5 years, 4 months ago
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.

x86_cpu_new() is wrong that way: it passes &local_err to
object_property_set_uint() without checking it, and then to
qdev_realize().  Harmless, because the former can't actually fail
here.

Fix by checking for failure right away.  While there, replace
qdev_realize(); object_unref() by qdev_realize_and_unref().

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 hw/i386/x86.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 34229b45c7..3a7029e6db 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -118,16 +118,10 @@ uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
 
 void x86_cpu_new(X86MachineState *x86ms, int64_t apic_id, Error **errp)
 {
-    Object *cpu = NULL;
-    Error *local_err = NULL;
+    Object *cpu = object_new(MACHINE(x86ms)->cpu_type);
 
-    cpu = object_new(MACHINE(x86ms)->cpu_type);
-
-    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
-    qdev_realize(DEVICE(cpu), NULL, &local_err);
-
-    object_unref(cpu);
-    error_propagate(errp, local_err);
+    object_property_set_uint(cpu, apic_id, "apic-id", &error_abort);
+    qdev_realize_and_unref(DEVICE(cpu), NULL, errp);
 }
 
 void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
-- 
2.26.2


Re: [PATCH v2 20/25] x86: Fix x86_cpu_new() error API violations
Posted by Igor Mammedov 5 years, 4 months ago
On Wed, 24 Jun 2020 10:37:32 +0200
Markus Armbruster <armbru@redhat.com> wrote:

> The Error ** argument must be NULL, &error_abort, &error_fatal, or a
> pointer to a variable containing NULL.  Passing an argument of the
> latter kind twice without clearing it in between is wrong: if the
> first call sets an error, it no longer points to NULL for the second
> call.
> 
> x86_cpu_new() is wrong that way: it passes &local_err to
> object_property_set_uint() without checking it, and then to
> qdev_realize().  Harmless, because the former can't actually fail
> here.
> 
> Fix by checking for failure right away.  While there, replace
> qdev_realize(); object_unref() by qdev_realize_and_unref().
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Richard Henderson <rth@twiddle.net>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  hw/i386/x86.c | 12 +++---------
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 34229b45c7..3a7029e6db 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -118,16 +118,10 @@ uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
>  
>  void x86_cpu_new(X86MachineState *x86ms, int64_t apic_id, Error **errp)
>  {
> -    Object *cpu = NULL;
> -    Error *local_err = NULL;
> +    Object *cpu = object_new(MACHINE(x86ms)->cpu_type);
>  
> -    cpu = object_new(MACHINE(x86ms)->cpu_type);
> -
> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
> -    qdev_realize(DEVICE(cpu), NULL, &local_err);
> -
> -    object_unref(cpu);
> -    error_propagate(errp, local_err);
> +    object_property_set_uint(cpu, apic_id, "apic-id", &error_abort);
it may fail here if user specified wrong cpu flags, but there is nothing we can do to fix it.
perhaps error_fatal would suit this case better?
 

> +    qdev_realize_and_unref(DEVICE(cpu), NULL, errp);
>  }
>  
>  void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)


Re: [PATCH v2 20/25] x86: Fix x86_cpu_new() error API violations
Posted by Paolo Bonzini 5 years, 4 months ago
On 24/06/20 16:17, Igor Mammedov wrote:
>> -    cpu = object_new(MACHINE(x86ms)->cpu_type);
>> -
>> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
>> -    qdev_realize(DEVICE(cpu), NULL, &local_err);
>> -
>> -    object_unref(cpu);
>> -    error_propagate(errp, local_err);
>> +    object_property_set_uint(cpu, apic_id, "apic-id", &error_abort);
> it may fail here if user specified wrong cpu flags, but there is nothing we can do to fix it.
> perhaps error_fatal would suit this case better?

No, we need to add the error_propagate dance instead.

Paolo


Re: [PATCH v2 20/25] x86: Fix x86_cpu_new() error API violations
Posted by Igor Mammedov 5 years, 4 months ago
On Wed, 24 Jun 2020 16:20:16 +0200
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 24/06/20 16:17, Igor Mammedov wrote:
> >> -    cpu = object_new(MACHINE(x86ms)->cpu_type);
> >> -
> >> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
> >> -    qdev_realize(DEVICE(cpu), NULL, &local_err);
> >> -
> >> -    object_unref(cpu);
> >> -    error_propagate(errp, local_err);
> >> +    object_property_set_uint(cpu, apic_id, "apic-id", &error_abort);  
> > it may fail here if user specified wrong cpu flags, but there is nothing we can do to fix it.
> > perhaps error_fatal would suit this case better?  
> 
> No, we need to add the error_propagate dance instead.

yep, it cam be used by legacy cpu-add, so just dying isn't an option.

we need deprecate cpu-add since device-add is supported buy all interested
boards for quite a bit and once it's gone, we can use error_fatal here.


> 
> Paolo
> 


Re: [PATCH v2 20/25] x86: Fix x86_cpu_new() error API violations
Posted by Markus Armbruster 5 years, 4 months ago
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 24/06/20 16:17, Igor Mammedov wrote:
>>> -    cpu = object_new(MACHINE(x86ms)->cpu_type);
>>> -
>>> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
>>> -    qdev_realize(DEVICE(cpu), NULL, &local_err);
>>> -
>>> -    object_unref(cpu);
>>> -    error_propagate(errp, local_err);
>>> +    object_property_set_uint(cpu, apic_id, "apic-id", &error_abort);
>> it may fail here if user specified wrong cpu flags, but there is nothing we can do to fix it.
>> perhaps error_fatal would suit this case better?
>
> No, we need to add the error_propagate dance instead.

Thanks, will dance!


Re: [PATCH v2 20/25] x86: Fix x86_cpu_new() error API violations
Posted by Markus Armbruster 5 years, 4 months ago
Igor Mammedov <imammedo@redhat.com> writes:

> On Wed, 24 Jun 2020 10:37:32 +0200
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> The Error ** argument must be NULL, &error_abort, &error_fatal, or a
>> pointer to a variable containing NULL.  Passing an argument of the
>> latter kind twice without clearing it in between is wrong: if the
>> first call sets an error, it no longer points to NULL for the second
>> call.
>> 
>> x86_cpu_new() is wrong that way: it passes &local_err to
>> object_property_set_uint() without checking it, and then to
>> qdev_realize().  Harmless, because the former can't actually fail
>> here.
>> 
>> Fix by checking for failure right away.  While there, replace
>> qdev_realize(); object_unref() by qdev_realize_and_unref().
>> 
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Richard Henderson <rth@twiddle.net>
>> Cc: Eduardo Habkost <ehabkost@redhat.com>
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>  hw/i386/x86.c | 12 +++---------
>>  1 file changed, 3 insertions(+), 9 deletions(-)
>> 
>> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
>> index 34229b45c7..3a7029e6db 100644
>> --- a/hw/i386/x86.c
>> +++ b/hw/i386/x86.c
>> @@ -118,16 +118,10 @@ uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
>>  
>>  void x86_cpu_new(X86MachineState *x86ms, int64_t apic_id, Error **errp)
>>  {
>> -    Object *cpu = NULL;
>> -    Error *local_err = NULL;
>> +    Object *cpu = object_new(MACHINE(x86ms)->cpu_type);
>>  
>> -    cpu = object_new(MACHINE(x86ms)->cpu_type);
>> -
>> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
>> -    qdev_realize(DEVICE(cpu), NULL, &local_err);
>> -
>> -    object_unref(cpu);
>> -    error_propagate(errp, local_err);
>> +    object_property_set_uint(cpu, apic_id, "apic-id", &error_abort);
> it may fail here if user specified wrong cpu flags, but there is nothing we can do to fix it.

Really?

object_property_set_uint() fails when property "apic-id" doesn't exist,
has no ->set() method, or its ->set() fails.

Property "apic-id" is defined in x86_cpu_properties[] as

    DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),

This means "apic-id" exists, and its ->set() is set_uint32().  That
leaves only set_uint32() as possible source of failure.

It fails when

* the device is already realized: programming error

* the value to be stored is not an integer: object_property_set_uint()
  makes it one, can't fail

* the value is not representable as uint32_t: @api_id is declared as
  int64_t, but:

  - pc_hot_add_cpu() passes x86_cpu_apic_id_from_index(), which is
    uint32_t, converted to int64_t.  Can't fail.

  - x86_cpus_init() passes possible_cpus->cpus[i].arch_id, which is
    uint64_t.  Is this the "if user specified wrong cpu flags" case?

  Aside: should the integer types be cleaned up?

To assess the bug's impact, we need to know when the other call in this
error pileup fails.  If we can make both fail, we have a crash bug.
Else, we have a harmless API violation.

Any ideas on how to make the qdev_realize() fail?

[...]


Re: [PATCH v2 20/25] x86: Fix x86_cpu_new() error API violations
Posted by Igor Mammedov 5 years, 4 months ago
On Fri, 26 Jun 2020 14:54:38 +0200
Markus Armbruster <armbru@redhat.com> wrote:

> Igor Mammedov <imammedo@redhat.com> writes:
> 
> > On Wed, 24 Jun 2020 10:37:32 +0200
> > Markus Armbruster <armbru@redhat.com> wrote:
> >  
> >> The Error ** argument must be NULL, &error_abort, &error_fatal, or a
> >> pointer to a variable containing NULL.  Passing an argument of the
> >> latter kind twice without clearing it in between is wrong: if the
> >> first call sets an error, it no longer points to NULL for the second
> >> call.
> >> 
> >> x86_cpu_new() is wrong that way: it passes &local_err to
> >> object_property_set_uint() without checking it, and then to
> >> qdev_realize().  Harmless, because the former can't actually fail
> >> here.
> >> 
> >> Fix by checking for failure right away.  While there, replace
> >> qdev_realize(); object_unref() by qdev_realize_and_unref().
> >> 
> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
> >> Cc: Richard Henderson <rth@twiddle.net>
> >> Cc: Eduardo Habkost <ehabkost@redhat.com>
> >> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> >> ---
> >>  hw/i386/x86.c | 12 +++---------
> >>  1 file changed, 3 insertions(+), 9 deletions(-)
> >> 
> >> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> >> index 34229b45c7..3a7029e6db 100644
> >> --- a/hw/i386/x86.c
> >> +++ b/hw/i386/x86.c
> >> @@ -118,16 +118,10 @@ uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
> >>  
> >>  void x86_cpu_new(X86MachineState *x86ms, int64_t apic_id, Error **errp)
> >>  {
> >> -    Object *cpu = NULL;
> >> -    Error *local_err = NULL;
> >> +    Object *cpu = object_new(MACHINE(x86ms)->cpu_type);
> >>  
> >> -    cpu = object_new(MACHINE(x86ms)->cpu_type);
> >> -
> >> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
> >> -    qdev_realize(DEVICE(cpu), NULL, &local_err);
> >> -
> >> -    object_unref(cpu);
> >> -    error_propagate(errp, local_err);
> >> +    object_property_set_uint(cpu, apic_id, "apic-id", &error_abort);  
> > it may fail here if user specified wrong cpu flags, but there is nothing we can do to fix it.  
> 
> Really?
> 
> object_property_set_uint() fails when property "apic-id" doesn't exist,
> has no ->set() method, or its ->set() fails.
> 
> Property "apic-id" is defined in x86_cpu_properties[] as
> 
>     DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
> 
> This means "apic-id" exists, and its ->set() is set_uint32().  That
> leaves only set_uint32() as possible source of failure.
> 
> It fails when
> 
> * the device is already realized: programming error
> 
> * the value to be stored is not an integer: object_property_set_uint()
>   makes it one, can't fail
> 
> * the value is not representable as uint32_t: @api_id is declared as
>   int64_t, but:
> 
>   - pc_hot_add_cpu() passes x86_cpu_apic_id_from_index(), which is
>     uint32_t, converted to int64_t.  Can't fail.
> 
>   - x86_cpus_init() passes possible_cpus->cpus[i].arch_id, which is
>     uint64_t.  Is this the "if user specified wrong cpu flags" case?

looking more on it, object_property_set_uint() can't really fail

>   Aside: should the integer types be cleaned up?

apic_id is x86 specific subset of .arch_id.
The later is used by other targets which may need larger than 32bit integer
(if I recall correctly virt-arm uses 64bit id). 


> To assess the bug's impact, we need to know when the other call in this
> error pileup fails.  If we can make both fail, we have a crash bug.
> Else, we have a harmless API violation.
> 
> Any ideas on how to make the qdev_realize() fail?
qemu CLI case
  QEMU -cpu qemu64,enforce,topoext

legacy hotplug case:
  QEMU -smp 1,maxcpus=2
  (monitor) cpu-add 1
  (monitor) cpu-add 1  <= fail
 



Re: [PATCH v2 20/25] x86: Fix x86_cpu_new() error API violations
Posted by Markus Armbruster 5 years, 4 months ago
Igor Mammedov <imammedo@redhat.com> writes:

> On Fri, 26 Jun 2020 14:54:38 +0200
> Markus Armbruster <armbru@redhat.com> wrote:
>
>> Igor Mammedov <imammedo@redhat.com> writes:
>> 
>> > On Wed, 24 Jun 2020 10:37:32 +0200
>> > Markus Armbruster <armbru@redhat.com> wrote:
>> >  
>> >> The Error ** argument must be NULL, &error_abort, &error_fatal, or a
>> >> pointer to a variable containing NULL.  Passing an argument of the
>> >> latter kind twice without clearing it in between is wrong: if the
>> >> first call sets an error, it no longer points to NULL for the second
>> >> call.
>> >> 
>> >> x86_cpu_new() is wrong that way: it passes &local_err to
>> >> object_property_set_uint() without checking it, and then to
>> >> qdev_realize().  Harmless, because the former can't actually fail
>> >> here.
>> >> 
>> >> Fix by checking for failure right away.  While there, replace
>> >> qdev_realize(); object_unref() by qdev_realize_and_unref().
>> >> 
>> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> >> Cc: Richard Henderson <rth@twiddle.net>
>> >> Cc: Eduardo Habkost <ehabkost@redhat.com>
>> >> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> >> ---
>> >>  hw/i386/x86.c | 12 +++---------
>> >>  1 file changed, 3 insertions(+), 9 deletions(-)
>> >> 
>> >> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
>> >> index 34229b45c7..3a7029e6db 100644
>> >> --- a/hw/i386/x86.c
>> >> +++ b/hw/i386/x86.c
>> >> @@ -118,16 +118,10 @@ uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
>> >>  
>> >>  void x86_cpu_new(X86MachineState *x86ms, int64_t apic_id, Error **errp)
>> >>  {
>> >> -    Object *cpu = NULL;
>> >> -    Error *local_err = NULL;
>> >> +    Object *cpu = object_new(MACHINE(x86ms)->cpu_type);
>> >>  
>> >> -    cpu = object_new(MACHINE(x86ms)->cpu_type);
>> >> -
>> >> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
>> >> -    qdev_realize(DEVICE(cpu), NULL, &local_err);
>> >> -
>> >> -    object_unref(cpu);
>> >> -    error_propagate(errp, local_err);
>> >> +    object_property_set_uint(cpu, apic_id, "apic-id", &error_abort);  
>> > it may fail here if user specified wrong cpu flags, but there is nothing we can do to fix it.  
>> 
>> Really?
>> 
>> object_property_set_uint() fails when property "apic-id" doesn't exist,
>> has no ->set() method, or its ->set() fails.
>> 
>> Property "apic-id" is defined in x86_cpu_properties[] as
>> 
>>     DEFINE_PROP_UINT32("apic-id", X86CPU, apic_id, UNASSIGNED_APIC_ID),
>> 
>> This means "apic-id" exists, and its ->set() is set_uint32().  That
>> leaves only set_uint32() as possible source of failure.
>> 
>> It fails when
>> 
>> * the device is already realized: programming error
>> 
>> * the value to be stored is not an integer: object_property_set_uint()
>>   makes it one, can't fail
>> 
>> * the value is not representable as uint32_t: @api_id is declared as
>>   int64_t, but:
>> 
>>   - pc_hot_add_cpu() passes x86_cpu_apic_id_from_index(), which is
>>     uint32_t, converted to int64_t.  Can't fail.
>> 
>>   - x86_cpus_init() passes possible_cpus->cpus[i].arch_id, which is
>>     uint64_t.  Is this the "if user specified wrong cpu flags" case?
>
> looking more on it, object_property_set_uint() can't really fail

Correct.

>>   Aside: should the integer types be cleaned up?
>
> apic_id is x86 specific subset of .arch_id.
> The later is used by other targets which may need larger than 32bit integer
> (if I recall correctly virt-arm uses 64bit id). 

I trust this works and makes sense, but the implicit conversions still
give me an uneasy feeling.

>> To assess the bug's impact, we need to know when the other call in this
>> error pileup fails.  If we can make both fail, we have a crash bug.
>> Else, we have a harmless API violation.
>> 
>> Any ideas on how to make the qdev_realize() fail?
> qemu CLI case
>   QEMU -cpu qemu64,enforce,topoext
>
> legacy hotplug case:
>   QEMU -smp 1,maxcpus=2
>   (monitor) cpu-add 1
>   (monitor) cpu-add 1  <= fail

Testing:

    $ qemu-system-x86_64 -nodefaults -display none -S -monitor stdio -smp 1,maxcpus=2
    QEMU 5.0.50 monitor - type 'help' for more information
    (qemu) cpu-add 1
    cpu_add is deprecated, please use device_add instead
    (qemu) cpu-add 1
    cpu_add is deprecated, please use device_add instead
    Error: CPU[1] with APIC ID 1 exists
    (qemu) 

We're good.

    $ qemu-system-x86_64 -cpu qemu64,enforce,topoext
    qemu-system-x86_64: warning: TCG doesn't support requested feature: CPUID.80000001H:ECX.topoext [bit 22]
    qemu-system-x86_64: TCG doesn't support requested features
    [Exit 1 ]

Are we good?

To finish the job in time for the freeze, I made do with this
non-assessment (commit 18d588fe1e1):

    To assess the bug's impact, we'd need to figure out how to make both
    calls fail.  Too much work for ignorant me, sorry.

Thanks!