Currently, the instance_post_init calls are performed from the leaf
class and all the way up to Object. This is incorrect because the
leaf class cannot observe property values applied by the superclasses;
for example, a compat property will be set on a device *after*
the class's post_init callback has run.
In particular this makes it impossible for implementations of
accel_cpu_instance_init() to operate based on the actual values of
the properties, though it seems that cxl_dsp_instance_post_init and
rp_instance_post_init might have similar issues.
Follow instead the same order as instance_init, starting with Object
and running the child class's instance_post_init after the parent.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
include/qom/object.h | 3 ++-
qom/object.c | 8 ++++----
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/include/qom/object.h b/include/qom/object.h
index 1d5b0337242..26df6137b91 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -445,7 +445,8 @@ struct Object
* class will have already been initialized so the type is only responsible
* for initializing its own members.
* @instance_post_init: This function is called to finish initialization of
- * an object, after all @instance_init functions were called.
+ * an object, after all @instance_init functions were called, as well as
+ * @instance_post_init functions for the parent classes.
* @instance_finalize: This function is called during object destruction. This
* is called before the parent @instance_finalize function has been called.
* An object should only free the members that are unique to its type in this
diff --git a/qom/object.c b/qom/object.c
index 7b013f40a0c..1856bb36c74 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -431,13 +431,13 @@ static void object_init_with_type(Object *obj, TypeImpl *ti)
static void object_post_init_with_type(Object *obj, TypeImpl *ti)
{
- if (ti->instance_post_init) {
- ti->instance_post_init(obj);
- }
-
if (type_has_parent(ti)) {
object_post_init_with_type(obj, type_get_parent(ti));
}
+
+ if (ti->instance_post_init) {
+ ti->instance_post_init(obj);
+ }
}
bool object_apply_global_props(Object *obj, const GPtrArray *props,
--
2.49.0
This commit may broken the "vendor=" configuration.
For instance, the hypervisor CPU vendor is AMD.
I am going to use "-cpu Skylake-Server,vendor=GenuineIntel".
Because of the commit, the vendor is still AMD.
[root@vm ~]# cpuid -1 -l 0x0
CPU:
vendor_id = "AuthenticAMD"
If I revert this patch, the vendor because the expected Intel.
[root@vm ~]# cpuid -1 -l 0x0
CPU:
vendor_id = "GenuineIntel"
Thank you very much!
Dongli Zhang
On 5/20/25 4:05 AM, Paolo Bonzini wrote:
> Currently, the instance_post_init calls are performed from the leaf
> class and all the way up to Object. This is incorrect because the
> leaf class cannot observe property values applied by the superclasses;
> for example, a compat property will be set on a device *after*
> the class's post_init callback has run.
>
> In particular this makes it impossible for implementations of
> accel_cpu_instance_init() to operate based on the actual values of
> the properties, though it seems that cxl_dsp_instance_post_init and
> rp_instance_post_init might have similar issues.
>
> Follow instead the same order as instance_init, starting with Object
> and running the child class's instance_post_init after the parent.
>
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> include/qom/object.h | 3 ++-
> qom/object.c | 8 ++++----
> 2 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/include/qom/object.h b/include/qom/object.h
> index 1d5b0337242..26df6137b91 100644
> --- a/include/qom/object.h
> +++ b/include/qom/object.h
> @@ -445,7 +445,8 @@ struct Object
> * class will have already been initialized so the type is only responsible
> * for initializing its own members.
> * @instance_post_init: This function is called to finish initialization of
> - * an object, after all @instance_init functions were called.
> + * an object, after all @instance_init functions were called, as well as
> + * @instance_post_init functions for the parent classes.
> * @instance_finalize: This function is called during object destruction. This
> * is called before the parent @instance_finalize function has been called.
> * An object should only free the members that are unique to its type in this
> diff --git a/qom/object.c b/qom/object.c
> index 7b013f40a0c..1856bb36c74 100644
> --- a/qom/object.c
> +++ b/qom/object.c
> @@ -431,13 +431,13 @@ static void object_init_with_type(Object *obj, TypeImpl *ti)
>
> static void object_post_init_with_type(Object *obj, TypeImpl *ti)
> {
> - if (ti->instance_post_init) {
> - ti->instance_post_init(obj);
> - }
> -
> if (type_has_parent(ti)) {
> object_post_init_with_type(obj, type_get_parent(ti));
> }
> +
> + if (ti->instance_post_init) {
> + ti->instance_post_init(obj);
> + }
> }
>
> bool object_apply_global_props(Object *obj, const GPtrArray *props,
On Mon, Jun 23, 2025 at 09:56:14AM -0700, Dongli Zhang wrote:
> Date: Mon, 23 Jun 2025 09:56:14 -0700
> From: Dongli Zhang <dongli.zhang@oracle.com>
> Subject: [Regression] Re: [PULL 35/35] qom: reverse order of
> instance_post_init calls
>
> This commit may broken the "vendor=" configuration.
>
> For instance, the hypervisor CPU vendor is AMD.
>
> I am going to use "-cpu Skylake-Server,vendor=GenuineIntel".
>
>
> Because of the commit, the vendor is still AMD.
>
> [root@vm ~]# cpuid -1 -l 0x0
> CPU:
> vendor_id = "AuthenticAMD"
>
>
> If I revert this patch, the vendor because the expected Intel.
>
> [root@vm ~]# cpuid -1 -l 0x0
> CPU:
> vendor_id = "GenuineIntel"
>
>
> Thank you very much!
Thank you Dongli!
(+Like)
While testing my cache model series, I also noticed the similar behavior
for KVM. Additionally, Like Xu reported to me that this commit caused
a failure in a KVM unit test case. Your report helped me connect these
two issues I met (though due to my environment issues, I haven't
confirmed yet).
The "vendor" property from cli is registered as the global property in
x86_cpu_parse_featurestr(), and is applied to x86 CPUs in
device_post_init().
With this commit, now KVM will override the "vendor" in
host_cpu_instance_init() (called in x86_cpu_post_initfn()) after
device_post_init(), regardless the previous global "vendor" property.
Back to this commit, I think current order of post_init makes sense.
Instead, the place of host_cpu_instance_init() doesn't seem quite
right. So, I think this commit might have exposed some drawbacks in the
previous x86 CPU initialization order:
f5cc5a5c1686 ("i386: split cpu accelerators from cpu.c, using AccelCPUClass")
5b8978d80426 ("i386: do not call cpudef-only models functions for max, host, base")
(cc Thomas for bug reporting on kvm-unit-test...)
On Tue, Jun 24, 2025 at 04:57:21PM +0800, Zhao Liu wrote:
> Date: Tue, 24 Jun 2025 16:57:21 +0800
> From: Zhao Liu <zhao1.liu@intel.com>
> Subject: Re: [Regression] Re: [PULL 35/35] qom: reverse order of
> instance_post_init calls
>
> On Mon, Jun 23, 2025 at 09:56:14AM -0700, Dongli Zhang wrote:
> > Date: Mon, 23 Jun 2025 09:56:14 -0700
> > From: Dongli Zhang <dongli.zhang@oracle.com>
> > Subject: [Regression] Re: [PULL 35/35] qom: reverse order of
> > instance_post_init calls
> >
> > This commit may broken the "vendor=" configuration.
> >
> > For instance, the hypervisor CPU vendor is AMD.
> >
> > I am going to use "-cpu Skylake-Server,vendor=GenuineIntel".
> >
> >
> > Because of the commit, the vendor is still AMD.
> >
> > [root@vm ~]# cpuid -1 -l 0x0
> > CPU:
> > vendor_id = "AuthenticAMD"
> >
> >
> > If I revert this patch, the vendor because the expected Intel.
> >
> > [root@vm ~]# cpuid -1 -l 0x0
> > CPU:
> > vendor_id = "GenuineIntel"
> >
> >
> > Thank you very much!
>
> Thank you Dongli!
>
> (+Like)
>
> While testing my cache model series, I also noticed the similar behavior
> for KVM. Additionally, Like Xu reported to me that this commit caused
> a failure in a KVM unit test case. Your report helped me connect these
> two issues I met (though due to my environment issues, I haven't
> confirmed yet).
Ok, now I can confirm this commit cause KUT failure:
* On AMD platform, the "msr.flat" case fails since this case requires
vendor=GenuineIntel (tested by Like).
* On Intel platform, the "syscall.flat" case fails because it requires
vendor=AuthenticAMD (tested by myself).
> The "vendor" property from cli is registered as the global property in
> x86_cpu_parse_featurestr(), and is applied to x86 CPUs in
> device_post_init().
>
> With this commit, now KVM will override the "vendor" in
> host_cpu_instance_init() (called in x86_cpu_post_initfn()) after
> device_post_init(), regardless the previous global "vendor" property.
This is the root cause for the above failure.
> Back to this commit, I think current order of post_init makes sense.
> Instead, the place of host_cpu_instance_init() doesn't seem quite
> right. So, I think this commit might have exposed some drawbacks in the
> previous x86 CPU initialization order:
>
> f5cc5a5c1686 ("i386: split cpu accelerators from cpu.c, using AccelCPUClass")
> 5b8978d80426 ("i386: do not call cpudef-only models functions for max, host, base")
To fix this issue, we need to initialize "vendor" property in the initfn
of max/host/named CPUs instead of current post_initfn.
This will need to split the cpu_instance_init() of x86 kvm (and maybe hvf/tcg)
into 2 hooks:
* AccelCPUClass.cpu_instance_init() - called in x86 CPUs' initfn.
* AccelCPUClass.cpu_instance_post_init() - called in x86 CPUs'
post_initfn.
I just did a POC and this solution works in principle.
But there are very many more details here, including considering more
properties/considering hvf and tcg and so on. It still need to take more
time for me figuring everything out. Once that's done, I'll post a series
to fix this issue.
Thanks,
Zhao
On 6/30/2025 11:22 PM, Zhao Liu wrote:
> (cc Thomas for bug reporting on kvm-unit-test...)
>
> On Tue, Jun 24, 2025 at 04:57:21PM +0800, Zhao Liu wrote:
>> Date: Tue, 24 Jun 2025 16:57:21 +0800
>> From: Zhao Liu <zhao1.liu@intel.com>
>> Subject: Re: [Regression] Re: [PULL 35/35] qom: reverse order of
>> instance_post_init calls
>>
>> On Mon, Jun 23, 2025 at 09:56:14AM -0700, Dongli Zhang wrote:
>>> Date: Mon, 23 Jun 2025 09:56:14 -0700
>>> From: Dongli Zhang <dongli.zhang@oracle.com>
>>> Subject: [Regression] Re: [PULL 35/35] qom: reverse order of
>>> instance_post_init calls
>>>
>>> This commit may broken the "vendor=" configuration.
>>>
>>> For instance, the hypervisor CPU vendor is AMD.
>>>
>>> I am going to use "-cpu Skylake-Server,vendor=GenuineIntel".
>>>
>>>
>>> Because of the commit, the vendor is still AMD.
>>>
>>> [root@vm ~]# cpuid -1 -l 0x0
>>> CPU:
>>> vendor_id = "AuthenticAMD"
>>>
>>>
>>> If I revert this patch, the vendor because the expected Intel.
>>>
>>> [root@vm ~]# cpuid -1 -l 0x0
>>> CPU:
>>> vendor_id = "GenuineIntel"
>>>
>>>
>>> Thank you very much!
>>
>> Thank you Dongli!
>>
>> (+Like)
>>
>> While testing my cache model series, I also noticed the similar behavior
>> for KVM. Additionally, Like Xu reported to me that this commit caused
>> a failure in a KVM unit test case. Your report helped me connect these
>> two issues I met (though due to my environment issues, I haven't
>> confirmed yet).
>
> Ok, now I can confirm this commit cause KUT failure:
> * On AMD platform, the "msr.flat" case fails since this case requires
> vendor=GenuineIntel (tested by Like).
> * On Intel platform, the "syscall.flat" case fails because it requires
> vendor=AuthenticAMD (tested by myself).
>
>> The "vendor" property from cli is registered as the global property in
>> x86_cpu_parse_featurestr(), and is applied to x86 CPUs in
>> device_post_init().
>>
>> With this commit, now KVM will override the "vendor" in
>> host_cpu_instance_init() (called in x86_cpu_post_initfn()) after
>> device_post_init(), regardless the previous global "vendor" property.
>
> This is the root cause for the above failure.
>
>> Back to this commit, I think current order of post_init makes sense.
>> Instead, the place of host_cpu_instance_init() doesn't seem quite
>> right. So, I think this commit might have exposed some drawbacks in the
>> previous x86 CPU initialization order:
>>
>> f5cc5a5c1686 ("i386: split cpu accelerators from cpu.c, using AccelCPUClass")
>> 5b8978d80426 ("i386: do not call cpudef-only models functions for max, host, base")
>
> To fix this issue, we need to initialize "vendor" property in the initfn
> of max/host/named CPUs instead of current post_initfn.
>
> This will need to split the cpu_instance_init() of x86 kvm (and maybe hvf/tcg)
> into 2 hooks:
> * AccelCPUClass.cpu_instance_init() - called in x86 CPUs' initfn.
> * AccelCPUClass.cpu_instance_post_init() - called in x86 CPUs'
> post_initfn.
Split accel.cpu_instance_init() into cpu's instance_init() and
post_instance_init() does not seem right way to go.
The reason .post_instance_init() was implemented and put
accel_cpu_instance_init() in it for x86 cpu was that, we don't want to
scatter acceletor specific instance_init operation into different
subclass of x86 cpu (max/host/named cpu model).
I think something like below should be enough.
-----------8<-------------
Author: Xiaoyao Li <xiaoyao.li@intel.com>
Date: Tue Jul 1 13:33:43 2025 +0800
i386/cpu: Re-apply the global props as the last step of post_init
Commit 220c739903ce ("qom: reverse order of instance_post_init calls")
reverses the order instance_post_init calls, which leads to
device_post_init() called before x86 cpu specific
.instance_post_init().
However, x86 cpu replies on qdev_prop_set_globals() (inside
device_post_init()) to apply the cpu option like "feature[=foo]" passed
via '-cpu' as the last step to make the '-cpu' option highest priority.
After the order change of .instance_post_init(), x86_cpu_post_initfn()
is called after device_post_init(), and it will change some property
value even though "-cpu" option specify a different one.
Re-apply the global props as the last step to ensure "-cpu" option
always takes highest priority.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 0d35e95430fe..bf290262cbfe 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -9044,6 +9044,12 @@ static void x86_cpu_post_initfn(Object *obj)
X86_CONFIDENTIAL_GUEST(current_machine->cgs), (CPU(obj)));
}
#endif
+
+ /*
+ * Re-apply the "feature[=foo]" from '-cpu' option since they might
+ * be overwritten by above
+ */
+ qdev_prop_set_globals(DEVICE(obj));
}
static void x86_cpu_init_default_topo(X86CPU *cpu)
On 1/7/25 08:50, Xiaoyao Li wrote:
> On 6/30/2025 11:22 PM, Zhao Liu wrote:
>> (cc Thomas for bug reporting on kvm-unit-test...)
>>
>> On Tue, Jun 24, 2025 at 04:57:21PM +0800, Zhao Liu wrote:
>>> Date: Tue, 24 Jun 2025 16:57:21 +0800
>>> From: Zhao Liu <zhao1.liu@intel.com>
>>> Subject: Re: [Regression] Re: [PULL 35/35] qom: reverse order of
>>> instance_post_init calls
>>>
>>> On Mon, Jun 23, 2025 at 09:56:14AM -0700, Dongli Zhang wrote:
>>>> Date: Mon, 23 Jun 2025 09:56:14 -0700
>>>> From: Dongli Zhang <dongli.zhang@oracle.com>
>>>> Subject: [Regression] Re: [PULL 35/35] qom: reverse order of
>>>> instance_post_init calls
>>>>
>>>> This commit may broken the "vendor=" configuration.
>>>>
>>>> For instance, the hypervisor CPU vendor is AMD.
>>>>
>>>> I am going to use "-cpu Skylake-Server,vendor=GenuineIntel".
>>>>
>>>>
>>>> Because of the commit, the vendor is still AMD.
>>>>
>>>> [root@vm ~]# cpuid -1 -l 0x0
>>>> CPU:
>>>> vendor_id = "AuthenticAMD"
>>>>
>>>>
>>>> If I revert this patch, the vendor because the expected Intel.
>>>>
>>>> [root@vm ~]# cpuid -1 -l 0x0
>>>> CPU:
>>>> vendor_id = "GenuineIntel"
>>>>
>>>>
>>>> Thank you very much!
>>>
>>> Thank you Dongli!
>>>
>>> (+Like)
>>>
>>> While testing my cache model series, I also noticed the similar behavior
>>> for KVM. Additionally, Like Xu reported to me that this commit caused
>>> a failure in a KVM unit test case. Your report helped me connect these
>>> two issues I met (though due to my environment issues, I haven't
>>> confirmed yet).
>>
>> Ok, now I can confirm this commit cause KUT failure:
>> * On AMD platform, the "msr.flat" case fails since this case requires
>> vendor=GenuineIntel (tested by Like).
>> * On Intel platform, the "syscall.flat" case fails because it requires
>> vendor=AuthenticAMD (tested by myself).
>>
>>> The "vendor" property from cli is registered as the global property in
>>> x86_cpu_parse_featurestr(), and is applied to x86 CPUs in
>>> device_post_init().
>>>
>>> With this commit, now KVM will override the "vendor" in
>>> host_cpu_instance_init() (called in x86_cpu_post_initfn()) after
>>> device_post_init(), regardless the previous global "vendor" property.
>>
>> This is the root cause for the above failure.
>>
>>> Back to this commit, I think current order of post_init makes sense.
>>> Instead, the place of host_cpu_instance_init() doesn't seem quite
>>> right. So, I think this commit might have exposed some drawbacks in the
>>> previous x86 CPU initialization order:
>>>
>>> f5cc5a5c1686 ("i386: split cpu accelerators from cpu.c, using
>>> AccelCPUClass")
>>> 5b8978d80426 ("i386: do not call cpudef-only models functions for
>>> max, host, base")
>>
>> To fix this issue, we need to initialize "vendor" property in the initfn
>> of max/host/named CPUs instead of current post_initfn.
>>
>> This will need to split the cpu_instance_init() of x86 kvm (and maybe
>> hvf/tcg)
>> into 2 hooks:
>> * AccelCPUClass.cpu_instance_init() - called in x86 CPUs' initfn.
>> * AccelCPUClass.cpu_instance_post_init() - called in x86 CPUs'
>> post_initfn.
>
> Split accel.cpu_instance_init() into cpu's instance_init() and
> post_instance_init() does not seem right way to go.
Yeah, please don't. I'm trying to consolidate this code but it takes
(too) long.
> The reason .post_instance_init() was implemented and put
> accel_cpu_instance_init() in it for x86 cpu was that, we don't want to
> scatter acceletor specific instance_init operation into different
> subclass of x86 cpu (max/host/named cpu model).
>
> I think something like below should be enough.
>
> -----------8<-------------
> Author: Xiaoyao Li <xiaoyao.li@intel.com>
> Date: Tue Jul 1 13:33:43 2025 +0800
>
> i386/cpu: Re-apply the global props as the last step of post_init
>
> Commit 220c739903ce ("qom: reverse order of instance_post_init calls")
> reverses the order instance_post_init calls, which leads to
> device_post_init() called before x86 cpu
> specific .instance_post_init().
>
> However, x86 cpu replies on qdev_prop_set_globals() (inside
> device_post_init()) to apply the cpu option like "feature[=foo]"
> passed
> via '-cpu' as the last step to make the '-cpu' option highest
> priority.
>
> After the order change of .instance_post_init(), x86_cpu_post_initfn()
> is called after device_post_init(), and it will change some property
> value even though "-cpu" option specify a different one.
>
> Re-apply the global props as the last step to ensure "-cpu" option
> always takes highest priority.
>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 0d35e95430fe..bf290262cbfe 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -9044,6 +9044,12 @@ static void x86_cpu_post_initfn(Object *obj)
> X86_CONFIDENTIAL_GUEST(current_machine->cgs), (CPU(obj)));
> }
> #endif
> +
> + /*
> + * Re-apply the "feature[=foo]" from '-cpu' option since they might
> + * be overwritten by above
> + */
> + qdev_prop_set_globals(DEVICE(obj));
> }
This patch LGTM.
Regards,
Phil.
Il mer 2 lug 2025, 02:54 Philippe Mathieu-Daudé <philmd@linaro.org> ha
scritto:
> >>> Back to this commit, I think current order of post_init makes sense.
> >>> Instead, the place of host_cpu_instance_init() doesn't seem quite
> >>> right. So, I think this commit might have exposed some drawbacks in the
> >>> previous x86 CPU initialization order:
> >>>
> >>> f5cc5a5c1686 ("i386: split cpu accelerators from cpu.c, using
> >>> AccelCPUClass")
> >>> 5b8978d80426 ("i386: do not call cpudef-only models functions for
> >>> max, host, base")
> >>
> >> To fix this issue, we need to initialize "vendor" property in the initfn
> >> of max/host/named CPUs instead of current post_initfn.
> >>
> >> This will need to split the cpu_instance_init() of x86 kvm (and maybe
> >> hvf/tcg)
> >> into 2 hooks:
> >> * AccelCPUClass.cpu_instance_init() - called in x86 CPUs' initfn.
> >> * AccelCPUClass.cpu_instance_post_init() - called in x86 CPUs'
> >> post_initfn.
> >
> > Split accel.cpu_instance_init() into cpu's instance_init() and
> > post_instance_init() does not seem right way to go.
>
> Yeah, please don't. I'm trying to consolidate this code but it takes
> (too) long.
>
Unfortunately I agree with Zhao. The globals, being user provided, must
come after the AccelCPUClass initialization.
My hope is that it's possible to avoid a split and move everything earlier
as in target-riscv/. I will take a look next week if Zhao doesn't beat me,
and/or discuss with him.
Paolo
> > The reason .post_instance_init() was implemented and put
> > accel_cpu_instance_init() in it for x86 cpu was that, we don't want to
> > scatter acceletor specific instance_init operation into different
> > subclass of x86 cpu (max/host/named cpu model).
> >
> > I think something like below should be enough.
> >
> > -----------8<-------------
> > Author: Xiaoyao Li <xiaoyao.li@intel.com>
> > Date: Tue Jul 1 13:33:43 2025 +0800
> >
> > i386/cpu: Re-apply the global props as the last step of post_init
> >
> > Commit 220c739903ce ("qom: reverse order of instance_post_init
> calls")
> > reverses the order instance_post_init calls, which leads to
> > device_post_init() called before x86 cpu
> > specific .instance_post_init().
> >
> > However, x86 cpu replies on qdev_prop_set_globals() (inside
> > device_post_init()) to apply the cpu option like "feature[=foo]"
> > passed
> > via '-cpu' as the last step to make the '-cpu' option highest
> > priority.
> >
> > After the order change of .instance_post_init(),
> x86_cpu_post_initfn()
> > is called after device_post_init(), and it will change some property
> > value even though "-cpu" option specify a different one.
> >
> > Re-apply the global props as the last step to ensure "-cpu" option
> > always takes highest priority.
> >
> > Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
> >
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 0d35e95430fe..bf290262cbfe 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -9044,6 +9044,12 @@ static void x86_cpu_post_initfn(Object *obj)
> > X86_CONFIDENTIAL_GUEST(current_machine->cgs), (CPU(obj)));
> > }
> > #endif
> > +
> > + /*
> > + * Re-apply the "feature[=foo]" from '-cpu' option since they might
> > + * be overwritten by above
> > + */
> > + qdev_prop_set_globals(DEVICE(obj));
> > }
>
> This patch LGTM.
>
> Regards,
>
> Phil.
>
>
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > > index 0d35e95430fe..bf290262cbfe 100644 > > --- a/target/i386/cpu.c > > +++ b/target/i386/cpu.c > > @@ -9044,6 +9044,12 @@ static void x86_cpu_post_initfn(Object *obj) > > X86_CONFIDENTIAL_GUEST(current_machine->cgs), (CPU(obj))); > > } > > #endif > > + > > + /* > > + * Re-apply the "feature[=foo]" from '-cpu' option since they might > > + * be overwritten by above > > + */ > > + qdev_prop_set_globals(DEVICE(obj)); > > } > > This patch LGTM. This solution will call qdev_prop_set_globals() twice. My concern is that this masks the problem: previous x86 CPU assumptions about the order of global property initialization break down... Per the commit message of Paolo's commit: "This is incorrect because the leaf class cannot observe property values applied by the superclasses; for example, a compat property will be set on a device *after* the class's post_init callback has run." X86 CPUs have the issue (e.g., "vendor" doesn't work) now because they - as leaf class, don't care about the property values of superclass - the DeviceState. If a property is just for initialization, like "vendor", it should be placed in the instance_init() instead of instance_post_init(). In addition, if other places handle it similarly, the device's post_init seems pointless. :-( Thanks, Zhao
On 7/2/2025 3:56 PM, Zhao Liu wrote:
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index 0d35e95430fe..bf290262cbfe 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -9044,6 +9044,12 @@ static void x86_cpu_post_initfn(Object *obj)
>>> X86_CONFIDENTIAL_GUEST(current_machine->cgs), (CPU(obj)));
>>> }
>>> #endif
>>> +
>>> + /*
>>> + * Re-apply the "feature[=foo]" from '-cpu' option since they might
>>> + * be overwritten by above
>>> + */
>>> + qdev_prop_set_globals(DEVICE(obj));
>>> }
>>
>> This patch LGTM.
>
> This solution will call qdev_prop_set_globals() twice. My concern is
> that this masks the problem: previous x86 CPU assumptions about the
> order of global property initialization break down...
>
> Per the commit message of Paolo's commit:
>
> "This is incorrect because the leaf class cannot observe property
> values applied by the superclasses; for example, a compat property
> will be set on a device *after* the class's post_init callback has
> run."
After checking the history of why .post_instance_init() was introduced
in the first place: 8231c2dd2203 ("qom: Introduce instance_post_init
hook"). It turns out to be used for qdev_prop_set_globals(), to ensure
global properties being applied after all sub-device specific
instance_init() were called. And I think the order from child to parent
was defined purposely.
And the reverse of Paolo's patch breaks the usage of "-global" than it
won't take effect if the sub-device changes the property in its
post_instance_init() callback.
Back to Paolo's example of "a compat property will be set on a device
*after* the class's post_init callback has run". I think the behavior of
compat property is applied after the class's post_init callback is also
what we want. If reversing the order, then compat prop can be
overwritten by subclass's post_init callback, and doesn't it fail the
purpose of compat prop?
So I think we might revert this patch, and document clearly the reverse
order of .post_instance_init() callback.
> X86 CPUs have the issue (e.g., "vendor" doesn't work) now because
> they - as leaf class, don't care about the property values of
> superclass - the DeviceState. If a property is just for initialization,
> like "vendor", it should be placed in the instance_init() instead of
> instance_post_init().
>
> In addition, if other places handle it similarly, the device's
> post_init seems pointless. :-(
>
> Thanks,
> Zhao
>
Il mer 2 lug 2025, 07:42 Xiaoyao Li <xiaoyao.li@intel.com> ha scritto: > Back to Paolo's example of "a compat property will be set on a device > *after* the class's post_init callback has run". I think the behavior of > compat property is applied after the class's post_init callback is also > what we want. If reversing the order, then compat prop can be > overwritten by subclass's post_init callback, and doesn't it fail the > purpose of compat prop? > I wrote the patch because of a latent issue with TDX. The issue was roughly that a compat property was overwriting the TDX-specific modifications to CPUID. I think this case shows that you *do* want the subclass to overwrite the compat property—of course the subclass code must be aware that compat properties exist and limit changes appropriately. In general I don't see how the reverse order makes sense: the subclass knows what the superclass does, so it can do the right thing if it runs last; but the superclass cannot know what all of its subclasses do in post_init, so it makes less sense to run it last. Paolo > So I think we might revert this patch, and document clearly the reverse > order of .post_instance_init() callback. > > > X86 CPUs have the issue (e.g., "vendor" doesn't work) now because > > they - as leaf class, don't care about the property values of > > superclass - the DeviceState. If a property is just for initialization, > > like "vendor", it should be placed in the instance_init() instead of > > instance_post_init(). > > > > In addition, if other places handle it similarly, the device's > > post_init seems pointless. :-( > > > > Thanks, > > Zhao > > > >
On 7/2/2025 8:12 PM, Paolo Bonzini wrote: > Il mer 2 lug 2025, 07:42 Xiaoyao Li <xiaoyao.li@intel.com> ha scritto: > >> Back to Paolo's example of "a compat property will be set on a device >> *after* the class's post_init callback has run". I think the behavior of >> compat property is applied after the class's post_init callback is also >> what we want. If reversing the order, then compat prop can be >> overwritten by subclass's post_init callback, and doesn't it fail the >> purpose of compat prop? >> > > I wrote the patch because of a latent issue with TDX. The issue was roughly > that a compat property was overwriting the TDX-specific modifications to > CPUID. I think this case shows that you *do* want the subclass to overwrite > the compat property—of course the subclass code must be aware that compat > properties exist and limit changes appropriately. IIRC that's on rhel QEMU which ports the TDX code before it's merged upstream. Now TDX is upstreamed, it works with upstream compat property and I think future new compat property won't affect TDX or anything, since it's compat property and it's to guarantee the existing behavior when introducing new behavior? > In general I don't see how the reverse order makes sense: the subclass > knows what the superclass does, so it can do the right thing if it runs > last; but the superclass cannot know what all of its subclasses do in > post_init, so it makes less sense to run it last. I agree in general the parent to child order makes more sense, especially when we treat .instance_init() as the phase 1 init and .post_instance_init() as the phase 2 init. But the original purpose of introducing .post_instance_init() was to ensure qdev_prop_set_globals() is called at last for Device. Reverse the order breaks this purpose. It seems not good option like the reversed order from child to parent that can achieve it by just putting the last step in top parent's. I'm looking forward to seeing a better solution. > Paolo > > >> So I think we might revert this patch, and document clearly the reverse >> order of .post_instance_init() callback. >> >>> X86 CPUs have the issue (e.g., "vendor" doesn't work) now because >>> they - as leaf class, don't care about the property values of >>> superclass - the DeviceState. If a property is just for initialization, >>> like "vendor", it should be placed in the instance_init() instead of >>> instance_post_init(). >>> >>> In addition, if other places handle it similarly, the device's >>> post_init seems pointless. :-( >>> >>> Thanks, >>> Zhao >>> >> >> >
Il mer 2 lug 2025, 09:25 Xiaoyao Li <xiaoyao.li@intel.com> ha scritto: > IIRC that's on rhel QEMU which ports the TDX code before it's merged > upstream. Now TDX is upstreamed, it works with upstream compat property > and I think future new compat property won't affect TDX or anything, > since it's compat property and it's to guarantee the existing behavior > when introducing new behavior? > It's a compat property that is only added by RHEL-specific machine types. But the bug is not specific to RHEL, it just happens that no upstream machine type has compat properties that overlap with TDX adjustments of CPUID. > In general I don't see how the reverse order makes sense: the subclass > > knows what the superclass does, so it can do the right thing if it runs > > last; but the superclass cannot know what all of its subclasses do in > > post_init, so it makes less sense to run it last. > > I agree in general the parent to child order makes more sense, > especially when we treat .instance_init() as the phase 1 init and > .post_instance_init() as the phase 2 init. > > But the original purpose of introducing .post_instance_init() was to > ensure qdev_prop_set_globals() is called at last for Device. Reverse the > order breaks this purpose. > Not "last", but "after instance_init". Anything that happens in the child class's instance_post_init can be moved at the end of instance_init, just like the refactoring that I did for risc-v. Paolo It seems not good option like the reversed order from child to parent > that can achieve it by just putting the last step in top parent's. > > I'm looking forward to seeing a better solution. > > > Paolo > > > > > >> So I think we might revert this patch, and document clearly the reverse > >> order of .post_instance_init() callback. > >> > >>> X86 CPUs have the issue (e.g., "vendor" doesn't work) now because > >>> they - as leaf class, don't care about the property values of > >>> superclass - the DeviceState. If a property is just for initialization, > >>> like "vendor", it should be placed in the instance_init() instead of > >>> instance_post_init(). > >>> > >>> In addition, if other places handle it similarly, the device's > >>> post_init seems pointless. :-( > >>> > >>> Thanks, > >>> Zhao > >>> > >> > >> > > > >
On 7/3/2025 2:54 AM, Paolo Bonzini wrote: > Il mer 2 lug 2025, 09:25 Xiaoyao Li <xiaoyao.li@intel.com> ha scritto: > >> IIRC that's on rhel QEMU which ports the TDX code before it's merged >> upstream. Now TDX is upstreamed, it works with upstream compat property >> and I think future new compat property won't affect TDX or anything, >> since it's compat property and it's to guarantee the existing behavior >> when introducing new behavior? >> > > It's a compat property that is only added by RHEL-specific machine types. > But the bug is not specific to RHEL, it just happens that no upstream > machine type has compat properties that overlap with TDX adjustments of > CPUID. > >> In general I don't see how the reverse order makes sense: the subclass >>> knows what the superclass does, so it can do the right thing if it runs >>> last; but the superclass cannot know what all of its subclasses do in >>> post_init, so it makes less sense to run it last. >> >> I agree in general the parent to child order makes more sense, >> especially when we treat .instance_init() as the phase 1 init and >> .post_instance_init() as the phase 2 init. >> >> But the original purpose of introducing .post_instance_init() was to >> ensure qdev_prop_set_globals() is called at last for Device. Reverse the >> order breaks this purpose. >> > > Not "last", but "after instance_init". Anything that happens in the child > class's instance_post_init can be moved at the end of instance_init, just > like the refactoring that I did for risc-v. Move into the end of instance_init() can surely work. But it requires to split the code in instance_post_init() to different child's instance_init() or making sure the code in instance_post_init() is called at the end of each lowest child class. Besides, it also leads to a rule that child of Device's .post_instance_init() needs to be careful about changing the property or anything that might affect the property, because it might break the usage of global properties. This can surely work. But to me, it seems to make code worse not better.
On Thu, Jul 03, 2025 at 09:03:10AM +0800, Xiaoyao Li wrote: > Date: Thu, 3 Jul 2025 09:03:10 +0800 > From: Xiaoyao Li <xiaoyao.li@intel.com> > Subject: Re: [Regression] Re: [PULL 35/35] qom: reverse order of > instance_post_init calls > > On 7/3/2025 2:54 AM, Paolo Bonzini wrote: > > Il mer 2 lug 2025, 09:25 Xiaoyao Li <xiaoyao.li@intel.com> ha scritto: > > > > > IIRC that's on rhel QEMU which ports the TDX code before it's merged > > > upstream. Now TDX is upstreamed, it works with upstream compat property > > > and I think future new compat property won't affect TDX or anything, > > > since it's compat property and it's to guarantee the existing behavior > > > when introducing new behavior? > > > > > > > It's a compat property that is only added by RHEL-specific machine types. > > But the bug is not specific to RHEL, it just happens that no upstream > > machine type has compat properties that overlap with TDX adjustments of > > CPUID. > > > > > In general I don't see how the reverse order makes sense: the subclass > > > > knows what the superclass does, so it can do the right thing if it runs > > > > last; but the superclass cannot know what all of its subclasses do in > > > > post_init, so it makes less sense to run it last. > > > > > > I agree in general the parent to child order makes more sense, > > > especially when we treat .instance_init() as the phase 1 init and > > > .post_instance_init() as the phase 2 init. > > > > > > But the original purpose of introducing .post_instance_init() was to > > > ensure qdev_prop_set_globals() is called at last for Device. Reverse the > > > order breaks this purpose. > > > > > > > Not "last", but "after instance_init". Anything that happens in the child > > class's instance_post_init can be moved at the end of instance_init, just > > like the refactoring that I did for risc-v. > > Move into the end of instance_init() can surely work. But it requires to > split the code in instance_post_init() to different child's instance_init() > or making sure the code in instance_post_init() is called at the end of each > lowest child class. Initially, when I proposed the split approach, it wasn't about splitting for the sake of splitting, nor for the sake of "work". A more granular split is just a means, and the goal is to place things at different stages in the most appropriate locations. > Besides, it also leads to a rule that child of Device's > .post_instance_init() needs to be careful about changing the property or > anything that might affect the property, I believe that's how things should be. instance_post_init() provides an opportunity to tweak properties. The order of instance_post_init() reflects the dependency relationships for property adjustments. As I mentioned earlier, if a property doesn't need to consider other factors and is simply being initialized, instance_init() is the most appropriate place for it. > because it might break the usage of global properties. Breaking global properties is just one example. Essentially, properties like "vendor" don't adhere well to the semantics of QOM. > This can surely work. But to me, it seems to make code worse not better. IMO, it's not the split that makes things worse, but rather the improper use of instance_post_init() that makes everything about the x86 CPU fragile. Ideally, QOM/qdev should focus on their own abstraction order, while the leaf-class should do the right thing in the right place, which is the most solid situation.
On 7/3/2025 11:08 AM, Zhao Liu wrote:
> On Thu, Jul 03, 2025 at 09:03:10AM +0800, Xiaoyao Li wrote:
>> Date: Thu, 3 Jul 2025 09:03:10 +0800
>> From: Xiaoyao Li <xiaoyao.li@intel.com>
>> Subject: Re: [Regression] Re: [PULL 35/35] qom: reverse order of
>> instance_post_init calls
>>
>> On 7/3/2025 2:54 AM, Paolo Bonzini wrote:
>>> Il mer 2 lug 2025, 09:25 Xiaoyao Li <xiaoyao.li@intel.com> ha scritto:
>>>
>>>> IIRC that's on rhel QEMU which ports the TDX code before it's merged
>>>> upstream. Now TDX is upstreamed, it works with upstream compat property
>>>> and I think future new compat property won't affect TDX or anything,
>>>> since it's compat property and it's to guarantee the existing behavior
>>>> when introducing new behavior?
>>>>
>>>
>>> It's a compat property that is only added by RHEL-specific machine types.
>>> But the bug is not specific to RHEL, it just happens that no upstream
>>> machine type has compat properties that overlap with TDX adjustments of
>>> CPUID.
>>>
>>>> In general I don't see how the reverse order makes sense: the subclass
>>>>> knows what the superclass does, so it can do the right thing if it runs
>>>>> last; but the superclass cannot know what all of its subclasses do in
>>>>> post_init, so it makes less sense to run it last.
>>>>
>>>> I agree in general the parent to child order makes more sense,
>>>> especially when we treat .instance_init() as the phase 1 init and
>>>> .post_instance_init() as the phase 2 init.
>>>>
>>>> But the original purpose of introducing .post_instance_init() was to
>>>> ensure qdev_prop_set_globals() is called at last for Device. Reverse the
>>>> order breaks this purpose.
>>>>
>>>
>>> Not "last", but "after instance_init". Anything that happens in the child
>>> class's instance_post_init can be moved at the end of instance_init, just
>>> like the refactoring that I did for risc-v.
>>
>> Move into the end of instance_init() can surely work. But it requires to
>> split the code in instance_post_init() to different child's instance_init()
>> or making sure the code in instance_post_init() is called at the end of each
>> lowest child class.
>
> Initially, when I proposed the split approach, it wasn't about
> splitting for the sake of splitting, nor for the sake of "work".
>
> A more granular split is just a means, and the goal is to place things
> at different stages in the most appropriate locations.
>
>> Besides, it also leads to a rule that child of Device's
>> .post_instance_init() needs to be careful about changing the property or
>> anything that might affect the property,
>
> I believe that's how things should be. instance_post_init() provides an
> opportunity to tweak properties. The order of instance_post_init()
> reflects the dependency relationships for property adjustments. As I
> mentioned earlier, if a property doesn't need to consider other factors
> and is simply being initialized, instance_init() is the most appropriate
> place for it.
The reason why accelerator's instance_init() was moved to post_init, was
just it needs to consider other factors. Please see commit 4db4385a7ab6
("i386: run accel_cpu_instance_init as post_init")
accelerator needs to do different tweak for "max/host", "xcc->model".
Of course we can split it and put specific handling at the end of each
sub-class's instance_init(), like below:
- in TYPE_X86_CPU instance_init()
if (accelerator_kvm) {
kvm_instance_init_common_for_x86_cpu();
}
- in "base" instance_init()
if (accelerator_kvm) {
kvm_instance_init_for_base();
}
- in "max" instance_init()
if (accelerator_kvm) {
kvm_instance_init_for_max();
}
- in "host" instance_init()
if (accelerator_kvm) {
kvm_instance_init_for_host();
}
- in "named cpu model" instance_init()
if (xcc->model) {
kvm_instance_init_for_xcc_model();
}
Contrast to the current implementation in post_init()
if (accelerator_kvm) {
kvm_instance_init_common_for_x86_cpu();
if (base)
...
if (max)
...
if (host)
...
if (xcc->model)
...
}
The reality for the former might be simpler since "base" doesn't have
instance_init(), and "max/host" are same to KVM as "cpu->max_features"
But I still like the latter.
Il mer 2 lug 2025, 23:36 Xiaoyao Li <xiaoyao.li@intel.com> ha scritto:
> The reason why accelerator's instance_init() was moved to post_init, was
> just it needs to consider other factors. Please see commit 4db4385a7ab6
> ("i386: run accel_cpu_instance_init as post_init")
>
You're right and this can be a problem with the simple split that Zhao
proposed. But the root cause is that post_init is confusing many kinds of
defaults (the KVM vendor case, globals which are different for compat
properties and -global and which CPUs also abuse to implement -cpu by the
way, max_features handling, maybe more; all of which have different
priorities).
TDX added checks on top, and instance_post_init worked when applying class
defaults but not for checks. But as mentioned in the commit message for
this patch, cxl_dsp_instance_post_init and
rp_instance_post_init have similar issues so reverting is also incorrect.
Maybe DeviceClass needs another callback that is called before Device's own
instance_post_init. The accelerator could use it.
Or maybe, more long term, instance_post_init could be replaced by a set of
Notifiers that are registered by instance_init and that have a priority
(FIXUP_CLASS_DEFAULTS, APPLY_GLOBAL_DEFAULTS, and a third for TDX).
Paolo
accelerator needs to do different tweak for "max/host", "xcc->model".
>
> Of course we can split it and put specific handling at the end of each
> sub-class's instance_init(), like below:
>
> - in TYPE_X86_CPU instance_init()
>
> if (accelerator_kvm) {
> kvm_instance_init_common_for_x86_cpu();
> }
>
> - in "base" instance_init()
>
> if (accelerator_kvm) {
> kvm_instance_init_for_base();
> }
>
> - in "max" instance_init()
>
> if (accelerator_kvm) {
> kvm_instance_init_for_max();
> }
>
> - in "host" instance_init()
>
> if (accelerator_kvm) {
> kvm_instance_init_for_host();
> }
>
> - in "named cpu model" instance_init()
>
> if (xcc->model) {
> kvm_instance_init_for_xcc_model();
> }
>
> Contrast to the current implementation in post_init()
>
> if (accelerator_kvm) {
> kvm_instance_init_common_for_x86_cpu();
>
> if (base)
> ...
> if (max)
> ...
> if (host)
> ...
> if (xcc->model)
> ...
> }
>
> The reality for the former might be simpler since "base" doesn't have
> instance_init(), and "max/host" are same to KVM as "cpu->max_features"
>
> But I still like the latter.
>
>
>
Il gio 3 lug 2025, 06:51 Paolo Bonzini <pbonzini@redhat.com> ha scritto:
>
>
> Il mer 2 lug 2025, 23:36 Xiaoyao Li <xiaoyao.li@intel.com> ha scritto:
>
>> The reason why accelerator's instance_init() was moved to post_init, was
>> just it needs to consider other factors. Please see commit 4db4385a7ab6
>> ("i386: run accel_cpu_instance_init as post_init")
>>
>
> You're right and this can be a problem with the simple split that Zhao
> proposed. But the root cause is that post_init is confusing many kinds of
> defaults (the KVM vendor case, globals which are different for compat
> properties and -global and which CPUs also abuse to implement -cpu by the
> way, max_features handling, maybe more; all of which have different
> priorities).
>
I checked and it seems to me that all of the accelerator-specific
initialization should be doable in instance_init (which risc-v already
does), the only problem is the max_features field. But max_features is
*not* a qdev property so it can be moved to the class side.
I will try to send patches tomorrow.
Paolo
TDX added checks on top, and instance_post_init worked when applying class
> defaults but not for checks. But as mentioned in the commit message for
> this patch, cxl_dsp_instance_post_init and
> rp_instance_post_init have similar issues so reverting is also incorrect.
>
> Maybe DeviceClass needs another callback that is called before Device's
> own instance_post_init. The accelerator could use it.
>
> Or maybe, more long term, instance_post_init could be replaced by a set of
> Notifiers that are registered by instance_init and that have a priority
> (FIXUP_CLASS_DEFAULTS, APPLY_GLOBAL_DEFAULTS, and a third for TDX).
>
> Paolo
>
> accelerator needs to do different tweak for "max/host", "xcc->model".
>>
>> Of course we can split it and put specific handling at the end of each
>> sub-class's instance_init(), like below:
>>
>> - in TYPE_X86_CPU instance_init()
>>
>> if (accelerator_kvm) {
>> kvm_instance_init_common_for_x86_cpu();
>> }
>>
>> - in "base" instance_init()
>>
>> if (accelerator_kvm) {
>> kvm_instance_init_for_base();
>> }
>>
>> - in "max" instance_init()
>>
>> if (accelerator_kvm) {
>> kvm_instance_init_for_max();
>> }
>>
>> - in "host" instance_init()
>>
>> if (accelerator_kvm) {
>> kvm_instance_init_for_host();
>> }
>>
>> - in "named cpu model" instance_init()
>>
>> if (xcc->model) {
>> kvm_instance_init_for_xcc_model();
>> }
>>
>> Contrast to the current implementation in post_init()
>>
>> if (accelerator_kvm) {
>> kvm_instance_init_common_for_x86_cpu();
>>
>> if (base)
>> ...
>> if (max)
>> ...
>> if (host)
>> ...
>> if (xcc->model)
>> ...
>> }
>>
>> The reality for the former might be simpler since "base" doesn't have
>> instance_init(), and "max/host" are same to KVM as "cpu->max_features"
>>
>> But I still like the latter.
>>
>>
>>
© 2016 - 2025 Red Hat, Inc.