[PATCH v7 2/9] target/i386: disable PERFCORE when "-pmu" is configured

Dongli Zhang posted 9 patches 3 months ago
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Zhao Liu <zhao1.liu@intel.com>, Marcelo Tosatti <mtosatti@redhat.com>
There is a newer version of this series
[PATCH v7 2/9] target/i386: disable PERFCORE when "-pmu" is configured
Posted by Dongli Zhang 3 months ago
Currently, AMD PMU support isn't determined based on CPUID, that is, the
"-pmu" option does not fully disable KVM AMD PMU virtualization.

To minimize AMD PMU features, remove PERFCORE when "-pmu" is configured.

To completely disable AMD PMU virtualization will be implemented via
KVM_CAP_PMU_CAPABILITY in upcoming patches.

As a reminder, neither CPUID_EXT3_PERFCORE nor
CPUID_8000_0022_EAX_PERFMON_V2 is removed from env->features[] when "-pmu"
is configured. Developers should query whether they are supported via
cpu_x86_cpuid() rather than relying on env->features[] in future patches.

Suggested-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
---
Changed since v2:
  - No need to check "kvm_enabled() && IS_AMD_CPU(env)".
Changed since v4:
  - Add Reviewed-by from Sandipan.

 target/i386/cpu.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 3653f8953e..4fcade89bc 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -8360,6 +8360,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
             !(env->hflags & HF_LMA_MASK)) {
             *edx &= ~CPUID_EXT2_SYSCALL;
         }
+
+        if (!cpu->enable_pmu) {
+            *ecx &= ~CPUID_EXT3_PERFCORE;
+        }
         break;
     case 0x80000002:
     case 0x80000003:
-- 
2.39.3
Re: [PATCH v7 2/9] target/i386: disable PERFCORE when "-pmu" is configured
Posted by Zhao Liu 2 months, 3 weeks ago
Hi Dongli,

> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 3653f8953e..4fcade89bc 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -8360,6 +8360,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>              !(env->hflags & HF_LMA_MASK)) {
>              *edx &= ~CPUID_EXT2_SYSCALL;
>          }
> +
> +        if (!cpu->enable_pmu) {
> +            *ecx &= ~CPUID_EXT3_PERFCORE;
> +        }

Learned the lessons from PDCM [*]:

[*] https://lore.kernel.org/qemu-devel/20250923104136.133875-3-pbonzini@redhat.com/

Directly masking CPUID bit off is not a good idea... modifying the
CPUID, even when fixing or adding dependencies, can easily break
migration.

So a safe way is to add a compat option. And I think it would be better
if patch 1 also has a compat option. A single compat option could cover
patch 1 & 2.

I have these thoughts:

* For "-pmu" dependency, it can be checked as [*] did.
* For normal feature bit dependency (patch 1), it seems possible to add
  compat_prop condition in feature_dependencies[].

I attached a draft for discussion (which is on the top of the whole
series).

Note, we are currently in the RC phase for v10.2, and the
pc_compat_10_2[] array is not yet added, which will be added before the
release of v10.2. Therefore, we have enough time for discussion I think.

If you think it's fine, I'll post compat_prop support separately as an
RFC. The specific compat option can be left to this series.

Although wait for a while, in a way, it is lucky :) - there's an
opportunity to avoid making mistakes for us.

Regards,
Zhao

------------------------------------------------
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f8b919cb6c47..d2b8e83e556d 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -83,6 +83,7 @@

 GlobalProperty pc_compat_10_1[] = {
     { "mch", "extended-tseg-mbytes", "16" },
+    { TYPE_X86_CPU, "x-amd-perfmon-always-on", "true" }, // This should be added to pc_compat_10_2!
 };
 const size_t pc_compat_10_1_len = G_N_ELEMENTS(pc_compat_10_1);

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 4fcade89bcea..8c56789eb296 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1997,6 +1997,8 @@ static FeatureDep feature_dependencies[] = {
     {
         .from = { FEAT_8000_0001_ECX,       CPUID_EXT3_PERFCORE },
         .to = { FEAT_8000_0022_EAX,         CPUID_8000_0022_EAX_PERFMON_V2 },
+        .compat_prop =
+            { "x-amd-perfmon-always-on",    "false" },
     },
 };

@@ -8998,15 +9000,36 @@ void x86_cpu_expand_features(X86CPU *cpu, Error **errp)
         }
     }

-    if (!cpu->pdcm_on_even_without_pmu) {
+    if (!cpu->enable_pmu) {
         /* PDCM is fixed1 bit for TDX */
-        if (!cpu->enable_pmu && !is_tdx_vm()) {
+        if (!cpu->pdcm_on_even_without_pmu && !is_tdx_vm()) {
             env->features[FEAT_1_ECX] &= ~CPUID_EXT_PDCM;
         }
+
+        if (!cpu->amd_perfmon_always_on) {
+            env->features[FEAT_8000_0001_ECX] &= ~CPUID_EXT3_PERFCORE;
+        }
     }

     for (i = 0; i < ARRAY_SIZE(feature_dependencies); i++) {
         FeatureDep *d = &feature_dependencies[i];
+        bool compat_omit = false;
+        PropValue *prop = &d->compat_prop;
+
+        if (prop->value) {
+            char *v = object_property_print(OBJECT(cpu), prop->prop, false, NULL);
+            if (v) {
+                compat_omit = !strcmp(prop->value, v);
+                printf("feature_dependencies: v: %s, value: %s, omit: %d\n",
+                       v, prop->value, compat_omit);
+                g_free(v);
+            }
+
+            if (compat_omit) {
+                continue;
+            }
+        }
+
         if (!(env->features[d->from.index] & d->from.mask)) {
             uint64_t unavailable_features = env->features[d->to.index] & d->to.mask;

@@ -10065,6 +10088,8 @@ static const Property x86_cpu_properties[] = {
                      arch_cap_always_on, false),
     DEFINE_PROP_BOOL("x-pdcm-on-even-without-pmu", X86CPU,
                      pdcm_on_even_without_pmu, false),
+    DEFINE_PROP_BOOL("x-amd-perfmon-always-on", X86CPU,
+                     amd_perfmon_always_on, false),
 };

 #ifndef CONFIG_USER_ONLY
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 6d78f3995b6b..ac8dd92efc59 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -696,8 +696,14 @@ typedef struct FeatureMask {
     uint64_t mask;
 } FeatureMask;

+typedef struct PropValue {
+    const char *prop, *value;
+} PropValue;
+
 typedef struct FeatureDep {
     FeatureMask from, to;
+    /* Once set, only when prop is satisfied, dependency can be checked. */
+    PropValue compat_prop;
 } FeatureDep;

 typedef uint64_t FeatureWordArray[FEATURE_WORDS];
@@ -2353,6 +2359,16 @@ struct ArchCPU {
      */
     bool pdcm_on_even_without_pmu;

+    /*
+     * Backwards compatibility with QEMU <10.2. (TODO: change 10.2 to 11.0!)
+     * This flag covers 2 parts:
+     * - The PERFCORE feature is now disabled when PMU is not available, but prior to
+     *   10.2 it was enabled even if PMU is off.
+     * - The PerfMonV2 feature is now disabled when PERFCORE is disabled, but prior to
+     *   10.2 it was enabled even if PERFCORE is off.
+     */
+    bool amd_perfmon_always_on;
+
     /* Number of physical address bits supported */
     uint32_t phys_bits;

@@ -2579,9 +2595,7 @@ bool cpu_x86_xrstor(CPUX86State *s, void *host, size_t len, uint64_t rbfm);
 /* cpu.c */
 void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
                               uint32_t vendor2, uint32_t vendor3);
-typedef struct PropValue {
-    const char *prop, *value;
-} PropValue;
+
 void x86_cpu_apply_props(X86CPU *cpu, PropValue *props);

 void x86_cpu_after_reset(X86CPU *cpu);
Re: [PATCH v7 2/9] target/i386: disable PERFCORE when "-pmu" is configured
Posted by Dongli Zhang 2 months, 3 weeks ago
Hi Zhao,

On 11/19/25 3:06 AM, Zhao Liu wrote:
> Hi Dongli,
> 
>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>> index 3653f8953e..4fcade89bc 100644
>> --- a/target/i386/cpu.c
>> +++ b/target/i386/cpu.c
>> @@ -8360,6 +8360,10 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
>>              !(env->hflags & HF_LMA_MASK)) {
>>              *edx &= ~CPUID_EXT2_SYSCALL;
>>          }
>> +
>> +        if (!cpu->enable_pmu) {
>> +            *ecx &= ~CPUID_EXT3_PERFCORE;
>> +        }
> 
> Learned the lessons from PDCM [*]:
> 
> [*] https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/20250923104136.133875-3-pbonzini@redhat.com/__;!!ACWV5N9M2RV99hQ!IGoeOCDjD7razBgwrJ-A9SjI1XIwq5kxmkPA5_NLOucYzMbsbCOQFe3jc7YyN0Ks1bNpP70F77klp0_--kV1$ 
> 
> Directly masking CPUID bit off is not a good idea... modifying the
> CPUID, even when fixing or adding dependencies, can easily break
> migration.
> 
> So a safe way is to add a compat option. And I think it would be better
> if patch 1 also has a compat option. A single compat option could cover
> patch 1 & 2.
> 
> I have these thoughts:
> 
> * For "-pmu" dependency, it can be checked as [*] did.
> * For normal feature bit dependency (patch 1), it seems possible to add
>   compat_prop condition in feature_dependencies[].
> 
> I attached a draft for discussion (which is on the top of the whole
> series).
> 
> Note, we are currently in the RC phase for v10.2, and the
> pc_compat_10_2[] array is not yet added, which will be added before the
> release of v10.2. Therefore, we have enough time for discussion I think.

Thank you very much for feedback!

It is very important to maintain live migration compatibility.

> 
> If you think it's fine, I'll post compat_prop support separately as an

I think it's fine. I don’t have a better idea. Please send the patch for further
discussion.

> RFC. The specific compat option can be left to this series.

Please let me know whether you would like to include Patch 2 on
"amd_perfmon_always_on" as part of the "compat_prop" patch, or if you'd prefer
that I re-create Patch 2 with your Suggested-by.

Either option works for me.


It seems the Patches 3 - 9 are not impacted by this Live Migration issue.
Perhaps they may be accepted (or as well as Patch 2 "amd_perfmon_always_on")
without "compat_prop" patch? They are independent with each other.

Another concern is Patch 3. Something unexpected may occur when live migrating
from a KVM host without KVM_PMU_CAP_DISABLE to one that has it enabled. The
migration will succeed, but the perceived perf/vPMU support could change.
Can we assume it is the user's responsibility to ensure compatibility between
KVM hosts when "-pmu" is specified?

> 
> Although wait for a while, in a way, it is lucky :) - there's an
> opportunity to avoid making mistakes for us.
> 

Thank you very much!

Dongli Zhang

Re: [PATCH v7 2/9] target/i386: disable PERFCORE when "-pmu" is configured
Posted by Zhao Liu 2 months, 2 weeks ago
> Please let me know whether you would like to include Patch 2 on
> "amd_perfmon_always_on" as part of the "compat_prop" patch, or if you'd prefer
> that I re-create Patch 2 with your Suggested-by.
> 
> Either option works for me.

I took some time to revisit the dependency issues with PDCM again, and I
do think the approach mentioned in previous reply should work.

Ok, let me pick your patch 1 & 2. I will rebase these on the CET series
(since I've also modified the dependency for Arch LBR). The entire
dependency fix series may take some time and may need to wait for
several weeks.

However, at least it's decoupled from the rest. :)

At the same time, I'll help go through the remaining patches 3-9 again,
as it's been quite a while since I last reviewed them.

> It seems the Patches 3 - 9 are not impacted by this Live Migration issue.
> Perhaps they may be accepted (or as well as Patch 2 "amd_perfmon_always_on")
> without "compat_prop" patch? They are independent with each other.

Yes, I think so.

> Another concern is Patch 3. Something unexpected may occur when live migrating
> from a KVM host without KVM_PMU_CAP_DISABLE to one that has it enabled. The
> migration will succeed, but the perceived perf/vPMU support could change.
> Can we assume it is the user's responsibility to ensure compatibility between
> KVM hosts when "-pmu" is specified?

Yes, I think so, too. I understand that QEMU needs to ensure vmstate
migration compatibility.

Thanks,
Zhao