PMU MSRs are set by QEMU only at levels >= KVM_PUT_RESET_STATE,
excluding runtime. Therefore, updating these MSRs without stopping events
should be acceptable.
In addition, KVM creates kernel perf events with host mode excluded
(exclude_host = 1). While the events remain active, they don't increment
the counter during QEMU vCPU userspace mode.
Finally, The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM
processes these MSRs one by one in a loop, only saving the config and
triggering the KVM_REQ_PMU request. This approach does not immediately stop
the event before updating PMC. This approach is true since Linux kernel
commit 68fb4757e867 ("KVM: x86/pmu: Defer reprogram_counter() to
kvm_pmu_handle_event"), that is, v6.2.
No Fixed tag is going to be added for the commit 0d89436786b0 ("kvm:
migrate vPMU state"), because this isn't a bugfix.
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
Changed since v3:
- Re-order reasons in commit messages.
- Mention KVM's commit 68fb4757e867 (v6.2).
- Keep Zhao's review as there isn't code change.
Changed since v6:
- Add Reviewed-by from Dapeng Mi.
target/i386/kvm/kvm.c | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 99837048b8..742dc6ac0d 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -4222,13 +4222,6 @@ static int kvm_put_msrs(X86CPU *cpu, KvmPutState level)
}
if ((IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env)) && pmu_version > 0) {
- if (pmu_version > 1) {
- /* Stop the counter. */
- kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
- kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
- }
-
- /* Set the counter values. */
for (i = 0; i < num_pmu_fixed_counters; i++) {
kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i,
env->msr_fixed_counters[i]);
@@ -4244,8 +4237,6 @@ static int kvm_put_msrs(X86CPU *cpu, KvmPutState level)
env->msr_global_status);
kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
env->msr_global_ovf_ctrl);
-
- /* Now start the PMU. */
kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL,
env->msr_fixed_ctr_ctrl);
kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL,
--
2.39.3
On 12/29/2025 11:42 PM, Dongli Zhang wrote:
> PMU MSRs are set by QEMU only at levels >= KVM_PUT_RESET_STATE,
> excluding runtime. Therefore, updating these MSRs without stopping events
> should be acceptable.
It seems preferable to keep the existing logic. The sequence of
disabling -> setting new counters -> re-enabling is complete and
reasonable. Re-enabling the PMU implicitly tell KVM to do whatever
actions are needed to make the new counters take effect.
If the purpose of this patch to improve performance, given that this is
a non-critical path, trading this clear and robust logic for a minor
performance gain does not seem necessary.
> In addition, KVM creates kernel perf events with host mode excluded
> (exclude_host = 1). While the events remain active, they don't increment
> the counter during QEMU vCPU userspace mode.
>
> Finally, The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM
> processes these MSRs one by one in a loop, only saving the config and
> triggering the KVM_REQ_PMU request. This approach does not immediately stop
> the event before updating PMC. This approach is true since Linux kernel
> commit 68fb4757e867 ("KVM: x86/pmu: Defer reprogram_counter() to
> kvm_pmu_handle_event"), that is, v6.2.
This seems to assume KVM's internal behavior. While that is true today
(and possibly in the future), it’s not necessary for QEMU to make such
assumptions, as that could unnecessarily limit KVM’s flexibility to
change its behavior later.
> No Fixed tag is going to be added for the commit 0d89436786b0 ("kvm:
> migrate vPMU state"), because this isn't a bugfix.
>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
> ---
> Changed since v3:
> - Re-order reasons in commit messages.
> - Mention KVM's commit 68fb4757e867 (v6.2).
> - Keep Zhao's review as there isn't code change.
> Changed since v6:
> - Add Reviewed-by from Dapeng Mi.
>
> target/i386/kvm/kvm.c | 9 ---------
> 1 file changed, 9 deletions(-)
>
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index 99837048b8..742dc6ac0d 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -4222,13 +4222,6 @@ static int kvm_put_msrs(X86CPU *cpu, KvmPutState level)
> }
>
> if ((IS_INTEL_CPU(env) || IS_ZHAOXIN_CPU(env)) && pmu_version > 0) {
> - if (pmu_version > 1) {
> - /* Stop the counter. */
> - kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL, 0);
> - kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL, 0);
> - }
> -
> - /* Set the counter values. */
> for (i = 0; i < num_pmu_fixed_counters; i++) {
> kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR0 + i,
> env->msr_fixed_counters[i]);
> @@ -4244,8 +4237,6 @@ static int kvm_put_msrs(X86CPU *cpu, KvmPutState level)
> env->msr_global_status);
> kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_OVF_CTRL,
> env->msr_global_ovf_ctrl);
> -
> - /* Now start the PMU. */
> kvm_msr_entry_add(cpu, MSR_CORE_PERF_FIXED_CTR_CTRL,
> env->msr_fixed_ctr_ctrl);
> kvm_msr_entry_add(cpu, MSR_CORE_PERF_GLOBAL_CTRL,
Hi Zide,
On 1/2/26 4:27 PM, Chen, Zide wrote:
>
>
> On 12/29/2025 11:42 PM, Dongli Zhang wrote:
>> PMU MSRs are set by QEMU only at levels >= KVM_PUT_RESET_STATE,
>> excluding runtime. Therefore, updating these MSRs without stopping events
>> should be acceptable.
>
> It seems preferable to keep the existing logic. The sequence of
> disabling -> setting new counters -> re-enabling is complete and
> reasonable. Re-enabling the PMU implicitly tell KVM to do whatever
> actions are needed to make the new counters take effect.
>
> If the purpose of this patch to improve performance, given that this is
> a non-critical path, trading this clear and robust logic for a minor
> performance gain does not seem necessary.
>
>
>> In addition, KVM creates kernel perf events with host mode excluded
>> (exclude_host = 1). While the events remain active, they don't increment
>> the counter during QEMU vCPU userspace mode.
>>
>> Finally, The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM
>> processes these MSRs one by one in a loop, only saving the config and
>> triggering the KVM_REQ_PMU request. This approach does not immediately stop
>> the event before updating PMC. This approach is true since Linux kernel
>> commit 68fb4757e867 ("KVM: x86/pmu: Defer reprogram_counter() to
>> kvm_pmu_handle_event"), that is, v6.2.
>
> This seems to assume KVM's internal behavior. While that is true today
> (and possibly in the future), it’s not necessary for QEMU to make such
> assumptions, as that could unnecessarily limit KVM’s flexibility to
> change its behavior later.
>
To "assume KVM's internal behavior" is only one of the two reasons. The first
reason is that QEMU controls the state of the vCPU to ensure this action only
occurs when "levels >= KVM_PUT_RESET_STATE."
Thanks to "(level >= KVM_PUT_RESET_STATE)", QEMU is able to avoid unnecessary
updates to many MSR registers during runtime.
The main objective is to sync the implementation for Intel and AMD.
Both MSR_CORE_PERF_FIXED_CTR_CTRL and MSR_CORE_PERF_GLOBAL_CTRL are reset to
zero only in the case where "has_pmu_version > 1." Otherwise, we may need to
reset the MSR_P6_PERFCTR_N registers before writing to the counter registers.
Without PATCH 7/7, an additional patch will be required to fix the workflow for
handling PMU registers to reset control registers before counter registers.
If the plan is to maintain the current logic, we may need to adjust the logic
for the AMD registers as well. In PATCH 6/7, we never reset global registers
before writing to control and counter registers.
Would you mine sharing your thoughts on it?
Thank you very much!
Dongli Zhang
On 1/5/2026 12:24 PM, Dongli Zhang wrote:
> Hi Zide,
>
> On 1/2/26 4:27 PM, Chen, Zide wrote:
>>
>>
>> On 12/29/2025 11:42 PM, Dongli Zhang wrote:
>>> PMU MSRs are set by QEMU only at levels >= KVM_PUT_RESET_STATE,
>>> excluding runtime. Therefore, updating these MSRs without stopping events
>>> should be acceptable.
>>
>> It seems preferable to keep the existing logic. The sequence of
>> disabling -> setting new counters -> re-enabling is complete and
>> reasonable. Re-enabling the PMU implicitly tell KVM to do whatever
>> actions are needed to make the new counters take effect.
>>
>> If the purpose of this patch to improve performance, given that this is
>> a non-critical path, trading this clear and robust logic for a minor
>> performance gain does not seem necessary.
>>
>>
>>> In addition, KVM creates kernel perf events with host mode excluded
>>> (exclude_host = 1). While the events remain active, they don't increment
>>> the counter during QEMU vCPU userspace mode.
>>>
>>> Finally, The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM
>>> processes these MSRs one by one in a loop, only saving the config and
>>> triggering the KVM_REQ_PMU request. This approach does not immediately stop
>>> the event before updating PMC. This approach is true since Linux kernel
>>> commit 68fb4757e867 ("KVM: x86/pmu: Defer reprogram_counter() to
>>> kvm_pmu_handle_event"), that is, v6.2.
>>
>> This seems to assume KVM's internal behavior. While that is true today
>> (and possibly in the future), it’s not necessary for QEMU to make such
>> assumptions, as that could unnecessarily limit KVM’s flexibility to
>> change its behavior later.
>>
>
> To "assume KVM's internal behavior" is only one of the two reasons. The first
> reason is that QEMU controls the state of the vCPU to ensure this action only
> occurs when "levels >= KVM_PUT_RESET_STATE."
>
> Thanks to "(level >= KVM_PUT_RESET_STATE)", QEMU is able to avoid unnecessary
> updates to many MSR registers during runtime.
>
>
> The main objective is to sync the implementation for Intel and AMD.
>
> Both MSR_CORE_PERF_FIXED_CTR_CTRL and MSR_CORE_PERF_GLOBAL_CTRL are reset to
> zero only in the case where "has_pmu_version > 1." Otherwise, we may need to
> reset the MSR_P6_PERFCTR_N registers before writing to the counter registers.
> Without PATCH 7/7, an additional patch will be required to fix the workflow for
> handling PMU registers to reset control registers before counter registers.
I might be missing something here, but since this is not for runtime,
I don’t quite understand the need to reset the control registers.
> If the plan is to maintain the current logic, we may need to adjust the logic
> for the AMD registers as well. In PATCH 6/7, we never reset global registers
> before writing to control and counter registers.
>
> Would you mine sharing your thoughts on it?
Personally, I would lean towards keeping the current logic and instead
adjusting patch 6/7 to reset the global registers. This is just my
view, and please don’t feel obligated to follow it.
> Thank you very much!
>
> Dongli Zhang
Hi Zide,
On 1/6/26 1:04 PM, Chen, Zide wrote:
>
>
> On 1/5/2026 12:24 PM, Dongli Zhang wrote:
>> Hi Zide,
>>
>> On 1/2/26 4:27 PM, Chen, Zide wrote:
>>>
>>>
>>> On 12/29/2025 11:42 PM, Dongli Zhang wrote:
>>>> PMU MSRs are set by QEMU only at levels >= KVM_PUT_RESET_STATE,
>>>> excluding runtime. Therefore, updating these MSRs without stopping events
>>>> should be acceptable.
>>>
>>> It seems preferable to keep the existing logic. The sequence of
>>> disabling -> setting new counters -> re-enabling is complete and
>>> reasonable. Re-enabling the PMU implicitly tell KVM to do whatever
>>> actions are needed to make the new counters take effect.
>>>
>>> If the purpose of this patch to improve performance, given that this is
>>> a non-critical path, trading this clear and robust logic for a minor
>>> performance gain does not seem necessary.
>>>
>>>
>>>> In addition, KVM creates kernel perf events with host mode excluded
>>>> (exclude_host = 1). While the events remain active, they don't increment
>>>> the counter during QEMU vCPU userspace mode.
>>>>
>>>> Finally, The kvm_put_msrs() sets the MSRs using KVM_SET_MSRS. The x86 KVM
>>>> processes these MSRs one by one in a loop, only saving the config and
>>>> triggering the KVM_REQ_PMU request. This approach does not immediately stop
>>>> the event before updating PMC. This approach is true since Linux kernel
>>>> commit 68fb4757e867 ("KVM: x86/pmu: Defer reprogram_counter() to
>>>> kvm_pmu_handle_event"), that is, v6.2.
>>>
>>> This seems to assume KVM's internal behavior. While that is true today
>>> (and possibly in the future), it’s not necessary for QEMU to make such
>>> assumptions, as that could unnecessarily limit KVM’s flexibility to
>>> change its behavior later.
>>>
>>
>> To "assume KVM's internal behavior" is only one of the two reasons. The first
>> reason is that QEMU controls the state of the vCPU to ensure this action only
>> occurs when "levels >= KVM_PUT_RESET_STATE."
>>
>> Thanks to "(level >= KVM_PUT_RESET_STATE)", QEMU is able to avoid unnecessary
>> updates to many MSR registers during runtime.
>>
>>
>> The main objective is to sync the implementation for Intel and AMD.
>>
>> Both MSR_CORE_PERF_FIXED_CTR_CTRL and MSR_CORE_PERF_GLOBAL_CTRL are reset to
>> zero only in the case where "has_pmu_version > 1." Otherwise, we may need to
>> reset the MSR_P6_PERFCTR_N registers before writing to the counter registers.
>> Without PATCH 7/7, an additional patch will be required to fix the workflow for
>> handling PMU registers to reset control registers before counter registers.
>
> I might be missing something here, but since this is not for runtime,
> I don’t quite understand the need to reset the control registers.
Global control registers take priority over per-counter control registers. If a
counter is disabled by the global register, enabling it in the per-counter
control register will have no effect.
However, assuming !(has_architectural_pmu_version > 1), which is highly unlikely
in modern systems, there will be no support for the global register.
As a result, QEMU should:
1. Set the control register to 0 for each counter.
2. Write to the counter.
3. Write to the control register.
From the KVM source code, the minimum version is 2, although QEMU assumes any
version can be utilized.
>
>> If the plan is to maintain the current logic, we may need to adjust the logic
>> for the AMD registers as well. In PATCH 6/7, we never reset global registers
>> before writing to control and counter registers.
>>
>> Would you mine sharing your thoughts on it?
>
> Personally, I would lean towards keeping the current logic and instead
> adjusting patch 6/7 to reset the global registers. This is just my
> view, and please don’t feel obligated to follow it.
>
It just seems odd to me that we would need to emulate PMU MSR usage in a VM,
i.e., stop all counters in the global registers at the beginning, when they
won't actually take effect.
Perhaps other reviewers have some insights?
Thank you very much!
Dongli Zhang
© 2016 - 2026 Red Hat, Inc.