[PATCH] perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS

Like Xu posted 1 patch 2 years, 8 months ago
There is a newer version of this series
arch/x86/events/intel/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS
Posted by Like Xu 2 years, 8 months ago
From: Like Xu <likexu@tencent.com>

After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
the VMX hardware MSR switching mechanism to save/restore invalid values
for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for guest.
Fix it by using the active host value from cpuc->active_pebs_data_cfg.

Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 arch/x86/events/intel/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 070cc4ef2672..89b9c1cebb61 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 	if (x86_pmu.intel_cap.pebs_baseline) {
 		arr[(*nr)++] = (struct perf_guest_switch_msr){
 			.msr = MSR_PEBS_DATA_CFG,
-			.host = cpuc->pebs_data_cfg,
+			.host = cpuc->active_pebs_data_cfg,
 			.guest = kvm_pmu->pebs_data_cfg,
 		};
 	}
-- 
2.40.1
Re: [PATCH] perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS
Posted by Sean Christopherson 2 years, 7 months ago
+KVM

On Wed, May 17, 2023, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
> PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
> supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
> the VMX hardware MSR switching mechanism to save/restore invalid values
> for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for guest.
> Fix it by using the active host value from cpuc->active_pebs_data_cfg.

In the future, please Cc: kvm@vger.kernel.org when posting fixes that obviously
affect KVM.  I wasted several hours bisecting these crashes.  In hindsight, I
should have searched all of lore sooner, but it really shouldn't have been that
hard for me to find this fix.

> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  arch/x86/events/intel/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 070cc4ef2672..89b9c1cebb61 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>  	if (x86_pmu.intel_cap.pebs_baseline) {
>  		arr[(*nr)++] = (struct perf_guest_switch_msr){
>  			.msr = MSR_PEBS_DATA_CFG,
> -			.host = cpuc->pebs_data_cfg,
> +			.host = cpuc->active_pebs_data_cfg,
>  			.guest = kvm_pmu->pebs_data_cfg,
>  		};
>  	}
> -- 
> 2.40.1
>
Re: [PATCH] perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS
Posted by Liang, Kan 2 years, 8 months ago

On 2023-05-17 9:38 a.m., Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
> PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
> supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
> the VMX hardware MSR switching mechanism to save/restore invalid values
> for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for guest.

I believe we clear the SW bit when it takes effect.

+	if (cpuc->pebs_data_cfg & PEBS_UPDATE_DS_SW) {
+		cpuc->pebs_data_cfg = pebs_data_cfg;
+		pebs_update_threshold(cpuc);
+	}

I think the SW bit can only be seen in a shot period between add() and
enable(). Is it caused by a VM enter which just happens on the period?

> Fix it by using the active host value from cpuc->active_pebs_data_cfg.

I don't see a problem of using active_pebs_data_cfg, since it reflects
the current MSR setting. Just curious about how it's triggered.

> 
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>

Thanks,
Kan

>  arch/x86/events/intel/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index 070cc4ef2672..89b9c1cebb61 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>  	if (x86_pmu.intel_cap.pebs_baseline) {
>  		arr[(*nr)++] = (struct perf_guest_switch_msr){
>  			.msr = MSR_PEBS_DATA_CFG,
> -			.host = cpuc->pebs_data_cfg,
> +			.host = cpuc->active_pebs_data_cfg,
>  			.guest = kvm_pmu->pebs_data_cfg,
>  		};
>  	}
Re: [PATCH] perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS
Posted by Like Xu 2 years, 8 months ago
On 19/5/2023 12:31 am, Liang, Kan wrote:
> 
> 
> On 2023-05-17 9:38 a.m., Like Xu wrote:
>> From: Like Xu <likexu@tencent.com>
>>
>> After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
>> PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
>> supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
>> the VMX hardware MSR switching mechanism to save/restore invalid values
>> for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for guest.
> 
> I believe we clear the SW bit when it takes effect.
> 
> +	if (cpuc->pebs_data_cfg & PEBS_UPDATE_DS_SW) {
> +		cpuc->pebs_data_cfg = pebs_data_cfg;
> +		pebs_update_threshold(cpuc);
> +	}
> 
> I think the SW bit can only be seen in a shot period between add() and
> enable(). Is it caused by a VM enter which just happens on the period?

What happens here is that when *intel_pmu_pebs_del()* is called,
the pebs_update_state() also triggers:
	cpuc->pebs_data_cfg |= PEBS_UPDATE_DS_SW;
and the new value will then be used for the next kvm_entry.

The KVM created pebs perf_event is not added/enabled at this point
and the cpuc->pebs_data_cfg strangely holds a non-zero value.

Perhaps there is more room for perf fixes here, but for guest pebs usages,
using active_pebs_data_cfg in intel_guest_get_msrs() is part of what is needed.

> 
>> Fix it by using the active host value from cpuc->active_pebs_data_cfg.
> 
> I don't see a problem of using active_pebs_data_cfg, since it reflects
> the current MSR setting. Just curious about how it's triggered.
> 
>>
>> Cc: Kan Liang <kan.liang@linux.intel.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Signed-off-by: Like Xu <likexu@tencent.com>
>> ---
> 
> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
> 
> Thanks,
> Kan
> 
>>   arch/x86/events/intel/core.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index 070cc4ef2672..89b9c1cebb61 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
>>   	if (x86_pmu.intel_cap.pebs_baseline) {
>>   		arr[(*nr)++] = (struct perf_guest_switch_msr){
>>   			.msr = MSR_PEBS_DATA_CFG,
>> -			.host = cpuc->pebs_data_cfg,
>> +			.host = cpuc->active_pebs_data_cfg,
>>   			.guest = kvm_pmu->pebs_data_cfg,
>>   		};
>>   	}
Re: [PATCH] perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS
Posted by Liang, Kan 2 years, 8 months ago

On 2023-05-19 3:40 a.m., Like Xu wrote:
> On 19/5/2023 12:31 am, Liang, Kan wrote:
>>
>>
>> On 2023-05-17 9:38 a.m., Like Xu wrote:
>>> From: Like Xu <likexu@tencent.com>
>>>
>>> After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when
>>> changing
>>> PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
>>> supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
>>> the VMX hardware MSR switching mechanism to save/restore invalid values
>>> for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for
>>> guest.
>>
>> I believe we clear the SW bit when it takes effect.
>>
>> +    if (cpuc->pebs_data_cfg & PEBS_UPDATE_DS_SW) {
>> +        cpuc->pebs_data_cfg = pebs_data_cfg;
>> +        pebs_update_threshold(cpuc);
>> +    }
>>
>> I think the SW bit can only be seen in a shot period between add() and
>> enable(). Is it caused by a VM enter which just happens on the period?
> 
> What happens here is that when *intel_pmu_pebs_del()* is called,

Ah, yes. I think it's useless to set the bit for the removal. We may
need a cleanup patch to improve it.

Thanks,
Kan

> the pebs_update_state() also triggers:
>     cpuc->pebs_data_cfg |= PEBS_UPDATE_DS_SW;
> and the new value will then be used for the next kvm_entry.
> 
> The KVM created pebs perf_event is not added/enabled at this point
> and the cpuc->pebs_data_cfg strangely holds a non-zero value.
> 
> Perhaps there is more room for perf fixes here, but for guest pebs usages,
> using active_pebs_data_cfg in intel_guest_get_msrs() is part of what is
> needed.
> 
>>
>>> Fix it by using the active host value from cpuc->active_pebs_data_cfg.
>>
>> I don't see a problem of using active_pebs_data_cfg, since it reflects
>> the current MSR setting. Just curious about how it's triggered.
>>
>>>
>>> Cc: Kan Liang <kan.liang@linux.intel.com>
>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>> Signed-off-by: Like Xu <likexu@tencent.com>
>>> ---
>>
>> Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
>>
>> Thanks,
>> Kan
>>
>>>   arch/x86/events/intel/core.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>>> index 070cc4ef2672..89b9c1cebb61 100644
>>> --- a/arch/x86/events/intel/core.c
>>> +++ b/arch/x86/events/intel/core.c
>>> @@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr
>>> *intel_guest_get_msrs(int *nr, void *data)
>>>       if (x86_pmu.intel_cap.pebs_baseline) {
>>>           arr[(*nr)++] = (struct perf_guest_switch_msr){
>>>               .msr = MSR_PEBS_DATA_CFG,
>>> -            .host = cpuc->pebs_data_cfg,
>>> +            .host = cpuc->active_pebs_data_cfg,
>>>               .guest = kvm_pmu->pebs_data_cfg,
>>>           };
>>>       }
[tip: perf/urgent] perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS
Posted by tip-bot2 for Like Xu 2 years, 8 months ago
The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     3c845304d2d723f20d5b91fef5d133ff94825d76
Gitweb:        https://git.kernel.org/tip/3c845304d2d723f20d5b91fef5d133ff94825d76
Author:        Like Xu <likexu@tencent.com>
AuthorDate:    Wed, 17 May 2023 21:38:08 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 May 2023 10:01:13 +02:00

perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS

After commit b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing
PEBS_DATA_CFG"), the cpuc->pebs_data_cfg may save some bits that are not
supported by real hardware, such as PEBS_UPDATE_DS_SW. This would cause
the VMX hardware MSR switching mechanism to save/restore invalid values
for PEBS_DATA_CFG MSR, thus crashing the host when PEBS is used for guest.
Fix it by using the active host value from cpuc->active_pebs_data_cfg.

Fixes: b752ea0c28e3 ("perf/x86/intel/ds: Flush PEBS DS when changing PEBS_DATA_CFG")
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20230517133808.67885-1-likexu@tencent.com
---
 arch/x86/events/intel/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 070cc4e..89b9c1c 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4074,7 +4074,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
 	if (x86_pmu.intel_cap.pebs_baseline) {
 		arr[(*nr)++] = (struct perf_guest_switch_msr){
 			.msr = MSR_PEBS_DATA_CFG,
-			.host = cpuc->pebs_data_cfg,
+			.host = cpuc->active_pebs_data_cfg,
 			.guest = kvm_pmu->pebs_data_cfg,
 		};
 	}