[PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs

Dapeng Mi posted 20 patches 1 year ago
[PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
Posted by Dapeng Mi 1 year ago
CPUID archPerfmonExt (0x23) leaves are supported to enumerate CPU
level's PMU capabilities on non-hybrid processors as well.

This patch supports to parse archPerfmonExt leaves on non-hybrid
processors. Architectural PEBS leverages archPerfmonExt sub-leaves 0x4
and 0x5 to enumerate the PEBS capabilities as well. This patch is a
precursor of the subsequent arch-PEBS enabling patches.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
---
 arch/x86/events/intel/core.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 12eb96219740..d29e7ada96aa 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4955,27 +4955,27 @@ static inline bool intel_pmu_broken_perf_cap(void)
 	return false;
 }
 
-static void update_pmu_cap(struct x86_hybrid_pmu *pmu)
+static void update_pmu_cap(struct pmu *pmu)
 {
 	unsigned int sub_bitmaps, eax, ebx, ecx, edx;
 
 	cpuid(ARCH_PERFMON_EXT_LEAF, &sub_bitmaps, &ebx, &ecx, &edx);
 
 	if (ebx & ARCH_PERFMON_EXT_UMASK2)
-		pmu->config_mask |= ARCH_PERFMON_EVENTSEL_UMASK2;
+		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2;
 	if (ebx & ARCH_PERFMON_EXT_EQ)
-		pmu->config_mask |= ARCH_PERFMON_EVENTSEL_EQ;
+		hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ;
 
 	if (sub_bitmaps & ARCH_PERFMON_NUM_COUNTER_LEAF) {
 		cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF_BIT,
 			    &eax, &ebx, &ecx, &edx);
-		pmu->cntr_mask64 = eax;
-		pmu->fixed_cntr_mask64 = ebx;
+		hybrid(pmu, cntr_mask64) = eax;
+		hybrid(pmu, fixed_cntr_mask64) = ebx;
 	}
 
 	if (!intel_pmu_broken_perf_cap()) {
 		/* Perf Metric (Bit 15) and PEBS via PT (Bit 16) are hybrid enumeration */
-		rdmsrl(MSR_IA32_PERF_CAPABILITIES, pmu->intel_cap.capabilities);
+		rdmsrl(MSR_IA32_PERF_CAPABILITIES, hybrid(pmu, intel_cap).capabilities);
 	}
 }
 
@@ -5066,7 +5066,7 @@ static bool init_hybrid_pmu(int cpu)
 		goto end;
 
 	if (this_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
-		update_pmu_cap(pmu);
+		update_pmu_cap(&pmu->pmu);
 
 	intel_pmu_check_hybrid_pmus(pmu);
 
@@ -6564,6 +6564,7 @@ __init int intel_pmu_init(void)
 
 	x86_pmu.pebs_events_mask	= intel_pmu_pebs_mask(x86_pmu.cntr_mask64);
 	x86_pmu.pebs_capable		= PEBS_COUNTER_MASK;
+	x86_pmu.config_mask		= X86_RAW_EVENT_MASK;
 
 	/*
 	 * Quirk: v2 perfmon does not report fixed-purpose events, so
@@ -7374,6 +7375,18 @@ __init int intel_pmu_init(void)
 		x86_pmu.attr_update = hybrid_attr_update;
 	}
 
+	/*
+	 * The archPerfmonExt (0x23) includes an enhanced enumeration of
+	 * PMU architectural features with a per-core view. For non-hybrid,
+	 * each core has the same PMU capabilities. It's good enough to
+	 * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
+	 * is used to keep the common capabilities. Still keep the values
+	 * from the leaf 0xa. The core specific update will be done later
+	 * when a new type is online.
+	 */
+	if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
+		update_pmu_cap(NULL);
+
 	intel_pmu_check_counters_mask(&x86_pmu.cntr_mask64,
 				      &x86_pmu.fixed_cntr_mask64,
 				      &x86_pmu.intel_ctrl);
-- 
2.40.1
Re: [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
Posted by Andi Kleen 1 year ago
> +	/*
> +	 * The archPerfmonExt (0x23) includes an enhanced enumeration of
> +	 * PMU architectural features with a per-core view. For non-hybrid,
> +	 * each core has the same PMU capabilities. It's good enough to
> +	 * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
> +	 * is used to keep the common capabilities. Still keep the values
> +	 * from the leaf 0xa. The core specific update will be done later
> +	 * when a new type is online.
> +	 */
> +	if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
> +		update_pmu_cap(NULL);

It seems ugly to have these different code paths. Couldn't non hybrid
use x86_pmu in the same way? I assume it would be a larger patch.

-Andi
Re: [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
Posted by Liang, Kan 1 year ago

On 2025-01-23 1:58 p.m., Andi Kleen wrote:
>> +	/*
>> +	 * The archPerfmonExt (0x23) includes an enhanced enumeration of
>> +	 * PMU architectural features with a per-core view. For non-hybrid,
>> +	 * each core has the same PMU capabilities. It's good enough to
>> +	 * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
>> +	 * is used to keep the common capabilities. Still keep the values
>> +	 * from the leaf 0xa. The core specific update will be done later
>> +	 * when a new type is online.
>> +	 */
>> +	if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
>> +		update_pmu_cap(NULL);
> 
> It seems ugly to have these different code paths. Couldn't non hybrid
> use x86_pmu in the same way? I assume it would be a larger patch.

The current non-hybrid is initialized in the intel_pmu_init(). But some
of the initialization code for the hybrid is in the
intel_pmu_cpu_starting(). Yes, it's better to move it together. It
should be a larger patch. Since it's impacted other features, a separate
patch set should be required.

Thanks,
Kan
Re: [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
Posted by Peter Zijlstra 1 year ago
On Mon, Jan 27, 2025 at 10:19:34AM -0500, Liang, Kan wrote:
> 
> 
> On 2025-01-23 1:58 p.m., Andi Kleen wrote:
> >> +	/*
> >> +	 * The archPerfmonExt (0x23) includes an enhanced enumeration of
> >> +	 * PMU architectural features with a per-core view. For non-hybrid,
> >> +	 * each core has the same PMU capabilities. It's good enough to
> >> +	 * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
> >> +	 * is used to keep the common capabilities. Still keep the values
> >> +	 * from the leaf 0xa. The core specific update will be done later
> >> +	 * when a new type is online.
> >> +	 */
> >> +	if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
> >> +		update_pmu_cap(NULL);
> > 
> > It seems ugly to have these different code paths. Couldn't non hybrid
> > use x86_pmu in the same way? I assume it would be a larger patch.
> 
> The current non-hybrid is initialized in the intel_pmu_init(). But some
> of the initialization code for the hybrid is in the
> intel_pmu_cpu_starting(). Yes, it's better to move it together. It
> should be a larger patch. Since it's impacted other features, a separate
> patch set should be required.

IIRC the problem was that there were SKUs with the same FMS that were
both hybrid and non-hybrid and we wouldn't know until we brought up the
CPUs.

Thomas rewrote the topology bits since, so maybe we can do beter these
days.
Re: [PATCH 03/20] perf/x86/intel: Parse CPUID archPerfmonExt leaves for non-hybrid CPUs
Posted by Mi, Dapeng 1 year ago
On 1/28/2025 12:44 AM, Peter Zijlstra wrote:
> On Mon, Jan 27, 2025 at 10:19:34AM -0500, Liang, Kan wrote:
>>
>> On 2025-01-23 1:58 p.m., Andi Kleen wrote:
>>>> +	/*
>>>> +	 * The archPerfmonExt (0x23) includes an enhanced enumeration of
>>>> +	 * PMU architectural features with a per-core view. For non-hybrid,
>>>> +	 * each core has the same PMU capabilities. It's good enough to
>>>> +	 * update the x86_pmu from the booting CPU. For hybrid, the x86_pmu
>>>> +	 * is used to keep the common capabilities. Still keep the values
>>>> +	 * from the leaf 0xa. The core specific update will be done later
>>>> +	 * when a new type is online.
>>>> +	 */
>>>> +	if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT))
>>>> +		update_pmu_cap(NULL);
>>> It seems ugly to have these different code paths. Couldn't non hybrid
>>> use x86_pmu in the same way? I assume it would be a larger patch.
>> The current non-hybrid is initialized in the intel_pmu_init(). But some
>> of the initialization code for the hybrid is in the
>> intel_pmu_cpu_starting(). Yes, it's better to move it together. It
>> should be a larger patch. Since it's impacted other features, a separate
>> patch set should be required.
> IIRC the problem was that there were SKUs with the same FMS that were
> both hybrid and non-hybrid and we wouldn't know until we brought up the
> CPUs.
>
> Thomas rewrote the topology bits since, so maybe we can do beter these
> days.

This optimization would be a fundamental and large change. As Kan said,
we'd better put it as a separate patch series, then it won't block this
arch-PEBS enabling patches.