From: dongsheng <dongsheng.x.zhang@intel.com>
For Intel Atom CPUs, the PMU events "Instruction Retired" or
"Branch Instruction Retired" may be overcounted for some certain
instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
and complex SGX/SMX/CSTATE instructions/flows.
The detailed information can be found in the errata (section SRF7):
https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
For the Atom platforms before Sierra Forest (including Sierra Forest),
Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
be overcounted on these certain instructions, but for Clearwater Forest
only "Instruction Retired" event is overcounted on these instructions.
So add a helper detect_inst_overcount_flags() to detect whether the
platform has the overcount issue and the later patches would relax the
precise count check by leveraging the gotten overcount flags from this
helper.
Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com>
[Rewrite comments and commit message - Dapeng]
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Yi Lai <yi1.lai@intel.com>
---
lib/x86/processor.h | 17 ++++++++++++++++
x86/pmu.c | 47 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+)
diff --git a/lib/x86/processor.h b/lib/x86/processor.h
index 62f3d578..3f475c21 100644
--- a/lib/x86/processor.h
+++ b/lib/x86/processor.h
@@ -1188,4 +1188,21 @@ static inline bool is_lam_u57_enabled(void)
return !!(read_cr3() & X86_CR3_LAM_U57);
}
+static inline u32 x86_family(u32 eax)
+{
+ u32 x86;
+
+ x86 = (eax >> 8) & 0xf;
+
+ if (x86 == 0xf)
+ x86 += (eax >> 20) & 0xff;
+
+ return x86;
+}
+
+static inline u32 x86_model(u32 eax)
+{
+ return ((eax >> 12) & 0xf0) | ((eax >> 4) & 0x0f);
+}
+
#endif
diff --git a/x86/pmu.c b/x86/pmu.c
index a6b0cfcc..87365aff 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -159,6 +159,14 @@ static struct pmu_event *gp_events;
static unsigned int gp_events_size;
static unsigned int fixed_counters_num;
+/*
+ * Flags for Intel "Instruction Retired" and "Branch Instruction Retired"
+ * overcount flaws.
+ */
+#define INST_RETIRED_OVERCOUNT BIT(0)
+#define BR_RETIRED_OVERCOUNT BIT(1)
+static u32 intel_inst_overcount_flags;
+
static int has_ibpb(void)
{
return this_cpu_has(X86_FEATURE_SPEC_CTRL) ||
@@ -959,6 +967,43 @@ static void check_invalid_rdpmc_gp(void)
"Expected #GP on RDPMC(64)");
}
+/*
+ * For Intel Atom CPUs, the PMU events "Instruction Retired" or
+ * "Branch Instruction Retired" may be overcounted for some certain
+ * instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD
+ * and complex SGX/SMX/CSTATE instructions/flows.
+ *
+ * The detailed information can be found in the errata (section SRF7):
+ * https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/
+ *
+ * For the Atom platforms before Sierra Forest (including Sierra Forest),
+ * Both 2 events "Instruction Retired" and "Branch Instruction Retired" would
+ * be overcounted on these certain instructions, but for Clearwater Forest
+ * only "Instruction Retired" event is overcounted on these instructions.
+ */
+static u32 detect_inst_overcount_flags(void)
+{
+ u32 flags = 0;
+ struct cpuid c = cpuid(1);
+
+ if (x86_family(c.a) == 0x6) {
+ switch (x86_model(c.a)) {
+ case 0xDD: /* Clearwater Forest */
+ flags = INST_RETIRED_OVERCOUNT;
+ break;
+
+ case 0xAF: /* Sierra Forest */
+ case 0x4D: /* Avaton, Rangely */
+ case 0x5F: /* Denverton */
+ case 0x86: /* Jacobsville */
+ flags = INST_RETIRED_OVERCOUNT | BR_RETIRED_OVERCOUNT;
+ break;
+ }
+ }
+
+ return flags;
+}
+
int main(int ac, char **av)
{
int instruction_idx;
@@ -985,6 +1030,8 @@ int main(int ac, char **av)
branch_idx = INTEL_BRANCHES_IDX;
branch_miss_idx = INTEL_BRANCH_MISS_IDX;
+ intel_inst_overcount_flags = detect_inst_overcount_flags();
+
/*
* For legacy Intel CPUS without clflush/clflushopt support,
* there is no way to force to trigger a LLC miss, thus set
--
2.43.0
On 7/13/2025 1:49 AM, Dapeng Mi wrote: > From: dongsheng <dongsheng.x.zhang@intel.com> > > For Intel Atom CPUs, the PMU events "Instruction Retired" or > "Branch Instruction Retired" may be overcounted for some certain > instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD > and complex SGX/SMX/CSTATE instructions/flows. > > The detailed information can be found in the errata (section SRF7): > https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/ > > For the Atom platforms before Sierra Forest (including Sierra Forest), > Both 2 events "Instruction Retired" and "Branch Instruction Retired" would > be overcounted on these certain instructions, but for Clearwater Forest > only "Instruction Retired" event is overcounted on these instructions. > > So add a helper detect_inst_overcount_flags() to detect whether the > platform has the overcount issue and the later patches would relax the > precise count check by leveraging the gotten overcount flags from this > helper. > > Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com> > [Rewrite comments and commit message - Dapeng] > Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> > Tested-by: Yi Lai <yi1.lai@intel.com> > --- > lib/x86/processor.h | 17 ++++++++++++++++ > x86/pmu.c | 47 +++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 64 insertions(+) > > diff --git a/lib/x86/processor.h b/lib/x86/processor.h > index 62f3d578..3f475c21 100644 > --- a/lib/x86/processor.h > +++ b/lib/x86/processor.h > @@ -1188,4 +1188,21 @@ static inline bool is_lam_u57_enabled(void) > return !!(read_cr3() & X86_CR3_LAM_U57); > } > > +static inline u32 x86_family(u32 eax) > +{ > + u32 x86; > + > + x86 = (eax >> 8) & 0xf; > + > + if (x86 == 0xf) > + x86 += (eax >> 20) & 0xff; > + > + return x86; > +} > + > +static inline u32 x86_model(u32 eax) > +{ > + return ((eax >> 12) & 0xf0) | ((eax >> 4) & 0x0f); > +} It seems to copy the implementation of kvm selftest. I need to point it out that it's not correct (because I fixed the similar issue on QEMU recently). We cannot count Extended Model ID unconditionally. Intel counts Extended Model when (base) Family is 0x6 or 0xF, while AMD counts EXtended Model when (base) Family is 0xF. You can refer to kernel's x86_model() in arch/x86/lib/cpu.c, while it optimizes the condition to "family >= 0x6", which seems to have the assumption that Intel doesn't have processor with family ID from 7 to 0xe and AMD doesn't have processor with family ID from 6 to 0xe.
On 7/15/2025 9:27 PM, Xiaoyao Li wrote: > On 7/13/2025 1:49 AM, Dapeng Mi wrote: >> From: dongsheng <dongsheng.x.zhang@intel.com> >> >> For Intel Atom CPUs, the PMU events "Instruction Retired" or >> "Branch Instruction Retired" may be overcounted for some certain >> instructions, like FAR CALL/JMP, RETF, IRET, VMENTRY/VMEXIT/VMPTRLD >> and complex SGX/SMX/CSTATE instructions/flows. >> >> The detailed information can be found in the errata (section SRF7): >> https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/ >> >> For the Atom platforms before Sierra Forest (including Sierra Forest), >> Both 2 events "Instruction Retired" and "Branch Instruction Retired" would >> be overcounted on these certain instructions, but for Clearwater Forest >> only "Instruction Retired" event is overcounted on these instructions. >> >> So add a helper detect_inst_overcount_flags() to detect whether the >> platform has the overcount issue and the later patches would relax the >> precise count check by leveraging the gotten overcount flags from this >> helper. >> >> Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com> >> [Rewrite comments and commit message - Dapeng] >> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> >> Tested-by: Yi Lai <yi1.lai@intel.com> >> --- >> lib/x86/processor.h | 17 ++++++++++++++++ >> x86/pmu.c | 47 +++++++++++++++++++++++++++++++++++++++++++++ >> 2 files changed, 64 insertions(+) >> >> diff --git a/lib/x86/processor.h b/lib/x86/processor.h >> index 62f3d578..3f475c21 100644 >> --- a/lib/x86/processor.h >> +++ b/lib/x86/processor.h >> @@ -1188,4 +1188,21 @@ static inline bool is_lam_u57_enabled(void) >> return !!(read_cr3() & X86_CR3_LAM_U57); >> } >> >> +static inline u32 x86_family(u32 eax) >> +{ >> + u32 x86; >> + >> + x86 = (eax >> 8) & 0xf; >> + >> + if (x86 == 0xf) >> + x86 += (eax >> 20) & 0xff; >> + >> + return x86; >> +} >> + >> +static inline u32 x86_model(u32 eax) >> +{ >> + return ((eax >> 12) & 0xf0) | ((eax >> 4) & 0x0f); >> +} > It seems to copy the implementation of kvm selftest. > > I need to point it out that it's not correct (because I fixed the > similar issue on QEMU recently). > > We cannot count Extended Model ID unconditionally. Intel counts Extended > Model when (base) Family is 0x6 or 0xF, while AMD counts EXtended Model > when (base) Family is 0xF. > > You can refer to kernel's x86_model() in arch/x86/lib/cpu.c, while it > optimizes the condition to "family >= 0x6", which seems to have the > assumption that Intel doesn't have processor with family ID from 7 to > 0xe and AMD doesn't have processor with family ID from 6 to 0xe. Sure. Thanks for reviewing.
© 2016 - 2025 Red Hat, Inc.