arch/x86/events/intel/core.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
Auto counter reload may have a group of events with software events
present within it. The software event PMU isn't the x86_hybrid_pmu and
a container_of operation in intel_pmu_set_acr_caused_constr (via the
hybrid helper) could cause out of bound memory reads. Avoid this by
guarding the call to intel_pmu_set_acr_caused_constr with an
is_x86_event check.
Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
Signed-off-by: Ian Rogers <irogers@google.com>
---
This fix was prompted by failure to get a BUG_ON in this series:
https://lore.kernel.org/lkml/a61eae6d-7a6d-40bd-83ec-bd4ea7657b9d@linux.intel.com/
and so I ran an AI analysis to see if there were similar bad casts as
spotted in:
https://lore.kernel.org/lkml/20260311075201.2951073-2-dapeng1.mi@linux.intel.com/
The AI analysis found this issue and its much more verbose
description is below:
I have evaluated all callers of the hybrid_pmu function within the
arch/x86/events directory. The vast majority of usages are safe
because they operate on an event that is currently being initialized
by the x86 PMU subsystem, which guarantees that event->pmu is
inherently an x86 PMU.
However, there is a distinct bug where a non-x86 PMU (e.g., a software
PMU) can be inadvertently passed into the hybrid_pmu function.
This issue occurs during group event validation and configuration in
intel_pmu_hw_config.
The Vulnerability Flow
1. The Context:
Inside intel_pmu_hw_config (located in arch/x86/events/intel/core.c),
there is logic to handle Automatic Counter Reload (ACR) capabilities
for event groups. The code needs to identify siblings that cause other
events to reload and apply constraints to them.
2. The Missing Check:
In the first pass through the sibling events, the code correctly
checks if each sibling is an x86 event via !is_x86_event(sibling)
before building a cause_mask. However, in the second pass to apply
the constraints, the code iterates over all siblings again but omits
the is_x86_event(sibling) check: arch/x86/events/intel/core.c#n4847
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4847)
1 if (leader->nr_siblings) {
2 for_each_sibling_event(sibling, leader)
3 intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask); // <-- Missing is_x86_event() check!
4 }
3. The Invalid Cast:
The intel_pmu_set_acr_caused_constr function takes this sibling event
(which could be a software event) and executes the hybrid macro over
its pmu pointer: arch/x86/events/intel/core.c#n4624
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4624)
1 static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
2 int idx, u64 cause_mask)
3 {
4 if (test_bit(idx, (unsigned long *)&cause_mask))
5 event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
6 }
4. The Root Cause:
The hybrid macro expands and passes the event->pmu to hybrid_pmu:
arch/x86/events/perf_event.h#n788
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n788)
1 #define hybrid(_pmu, _field) \
2 ...
3 if (is_hybrid() && (_pmu)) \
4 __Fp = &hybrid_pmu(_pmu)->_field; \
Which subsequently results in a blind container_of on a non-x86 PMU
pointer: arch/x86/events/perf_event.h#n780
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n780)
1 static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
2 {
3 return container_of(pmu, struct x86_hybrid_pmu, pmu);
4 }
Conclusion
If a user creates an event group led by an x86 ACR event but includes
a non-x86 sibling event (like a software event), the second traversal
in intel_pmu_hw_config will blindly pass the software PMU to
hybrid_pmu. Because container_of assumes the PMU is embedded inside an
x86_hybrid_pmu struct, the resulting pointer becomes invalid, leading
to memory corruption or an out-of-bounds access when attempting to
read the acr_cause_mask64 property.
---
arch/x86/events/intel/core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index cf3a4fe06ff2..26e829b8a882 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4844,8 +4844,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
intel_pmu_set_acr_caused_constr(leader, idx++, cause_mask);
if (leader->nr_siblings) {
- for_each_sibling_event(sibling, leader)
- intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
+ for_each_sibling_event(sibling, leader) {
+ if (is_x86_event(sibling))
+ intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
+ }
}
if (leader != event)
--
2.53.0.851.ga537e3e6e9-goog
On Thu, Mar 12, 2026 at 12:43 PM Ian Rogers <irogers@google.com> wrote:
>
> Auto counter reload may have a group of events with software events
> present within it. The software event PMU isn't the x86_hybrid_pmu and
> a container_of operation in intel_pmu_set_acr_caused_constr (via the
> hybrid helper) could cause out of bound memory reads. Avoid this by
> guarding the call to intel_pmu_set_acr_caused_constr with an
> is_x86_event check.
>
> Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
> Signed-off-by: Ian Rogers <irogers@google.com>
+Thomas Falcon
Thanks,
Ian
> ---
> This fix was prompted by failure to get a BUG_ON in this series:
> https://lore.kernel.org/lkml/a61eae6d-7a6d-40bd-83ec-bd4ea7657b9d@linux.intel.com/
> and so I ran an AI analysis to see if there were similar bad casts as
> spotted in:
> https://lore.kernel.org/lkml/20260311075201.2951073-2-dapeng1.mi@linux.intel.com/
> The AI analysis found this issue and its much more verbose
> description is below:
>
> I have evaluated all callers of the hybrid_pmu function within the
> arch/x86/events directory. The vast majority of usages are safe
> because they operate on an event that is currently being initialized
> by the x86 PMU subsystem, which guarantees that event->pmu is
> inherently an x86 PMU.
>
> However, there is a distinct bug where a non-x86 PMU (e.g., a software
> PMU) can be inadvertently passed into the hybrid_pmu function.
>
> This issue occurs during group event validation and configuration in
> intel_pmu_hw_config.
>
> The Vulnerability Flow
>
> 1. The Context:
>
> Inside intel_pmu_hw_config (located in arch/x86/events/intel/core.c),
> there is logic to handle Automatic Counter Reload (ACR) capabilities
> for event groups. The code needs to identify siblings that cause other
> events to reload and apply constraints to them.
>
> 2. The Missing Check:
>
> In the first pass through the sibling events, the code correctly
> checks if each sibling is an x86 event via !is_x86_event(sibling)
> before building a cause_mask. However, in the second pass to apply
> the constraints, the code iterates over all siblings again but omits
> the is_x86_event(sibling) check: arch/x86/events/intel/core.c#n4847
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4847)
>
> 1 if (leader->nr_siblings) {
> 2 for_each_sibling_event(sibling, leader)
> 3 intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask); // <-- Missing is_x86_event() check!
> 4 }
>
> 3. The Invalid Cast:
>
> The intel_pmu_set_acr_caused_constr function takes this sibling event
> (which could be a software event) and executes the hybrid macro over
> its pmu pointer: arch/x86/events/intel/core.c#n4624
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4624)
>
> 1 static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
> 2 int idx, u64 cause_mask)
> 3 {
> 4 if (test_bit(idx, (unsigned long *)&cause_mask))
> 5 event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
> 6 }
>
> 4. The Root Cause:
>
> The hybrid macro expands and passes the event->pmu to hybrid_pmu:
> arch/x86/events/perf_event.h#n788
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n788)
>
> 1 #define hybrid(_pmu, _field) \
> 2 ...
> 3 if (is_hybrid() && (_pmu)) \
> 4 __Fp = &hybrid_pmu(_pmu)->_field; \
>
> Which subsequently results in a blind container_of on a non-x86 PMU
> pointer: arch/x86/events/perf_event.h#n780
> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n780)
>
> 1 static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
> 2 {
> 3 return container_of(pmu, struct x86_hybrid_pmu, pmu);
> 4 }
>
> Conclusion
>
> If a user creates an event group led by an x86 ACR event but includes
> a non-x86 sibling event (like a software event), the second traversal
> in intel_pmu_hw_config will blindly pass the software PMU to
> hybrid_pmu. Because container_of assumes the PMU is embedded inside an
> x86_hybrid_pmu struct, the resulting pointer becomes invalid, leading
> to memory corruption or an out-of-bounds access when attempting to
> read the acr_cause_mask64 property.
> ---
> arch/x86/events/intel/core.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index cf3a4fe06ff2..26e829b8a882 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4844,8 +4844,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
> intel_pmu_set_acr_caused_constr(leader, idx++, cause_mask);
>
> if (leader->nr_siblings) {
> - for_each_sibling_event(sibling, leader)
> - intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
> + for_each_sibling_event(sibling, leader) {
> + if (is_x86_event(sibling))
> + intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
> + }
> }
>
> if (leader != event)
> --
> 2.53.0.851.ga537e3e6e9-goog
>
On Thu, 2026-03-12 at 12:44 -0700, Ian Rogers wrote:
> On Thu, Mar 12, 2026 at 12:43 PM Ian Rogers <irogers@google.com> wrote:
> >
> > Auto counter reload may have a group of events with software events
> > present within it. The software event PMU isn't the x86_hybrid_pmu and
> > a container_of operation in intel_pmu_set_acr_caused_constr (via the
> > hybrid helper) could cause out of bound memory reads. Avoid this by
> > guarding the call to intel_pmu_set_acr_caused_constr with an
> > is_x86_event check.
> >
> > Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
> > Signed-off-by: Ian Rogers <irogers@google.com>
>
> +Thomas Falcon
>
> Thanks,
> Ian
>
> > ---
> > This fix was prompted by failure to get a BUG_ON in this series:
> > https://lore.kernel.org/lkml/a61eae6d-7a6d-40bd-83ec-bd4ea7657b9d@linux.intel.com/
> > and so I ran an AI analysis to see if there were similar bad casts as
> > spotted in:
> > https://lore.kernel.org/lkml/20260311075201.2951073-2-dapeng1.mi@linux.intel.com/
> > The AI analysis found this issue and its much more verbose
> > description is below:
> >
> > I have evaluated all callers of the hybrid_pmu function within the
> > arch/x86/events directory. The vast majority of usages are safe
> > because they operate on an event that is currently being initialized
> > by the x86 PMU subsystem, which guarantees that event->pmu is
> > inherently an x86 PMU.
> >
> > However, there is a distinct bug where a non-x86 PMU (e.g., a software
> > PMU) can be inadvertently passed into the hybrid_pmu function.
> >
> > This issue occurs during group event validation and configuration in
> > intel_pmu_hw_config.
> >
> > The Vulnerability Flow
> >
> > 1. The Context:
> >
> > Inside intel_pmu_hw_config (located in arch/x86/events/intel/core.c),
> > there is logic to handle Automatic Counter Reload (ACR) capabilities
> > for event groups. The code needs to identify siblings that cause other
> > events to reload and apply constraints to them.
> >
> > 2. The Missing Check:
> >
> > In the first pass through the sibling events, the code correctly
> > checks if each sibling is an x86 event via !is_x86_event(sibling)
> > before building a cause_mask. However, in the second pass to apply
> > the constraints, the code iterates over all siblings again but omits
> > the is_x86_event(sibling) check: arch/x86/events/intel/core.c#n4847
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4847)
> >
> > 1 if (leader->nr_siblings) {
> > 2 for_each_sibling_event(sibling, leader)
> > 3 intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask); // <-- Missing is_x86_event() check!
> > 4 }
> >
Hi Ian, thanks for cc'ing me. The fix looks good to me. I was confused initially because first pass described above looks like it is checking for non-x86 events and bailing, but really is only checking for these events in between ACR events because "ACR events must be contiguous." Software events inserted outside of these events should be valid though, so the is_x86_event() check in the second pass is needed.
Thanks,
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
> > 3. The Invalid Cast:
> >
> > The intel_pmu_set_acr_caused_constr function takes this sibling event
> > (which could be a software event) and executes the hybrid macro over
> > its pmu pointer: arch/x86/events/intel/core.c#n4624
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4624)
> >
> > 1 static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
> > 2 int idx, u64 cause_mask)
> > 3 {
> > 4 if (test_bit(idx, (unsigned long *)&cause_mask))
> > 5 event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
> > 6 }
> >
> > 4. The Root Cause:
> >
> > The hybrid macro expands and passes the event->pmu to hybrid_pmu:
> > arch/x86/events/perf_event.h#n788
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n788)
> >
> > 1 #define hybrid(_pmu, _field) \
> > 2 ...
> > 3 if (is_hybrid() && (_pmu)) \
> > 4 __Fp = &hybrid_pmu(_pmu)->_field; \
> >
> > Which subsequently results in a blind container_of on a non-x86 PMU
> > pointer: arch/x86/events/perf_event.h#n780
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n780)
> >
> > 1 static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
> > 2 {
> > 3 return container_of(pmu, struct x86_hybrid_pmu, pmu);
> > 4 }
> >
> > Conclusion
> >
> > If a user creates an event group led by an x86 ACR event but includes
> > a non-x86 sibling event (like a software event), the second traversal
> > in intel_pmu_hw_config will blindly pass the software PMU to
> > hybrid_pmu. Because container_of assumes the PMU is embedded inside an
> > x86_hybrid_pmu struct, the resulting pointer becomes invalid, leading
> > to memory corruption or an out-of-bounds access when attempting to
> > read the acr_cause_mask64 property.
> > ---
> > arch/x86/events/intel/core.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> > index cf3a4fe06ff2..26e829b8a882 100644
> > --- a/arch/x86/events/intel/core.c
> > +++ b/arch/x86/events/intel/core.c
> > @@ -4844,8 +4844,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
> > intel_pmu_set_acr_caused_constr(leader, idx++, cause_mask);
> >
> > if (leader->nr_siblings) {
> > - for_each_sibling_event(sibling, leader)
> > - intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
> > + for_each_sibling_event(sibling, leader) {
> > + if (is_x86_event(sibling))
> > + intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
> > + }
> > }
> >
> > if (leader != event)
> > --
> > 2.53.0.851.ga537e3e6e9-goog
> >
On Thu, Mar 12, 2026 at 12:44 PM Ian Rogers <irogers@google.com> wrote:
>
> On Thu, Mar 12, 2026 at 12:43 PM Ian Rogers <irogers@google.com> wrote:
> >
> > Auto counter reload may have a group of events with software events
> > present within it. The software event PMU isn't the x86_hybrid_pmu and
> > a container_of operation in intel_pmu_set_acr_caused_constr (via the
> > hybrid helper) could cause out of bound memory reads. Avoid this by
> > guarding the call to intel_pmu_set_acr_caused_constr with an
> > is_x86_event check.
> >
> > Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
> > Signed-off-by: Ian Rogers <irogers@google.com>
>
> +Thomas Falcon
>
> Thanks,
> Ian
>
> > ---
> > This fix was prompted by failure to get a BUG_ON in this series:
> > https://lore.kernel.org/lkml/a61eae6d-7a6d-40bd-83ec-bd4ea7657b9d@linux.intel.com/
> > and so I ran an AI analysis to see if there were similar bad casts as
> > spotted in:
> > https://lore.kernel.org/lkml/20260311075201.2951073-2-dapeng1.mi@linux.intel.com/
> > The AI analysis found this issue and its much more verbose
> > description is below:
> >
> > I have evaluated all callers of the hybrid_pmu function within the
> > arch/x86/events directory. The vast majority of usages are safe
> > because they operate on an event that is currently being initialized
> > by the x86 PMU subsystem, which guarantees that event->pmu is
> > inherently an x86 PMU.
> >
> > However, there is a distinct bug where a non-x86 PMU (e.g., a software
> > PMU) can be inadvertently passed into the hybrid_pmu function.
> >
> > This issue occurs during group event validation and configuration in
> > intel_pmu_hw_config.
> >
> > The Vulnerability Flow
> >
> > 1. The Context:
> >
> > Inside intel_pmu_hw_config (located in arch/x86/events/intel/core.c),
> > there is logic to handle Automatic Counter Reload (ACR) capabilities
> > for event groups. The code needs to identify siblings that cause other
> > events to reload and apply constraints to them.
> >
> > 2. The Missing Check:
> >
> > In the first pass through the sibling events, the code correctly
> > checks if each sibling is an x86 event via !is_x86_event(sibling)
> > before building a cause_mask. However, in the second pass to apply
> > the constraints, the code iterates over all siblings again but omits
> > the is_x86_event(sibling) check: arch/x86/events/intel/core.c#n4847
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4847)
> >
> > 1 if (leader->nr_siblings) {
> > 2 for_each_sibling_event(sibling, leader)
> > 3 intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask); // <-- Missing is_x86_event() check!
> > 4 }
> >
> > 3. The Invalid Cast:
> >
> > The intel_pmu_set_acr_caused_constr function takes this sibling event
> > (which could be a software event) and executes the hybrid macro over
> > its pmu pointer: arch/x86/events/intel/core.c#n4624
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/intel/core.c#n4624)
> >
> > 1 static inline void intel_pmu_set_acr_caused_constr(struct perf_event *event,
> > 2 int idx, u64 cause_mask)
> > 3 {
> > 4 if (test_bit(idx, (unsigned long *)&cause_mask))
> > 5 event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64);
> > 6 }
> >
> > 4. The Root Cause:
> >
> > The hybrid macro expands and passes the event->pmu to hybrid_pmu:
> > arch/x86/events/perf_event.h#n788
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n788)
> >
> > 1 #define hybrid(_pmu, _field) \
> > 2 ...
> > 3 if (is_hybrid() && (_pmu)) \
> > 4 __Fp = &hybrid_pmu(_pmu)->_field; \
> >
> > Which subsequently results in a blind container_of on a non-x86 PMU
> > pointer: arch/x86/events/perf_event.h#n780
> > (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/events/perf_event.h#n780)
> >
> > 1 static __always_inline struct x86_hybrid_pmu *hybrid_pmu(struct pmu *pmu)
> > 2 {
> > 3 return container_of(pmu, struct x86_hybrid_pmu, pmu);
> > 4 }
> >
> > Conclusion
> >
> > If a user creates an event group led by an x86 ACR event but includes
> > a non-x86 sibling event (like a software event), the second traversal
> > in intel_pmu_hw_config will blindly pass the software PMU to
> > hybrid_pmu. Because container_of assumes the PMU is embedded inside an
> > x86_hybrid_pmu struct, the resulting pointer becomes invalid, leading
> > to memory corruption or an out-of-bounds access when attempting to
> > read the acr_cause_mask64 property.
> > ---
> > arch/x86/events/intel/core.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> > index cf3a4fe06ff2..26e829b8a882 100644
> > --- a/arch/x86/events/intel/core.c
> > +++ b/arch/x86/events/intel/core.c
> > @@ -4844,8 +4844,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
> > intel_pmu_set_acr_caused_constr(leader, idx++, cause_mask);
> >
> > if (leader->nr_siblings) {
> > - for_each_sibling_event(sibling, leader)
> > - intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
> > + for_each_sibling_event(sibling, leader) {
So, the fix appears functional, but my AI overlord is spotting more
issues with the ACR code that I'll relay below:
Since intel_pmu_hw_config() is called during perf_event_alloc() ->
perf_init_event() in sys_perf_event_open(), it appears sys_perf_event_open()
has not yet acquired the required ctx->mutex for the group leader's context.
Could a concurrent thread modify the sibling_list (e.g., by closing a
sibling's file descriptor, triggering perf_event_release_kernel() and
removing the sibling from the list) and cause list corruption or a
use-after-free?
Additionally, for_each_sibling_event() explicitly asserts
lockdep_assert_event_ctx(leader).
Will this predictably trigger a kernel warning when CONFIG_PROVE_LOCKING is
enabled?
> > + if (is_x86_event(sibling))
> > + intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
When adding a new event to an existing ACR group, the existing leader and
siblings might be actively scheduled on the PMU by another CPU.
Modifying their dynamic constraints via a non-atomic bitwise AND update
(event->hw.dyn_constraint &= hybrid(event->pmu, acr_cause_mask64)) without
holding ctx->lock or using the PMU event update protocol might allow the
PMU scheduler to read a stale or torn dyn_constraint value.
Could this unsynchronized call to intel_pmu_set_acr_caused_constr()
result in a data race?
Also, cause_mask is defined as a local 8-byte u64 stack variable. Since
perf_event_validate_size() allows event groups to contain several hundred
siblings, a user could construct a group with over 64 x86 siblings where
attr.config2 == 0 to bypass the num constraints.
(Ian note: the config2 sounds off here, but if the events are disabled
I think you can put as many as you like in a group)
When idx exceeds 63, the call to intel_pmu_set_acr_caused_constr() executes
test_bit(idx, (unsigned long *)&cause_mask).
Since cause_mask is passed by value and its address is taken in the inline
function, this will read memory past the bounds of the 8-byte variable,
accessing adjacent kernel stack data.
Can an attacker observe the resulting scheduling success or failure to
reliably infer the bits of the adjacent kernel stack memory?
Thanks,
Ian
> > + }
> > }
> >
> > if (leader != event)
> > --
> > 2.53.0.851.ga537e3e6e9-goog
> >
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: dbde07f06226438cd2cf1179745fa1bec5d8914a
Gitweb: https://git.kernel.org/tip/dbde07f06226438cd2cf1179745fa1bec5d8914a
Author: Ian Rogers <irogers@google.com>
AuthorDate: Thu, 12 Mar 2026 12:43:05 -07:00
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 02 Apr 2026 13:49:16 +02:00
perf/x86: Fix potential bad container_of in intel_pmu_hw_config
Auto counter reload may have a group of events with software events
present within it. The software event PMU isn't the x86_hybrid_pmu and
a container_of operation in intel_pmu_set_acr_caused_constr (via the
hybrid helper) could cause out of bound memory reads. Avoid this by
guarding the call to intel_pmu_set_acr_caused_constr with an
is_x86_event check.
Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://patch.msgid.link/20260312194305.1834035-1-irogers@google.com
---
arch/x86/events/intel/core.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 36c6821..793335c 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4855,8 +4855,10 @@ static int intel_pmu_hw_config(struct perf_event *event)
intel_pmu_set_acr_caused_constr(leader, idx++, cause_mask);
if (leader->nr_siblings) {
- for_each_sibling_event(sibling, leader)
- intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
+ for_each_sibling_event(sibling, leader) {
+ if (is_x86_event(sibling))
+ intel_pmu_set_acr_caused_constr(sibling, idx++, cause_mask);
+ }
}
if (leader != event)
© 2016 - 2026 Red Hat, Inc.