Calling perf_allow_kernel() from the NMI context is unsafe and could be
fatal. Capture the permission at event-initialization time by storing it
in event->hw.flags, and have the NMI handler rely on that cached flag
instead of making the call directly.
Reported-by: Sadasivan Shaiju <sadasivan.shaiju2@amd.com>
Fixes: 50a53b60e141d ("perf/amd/ibs: Prevent leaking sensitive data to userspace")
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
---
arch/x86/events/amd/ibs.c | 5 ++++-
arch/x86/events/perf_event_flags.h | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 05b7c9f2ec33..004226b52ac7 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -313,6 +313,9 @@ static int perf_ibs_init(struct perf_event *event)
if (ret)
return ret;
+ if (perf_allow_kernel())
+ hwc->flags |= PERF_X86_EVENT_UNPRIVILEGED;
+
if (hwc->sample_period) {
if (config & perf_ibs->cnt_mask)
/* raw max_cnt may not be set */
@@ -1349,7 +1352,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
* unprivileged users.
*/
if ((event->attr.sample_type & PERF_SAMPLE_RAW) &&
- perf_allow_kernel()) {
+ (hwc->flags & PERF_X86_EVENT_UNPRIVILEGED)) {
perf_ibs_phyaddr_clear(perf_ibs, &ibs_data);
}
diff --git a/arch/x86/events/perf_event_flags.h b/arch/x86/events/perf_event_flags.h
index 70078334e4a3..47f84ee8f540 100644
--- a/arch/x86/events/perf_event_flags.h
+++ b/arch/x86/events/perf_event_flags.h
@@ -23,3 +23,4 @@ PERF_ARCH(PEBS_LAT_HYBRID, 0x0020000) /* ld and st lat for hybrid */
PERF_ARCH(NEEDS_BRANCH_STACK, 0x0040000) /* require branch stack setup */
PERF_ARCH(BRANCH_COUNTERS, 0x0080000) /* logs the counters in the extra space of each branch */
PERF_ARCH(ACR, 0x0100000) /* Auto counter reload */
+PERF_ARCH(UNPRIVILEGED, 0x0200000) /* Unprivileged event (wrt perf_allow_kernel()) */
--
2.43.0
Hello Ravi.
By evaluating the privileges at the event-init moment and using cached
value later, couldn't it lead to the cached value being obsolete at
some point? E.g. a setuid program later dropping the privileges but
still being able to read physical addresses? I don't think it can be
a solid attack-surface, but still may be worth of a thought... ?
Michael
On Mon, 16 Feb 2026, Ravi Bangoria wrote:
> Calling perf_allow_kernel() from the NMI context is unsafe and could be
> fatal. Capture the permission at event-initialization time by storing it
> in event->hw.flags, and have the NMI handler rely on that cached flag
> instead of making the call directly.
>
> Reported-by: Sadasivan Shaiju <sadasivan.shaiju2@amd.com>
> Fixes: 50a53b60e141d ("perf/amd/ibs: Prevent leaking sensitive data to userspace")
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
> ---
> arch/x86/events/amd/ibs.c | 5 ++++-
> arch/x86/events/perf_event_flags.h | 1 +
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
> index 05b7c9f2ec33..004226b52ac7 100644
> --- a/arch/x86/events/amd/ibs.c
> +++ b/arch/x86/events/amd/ibs.c
> @@ -313,6 +313,9 @@ static int perf_ibs_init(struct perf_event *event)
> if (ret)
> return ret;
>
> + if (perf_allow_kernel())
> + hwc->flags |= PERF_X86_EVENT_UNPRIVILEGED;
> +
> if (hwc->sample_period) {
> if (config & perf_ibs->cnt_mask)
> /* raw max_cnt may not be set */
> @@ -1349,7 +1352,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
> * unprivileged users.
> */
> if ((event->attr.sample_type & PERF_SAMPLE_RAW) &&
> - perf_allow_kernel()) {
> + (hwc->flags & PERF_X86_EVENT_UNPRIVILEGED)) {
> perf_ibs_phyaddr_clear(perf_ibs, &ibs_data);
> }
>
> diff --git a/arch/x86/events/perf_event_flags.h b/arch/x86/events/perf_event_flags.h
> index 70078334e4a3..47f84ee8f540 100644
> --- a/arch/x86/events/perf_event_flags.h
> +++ b/arch/x86/events/perf_event_flags.h
> @@ -23,3 +23,4 @@ PERF_ARCH(PEBS_LAT_HYBRID, 0x0020000) /* ld and st lat for hybrid */
> PERF_ARCH(NEEDS_BRANCH_STACK, 0x0040000) /* require branch stack setup */
> PERF_ARCH(BRANCH_COUNTERS, 0x0080000) /* logs the counters in the extra space of each branch */
> PERF_ARCH(ACR, 0x0100000) /* Auto counter reload */
> +PERF_ARCH(UNPRIVILEGED, 0x0200000) /* Unprivileged event (wrt perf_allow_kernel()) */
> --
> 2.43.0
>
>
>
Hi Michael, > By evaluating the privileges at the event-init moment and using cached > value later, couldn't it lead to the cached value being obsolete at > some point? E.g. a setuid program later dropping the privileges but > still being able to read physical addresses? Yes, but wouldn't the same concern apply to all other call sites of perf_allow_kernel() ? Thanks, Ravi
On Mon, 9 Mar 2026, Ravi Bangoria wrote: > Hi Michael, > > > By evaluating the privileges at the event-init moment and using cached > > value later, couldn't it lead to the cached value being obsolete at > > some point? E.g. a setuid program later dropping the privileges but > > still being able to read physical addresses? > > Yes, but wouldn't the same concern apply to all other call sites of > perf_allow_kernel() ? Well, I don't say this case is worse than the others, just raising a more generic question whether this design is fairly acceptable or would deserve e.g. privilege adjusting at the time when they are dropped or something... Nothing more, just a note... :) Michael > > Thanks, > Ravi > >
Hi Michael,
>>> By evaluating the privileges at the event-init moment and using cached
>>> value later, couldn't it lead to the cached value being obsolete at
>>> some point? E.g. a setuid program later dropping the privileges but
>>> still being able to read physical addresses?
>>
>> Yes, but wouldn't the same concern apply to all other call sites of
>> perf_allow_kernel() ?
>
> Well, I don't say this case is worse than the others, just raising a
> more generic question whether this design is fairly acceptable or would
> deserve e.g. privilege adjusting at the time when they are dropped or
> something... Nothing more, just a note... :)
I understand. However, performing the privilege check at the beginning
and then continuing to allow accessing the privileged resource even
after privileges have been dropped is a common design pattern, isn't it?
For example, I assume something like this is allowed:
fd = open("/root/only/file");
setresuid(); /* drop the privilege */
read(fd);
Thanks,
Ravi
On Mon, 16 Feb 2026, Ravi Bangoria wrote:
> Calling perf_allow_kernel() from the NMI context is unsafe and could be
> fatal. Capture the permission at event-initialization time by storing it
> in event->hw.flags, and have the NMI handler rely on that cached flag
> instead of making the call directly.
>
Hi Ravi.
I have tested this fix on a machine (below), which without this fix can
be easily crashed by recording ibs_op/l3missonly=1/ event.
With this fix, it survived 4900+ iterations without any crash.
CPU details
===========
BIOS CPU family: 107
CPU family: 26
Model: 2
Stepping: 1
Thanks for fixing.
Cheers,
Michael
> Reported-by: Sadasivan Shaiju <sadasivan.shaiju2@amd.com>
> Fixes: 50a53b60e141d ("perf/amd/ibs: Prevent leaking sensitive data to userspace")
> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Michael Petlan <mpetlan@redhat.com>
> ---
> arch/x86/events/amd/ibs.c | 5 ++++-
> arch/x86/events/perf_event_flags.h | 1 +
> 2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
> index 05b7c9f2ec33..004226b52ac7 100644
> --- a/arch/x86/events/amd/ibs.c
> +++ b/arch/x86/events/amd/ibs.c
> @@ -313,6 +313,9 @@ static int perf_ibs_init(struct perf_event *event)
> if (ret)
> return ret;
>
> + if (perf_allow_kernel())
> + hwc->flags |= PERF_X86_EVENT_UNPRIVILEGED;
> +
> if (hwc->sample_period) {
> if (config & perf_ibs->cnt_mask)
> /* raw max_cnt may not be set */
> @@ -1349,7 +1352,7 @@ static int perf_ibs_handle_irq(struct perf_ibs *perf_ibs, struct pt_regs *iregs)
> * unprivileged users.
> */
> if ((event->attr.sample_type & PERF_SAMPLE_RAW) &&
> - perf_allow_kernel()) {
> + (hwc->flags & PERF_X86_EVENT_UNPRIVILEGED)) {
> perf_ibs_phyaddr_clear(perf_ibs, &ibs_data);
> }
>
> diff --git a/arch/x86/events/perf_event_flags.h b/arch/x86/events/perf_event_flags.h
> index 70078334e4a3..47f84ee8f540 100644
> --- a/arch/x86/events/perf_event_flags.h
> +++ b/arch/x86/events/perf_event_flags.h
> @@ -23,3 +23,4 @@ PERF_ARCH(PEBS_LAT_HYBRID, 0x0020000) /* ld and st lat for hybrid */
> PERF_ARCH(NEEDS_BRANCH_STACK, 0x0040000) /* require branch stack setup */
> PERF_ARCH(BRANCH_COUNTERS, 0x0080000) /* logs the counters in the extra space of each branch */
> PERF_ARCH(ACR, 0x0100000) /* Auto counter reload */
> +PERF_ARCH(UNPRIVILEGED, 0x0200000) /* Unprivileged event (wrt perf_allow_kernel()) */
> --
> 2.43.0
>
>
>
The following commit has been merged into the perf/core branch of tip:
Commit-ID: b0a09142622a994c4f4088c3f61db5da87cfc711
Gitweb: https://git.kernel.org/tip/b0a09142622a994c4f4088c3f61db5da87cfc711
Author: Ravi Bangoria <ravi.bangoria@amd.com>
AuthorDate: Mon, 16 Feb 2026 04:22:15
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 27 Feb 2026 16:40:23 +01:00
perf/amd/ibs: Avoid calling perf_allow_kernel() from the IBS NMI handler
Calling perf_allow_kernel() from the NMI context is unsafe and could be
fatal. Capture the permission at event-initialization time by storing it
in event->hw.flags, and have the NMI handler rely on that cached flag
instead of making the call directly.
Fixes: 50a53b60e141d ("perf/amd/ibs: Prevent leaking sensitive data to userspace")
Reported-by: Sadasivan Shaiju <sadasivan.shaiju2@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://patch.msgid.link/20260216042216.1440-5-ravi.bangoria@amd.com
---
arch/x86/events/amd/ibs.c | 5 ++++-
arch/x86/events/perf_event_flags.h | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/events/amd/ibs.c b/arch/x86/events/amd/ibs.c
index 05b7c9f..004226b 100644
--- a/arch/x86/events/amd/ibs.c
+++ b/arch/x86/events/amd/ibs.c
@@ -313,6 +313,9 @@ static int perf_ibs_init(struct perf_event *event)
if (ret)
return ret;
+ if (perf_allow_kernel())
+ hwc->flags |= PERF_X86_EVENT_UNPRIVILEGED;
+
if (hwc->sample_period) {
if (config & perf_ibs->cnt_mask)
/* raw max_cnt may not be set */
@@ -1349,7 +1352,7 @@ fail:
* unprivileged users.
*/
if ((event->attr.sample_type & PERF_SAMPLE_RAW) &&
- perf_allow_kernel()) {
+ (hwc->flags & PERF_X86_EVENT_UNPRIVILEGED)) {
perf_ibs_phyaddr_clear(perf_ibs, &ibs_data);
}
diff --git a/arch/x86/events/perf_event_flags.h b/arch/x86/events/perf_event_flags.h
index 7007833..47f84ee 100644
--- a/arch/x86/events/perf_event_flags.h
+++ b/arch/x86/events/perf_event_flags.h
@@ -23,3 +23,4 @@ PERF_ARCH(PEBS_LAT_HYBRID, 0x0020000) /* ld and st lat for hybrid */
PERF_ARCH(NEEDS_BRANCH_STACK, 0x0040000) /* require branch stack setup */
PERF_ARCH(BRANCH_COUNTERS, 0x0080000) /* logs the counters in the extra space of each branch */
PERF_ARCH(ACR, 0x0100000) /* Auto counter reload */
+PERF_ARCH(UNPRIVILEGED, 0x0200000) /* Unprivileged event (wrt perf_allow_kernel()) */
© 2016 - 2026 Red Hat, Inc.