[PATCH] perf: Fix refcount warning on event->mmap_count increment

Will Rosenberg posted 1 patch 1 month, 1 week ago
There is a newer version of this series
kernel/events/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[PATCH] perf: Fix refcount warning on event->mmap_count increment
Posted by Will Rosenberg 1 month, 1 week ago
When calling refcount_inc(&event->mmap_count) inside perf_mmap_rb(), the
following warning is triggered:

	refcount_t: addition on 0; use-after-free.
	WARNING: lib/refcount.c:25

PoC:

    struct perf_event_attr attr = {0};
    int fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, 0);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    int victim = syscall(__NR_perf_event_open, &attr, 0, -1, fd,
                         PERF_FLAG_FD_OUTPUT);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, victim, 0);

This occurs when creating a group member event with the flag
PERF_FLAG_FD_OUTPUT. The group leader should be mmap-ed and then mmap-ing
the event triggers the warning.

Since the event has copied the output_event in perf_event_set_output(),
event->rb is set. As a result, perf_mmap_rb() calls
refcount_inc(&event->mmap_count) when event->mmap_count = 0.

Account for the case when event->mmap_count = 0. This patch goes against
the design philosophy of the refcount library by re-enabling an empty
refcount, but the patch remains inline with the current treatment of
mmap_count.

Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
---

Notes:
    I also have a related concern about code that handles the mmap_count.
    In perf_mmap_close(), if refcount_dec_and_mutex_lock() decrements
    event->mmap_count to zero, then event->rb is set to NULL. This
    effectively undos our output_event copy. However, is this desired
    behavior? Should event->rb remain unchanged since it may still be
    mmap-ed by other events and can still be used?

 kernel/events/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 376fb07d869b..49709b627b1f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7279,7 +7279,8 @@ static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event,
 			 * multiple times.
 			 */
 			perf_mmap_account(vma, user_extra, extra);
-			refcount_inc(&event->mmap_count);
+			if (!refcount_inc_not_zero(&event->mmap_count))
+				refcount_set(&event->mmap_count, 1);
 			return 0;
 		}
 

base-commit: 538254cd98afb31b09c4cc58219217d8127c79be
-- 
2.34.1
Re: [PATCH] perf: Fix refcount warning on event->mmap_count increment
Posted by Lai, Yi 1 month ago
On Mon, Dec 29, 2025 at 02:03:55PM -0700, Will Rosenberg wrote:
> When calling refcount_inc(&event->mmap_count) inside perf_mmap_rb(), the
> following warning is triggered:
> 
> 	refcount_t: addition on 0; use-after-free.
> 	WARNING: lib/refcount.c:25
> 
> PoC:
> 
>     struct perf_event_attr attr = {0};
>     int fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, 0);
>     mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
>     int victim = syscall(__NR_perf_event_open, &attr, 0, -1, fd,
>                          PERF_FLAG_FD_OUTPUT);
>     mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, victim, 0);
> 
> This occurs when creating a group member event with the flag
> PERF_FLAG_FD_OUTPUT. The group leader should be mmap-ed and then mmap-ing
> the event triggers the warning.
> 
> Since the event has copied the output_event in perf_event_set_output(),
> event->rb is set. As a result, perf_mmap_rb() calls
> refcount_inc(&event->mmap_count) when event->mmap_count = 0.
> 
> Account for the case when event->mmap_count = 0. This patch goes against
> the design philosophy of the refcount library by re-enabling an empty
> refcount, but the patch remains inline with the current treatment of
> mmap_count.
> 
> Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
> ---
> 
> Notes:
>     I also have a related concern about code that handles the mmap_count.
>     In perf_mmap_close(), if refcount_dec_and_mutex_lock() decrements
>     event->mmap_count to zero, then event->rb is set to NULL. This
>     effectively undos our output_event copy. However, is this desired
>     behavior? Should event->rb remain unchanged since it may still be
>     mmap-ed by other events and can still be used?
> 
>  kernel/events/core.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>

I also hit the refcount bug in perf_mmap using v6.19-rc4 kernel. Here is
the call trace:

[   17.446965] ------------[ cut here ]------------
[   17.447357] refcount_t: addition on 0; use-after-free.
[   17.447732] WARNING: lib/refcount.c:25 at refcount_warn_saturate+0xc5/0x150, CPU#1: repro/713
[   17.448358] Modules linked in:
[   17.448613] CPU: 1 UID: 0 PID: 713 Comm: repro Not tainted 6.19.0-rc1-v6.19-rc1 #1 PREEMPT(voluntary)
[   17.449277] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebu4
[   17.450209] RIP: 0010:refcount_warn_saturate+0xc5/0x150
[   17.450589] Code: 31 fa 61 fe 48 8d 3d fa 66 0e 05 67 48 0f b9 3a e8 20 fa 61 fe 5b 41 5c 5d c3 cc cc cc9
[   17.451950] RSP: 0018:ff11000014ddf5d0 EFLAGS: 00010293
[   17.452333] RAX: 0000000000000000 RBX: 0000000000000002 RCX: ffffffff8327b5d2
[   17.452827] RDX: ff1100001a1d8000 RSI: ffffffff8327b62e RDI: ffffffff88361d20
[   17.453322] RBP: ff11000014ddf5e0 R08: 0000000000000001 R09: ffe21c00028c3a78
[   17.453813] R10: 0000000000000002 R11: ff1100001a1d8e98 R12: ff1100001461d3c0
[   17.454401] R13: ff1100001461d3c0 R14: 0000000000000000 R15: ff11000010325dc0
[   17.454908] FS:  00007f16e9af0740(0000) GS:ff110000e35b9000(0000) knlGS:0000000000000000
[   17.455475] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   17.455881] CR2: 0000200000000080 CR3: 000000000ee14003 CR4: 0000000000771ef0
[   17.456383] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   17.456877] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000602
[   17.457376] PKRU: 55555554
[   17.457576] Call Trace:
[   17.457757]  <TASK>
[   17.457918]  perf_mmap+0x1a35/0x2390
[   17.458250]  ? mas_preallocate+0x2cb/0xe70
[   17.458554]  ? __pfx_perf_mmap+0x10/0x10
[   17.458849]  ? lockdep_init_map_type+0x50/0x260
[   17.459187]  __mmap_region+0x11a7/0x2970
[   17.459482]  ? __sanitizer_cov_trace_cmp4+0x1a/0x20
[   17.459834]  ? __sanitizer_cov_trace_const_cmp8+0x1c/0x30
[   17.460227]  ? __pfx___mmap_region+0x10/0x10
[   17.460552]  ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[   17.460941]  ? perf_ctx_unlock+0xfa/0x180
[   17.461252]  ? perf_ctx_unlock+0xfa/0x180
[   17.461580]  ? __this_cpu_preempt_check+0x21/0x30
[   17.461990]  ? lock_is_held_type+0xef/0x150
[   17.462308]  mmap_region+0x307/0x3e0
[   17.462577]  do_mmap+0xa5d/0x12e0
[   17.462835]  ? lock_acquire+0x180/0x2f0
[   17.463130]  ? __pfx_do_mmap+0x10/0x10
[   17.463408]  ? down_write_killable+0x163/0x250
[   17.463729]  ? __pfx_down_write_killable+0x10/0x10
[   17.464076]  vm_mmap_pgoff+0x2b8/0x4a0
[   17.464376]  ? __pfx_vm_mmap_pgoff+0x10/0x10
[   17.464692]  ? __fget_files+0x204/0x3b0
[   17.464982]  ksys_mmap_pgoff+0x3dc/0x520
[   17.465277]  __x64_sys_mmap+0x139/0x1d0
[   17.465561]  x64_sys_call+0x19a4/0x21b0
[   17.465842]  do_syscall_64+0x6d/0x1180
[   17.466180]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[   17.466544] RIP: 0033:0x7f16e983ee5d
[   17.466811] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca8
[   17.468073] RSP: 002b:00007ffefb1af538 EFLAGS: 00000216 ORIG_RAX: 0000000000000009
[   17.468607] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f16e983ee5d
[   17.469103] RDX: 0000000000000000 RSI: 0000000000002000 RDI: 0000200000ab9000
[   17.469595] RBP: 00007ffefb1af560 R08: 0000000000000004 R09: 0000000000000000
[   17.470145] R10: 0000000004002013 R11: 0000000000000216 R12: 00007ffefb1af678
[   17.470652] R13: 0000000000401136 R14: 0000000000404e08 R15: 00007f16e9b3d000
[   17.471172]  </TASK>
[   17.471338] irq event stamp: 2257
[   17.471579] hardirqs last  enabled at (2265): [<ffffffff816661b5>] __up_console_sem+0x95/0xb0
[   17.472185] hardirqs last disabled at (2272): [<ffffffff8166619a>] __up_console_sem+0x7a/0xb0
[   17.472773] softirqs last  enabled at (2230): [<ffffffff81489a9e>] __irq_exit_rcu+0x10e/0x170
[   17.473368] softirqs last disabled at (2185): [<ffffffff81489a9e>] __irq_exit_rcu+0x10e/0x170
[   17.474011] ---[ end trace 0000000000000000 ]---

After applying the fix patch, the issue cannot be reproduced.

Tested-by: Yi Lai <yi1.lai@intel.com>

Regards,
Yi Lai
 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 376fb07d869b..49709b627b1f 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7279,7 +7279,8 @@ static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event,
>  			 * multiple times.
>  			 */
>  			perf_mmap_account(vma, user_extra, extra);
> -			refcount_inc(&event->mmap_count);
> +			if (!refcount_inc_not_zero(&event->mmap_count))
> +				refcount_set(&event->mmap_count, 1);
>  			return 0;
>  		}
>  
> 
> base-commit: 538254cd98afb31b09c4cc58219217d8127c79be
> -- 
> 2.34.1
>
[PATCH v2] perf: Fix refcount warning on event->mmap_count increment
Posted by Will Rosenberg 1 month ago
When calling refcount_inc(&event->mmap_count) inside perf_mmap_rb(), the
following warning is triggered:

	refcount_t: addition on 0; use-after-free.
	WARNING: lib/refcount.c:25

PoC:

    struct perf_event_attr attr = {0};
    int fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, 0);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    int victim = syscall(__NR_perf_event_open, &attr, 0, -1, fd,
                         PERF_FLAG_FD_OUTPUT);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, victim, 0);

This occurs when creating a group member event with the flag
PERF_FLAG_FD_OUTPUT. The group leader should be mmap-ed and then mmap-ing
the event triggers the warning.

Since the event has copied the output_event in perf_event_set_output(),
event->rb is set. As a result, perf_mmap_rb() calls
refcount_inc(&event->mmap_count) when event->mmap_count = 0.

Account for the case when event->mmap_count = 0. This patch goes against
the design philosophy of the refcount library by re-enabling an empty
refcount, but the patch remains inline with the current treatment of
mmap_count.

Fixes: 448f97fba901 ("perf: Convert mmap() refcounts to refcount_t")
Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
---

Notes:
    v1 -> v2: Add Fixes tag
    
    I also have a related concern about code that handles the mmap_count.
    In perf_mmap_close(), if refcount_dec_and_mutex_lock() decrements
    event->mmap_count to zero, then event->rb is set to NULL. This
    effectively undos our ring buffer copy. However, is this desired
    behavior? Should event->rb remain unchanged since it may still be
    mmap-ed by other events and can still be used?

 kernel/events/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 376fb07d869b..49709b627b1f 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7279,7 +7279,8 @@ static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event,
 			 * multiple times.
 			 */
 			perf_mmap_account(vma, user_extra, extra);
-			refcount_inc(&event->mmap_count);
+			if (!refcount_inc_not_zero(&event->mmap_count))
+				refcount_set(&event->mmap_count, 1);
 			return 0;
 		}
 

base-commit: 538254cd98afb31b09c4cc58219217d8127c79be
-- 
2.34.1
Re: [PATCH v2] perf: Fix refcount warning on event->mmap_count increment
Posted by Peter Zijlstra 1 month ago
On Mon, Jan 05, 2026 at 09:51:49AM -0700, Will Rosenberg wrote:
> When calling refcount_inc(&event->mmap_count) inside perf_mmap_rb(), the
> following warning is triggered:
> 
> 	refcount_t: addition on 0; use-after-free.
> 	WARNING: lib/refcount.c:25
> 
> PoC:
> 
>     struct perf_event_attr attr = {0};
>     int fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, 0);
>     mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
>     int victim = syscall(__NR_perf_event_open, &attr, 0, -1, fd,
>                          PERF_FLAG_FD_OUTPUT);
>     mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, victim, 0);
> 
> This occurs when creating a group member event with the flag
> PERF_FLAG_FD_OUTPUT. The group leader should be mmap-ed and then mmap-ing
> the event triggers the warning.
> 
> Since the event has copied the output_event in perf_event_set_output(),
> event->rb is set. As a result, perf_mmap_rb() calls
> refcount_inc(&event->mmap_count) when event->mmap_count = 0.
> 
> Account for the case when event->mmap_count = 0. This patch goes against
> the design philosophy of the refcount library by re-enabling an empty
> refcount, but the patch remains inline with the current treatment of
> mmap_count.
> 
> Fixes: 448f97fba901 ("perf: Convert mmap() refcounts to refcount_t")
> Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
> ---
> 
> Notes:
>     v1 -> v2: Add Fixes tag
>     
>     I also have a related concern about code that handles the mmap_count.
>     In perf_mmap_close(), if refcount_dec_and_mutex_lock() decrements
>     event->mmap_count to zero, then event->rb is set to NULL. This
>     effectively undos our ring buffer copy. However, is this desired
>     behavior? Should event->rb remain unchanged since it may still be
>     mmap-ed by other events and can still be used?
> 
>  kernel/events/core.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 376fb07d869b..49709b627b1f 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -7279,7 +7279,8 @@ static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event,
>  			 * multiple times.
>  			 */
>  			perf_mmap_account(vma, user_extra, extra);
> -			refcount_inc(&event->mmap_count);
> +			if (!refcount_inc_not_zero(&event->mmap_count))
> +				refcount_set(&event->mmap_count, 1);

So this pattern was an instant red flag; this cannot be right.

Yes, this makes the error go away, but I think the result is bad. 

The sequence as provided will create a mapping for event fd, and create
victim such that its events are redirected to this buffer.

So far so good.

However, the mmap() of victim will create an alias of the earlier buffer
(which is pointless but isn't a problem per-se), but by setting
event->mmap_count, you're saying this second event should also update
the user_page (struct perf_event_mmap_page at offset +0).

This means that if you make this succeed you end up with both events
(fd, victim) writing to the same page (which is mapped twice). And that
is broken.

I'm thinking we should dis-allow this mmap()... something like so, hmm?

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3c2a491200c6..ccf3aecbfff5 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7273,6 +7273,15 @@ static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event,
 		if (data_page_nr(event->rb) != nr_pages)
 			return -EINVAL;
 
+		/*
+		 * If this event doesn't have mmap_count, we're attempting to
+		 * create an alias of another event's mmap(); this would mean
+		 * both events will end up scribbling the same user_page;
+		 * which makes no sense.
+		 */
+		if (refcount_read(&event->mmap_count))
+			return -EBUSY;
+
 		if (refcount_inc_not_zero(&event->rb->mmap_count)) {
 			/*
 			 * Success -- managed to mmap() the same buffer
[PATCH v3] perf: Fix refcount warning on event->mmap_count increment
Posted by Will Rosenberg 1 month ago
When calling refcount_inc(&event->mmap_count) inside perf_mmap_rb(), the
following warning is triggered:

        refcount_t: addition on 0; use-after-free.
        WARNING: lib/refcount.c:25

PoC:

    struct perf_event_attr attr = {0};
    int fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, 0);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    int victim = syscall(__NR_perf_event_open, &attr, 0, -1, fd,
                         PERF_FLAG_FD_OUTPUT);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, victim, 0);

This occurs when creating a group member event with the flag
PERF_FLAG_FD_OUTPUT. The group leader should be mmap-ed and then mmap-ing
the event triggers the warning.

Since the event has copied the output_event in perf_event_set_output(),
event->rb is set. As a result, perf_mmap_rb() calls
refcount_inc(&event->mmap_count) when event->mmap_count = 0.

Disallow the case when event->mmap_count = 0. This prevents two
events from updating the same user_page.

Fixes: 448f97fba901 ("perf: Convert mmap() refcounts to refcount_t")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
---

Notes:
    v2 -> v3: Update patch to error out instead of incrementing.
    
    Thank you, this is a much better solution. I was not thinking
    that the mmap itself was unintended.
    
    I believe you are missing a "!" in your patch. After adding
    that, I tested the patch, and it fixed the bug.
    
    Thank you for your help.

 kernel/events/core.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3c2a491200c6..ac7f12560172 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7273,6 +7273,15 @@ static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event,
 		if (data_page_nr(event->rb) != nr_pages)
 			return -EINVAL;
 
+		/*
+		 * If this event doesn't have mmap_count, we're attempting to
+		 * create an alias of another event's mmap(); this would mean
+		 * both events will end up scribbling the same user_page;
+		 * which makes no sense.
+		 */
+		if (!refcount_read(&event->mmap_count))
+			return -EBUSY;
+
 		if (refcount_inc_not_zero(&event->rb->mmap_count)) {
 			/*
 			 * Success -- managed to mmap() the same buffer

base-commit: 5d3b0106245d467fd5ba0bd9a373a13356684f6e
-- 
2.34.1
[PATCH v3 RESEND] perf: Fix refcount warning on event->mmap_count increment
Posted by Will Rosenberg 2 weeks, 4 days ago
When calling refcount_inc(&event->mmap_count) inside perf_mmap_rb(), the
following warning is triggered:

        refcount_t: addition on 0; use-after-free.
        WARNING: lib/refcount.c:25

PoC:

    struct perf_event_attr attr = {0};
    int fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, 0);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    int victim = syscall(__NR_perf_event_open, &attr, 0, -1, fd,
                         PERF_FLAG_FD_OUTPUT);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, victim, 0);

This occurs when creating a group member event with the flag
PERF_FLAG_FD_OUTPUT. The group leader should be mmap-ed and then mmap-ing
the event triggers the warning.

Since the event has copied the output_event in perf_event_set_output(),
event->rb is set. As a result, perf_mmap_rb() calls
refcount_inc(&event->mmap_count) when event->mmap_count = 0.

Disallow the case when event->mmap_count = 0. This also prevents two
events from updating the same user_page.

Fixes: 448f97fba901 ("perf: Convert mmap() refcounts to refcount_t")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
---

Notes:
    v2 -> v3: Update patch to error out instead of incrementing.
    
    Thank you, this is a much better solution. I was not thinking
    that the mmap itself was unintended.
    
    I believe you are missing a "!" in your patch. After adding
    that, I tested the patch, and it fixed the bug.
    
    I also wanted to check my understanding of the race with
    perf_mmap_close() to double check this patch will not cause
    an issue. perf_mmap_rb() should always hold the
    event->mmap_mutex, so there should be no race on
    event->mmap_count with perf_mmap_close()'s
    refcount_dec_and_mutex_lock(). If there was a race, we would
    risk returning -EBUSY when we should "continue as if !event->rb."
    
    Thank you for your help.

 kernel/events/core.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3c2a491200c6..ac7f12560172 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7273,6 +7273,15 @@ static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event,
 		if (data_page_nr(event->rb) != nr_pages)
 			return -EINVAL;
 
+		/*
+		 * If this event doesn't have mmap_count, we're attempting to
+		 * create an alias of another event's mmap(); this would mean
+		 * both events will end up scribbling the same user_page;
+		 * which makes no sense.
+		 */
+		if (!refcount_read(&event->mmap_count))
+			return -EBUSY;
+
 		if (refcount_inc_not_zero(&event->rb->mmap_count)) {
 			/*
 			 * Success -- managed to mmap() the same buffer

base-commit: 5d3b0106245d467fd5ba0bd9a373a13356684f6e
-- 
2.34.1
Re: [PATCH v3 RESEND] perf: Fix refcount warning on event->mmap_count increment
Posted by Peter Zijlstra 2 weeks, 4 days ago
On Mon, Jan 19, 2026 at 11:49:56AM -0700, Will Rosenberg wrote:

> 
> Notes:
>     v2 -> v3: Update patch to error out instead of incrementing.
>     
>     Thank you, this is a much better solution. I was not thinking
>     that the mmap itself was unintended.
>     
>     I believe you are missing a "!" in your patch. After adding
>     that, I tested the patch, and it fixed the bug.

D'0h indeed. Sometimes typing is so very hard ;-)

>     I also wanted to check my understanding of the race with
>     perf_mmap_close() to double check this patch will not cause
>     an issue. perf_mmap_rb() should always hold the
>     event->mmap_mutex, so there should be no race on
>     event->mmap_count with perf_mmap_close()'s
>     refcount_dec_and_mutex_lock(). If there was a race, we would
>     risk returning -EBUSY when we should "continue as if !event->rb."

Since we're failing perf_mmap_rb(), it won't call ->close(), right?

Also, we already have an error path on data_page_nr() mismatch.

The caller of perf_mmap_rb() has if (ret) return ret; nothing is
modified before calling perf_mmap_rb() and perf_mmap_rb() itself hasn't
modified anytyhing yet at the point of failure.

So afaict we're good.


Anyway, thanks for the patch, I'll get it applied!
[tip: perf/urgent] perf: Fix refcount warning on event->mmap_count increment
Posted by tip-bot2 for Will Rosenberg 2 weeks, 3 days ago
The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     d06bf78e55d5159c1b00072e606ab924ffbbad35
Gitweb:        https://git.kernel.org/tip/d06bf78e55d5159c1b00072e606ab924ffbbad35
Author:        Will Rosenberg <whrosenb@asu.edu>
AuthorDate:    Mon, 19 Jan 2026 11:49:56 -07:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 21 Jan 2026 16:28:58 +01:00

perf: Fix refcount warning on event->mmap_count increment

When calling refcount_inc(&event->mmap_count) inside perf_mmap_rb(), the
following warning is triggered:

        refcount_t: addition on 0; use-after-free.
        WARNING: lib/refcount.c:25

PoC:

    struct perf_event_attr attr = {0};
    int fd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, 0);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    int victim = syscall(__NR_perf_event_open, &attr, 0, -1, fd,
                         PERF_FLAG_FD_OUTPUT);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, victim, 0);

This occurs when creating a group member event with the flag
PERF_FLAG_FD_OUTPUT. The group leader should be mmap-ed and then mmap-ing
the event triggers the warning.

Since the event has copied the output_event in perf_event_set_output(),
event->rb is set. As a result, perf_mmap_rb() calls
refcount_inc(&event->mmap_count) when event->mmap_count = 0.

Disallow the case when event->mmap_count = 0. This also prevents two
events from updating the same user_page.

Fixes: 448f97fba901 ("perf: Convert mmap() refcounts to refcount_t")
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Rosenberg <whrosenb@asu.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260119184956.801238-1-whrosenb@asu.edu
---
 kernel/events/core.c |  9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5b5cb62..a0fa488 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6997,6 +6997,15 @@ static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *event,
 		if (data_page_nr(event->rb) != nr_pages)
 			return -EINVAL;
 
+		/*
+		 * If this event doesn't have mmap_count, we're attempting to
+		 * create an alias of another event's mmap(); this would mean
+		 * both events will end up scribbling the same user_page;
+		 * which makes no sense.
+		 */
+		if (!refcount_read(&event->mmap_count))
+			return -EBUSY;
+
 		if (refcount_inc_not_zero(&event->rb->mmap_count)) {
 			/*
 			 * Success -- managed to mmap() the same buffer