From nobody Sat Jun 13 06:52:18 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 50A5C26A1AC for ; Sat, 9 May 2026 01:40:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778290849; cv=none; b=Lag6Q5C68CyN+l+x+T2ks11BNXegPiZq20oUKxWF77yA4PzmuFcxuuv74XmzbKc0sdKKVeyUgzpEDpl2ksDbghbg8HKU+fAO+WsN4nL9oqLh1MMpj2Fpx55+M/KHBboQ+Zs+L1Dnp6qoqyDO5wtNm9qqq/siqGeORj+BI1Wd1ko= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778290849; c=relaxed/simple; bh=nY6l5tVoxvRQzo8dF5yU0Ek3S+RovBZ1zXAEZVGfr7s=; h=Date:From:To:Cc:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition; b=EMGmdNJHwKtH4QL/+Sxc8C5MKsH+9GZzph2YWmIzE9v4fvFK1R14lras3svvhZgpc/7IFKPN5HNU+bSJkZYmVnNsAMHLUuAqL/EP4KIoFFOQhljBiqO3TjBKT5iW0uyl3bhPxRzPn6Btr/Z9q5WN88k976VPDMuQA73S1RNi7wU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ebExcSxx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ebExcSxx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D6F6BC2BCB0; Sat, 9 May 2026 01:40:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1778290849; bh=nY6l5tVoxvRQzo8dF5yU0Ek3S+RovBZ1zXAEZVGfr7s=; h=Date:From:To:Cc:Subject:From; b=ebExcSxxyQsdEWNdHhQVAmNYT14XX5MUvkQw+4ltjKgvi1kujukYyzK+vCeIh6Yga 83YAqUQsB7PLHc+EeFGxc1yrGHmffSr81J3zCJwdxcqCim7fpI0oiWqsur4vxv1yLW iDR9niv8pAXh48wkS/ykH6DUQsH5rd+dpLdRa7sm/N5XZhmkKOwjRkf4Fe9+3dNgKL 5aEWEfaiBJyYsJpL615bg9rDD5EDyAbsMF4JeC515/lJyery1XCrEjE6x26KBWGIVP KN4+RmgOjoRT9wA3THZhgjbcG8yAYa9KQjwgk3H+dLQ/MNpVAt23yPoCAVEf46WuzM dGXGAZyPca0sg== Date: Sat, 9 May 2026 03:40:44 +0200 From: Ingo Molnar To: Linus Torvalds Cc: linux-kernel@vger.kernel.org, Peter Zijlstra , Arnaldo Carvalho de Melo , Jiri Olsa , Alexander Shishkin , Mark Rutland , Namhyung Kim Subject: [GIT PULL] perf events fixes Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Linus, Please pull the latest perf/urgent Git tree from: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf-urgent-20= 26-05-09 for you to fetch changes up to aa4384bc8f4360167f3c3d5322121fe892289ea2: Miscellaneous perf events fixes: - Fix deadlock in the perf_mmap() failure path (Peter Zijlstra) - Intel ACR (Auto Counter Reload) fixes (Dapeng Mi): - Fix validation and configuration of ACR masks - Fix ACR rescheduling bug causing stale masks - Disable the PMI on ACR-enabled hardware - Enable ACR on Panther Cover uarch too Thanks, Ingo ------------------> Dapeng Mi (4): perf/x86/intel: Improve validation and configuration of ACR masks perf/x86/intel: Always reprogram ACR events to prevent stale masks perf/x86/intel: Disable PMI for self-reloaded ACR events perf/x86/intel: Enable auto counter reload for DMR Peter Zijlstra (1): perf/core: Fix deadlock in perf_mmap() failure path arch/x86/events/core.c | 13 ++++---- arch/x86/events/intel/core.c | 50 ++++++++++++++++++++++++------- arch/x86/events/perf_event.h | 10 +++++++ kernel/events/core.c | 70 ++++++++++++++++++++++++++++++++++------= ---- kernel/events/internal.h | 1 + kernel/events/ring_buffer.c | 2 ++ 6 files changed, 115 insertions(+), 31 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 810ab21ffd99..4b9e105309c6 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1294,13 +1294,16 @@ int x86_perf_rdpmc_index(struct perf_event *event) return event->hw.event_base_rdpmc; } =20 -static inline int match_prev_assignment(struct hw_perf_event *hwc, +static inline int match_prev_assignment(struct perf_event *event, struct cpu_hw_events *cpuc, int i) { + struct hw_perf_event *hwc =3D &event->hw; + return hwc->idx =3D=3D cpuc->assign[i] && - hwc->last_cpu =3D=3D smp_processor_id() && - hwc->last_tag =3D=3D cpuc->tags[i]; + hwc->last_cpu =3D=3D smp_processor_id() && + hwc->last_tag =3D=3D cpuc->tags[i] && + !is_acr_event_group(event); } =20 static void x86_pmu_start(struct perf_event *event, int flags); @@ -1346,7 +1349,7 @@ static void x86_pmu_enable(struct pmu *pmu) * - no other event has used the counter since */ if (hwc->idx =3D=3D -1 || - match_prev_assignment(hwc, cpuc, i)) + match_prev_assignment(event, cpuc, i)) continue; =20 /* @@ -1367,7 +1370,7 @@ static void x86_pmu_enable(struct pmu *pmu) event =3D cpuc->event_list[i]; hwc =3D &event->hw; =20 - if (!match_prev_assignment(hwc, cpuc, i)) + if (!match_prev_assignment(event, cpuc, i)) x86_assign_hw_event(event, cpuc, i); else if (i < n_running) continue; diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index d9488ade0f8e..dd1e3aa75ee9 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3118,11 +3118,11 @@ static void intel_pmu_enable_fixed(struct perf_even= t *event) intel_set_masks(event, idx); =20 /* - * Enable IRQ generation (0x8), if not PEBS, - * and enable ring-3 counting (0x2) and ring-0 counting (0x1) - * if requested: + * Enable IRQ generation (0x8), if not PEBS or self-reloaded + * ACR event, and enable ring-3 counting (0x2) and ring-0 + * counting (0x1) if requested: */ - if (!event->attr.precise_ip) + if (!event->attr.precise_ip && !is_acr_self_reload_event(event)) bits |=3D INTEL_FIXED_0_ENABLE_PMI; if (hwc->config & ARCH_PERFMON_EVENTSEL_USR) bits |=3D INTEL_FIXED_0_USER; @@ -3306,6 +3306,15 @@ static void intel_pmu_enable_event(struct perf_event= *event) intel_set_masks(event, idx); static_call_cond(intel_pmu_enable_acr_event)(event); static_call_cond(intel_pmu_enable_event_ext)(event); + /* + * For self-reloaded ACR event, don't enable PMI since + * HW won't set overflow bit in GLOBAL_STATUS. Otherwise, + * the PMI would be recognized as a suspicious NMI. + */ + if (is_acr_self_reload_event(event)) + hwc->config &=3D ~ARCH_PERFMON_EVENTSEL_INT; + else if (!event->attr.precise_ip) + hwc->config |=3D ARCH_PERFMON_EVENTSEL_INT; __x86_pmu_enable_event(hwc, enable_mask); break; case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1: @@ -3332,23 +3341,41 @@ static void intel_pmu_enable_event(struct perf_even= t *event) static void intel_pmu_acr_late_setup(struct cpu_hw_events *cpuc) { struct perf_event *event, *leader; - int i, j, idx; + int i, j, k, bit, idx; =20 + /* + * FIXME: ACR mask parsing relies on cpuc->event_list[] (active events on= ly). + * Disabling an ACR event causes bit-shifting errors in the acr_mask of + * remaining group members. As ACR sampling requires all events to be act= ive, + * this limitation is acceptable for now. Revisit if independent event to= ggling + * is required. + */ for (i =3D 0; i < cpuc->n_events; i++) { leader =3D cpuc->event_list[i]; if (!is_acr_event_group(leader)) continue; =20 - /* The ACR events must be contiguous. */ + /* Find the last event of the ACR group. */ for (j =3D i; j < cpuc->n_events; j++) { event =3D cpuc->event_list[j]; if (event->group_leader !=3D leader->group_leader) break; - for_each_set_bit(idx, (unsigned long *)&event->attr.config2, X86_PMC_ID= X_MAX) { - if (i + idx >=3D cpuc->n_events || - !is_acr_event_group(cpuc->event_list[i + idx])) - return; - __set_bit(cpuc->assign[i + idx], (unsigned long *)&event->hw.config1); + } + + /* + * Translate the user-space ACR mask (attr.config2) into the physical + * counter bitmask (hw.config1) for each ACR event in the group. + * NOTE: ACR event contiguity is guaranteed by intel_pmu_hw_config(). + */ + for (k =3D i; k < j; k++) { + event =3D cpuc->event_list[k]; + event->hw.config1 =3D 0; + for_each_set_bit(bit, (unsigned long *)&event->attr.config2, X86_PMC_ID= X_MAX) { + idx =3D i + bit; + /* Event index of ACR group must locate in [i, j). */ + if (idx >=3D j || !is_acr_event_group(cpuc->event_list[idx])) + continue; + __set_bit(cpuc->assign[idx], (unsigned long *)&event->hw.config1); } } i =3D j - 1; @@ -7504,6 +7531,7 @@ static __always_inline void intel_pmu_init_pnc(struct= pmu *pmu) hybrid(pmu, event_constraints) =3D intel_pnc_event_constraints; hybrid(pmu, pebs_constraints) =3D intel_pnc_pebs_event_constraints; hybrid(pmu, extra_regs) =3D intel_pnc_extra_regs; + static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr); } =20 static __always_inline void intel_pmu_init_skt(struct pmu *pmu) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index fad87d3c8b2c..524668dcf4cc 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -137,6 +137,16 @@ static inline bool is_acr_event_group(struct perf_even= t *event) return check_leader_group(event->group_leader, PERF_X86_EVENT_ACR); } =20 +static inline bool is_acr_self_reload_event(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + + if (hwc->idx < 0) + return false; + + return test_bit(hwc->idx, (unsigned long *)&hwc->config1); +} + struct amd_nb { int nb_id; /* NorthBridge id */ int refcnt; /* reference count */ diff --git a/kernel/events/core.c b/kernel/events/core.c index 6d1f8bad7e1c..7935d5663944 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7006,6 +7006,7 @@ static void perf_mmap_open(struct vm_area_struct *vma) } =20 static void perf_pmu_output_stop(struct perf_event *event); +static void perf_mmap_unaccount(struct vm_area_struct *vma, struct perf_bu= ffer *rb); =20 /* * A buffer can be mmap()ed multiple times; either directly through the sa= me @@ -7021,8 +7022,6 @@ static void perf_mmap_close(struct vm_area_struct *vm= a) mapped_f unmapped =3D get_mapped(event, event_unmapped); struct perf_buffer *rb =3D ring_buffer_get(event); struct user_struct *mmap_user =3D rb->mmap_user; - int mmap_locked =3D rb->mmap_locked; - unsigned long size =3D perf_data_size(rb); bool detach_rest =3D false; =20 /* FIXIES vs perf_pmu_unregister() */ @@ -7117,11 +7116,7 @@ static void perf_mmap_close(struct vm_area_struct *v= ma) * Aside from that, this buffer is 'fully' detached and unmapped, * undo the VM accounting. */ - - atomic_long_sub((size >> PAGE_SHIFT) + 1 - mmap_locked, - &mmap_user->locked_vm); - atomic64_sub(mmap_locked, &vma->vm_mm->pinned_vm); - free_uid(mmap_user); + perf_mmap_unaccount(vma, rb); =20 out_put: ring_buffer_put(rb); /* could be last */ @@ -7261,6 +7256,15 @@ static void perf_mmap_account(struct vm_area_struct = *vma, long user_extra, long atomic64_add(extra, &vma->vm_mm->pinned_vm); } =20 +static void perf_mmap_unaccount(struct vm_area_struct *vma, struct perf_bu= ffer *rb) +{ + struct user_struct *user =3D rb->mmap_user; + + atomic_long_sub((perf_data_size(rb) >> PAGE_SHIFT) + 1 - rb->mmap_locked, + &user->locked_vm); + atomic64_sub(rb->mmap_locked, &vma->vm_mm->pinned_vm); +} + static int perf_mmap_rb(struct vm_area_struct *vma, struct perf_event *eve= nt, unsigned long nr_pages) { @@ -7323,8 +7327,6 @@ static int perf_mmap_rb(struct vm_area_struct *vma, s= truct perf_event *event, if (!rb) return -ENOMEM; =20 - refcount_set(&rb->mmap_count, 1); - rb->mmap_user =3D get_current_user(); rb->mmap_locked =3D extra; =20 ring_buffer_attach(event, rb); @@ -7474,16 +7476,54 @@ static int perf_mmap(struct file *file, struct vm_a= rea_struct *vma) mapped(event, vma->vm_mm); =20 /* - * Try to map it into the page table. On fail, invoke - * perf_mmap_close() to undo the above, as the callsite expects - * full cleanup in this case and therefore does not invoke - * vmops::close(). + * Try to map it into the page table. On fail undo the above, + * as the callsite expects full cleanup in this case and + * therefore does not invoke vmops::close(). */ ret =3D map_range(event->rb, vma); - if (ret) - perf_mmap_close(vma); + if (likely(!ret)) + return 0; + + /* Error path */ + + /* + * If this is the first mmap(), then event->mmap_count should + * be stable at 1. It is only modified by: + * perf_mmap_{open,close}() and perf_mmap(). + * + * The former are not possible because this mmap() hasn't been + * successful yet, and the latter is serialized by + * event->mmap_mutex which we still hold (note that mmap_lock + * is not strictly sufficient here, because the event fd can + * be passed to another process through trivial means like + * fork(), leading to concurrent mmap() from different mm). + * + * Make sure to remove event->rb before releasing + * event->mmap_mutex, such that any concurrent mmap() will not + * attempt use this failed buffer. + */ + if (refcount_read(&event->mmap_count) =3D=3D 1) { + /* + * Minimal perf_mmap_close(); there can't be AUX or + * other events on account of this being the first. + */ + mapped =3D get_mapped(event, event_unmapped); + if (mapped) + mapped(event, vma->vm_mm); + perf_mmap_unaccount(vma, event->rb); + ring_buffer_attach(event, NULL); /* drops last rb->refcount */ + refcount_set(&event->mmap_count, 0); + return ret; + } + + /* + * Otherwise this is an already existing buffer, and there is + * no race vs first exposure, so fall-through and call + * perf_mmap_close(). + */ } =20 + perf_mmap_close(vma); return ret; } =20 diff --git a/kernel/events/internal.h b/kernel/events/internal.h index d9cc57083091..c03c4f2eea57 100644 --- a/kernel/events/internal.h +++ b/kernel/events/internal.h @@ -67,6 +67,7 @@ static inline void rb_free_rcu(struct rcu_head *rcu_head) struct perf_buffer *rb; =20 rb =3D container_of(rcu_head, struct perf_buffer, rcu_head); + free_uid(rb->mmap_user); rb_free(rb); } =20 diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 3e7de2661417..9fe92161715e 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -340,6 +340,8 @@ ring_buffer_init(struct perf_buffer *rb, long watermark= , int flags) rb->paused =3D 1; =20 mutex_init(&rb->aux_mutex); + rb->mmap_user =3D get_current_user(); + refcount_set(&rb->mmap_count, 1); } =20 void perf_aux_output_flag(struct perf_output_handle *handle, u64 flags)