From nobody Mon Apr 6 13:07:27 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44C46C6FA83 for ; Thu, 8 Sep 2022 05:11:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229911AbiIHFLj (ORCPT ); Thu, 8 Sep 2022 01:11:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229876AbiIHFLc (ORCPT ); Thu, 8 Sep 2022 01:11:32 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5CE51BBA65; Wed, 7 Sep 2022 22:11:22 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C28D7D6E; Wed, 7 Sep 2022 22:11:27 -0700 (PDT) Received: from a077893.blr.arm.com (unknown [10.162.41.8]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 3D03E3F7B4; Wed, 7 Sep 2022 22:11:16 -0700 (PDT) From: Anshuman Khandual To: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-arm-kernel@lists.infradead.org, peterz@infradead.org, acme@kernel.org, mark.rutland@arm.com, will@kernel.org, catalin.marinas@arm.com Cc: Anshuman Khandual , James Clark , Rob Herring , Marc Zyngier , Ingo Molnar Subject: [PATCH V2 5/7] arm64/perf: Drive BRBE from perf event states Date: Thu, 8 Sep 2022 10:40:44 +0530 Message-Id: <20220908051046.465307-6-anshuman.khandual@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220908051046.465307-1-anshuman.khandual@arm.com> References: <20220908051046.465307-1-anshuman.khandual@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Branch stack sampling rides along the normal perf event and all the branch records get captured during the PMU interrupt. This just changes perf event handling on the arm64 platform to accommodate required BRBE operations that will enable branch stack sampling support. It adds a new 'hw_perf_event.flags' element i.e ARMPMU_EVT_PRIV, which will enable caching perf event privilege information required for capturing some branch record types. Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Mark Rutland Cc: Will Deacon Cc: Catalin Marinas Cc: linux-perf-users@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Anshuman Khandual --- arch/arm64/kernel/perf_event.c | 6 ++++ drivers/perf/arm_pmu.c | 50 ++++++++++++++++++++++++++++++++++ include/linux/perf/arm_pmu.h | 4 +++ 3 files changed, 60 insertions(+) diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c index e7013699171f..5bfaba8edad1 100644 --- a/arch/arm64/kernel/perf_event.c +++ b/arch/arm64/kernel/perf_event.c @@ -874,6 +874,12 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu = *cpu_pmu) if (!armpmu_event_set_period(event)) continue; =20 + if (has_branch_stack(event)) { + cpu_pmu->brbe_read(cpuc, event); + data.br_stack =3D &cpuc->brbe_stack; + cpu_pmu->brbe_reset(cpuc); + } + /* * Perf event overflow will queue the processing of the event as * an irq_work which will be taken care of in the handling of diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c index 59d3980b8ca2..1fe5d6238b81 100644 --- a/drivers/perf/arm_pmu.c +++ b/drivers/perf/arm_pmu.c @@ -271,12 +271,22 @@ armpmu_stop(struct perf_event *event, int flags) { struct arm_pmu *armpmu =3D to_arm_pmu(event->pmu); struct hw_perf_event *hwc =3D &event->hw; + struct pmu_hw_events *hw_events =3D this_cpu_ptr(armpmu->hw_events); =20 /* * ARM pmu always has to update the counter, so ignore * PERF_EF_UPDATE, see comments in armpmu_start(). */ if (!(hwc->state & PERF_HES_STOPPED)) { + if (has_branch_stack(event)) { + WARN_ON_ONCE(!hw_events->brbe_users); + hw_events->brbe_users--; + if (!hw_events->brbe_users) { + hw_events->brbe_context =3D NULL; + armpmu->brbe_disable(hw_events); + } + } + armpmu->disable(event); armpmu_event_update(event); hwc->state |=3D PERF_HES_STOPPED | PERF_HES_UPTODATE; @@ -287,6 +297,7 @@ static void armpmu_start(struct perf_event *event, int = flags) { struct arm_pmu *armpmu =3D to_arm_pmu(event->pmu); struct hw_perf_event *hwc =3D &event->hw; + struct pmu_hw_events *hw_events =3D this_cpu_ptr(armpmu->hw_events); =20 /* * ARM pmu always has to reprogram the period, so ignore @@ -304,6 +315,14 @@ static void armpmu_start(struct perf_event *event, int= flags) * happened since disabling. */ armpmu_event_set_period(event); + if (has_branch_stack(event)) { + if (event->ctx->task && hw_events->brbe_context !=3D event->ctx) { + armpmu->brbe_reset(hw_events); + hw_events->brbe_context =3D event->ctx; + } + armpmu->brbe_enable(hw_events); + hw_events->brbe_users++; + } armpmu->enable(event); } =20 @@ -349,6 +368,10 @@ armpmu_add(struct perf_event *event, int flags) hw_events->events[idx] =3D event; =20 hwc->state =3D PERF_HES_STOPPED | PERF_HES_UPTODATE; + + if (has_branch_stack(event)) + armpmu->brbe_filter(hw_events, event); + if (flags & PERF_EF_START) armpmu_start(event, PERF_EF_RELOAD); =20 @@ -443,6 +466,7 @@ __hw_perf_event_init(struct perf_event *event) { struct arm_pmu *armpmu =3D to_arm_pmu(event->pmu); struct hw_perf_event *hwc =3D &event->hw; + struct pmu_hw_events *hw_events =3D this_cpu_ptr(armpmu->hw_events); int mapping; =20 hwc->flags =3D 0; @@ -492,6 +516,19 @@ __hw_perf_event_init(struct perf_event *event) local64_set(&hwc->period_left, hwc->sample_period); } =20 + if (has_branch_stack(event)) { + /* + * Cache whether the perf event is allowed to capture exception + * and exception return branch records. It allows us to perform + * the privilege check via perfmon_capable(), in the context of + * the event owner, just once, during the pmu->event_init(). + */ + if (perfmon_capable()) + event->hw.flags |=3D ARMPMU_EVT_PRIV; + + armpmu->brbe_filter(hw_events, event); + } + return validate_group(event); } =20 @@ -520,6 +557,18 @@ static int armpmu_event_init(struct perf_event *event) return __hw_perf_event_init(event); } =20 +static void armpmu_sched_task(struct perf_event_context *ctx, bool sched_i= n) +{ + struct arm_pmu *armpmu =3D to_arm_pmu(ctx->pmu); + struct pmu_hw_events *hw_events =3D this_cpu_ptr(armpmu->hw_events); + + if (!hw_events->brbe_users) + return; + + if (sched_in) + armpmu->brbe_reset(hw_events); +} + static void armpmu_enable(struct pmu *pmu) { struct arm_pmu *armpmu =3D to_arm_pmu(pmu); @@ -877,6 +926,7 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags) } =20 pmu->pmu =3D (struct pmu) { + .sched_task =3D armpmu_sched_task, .pmu_enable =3D armpmu_enable, .pmu_disable =3D armpmu_disable, .event_init =3D armpmu_event_init, diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h index 18e519e4e658..67f44020a736 100644 --- a/include/linux/perf/arm_pmu.h +++ b/include/linux/perf/arm_pmu.h @@ -29,6 +29,10 @@ /* Event uses a 47bit counter */ #define ARMPMU_EVT_47BIT 2 =20 +#define ARMPMU_EVT_PRIV 0x00004 /* Event is privileged */ + +static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_PRIV) =3D=3D ARMPMU_EVT_P= RIV); + #define HW_OP_UNSUPPORTED 0xFFFF #define C(_x) PERF_COUNT_HW_CACHE_##_x #define CACHE_OP_UNSUPPORTED 0xFFFF --=20 2.25.1