From nobody Tue Apr 7 03:47:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70325ECAAD1 for ; Wed, 31 Aug 2022 14:56:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230060AbiHaO42 (ORCPT ); Wed, 31 Aug 2022 10:56:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231316AbiHaO4U (ORCPT ); Wed, 31 Aug 2022 10:56:20 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD033CB5F7 for ; Wed, 31 Aug 2022 07:55:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661957746; x=1693493746; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tKCKqwQ8m8var63qlj80j+a8U1AHxCZ1pQ7hTm4PXaM=; b=b/KjTF2KvjSS7qKX/SzPhQfQ8/x7MZc1+nxKI9+y5YiqYFxbSYijqfST STBFs0tp0z1Y7GR6OTSRV9KWo8myY8wXT8ZeBjXrdwF2sQOmacR4BKCye iQRBKk7zJaig/I9lHmfUizDSbhvLW2odWeuX0iUTco7EM6JkSKuGXNVtS UYt3nZn3jCEpJ5Q7/gBO1GpyoklF+DvRdh5w4KxF5XEt0QBGFZNui/b3H ZxGLYB4+LfZvXcTW7vVD2Sj/MDJPZe5T7P7dj8yYtIGPQficQ+pt95dsy tbeybWD3h4sRslqRyAUku5y91Hv1gz6MJpDyIYyIjOLv+lS44jq8M9BtE A==; X-IronPort-AV: E=McAfee;i="6500,9779,10456"; a="296248194" X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="296248194" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2022 07:55:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="614991682" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmsmga007.fm.intel.com with ESMTP; 31 Aug 2022 07:55:44 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, acme@kernel.org, mingo@redhat.com, eranian@google.com, mpe@ellerman.id.au, linux-kernel@vger.kernel.org Cc: ak@linux.intel.com, andreas.kogler.0x@gmail.com, atrajeev@linux.vnet.ibm.com, Kan Liang Subject: [PATCH 3/6] perf: Use sample_flags for branch stack Date: Wed, 31 Aug 2022 07:55:11 -0700 Message-Id: <20220831145514.190514-4-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220831145514.190514-1-kan.liang@linux.intel.com> References: <20220831145514.190514-1-kan.liang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Kan Liang Use the new sample_flags to indicate whether the branch stack is filled by the PMU driver. Remove the br_stack from the perf_sample_data_init() to minimize the number of cache lines touched. Signed-off-by: Kan Liang --- arch/powerpc/perf/core-book3s.c | 1 + arch/x86/events/core.c | 4 +++- arch/x86/events/intel/core.c | 4 +++- arch/x86/events/intel/ds.c | 5 ++++- include/linux/perf_event.h | 4 ++-- kernel/events/core.c | 4 ++-- 6 files changed, 15 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3= s.c index 13919eb96931..1ad1efdb33f9 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2297,6 +2297,7 @@ static void record_and_restart(struct perf_event *eve= nt, unsigned long val, cpuhw =3D this_cpu_ptr(&cpu_hw_events); power_pmu_bhrb_read(event, cpuhw); data.br_stack =3D &cpuhw->bhrb_stack; + data.sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; } =20 if (event->attr.sample_type & PERF_SAMPLE_DATA_SRC && diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index f969410d0c90..bb34a28fa71b 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1714,8 +1714,10 @@ int x86_pmu_handle_irq(struct pt_regs *regs) =20 perf_sample_data_init(&data, 0, event->hw.last_period); =20 - if (has_branch_stack(event)) + if (has_branch_stack(event)) { data.br_stack =3D &cpuc->lbr_stack; + data.sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; + } =20 if (perf_event_overflow(event, &data, regs)) x86_pmu_stop(event, 0); diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 4fce2bdbbf87..36f95894dd1c 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3004,8 +3004,10 @@ static int handle_pmi_common(struct pt_regs *regs, u= 64 status) =20 perf_sample_data_init(&data, 0, event->hw.last_period); =20 - if (has_branch_stack(event)) + if (has_branch_stack(event)) { data.br_stack =3D &cpuc->lbr_stack; + data.sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; + } =20 if (perf_event_overflow(event, &data, regs)) x86_pmu_stop(event, 0); diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 3af24c4891fb..d5f3007af59d 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1645,8 +1645,10 @@ static void setup_pebs_fixed_sample_data(struct perf= _event *event, data->sample_flags |=3D PERF_SAMPLE_TIME; } =20 - if (has_branch_stack(event)) + if (has_branch_stack(event)) { data->br_stack =3D &cpuc->lbr_stack; + data->sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; + } } =20 static void adaptive_pebs_save_regs(struct pt_regs *regs, @@ -1796,6 +1798,7 @@ static void setup_pebs_adaptive_sample_data(struct pe= rf_event *event, if (has_branch_stack(event)) { intel_pmu_store_pebs_lbrs(lbr); data->br_stack =3D &cpuc->lbr_stack; + data->sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; } } =20 diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index b0ebbb1377b9..2aec1765b3d5 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1010,7 +1010,6 @@ struct perf_sample_data { u64 sample_flags; u64 addr; struct perf_raw_record *raw; - struct perf_branch_stack *br_stack; u64 period; union perf_sample_weight weight; u64 txn; @@ -1020,6 +1019,8 @@ struct perf_sample_data { * The other fields, optionally {set,used} by * perf_{prepare,output}_sample(). */ + struct perf_branch_stack *br_stack; + u64 type; u64 ip; struct { @@ -1060,7 +1061,6 @@ static inline void perf_sample_data_init(struct perf_= sample_data *data, data->sample_flags =3D 0; data->addr =3D addr; data->raw =3D NULL; - data->br_stack =3D NULL; data->period =3D period; data->weight.full =3D 0; data->data_src.val =3D PERF_MEM_NA; diff --git a/kernel/events/core.c b/kernel/events/core.c index c9b9cb79231a..104c0c9f4e6f 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7052,7 +7052,7 @@ void perf_output_sample(struct perf_output_handle *ha= ndle, } =20 if (sample_type & PERF_SAMPLE_BRANCH_STACK) { - if (data->br_stack) { + if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) { size_t size; =20 size =3D data->br_stack->nr @@ -7358,7 +7358,7 @@ void perf_prepare_sample(struct perf_event_header *he= ader, =20 if (sample_type & PERF_SAMPLE_BRANCH_STACK) { int size =3D sizeof(u64); /* nr */ - if (data->br_stack) { + if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) { if (perf_sample_save_hw_index(event)) size +=3D sizeof(u64); =20 --=20 2.35.1