From nobody Tue Apr 7 00:42:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8B4BC0502A for ; Wed, 31 Aug 2022 14:56:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231453AbiHaO4g (ORCPT ); Wed, 31 Aug 2022 10:56:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231331AbiHaO4U (ORCPT ); Wed, 31 Aug 2022 10:56:20 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE150C9908 for ; Wed, 31 Aug 2022 07:55:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661957745; x=1693493745; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gus0aTfcf/kp+LWpuOTpUd2EeTfjQ4el7EkieP/ZXEM=; b=dQJKP/efKft1nayRq671sysuHwluVv3Uycj9VKcMMrkYNpC/HZ9KQ5Cn ONBNsWNtiGwNRSaGAA8qhLjdMzsBinwgiMSw3wqbsGq4mp/m9Q4WBNzdY ASV9cdTpxvxWiErnjpcppOAVgWKOSwDyBkJA99IwQmBgUO8VN+TGpjIFz lH0ENvU4AViTnL0U+k3o/vueSrakl+/uAPZnSoKdEmxXcPF2Gvcry6pZv tS1UTRXwfSoZdqVm6d0PDn2mZY0bmozhU7nwZLxpqvsbRvtgagKcGWPHc PmRAFFF+sxZHN1irqQO8sZ7xC6OJovKdikbFdkaY6Qzg+oid+WRoydMR+ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10456"; a="296248192" X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="296248192" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2022 07:55:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="614991679" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmsmga007.fm.intel.com with ESMTP; 31 Aug 2022 07:55:44 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, acme@kernel.org, mingo@redhat.com, eranian@google.com, mpe@ellerman.id.au, linux-kernel@vger.kernel.org Cc: ak@linux.intel.com, andreas.kogler.0x@gmail.com, atrajeev@linux.vnet.ibm.com, Kan Liang Subject: [PATCH 1/6] perf: Add sample_flags to indicate the PMU-filled sample data Date: Wed, 31 Aug 2022 07:55:09 -0700 Message-Id: <20220831145514.190514-2-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220831145514.190514-1-kan.liang@linux.intel.com> References: <20220831145514.190514-1-kan.liang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Kan Liang On some platforms, some data e.g., timestamps, can be retrieved from the PMU driver. Usually, the data from the PMU driver is more accurate. The current perf kernel should output the PMU-filled sample data if it's available. To check the availability of the PMU-filled sample data, the current perf kernel initializes the related fields in the perf_sample_data_init(). When outputting a sample, the perf checks whether the field is updated by the PMU driver. If yes, the updated value will be output. If not, the perf uses an SW way to calculate the value or just outputs the initialized value if an SW way is unavailable either. With more and more data being provided by the PMU driver, more fields has to be initialized in the perf_sample_data_init(). That will increase the number of cache lines touched in perf_sample_data_init() and be harmful to the performance. Add new "sample_flags" to indicate the PMU-filled sample data. The PMU driver should set the corresponding PERF_SAMPLE_ flag when the field is updated. The initialization of the corresponding field is not required anymore. The following patches will make use of it and remove the corresponding fields from the perf_sample_data_init(), which will further minimize the number of cache lines touched. Only clear the sample flags that have already been done by the PMU driver in the perf_prepare_sample() for the PERF_RECORD_SAMPLE. For the other PERF_RECORD_ event type, the sample data is not available. Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Kan Liang --- include/linux/perf_event.h | 2 ++ kernel/events/core.c | 17 +++++++++++------ 2 files changed, 13 insertions(+), 6 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index ee8b9ecdc03b..b0ebbb1377b9 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1007,6 +1007,7 @@ struct perf_sample_data { * Fields set by perf_sample_data_init(), group so as to * minimize the cachelines touched. */ + u64 sample_flags; u64 addr; struct perf_raw_record *raw; struct perf_branch_stack *br_stack; @@ -1056,6 +1057,7 @@ static inline void perf_sample_data_init(struct perf_= sample_data *data, u64 addr, u64 period) { /* remaining struct members initialized in perf_prepare_sample() */ + data->sample_flags =3D 0; data->addr =3D addr; data->raw =3D NULL; data->br_stack =3D NULL; diff --git a/kernel/events/core.c b/kernel/events/core.c index 2621fd24ad26..c9b9cb79231a 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6794,11 +6794,10 @@ static void perf_aux_sample_output(struct perf_even= t *event, =20 static void __perf_event_header__init_id(struct perf_event_header *header, struct perf_sample_data *data, - struct perf_event *event) + struct perf_event *event, + u64 sample_type) { - u64 sample_type =3D event->attr.sample_type; - - data->type =3D sample_type; + data->type =3D event->attr.sample_type; header->size +=3D event->id_header_size; =20 if (sample_type & PERF_SAMPLE_TID) { @@ -6827,7 +6826,7 @@ void perf_event_header__init_id(struct perf_event_hea= der *header, struct perf_event *event) { if (event->attr.sample_id_all) - __perf_event_header__init_id(header, data, event); + __perf_event_header__init_id(header, data, event, event->attr.sample_typ= e); } =20 static void __perf_event__output_id_sample(struct perf_output_handle *hand= le, @@ -7303,6 +7302,7 @@ void perf_prepare_sample(struct perf_event_header *he= ader, struct pt_regs *regs) { u64 sample_type =3D event->attr.sample_type; + u64 filtered_sample_type; =20 header->type =3D PERF_RECORD_SAMPLE; header->size =3D sizeof(*header) + event->header_size; @@ -7310,7 +7310,12 @@ void perf_prepare_sample(struct perf_event_header *h= eader, header->misc =3D 0; header->misc |=3D perf_misc_flags(regs); =20 - __perf_event_header__init_id(header, data, event); + /* + * Clear the sample flags that have already been done by the + * PMU driver. + */ + filtered_sample_type =3D sample_type & ~data->sample_flags; + __perf_event_header__init_id(header, data, event, filtered_sample_type); =20 if (sample_type & (PERF_SAMPLE_IP | PERF_SAMPLE_CODE_PAGE_SIZE)) data->ip =3D perf_instruction_pointer(regs); --=20 2.35.1 From nobody Tue Apr 7 00:42:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29D58ECAAD4 for ; Wed, 31 Aug 2022 14:56:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231504AbiHaO4k (ORCPT ); Wed, 31 Aug 2022 10:56:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231315AbiHaO4U (ORCPT ); Wed, 31 Aug 2022 10:56:20 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 288D6C992F for ; Wed, 31 Aug 2022 07:55:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661957746; x=1693493746; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8+GY+VHMpQ4tyKPMCOukgzehMzX4R6wiYIg29V61XK0=; b=VEQPCB4DcvGDmR26ZtRC2HTZj4tvUNsMkEq6znCwhD0eQ6y7QMxSor2l HSsVq3Fh5LnjM+dCKuTIJyUNV9/OB0uuzeKRV7/e0lYz9XsrrQlqy+CRm BVHnfcjJ402DZYPoCVyUU46elK41mz3NZ43dYbrfkNisNo0eH2bjgUOxT sL3LJnVjjKxhN5aSaincoElRkCk3I0RKWD6qd1Zz21T7Mg0CN6XLbP8r0 GvmiN6pbEJJ7kGSK1HFzt4wkPwvIJRfhy/eiBiU2/d2jEVqfyxxXeCpDa C+LYe9SKKlcjX7VY4+kFsABHBXjqWc73PAm5dwJ6G/H1bMpGnS7ZBbYIS w==; X-IronPort-AV: E=McAfee;i="6500,9779,10456"; a="296248193" X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="296248193" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2022 07:55:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="614991680" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmsmga007.fm.intel.com with ESMTP; 31 Aug 2022 07:55:44 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, acme@kernel.org, mingo@redhat.com, eranian@google.com, mpe@ellerman.id.au, linux-kernel@vger.kernel.org Cc: ak@linux.intel.com, andreas.kogler.0x@gmail.com, atrajeev@linux.vnet.ibm.com, Kan Liang Subject: [PATCH 2/6] perf/x86/intel/pebs: Fix PEBS timestamps overwritten Date: Wed, 31 Aug 2022 07:55:10 -0700 Message-Id: <20220831145514.190514-3-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220831145514.190514-1-kan.liang@linux.intel.com> References: <20220831145514.190514-1-kan.liang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Kan Liang The PEBS TSC-based timestamps do not appear correctly in the final perf.data output file from perf record. The data->time field setup by PEBS in the setup_pebs_fixed_sample_data() is later overwritten by perf_events generic code in perf_prepare_sample(). There is an ordering problem. Set the sample flags when the data->time is updated by PEBS. The data->time field will not be overwritten anymore. Reported-by: Andreas Kogler Reported-by: Stephane Eranian Signed-off-by: Kan Liang --- arch/x86/events/intel/ds.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 6ce73b4ae2f3..3af24c4891fb 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1640,8 +1640,10 @@ static void setup_pebs_fixed_sample_data(struct perf= _event *event, * We can only do this for the default trace clock. */ if (x86_pmu.intel_cap.pebs_format >=3D 3 && - event->attr.use_clockid =3D=3D 0) + event->attr.use_clockid =3D=3D 0) { data->time =3D native_sched_clock_from_tsc(pebs->tsc); + data->sample_flags |=3D PERF_SAMPLE_TIME; + } =20 if (has_branch_stack(event)) data->br_stack =3D &cpuc->lbr_stack; @@ -1702,8 +1704,10 @@ static void setup_pebs_adaptive_sample_data(struct p= erf_event *event, perf_sample_data_init(data, 0, event->hw.last_period); data->period =3D event->hw.last_period; =20 - if (event->attr.use_clockid =3D=3D 0) + if (event->attr.use_clockid =3D=3D 0) { data->time =3D native_sched_clock_from_tsc(basic->tsc); + data->sample_flags |=3D PERF_SAMPLE_TIME; + } =20 /* * We must however always use iregs for the unwinder to stay sane; the --=20 2.35.1 From nobody Tue Apr 7 00:42:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70325ECAAD1 for ; Wed, 31 Aug 2022 14:56:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230060AbiHaO42 (ORCPT ); Wed, 31 Aug 2022 10:56:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231316AbiHaO4U (ORCPT ); Wed, 31 Aug 2022 10:56:20 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD033CB5F7 for ; Wed, 31 Aug 2022 07:55:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661957746; x=1693493746; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tKCKqwQ8m8var63qlj80j+a8U1AHxCZ1pQ7hTm4PXaM=; b=b/KjTF2KvjSS7qKX/SzPhQfQ8/x7MZc1+nxKI9+y5YiqYFxbSYijqfST STBFs0tp0z1Y7GR6OTSRV9KWo8myY8wXT8ZeBjXrdwF2sQOmacR4BKCye iQRBKk7zJaig/I9lHmfUizDSbhvLW2odWeuX0iUTco7EM6JkSKuGXNVtS UYt3nZn3jCEpJ5Q7/gBO1GpyoklF+DvRdh5w4KxF5XEt0QBGFZNui/b3H ZxGLYB4+LfZvXcTW7vVD2Sj/MDJPZe5T7P7dj8yYtIGPQficQ+pt95dsy tbeybWD3h4sRslqRyAUku5y91Hv1gz6MJpDyIYyIjOLv+lS44jq8M9BtE A==; X-IronPort-AV: E=McAfee;i="6500,9779,10456"; a="296248194" X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="296248194" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2022 07:55:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="614991682" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmsmga007.fm.intel.com with ESMTP; 31 Aug 2022 07:55:44 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, acme@kernel.org, mingo@redhat.com, eranian@google.com, mpe@ellerman.id.au, linux-kernel@vger.kernel.org Cc: ak@linux.intel.com, andreas.kogler.0x@gmail.com, atrajeev@linux.vnet.ibm.com, Kan Liang Subject: [PATCH 3/6] perf: Use sample_flags for branch stack Date: Wed, 31 Aug 2022 07:55:11 -0700 Message-Id: <20220831145514.190514-4-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220831145514.190514-1-kan.liang@linux.intel.com> References: <20220831145514.190514-1-kan.liang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Kan Liang Use the new sample_flags to indicate whether the branch stack is filled by the PMU driver. Remove the br_stack from the perf_sample_data_init() to minimize the number of cache lines touched. Signed-off-by: Kan Liang --- arch/powerpc/perf/core-book3s.c | 1 + arch/x86/events/core.c | 4 +++- arch/x86/events/intel/core.c | 4 +++- arch/x86/events/intel/ds.c | 5 ++++- include/linux/perf_event.h | 4 ++-- kernel/events/core.c | 4 ++-- 6 files changed, 15 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3= s.c index 13919eb96931..1ad1efdb33f9 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2297,6 +2297,7 @@ static void record_and_restart(struct perf_event *eve= nt, unsigned long val, cpuhw =3D this_cpu_ptr(&cpu_hw_events); power_pmu_bhrb_read(event, cpuhw); data.br_stack =3D &cpuhw->bhrb_stack; + data.sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; } =20 if (event->attr.sample_type & PERF_SAMPLE_DATA_SRC && diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index f969410d0c90..bb34a28fa71b 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1714,8 +1714,10 @@ int x86_pmu_handle_irq(struct pt_regs *regs) =20 perf_sample_data_init(&data, 0, event->hw.last_period); =20 - if (has_branch_stack(event)) + if (has_branch_stack(event)) { data.br_stack =3D &cpuc->lbr_stack; + data.sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; + } =20 if (perf_event_overflow(event, &data, regs)) x86_pmu_stop(event, 0); diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 4fce2bdbbf87..36f95894dd1c 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3004,8 +3004,10 @@ static int handle_pmi_common(struct pt_regs *regs, u= 64 status) =20 perf_sample_data_init(&data, 0, event->hw.last_period); =20 - if (has_branch_stack(event)) + if (has_branch_stack(event)) { data.br_stack =3D &cpuc->lbr_stack; + data.sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; + } =20 if (perf_event_overflow(event, &data, regs)) x86_pmu_stop(event, 0); diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 3af24c4891fb..d5f3007af59d 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1645,8 +1645,10 @@ static void setup_pebs_fixed_sample_data(struct perf= _event *event, data->sample_flags |=3D PERF_SAMPLE_TIME; } =20 - if (has_branch_stack(event)) + if (has_branch_stack(event)) { data->br_stack =3D &cpuc->lbr_stack; + data->sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; + } } =20 static void adaptive_pebs_save_regs(struct pt_regs *regs, @@ -1796,6 +1798,7 @@ static void setup_pebs_adaptive_sample_data(struct pe= rf_event *event, if (has_branch_stack(event)) { intel_pmu_store_pebs_lbrs(lbr); data->br_stack =3D &cpuc->lbr_stack; + data->sample_flags |=3D PERF_SAMPLE_BRANCH_STACK; } } =20 diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index b0ebbb1377b9..2aec1765b3d5 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1010,7 +1010,6 @@ struct perf_sample_data { u64 sample_flags; u64 addr; struct perf_raw_record *raw; - struct perf_branch_stack *br_stack; u64 period; union perf_sample_weight weight; u64 txn; @@ -1020,6 +1019,8 @@ struct perf_sample_data { * The other fields, optionally {set,used} by * perf_{prepare,output}_sample(). */ + struct perf_branch_stack *br_stack; + u64 type; u64 ip; struct { @@ -1060,7 +1061,6 @@ static inline void perf_sample_data_init(struct perf_= sample_data *data, data->sample_flags =3D 0; data->addr =3D addr; data->raw =3D NULL; - data->br_stack =3D NULL; data->period =3D period; data->weight.full =3D 0; data->data_src.val =3D PERF_MEM_NA; diff --git a/kernel/events/core.c b/kernel/events/core.c index c9b9cb79231a..104c0c9f4e6f 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7052,7 +7052,7 @@ void perf_output_sample(struct perf_output_handle *ha= ndle, } =20 if (sample_type & PERF_SAMPLE_BRANCH_STACK) { - if (data->br_stack) { + if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) { size_t size; =20 size =3D data->br_stack->nr @@ -7358,7 +7358,7 @@ void perf_prepare_sample(struct perf_event_header *he= ader, =20 if (sample_type & PERF_SAMPLE_BRANCH_STACK) { int size =3D sizeof(u64); /* nr */ - if (data->br_stack) { + if (data->sample_flags & PERF_SAMPLE_BRANCH_STACK) { if (perf_sample_save_hw_index(event)) size +=3D sizeof(u64); =20 --=20 2.35.1 From nobody Tue Apr 7 00:42:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86F05ECAAD4 for ; Wed, 31 Aug 2022 14:56:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231537AbiHaO4r (ORCPT ); Wed, 31 Aug 2022 10:56:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51944 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230355AbiHaO4V (ORCPT ); Wed, 31 Aug 2022 10:56:21 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EED73CC325 for ; Wed, 31 Aug 2022 07:55:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661957746; x=1693493746; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Th+Kkqmd56IzaaVSmR68ZC7TOhiCUwkMlU7+pGxLX04=; b=byQbywdEU8GlG3NCPVYyPvYYKMIynmvZYF3PAbuxqvFetJu7zO1L447k j9O3z+TCMkBle+0F770qAs4e0cROjIYiVxDpgSqpLW/5VSrVpHGOSIo4i 8lzlvogYUW2HmlULQ7b9rcWbhyey4Hxt9fT4Qcyn8gbm+igP4OWRTH648 ansSJVZzFipCS3cA3MraFnfegHFtTtfuYj4ywaHUuk8axw8iMjFNL5ihS rbQeOJtdfUOJLICF2l0umIsseE4hvRXcJaiqtRbSm61hxcOVRe1UcQcDg UcqeYZ8YSg8WIIFvsTf55/1DeXzeqpj/JXqBfToN5oWxHQeiE9V43TFwU g==; X-IronPort-AV: E=McAfee;i="6500,9779,10456"; a="296248196" X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="296248196" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2022 07:55:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="614991683" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmsmga007.fm.intel.com with ESMTP; 31 Aug 2022 07:55:45 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, acme@kernel.org, mingo@redhat.com, eranian@google.com, mpe@ellerman.id.au, linux-kernel@vger.kernel.org Cc: ak@linux.intel.com, andreas.kogler.0x@gmail.com, atrajeev@linux.vnet.ibm.com, Kan Liang Subject: [PATCH 4/6] perf: Use sample_flags for weight Date: Wed, 31 Aug 2022 07:55:12 -0700 Message-Id: <20220831145514.190514-5-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220831145514.190514-1-kan.liang@linux.intel.com> References: <20220831145514.190514-1-kan.liang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Kan Liang Use the new sample_flags to indicate whether the weight field is filled by the PMU driver. Remove the weight field from the perf_sample_data_init() to minimize the number of cache lines touched. Signed-off-by: Kan Liang --- arch/powerpc/perf/core-book3s.c | 5 +++-- arch/x86/events/intel/ds.c | 10 +++++++--- include/linux/perf_event.h | 3 +-- kernel/events/core.c | 3 +++ 4 files changed, 14 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3= s.c index 1ad1efdb33f9..a5c95a2006ea 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2305,9 +2305,10 @@ static void record_and_restart(struct perf_event *ev= ent, unsigned long val, ppmu->get_mem_data_src(&data.data_src, ppmu->flags, regs); =20 if (event->attr.sample_type & PERF_SAMPLE_WEIGHT_TYPE && - ppmu->get_mem_weight) + ppmu->get_mem_weight) { ppmu->get_mem_weight(&data.weight.full, event->attr.sample_type); - + data.sample_flags |=3D PERF_SAMPLE_WEIGHT_TYPE; + } if (perf_event_overflow(event, &data, regs)) power_pmu_stop(event, 0); } else if (period) { diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index d5f3007af59d..e80632a575d1 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1532,8 +1532,10 @@ static void setup_pebs_fixed_sample_data(struct perf= _event *event, /* * Use latency for weight (only avail with PEBS-LL) */ - if (fll && (sample_type & PERF_SAMPLE_WEIGHT_TYPE)) + if (fll && (sample_type & PERF_SAMPLE_WEIGHT_TYPE)) { data->weight.full =3D pebs->lat; + data->sample_flags |=3D PERF_SAMPLE_WEIGHT_TYPE; + } =20 /* * data.data_src encodes the data source @@ -1625,9 +1627,10 @@ static void setup_pebs_fixed_sample_data(struct perf= _event *event, =20 if (x86_pmu.intel_cap.pebs_format >=3D 2) { /* Only set the TSX weight when no memory weight. */ - if ((sample_type & PERF_SAMPLE_WEIGHT_TYPE) && !fll) + if ((sample_type & PERF_SAMPLE_WEIGHT_TYPE) && !fll) { data->weight.full =3D intel_get_tsx_weight(pebs->tsx_tuning); - + data->sample_flags |=3D PERF_SAMPLE_WEIGHT_TYPE; + } if (sample_type & PERF_SAMPLE_TRANSACTION) data->txn =3D intel_get_tsx_transaction(pebs->tsx_tuning, pebs->ax); @@ -1769,6 +1772,7 @@ static void setup_pebs_adaptive_sample_data(struct pe= rf_event *event, data->weight.var1_dw =3D (u32)(weight & PEBS_LATENCY_MASK) ?: intel_get_tsx_weight(meminfo->tsx_tuning); } + data->sample_flags |=3D PERF_SAMPLE_WEIGHT_TYPE; } =20 if (sample_type & PERF_SAMPLE_DATA_SRC) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 2aec1765b3d5..c030d1d1c675 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1011,7 +1011,6 @@ struct perf_sample_data { u64 addr; struct perf_raw_record *raw; u64 period; - union perf_sample_weight weight; u64 txn; union perf_mem_data_src data_src; =20 @@ -1020,6 +1019,7 @@ struct perf_sample_data { * perf_{prepare,output}_sample(). */ struct perf_branch_stack *br_stack; + union perf_sample_weight weight; =20 u64 type; u64 ip; @@ -1062,7 +1062,6 @@ static inline void perf_sample_data_init(struct perf_= sample_data *data, data->addr =3D addr; data->raw =3D NULL; data->period =3D period; - data->weight.full =3D 0; data->data_src.val =3D PERF_MEM_NA; data->txn =3D 0; } diff --git a/kernel/events/core.c b/kernel/events/core.c index 104c0c9f4e6f..f0af45db02b3 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7408,6 +7408,9 @@ void perf_prepare_sample(struct perf_event_header *he= ader, header->size +=3D size; } =20 + if (filtered_sample_type & PERF_SAMPLE_WEIGHT_TYPE) + data->weight.full =3D 0; + if (sample_type & PERF_SAMPLE_REGS_INTR) { /* regs dump ABI info */ int size =3D sizeof(u64); --=20 2.35.1 From nobody Tue Apr 7 00:42:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BCADECAAD4 for ; Wed, 31 Aug 2022 14:56:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230310AbiHaO4v (ORCPT ); Wed, 31 Aug 2022 10:56:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231342AbiHaO4V (ORCPT ); Wed, 31 Aug 2022 10:56:21 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A5CDCCD61 for ; Wed, 31 Aug 2022 07:55:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661957747; x=1693493747; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qrAa0Is2zPCGKakBa/pG1x7ldNZCFiMTJH5UKK30Rok=; b=AkawByBAMe5xj4G2EalPGPLMJRhg/OOYo+nkL/GUvffv511Jlqpnq+7r ZqUv3u396p/yb/MmACE3r0YJEzqLKlrl3jJaIlORl35RjgSktawwj8oe6 pO7YrjzLN7EgJ38JyFV+TrwrnDmpRtoCCIfhWISUinqJq8BbaaZa3n1fq m0lv5dHV40yKn8c16X8iQv/V/dChEqh9bd0u0EYJ7l8GezGJTNhmzFQAl J/KcFMpEMHz6jKjszluN6nhI1BcsjTt5pe5C5gJX9ICzfq/yQJ6p6j7Gj lUmF6JWYagnt8vXflMkwi4teDREm+E1BRJHSldGME4djBTYZO0GQCG2Hf A==; X-IronPort-AV: E=McAfee;i="6500,9779,10456"; a="296248197" X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="296248197" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2022 07:55:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="614991684" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmsmga007.fm.intel.com with ESMTP; 31 Aug 2022 07:55:45 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, acme@kernel.org, mingo@redhat.com, eranian@google.com, mpe@ellerman.id.au, linux-kernel@vger.kernel.org Cc: ak@linux.intel.com, andreas.kogler.0x@gmail.com, atrajeev@linux.vnet.ibm.com, Kan Liang Subject: [PATCH 5/6] perf: Use sample_flags for data_src Date: Wed, 31 Aug 2022 07:55:13 -0700 Message-Id: <20220831145514.190514-6-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220831145514.190514-1-kan.liang@linux.intel.com> References: <20220831145514.190514-1-kan.liang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Kan Liang Use the new sample_flags to indicate whether the data_src field is filled by the PMU driver. Remove the data_src field from the perf_sample_data_init() to minimize the number of cache lines touched. Signed-off-by: Kan Liang --- arch/powerpc/perf/core-book3s.c | 4 +++- arch/x86/events/intel/ds.c | 8 ++++++-- include/linux/perf_event.h | 3 +-- kernel/events/core.c | 3 +++ 4 files changed, 13 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3= s.c index a5c95a2006ea..6ec7069e6482 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2301,8 +2301,10 @@ static void record_and_restart(struct perf_event *ev= ent, unsigned long val, } =20 if (event->attr.sample_type & PERF_SAMPLE_DATA_SRC && - ppmu->get_mem_data_src) + ppmu->get_mem_data_src) { ppmu->get_mem_data_src(&data.data_src, ppmu->flags, regs); + data.sample_flags |=3D PERF_SAMPLE_DATA_SRC; + } =20 if (event->attr.sample_type & PERF_SAMPLE_WEIGHT_TYPE && ppmu->get_mem_weight) { diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index e80632a575d1..9a10457ff32a 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1540,8 +1540,10 @@ static void setup_pebs_fixed_sample_data(struct perf= _event *event, /* * data.data_src encodes the data source */ - if (sample_type & PERF_SAMPLE_DATA_SRC) + if (sample_type & PERF_SAMPLE_DATA_SRC) { data->data_src.val =3D get_data_src(event, pebs->dse); + data->sample_flags |=3D PERF_SAMPLE_DATA_SRC; + } =20 /* * We must however always use iregs for the unwinder to stay sane; the @@ -1775,8 +1777,10 @@ static void setup_pebs_adaptive_sample_data(struct p= erf_event *event, data->sample_flags |=3D PERF_SAMPLE_WEIGHT_TYPE; } =20 - if (sample_type & PERF_SAMPLE_DATA_SRC) + if (sample_type & PERF_SAMPLE_DATA_SRC) { data->data_src.val =3D get_data_src(event, meminfo->aux); + data->sample_flags |=3D PERF_SAMPLE_DATA_SRC; + } =20 if (sample_type & PERF_SAMPLE_ADDR_TYPE) data->addr =3D meminfo->address; diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index c030d1d1c675..79b44084c15d 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1012,7 +1012,6 @@ struct perf_sample_data { struct perf_raw_record *raw; u64 period; u64 txn; - union perf_mem_data_src data_src; =20 /* * The other fields, optionally {set,used} by @@ -1020,6 +1019,7 @@ struct perf_sample_data { */ struct perf_branch_stack *br_stack; union perf_sample_weight weight; + union perf_mem_data_src data_src; =20 u64 type; u64 ip; @@ -1062,7 +1062,6 @@ static inline void perf_sample_data_init(struct perf_= sample_data *data, data->addr =3D addr; data->raw =3D NULL; data->period =3D period; - data->data_src.val =3D PERF_MEM_NA; data->txn =3D 0; } =20 diff --git a/kernel/events/core.c b/kernel/events/core.c index f0af45db02b3..163e2f478e61 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7411,6 +7411,9 @@ void perf_prepare_sample(struct perf_event_header *he= ader, if (filtered_sample_type & PERF_SAMPLE_WEIGHT_TYPE) data->weight.full =3D 0; =20 + if (filtered_sample_type & PERF_SAMPLE_DATA_SRC) + data->data_src.val =3D PERF_MEM_NA; + if (sample_type & PERF_SAMPLE_REGS_INTR) { /* regs dump ABI info */ int size =3D sizeof(u64); --=20 2.35.1 From nobody Tue Apr 7 00:42:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24B8EECAAD4 for ; Wed, 31 Aug 2022 14:57:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231416AbiHaO47 (ORCPT ); Wed, 31 Aug 2022 10:56:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231344AbiHaO4V (ORCPT ); Wed, 31 Aug 2022 10:56:21 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2CE7CCE3B for ; Wed, 31 Aug 2022 07:55:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661957747; x=1693493747; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gHPh3sEBIfJJjkAbNeqnHdXeqa7zmq3xnXel25hgjnY=; b=LIXsg+OE7MHBGwp/CkpuVltOQsqsQtHE7Nu3hV3CB4G1X+xet5B57tQP gHCzTOA4jPOsPIYUqFM5KcSTtSG2FxNaRut+kEnrqL5QcFGI5r9999H9c TWhelPcHMzllcMqRZnG9sUqZX4aPZwHkljKT0QFzzJIcCbfeGqL1xHbIZ 17vzU2C5TLXdV0+npaxI79KWCX5apVl9XdwGXcjkeCQu8NBEEXM9il8eD 5b/68BlgmSbg+TdOBVZ+1DVTY7dxxU4kD+zebEfkq26xWZrGzgcC0NAYb uDipqSWf16SKFIM5IrlNzFJncUJdF6W3NY7PXjWwP/hEWGLH9G/BuCjNw w==; X-IronPort-AV: E=McAfee;i="6500,9779,10456"; a="296248198" X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="296248198" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2022 07:55:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,278,1654585200"; d="scan'208";a="614991685" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmsmga007.fm.intel.com with ESMTP; 31 Aug 2022 07:55:45 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, acme@kernel.org, mingo@redhat.com, eranian@google.com, mpe@ellerman.id.au, linux-kernel@vger.kernel.org Cc: ak@linux.intel.com, andreas.kogler.0x@gmail.com, atrajeev@linux.vnet.ibm.com, Kan Liang Subject: [PATCH 6/6] perf: Use sample_flags for txn Date: Wed, 31 Aug 2022 07:55:14 -0700 Message-Id: <20220831145514.190514-7-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220831145514.190514-1-kan.liang@linux.intel.com> References: <20220831145514.190514-1-kan.liang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Kan Liang Use the new sample_flags to indicate whether the txn field is filled by the PMU driver. Remove the txn field from the perf_sample_data_init() to minimize the number of cache lines touched. Signed-off-by: Kan Liang --- arch/x86/events/intel/ds.c | 8 ++++++-- include/linux/perf_event.h | 3 +-- kernel/events/core.c | 3 +++ 3 files changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 9a10457ff32a..3c6a68d7fe42 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1633,9 +1633,11 @@ static void setup_pebs_fixed_sample_data(struct perf= _event *event, data->weight.full =3D intel_get_tsx_weight(pebs->tsx_tuning); data->sample_flags |=3D PERF_SAMPLE_WEIGHT_TYPE; } - if (sample_type & PERF_SAMPLE_TRANSACTION) + if (sample_type & PERF_SAMPLE_TRANSACTION) { data->txn =3D intel_get_tsx_transaction(pebs->tsx_tuning, pebs->ax); + data->sample_flags |=3D PERF_SAMPLE_TRANSACTION; + } } =20 /* @@ -1785,9 +1787,11 @@ static void setup_pebs_adaptive_sample_data(struct p= erf_event *event, if (sample_type & PERF_SAMPLE_ADDR_TYPE) data->addr =3D meminfo->address; =20 - if (sample_type & PERF_SAMPLE_TRANSACTION) + if (sample_type & PERF_SAMPLE_TRANSACTION) { data->txn =3D intel_get_tsx_transaction(meminfo->tsx_tuning, gprs ? gprs->ax : 0); + data->sample_flags |=3D PERF_SAMPLE_TRANSACTION; + } } =20 if (format_size & PEBS_DATACFG_XMMS) { diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 79b44084c15d..d7c9fdd82bc3 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1011,7 +1011,6 @@ struct perf_sample_data { u64 addr; struct perf_raw_record *raw; u64 period; - u64 txn; =20 /* * The other fields, optionally {set,used} by @@ -1020,6 +1019,7 @@ struct perf_sample_data { struct perf_branch_stack *br_stack; union perf_sample_weight weight; union perf_mem_data_src data_src; + u64 txn; =20 u64 type; u64 ip; @@ -1062,7 +1062,6 @@ static inline void perf_sample_data_init(struct perf_= sample_data *data, data->addr =3D addr; data->raw =3D NULL; data->period =3D period; - data->txn =3D 0; } =20 /* diff --git a/kernel/events/core.c b/kernel/events/core.c index 163e2f478e61..15d27b14c827 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7414,6 +7414,9 @@ void perf_prepare_sample(struct perf_event_header *he= ader, if (filtered_sample_type & PERF_SAMPLE_DATA_SRC) data->data_src.val =3D PERF_MEM_NA; =20 + if (filtered_sample_type & PERF_SAMPLE_TRANSACTION) + data->txn =3D 0; + if (sample_type & PERF_SAMPLE_REGS_INTR) { /* regs dump ABI info */ int size =3D sizeof(u64); --=20 2.35.1