From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28B96C433EF for ; Wed, 20 Jul 2022 06:58:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237195AbiGTG6o (ORCPT ); Wed, 20 Jul 2022 02:58:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236335AbiGTG6m (ORCPT ); Wed, 20 Jul 2022 02:58:42 -0400 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 708ED51428 for ; Tue, 19 Jul 2022 23:58:39 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0VJvhhU8_1658300314; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VJvhhU8_1658300314) by smtp.aliyun-inc.com; Wed, 20 Jul 2022 14:58:35 +0800 From: Shuai Xue To: Jonathan.Cameron@Huawei.com, rdunlap@infradead.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [PATCH v3 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Date: Wed, 20 Jul 2022 14:58:13 +0800 Message-Id: <20220720065815.60772-2-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Alibaba's T-Head SoC implements uncore PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue Reviewed-by: Jonathan Cameron Reviewed-by: Baolin Wang --- .../admin-guide/perf/alibaba_pmu.rst | 100 ++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 101 insertions(+) create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation= /admin-guide/perf/alibaba_pmu.rst new file mode 100644 index 000000000000..11de998bb480 --- /dev/null +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst @@ -0,0 +1,100 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The Yitian 710, custom-built by Alibaba Group's chip development business, +T-Head, implements uncore PMU for performance and functional debugging to +facilitate system maintenance. + +DDR Sub-System Driveway (DRW) PMU Driver +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 chan= nel +is independent of others to service system memory requests. And one DDR5 +channel is split into two independent sub-channels. The DDR Sub-System Dri= veway +implements separate PMUs for each sub-channel to monitor various performan= ce +metrics. + +The Driveway PMU devices are named as ali_drw_ with perf. +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two +sub-channels of the same channel in die 0. And the PMU device of die 1 is +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. + +Each sub-channel has 36 PMU counters in total, which is classified into +four groups: + +- Group 0: PMU Cycle Counter. This group has one pair of counters + pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count + based on DDRC core clock. + +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used + to count the total access number of either the eight bank groups in a + selected rank, or four ranks separately in the first 4 counters. The base + transfer unit is 64B. + +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to + count the total retry number of each type of uncorrectable error. + +- Group 3: PMU Common Counters. This group has 16 counters, that are used + to count the common events. + +For now, the Driveway PMU driver only uses counters in group 0 and group 3. + +The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solut= ion +for connecting an SoC application bus to DDR memory devices. The DDRCTL +receives transactions Host Interface (HIF) which is custom-defined by Syno= psys. +These transactions are queued internally and scheduled for access while +satisfying the SDRAM protocol timing requirements, transaction priorities,= and +dependencies between the transactions. The DDRCTL in turn issues commands = on +the DDR PHY Interface (DFI) to the PHY module, which launches and captures= data +to and from the SDRAM. The driveway PMUs have hardware logic to gather +statistics and performance logging signals on HIF, DFI, etc. + +By counting the READ, WRITE and RMW commands sent to the DDRC through the = HIF +interface, we could calculate the bandwidth. Example usage of counting mem= ory +data bandwidth:: + + perf stat \ + -e ali_drw_21000/hif_wr/ \ + -e ali_drw_21000/hif_rd/ \ + -e ali_drw_21000/hif_rmw/ \ + -e ali_drw_21000/cycle/ \ + -e ali_drw_21080/hif_wr/ \ + -e ali_drw_21080/hif_rd/ \ + -e ali_drw_21080/hif_rmw/ \ + -e ali_drw_21080/cycle/ \ + -e ali_drw_23000/hif_wr/ \ + -e ali_drw_23000/hif_rd/ \ + -e ali_drw_23000/hif_rmw/ \ + -e ali_drw_23000/cycle/ \ + -e ali_drw_23080/hif_wr/ \ + -e ali_drw_23080/hif_rd/ \ + -e ali_drw_23080/hif_rmw/ \ + -e ali_drw_23080/cycle/ \ + -e ali_drw_25000/hif_wr/ \ + -e ali_drw_25000/hif_rd/ \ + -e ali_drw_25000/hif_rmw/ \ + -e ali_drw_25000/cycle/ \ + -e ali_drw_25080/hif_wr/ \ + -e ali_drw_25080/hif_rd/ \ + -e ali_drw_25080/hif_rmw/ \ + -e ali_drw_25080/cycle/ \ + -e ali_drw_27000/hif_wr/ \ + -e ali_drw_27000/hif_rd/ \ + -e ali_drw_27000/hif_rmw/ \ + -e ali_drw_27000/cycle/ \ + -e ali_drw_27080/hif_wr/ \ + -e ali_drw_27080/hif_rd/ \ + -e ali_drw_27080/hif_rmw/ \ + -e ali_drw_27080/cycle/ -- sleep 10 + +The average DRAM bandwidth can be calculated as follows: + +- Read Bandwidth =3D perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle +- Write Bandwidth =3D (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Fre= q / DDRC_Cycle + +Here, DDRC_WIDTH =3D 64 bytes. + +The current driver does not support sampling. So "perf record" is +unsupported. Also attach to a task is unsupported as the events are all +uncore. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin= -guide/perf/index.rst index 69b23f087c05..823db08863db 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -17,3 +17,4 @@ Performance monitor support xgene-pmu arm_dsu_pmu thunderx2-pmu + alibaba_pmu --=20 2.20.1.9.gb50a0d7 From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB019C433EF for ; Fri, 15 Jul 2022 15:13:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233560AbiGOPNg (ORCPT ); Fri, 15 Jul 2022 11:13:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233504AbiGOPNb (ORCPT ); Fri, 15 Jul 2022 11:13:31 -0400 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D30747CB4D for ; Fri, 15 Jul 2022 08:13:29 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VJPkRLx_1657898001; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VJPkRLx_1657898001) by smtp.aliyun-inc.com; Fri, 15 Jul 2022 23:13:23 +0800 From: Shuai Xue To: Jonathan.Cameron@Huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [RESEND PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Date: Fri, 15 Jul 2022 23:13:08 +0800 Message-Id: <20220715151310.90091-2-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Alibaba's T-Head SoC implements uncore PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue Reviewed-by: Baolin Wang Reviewed-by: Jonathan Cameron --- .../admin-guide/perf/alibaba_pmu.rst | 100 ++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 101 insertions(+) create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation= /admin-guide/perf/alibaba_pmu.rst new file mode 100644 index 000000000000..11de998bb480 --- /dev/null +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst @@ -0,0 +1,100 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The Yitian 710, custom-built by Alibaba Group's chip development business, +T-Head, implements uncore PMU for performance and functional debugging to +facilitate system maintenance. + +DDR Sub-System Driveway (DRW) PMU Driver +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 chan= nel +is independent of others to service system memory requests. And one DDR5 +channel is split into two independent sub-channels. The DDR Sub-System Dri= veway +implements separate PMUs for each sub-channel to monitor various performan= ce +metrics. + +The Driveway PMU devices are named as ali_drw_ with perf. +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two +sub-channels of the same channel in die 0. And the PMU device of die 1 is +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. + +Each sub-channel has 36 PMU counters in total, which is classified into +four groups: + +- Group 0: PMU Cycle Counter. This group has one pair of counters + pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count + based on DDRC core clock. + +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used + to count the total access number of either the eight bank groups in a + selected rank, or four ranks separately in the first 4 counters. The base + transfer unit is 64B. + +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to + count the total retry number of each type of uncorrectable error. + +- Group 3: PMU Common Counters. This group has 16 counters, that are used + to count the common events. + +For now, the Driveway PMU driver only uses counters in group 0 and group 3. + +The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solut= ion +for connecting an SoC application bus to DDR memory devices. The DDRCTL +receives transactions Host Interface (HIF) which is custom-defined by Syno= psys. +These transactions are queued internally and scheduled for access while +satisfying the SDRAM protocol timing requirements, transaction priorities,= and +dependencies between the transactions. The DDRCTL in turn issues commands = on +the DDR PHY Interface (DFI) to the PHY module, which launches and captures= data +to and from the SDRAM. The driveway PMUs have hardware logic to gather +statistics and performance logging signals on HIF, DFI, etc. + +By counting the READ, WRITE and RMW commands sent to the DDRC through the = HIF +interface, we could calculate the bandwidth. Example usage of counting mem= ory +data bandwidth:: + + perf stat \ + -e ali_drw_21000/hif_wr/ \ + -e ali_drw_21000/hif_rd/ \ + -e ali_drw_21000/hif_rmw/ \ + -e ali_drw_21000/cycle/ \ + -e ali_drw_21080/hif_wr/ \ + -e ali_drw_21080/hif_rd/ \ + -e ali_drw_21080/hif_rmw/ \ + -e ali_drw_21080/cycle/ \ + -e ali_drw_23000/hif_wr/ \ + -e ali_drw_23000/hif_rd/ \ + -e ali_drw_23000/hif_rmw/ \ + -e ali_drw_23000/cycle/ \ + -e ali_drw_23080/hif_wr/ \ + -e ali_drw_23080/hif_rd/ \ + -e ali_drw_23080/hif_rmw/ \ + -e ali_drw_23080/cycle/ \ + -e ali_drw_25000/hif_wr/ \ + -e ali_drw_25000/hif_rd/ \ + -e ali_drw_25000/hif_rmw/ \ + -e ali_drw_25000/cycle/ \ + -e ali_drw_25080/hif_wr/ \ + -e ali_drw_25080/hif_rd/ \ + -e ali_drw_25080/hif_rmw/ \ + -e ali_drw_25080/cycle/ \ + -e ali_drw_27000/hif_wr/ \ + -e ali_drw_27000/hif_rd/ \ + -e ali_drw_27000/hif_rmw/ \ + -e ali_drw_27000/cycle/ \ + -e ali_drw_27080/hif_wr/ \ + -e ali_drw_27080/hif_rd/ \ + -e ali_drw_27080/hif_rmw/ \ + -e ali_drw_27080/cycle/ -- sleep 10 + +The average DRAM bandwidth can be calculated as follows: + +- Read Bandwidth =3D perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle +- Write Bandwidth =3D (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Fre= q / DDRC_Cycle + +Here, DDRC_WIDTH =3D 64 bytes. + +The current driver does not support sampling. So "perf record" is +unsupported. Also attach to a task is unsupported as the events are all +uncore. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin= -guide/perf/index.rst index 69b23f087c05..823db08863db 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -17,3 +17,4 @@ Performance monitor support xgene-pmu arm_dsu_pmu thunderx2-pmu + alibaba_pmu --=20 2.20.1.9.gb50a0d7 From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A181C43334 for ; Fri, 17 Jun 2022 11:18:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1381585AbiFQLSu (ORCPT ); Fri, 17 Jun 2022 07:18:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1380688AbiFQLSl (ORCPT ); Fri, 17 Jun 2022 07:18:41 -0400 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7EF126A018 for ; Fri, 17 Jun 2022 04:18:39 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R751e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0VGernk5_1655464716; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VGernk5_1655464716) by smtp.aliyun-inc.com; Fri, 17 Jun 2022 19:18:37 +0800 From: Shuai Xue To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com Subject: [PATCH v1 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Date: Fri, 17 Jun 2022 19:18:23 +0800 Message-Id: <20220617111825.92911-2-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Alibaba's T-Head SoC implements uncore PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue Reported-by: kernel test robot Reviewed-by: Baolin Wang --- .../admin-guide/perf/alibaba_pmu.rst | 94 +++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 95 insertions(+) create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation= /admin-guide/perf/alibaba_pmu.rst new file mode 100644 index 000000000000..337f4f1d4c54 --- /dev/null +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst @@ -0,0 +1,94 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The Yitian 710, custom-built by Alibaba Group's chip development business, +T-Head, implements uncore PMU for performance and functional debugging to +facilitate system maintenance. + +DDR Sub-System Driveway (DRW) PMU Driver +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The Yitian 710 SoC supports the most advanced DDR5/4 DRAM to provide +tremendous memory bandwidth for cloud computing and HPC. The Driveway is a +module which is a bridge between a router of mesh network and memory +controller. It provides various functions like secure control, address map +and so on. + +Yitian 710 employs eight DDR5/4 channels, four on each die. Each channel is +independent of others to service system memory requests. And one DDR5 +channel is split into two independent sub-channels. The DDR Sub-System +Driveway implements separate PMUs for each sub-channel to monitor various +performance metrics. + +The Driveway PMU devices are named as ali_drw_ with perf. +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two +sub-channels of the same channel in die 0. And the PMU device of die 1 is +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. + +Each sub-channel has 36 PMU counters in total, which is classified into +four groups: + +- Group 0: PMU Cycle Counter. This group has one pair of counters + pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count + based on DDRC core clock. + +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used + to count the total access number of either the eight bank groups in a + selected rank, or four ranks separately in the first 4 counters. The base + transfer unit is 64B. + +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to + count the total retry number of each type of uncorrectable error. + +- Group 3: PMU Common Counters. This group has 16 counters, that are used + to count the common events. + +For now, the Driveway PMU driver only uses counters in group 0 and group 3. + +Example usage of counting memory data bandwidth:: + + perf stat \ + -e ali_drw_21000/perf_hif_wr/ \ + -e ali_drw_21000/perf_hif_rd/ \ + -e ali_drw_21000/perf_hif_rmw/ \ + -e ali_drw_21000/perf_cycle/ \ + -e ali_drw_21080/perf_hif_wr/ \ + -e ali_drw_21080/perf_hif_rd/ \ + -e ali_drw_21080/perf_hif_rmw/ \ + -e ali_drw_21080/perf_cycle/ \ + -e ali_drw_23000/perf_hif_wr/ \ + -e ali_drw_23000/perf_hif_rd/ \ + -e ali_drw_23000/perf_hif_rmw/ \ + -e ali_drw_23000/perf_cycle/ \ + -e ali_drw_23080/perf_hif_wr/ \ + -e ali_drw_23080/perf_hif_rd/ \ + -e ali_drw_23080/perf_hif_rmw/ \ + -e ali_drw_23080/perf_cycle/ \ + -e ali_drw_25000/perf_hif_wr/ \ + -e ali_drw_25000/perf_hif_rd/ \ + -e ali_drw_25000/perf_hif_rmw/ \ + -e ali_drw_25000/perf_cycle/ \ + -e ali_drw_25080/perf_hif_wr/ \ + -e ali_drw_25080/perf_hif_rd/ \ + -e ali_drw_25080/perf_hif_rmw/ \ + -e ali_drw_25080/perf_cycle/ \ + -e ali_drw_27000/perf_hif_wr/ \ + -e ali_drw_27000/perf_hif_rd/ \ + -e ali_drw_27000/perf_hif_rmw/ \ + -e ali_drw_27000/perf_cycle/ \ + -e ali_drw_27080/perf_hif_wr/ \ + -e ali_drw_27080/perf_hif_rd/ \ + -e ali_drw_27080/perf_hif_rmw/ \ + -e ali_drw_27080/perf_cycle/ -- sleep 10 + +The average DRAM bandwidth can be calculated as follows: + +- Read Bandwidth =3D perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle +- Write Bandwidth =3D (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Fre= q / DDRC_Cycle + +Here, DDRC_WIDTH =3D 64 bytes. + +The current driver does not support sampling. So "perf record" is +unsupported. Also attach to a task is unsupported as the events are all +uncore. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin= -guide/perf/index.rst index 69b23f087c05..bf466ae91c6c 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -17,3 +17,4 @@ Performance monitor support xgene-pmu arm_dsu_pmu thunderx2-pmu + thead_pmu --=20 2.20.1.12.g72788fdb From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F91EC54EE9 for ; Wed, 14 Sep 2022 02:23:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229875AbiINCXl (ORCPT ); Tue, 13 Sep 2022 22:23:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229999AbiINCXh (ORCPT ); Tue, 13 Sep 2022 22:23:37 -0400 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6DF55D12F for ; Tue, 13 Sep 2022 19:23:35 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R891e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046056;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VPl45k7_1663122211; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VPl45k7_1663122211) by smtp.aliyun-inc.com; Wed, 14 Sep 2022 10:23:32 +0800 From: Shuai Xue To: will@kernel.org, Jonathan.Cameron@Huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: rdunlap@infradead.org, robin.murphy@arm.com, mark.rutland@arm.com, baolin.wang@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [RESEND PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Date: Wed, 14 Sep 2022 10:23:24 +0800 Message-Id: <20220914022326.88550-2-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Alibaba's T-Head SoC implements uncore PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue Reviewed-by: Jonathan Cameron Reviewed-by: Baolin Wang --- .../admin-guide/perf/alibaba_pmu.rst | 100 ++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 101 insertions(+) create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation= /admin-guide/perf/alibaba_pmu.rst new file mode 100644 index 000000000000..11de998bb480 --- /dev/null +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst @@ -0,0 +1,100 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The Yitian 710, custom-built by Alibaba Group's chip development business, +T-Head, implements uncore PMU for performance and functional debugging to +facilitate system maintenance. + +DDR Sub-System Driveway (DRW) PMU Driver +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 chan= nel +is independent of others to service system memory requests. And one DDR5 +channel is split into two independent sub-channels. The DDR Sub-System Dri= veway +implements separate PMUs for each sub-channel to monitor various performan= ce +metrics. + +The Driveway PMU devices are named as ali_drw_ with perf. +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two +sub-channels of the same channel in die 0. And the PMU device of die 1 is +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. + +Each sub-channel has 36 PMU counters in total, which is classified into +four groups: + +- Group 0: PMU Cycle Counter. This group has one pair of counters + pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count + based on DDRC core clock. + +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used + to count the total access number of either the eight bank groups in a + selected rank, or four ranks separately in the first 4 counters. The base + transfer unit is 64B. + +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to + count the total retry number of each type of uncorrectable error. + +- Group 3: PMU Common Counters. This group has 16 counters, that are used + to count the common events. + +For now, the Driveway PMU driver only uses counters in group 0 and group 3. + +The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solut= ion +for connecting an SoC application bus to DDR memory devices. The DDRCTL +receives transactions Host Interface (HIF) which is custom-defined by Syno= psys. +These transactions are queued internally and scheduled for access while +satisfying the SDRAM protocol timing requirements, transaction priorities,= and +dependencies between the transactions. The DDRCTL in turn issues commands = on +the DDR PHY Interface (DFI) to the PHY module, which launches and captures= data +to and from the SDRAM. The driveway PMUs have hardware logic to gather +statistics and performance logging signals on HIF, DFI, etc. + +By counting the READ, WRITE and RMW commands sent to the DDRC through the = HIF +interface, we could calculate the bandwidth. Example usage of counting mem= ory +data bandwidth:: + + perf stat \ + -e ali_drw_21000/hif_wr/ \ + -e ali_drw_21000/hif_rd/ \ + -e ali_drw_21000/hif_rmw/ \ + -e ali_drw_21000/cycle/ \ + -e ali_drw_21080/hif_wr/ \ + -e ali_drw_21080/hif_rd/ \ + -e ali_drw_21080/hif_rmw/ \ + -e ali_drw_21080/cycle/ \ + -e ali_drw_23000/hif_wr/ \ + -e ali_drw_23000/hif_rd/ \ + -e ali_drw_23000/hif_rmw/ \ + -e ali_drw_23000/cycle/ \ + -e ali_drw_23080/hif_wr/ \ + -e ali_drw_23080/hif_rd/ \ + -e ali_drw_23080/hif_rmw/ \ + -e ali_drw_23080/cycle/ \ + -e ali_drw_25000/hif_wr/ \ + -e ali_drw_25000/hif_rd/ \ + -e ali_drw_25000/hif_rmw/ \ + -e ali_drw_25000/cycle/ \ + -e ali_drw_25080/hif_wr/ \ + -e ali_drw_25080/hif_rd/ \ + -e ali_drw_25080/hif_rmw/ \ + -e ali_drw_25080/cycle/ \ + -e ali_drw_27000/hif_wr/ \ + -e ali_drw_27000/hif_rd/ \ + -e ali_drw_27000/hif_rmw/ \ + -e ali_drw_27000/cycle/ \ + -e ali_drw_27080/hif_wr/ \ + -e ali_drw_27080/hif_rd/ \ + -e ali_drw_27080/hif_rmw/ \ + -e ali_drw_27080/cycle/ -- sleep 10 + +The average DRAM bandwidth can be calculated as follows: + +- Read Bandwidth =3D perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle +- Write Bandwidth =3D (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Fre= q / DDRC_Cycle + +Here, DDRC_WIDTH =3D 64 bytes. + +The current driver does not support sampling. So "perf record" is +unsupported. Also attach to a task is unsupported as the events are all +uncore. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin= -guide/perf/index.rst index 9c9ece88ce53..793e1970bc05 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -18,3 +18,4 @@ Performance monitor support xgene-pmu arm_dsu_pmu thunderx2-pmu + alibaba_pmu --=20 2.20.1.12.g72788fdb From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82814C25B08 for ; Thu, 18 Aug 2022 03:18:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243067AbiHRDSn (ORCPT ); Wed, 17 Aug 2022 23:18:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243035AbiHRDSi (ORCPT ); Wed, 17 Aug 2022 23:18:38 -0400 Received: from out30-56.freemail.mail.aliyun.com (out30-56.freemail.mail.aliyun.com [115.124.30.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADC925FC4 for ; Wed, 17 Aug 2022 20:18:34 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046056;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VMZAdEL_1660792709; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VMZAdEL_1660792709) by smtp.aliyun-inc.com; Thu, 18 Aug 2022 11:18:30 +0800 From: Shuai Xue To: will@kernel.org, Jonathan.Cameron@Huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: rdunlap@infradead.org, robin.murphy@arm.com, mark.rutland@arm.com, baolin.wang@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [PATCH v4 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Date: Thu, 18 Aug 2022 11:18:20 +0800 Message-Id: <20220818031822.38415-2-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Alibaba's T-Head SoC implements uncore PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue Reviewed-by: Jonathan Cameron Reviewed-by: Baolin Wang --- .../admin-guide/perf/alibaba_pmu.rst | 100 ++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 101 insertions(+) create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation= /admin-guide/perf/alibaba_pmu.rst new file mode 100644 index 000000000000..11de998bb480 --- /dev/null +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst @@ -0,0 +1,100 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The Yitian 710, custom-built by Alibaba Group's chip development business, +T-Head, implements uncore PMU for performance and functional debugging to +facilitate system maintenance. + +DDR Sub-System Driveway (DRW) PMU Driver +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 chan= nel +is independent of others to service system memory requests. And one DDR5 +channel is split into two independent sub-channels. The DDR Sub-System Dri= veway +implements separate PMUs for each sub-channel to monitor various performan= ce +metrics. + +The Driveway PMU devices are named as ali_drw_ with perf. +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two +sub-channels of the same channel in die 0. And the PMU device of die 1 is +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. + +Each sub-channel has 36 PMU counters in total, which is classified into +four groups: + +- Group 0: PMU Cycle Counter. This group has one pair of counters + pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count + based on DDRC core clock. + +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used + to count the total access number of either the eight bank groups in a + selected rank, or four ranks separately in the first 4 counters. The base + transfer unit is 64B. + +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to + count the total retry number of each type of uncorrectable error. + +- Group 3: PMU Common Counters. This group has 16 counters, that are used + to count the common events. + +For now, the Driveway PMU driver only uses counters in group 0 and group 3. + +The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solut= ion +for connecting an SoC application bus to DDR memory devices. The DDRCTL +receives transactions Host Interface (HIF) which is custom-defined by Syno= psys. +These transactions are queued internally and scheduled for access while +satisfying the SDRAM protocol timing requirements, transaction priorities,= and +dependencies between the transactions. The DDRCTL in turn issues commands = on +the DDR PHY Interface (DFI) to the PHY module, which launches and captures= data +to and from the SDRAM. The driveway PMUs have hardware logic to gather +statistics and performance logging signals on HIF, DFI, etc. + +By counting the READ, WRITE and RMW commands sent to the DDRC through the = HIF +interface, we could calculate the bandwidth. Example usage of counting mem= ory +data bandwidth:: + + perf stat \ + -e ali_drw_21000/hif_wr/ \ + -e ali_drw_21000/hif_rd/ \ + -e ali_drw_21000/hif_rmw/ \ + -e ali_drw_21000/cycle/ \ + -e ali_drw_21080/hif_wr/ \ + -e ali_drw_21080/hif_rd/ \ + -e ali_drw_21080/hif_rmw/ \ + -e ali_drw_21080/cycle/ \ + -e ali_drw_23000/hif_wr/ \ + -e ali_drw_23000/hif_rd/ \ + -e ali_drw_23000/hif_rmw/ \ + -e ali_drw_23000/cycle/ \ + -e ali_drw_23080/hif_wr/ \ + -e ali_drw_23080/hif_rd/ \ + -e ali_drw_23080/hif_rmw/ \ + -e ali_drw_23080/cycle/ \ + -e ali_drw_25000/hif_wr/ \ + -e ali_drw_25000/hif_rd/ \ + -e ali_drw_25000/hif_rmw/ \ + -e ali_drw_25000/cycle/ \ + -e ali_drw_25080/hif_wr/ \ + -e ali_drw_25080/hif_rd/ \ + -e ali_drw_25080/hif_rmw/ \ + -e ali_drw_25080/cycle/ \ + -e ali_drw_27000/hif_wr/ \ + -e ali_drw_27000/hif_rd/ \ + -e ali_drw_27000/hif_rmw/ \ + -e ali_drw_27000/cycle/ \ + -e ali_drw_27080/hif_wr/ \ + -e ali_drw_27080/hif_rd/ \ + -e ali_drw_27080/hif_rmw/ \ + -e ali_drw_27080/cycle/ -- sleep 10 + +The average DRAM bandwidth can be calculated as follows: + +- Read Bandwidth =3D perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle +- Write Bandwidth =3D (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Fre= q / DDRC_Cycle + +Here, DDRC_WIDTH =3D 64 bytes. + +The current driver does not support sampling. So "perf record" is +unsupported. Also attach to a task is unsupported as the events are all +uncore. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin= -guide/perf/index.rst index 9c9ece88ce53..793e1970bc05 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -18,3 +18,4 @@ Performance monitor support xgene-pmu arm_dsu_pmu thunderx2-pmu + alibaba_pmu --=20 2.20.1.12.g72788fdb From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D3B8C433EF for ; Tue, 28 Jun 2022 03:16:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230043AbiF1DQt (ORCPT ); Mon, 27 Jun 2022 23:16:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229928AbiF1DQo (ORCPT ); Mon, 27 Jun 2022 23:16:44 -0400 Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com [115.124.30.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 447BC1CFF0 for ; Mon, 27 Jun 2022 20:16:43 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R181e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0VHfG1xv_1656386199; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VHfG1xv_1656386199) by smtp.aliyun-inc.com; Tue, 28 Jun 2022 11:16:40 +0800 From: Shuai Xue To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [PATCH v2 1/3] docs: perf: Add description for Alibaba's T-Head PMU driver Date: Tue, 28 Jun 2022 11:16:28 +0800 Message-Id: <20220628031630.18674-2-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Alibaba's T-Head SoC implements uncore PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue Reviewed-by: Baolin Wang --- .../admin-guide/perf/alibaba_pmu.rst | 97 +++++++++++++++++++ Documentation/admin-guide/perf/index.rst | 1 + 2 files changed, 98 insertions(+) create mode 100644 Documentation/admin-guide/perf/alibaba_pmu.rst diff --git a/Documentation/admin-guide/perf/alibaba_pmu.rst b/Documentation= /admin-guide/perf/alibaba_pmu.rst new file mode 100644 index 000000000000..784885ce0dbb --- /dev/null +++ b/Documentation/admin-guide/perf/alibaba_pmu.rst @@ -0,0 +1,97 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Alibaba's T-Head SoC Uncore Performance Monitoring Unit (PMU) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The Yitian 710, custom-built by Alibaba Group's chip development business, +T-Head, implements uncore PMU for performance and functional debugging to +facilitate system maintenance. + +DDR Sub-System Driveway (DRW) PMU Driver +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Yitian 710 employs eight DDR5/4 channels, four on each die. Each DDR5 chan= nel +is independent of others to service system memory requests. And one DDR5 +channel is split into two independent sub-channels. The DDR Sub-System Dri= veway +implements separate PMUs for each sub-channel to monitor various performan= ce +metrics. + +The Driveway PMU devices are named as ali_drw_ with perf. +For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two +sub-channels of the same channel in die 0. And the PMU device of die 1 is +prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. + +Each sub-channel has 36 PMU counters in total, which is classified into +four groups: + +- Group 0: PMU Cycle Counter. This group has one pair of counters + pmu_cycle_cnt_low and pmu_cycle_cnt_high, that is used as the cycle count + based on DDRC core clock. + +- Group 1: PMU Bandwidth Counters. This group has 8 counters that are used + to count the total access number of either the eight bank groups in a + selected rank, or four ranks separately in the first 4 counters. The base + transfer unit is 64B. + +- Group 2: PMU Retry Counters. This group has 10 counters, that intend to + count the total retry number of each type of uncorrectable error. + +- Group 3: PMU Common Counters. This group has 16 counters, that are used + to count the common events. + +For now, the Driveway PMU driver only uses counters in group 0 and group 3. + +The DDR Controller (DDRCTL) and DDR PHY combine to create a complete solut= ion +for connecting an SoC application bus to DDR memory devices. The DDRCTL +receives transactions Host Interface (HIF) which is custom-defined by Syno= psys. +These transactions are queued internally and scheduled for access while +satisfying the SDRAM protocol timing requirements, transaction priorities,= and +dependencies between the transactions. The DDRCTL in turn issues commands = on +the DDR PHY Interface (DFI) to the PHY module, which launches and captures= data +to and from the SDRAM. The driveway PMUs have hardware logic to gather sta= tistics and performance logging signals on HIF, DFI, etc. + +By counting the READ, WRITE and RMW commands sent to the DDRC through the = HIF interface, we could calculate the bandwidth. Example usage of counting = memory data bandwidth:: + + perf stat \ + -e ali_drw_21000/hif_wr/ \ + -e ali_drw_21000/hif_rd/ \ + -e ali_drw_21000/hif_rmw/ \ + -e ali_drw_21000/cycle/ \ + -e ali_drw_21080/hif_wr/ \ + -e ali_drw_21080/hif_rd/ \ + -e ali_drw_21080/hif_rmw/ \ + -e ali_drw_21080/cycle/ \ + -e ali_drw_23000/hif_wr/ \ + -e ali_drw_23000/hif_rd/ \ + -e ali_drw_23000/hif_rmw/ \ + -e ali_drw_23000/cycle/ \ + -e ali_drw_23080/hif_wr/ \ + -e ali_drw_23080/hif_rd/ \ + -e ali_drw_23080/hif_rmw/ \ + -e ali_drw_23080/cycle/ \ + -e ali_drw_25000/hif_wr/ \ + -e ali_drw_25000/hif_rd/ \ + -e ali_drw_25000/hif_rmw/ \ + -e ali_drw_25000/cycle/ \ + -e ali_drw_25080/hif_wr/ \ + -e ali_drw_25080/hif_rd/ \ + -e ali_drw_25080/hif_rmw/ \ + -e ali_drw_25080/cycle/ \ + -e ali_drw_27000/hif_wr/ \ + -e ali_drw_27000/hif_rd/ \ + -e ali_drw_27000/hif_rmw/ \ + -e ali_drw_27000/cycle/ \ + -e ali_drw_27080/hif_wr/ \ + -e ali_drw_27080/hif_rd/ \ + -e ali_drw_27080/hif_rmw/ \ + -e ali_drw_27080/cycle/ -- sleep 10 + +The average DRAM bandwidth can be calculated as follows: + +- Read Bandwidth =3D perf_hif_rd * DDRC_WIDTH * DDRC_Freq / DDRC_Cycle +- Write Bandwidth =3D (perf_hif_wr + perf_hif_rmw) * DDRC_WIDTH * DDRC_Fre= q / DDRC_Cycle + +Here, DDRC_WIDTH =3D 64 bytes. + +The current driver does not support sampling. So "perf record" is +unsupported. Also attach to a task is unsupported as the events are all +uncore. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin= -guide/perf/index.rst index 69b23f087c05..bf466ae91c6c 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -17,3 +17,4 @@ Performance monitor support xgene-pmu arm_dsu_pmu thunderx2-pmu + thead_pmu --=20 2.20.1.9.gb50a0d7 From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 007C6C433EF for ; Fri, 15 Jul 2022 15:13:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234657AbiGOPNm (ORCPT ); Fri, 15 Jul 2022 11:13:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233559AbiGOPNf (ORCPT ); Fri, 15 Jul 2022 11:13:35 -0400 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 893D97CB4D for ; Fri, 15 Jul 2022 08:13:31 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VJPkRMS_1657898004; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VJPkRMS_1657898004) by smtp.aliyun-inc.com; Fri, 15 Jul 2022 23:13:26 +0800 From: Shuai Xue To: Jonathan.Cameron@Huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [RESEND PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Date: Fri, 15 Jul 2022 23:13:09 +0800 Message-Id: <20220715151310.90091-3-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4 DRAM and targets cloud computing and HPC. Each PMU is registered as a device in /sys/bus/event_source/devices, and users can select event to monitor in each sub-channel, independently. For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two sub-channels of the same channel in die 0. And the PMU device of die 1 is prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers successfully, add IRQF_SHARED flag. Signed-off-by: Shuai Xue Co-developed-by: Hongbo Yao Signed-off-by: Hongbo Yao Co-developed-by: Neng Chen Signed-off-by: Neng Chen Reviewed-by: Baolin Wang Reviewed-by: Jonathan Cameron --- drivers/perf/Kconfig | 8 + drivers/perf/Makefile | 1 + drivers/perf/alibaba_uncore_drw_pmu.c | 793 ++++++++++++++++++++++++++ 3 files changed, 802 insertions(+) create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 1e2d69453771..dfafba4cb066 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU Provides support for the non-architectural CPU PMUs present on the Apple M1 SoCs and derivatives. =20 +config ALIBABA_UNCORE_DRW_PMU + tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver" + depends on (ARM64 && ACPI) + default m + help + Support for Driveway PMU events monitoring on Yitian 710 DDR + Sub-system. + source "drivers/perf/hisilicon/Kconfig" =20 config MARVELL_CN10K_DDR_PMU diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 57a279c61df5..050d04ee19dd 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) +=3D arm_dmc620_pmu.o obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) +=3D marvell_cn10k_tad_pmu.o obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) +=3D marvell_cn10k_ddr_pmu.o obj-$(CONFIG_APPLE_M1_CPU_PMU) +=3D apple_m1_cpu_pmu.o +obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) +=3D alibaba_uncore_drw_pmu.o diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_u= ncore_drw_pmu.c new file mode 100644 index 000000000000..4022bc50ffae --- /dev/null +++ b/drivers/perf/alibaba_uncore_drw_pmu.c @@ -0,0 +1,793 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Alibaba DDR Sub-System Driveway PMU driver + * + * Copyright (C) 2022 Alibaba Inc + */ + +#define ALI_DRW_PMUNAME "ali_drw" +#define ALI_DRW_DRVNAME ALI_DRW_PMUNAME "_pmu" +#define pr_fmt(fmt) ALI_DRW_DRVNAME ": " fmt + +#include +#include +#include + +#define ALI_DRW_PMU_COMMON_MAX_COUNTERS 16 +#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE 19 + +#define ALI_DRW_PMU_PA_SHIFT 12 +#define ALI_DRW_PMU_CNT_INIT 0x00000000 +#define ALI_DRW_CNT_MAX_PERIOD 0xffffffff +#define ALI_DRW_PMU_CYCLE_EVT_ID 0x80 + +#define ALI_DRW_PMU_CNT_CTRL 0xC00 +#define ALI_DRW_PMU_CNT_RST BIT(2) +#define ALI_DRW_PMU_CNT_STOP BIT(1) +#define ALI_DRW_PMU_CNT_START BIT(0) + +#define ALI_DRW_PMU_CNT_STATE 0xC04 +#define ALI_DRW_PMU_TEST_CTRL 0xC08 +#define ALI_DRW_PMU_CNT_PRELOAD 0xC0C + +#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK GENMASK(23, 0) +#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK GENMASK(31, 0) +#define ALI_DRW_PMU_CYCLE_CNT_HIGH 0xC10 +#define ALI_DRW_PMU_CYCLE_CNT_LOW 0xC14 + +/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */ +#define ALI_DRW_PMU_EVENT_SEL0 0xC68 +/* counter 0-3 use sel0, counter 4-7 use sel1...*/ +#define ALI_DRW_PMU_EVENT_SELn(n) \ + (ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4) +#define ALI_DRW_PMCOM_CNT_EN BIT(7) +#define ALI_DRW_PMCOM_CNT_EVENT_MASK GENMASK(5, 0) +#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \ + (8 * (n % 4)) + +/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte str= ide */ +#define ALI_DRW_PMU_COMMON_COUNTER0 0xC78 +#define ALI_DRW_PMU_COMMON_COUNTERn(n) \ + (ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n)) + +#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL 0xCB8 +#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL 0xCBC +#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS 0xCC0 +#define ALI_DRW_PMU_OV_INTR_CLR 0xCC4 +#define ALI_DRW_PMU_OV_INTR_STATUS 0xCC8 +#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK GENMASK(23, 8) +#define ALI_DRW_PMBW_CNT_OV_INTR_MASK GENMASK(7, 0) +#define ALI_DRW_PMU_OV_INTR_MASK GENMASK_ULL(63, 0) + +static int ali_drw_cpuhp_state_num; + +static LIST_HEAD(ali_drw_pmu_irqs); +static DEFINE_MUTEX(ali_drw_pmu_irqs_lock); + +struct ali_drw_pmu_irq { + struct hlist_node node; + struct list_head irqs_node; + struct list_head pmus_node; + int irq_num; + int cpu; + refcount_t refcount; +}; + +struct ali_drw_pmu { + void __iomem *cfg_base; + struct device *dev; + + struct list_head pmus_node; + struct ali_drw_pmu_irq *irq; + int irq_num; + int cpu; + DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS); + struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + + struct pmu pmu; +}; + +#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu)) + +#define DRW_CONFIG_EVENTID GENMASK(7, 0) +#define GET_DRW_EVENTID(event) FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr= .config) + +static ssize_t ali_drw_pmu_format_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(buf, "%s\n", (char *)eattr->var); +} + +/* + * PMU event attributes + */ +static ssize_t ali_drw_pmu_event_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(page, "config=3D0x%lx\n", (unsigned long)eattr->var); +} + +#define ALI_DRW_PMU_ATTR(_name, _func, _config) = \ + (&((struct dev_ext_attribute[]) { \ + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ + })[0].attr.attr) + +#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config) +#define ALI_DRW_PMU_EVENT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config) + +static struct attribute *ali_drw_pmu_events_attrs[] =3D { + ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr, 0x0), + ALI_DRW_PMU_EVENT_ATTR(hif_wr, 0x1), + ALI_DRW_PMU_EVENT_ATTR(hif_rd, 0x2), + ALI_DRW_PMU_EVENT_ATTR(hif_rmw, 0x3), + ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd, 0x4), + ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles, 0x7), + ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles, 0x8), + ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical, 0x9), + ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical, 0xA), + ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical, 0xB), + ALI_DRW_PMU_EVENT_ATTR(op_is_activate, 0xC), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr, 0xD), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate, 0xE), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd, 0xF), + ALI_DRW_PMU_EVENT_ATTR(op_is_wr, 0x10), + ALI_DRW_PMU_EVENT_ATTR(op_is_mwr, 0x11), + ALI_DRW_PMU_EVENT_ATTR(op_is_precharge, 0x12), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr, 0x13), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_other, 0x14), + ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions, 0x15), + ALI_DRW_PMU_EVENT_ATTR(write_combine, 0x16), + ALI_DRW_PMU_EVENT_ATTR(war_hazard, 0x17), + ALI_DRW_PMU_EVENT_ATTR(raw_hazard, 0x18), + ALI_DRW_PMU_EVENT_ATTR(waw_hazard, 0x19), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0, 0x1A), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1, 0x1B), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2, 0x1C), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3, 0x1D), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0, 0x1E), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1, 0x1F), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2, 0x20), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3, 0x21), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0, 0x26), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1, 0x27), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2, 0x28), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3, 0x29), + ALI_DRW_PMU_EVENT_ATTR(op_is_refresh, 0x2A), + ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref, 0x2B), + ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode, 0x2D), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl, 0x2E), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc, 0x34), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr, 0x35), + ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr, 0x36), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart, 0x37), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch, 0x38), + ALI_DRW_PMU_EVENT_ATTR(chi_txreq, 0x39), + ALI_DRW_PMU_EVENT_ATTR(chi_txdat, 0x3A), + ALI_DRW_PMU_EVENT_ATTR(chi_rxdat, 0x3B), + ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp, 0x3C), + ALI_DRW_PMU_EVENT_ATTR(tsz_vio, 0x3D), + ALI_DRW_PMU_EVENT_ATTR(cycle, 0x80), + NULL, +}; + +static struct attribute_group ali_drw_pmu_events_attr_group =3D { + .name =3D "events", + .attrs =3D ali_drw_pmu_events_attrs, +}; + +static struct attribute *ali_drw_pmu_format_attr[] =3D { + ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"), + NULL, +}; + +static const struct attribute_group ali_drw_pmu_format_group =3D { + .name =3D "format", + .attrs =3D ali_drw_pmu_format_attr, +}; + +static ssize_t ali_drw_pmu_cpumask_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu)); +} + +static struct device_attribute ali_drw_pmu_cpumask_attr =3D + __ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL); + +static struct attribute *ali_drw_pmu_cpumask_attrs[] =3D { + &ali_drw_pmu_cpumask_attr.attr, + NULL, +}; + +static const struct attribute_group ali_drw_pmu_cpumask_attr_group =3D { + .attrs =3D ali_drw_pmu_cpumask_attrs, +}; + +static const struct attribute_group *ali_drw_pmu_attr_groups[] =3D { + &ali_drw_pmu_events_attr_group, + &ali_drw_pmu_cpumask_attr_group, + &ali_drw_pmu_format_group, + NULL, +}; + +/* find a counter for event, then in add func, hw.idx will equal to counte= r */ +static int ali_drw_get_counter_idx(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) { + if (!test_and_set_bit(idx, drw_pmu->used_mask)) + return idx; + } + + /* The counters are all in use. */ + return -EBUSY; +} + +static u64 ali_drw_pmu_read_counter(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + u64 cycle_high, cycle_low; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + cycle_high =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH); + cycle_high &=3D ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK; + cycle_low =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW); + cycle_low &=3D ALI_DRW_PMU_CYCLE_CNT_LOW_MASK; + return (cycle_high << 32 | cycle_low); + } + + return readl(drw_pmu->cfg_base + + ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx)); +} + +static void ali_drw_pmu_event_update(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + u64 delta, prev, now; + + do { + prev =3D local64_read(&hwc->prev_count); + now =3D ali_drw_pmu_read_counter(event); + } while (local64_cmpxchg(&hwc->prev_count, prev, now) !=3D prev); + + /* handle overflow. */ + delta =3D now - prev; + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) + delta &=3D ALI_DRW_PMU_OV_INTR_MASK; + else + delta &=3D ALI_DRW_CNT_MAX_PERIOD; + local64_add(delta, &event->count); +} + +static void ali_drw_pmu_event_set_period(struct perf_event *event) +{ + u64 pre_val; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + /* set a preload counter for test purpose */ + writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx, + drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); + + /* set conunter initial value */ + pre_val =3D ALI_DRW_PMU_CNT_INIT; + writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + local64_set(&event->hw.prev_count, pre_val); + + /* set sel mode to zero to start test */ + writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); +} + +static void ali_drw_pmu_enable_counter(struct perf_event *event) +{ + u32 val, subval, reg, shift; + int counter =3D event->hw.idx; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static void ali_drw_pmu_disable_counter(struct perf_event *event) +{ + u32 val, reg, subval, shift; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int counter =3D event->hw.idx; + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data) +{ + struct ali_drw_pmu_irq *irq =3D data; + struct ali_drw_pmu *drw_pmu; + irqreturn_t ret =3D IRQ_NONE; + + rcu_read_lock(); + list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) { + unsigned long status, clr_status; + struct perf_event *event; + unsigned int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + ali_drw_pmu_disable_counter(event); + } + + /* common counter intr status */ + status =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS); + status =3D FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status); + if (status) { + for_each_set_bit(idx, &status, + ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + event =3D drw_pmu->events[idx]; + if (WARN_ON_ONCE(!event)) + continue; + ali_drw_pmu_event_update(event); + ali_drw_pmu_event_set_period(event); + } + + /* clear common counter intr status */ + clr_status =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1); + writel(clr_status, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + } + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + if (!(event->hw.state & PERF_HES_STOPPED)) + ali_drw_pmu_enable_counter(event); + } + if (status) + ret =3D IRQ_HANDLED; + } + rcu_read_unlock(); + return ret; +} + +static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_devi= ce + *pdev, int irq_num) +{ + int ret; + struct ali_drw_pmu_irq *irq; + + list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) { + if (irq->irq_num =3D=3D irq_num + && refcount_inc_not_zero(&irq->refcount)) + return irq; + } + + irq =3D kzalloc(sizeof(*irq), GFP_KERNEL); + if (!irq) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&irq->pmus_node); + + /* Pick one CPU to be the preferred one to use */ + irq->cpu =3D smp_processor_id(); + refcount_set(&irq->refcount, 1); + + /* + * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same + * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers + * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not + * share with other component. + */ + ret =3D devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr, + IRQF_SHARED, dev_name(&pdev->dev), irq); + if (ret < 0) { + dev_err(&pdev->dev, + "Fail to request IRQ:%d ret:%d\n", irq_num, ret); + goto out_free; + } + + ret =3D irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu)); + if (ret) + goto out_free; + + ret =3D cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + if (ret) + goto out_free; + + irq->irq_num =3D irq_num; + list_add(&irq->irqs_node, &ali_drw_pmu_irqs); + + return irq; + +out_free: + kfree(irq); + return ERR_PTR(ret); +} + +static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu, + struct platform_device *pdev) +{ + int irq_num; + struct ali_drw_pmu_irq *irq; + + /* Read and init IRQ */ + irq_num =3D platform_get_irq(pdev, 0); + if (irq_num < 0) + return irq_num; + + mutex_lock(&ali_drw_pmu_irqs_lock); + irq =3D __ali_drw_pmu_init_irq(pdev, irq_num); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + if (IS_ERR(irq)) + return PTR_ERR(irq); + + drw_pmu->irq =3D irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + return 0; +} + +static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu) +{ + struct ali_drw_pmu_irq *irq =3D drw_pmu->irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_del_rcu(&drw_pmu->pmus_node); + + if (!refcount_dec_and_test(&irq->refcount)) { + mutex_unlock(&ali_drw_pmu_irqs_lock); + return; + } + + list_del(&irq->irqs_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL)); + cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + kfree(irq); +} + +static int ali_drw_pmu_event_init(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + struct perf_event *sibling; + struct device *dev =3D drw_pmu->pmu.dev; + + if (event->attr.type !=3D event->pmu->type) + return -ENOENT; + + if (is_sampling_event(event)) { + dev_err(dev, "Sampling not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->attach_state & PERF_ATTACH_TASK) { + dev_err(dev, "Per-task counter cannot allocate!\n"); + return -EOPNOTSUPP; + } + + event->cpu =3D drw_pmu->cpu; + if (event->cpu < 0) { + dev_err(dev, "Per-task mode not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->group_leader !=3D event && + !is_software_event(event->group_leader)) { + dev_err(dev, "driveway only allow one event!\n"); + return -EINVAL; + } + + for_each_sibling_event(sibling, event->group_leader) { + if (sibling !=3D event && !is_software_event(sibling)) { + dev_err(dev, "driveway event not allowed!\n"); + return -EINVAL; + } + } + + /* reset all the pmu counters */ + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + hwc->idx =3D -1; + + return 0; +} + +static void ali_drw_pmu_start(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + event->hw.state =3D 0; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + writel(ALI_DRW_PMU_CNT_START, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + return; + } + + ali_drw_pmu_event_set_period(event); + if (flags & PERF_EF_RELOAD) { + unsigned long prev_raw_count =3D + local64_read(&event->hw.prev_count); + writel(prev_raw_count, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + } + + ali_drw_pmu_enable_counter(event); + + writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); +} + +static void ali_drw_pmu_stop(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + if (event->hw.state & PERF_HES_STOPPED) + return; + + if (GET_DRW_EVENTID(event) !=3D ALI_DRW_PMU_CYCLE_EVT_ID) + ali_drw_pmu_disable_counter(event); + + writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + ali_drw_pmu_event_update(event); + event->hw.state |=3D PERF_HES_STOPPED | PERF_HES_UPTODATE; +} + +static int ali_drw_pmu_add(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D -1; + int evtid; + + evtid =3D GET_DRW_EVENTID(event); + + if (evtid !=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + idx =3D ali_drw_get_counter_idx(event); + if (idx < 0) + return idx; + drw_pmu->events[idx] =3D event; + drw_pmu->evtids[idx] =3D evtid; + } + hwc->idx =3D idx; + + hwc->state =3D PERF_HES_STOPPED | PERF_HES_UPTODATE; + + if (flags & PERF_EF_START) + ali_drw_pmu_start(event, PERF_EF_RELOAD); + + /* Propagate our changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void ali_drw_pmu_del(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D hwc->idx; + + ali_drw_pmu_stop(event, PERF_EF_UPDATE); + + if (idx >=3D 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + drw_pmu->events[idx] =3D NULL; + drw_pmu->evtids[idx] =3D 0; + clear_bit(idx, drw_pmu->used_mask); + } + + perf_event_update_userpage(event); +} + +static void ali_drw_pmu_read(struct perf_event *event) +{ + ali_drw_pmu_event_update(event); +} + +static int ali_drw_pmu_probe(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu; + struct resource *res; + char *name; + int ret; + + drw_pmu =3D devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL); + if (!drw_pmu) + return -ENOMEM; + + drw_pmu->dev =3D &pdev->dev; + platform_set_drvdata(pdev, drw_pmu); + + res =3D platform_get_resource(pdev, IORESOURCE_MEM, 0); + drw_pmu->cfg_base =3D devm_ioremap_resource(&pdev->dev, res); + if (!drw_pmu->cfg_base) + return -ENOMEM; + + name =3D devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx", + (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT)); + if (!name) + return -ENOMEM; + + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + /* enable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL); + + /* clearing interrupt status */ + writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + + drw_pmu->cpu =3D smp_processor_id(); + + ret =3D ali_drw_pmu_init_irq(drw_pmu, pdev); + if (ret) + return ret; + + drw_pmu->pmu =3D (struct pmu) { + .module =3D THIS_MODULE, + .task_ctx_nr =3D perf_invalid_context, + .event_init =3D ali_drw_pmu_event_init, + .add =3D ali_drw_pmu_add, + .del =3D ali_drw_pmu_del, + .start =3D ali_drw_pmu_start, + .stop =3D ali_drw_pmu_stop, + .read =3D ali_drw_pmu_read, + .attr_groups =3D ali_drw_pmu_attr_groups, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE, + }; + + ret =3D perf_pmu_register(&drw_pmu->pmu, name, -1); + if (ret) { + dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n"); + ali_drw_pmu_uninit_irq(drw_pmu); + } + + return ret; +} + +static int ali_drw_pmu_remove(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu =3D platform_get_drvdata(pdev); + + /* disable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL); + + ali_drw_pmu_uninit_irq(drw_pmu); + perf_pmu_unregister(&drw_pmu->pmu); + + return 0; +} + +static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *no= de) +{ + struct ali_drw_pmu_irq *irq; + struct ali_drw_pmu *drw_pmu; + unsigned int target; + int ret; + cpumask_t node_online_cpus; + + irq =3D hlist_entry_safe(node, struct ali_drw_pmu_irq, node); + if (cpu !=3D irq->cpu) + return 0; + + ret =3D cpumask_and(&node_online_cpus, + cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask); + if (ret) + target =3D cpumask_any_but(&node_online_cpus, cpu); + else + target =3D cpumask_any_but(cpu_online_mask, cpu); + + if (target >=3D nr_cpu_ids) + return 0; + + /* We're only reading, but this isn't the place to be involving RCU */ + mutex_lock(&ali_drw_pmu_irqs_lock); + list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node) + perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target))); + irq->cpu =3D target; + + return 0; +} + +/* + * Due to historical reasons, the HID used in the production environment is + * ARMHD700, so we leave ARMHD700 as Compatible ID. + */ +static const struct acpi_device_id ali_drw_acpi_match[] =3D { + {"BABA5000", 0}, + {"ARMHD700", 0}, + {} +}; + +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match); + +static struct platform_driver ali_drw_pmu_driver =3D { + .driver =3D { + .name =3D "ali_drw_pmu", + .acpi_match_table =3D ACPI_PTR(ali_drw_acpi_match), + }, + .probe =3D ali_drw_pmu_probe, + .remove =3D ali_drw_pmu_remove, +}; + +static int __init ali_drw_pmu_init(void) +{ + int ret; + + ret =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "ali_drw_pmu:online", + NULL, ali_drw_pmu_offline_cpu); + + if (ret < 0) { + pr_err("DRW Driveway PMU: setup hotplug failed, ret =3D %d\n", + ret); + return ret; + } + ali_drw_cpuhp_state_num =3D ret; + + ret =3D platform_driver_register(&ali_drw_pmu_driver); + if (ret) + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); + + return ret; +} + +static void __exit ali_drw_pmu_exit(void) +{ + platform_driver_unregister(&ali_drw_pmu_driver); + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); +} + +module_init(ali_drw_pmu_init); +module_exit(ali_drw_pmu_exit); + +MODULE_AUTHOR("Hongbo Yao "); +MODULE_AUTHOR("Neng Chen "); +MODULE_AUTHOR("Shuai Xue "); +MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver"); +MODULE_LICENSE("GPL v2"); --=20 2.20.1.9.gb50a0d7 From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C939BECAAD8 for ; Wed, 14 Sep 2022 02:23:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230041AbiINCXu (ORCPT ); Tue, 13 Sep 2022 22:23:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229636AbiINCXj (ORCPT ); Tue, 13 Sep 2022 22:23:39 -0400 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E01755E303 for ; Tue, 13 Sep 2022 19:23:36 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VPl45ki_1663122213; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VPl45ki_1663122213) by smtp.aliyun-inc.com; Wed, 14 Sep 2022 10:23:34 +0800 From: Shuai Xue To: will@kernel.org, Jonathan.Cameron@Huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: rdunlap@infradead.org, robin.murphy@arm.com, mark.rutland@arm.com, baolin.wang@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [RESEND PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Date: Wed, 14 Sep 2022 10:23:25 +0800 Message-Id: <20220914022326.88550-3-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4 DRAM and targets cloud computing and HPC. Each PMU is registered as a device in /sys/bus/event_source/devices, and users can select event to monitor in each sub-channel, independently. For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two sub-channels of the same channel in die 0. And the PMU device of die 1 is prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers successfully, add IRQF_SHARED flag. Signed-off-by: Shuai Xue Co-developed-by: Hongbo Yao Signed-off-by: Hongbo Yao Co-developed-by: Neng Chen Signed-off-by: Neng Chen Reviewed-by: Jonathan Cameron Reviewed-by: Baolin Wang --- drivers/perf/Kconfig | 7 + drivers/perf/Makefile | 1 + drivers/perf/alibaba_uncore_drw_pmu.c | 810 ++++++++++++++++++++++++++ 3 files changed, 818 insertions(+) create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 1e2d69453771..44c07ea487f4 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -183,6 +183,13 @@ config APPLE_M1_CPU_PMU Provides support for the non-architectural CPU PMUs present on the Apple M1 SoCs and derivatives. =20 +config ALIBABA_UNCORE_DRW_PMU + tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver" + depends on ARM64 || COMPILE_TEST + help + Support for Driveway PMU events monitoring on Yitian 710 DDR + Sub-system. + source "drivers/perf/hisilicon/Kconfig" =20 config MARVELL_CN10K_DDR_PMU diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 57a279c61df5..050d04ee19dd 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) +=3D arm_dmc620_pmu.o obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) +=3D marvell_cn10k_tad_pmu.o obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) +=3D marvell_cn10k_ddr_pmu.o obj-$(CONFIG_APPLE_M1_CPU_PMU) +=3D apple_m1_cpu_pmu.o +obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) +=3D alibaba_uncore_drw_pmu.o diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_u= ncore_drw_pmu.c new file mode 100644 index 000000000000..82729b874f09 --- /dev/null +++ b/drivers/perf/alibaba_uncore_drw_pmu.c @@ -0,0 +1,810 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Alibaba DDR Sub-System Driveway PMU driver + * + * Copyright (C) 2022 Alibaba Inc + */ + +#define ALI_DRW_PMUNAME "ali_drw" +#define ALI_DRW_DRVNAME ALI_DRW_PMUNAME "_pmu" +#define pr_fmt(fmt) ALI_DRW_DRVNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +#define ALI_DRW_PMU_COMMON_MAX_COUNTERS 16 +#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE 19 + +#define ALI_DRW_PMU_PA_SHIFT 12 +#define ALI_DRW_PMU_CNT_INIT 0x00000000 +#define ALI_DRW_CNT_MAX_PERIOD 0xffffffff +#define ALI_DRW_PMU_CYCLE_EVT_ID 0x80 + +#define ALI_DRW_PMU_CNT_CTRL 0xC00 +#define ALI_DRW_PMU_CNT_RST BIT(2) +#define ALI_DRW_PMU_CNT_STOP BIT(1) +#define ALI_DRW_PMU_CNT_START BIT(0) + +#define ALI_DRW_PMU_CNT_STATE 0xC04 +#define ALI_DRW_PMU_TEST_CTRL 0xC08 +#define ALI_DRW_PMU_CNT_PRELOAD 0xC0C + +#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK GENMASK(23, 0) +#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK GENMASK(31, 0) +#define ALI_DRW_PMU_CYCLE_CNT_HIGH 0xC10 +#define ALI_DRW_PMU_CYCLE_CNT_LOW 0xC14 + +/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */ +#define ALI_DRW_PMU_EVENT_SEL0 0xC68 +/* counter 0-3 use sel0, counter 4-7 use sel1...*/ +#define ALI_DRW_PMU_EVENT_SELn(n) \ + (ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4) +#define ALI_DRW_PMCOM_CNT_EN BIT(7) +#define ALI_DRW_PMCOM_CNT_EVENT_MASK GENMASK(5, 0) +#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \ + (8 * (n % 4)) + +/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte str= ide */ +#define ALI_DRW_PMU_COMMON_COUNTER0 0xC78 +#define ALI_DRW_PMU_COMMON_COUNTERn(n) \ + (ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n)) + +#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL 0xCB8 +#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL 0xCBC +#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS 0xCC0 +#define ALI_DRW_PMU_OV_INTR_CLR 0xCC4 +#define ALI_DRW_PMU_OV_INTR_STATUS 0xCC8 +#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK GENMASK(23, 8) +#define ALI_DRW_PMBW_CNT_OV_INTR_MASK GENMASK(7, 0) +#define ALI_DRW_PMU_OV_INTR_MASK GENMASK_ULL(63, 0) + +static int ali_drw_cpuhp_state_num; + +static LIST_HEAD(ali_drw_pmu_irqs); +static DEFINE_MUTEX(ali_drw_pmu_irqs_lock); + +struct ali_drw_pmu_irq { + struct hlist_node node; + struct list_head irqs_node; + struct list_head pmus_node; + int irq_num; + int cpu; + refcount_t refcount; +}; + +struct ali_drw_pmu { + void __iomem *cfg_base; + struct device *dev; + + struct list_head pmus_node; + struct ali_drw_pmu_irq *irq; + int irq_num; + int cpu; + DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS); + struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + + struct pmu pmu; +}; + +#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu)) + +#define DRW_CONFIG_EVENTID GENMASK(7, 0) +#define GET_DRW_EVENTID(event) FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr= .config) + +static ssize_t ali_drw_pmu_format_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(buf, "%s\n", (char *)eattr->var); +} + +/* + * PMU event attributes + */ +static ssize_t ali_drw_pmu_event_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(page, "config=3D0x%lx\n", (unsigned long)eattr->var); +} + +#define ALI_DRW_PMU_ATTR(_name, _func, _config) = \ + (&((struct dev_ext_attribute[]) { \ + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ + })[0].attr.attr) + +#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config) +#define ALI_DRW_PMU_EVENT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config) + +static struct attribute *ali_drw_pmu_events_attrs[] =3D { + ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr, 0x0), + ALI_DRW_PMU_EVENT_ATTR(hif_wr, 0x1), + ALI_DRW_PMU_EVENT_ATTR(hif_rd, 0x2), + ALI_DRW_PMU_EVENT_ATTR(hif_rmw, 0x3), + ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd, 0x4), + ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles, 0x7), + ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles, 0x8), + ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical, 0x9), + ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical, 0xA), + ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical, 0xB), + ALI_DRW_PMU_EVENT_ATTR(op_is_activate, 0xC), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr, 0xD), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate, 0xE), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd, 0xF), + ALI_DRW_PMU_EVENT_ATTR(op_is_wr, 0x10), + ALI_DRW_PMU_EVENT_ATTR(op_is_mwr, 0x11), + ALI_DRW_PMU_EVENT_ATTR(op_is_precharge, 0x12), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr, 0x13), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_other, 0x14), + ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions, 0x15), + ALI_DRW_PMU_EVENT_ATTR(write_combine, 0x16), + ALI_DRW_PMU_EVENT_ATTR(war_hazard, 0x17), + ALI_DRW_PMU_EVENT_ATTR(raw_hazard, 0x18), + ALI_DRW_PMU_EVENT_ATTR(waw_hazard, 0x19), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0, 0x1A), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1, 0x1B), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2, 0x1C), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3, 0x1D), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0, 0x1E), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1, 0x1F), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2, 0x20), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3, 0x21), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0, 0x26), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1, 0x27), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2, 0x28), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3, 0x29), + ALI_DRW_PMU_EVENT_ATTR(op_is_refresh, 0x2A), + ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref, 0x2B), + ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode, 0x2D), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl, 0x2E), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc, 0x34), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr, 0x35), + ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr, 0x36), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart, 0x37), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch, 0x38), + ALI_DRW_PMU_EVENT_ATTR(chi_txreq, 0x39), + ALI_DRW_PMU_EVENT_ATTR(chi_txdat, 0x3A), + ALI_DRW_PMU_EVENT_ATTR(chi_rxdat, 0x3B), + ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp, 0x3C), + ALI_DRW_PMU_EVENT_ATTR(tsz_vio, 0x3D), + ALI_DRW_PMU_EVENT_ATTR(cycle, 0x80), + NULL, +}; + +static struct attribute_group ali_drw_pmu_events_attr_group =3D { + .name =3D "events", + .attrs =3D ali_drw_pmu_events_attrs, +}; + +static struct attribute *ali_drw_pmu_format_attr[] =3D { + ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"), + NULL, +}; + +static const struct attribute_group ali_drw_pmu_format_group =3D { + .name =3D "format", + .attrs =3D ali_drw_pmu_format_attr, +}; + +static ssize_t ali_drw_pmu_cpumask_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu)); +} + +static struct device_attribute ali_drw_pmu_cpumask_attr =3D + __ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL); + +static struct attribute *ali_drw_pmu_cpumask_attrs[] =3D { + &ali_drw_pmu_cpumask_attr.attr, + NULL, +}; + +static const struct attribute_group ali_drw_pmu_cpumask_attr_group =3D { + .attrs =3D ali_drw_pmu_cpumask_attrs, +}; + +static const struct attribute_group *ali_drw_pmu_attr_groups[] =3D { + &ali_drw_pmu_events_attr_group, + &ali_drw_pmu_cpumask_attr_group, + &ali_drw_pmu_format_group, + NULL, +}; + +/* find a counter for event, then in add func, hw.idx will equal to counte= r */ +static int ali_drw_get_counter_idx(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) { + if (!test_and_set_bit(idx, drw_pmu->used_mask)) + return idx; + } + + /* The counters are all in use. */ + return -EBUSY; +} + +static u64 ali_drw_pmu_read_counter(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + u64 cycle_high, cycle_low; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + cycle_high =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH); + cycle_high &=3D ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK; + cycle_low =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW); + cycle_low &=3D ALI_DRW_PMU_CYCLE_CNT_LOW_MASK; + return (cycle_high << 32 | cycle_low); + } + + return readl(drw_pmu->cfg_base + + ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx)); +} + +static void ali_drw_pmu_event_update(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + u64 delta, prev, now; + + do { + prev =3D local64_read(&hwc->prev_count); + now =3D ali_drw_pmu_read_counter(event); + } while (local64_cmpxchg(&hwc->prev_count, prev, now) !=3D prev); + + /* handle overflow. */ + delta =3D now - prev; + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) + delta &=3D ALI_DRW_PMU_OV_INTR_MASK; + else + delta &=3D ALI_DRW_CNT_MAX_PERIOD; + local64_add(delta, &event->count); +} + +static void ali_drw_pmu_event_set_period(struct perf_event *event) +{ + u64 pre_val; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + /* set a preload counter for test purpose */ + writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx, + drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); + + /* set conunter initial value */ + pre_val =3D ALI_DRW_PMU_CNT_INIT; + writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + local64_set(&event->hw.prev_count, pre_val); + + /* set sel mode to zero to start test */ + writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); +} + +static void ali_drw_pmu_enable_counter(struct perf_event *event) +{ + u32 val, subval, reg, shift; + int counter =3D event->hw.idx; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static void ali_drw_pmu_disable_counter(struct perf_event *event) +{ + u32 val, reg, subval, shift; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int counter =3D event->hw.idx; + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data) +{ + struct ali_drw_pmu_irq *irq =3D data; + struct ali_drw_pmu *drw_pmu; + irqreturn_t ret =3D IRQ_NONE; + + rcu_read_lock(); + list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) { + unsigned long status, clr_status; + struct perf_event *event; + unsigned int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + ali_drw_pmu_disable_counter(event); + } + + /* common counter intr status */ + status =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS); + status =3D FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status); + if (status) { + for_each_set_bit(idx, &status, + ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + event =3D drw_pmu->events[idx]; + if (WARN_ON_ONCE(!event)) + continue; + ali_drw_pmu_event_update(event); + ali_drw_pmu_event_set_period(event); + } + + /* clear common counter intr status */ + clr_status =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1); + writel(clr_status, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + } + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + if (!(event->hw.state & PERF_HES_STOPPED)) + ali_drw_pmu_enable_counter(event); + } + if (status) + ret =3D IRQ_HANDLED; + } + rcu_read_unlock(); + return ret; +} + +static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_devi= ce + *pdev, int irq_num) +{ + int ret; + struct ali_drw_pmu_irq *irq; + + list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) { + if (irq->irq_num =3D=3D irq_num + && refcount_inc_not_zero(&irq->refcount)) + return irq; + } + + irq =3D kzalloc(sizeof(*irq), GFP_KERNEL); + if (!irq) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&irq->pmus_node); + + /* Pick one CPU to be the preferred one to use */ + irq->cpu =3D smp_processor_id(); + refcount_set(&irq->refcount, 1); + + /* + * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same + * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers + * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not + * share with other component. + */ + ret =3D devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr, + IRQF_SHARED, dev_name(&pdev->dev), irq); + if (ret < 0) { + dev_err(&pdev->dev, + "Fail to request IRQ:%d ret:%d\n", irq_num, ret); + goto out_free; + } + + ret =3D irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu)); + if (ret) + goto out_free; + + ret =3D cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + if (ret) + goto out_free; + + irq->irq_num =3D irq_num; + list_add(&irq->irqs_node, &ali_drw_pmu_irqs); + + return irq; + +out_free: + kfree(irq); + return ERR_PTR(ret); +} + +static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu, + struct platform_device *pdev) +{ + int irq_num; + struct ali_drw_pmu_irq *irq; + + /* Read and init IRQ */ + irq_num =3D platform_get_irq(pdev, 0); + if (irq_num < 0) + return irq_num; + + mutex_lock(&ali_drw_pmu_irqs_lock); + irq =3D __ali_drw_pmu_init_irq(pdev, irq_num); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + if (IS_ERR(irq)) + return PTR_ERR(irq); + + drw_pmu->irq =3D irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + return 0; +} + +static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu) +{ + struct ali_drw_pmu_irq *irq =3D drw_pmu->irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_del_rcu(&drw_pmu->pmus_node); + + if (!refcount_dec_and_test(&irq->refcount)) { + mutex_unlock(&ali_drw_pmu_irqs_lock); + return; + } + + list_del(&irq->irqs_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL)); + cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + kfree(irq); +} + +static int ali_drw_pmu_event_init(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + struct perf_event *sibling; + struct device *dev =3D drw_pmu->pmu.dev; + + if (event->attr.type !=3D event->pmu->type) + return -ENOENT; + + if (is_sampling_event(event)) { + dev_err(dev, "Sampling not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->attach_state & PERF_ATTACH_TASK) { + dev_err(dev, "Per-task counter cannot allocate!\n"); + return -EOPNOTSUPP; + } + + event->cpu =3D drw_pmu->cpu; + if (event->cpu < 0) { + dev_err(dev, "Per-task mode not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->group_leader !=3D event && + !is_software_event(event->group_leader)) { + dev_err(dev, "driveway only allow one event!\n"); + return -EINVAL; + } + + for_each_sibling_event(sibling, event->group_leader) { + if (sibling !=3D event && !is_software_event(sibling)) { + dev_err(dev, "driveway event not allowed!\n"); + return -EINVAL; + } + } + + /* reset all the pmu counters */ + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + hwc->idx =3D -1; + + return 0; +} + +static void ali_drw_pmu_start(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + event->hw.state =3D 0; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + writel(ALI_DRW_PMU_CNT_START, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + return; + } + + ali_drw_pmu_event_set_period(event); + if (flags & PERF_EF_RELOAD) { + unsigned long prev_raw_count =3D + local64_read(&event->hw.prev_count); + writel(prev_raw_count, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + } + + ali_drw_pmu_enable_counter(event); + + writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); +} + +static void ali_drw_pmu_stop(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + if (event->hw.state & PERF_HES_STOPPED) + return; + + if (GET_DRW_EVENTID(event) !=3D ALI_DRW_PMU_CYCLE_EVT_ID) + ali_drw_pmu_disable_counter(event); + + writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + ali_drw_pmu_event_update(event); + event->hw.state |=3D PERF_HES_STOPPED | PERF_HES_UPTODATE; +} + +static int ali_drw_pmu_add(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D -1; + int evtid; + + evtid =3D GET_DRW_EVENTID(event); + + if (evtid !=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + idx =3D ali_drw_get_counter_idx(event); + if (idx < 0) + return idx; + drw_pmu->events[idx] =3D event; + drw_pmu->evtids[idx] =3D evtid; + } + hwc->idx =3D idx; + + hwc->state =3D PERF_HES_STOPPED | PERF_HES_UPTODATE; + + if (flags & PERF_EF_START) + ali_drw_pmu_start(event, PERF_EF_RELOAD); + + /* Propagate our changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void ali_drw_pmu_del(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D hwc->idx; + + ali_drw_pmu_stop(event, PERF_EF_UPDATE); + + if (idx >=3D 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + drw_pmu->events[idx] =3D NULL; + drw_pmu->evtids[idx] =3D 0; + clear_bit(idx, drw_pmu->used_mask); + } + + perf_event_update_userpage(event); +} + +static void ali_drw_pmu_read(struct perf_event *event) +{ + ali_drw_pmu_event_update(event); +} + +static int ali_drw_pmu_probe(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu; + struct resource *res; + char *name; + int ret; + + drw_pmu =3D devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL); + if (!drw_pmu) + return -ENOMEM; + + drw_pmu->dev =3D &pdev->dev; + platform_set_drvdata(pdev, drw_pmu); + + res =3D platform_get_resource(pdev, IORESOURCE_MEM, 0); + drw_pmu->cfg_base =3D devm_ioremap_resource(&pdev->dev, res); + if (!drw_pmu->cfg_base) + return -ENOMEM; + + name =3D devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx", + (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT)); + if (!name) + return -ENOMEM; + + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + /* enable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL); + + /* clearing interrupt status */ + writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + + drw_pmu->cpu =3D smp_processor_id(); + + ret =3D ali_drw_pmu_init_irq(drw_pmu, pdev); + if (ret) + return ret; + + drw_pmu->pmu =3D (struct pmu) { + .module =3D THIS_MODULE, + .task_ctx_nr =3D perf_invalid_context, + .event_init =3D ali_drw_pmu_event_init, + .add =3D ali_drw_pmu_add, + .del =3D ali_drw_pmu_del, + .start =3D ali_drw_pmu_start, + .stop =3D ali_drw_pmu_stop, + .read =3D ali_drw_pmu_read, + .attr_groups =3D ali_drw_pmu_attr_groups, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE, + }; + + ret =3D perf_pmu_register(&drw_pmu->pmu, name, -1); + if (ret) { + dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n"); + ali_drw_pmu_uninit_irq(drw_pmu); + } + + return ret; +} + +static int ali_drw_pmu_remove(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu =3D platform_get_drvdata(pdev); + + /* disable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL); + + ali_drw_pmu_uninit_irq(drw_pmu); + perf_pmu_unregister(&drw_pmu->pmu); + + return 0; +} + +static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *no= de) +{ + struct ali_drw_pmu_irq *irq; + struct ali_drw_pmu *drw_pmu; + unsigned int target; + int ret; + cpumask_t node_online_cpus; + + irq =3D hlist_entry_safe(node, struct ali_drw_pmu_irq, node); + if (cpu !=3D irq->cpu) + return 0; + + ret =3D cpumask_and(&node_online_cpus, + cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask); + if (ret) + target =3D cpumask_any_but(&node_online_cpus, cpu); + else + target =3D cpumask_any_but(cpu_online_mask, cpu); + + if (target >=3D nr_cpu_ids) + return 0; + + /* We're only reading, but this isn't the place to be involving RCU */ + mutex_lock(&ali_drw_pmu_irqs_lock); + list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node) + perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target))); + irq->cpu =3D target; + + return 0; +} + +/* + * Due to historical reasons, the HID used in the production environment is + * ARMHD700, so we leave ARMHD700 as Compatible ID. + */ +static const struct acpi_device_id ali_drw_acpi_match[] =3D { + {"BABA5000", 0}, + {"ARMHD700", 0}, + {} +}; + +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match); + +static struct platform_driver ali_drw_pmu_driver =3D { + .driver =3D { + .name =3D "ali_drw_pmu", + .acpi_match_table =3D ali_drw_acpi_match, + }, + .probe =3D ali_drw_pmu_probe, + .remove =3D ali_drw_pmu_remove, +}; + +static int __init ali_drw_pmu_init(void) +{ + int ret; + + ret =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "ali_drw_pmu:online", + NULL, ali_drw_pmu_offline_cpu); + + if (ret < 0) { + pr_err("DRW Driveway PMU: setup hotplug failed, ret =3D %d\n", + ret); + return ret; + } + ali_drw_cpuhp_state_num =3D ret; + + ret =3D platform_driver_register(&ali_drw_pmu_driver); + if (ret) + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); + + return ret; +} + +static void __exit ali_drw_pmu_exit(void) +{ + platform_driver_unregister(&ali_drw_pmu_driver); + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); +} + +module_init(ali_drw_pmu_init); +module_exit(ali_drw_pmu_exit); + +MODULE_AUTHOR("Hongbo Yao "); +MODULE_AUTHOR("Neng Chen "); +MODULE_AUTHOR("Shuai Xue "); +MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver"); +MODULE_LICENSE("GPL v2"); --=20 2.20.1.12.g72788fdb From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 180D4C25B08 for ; Thu, 18 Aug 2022 03:18:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243081AbiHRDSy (ORCPT ); Wed, 17 Aug 2022 23:18:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243044AbiHRDSl (ORCPT ); Wed, 17 Aug 2022 23:18:41 -0400 Received: from out30-43.freemail.mail.aliyun.com (out30-43.freemail.mail.aliyun.com [115.124.30.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D27F15735 for ; Wed, 17 Aug 2022 20:18:35 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VMZAdEb_1660792711; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VMZAdEb_1660792711) by smtp.aliyun-inc.com; Thu, 18 Aug 2022 11:18:31 +0800 From: Shuai Xue To: will@kernel.org, Jonathan.Cameron@Huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: rdunlap@infradead.org, robin.murphy@arm.com, mark.rutland@arm.com, baolin.wang@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [PATCH v4 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Date: Thu, 18 Aug 2022 11:18:21 +0800 Message-Id: <20220818031822.38415-3-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4 DRAM and targets cloud computing and HPC. Each PMU is registered as a device in /sys/bus/event_source/devices, and users can select event to monitor in each sub-channel, independently. For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two sub-channels of the same channel in die 0. And the PMU device of die 1 is prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers successfully, add IRQF_SHARED flag. Signed-off-by: Shuai Xue Co-developed-by: Hongbo Yao Signed-off-by: Hongbo Yao Co-developed-by: Neng Chen Signed-off-by: Neng Chen Reviewed-by: Jonathan Cameron Reviewed-by: Baolin Wang --- drivers/perf/Kconfig | 7 + drivers/perf/Makefile | 1 + drivers/perf/alibaba_uncore_drw_pmu.c | 810 ++++++++++++++++++++++++++ 3 files changed, 818 insertions(+) create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 1e2d69453771..44c07ea487f4 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -183,6 +183,13 @@ config APPLE_M1_CPU_PMU Provides support for the non-architectural CPU PMUs present on the Apple M1 SoCs and derivatives. =20 +config ALIBABA_UNCORE_DRW_PMU + tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver" + depends on ARM64 || COMPILE_TEST + help + Support for Driveway PMU events monitoring on Yitian 710 DDR + Sub-system. + source "drivers/perf/hisilicon/Kconfig" =20 config MARVELL_CN10K_DDR_PMU diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 57a279c61df5..050d04ee19dd 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) +=3D arm_dmc620_pmu.o obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) +=3D marvell_cn10k_tad_pmu.o obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) +=3D marvell_cn10k_ddr_pmu.o obj-$(CONFIG_APPLE_M1_CPU_PMU) +=3D apple_m1_cpu_pmu.o +obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) +=3D alibaba_uncore_drw_pmu.o diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_u= ncore_drw_pmu.c new file mode 100644 index 000000000000..82729b874f09 --- /dev/null +++ b/drivers/perf/alibaba_uncore_drw_pmu.c @@ -0,0 +1,810 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Alibaba DDR Sub-System Driveway PMU driver + * + * Copyright (C) 2022 Alibaba Inc + */ + +#define ALI_DRW_PMUNAME "ali_drw" +#define ALI_DRW_DRVNAME ALI_DRW_PMUNAME "_pmu" +#define pr_fmt(fmt) ALI_DRW_DRVNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +#define ALI_DRW_PMU_COMMON_MAX_COUNTERS 16 +#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE 19 + +#define ALI_DRW_PMU_PA_SHIFT 12 +#define ALI_DRW_PMU_CNT_INIT 0x00000000 +#define ALI_DRW_CNT_MAX_PERIOD 0xffffffff +#define ALI_DRW_PMU_CYCLE_EVT_ID 0x80 + +#define ALI_DRW_PMU_CNT_CTRL 0xC00 +#define ALI_DRW_PMU_CNT_RST BIT(2) +#define ALI_DRW_PMU_CNT_STOP BIT(1) +#define ALI_DRW_PMU_CNT_START BIT(0) + +#define ALI_DRW_PMU_CNT_STATE 0xC04 +#define ALI_DRW_PMU_TEST_CTRL 0xC08 +#define ALI_DRW_PMU_CNT_PRELOAD 0xC0C + +#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK GENMASK(23, 0) +#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK GENMASK(31, 0) +#define ALI_DRW_PMU_CYCLE_CNT_HIGH 0xC10 +#define ALI_DRW_PMU_CYCLE_CNT_LOW 0xC14 + +/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */ +#define ALI_DRW_PMU_EVENT_SEL0 0xC68 +/* counter 0-3 use sel0, counter 4-7 use sel1...*/ +#define ALI_DRW_PMU_EVENT_SELn(n) \ + (ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4) +#define ALI_DRW_PMCOM_CNT_EN BIT(7) +#define ALI_DRW_PMCOM_CNT_EVENT_MASK GENMASK(5, 0) +#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \ + (8 * (n % 4)) + +/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte str= ide */ +#define ALI_DRW_PMU_COMMON_COUNTER0 0xC78 +#define ALI_DRW_PMU_COMMON_COUNTERn(n) \ + (ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n)) + +#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL 0xCB8 +#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL 0xCBC +#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS 0xCC0 +#define ALI_DRW_PMU_OV_INTR_CLR 0xCC4 +#define ALI_DRW_PMU_OV_INTR_STATUS 0xCC8 +#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK GENMASK(23, 8) +#define ALI_DRW_PMBW_CNT_OV_INTR_MASK GENMASK(7, 0) +#define ALI_DRW_PMU_OV_INTR_MASK GENMASK_ULL(63, 0) + +static int ali_drw_cpuhp_state_num; + +static LIST_HEAD(ali_drw_pmu_irqs); +static DEFINE_MUTEX(ali_drw_pmu_irqs_lock); + +struct ali_drw_pmu_irq { + struct hlist_node node; + struct list_head irqs_node; + struct list_head pmus_node; + int irq_num; + int cpu; + refcount_t refcount; +}; + +struct ali_drw_pmu { + void __iomem *cfg_base; + struct device *dev; + + struct list_head pmus_node; + struct ali_drw_pmu_irq *irq; + int irq_num; + int cpu; + DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS); + struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + + struct pmu pmu; +}; + +#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu)) + +#define DRW_CONFIG_EVENTID GENMASK(7, 0) +#define GET_DRW_EVENTID(event) FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr= .config) + +static ssize_t ali_drw_pmu_format_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(buf, "%s\n", (char *)eattr->var); +} + +/* + * PMU event attributes + */ +static ssize_t ali_drw_pmu_event_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(page, "config=3D0x%lx\n", (unsigned long)eattr->var); +} + +#define ALI_DRW_PMU_ATTR(_name, _func, _config) = \ + (&((struct dev_ext_attribute[]) { \ + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ + })[0].attr.attr) + +#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config) +#define ALI_DRW_PMU_EVENT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config) + +static struct attribute *ali_drw_pmu_events_attrs[] =3D { + ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr, 0x0), + ALI_DRW_PMU_EVENT_ATTR(hif_wr, 0x1), + ALI_DRW_PMU_EVENT_ATTR(hif_rd, 0x2), + ALI_DRW_PMU_EVENT_ATTR(hif_rmw, 0x3), + ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd, 0x4), + ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles, 0x7), + ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles, 0x8), + ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical, 0x9), + ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical, 0xA), + ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical, 0xB), + ALI_DRW_PMU_EVENT_ATTR(op_is_activate, 0xC), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr, 0xD), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate, 0xE), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd, 0xF), + ALI_DRW_PMU_EVENT_ATTR(op_is_wr, 0x10), + ALI_DRW_PMU_EVENT_ATTR(op_is_mwr, 0x11), + ALI_DRW_PMU_EVENT_ATTR(op_is_precharge, 0x12), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr, 0x13), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_other, 0x14), + ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions, 0x15), + ALI_DRW_PMU_EVENT_ATTR(write_combine, 0x16), + ALI_DRW_PMU_EVENT_ATTR(war_hazard, 0x17), + ALI_DRW_PMU_EVENT_ATTR(raw_hazard, 0x18), + ALI_DRW_PMU_EVENT_ATTR(waw_hazard, 0x19), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0, 0x1A), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1, 0x1B), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2, 0x1C), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3, 0x1D), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0, 0x1E), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1, 0x1F), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2, 0x20), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3, 0x21), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0, 0x26), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1, 0x27), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2, 0x28), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3, 0x29), + ALI_DRW_PMU_EVENT_ATTR(op_is_refresh, 0x2A), + ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref, 0x2B), + ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode, 0x2D), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl, 0x2E), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc, 0x34), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr, 0x35), + ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr, 0x36), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart, 0x37), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch, 0x38), + ALI_DRW_PMU_EVENT_ATTR(chi_txreq, 0x39), + ALI_DRW_PMU_EVENT_ATTR(chi_txdat, 0x3A), + ALI_DRW_PMU_EVENT_ATTR(chi_rxdat, 0x3B), + ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp, 0x3C), + ALI_DRW_PMU_EVENT_ATTR(tsz_vio, 0x3D), + ALI_DRW_PMU_EVENT_ATTR(cycle, 0x80), + NULL, +}; + +static struct attribute_group ali_drw_pmu_events_attr_group =3D { + .name =3D "events", + .attrs =3D ali_drw_pmu_events_attrs, +}; + +static struct attribute *ali_drw_pmu_format_attr[] =3D { + ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"), + NULL, +}; + +static const struct attribute_group ali_drw_pmu_format_group =3D { + .name =3D "format", + .attrs =3D ali_drw_pmu_format_attr, +}; + +static ssize_t ali_drw_pmu_cpumask_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu)); +} + +static struct device_attribute ali_drw_pmu_cpumask_attr =3D + __ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL); + +static struct attribute *ali_drw_pmu_cpumask_attrs[] =3D { + &ali_drw_pmu_cpumask_attr.attr, + NULL, +}; + +static const struct attribute_group ali_drw_pmu_cpumask_attr_group =3D { + .attrs =3D ali_drw_pmu_cpumask_attrs, +}; + +static const struct attribute_group *ali_drw_pmu_attr_groups[] =3D { + &ali_drw_pmu_events_attr_group, + &ali_drw_pmu_cpumask_attr_group, + &ali_drw_pmu_format_group, + NULL, +}; + +/* find a counter for event, then in add func, hw.idx will equal to counte= r */ +static int ali_drw_get_counter_idx(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) { + if (!test_and_set_bit(idx, drw_pmu->used_mask)) + return idx; + } + + /* The counters are all in use. */ + return -EBUSY; +} + +static u64 ali_drw_pmu_read_counter(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + u64 cycle_high, cycle_low; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + cycle_high =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH); + cycle_high &=3D ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK; + cycle_low =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW); + cycle_low &=3D ALI_DRW_PMU_CYCLE_CNT_LOW_MASK; + return (cycle_high << 32 | cycle_low); + } + + return readl(drw_pmu->cfg_base + + ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx)); +} + +static void ali_drw_pmu_event_update(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + u64 delta, prev, now; + + do { + prev =3D local64_read(&hwc->prev_count); + now =3D ali_drw_pmu_read_counter(event); + } while (local64_cmpxchg(&hwc->prev_count, prev, now) !=3D prev); + + /* handle overflow. */ + delta =3D now - prev; + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) + delta &=3D ALI_DRW_PMU_OV_INTR_MASK; + else + delta &=3D ALI_DRW_CNT_MAX_PERIOD; + local64_add(delta, &event->count); +} + +static void ali_drw_pmu_event_set_period(struct perf_event *event) +{ + u64 pre_val; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + /* set a preload counter for test purpose */ + writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx, + drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); + + /* set conunter initial value */ + pre_val =3D ALI_DRW_PMU_CNT_INIT; + writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + local64_set(&event->hw.prev_count, pre_val); + + /* set sel mode to zero to start test */ + writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); +} + +static void ali_drw_pmu_enable_counter(struct perf_event *event) +{ + u32 val, subval, reg, shift; + int counter =3D event->hw.idx; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static void ali_drw_pmu_disable_counter(struct perf_event *event) +{ + u32 val, reg, subval, shift; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int counter =3D event->hw.idx; + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data) +{ + struct ali_drw_pmu_irq *irq =3D data; + struct ali_drw_pmu *drw_pmu; + irqreturn_t ret =3D IRQ_NONE; + + rcu_read_lock(); + list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) { + unsigned long status, clr_status; + struct perf_event *event; + unsigned int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + ali_drw_pmu_disable_counter(event); + } + + /* common counter intr status */ + status =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS); + status =3D FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status); + if (status) { + for_each_set_bit(idx, &status, + ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + event =3D drw_pmu->events[idx]; + if (WARN_ON_ONCE(!event)) + continue; + ali_drw_pmu_event_update(event); + ali_drw_pmu_event_set_period(event); + } + + /* clear common counter intr status */ + clr_status =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1); + writel(clr_status, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + } + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + if (!(event->hw.state & PERF_HES_STOPPED)) + ali_drw_pmu_enable_counter(event); + } + if (status) + ret =3D IRQ_HANDLED; + } + rcu_read_unlock(); + return ret; +} + +static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_devi= ce + *pdev, int irq_num) +{ + int ret; + struct ali_drw_pmu_irq *irq; + + list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) { + if (irq->irq_num =3D=3D irq_num + && refcount_inc_not_zero(&irq->refcount)) + return irq; + } + + irq =3D kzalloc(sizeof(*irq), GFP_KERNEL); + if (!irq) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&irq->pmus_node); + + /* Pick one CPU to be the preferred one to use */ + irq->cpu =3D smp_processor_id(); + refcount_set(&irq->refcount, 1); + + /* + * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same + * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers + * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not + * share with other component. + */ + ret =3D devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr, + IRQF_SHARED, dev_name(&pdev->dev), irq); + if (ret < 0) { + dev_err(&pdev->dev, + "Fail to request IRQ:%d ret:%d\n", irq_num, ret); + goto out_free; + } + + ret =3D irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu)); + if (ret) + goto out_free; + + ret =3D cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + if (ret) + goto out_free; + + irq->irq_num =3D irq_num; + list_add(&irq->irqs_node, &ali_drw_pmu_irqs); + + return irq; + +out_free: + kfree(irq); + return ERR_PTR(ret); +} + +static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu, + struct platform_device *pdev) +{ + int irq_num; + struct ali_drw_pmu_irq *irq; + + /* Read and init IRQ */ + irq_num =3D platform_get_irq(pdev, 0); + if (irq_num < 0) + return irq_num; + + mutex_lock(&ali_drw_pmu_irqs_lock); + irq =3D __ali_drw_pmu_init_irq(pdev, irq_num); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + if (IS_ERR(irq)) + return PTR_ERR(irq); + + drw_pmu->irq =3D irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + return 0; +} + +static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu) +{ + struct ali_drw_pmu_irq *irq =3D drw_pmu->irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_del_rcu(&drw_pmu->pmus_node); + + if (!refcount_dec_and_test(&irq->refcount)) { + mutex_unlock(&ali_drw_pmu_irqs_lock); + return; + } + + list_del(&irq->irqs_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL)); + cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + kfree(irq); +} + +static int ali_drw_pmu_event_init(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + struct perf_event *sibling; + struct device *dev =3D drw_pmu->pmu.dev; + + if (event->attr.type !=3D event->pmu->type) + return -ENOENT; + + if (is_sampling_event(event)) { + dev_err(dev, "Sampling not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->attach_state & PERF_ATTACH_TASK) { + dev_err(dev, "Per-task counter cannot allocate!\n"); + return -EOPNOTSUPP; + } + + event->cpu =3D drw_pmu->cpu; + if (event->cpu < 0) { + dev_err(dev, "Per-task mode not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->group_leader !=3D event && + !is_software_event(event->group_leader)) { + dev_err(dev, "driveway only allow one event!\n"); + return -EINVAL; + } + + for_each_sibling_event(sibling, event->group_leader) { + if (sibling !=3D event && !is_software_event(sibling)) { + dev_err(dev, "driveway event not allowed!\n"); + return -EINVAL; + } + } + + /* reset all the pmu counters */ + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + hwc->idx =3D -1; + + return 0; +} + +static void ali_drw_pmu_start(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + event->hw.state =3D 0; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + writel(ALI_DRW_PMU_CNT_START, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + return; + } + + ali_drw_pmu_event_set_period(event); + if (flags & PERF_EF_RELOAD) { + unsigned long prev_raw_count =3D + local64_read(&event->hw.prev_count); + writel(prev_raw_count, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + } + + ali_drw_pmu_enable_counter(event); + + writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); +} + +static void ali_drw_pmu_stop(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + if (event->hw.state & PERF_HES_STOPPED) + return; + + if (GET_DRW_EVENTID(event) !=3D ALI_DRW_PMU_CYCLE_EVT_ID) + ali_drw_pmu_disable_counter(event); + + writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + ali_drw_pmu_event_update(event); + event->hw.state |=3D PERF_HES_STOPPED | PERF_HES_UPTODATE; +} + +static int ali_drw_pmu_add(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D -1; + int evtid; + + evtid =3D GET_DRW_EVENTID(event); + + if (evtid !=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + idx =3D ali_drw_get_counter_idx(event); + if (idx < 0) + return idx; + drw_pmu->events[idx] =3D event; + drw_pmu->evtids[idx] =3D evtid; + } + hwc->idx =3D idx; + + hwc->state =3D PERF_HES_STOPPED | PERF_HES_UPTODATE; + + if (flags & PERF_EF_START) + ali_drw_pmu_start(event, PERF_EF_RELOAD); + + /* Propagate our changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void ali_drw_pmu_del(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D hwc->idx; + + ali_drw_pmu_stop(event, PERF_EF_UPDATE); + + if (idx >=3D 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + drw_pmu->events[idx] =3D NULL; + drw_pmu->evtids[idx] =3D 0; + clear_bit(idx, drw_pmu->used_mask); + } + + perf_event_update_userpage(event); +} + +static void ali_drw_pmu_read(struct perf_event *event) +{ + ali_drw_pmu_event_update(event); +} + +static int ali_drw_pmu_probe(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu; + struct resource *res; + char *name; + int ret; + + drw_pmu =3D devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL); + if (!drw_pmu) + return -ENOMEM; + + drw_pmu->dev =3D &pdev->dev; + platform_set_drvdata(pdev, drw_pmu); + + res =3D platform_get_resource(pdev, IORESOURCE_MEM, 0); + drw_pmu->cfg_base =3D devm_ioremap_resource(&pdev->dev, res); + if (!drw_pmu->cfg_base) + return -ENOMEM; + + name =3D devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx", + (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT)); + if (!name) + return -ENOMEM; + + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + /* enable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL); + + /* clearing interrupt status */ + writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + + drw_pmu->cpu =3D smp_processor_id(); + + ret =3D ali_drw_pmu_init_irq(drw_pmu, pdev); + if (ret) + return ret; + + drw_pmu->pmu =3D (struct pmu) { + .module =3D THIS_MODULE, + .task_ctx_nr =3D perf_invalid_context, + .event_init =3D ali_drw_pmu_event_init, + .add =3D ali_drw_pmu_add, + .del =3D ali_drw_pmu_del, + .start =3D ali_drw_pmu_start, + .stop =3D ali_drw_pmu_stop, + .read =3D ali_drw_pmu_read, + .attr_groups =3D ali_drw_pmu_attr_groups, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE, + }; + + ret =3D perf_pmu_register(&drw_pmu->pmu, name, -1); + if (ret) { + dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n"); + ali_drw_pmu_uninit_irq(drw_pmu); + } + + return ret; +} + +static int ali_drw_pmu_remove(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu =3D platform_get_drvdata(pdev); + + /* disable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL); + + ali_drw_pmu_uninit_irq(drw_pmu); + perf_pmu_unregister(&drw_pmu->pmu); + + return 0; +} + +static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *no= de) +{ + struct ali_drw_pmu_irq *irq; + struct ali_drw_pmu *drw_pmu; + unsigned int target; + int ret; + cpumask_t node_online_cpus; + + irq =3D hlist_entry_safe(node, struct ali_drw_pmu_irq, node); + if (cpu !=3D irq->cpu) + return 0; + + ret =3D cpumask_and(&node_online_cpus, + cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask); + if (ret) + target =3D cpumask_any_but(&node_online_cpus, cpu); + else + target =3D cpumask_any_but(cpu_online_mask, cpu); + + if (target >=3D nr_cpu_ids) + return 0; + + /* We're only reading, but this isn't the place to be involving RCU */ + mutex_lock(&ali_drw_pmu_irqs_lock); + list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node) + perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target))); + irq->cpu =3D target; + + return 0; +} + +/* + * Due to historical reasons, the HID used in the production environment is + * ARMHD700, so we leave ARMHD700 as Compatible ID. + */ +static const struct acpi_device_id ali_drw_acpi_match[] =3D { + {"BABA5000", 0}, + {"ARMHD700", 0}, + {} +}; + +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match); + +static struct platform_driver ali_drw_pmu_driver =3D { + .driver =3D { + .name =3D "ali_drw_pmu", + .acpi_match_table =3D ali_drw_acpi_match, + }, + .probe =3D ali_drw_pmu_probe, + .remove =3D ali_drw_pmu_remove, +}; + +static int __init ali_drw_pmu_init(void) +{ + int ret; + + ret =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "ali_drw_pmu:online", + NULL, ali_drw_pmu_offline_cpu); + + if (ret < 0) { + pr_err("DRW Driveway PMU: setup hotplug failed, ret =3D %d\n", + ret); + return ret; + } + ali_drw_cpuhp_state_num =3D ret; + + ret =3D platform_driver_register(&ali_drw_pmu_driver); + if (ret) + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); + + return ret; +} + +static void __exit ali_drw_pmu_exit(void) +{ + platform_driver_unregister(&ali_drw_pmu_driver); + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); +} + +module_init(ali_drw_pmu_init); +module_exit(ali_drw_pmu_exit); + +MODULE_AUTHOR("Hongbo Yao "); +MODULE_AUTHOR("Neng Chen "); +MODULE_AUTHOR("Shuai Xue "); +MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver"); +MODULE_LICENSE("GPL v2"); --=20 2.20.1.12.g72788fdb From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06D24C433EF for ; Wed, 20 Jul 2022 06:58:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229674AbiGTG6t (ORCPT ); Wed, 20 Jul 2022 02:58:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237019AbiGTG6n (ORCPT ); Wed, 20 Jul 2022 02:58:43 -0400 Received: from out30-44.freemail.mail.aliyun.com (out30-44.freemail.mail.aliyun.com [115.124.30.44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 573DC2E9DE for ; Tue, 19 Jul 2022 23:58:41 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0VJvhhV3_1658300316; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VJvhhV3_1658300316) by smtp.aliyun-inc.com; Wed, 20 Jul 2022 14:58:37 +0800 From: Shuai Xue To: Jonathan.Cameron@Huawei.com, rdunlap@infradead.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [PATCH v3 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Date: Wed, 20 Jul 2022 14:58:14 +0800 Message-Id: <20220720065815.60772-3-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4 DRAM and targets cloud computing and HPC. Each PMU is registered as a device in /sys/bus/event_source/devices, and users can select event to monitor in each sub-channel, independently. For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two sub-channels of the same channel in die 0. And the PMU device of die 1 is prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers successfully, add IRQF_SHARED flag. Signed-off-by: Shuai Xue Co-developed-by: Hongbo Yao Signed-off-by: Hongbo Yao Co-developed-by: Neng Chen Signed-off-by: Neng Chen Reviewed-by: Jonathan Cameron Reviewed-by: Baolin Wang --- drivers/perf/Kconfig | 7 + drivers/perf/Makefile | 1 + drivers/perf/alibaba_uncore_drw_pmu.c | 810 ++++++++++++++++++++++++++ 3 files changed, 818 insertions(+) create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 1e2d69453771..44c07ea487f4 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -183,6 +183,13 @@ config APPLE_M1_CPU_PMU Provides support for the non-architectural CPU PMUs present on the Apple M1 SoCs and derivatives. =20 +config ALIBABA_UNCORE_DRW_PMU + tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver" + depends on ARM64 || COMPILE_TEST + help + Support for Driveway PMU events monitoring on Yitian 710 DDR + Sub-system. + source "drivers/perf/hisilicon/Kconfig" =20 config MARVELL_CN10K_DDR_PMU diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 57a279c61df5..050d04ee19dd 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) +=3D arm_dmc620_pmu.o obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) +=3D marvell_cn10k_tad_pmu.o obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) +=3D marvell_cn10k_ddr_pmu.o obj-$(CONFIG_APPLE_M1_CPU_PMU) +=3D apple_m1_cpu_pmu.o +obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) +=3D alibaba_uncore_drw_pmu.o diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_u= ncore_drw_pmu.c new file mode 100644 index 000000000000..82729b874f09 --- /dev/null +++ b/drivers/perf/alibaba_uncore_drw_pmu.c @@ -0,0 +1,810 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Alibaba DDR Sub-System Driveway PMU driver + * + * Copyright (C) 2022 Alibaba Inc + */ + +#define ALI_DRW_PMUNAME "ali_drw" +#define ALI_DRW_DRVNAME ALI_DRW_PMUNAME "_pmu" +#define pr_fmt(fmt) ALI_DRW_DRVNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + + +#define ALI_DRW_PMU_COMMON_MAX_COUNTERS 16 +#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE 19 + +#define ALI_DRW_PMU_PA_SHIFT 12 +#define ALI_DRW_PMU_CNT_INIT 0x00000000 +#define ALI_DRW_CNT_MAX_PERIOD 0xffffffff +#define ALI_DRW_PMU_CYCLE_EVT_ID 0x80 + +#define ALI_DRW_PMU_CNT_CTRL 0xC00 +#define ALI_DRW_PMU_CNT_RST BIT(2) +#define ALI_DRW_PMU_CNT_STOP BIT(1) +#define ALI_DRW_PMU_CNT_START BIT(0) + +#define ALI_DRW_PMU_CNT_STATE 0xC04 +#define ALI_DRW_PMU_TEST_CTRL 0xC08 +#define ALI_DRW_PMU_CNT_PRELOAD 0xC0C + +#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK GENMASK(23, 0) +#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK GENMASK(31, 0) +#define ALI_DRW_PMU_CYCLE_CNT_HIGH 0xC10 +#define ALI_DRW_PMU_CYCLE_CNT_LOW 0xC14 + +/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */ +#define ALI_DRW_PMU_EVENT_SEL0 0xC68 +/* counter 0-3 use sel0, counter 4-7 use sel1...*/ +#define ALI_DRW_PMU_EVENT_SELn(n) \ + (ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4) +#define ALI_DRW_PMCOM_CNT_EN BIT(7) +#define ALI_DRW_PMCOM_CNT_EVENT_MASK GENMASK(5, 0) +#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \ + (8 * (n % 4)) + +/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte str= ide */ +#define ALI_DRW_PMU_COMMON_COUNTER0 0xC78 +#define ALI_DRW_PMU_COMMON_COUNTERn(n) \ + (ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n)) + +#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL 0xCB8 +#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL 0xCBC +#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS 0xCC0 +#define ALI_DRW_PMU_OV_INTR_CLR 0xCC4 +#define ALI_DRW_PMU_OV_INTR_STATUS 0xCC8 +#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK GENMASK(23, 8) +#define ALI_DRW_PMBW_CNT_OV_INTR_MASK GENMASK(7, 0) +#define ALI_DRW_PMU_OV_INTR_MASK GENMASK_ULL(63, 0) + +static int ali_drw_cpuhp_state_num; + +static LIST_HEAD(ali_drw_pmu_irqs); +static DEFINE_MUTEX(ali_drw_pmu_irqs_lock); + +struct ali_drw_pmu_irq { + struct hlist_node node; + struct list_head irqs_node; + struct list_head pmus_node; + int irq_num; + int cpu; + refcount_t refcount; +}; + +struct ali_drw_pmu { + void __iomem *cfg_base; + struct device *dev; + + struct list_head pmus_node; + struct ali_drw_pmu_irq *irq; + int irq_num; + int cpu; + DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS); + struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + + struct pmu pmu; +}; + +#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu)) + +#define DRW_CONFIG_EVENTID GENMASK(7, 0) +#define GET_DRW_EVENTID(event) FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr= .config) + +static ssize_t ali_drw_pmu_format_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(buf, "%s\n", (char *)eattr->var); +} + +/* + * PMU event attributes + */ +static ssize_t ali_drw_pmu_event_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(page, "config=3D0x%lx\n", (unsigned long)eattr->var); +} + +#define ALI_DRW_PMU_ATTR(_name, _func, _config) = \ + (&((struct dev_ext_attribute[]) { \ + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ + })[0].attr.attr) + +#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config) +#define ALI_DRW_PMU_EVENT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config) + +static struct attribute *ali_drw_pmu_events_attrs[] =3D { + ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr, 0x0), + ALI_DRW_PMU_EVENT_ATTR(hif_wr, 0x1), + ALI_DRW_PMU_EVENT_ATTR(hif_rd, 0x2), + ALI_DRW_PMU_EVENT_ATTR(hif_rmw, 0x3), + ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd, 0x4), + ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles, 0x7), + ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles, 0x8), + ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical, 0x9), + ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical, 0xA), + ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical, 0xB), + ALI_DRW_PMU_EVENT_ATTR(op_is_activate, 0xC), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr, 0xD), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate, 0xE), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd, 0xF), + ALI_DRW_PMU_EVENT_ATTR(op_is_wr, 0x10), + ALI_DRW_PMU_EVENT_ATTR(op_is_mwr, 0x11), + ALI_DRW_PMU_EVENT_ATTR(op_is_precharge, 0x12), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr, 0x13), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_other, 0x14), + ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions, 0x15), + ALI_DRW_PMU_EVENT_ATTR(write_combine, 0x16), + ALI_DRW_PMU_EVENT_ATTR(war_hazard, 0x17), + ALI_DRW_PMU_EVENT_ATTR(raw_hazard, 0x18), + ALI_DRW_PMU_EVENT_ATTR(waw_hazard, 0x19), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0, 0x1A), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1, 0x1B), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2, 0x1C), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3, 0x1D), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0, 0x1E), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1, 0x1F), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2, 0x20), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3, 0x21), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0, 0x26), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1, 0x27), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2, 0x28), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3, 0x29), + ALI_DRW_PMU_EVENT_ATTR(op_is_refresh, 0x2A), + ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref, 0x2B), + ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode, 0x2D), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl, 0x2E), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc, 0x34), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr, 0x35), + ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr, 0x36), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart, 0x37), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch, 0x38), + ALI_DRW_PMU_EVENT_ATTR(chi_txreq, 0x39), + ALI_DRW_PMU_EVENT_ATTR(chi_txdat, 0x3A), + ALI_DRW_PMU_EVENT_ATTR(chi_rxdat, 0x3B), + ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp, 0x3C), + ALI_DRW_PMU_EVENT_ATTR(tsz_vio, 0x3D), + ALI_DRW_PMU_EVENT_ATTR(cycle, 0x80), + NULL, +}; + +static struct attribute_group ali_drw_pmu_events_attr_group =3D { + .name =3D "events", + .attrs =3D ali_drw_pmu_events_attrs, +}; + +static struct attribute *ali_drw_pmu_format_attr[] =3D { + ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"), + NULL, +}; + +static const struct attribute_group ali_drw_pmu_format_group =3D { + .name =3D "format", + .attrs =3D ali_drw_pmu_format_attr, +}; + +static ssize_t ali_drw_pmu_cpumask_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu)); +} + +static struct device_attribute ali_drw_pmu_cpumask_attr =3D + __ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL); + +static struct attribute *ali_drw_pmu_cpumask_attrs[] =3D { + &ali_drw_pmu_cpumask_attr.attr, + NULL, +}; + +static const struct attribute_group ali_drw_pmu_cpumask_attr_group =3D { + .attrs =3D ali_drw_pmu_cpumask_attrs, +}; + +static const struct attribute_group *ali_drw_pmu_attr_groups[] =3D { + &ali_drw_pmu_events_attr_group, + &ali_drw_pmu_cpumask_attr_group, + &ali_drw_pmu_format_group, + NULL, +}; + +/* find a counter for event, then in add func, hw.idx will equal to counte= r */ +static int ali_drw_get_counter_idx(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) { + if (!test_and_set_bit(idx, drw_pmu->used_mask)) + return idx; + } + + /* The counters are all in use. */ + return -EBUSY; +} + +static u64 ali_drw_pmu_read_counter(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + u64 cycle_high, cycle_low; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + cycle_high =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH); + cycle_high &=3D ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK; + cycle_low =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW); + cycle_low &=3D ALI_DRW_PMU_CYCLE_CNT_LOW_MASK; + return (cycle_high << 32 | cycle_low); + } + + return readl(drw_pmu->cfg_base + + ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx)); +} + +static void ali_drw_pmu_event_update(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + u64 delta, prev, now; + + do { + prev =3D local64_read(&hwc->prev_count); + now =3D ali_drw_pmu_read_counter(event); + } while (local64_cmpxchg(&hwc->prev_count, prev, now) !=3D prev); + + /* handle overflow. */ + delta =3D now - prev; + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) + delta &=3D ALI_DRW_PMU_OV_INTR_MASK; + else + delta &=3D ALI_DRW_CNT_MAX_PERIOD; + local64_add(delta, &event->count); +} + +static void ali_drw_pmu_event_set_period(struct perf_event *event) +{ + u64 pre_val; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + /* set a preload counter for test purpose */ + writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx, + drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); + + /* set conunter initial value */ + pre_val =3D ALI_DRW_PMU_CNT_INIT; + writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + local64_set(&event->hw.prev_count, pre_val); + + /* set sel mode to zero to start test */ + writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); +} + +static void ali_drw_pmu_enable_counter(struct perf_event *event) +{ + u32 val, subval, reg, shift; + int counter =3D event->hw.idx; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static void ali_drw_pmu_disable_counter(struct perf_event *event) +{ + u32 val, reg, subval, shift; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int counter =3D event->hw.idx; + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data) +{ + struct ali_drw_pmu_irq *irq =3D data; + struct ali_drw_pmu *drw_pmu; + irqreturn_t ret =3D IRQ_NONE; + + rcu_read_lock(); + list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) { + unsigned long status, clr_status; + struct perf_event *event; + unsigned int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + ali_drw_pmu_disable_counter(event); + } + + /* common counter intr status */ + status =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS); + status =3D FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status); + if (status) { + for_each_set_bit(idx, &status, + ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + event =3D drw_pmu->events[idx]; + if (WARN_ON_ONCE(!event)) + continue; + ali_drw_pmu_event_update(event); + ali_drw_pmu_event_set_period(event); + } + + /* clear common counter intr status */ + clr_status =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1); + writel(clr_status, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + } + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + if (!(event->hw.state & PERF_HES_STOPPED)) + ali_drw_pmu_enable_counter(event); + } + if (status) + ret =3D IRQ_HANDLED; + } + rcu_read_unlock(); + return ret; +} + +static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_devi= ce + *pdev, int irq_num) +{ + int ret; + struct ali_drw_pmu_irq *irq; + + list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) { + if (irq->irq_num =3D=3D irq_num + && refcount_inc_not_zero(&irq->refcount)) + return irq; + } + + irq =3D kzalloc(sizeof(*irq), GFP_KERNEL); + if (!irq) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&irq->pmus_node); + + /* Pick one CPU to be the preferred one to use */ + irq->cpu =3D smp_processor_id(); + refcount_set(&irq->refcount, 1); + + /* + * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same + * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers + * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not + * share with other component. + */ + ret =3D devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr, + IRQF_SHARED, dev_name(&pdev->dev), irq); + if (ret < 0) { + dev_err(&pdev->dev, + "Fail to request IRQ:%d ret:%d\n", irq_num, ret); + goto out_free; + } + + ret =3D irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu)); + if (ret) + goto out_free; + + ret =3D cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + if (ret) + goto out_free; + + irq->irq_num =3D irq_num; + list_add(&irq->irqs_node, &ali_drw_pmu_irqs); + + return irq; + +out_free: + kfree(irq); + return ERR_PTR(ret); +} + +static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu, + struct platform_device *pdev) +{ + int irq_num; + struct ali_drw_pmu_irq *irq; + + /* Read and init IRQ */ + irq_num =3D platform_get_irq(pdev, 0); + if (irq_num < 0) + return irq_num; + + mutex_lock(&ali_drw_pmu_irqs_lock); + irq =3D __ali_drw_pmu_init_irq(pdev, irq_num); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + if (IS_ERR(irq)) + return PTR_ERR(irq); + + drw_pmu->irq =3D irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + return 0; +} + +static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu) +{ + struct ali_drw_pmu_irq *irq =3D drw_pmu->irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_del_rcu(&drw_pmu->pmus_node); + + if (!refcount_dec_and_test(&irq->refcount)) { + mutex_unlock(&ali_drw_pmu_irqs_lock); + return; + } + + list_del(&irq->irqs_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL)); + cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + kfree(irq); +} + +static int ali_drw_pmu_event_init(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + struct perf_event *sibling; + struct device *dev =3D drw_pmu->pmu.dev; + + if (event->attr.type !=3D event->pmu->type) + return -ENOENT; + + if (is_sampling_event(event)) { + dev_err(dev, "Sampling not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->attach_state & PERF_ATTACH_TASK) { + dev_err(dev, "Per-task counter cannot allocate!\n"); + return -EOPNOTSUPP; + } + + event->cpu =3D drw_pmu->cpu; + if (event->cpu < 0) { + dev_err(dev, "Per-task mode not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->group_leader !=3D event && + !is_software_event(event->group_leader)) { + dev_err(dev, "driveway only allow one event!\n"); + return -EINVAL; + } + + for_each_sibling_event(sibling, event->group_leader) { + if (sibling !=3D event && !is_software_event(sibling)) { + dev_err(dev, "driveway event not allowed!\n"); + return -EINVAL; + } + } + + /* reset all the pmu counters */ + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + hwc->idx =3D -1; + + return 0; +} + +static void ali_drw_pmu_start(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + event->hw.state =3D 0; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + writel(ALI_DRW_PMU_CNT_START, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + return; + } + + ali_drw_pmu_event_set_period(event); + if (flags & PERF_EF_RELOAD) { + unsigned long prev_raw_count =3D + local64_read(&event->hw.prev_count); + writel(prev_raw_count, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + } + + ali_drw_pmu_enable_counter(event); + + writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); +} + +static void ali_drw_pmu_stop(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + if (event->hw.state & PERF_HES_STOPPED) + return; + + if (GET_DRW_EVENTID(event) !=3D ALI_DRW_PMU_CYCLE_EVT_ID) + ali_drw_pmu_disable_counter(event); + + writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + ali_drw_pmu_event_update(event); + event->hw.state |=3D PERF_HES_STOPPED | PERF_HES_UPTODATE; +} + +static int ali_drw_pmu_add(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D -1; + int evtid; + + evtid =3D GET_DRW_EVENTID(event); + + if (evtid !=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + idx =3D ali_drw_get_counter_idx(event); + if (idx < 0) + return idx; + drw_pmu->events[idx] =3D event; + drw_pmu->evtids[idx] =3D evtid; + } + hwc->idx =3D idx; + + hwc->state =3D PERF_HES_STOPPED | PERF_HES_UPTODATE; + + if (flags & PERF_EF_START) + ali_drw_pmu_start(event, PERF_EF_RELOAD); + + /* Propagate our changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void ali_drw_pmu_del(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D hwc->idx; + + ali_drw_pmu_stop(event, PERF_EF_UPDATE); + + if (idx >=3D 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + drw_pmu->events[idx] =3D NULL; + drw_pmu->evtids[idx] =3D 0; + clear_bit(idx, drw_pmu->used_mask); + } + + perf_event_update_userpage(event); +} + +static void ali_drw_pmu_read(struct perf_event *event) +{ + ali_drw_pmu_event_update(event); +} + +static int ali_drw_pmu_probe(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu; + struct resource *res; + char *name; + int ret; + + drw_pmu =3D devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL); + if (!drw_pmu) + return -ENOMEM; + + drw_pmu->dev =3D &pdev->dev; + platform_set_drvdata(pdev, drw_pmu); + + res =3D platform_get_resource(pdev, IORESOURCE_MEM, 0); + drw_pmu->cfg_base =3D devm_ioremap_resource(&pdev->dev, res); + if (!drw_pmu->cfg_base) + return -ENOMEM; + + name =3D devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx", + (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT)); + if (!name) + return -ENOMEM; + + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + /* enable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL); + + /* clearing interrupt status */ + writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + + drw_pmu->cpu =3D smp_processor_id(); + + ret =3D ali_drw_pmu_init_irq(drw_pmu, pdev); + if (ret) + return ret; + + drw_pmu->pmu =3D (struct pmu) { + .module =3D THIS_MODULE, + .task_ctx_nr =3D perf_invalid_context, + .event_init =3D ali_drw_pmu_event_init, + .add =3D ali_drw_pmu_add, + .del =3D ali_drw_pmu_del, + .start =3D ali_drw_pmu_start, + .stop =3D ali_drw_pmu_stop, + .read =3D ali_drw_pmu_read, + .attr_groups =3D ali_drw_pmu_attr_groups, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE, + }; + + ret =3D perf_pmu_register(&drw_pmu->pmu, name, -1); + if (ret) { + dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n"); + ali_drw_pmu_uninit_irq(drw_pmu); + } + + return ret; +} + +static int ali_drw_pmu_remove(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu =3D platform_get_drvdata(pdev); + + /* disable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL); + + ali_drw_pmu_uninit_irq(drw_pmu); + perf_pmu_unregister(&drw_pmu->pmu); + + return 0; +} + +static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *no= de) +{ + struct ali_drw_pmu_irq *irq; + struct ali_drw_pmu *drw_pmu; + unsigned int target; + int ret; + cpumask_t node_online_cpus; + + irq =3D hlist_entry_safe(node, struct ali_drw_pmu_irq, node); + if (cpu !=3D irq->cpu) + return 0; + + ret =3D cpumask_and(&node_online_cpus, + cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask); + if (ret) + target =3D cpumask_any_but(&node_online_cpus, cpu); + else + target =3D cpumask_any_but(cpu_online_mask, cpu); + + if (target >=3D nr_cpu_ids) + return 0; + + /* We're only reading, but this isn't the place to be involving RCU */ + mutex_lock(&ali_drw_pmu_irqs_lock); + list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node) + perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target))); + irq->cpu =3D target; + + return 0; +} + +/* + * Due to historical reasons, the HID used in the production environment is + * ARMHD700, so we leave ARMHD700 as Compatible ID. + */ +static const struct acpi_device_id ali_drw_acpi_match[] =3D { + {"BABA5000", 0}, + {"ARMHD700", 0}, + {} +}; + +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match); + +static struct platform_driver ali_drw_pmu_driver =3D { + .driver =3D { + .name =3D "ali_drw_pmu", + .acpi_match_table =3D ali_drw_acpi_match, + }, + .probe =3D ali_drw_pmu_probe, + .remove =3D ali_drw_pmu_remove, +}; + +static int __init ali_drw_pmu_init(void) +{ + int ret; + + ret =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "ali_drw_pmu:online", + NULL, ali_drw_pmu_offline_cpu); + + if (ret < 0) { + pr_err("DRW Driveway PMU: setup hotplug failed, ret =3D %d\n", + ret); + return ret; + } + ali_drw_cpuhp_state_num =3D ret; + + ret =3D platform_driver_register(&ali_drw_pmu_driver); + if (ret) + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); + + return ret; +} + +static void __exit ali_drw_pmu_exit(void) +{ + platform_driver_unregister(&ali_drw_pmu_driver); + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); +} + +module_init(ali_drw_pmu_init); +module_exit(ali_drw_pmu_exit); + +MODULE_AUTHOR("Hongbo Yao "); +MODULE_AUTHOR("Neng Chen "); +MODULE_AUTHOR("Shuai Xue "); +MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver"); +MODULE_LICENSE("GPL v2"); --=20 2.20.1.9.gb50a0d7 From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CDB5C43334 for ; Tue, 28 Jun 2022 03:16:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230399AbiF1DQ4 (ORCPT ); Mon, 27 Jun 2022 23:16:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230063AbiF1DQq (ORCPT ); Mon, 27 Jun 2022 23:16:46 -0400 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D3621CFF5 for ; Mon, 27 Jun 2022 20:16:43 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R461e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0VHfG1yG_1656386200; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VHfG1yG_1656386200) by smtp.aliyun-inc.com; Tue, 28 Jun 2022 11:16:41 +0800 From: Shuai Xue To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [PATCH v2 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Date: Tue, 28 Jun 2022 11:16:29 +0800 Message-Id: <20220628031630.18674-3-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports DDR5/4 DRAM and targets cloud computing and HPC. Each PMU is registered as a device in /sys/bus/event_source/devices, and users can select event to monitor in each sub-channel, independently. For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two sub-channels of the same channel in die 0. And the PMU device of die 1 is prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. Due to hardware limitation, one of DDRSS Driveway PMU overflow interrupt shares the same irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers successfully, add IRQF_SHARED flag. Signed-off-by: Shuai Xue Co-developed-by: Hongbo Yao Signed-off-by: Hongbo Yao Co-developed-by: Neng Chen Signed-off-by: Neng Chen Reviewed-by: Baolin Wang --- drivers/perf/Kconfig | 8 + drivers/perf/Makefile | 1 + drivers/perf/alibaba_uncore_drw_pmu.c | 793 ++++++++++++++++++++++++++ 3 files changed, 802 insertions(+) create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 1e2d69453771..dfafba4cb066 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU Provides support for the non-architectural CPU PMUs present on the Apple M1 SoCs and derivatives. =20 +config ALIBABA_UNCORE_DRW_PMU + tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver" + depends on (ARM64 && ACPI) + default m + help + Support for Driveway PMU events monitoring on Yitian 710 DDR + Sub-system. + source "drivers/perf/hisilicon/Kconfig" =20 config MARVELL_CN10K_DDR_PMU diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 57a279c61df5..050d04ee19dd 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) +=3D arm_dmc620_pmu.o obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) +=3D marvell_cn10k_tad_pmu.o obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) +=3D marvell_cn10k_ddr_pmu.o obj-$(CONFIG_APPLE_M1_CPU_PMU) +=3D apple_m1_cpu_pmu.o +obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) +=3D alibaba_uncore_drw_pmu.o diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_u= ncore_drw_pmu.c new file mode 100644 index 000000000000..62c1aeeac2bd --- /dev/null +++ b/drivers/perf/alibaba_uncore_drw_pmu.c @@ -0,0 +1,793 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Alibaba DDR Sub-System Driveway PMU driver + * + * Copyright (C) 2022 Alibaba Inc + */ + +#define ALI_DRW_PMUNAME "ali_drw" +#define ALI_DRW_DRVNAME ALI_DRW_PMUNAME "_pmu" +#define pr_fmt(fmt) ALI_DRW_DRVNAME ": " fmt + +#include +#include +#include + +#define ALI_DRW_PMU_COMMON_MAX_COUNTERS 16 +#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE 19 + +#define ALI_DRW_PMU_PA_SHIFT 12 +#define ALI_DRW_PMU_CNT_INIT 0x00000000 +#define ALI_DRW_CNT_MAX_PERIOD 0xffffffff +#define ALI_DRW_PMU_CYCLE_EVT_ID 0x80 + +#define ALI_DRW_PMU_CNT_CTRL 0xC00 +#define ALI_DRW_PMU_CNT_RST BIT(2) +#define ALI_DRW_PMU_CNT_STOP BIT(1) +#define ALI_DRW_PMU_CNT_START BIT(0) + +#define ALI_DRW_PMU_CNT_STATE 0xC04 +#define ALI_DRW_PMU_TEST_CTRL 0xC08 +#define ALI_DRW_PMU_CNT_PRELOAD 0xC0C + +#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK GENMASK(23, 0) +#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK GENMASK(31, 0) +#define ALI_DRW_PMU_CYCLE_CNT_HIGH 0xC10 +#define ALI_DRW_PMU_CYCLE_CNT_LOW 0xC14 + +/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */ +#define ALI_DRW_PMU_EVENT_SEL0 0xC68 +/* counter 0-3 use sel0, counter 4-7 use sel1...*/ +#define ALI_DRW_PMU_EVENT_SELn(n) \ + (ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4) +#define ALI_DRW_PMCOM_CNT_EN BIT(7) +#define ALI_DRW_PMCOM_CNT_EVENT_MASK GENMASK(5, 0) +#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \ + (8 * (n % 4)) + +/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte str= ide */ +#define ALI_DRW_PMU_COMMON_COUNTER0 0xC78 +#define ALI_DRW_PMU_COMMON_COUNTERn(n) \ + (ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n)) + +#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL 0xCB8 +#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL 0xCBC +#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS 0xCC0 +#define ALI_DRW_PMU_OV_INTR_CLR 0xCC4 +#define ALI_DRW_PMU_OV_INTR_STATUS 0xCC8 +#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK GENMASK(23, 8) +#define ALI_DRW_PMBW_CNT_OV_INTR_MASK GENMASK(7, 0) +#define ALI_DRW_PMU_OV_INTR_MASK GENMASK_ULL(63, 0) + +static int ali_drw_cpuhp_state_num; + +static LIST_HEAD(ali_drw_pmu_irqs); +static DEFINE_MUTEX(ali_drw_pmu_irqs_lock); + +struct ali_drw_pmu_irq { + struct hlist_node node; + struct list_head irqs_node; + struct list_head pmus_node; + int irq_num; + int cpu; + refcount_t refcount; +}; + +struct ali_drw_pmu { + void __iomem *cfg_base; + struct device *dev; + + struct list_head pmus_node; + struct ali_drw_pmu_irq *irq; + int irq_num; + int cpu; + DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS); + struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + + struct pmu pmu; +}; + +#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu)) + +#define DRW_CONFIG_EVENTID GENMASK(7, 0) +#define GET_DRW_EVENTID(event) FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr= .config) + +static ssize_t ali_drw_pmu_format_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(buf, "%s\n", (char *)eattr->var); +} + +/* + * PMU event attributes + */ +statuc ssize_t ali_drw_pmu_event_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(page, "config=3D0x%lx\n", (unsigned long)eattr->var); +} + +#define ALI_DRW_PMU_ATTR(_name, _func, _config) = \ + (&((struct dev_ext_attribute[]) { \ + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ + })[0].attr.attr) + +#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config) +#define ALI_DRW_PMU_EVENT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config) + +static struct attribute *ali_drw_pmu_events_attrs[] =3D { + ALI_DRW_PMU_EVENT_ATTR(hif_rd_or_wr, 0x0), + ALI_DRW_PMU_EVENT_ATTR(hif_wr, 0x1), + ALI_DRW_PMU_EVENT_ATTR(hif_rd, 0x2), + ALI_DRW_PMU_EVENT_ATTR(hif_rmw, 0x3), + ALI_DRW_PMU_EVENT_ATTR(hif_hi_pri_rd, 0x4), + ALI_DRW_PMU_EVENT_ATTR(dfi_wr_data_cycles, 0x7), + ALI_DRW_PMU_EVENT_ATTR(dfi_rd_data_cycles, 0x8), + ALI_DRW_PMU_EVENT_ATTR(hpr_xact_when_critical, 0x9), + ALI_DRW_PMU_EVENT_ATTR(lpr_xact_when_critical, 0xA), + ALI_DRW_PMU_EVENT_ATTR(wr_xact_when_critical, 0xB), + ALI_DRW_PMU_EVENT_ATTR(op_is_activate, 0xC), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_or_wr, 0xD), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd_activate, 0xE), + ALI_DRW_PMU_EVENT_ATTR(op_is_rd, 0xF), + ALI_DRW_PMU_EVENT_ATTR(op_is_wr, 0x10), + ALI_DRW_PMU_EVENT_ATTR(op_is_mwr, 0x11), + ALI_DRW_PMU_EVENT_ATTR(op_is_precharge, 0x12), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_rdwr, 0x13), + ALI_DRW_PMU_EVENT_ATTR(precharge_for_other, 0x14), + ALI_DRW_PMU_EVENT_ATTR(rdwr_transitions, 0x15), + ALI_DRW_PMU_EVENT_ATTR(write_combine, 0x16), + ALI_DRW_PMU_EVENT_ATTR(war_hazard, 0x17), + ALI_DRW_PMU_EVENT_ATTR(raw_hazard, 0x18), + ALI_DRW_PMU_EVENT_ATTR(waw_hazard, 0x19), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk0, 0x1A), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk1, 0x1B), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk2, 0x1C), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_selfref_rk3, 0x1D), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk0, 0x1E), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk1, 0x1F), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk2, 0x20), + ALI_DRW_PMU_EVENT_ATTR(op_is_enter_powerdown_rk3, 0x21), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk0, 0x26), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk1, 0x27), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk2, 0x28), + ALI_DRW_PMU_EVENT_ATTR(selfref_mode_rk3, 0x29), + ALI_DRW_PMU_EVENT_ATTR(op_is_refresh, 0x2A), + ALI_DRW_PMU_EVENT_ATTR(op_is_crit_ref, 0x2B), + ALI_DRW_PMU_EVENT_ATTR(op_is_load_mode, 0x2D), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqcl, 0x2E), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_rd, 0x30), + ALI_DRW_PMU_EVENT_ATTR(visible_window_limit_reached_wr, 0x31), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mpc, 0x34), + ALI_DRW_PMU_EVENT_ATTR(op_is_dqsosc_mrr, 0x35), + ALI_DRW_PMU_EVENT_ATTR(op_is_tcr_mrr, 0x36), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqstart, 0x37), + ALI_DRW_PMU_EVENT_ATTR(op_is_zqlatch, 0x38), + ALI_DRW_PMU_EVENT_ATTR(chi_txreq, 0x39), + ALI_DRW_PMU_EVENT_ATTR(chi_txdat, 0x3A), + ALI_DRW_PMU_EVENT_ATTR(chi_rxdat, 0x3B), + ALI_DRW_PMU_EVENT_ATTR(chi_rxrsp, 0x3C), + ALI_DRW_PMU_EVENT_ATTR(tsz_vio, 0x3D), + ALI_DRW_PMU_EVENT_ATTR(cycle, 0x80), + NULL, +}; + +static struct attribute_group ali_drw_pmu_events_attr_group =3D { + .name =3D "events", + .attrs =3D ali_drw_pmu_events_attrs, +}; + +static struct attribute *ali_drw_pmu_format_attr[] =3D { + ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"), + NULL, +}; + +static const struct attribute_group ali_drw_pmu_format_group =3D { + .name =3D "format", + .attrs =3D ali_drw_pmu_format_attr, +}; + +static ssize_t ali_drw_pmu_cpumask_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu)); +} + +static struct device_attribute ali_drw_pmu_cpumask_attr =3D + __ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL); + +static struct attribute *ali_drw_pmu_cpumask_attrs[] =3D { + &ali_drw_pmu_cpumask_attr.attr, + NULL, +}; + +static const struct attribute_group ali_drw_pmu_cpumask_attr_group =3D { + .attrs =3D ali_drw_pmu_cpumask_attrs, +}; + +static const struct attribute_group *ali_drw_pmu_attr_groups[] =3D { + &ali_drw_pmu_events_attr_group, + &ali_drw_pmu_cpumask_attr_group, + &ali_drw_pmu_format_group, + NULL, +}; + +/* find a counter for event, then in add func, hw.idx will equal to counte= r */ +static int ali_drw_get_counter_idx(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) { + if (!test_and_set_bit(idx, drw_pmu->used_mask)) + return idx; + } + + /* The counters are all in use. */ + return -EBUSY; +} + +static u64 ali_drw_pmu_read_counter(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + u64 cycle_high, cycle_low; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + cycle_high =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH); + cycle_high &=3D ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK; + cycle_low =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW); + cycle_low &=3D ALI_DRW_PMU_CYCLE_CNT_LOW_MASK; + return (cycle_high << 32 | cycle_low); + } + + return readl(drw_pmu->cfg_base + + ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx)); +} + +static void ali_drw_pmu_event_update(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + u64 delta, prev, now; + + do { + prev =3D local64_read(&hwc->prev_count); + now =3D ali_drw_pmu_read_counter(event); + } while (local64_cmpxchg(&hwc->prev_count, prev, now) !=3D prev); + + /* handle overflow. */ + delta =3D now - prev; + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) + delta &=3D ALI_DRW_PMU_OV_INTR_MASK; + else + delta &=3D ALI_DRW_CNT_MAX_PERIOD; + local64_add(delta, &event->count); +} + +static void ali_drw_pmu_event_set_period(struct perf_event *event) +{ + u64 pre_val; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + /* set a preload counter for test purpose */ + writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx, + drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); + + /* set conunter initial value */ + pre_val =3D ALI_DRW_PMU_CNT_INIT; + writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + local64_set(&event->hw.prev_count, pre_val); + + /* set sel mode to zero to start test */ + writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); +} + +static void ali_drw_pmu_enable_counter(struct perf_event *event) +{ + u32 val, subval, reg, shift; + int counter =3D event->hw.idx; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 1) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, drw_pmu->evtids[counter]); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static void ali_drw_pmu_disable_counter(struct perf_event *event) +{ + u32 val, reg, subval, shift; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int counter =3D event->hw.idx; + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + subval =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_EN, 0) | + FIELD_PREP(ALI_DRW_PMCOM_CNT_EVENT_MASK, 0); + + shift =3D ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter); + val &=3D ~(GENMASK(7, 0) << shift); + val |=3D subval << shift; + + writel(val, drw_pmu->cfg_base + reg); +} + +static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data) +{ + struct ali_drw_pmu_irq *irq =3D data; + struct ali_drw_pmu *drw_pmu; + irqreturn_t ret =3D IRQ_NONE; + + rcu_read_lock(); + list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) { + unsigned long status, clr_status; + struct perf_event *event; + unsigned int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + ali_drw_pmu_disable_counter(event); + } + + /* common counter intr status */ + status =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS); + status =3D FIELD_GET(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, status); + if (status) { + for_each_set_bit(idx, &status, + ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + event =3D drw_pmu->events[idx]; + if (WARN_ON_ONCE(!event)) + continue; + ali_drw_pmu_event_update(event); + ali_drw_pmu_event_set_period(event); + } + + /* clear common counter intr status */ + clr_status =3D FIELD_PREP(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, 1); + writel(clr_status, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + } + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + if (!(event->hw.state & PERF_HES_STOPPED)) + ali_drw_pmu_enable_counter(event); + } + if (status) + ret =3D IRQ_HANDLED; + } + rcu_read_unlock(); + return ret; +} + +static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_devi= ce + *pdev, int irq_num) +{ + int ret; + struct ali_drw_pmu_irq *irq; + + list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) { + if (irq->irq_num =3D=3D irq_num + && refcount_inc_not_zero(&irq->refcount)) + return irq; + } + + irq =3D kzalloc(sizeof(*irq), GFP_KERNEL); + if (!irq) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&irq->pmus_node); + + /* Pick one CPU to be the preferred one to use */ + irq->cpu =3D smp_processor_id(); + refcount_set(&irq->refcount, 1); + + /* + * FIXME: one of DDRSS Driveway PMU overflow interrupt shares the same + * irq number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers + * successfully, add IRQF_SHARED flag. Howerer, PMU interrupt should not + * share with other component. + */ + ret =3D devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr, + IRQF_SHARED, dev_name(&pdev->dev), irq); + if (ret < 0) { + dev_err(&pdev->dev, + "Fail to request IRQ:%d ret:%d\n", irq_num, ret); + goto out_free; + } + + ret =3D irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu)); + if (ret) + goto out_free; + + ret =3D cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + if (ret) + goto out_free; + + irq->irq_num =3D irq_num; + list_add(&irq->irqs_node, &ali_drw_pmu_irqs); + + return irq; + +out_free: + kfree(irq); + return ERR_PTR(ret); +} + +static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu, + struct platform_device *pdev) +{ + int irq_num; + struct ali_drw_pmu_irq *irq; + + /* Read and init IRQ */ + irq_num =3D platform_get_irq(pdev, 0); + if (irq_num < 0) + return irq_num; + + mutex_lock(&ali_drw_pmu_irqs_lock); + irq =3D __ali_drw_pmu_init_irq(pdev, irq_num); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + if (IS_ERR(irq)) + return PTR_ERR(irq); + + drw_pmu->irq =3D irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + return 0; +} + +static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu) +{ + struct ali_drw_pmu_irq *irq =3D drw_pmu->irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_del_rcu(&drw_pmu->pmus_node); + + if (!refcount_dec_and_test(&irq->refcount)) { + mutex_unlock(&ali_drw_pmu_irqs_lock); + return; + } + + list_del(&irq->irqs_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL)); + cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + kfree(irq); +} + +static int ali_drw_pmu_event_init(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + struct perf_event *sibling; + struct device *dev =3D drw_pmu->pmu.dev; + + if (event->attr.type !=3D event->pmu->type) + return -ENOENT; + + if (is_sampling_event(event)) { + dev_err(dev, "Sampling not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->attach_state & PERF_ATTACH_TASK) { + dev_err(dev, "Per-task counter cannot allocate!\n"); + return -EOPNOTSUPP; + } + + event->cpu =3D drw_pmu->cpu; + if (event->cpu < 0) { + dev_err(dev, "Per-task mode not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->group_leader !=3D event && + !is_software_event(event->group_leader)) { + dev_err(dev, "driveway only allow one event!\n"); + return -EINVAL; + } + + for_each_sibling_event(sibling, event->group_leader) { + if (sibling !=3D event && !is_software_event(sibling)) { + dev_err(dev, "driveway event not allowed!\n"); + return -EINVAL; + } + } + + /* reset all the pmu counters */ + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + hwc->idx =3D -1; + + return 0; +} + +static void ali_drw_pmu_start(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + event->hw.state =3D 0; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + writel(ALI_DRW_PMU_CNT_START, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + return; + } + + ali_drw_pmu_event_set_period(event); + if (flags & PERF_EF_RELOAD) { + unsigned long prev_raw_count =3D + local64_read(&event->hw.prev_count); + writel(prev_raw_count, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + } + + ali_drw_pmu_enable_counter(event); + + writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); +} + +static void ali_drw_pmu_stop(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + if (event->hw.state & PERF_HES_STOPPED) + return; + + if (GET_DRW_EVENTID(event) !=3D ALI_DRW_PMU_CYCLE_EVT_ID) + ali_drw_pmu_disable_counter(event); + + writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + ali_drw_pmu_event_update(event); + event->hw.state |=3D PERF_HES_STOPPED | PERF_HES_UPTODATE; +} + +static int ali_drw_pmu_add(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D -1; + int evtid; + + evtid =3D GET_DRW_EVENTID(event); + + if (evtid !=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + idx =3D ali_drw_get_counter_idx(event); + if (idx < 0) + return idx; + drw_pmu->events[idx] =3D event; + drw_pmu->evtids[idx] =3D evtid; + } + hwc->idx =3D idx; + + hwc->state =3D PERF_HES_STOPPED | PERF_HES_UPTODATE; + + if (flags & PERF_EF_START) + ali_drw_pmu_start(event, PERF_EF_RELOAD); + + /* Propagate our changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void ali_drw_pmu_del(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D hwc->idx; + + ali_drw_pmu_stop(event, PERF_EF_UPDATE); + + if (idx >=3D 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + drw_pmu->events[idx] =3D NULL; + drw_pmu->evtids[idx] =3D 0; + clear_bit(idx, drw_pmu->used_mask); + } + + perf_event_update_userpage(event); +} + +static void ali_drw_pmu_read(struct perf_event *event) +{ + ali_drw_pmu_event_update(event); +} + +static int ali_drw_pmu_probe(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu; + struct resource *res; + char *name; + int ret; + + drw_pmu =3D devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL); + if (!drw_pmu) + return -ENOMEM; + + drw_pmu->dev =3D &pdev->dev; + platform_set_drvdata(pdev, drw_pmu); + + res =3D platform_get_resource(pdev, IORESOURCE_MEM, 0); + drw_pmu->cfg_base =3D devm_ioremap_resource(&pdev->dev, res); + if (!drw_pmu->cfg_base) + return -ENOMEM; + + name =3D devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx", + (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT)); + if (!name) + return -ENOMEM; + + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + /* enable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL); + + /* clearing interrupt status */ + writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + + drw_pmu->cpu =3D smp_processor_id(); + + ret =3D ali_drw_pmu_init_irq(drw_pmu, pdev); + if (ret) + return ret; + + drw_pmu->pmu =3D (struct pmu) { + .module =3D THIS_MODULE, + .task_ctx_nr =3D perf_invalid_context, + .event_init =3D ali_drw_pmu_event_init, + .add =3D ali_drw_pmu_add, + .del =3D ali_drw_pmu_del, + .start =3D ali_drw_pmu_start, + .stop =3D ali_drw_pmu_stop, + .read =3D ali_drw_pmu_read, + .attr_groups =3D ali_drw_pmu_attr_groups, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE, + }; + + ret =3D perf_pmu_register(&drw_pmu->pmu, name, -1); + if (ret) { + dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n"); + ali_drw_pmu_uninit_irq(drw_pmu); + } + + return ret; +} + +static int ali_drw_pmu_remove(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu =3D platform_get_drvdata(pdev); + + /* disable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL); + + ali_drw_pmu_uninit_irq(drw_pmu); + perf_pmu_unregister(&drw_pmu->pmu); + + return 0; +} + +static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *no= de) +{ + struct ali_drw_pmu_irq *irq; + struct ali_drw_pmu *drw_pmu; + unsigned int target; + int ret; + cpumask_t node_online_cpus; + + irq =3D hlist_entry_safe(node, struct ali_drw_pmu_irq, node); + if (cpu !=3D irq->cpu) + return 0; + + ret =3D cpumask_and(&node_online_cpus, + cpumask_of_node(cpu_to_node(cpu)), cpu_online_mask); + if (ret) + target =3D cpumask_any_but(&node_online_cpus, cpu); + else + target =3D cpumask_any_but(cpu_online_mask, cpu); + + if (target >=3D nr_cpu_ids) + return 0; + + /* We're only reading, but this isn't the place to be involving RCU */ + mutex_lock(&ali_drw_pmu_irqs_lock); + list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node) + perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target))); + irq->cpu =3D target; + + return 0; +} + +/* + * Due to historical reasons, the HID used in the production environment is + * ARMHD700, so we leave ARMHD700 for compatibility. + */ +static const struct acpi_device_id ali_drw_acpi_match[] =3D { + {"BABA5000", 0}, + {"ARMHD700", 0}, + {} +}; + +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match); + +static struct platform_driver ali_drw_pmu_driver =3D { + .driver =3D { + .name =3D "ali_drw_pmu", + .acpi_match_table =3D ACPI_PTR(ali_drw_acpi_match), + }, + .probe =3D ali_drw_pmu_probe, + .remove =3D ali_drw_pmu_remove, +}; + +static int __init ali_drw_pmu_init(void) +{ + int ret; + + ret =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "ali_drw_pmu:online", + NULL, ali_drw_pmu_offline_cpu); + + if (ret < 0) { + pr_err("DRW Driveway PMU: setup hotplug failed, ret =3D %d\n", + ret); + return ret; + } + ali_drw_cpuhp_state_num =3D ret; + + ret =3D platform_driver_register(&ali_drw_pmu_driver); + if (ret) + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); + + return ret; +} + +static void __exit ali_drw_pmu_exit(void) +{ + platform_driver_unregister(&ali_drw_pmu_driver); + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); +} + +module_init(ali_drw_pmu_init); +module_exit(ali_drw_pmu_exit); + +MODULE_AUTHOR("Hongbo Yao "); +MODULE_AUTHOR("Neng Chen "); +MODULE_AUTHOR("Shuai Xue "); +MODULE_DESCRIPTION("Alibaba DDR Sub-System Driveway PMU driver"); +MODULE_LICENSE("GPL v2"); --=20 2.20.1.9.gb50a0d7 From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E951FC43334 for ; Fri, 17 Jun 2022 11:18:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1381629AbiFQLSv (ORCPT ); Fri, 17 Jun 2022 07:18:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235059AbiFQLSm (ORCPT ); Fri, 17 Jun 2022 07:18:42 -0400 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 829FA6B640 for ; Fri, 17 Jun 2022 04:18:40 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0VGernkM_1655464717; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VGernkM_1655464717) by smtp.aliyun-inc.com; Fri, 17 Jun 2022 19:18:38 +0800 From: Shuai Xue To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com Subject: [PATCH v1 2/3] drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC Date: Fri, 17 Jun 2022 19:18:24 +0800 Message-Id: <20220617111825.92911-3-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add the DDR Sub-System Driveway Performance Monitoring Unit (PMU) driver support for Alibaba T-Head Yitian 710 SoC chip. Yitian supports the most advanced DDR5/4 DRAM to provide tremendous memory bandwidth for cloud computing and HPC. Each PMU is registered as a device in /sys/bus/event_source/devices, and users can select event to monitor in each sub-channel, independently. For example, ali_drw_21000 and ali_drw_21080 are two PMU devices for two sub-channels of the same channel in die 0. And the PMU device of die 1 is prefixed with ali_drw_400XXXXX, e.g. ali_drw_40021000. Signed-off-by: Shuai Xue Signed-off-by: Hongbo Yao Signed-off-by: Neng Chen Reported-by: kernel test robot Reviewed-by: Baolin Wang --- drivers/perf/Kconfig | 8 + drivers/perf/Makefile | 1 + drivers/perf/alibaba_uncore_drw_pmu.c | 771 ++++++++++++++++++++++++++ 3 files changed, 780 insertions(+) create mode 100644 drivers/perf/alibaba_uncore_drw_pmu.c diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 1e2d69453771..dfafba4cb066 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -183,6 +183,14 @@ config APPLE_M1_CPU_PMU Provides support for the non-architectural CPU PMUs present on the Apple M1 SoCs and derivatives. =20 +config ALIBABA_UNCORE_DRW_PMU + tristate "Alibaba T-Head Yitian 710 DDR Sub-system Driveway PMU driver" + depends on (ARM64 && ACPI) + default m + help + Support for Driveway PMU events monitoring on Yitian 710 DDR + Sub-system. + source "drivers/perf/hisilicon/Kconfig" =20 config MARVELL_CN10K_DDR_PMU diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 57a279c61df5..050d04ee19dd 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) +=3D arm_dmc620_pmu.o obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) +=3D marvell_cn10k_tad_pmu.o obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) +=3D marvell_cn10k_ddr_pmu.o obj-$(CONFIG_APPLE_M1_CPU_PMU) +=3D apple_m1_cpu_pmu.o +obj-$(CONFIG_ALIBABA_UNCORE_DRW_PMU) +=3D alibaba_uncore_drw_pmu.o diff --git a/drivers/perf/alibaba_uncore_drw_pmu.c b/drivers/perf/alibaba_u= ncore_drw_pmu.c new file mode 100644 index 000000000000..9228ff672276 --- /dev/null +++ b/drivers/perf/alibaba_uncore_drw_pmu.c @@ -0,0 +1,771 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Alibaba DDR Sub-System Driveway PMU driver + * + * Copyright (C) 2022 Alibaba Inc + */ + +#define ALI_DRW_PMUNAME "ali_drw" +#define ALI_DRW_DRVNAME ALI_DRW_PMUNAME "_pmu" +#define pr_fmt(fmt) ALI_DRW_DRVNAME ": " fmt + +#include +#include +#include + +#define ALI_DRW_PMU_COMMON_MAX_COUNTERS 16 +#define ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE 19 + +#define ALI_DRW_PMU_PA_SHIFT 12 +#define ALI_DRW_PMU_CNT_INIT 0x00000000 +#define ALI_DRW_CNT_MAX_PERIOD 0xffffffff +#define ALI_DRW_PMU_CYCLE_EVT_ID 0x80 + +#define ALI_DRW_PMU_CNT_CTRL 0xC00 +#define ALI_DRW_PMU_CNT_RST BIT(2) +#define ALI_DRW_PMU_CNT_STOP BIT(1) +#define ALI_DRW_PMU_CNT_START BIT(0) + +#define ALI_DRW_PMU_CNT_STATE 0xC04 +#define ALI_DRW_PMU_TEST_CTRL 0xC08 +#define ALI_DRW_PMU_CNT_PRELOAD 0xC0C + +#define ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK GENMASK(23, 0) +#define ALI_DRW_PMU_CYCLE_CNT_LOW_MASK GENMASK(31, 0) +#define ALI_DRW_PMU_CYCLE_CNT_HIGH 0xC10 +#define ALI_DRW_PMU_CYCLE_CNT_LOW 0xC14 + +/* PMU EVENT SEL 0-3 are paired in 32-bit registers on a 4-byte stride */ +#define ALI_DRW_PMU_EVENT_SEL0 0xC68 +/* counter 0-3 use sel0, counter 4-7 use sel1...*/ +#define ALI_DRW_PMU_EVENT_SELn(n) \ + (ALI_DRW_PMU_EVENT_SEL0 + (n / 4) * 0x4) +#define ALI_DRW_PMCOM_CNT_EVENT_OFFSET(n) \ + (8 * (n % 4)) +/* counter 0: bit 7, counter 1: bit 15, counter 2: bit 23... */ +#define ALI_DRW_PMCOM_CNT_EN_OFFSET(n) \ + (7 + 8 * (n % 4)) + +/* PMU COMMON COUNTER 0-15, are paired in 32-bit registers on a 4-byte str= ide */ +#define ALI_DRW_PMU_COMMON_COUNTER0 0xC78 +#define ALI_DRW_PMU_COMMON_COUNTERn(n) \ + (ALI_DRW_PMU_COMMON_COUNTER0 + 0x4 * (n)) + +#define ALI_DRW_PMU_OV_INTR_ENABLE_CTL 0xCB8 +#define ALI_DRW_PMU_OV_INTR_DISABLE_CTL 0xCBC +#define ALI_DRW_PMU_OV_INTR_ENABLE_STATUS 0xCC0 +#define ALI_DRW_PMU_OV_INTR_CLR 0xCC4 +#define ALI_DRW_PMCOM_CNT_OV_INTR_MASK GENMASK(23, 8) +#define ALI_DRW_PMBW_CNT_OV_INTR_MASK GENMASK(7, 0) +#define ALI_DRW_PMU_OV_INTR_STATUS 0xCC8 +#define ALI_DRW_PMU_OV_INTR_MASK GENMASK_ULL(63, 0) + +static int ali_drw_cpuhp_state_num; + +static LIST_HEAD(ali_drw_pmu_irqs); +static DEFINE_MUTEX(ali_drw_pmu_irqs_lock); + +struct ali_drw_pmu_irq { + struct hlist_node node; + struct list_head irqs_node; + struct list_head pmus_node; + int irq_num; + int cpu; + refcount_t refcount; +}; + +struct ali_drw_pmu { + void __iomem *cfg_base; + struct device *dev; + + struct list_head pmus_node; + struct ali_drw_pmu_irq *irq; + int irq_num; + int cpu; + DECLARE_BITMAP(used_mask, ALI_DRW_PMU_COMMON_MAX_COUNTERS); + struct perf_event *events[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + int evtids[ALI_DRW_PMU_COMMON_MAX_COUNTERS]; + + struct pmu pmu; +}; + +#define to_ali_drw_pmu(p) (container_of(p, struct ali_drw_pmu, pmu)) + +#define DRW_CONFIG_EVENTID GENMASK(7, 0) +#define GET_DRW_EVENTID(event) FIELD_GET(DRW_CONFIG_EVENTID, (event)->attr= .config) + +ssize_t ali_drw_pmu_format_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(buf, "%s\n", (char *)eattr->var); +} + +/* + * PMU event attributes + */ +ssize_t ali_drw_pmu_event_show(struct device *dev, + struct device_attribute *attr, char *page) +{ + struct dev_ext_attribute *eattr; + + eattr =3D container_of(attr, struct dev_ext_attribute, attr); + + return sprintf(page, "config=3D0x%lx\n", (unsigned long)eattr->var); +} + +#define ALI_DRW_PMU_ATTR(_name, _func, _config) = \ + (&((struct dev_ext_attribute[]) { \ + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ + })[0].attr.attr) + +#define ALI_DRW_PMU_FORMAT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_format_show, (void *)_config) +#define ALI_DRW_PMU_EVENT_ATTR(_name, _config) \ + ALI_DRW_PMU_ATTR(_name, ali_drw_pmu_event_show, (unsigned long)_config) + +static struct attribute *ali_drw_pmu_events_attrs[] =3D { + ALI_DRW_PMU_EVENT_ATTR(perf_hif_rd_or_wr, 0x0), + ALI_DRW_PMU_EVENT_ATTR(perf_hif_wr, 0x1), + ALI_DRW_PMU_EVENT_ATTR(perf_hif_rd, 0x2), + ALI_DRW_PMU_EVENT_ATTR(perf_hif_rmw, 0x3), + ALI_DRW_PMU_EVENT_ATTR(perf_hif_hi_pri_rd, 0x4), + ALI_DRW_PMU_EVENT_ATTR(perf_dfi_wr_data_cycles, 0x7), + ALI_DRW_PMU_EVENT_ATTR(perf_dfi_rd_data_cycles, 0x8), + ALI_DRW_PMU_EVENT_ATTR(perf_hpr_xact_when_critical, 0x9), + ALI_DRW_PMU_EVENT_ATTR(perf_lpr_xact_when_critical, 0xA), + ALI_DRW_PMU_EVENT_ATTR(perf_wpr_xact_when_critical, 0xB), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_activate, 0xC), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd_or_wr, 0xD), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd_activate, 0xE), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_rd, 0xF), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_wr, 0x10), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_mwr, 0x11), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_precharge, 0x12), + ALI_DRW_PMU_EVENT_ATTR(perf_precharge_for_rdwr, 0x13), + ALI_DRW_PMU_EVENT_ATTR(perf_precharge_for_other, 0x14), + ALI_DRW_PMU_EVENT_ATTR(perf_rdwr_transitions, 0x15), + ALI_DRW_PMU_EVENT_ATTR(perf_write_combine, 0x16), + ALI_DRW_PMU_EVENT_ATTR(perf_war_hazard, 0x17), + ALI_DRW_PMU_EVENT_ATTR(perf_raw_hazard, 0x18), + ALI_DRW_PMU_EVENT_ATTR(perf_waw_hazard, 0x19), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk0, 0x1A), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk1, 0x1B), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk2, 0x1C), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_selfref_rk3, 0x1D), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk0, 0x1E), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk1, 0x1F), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk2, 0x20), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_enter_powerdown_rk3, 0x21), + ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk0, 0x26), + ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk1, 0x27), + ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk2, 0x28), + ALI_DRW_PMU_EVENT_ATTR(perf_selfref_mode_rk3, 0x29), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_refresh, 0x2A), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_crit_ref, 0x2B), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_load_mode, 0x2D), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqcl, 0x2E), + ALI_DRW_PMU_EVENT_ATTR(perf_visible_window_limit_reached_rd, 0x30), + ALI_DRW_PMU_EVENT_ATTR(perf_visible_window_limit_reached_wr, 0x31), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_dqsosc_mpc, 0x34), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_dqsosc_mrr, 0x35), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_tcr_mrr, 0x36), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqstart, 0x37), + ALI_DRW_PMU_EVENT_ATTR(perf_op_is_zqlatch, 0x38), + ALI_DRW_PMU_EVENT_ATTR(perf_chi_txreq, 0x39), + ALI_DRW_PMU_EVENT_ATTR(perf_chi_txdat, 0x3A), + ALI_DRW_PMU_EVENT_ATTR(perf_chi_rxdat, 0x3B), + ALI_DRW_PMU_EVENT_ATTR(perf_chi_rxrsp, 0x3C), + ALI_DRW_PMU_EVENT_ATTR(perf_tsz_vio, 0x3D), + ALI_DRW_PMU_EVENT_ATTR(perf_cycle, 0x80), + NULL, +}; + +static struct attribute_group ali_drw_pmu_events_attr_group =3D { + .name =3D "events", + .attrs =3D ali_drw_pmu_events_attrs, +}; + +static struct attribute *ali_drw_pmu_format_attr[] =3D { + ALI_DRW_PMU_FORMAT_ATTR(event, "config:0-7"), + NULL, +}; + +static const struct attribute_group ali_drw_pmu_format_group =3D { + .name =3D "format", + .attrs =3D ali_drw_pmu_format_attr, +}; + +static ssize_t ali_drw_pmu_cpumask_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(dev_get_drvdata(dev)); + + return cpumap_print_to_pagebuf(true, buf, cpumask_of(drw_pmu->cpu)); +} + +static struct device_attribute ali_drw_pmu_cpumask_attr =3D + __ATTR(cpumask, 0444, ali_drw_pmu_cpumask_show, NULL); + +static struct attribute *ali_drw_pmu_cpumask_attrs[] =3D { + &ali_drw_pmu_cpumask_attr.attr, + NULL, +}; + +static const struct attribute_group ali_drw_pmu_cpumask_attr_group =3D { + .attrs =3D ali_drw_pmu_cpumask_attrs, +}; + +static const struct attribute_group *ali_drw_pmu_attr_groups[] =3D { + &ali_drw_pmu_events_attr_group, + &ali_drw_pmu_cpumask_attr_group, + &ali_drw_pmu_format_group, + NULL, +}; + +/* find a counter for event, then in add func, hw.idx will equal to counte= r */ +static int ali_drw_get_counter_idx(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; ++idx) { + if (!test_and_set_bit(idx, drw_pmu->used_mask)) + return idx; + } + + /* The counters are all in use. */ + return -EBUSY; +} + +static u64 ali_drw_pmu_read_counter(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + u64 cycle_high, cycle_low; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + cycle_high =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_HIGH); + cycle_high &=3D ALI_DRW_PMU_CYCLE_CNT_HIGH_MASK; + cycle_low =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_CYCLE_CNT_LOW); + cycle_low &=3D ALI_DRW_PMU_CYCLE_CNT_LOW_MASK; + return (cycle_high << 32 | cycle_low); + } + + return readl(drw_pmu->cfg_base + + ALI_DRW_PMU_COMMON_COUNTERn(event->hw.idx)); +} + +static void ali_drw_pmu_event_update(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + u64 delta, prev, now; + + do { + prev =3D local64_read(&hwc->prev_count); + now =3D ali_drw_pmu_read_counter(event); + } while (local64_cmpxchg(&hwc->prev_count, prev, now) !=3D prev); + + /* handle overflow. */ + delta =3D now - prev; + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) + delta &=3D ALI_DRW_PMU_OV_INTR_MASK; + else + delta &=3D ALI_DRW_CNT_MAX_PERIOD; + local64_add(delta, &event->count); +} + +static void ali_drw_pmu_event_set_period(struct perf_event *event) +{ + u64 pre_val; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + /* set a preload counter for test purpose */ + writel(ALI_DRW_PMU_TEST_SEL_COMMON_COUNTER_BASE + event->hw.idx, + drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); + + /* set conunter initial value */ + pre_val =3D ALI_DRW_PMU_CNT_INIT; + writel(pre_val, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + local64_set(&event->hw.prev_count, pre_val); + + /* set sel mode to zero to start test */ + writel(0x0, drw_pmu->cfg_base + ALI_DRW_PMU_TEST_CTRL); +} + +static void ali_drw_pmu_enable_counter(struct perf_event *event) +{ + u32 val, reg; + int counter =3D event->hw.idx; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + /* enable the common counter n */ + val |=3D (1 << ALI_DRW_PMCOM_CNT_EN_OFFSET(counter)); + /* decides the event which will be counted by the common counter n */ + val |=3D ((drw_pmu->evtids[counter] & 0x3F) << ALI_DRW_PMCOM_CNT_EVENT_OF= FSET(counter)); + writel(val, drw_pmu->cfg_base + reg); +} + +static void ali_drw_pmu_disable_counter(struct perf_event *event) +{ + u32 val, reg; + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + int counter =3D event->hw.idx; + + reg =3D ALI_DRW_PMU_EVENT_SELn(counter); + val =3D readl(drw_pmu->cfg_base + reg); + val &=3D ~(1 << ALI_DRW_PMCOM_CNT_EN_OFFSET(counter)); + val &=3D ~(0x3F << ALI_DRW_PMCOM_CNT_EVENT_OFFSET(counter)); + + writel(val, drw_pmu->cfg_base + reg); +} + +static irqreturn_t ali_drw_pmu_isr(int irq_num, void *data) +{ + struct ali_drw_pmu_irq *irq =3D data; + struct ali_drw_pmu *drw_pmu; + irqreturn_t ret =3D IRQ_NONE; + + rcu_read_lock(); + list_for_each_entry_rcu(drw_pmu, &irq->pmus_node, pmus_node) { + unsigned long status; + struct perf_event *event; + unsigned int idx; + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + ali_drw_pmu_disable_counter(event); + } + + /* common counter intr status */ + status =3D readl(drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_STATUS); + status =3D (status >> 8) & 0xFFFF; + if (status) { + for_each_set_bit(idx, &status, + ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + event =3D drw_pmu->events[idx]; + if (WARN_ON_ONCE(!event)) + continue; + ali_drw_pmu_event_update(event); + ali_drw_pmu_event_set_period(event); + } + + /* clear common counter intr status */ + writel((status & 0xFFFF) << 8, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + } + + for (idx =3D 0; idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS; idx++) { + event =3D drw_pmu->events[idx]; + if (!event) + continue; + if (!(event->hw.state & PERF_HES_STOPPED)) + ali_drw_pmu_enable_counter(event); + } + + ret =3D IRQ_HANDLED; + } + rcu_read_unlock(); + return ret; +} + +static struct ali_drw_pmu_irq *__ali_drw_pmu_init_irq(struct platform_devi= ce + *pdev, int irq_num) +{ + int ret; + struct ali_drw_pmu_irq *irq; + + list_for_each_entry(irq, &ali_drw_pmu_irqs, irqs_node) { + if (irq->irq_num =3D=3D irq_num + && refcount_inc_not_zero(&irq->refcount)) + return irq; + } + + irq =3D kzalloc(sizeof(*irq), GFP_KERNEL); + if (!irq) + return ERR_PTR(-ENOMEM); + + INIT_LIST_HEAD(&irq->pmus_node); + + /* Pick one CPU to be the preferred one to use */ + irq->cpu =3D smp_processor_id(); + refcount_set(&irq->refcount, 1); + + /* + * FIXME: DDRSS Driveway PMU overflow interrupt shares the same irq + * number with MPAM ERR_IRQ. To register DDRSS PMU and MPAM drivers + * successfully, add IRQF_SHARED flag. + */ + ret =3D devm_request_irq(&pdev->dev, irq_num, ali_drw_pmu_isr, + IRQF_SHARED, dev_name(&pdev->dev), irq); + if (ret < 0) { + dev_err(&pdev->dev, + "Fail to request IRQ:%d ret:%d\n", irq_num, ret); + goto out_free; + } + + ret =3D irq_set_affinity_hint(irq_num, cpumask_of(irq->cpu)); + if (ret) + goto out_free; + + ret =3D cpuhp_state_add_instance_nocalls(ali_drw_cpuhp_state_num, &irq->n= ode); + if (ret) + goto out_free; + + irq->irq_num =3D irq_num; + list_add(&irq->irqs_node, &ali_drw_pmu_irqs); + + return irq; + +out_free: + kfree(irq); + return ERR_PTR(ret); +} + +static int ali_drw_pmu_init_irq(struct ali_drw_pmu *drw_pmu, + struct platform_device *pdev) +{ + int irq_num; + struct ali_drw_pmu_irq *irq; + + /* Read and init IRQ */ + irq_num =3D platform_get_irq(pdev, 0); + if (irq_num < 0) + return irq_num; + + mutex_lock(&ali_drw_pmu_irqs_lock); + irq =3D __ali_drw_pmu_init_irq(pdev, irq_num); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + if (IS_ERR(irq)) + return PTR_ERR(irq); + + drw_pmu->irq =3D irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_add_rcu(&drw_pmu->pmus_node, &irq->pmus_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + return 0; +} + +static void ali_drw_pmu_uninit_irq(struct ali_drw_pmu *drw_pmu) +{ + struct ali_drw_pmu_irq *irq =3D drw_pmu->irq; + + mutex_lock(&ali_drw_pmu_irqs_lock); + list_del_rcu(&drw_pmu->pmus_node); + + if (!refcount_dec_and_test(&irq->refcount)) { + mutex_unlock(&ali_drw_pmu_irqs_lock); + return; + } + + list_del(&irq->irqs_node); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, NULL)); + cpuhp_state_remove_instance_nocalls(ali_drw_cpuhp_state_num, + &irq->node); + kfree(irq); +} + +static int ali_drw_pmu_event_init(struct perf_event *event) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + struct perf_event *sibling; + struct device *dev =3D drw_pmu->pmu.dev; + + if (event->attr.type !=3D event->pmu->type) + return -ENOENT; + + if (is_sampling_event(event)) { + dev_err(dev, "Sampling not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->attach_state & PERF_ATTACH_TASK) { + dev_err(dev, "Per-task counter cannot allocate!\n"); + return -EOPNOTSUPP; + } + + event->cpu =3D drw_pmu->cpu; + if (event->cpu < 0) { + dev_err(dev, "Per-task mode not supported!\n"); + return -EOPNOTSUPP; + } + + if (event->group_leader !=3D event && + !is_software_event(event->group_leader)) { + dev_err(dev, "driveway only allow one event!\n"); + return -EINVAL; + } + + for_each_sibling_event(sibling, event->group_leader) { + if (sibling !=3D event && !is_software_event(sibling)) { + dev_err(dev, "driveway event not allowed!\n"); + return -EINVAL; + } + } + + /* reset all the pmu counters */ + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + hwc->idx =3D -1; + + return 0; +} + +static void ali_drw_pmu_start(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + event->hw.state =3D 0; + + if (GET_DRW_EVENTID(event) =3D=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + writel(ALI_DRW_PMU_CNT_START, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + return; + } + + ali_drw_pmu_event_set_period(event); + if (flags & PERF_EF_RELOAD) { + unsigned long prev_raw_count =3D + local64_read(&event->hw.prev_count); + writel(prev_raw_count, + drw_pmu->cfg_base + ALI_DRW_PMU_CNT_PRELOAD); + } + + ali_drw_pmu_enable_counter(event); + + writel(ALI_DRW_PMU_CNT_START, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); +} + +static void ali_drw_pmu_stop(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + + if (event->hw.state & PERF_HES_STOPPED) + return; + + if (GET_DRW_EVENTID(event) !=3D ALI_DRW_PMU_CYCLE_EVT_ID) + ali_drw_pmu_disable_counter(event); + + writel(ALI_DRW_PMU_CNT_STOP, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + ali_drw_pmu_event_update(event); + event->hw.state |=3D PERF_HES_STOPPED | PERF_HES_UPTODATE; +} + +static int ali_drw_pmu_add(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D -1; + int evtid; + + evtid =3D GET_DRW_EVENTID(event); + + if (evtid !=3D ALI_DRW_PMU_CYCLE_EVT_ID) { + idx =3D ali_drw_get_counter_idx(event); + if (idx < 0) + return idx; + drw_pmu->events[idx] =3D event; + drw_pmu->evtids[idx] =3D evtid; + } + hwc->idx =3D idx; + + hwc->state =3D PERF_HES_STOPPED | PERF_HES_UPTODATE; + + if (flags & PERF_EF_START) + ali_drw_pmu_start(event, PERF_EF_RELOAD); + + /* Propagate our changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void ali_drw_pmu_del(struct perf_event *event, int flags) +{ + struct ali_drw_pmu *drw_pmu =3D to_ali_drw_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D hwc->idx; + + ali_drw_pmu_stop(event, PERF_EF_UPDATE); + + if (idx >=3D 0 && idx < ALI_DRW_PMU_COMMON_MAX_COUNTERS) { + drw_pmu->events[idx] =3D NULL; + drw_pmu->evtids[idx] =3D 0; + clear_bit(idx, drw_pmu->used_mask); + } + + perf_event_update_userpage(event); +} + +static void ali_drw_pmu_read(struct perf_event *event) +{ + ali_drw_pmu_event_update(event); +} + +static int ali_drw_pmu_probe(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu; + struct resource *res; + char *name; + int ret; + + drw_pmu =3D devm_kzalloc(&pdev->dev, sizeof(*drw_pmu), GFP_KERNEL); + if (!drw_pmu) + return -ENOMEM; + + drw_pmu->dev =3D &pdev->dev; + platform_set_drvdata(pdev, drw_pmu); + + res =3D platform_get_resource(pdev, IORESOURCE_MEM, 0); + drw_pmu->cfg_base =3D devm_ioremap_resource(&pdev->dev, res); + if (!drw_pmu->cfg_base) + return -ENOMEM; + + name =3D devm_kasprintf(drw_pmu->dev, GFP_KERNEL, "ali_drw_%llx", + (u64) (res->start >> ALI_DRW_PMU_PA_SHIFT)); + if (!name) + return -ENOMEM; + + writel(ALI_DRW_PMU_CNT_RST, drw_pmu->cfg_base + ALI_DRW_PMU_CNT_CTRL); + + /* enable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_ENABLE_CTL); + + /* clearing interrupt status */ + writel(0xffffff, drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_CLR); + + drw_pmu->cpu =3D smp_processor_id(); + + ret =3D ali_drw_pmu_init_irq(drw_pmu, pdev); + if (ret) + return ret; + + drw_pmu->pmu =3D (struct pmu) { + .module =3D THIS_MODULE, + .task_ctx_nr =3D perf_invalid_context, + .event_init =3D ali_drw_pmu_event_init, + .add =3D ali_drw_pmu_add, + .del =3D ali_drw_pmu_del, + .start =3D ali_drw_pmu_start, + .stop =3D ali_drw_pmu_stop, + .read =3D ali_drw_pmu_read, + .attr_groups =3D ali_drw_pmu_attr_groups, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE, + }; + + ret =3D perf_pmu_register(&drw_pmu->pmu, name, -1); + if (ret) { + dev_err(drw_pmu->dev, "DRW Driveway PMU PMU register failed!\n"); + ali_drw_pmu_uninit_irq(drw_pmu); + } + + return ret; +} + +static int ali_drw_pmu_remove(struct platform_device *pdev) +{ + struct ali_drw_pmu *drw_pmu =3D platform_get_drvdata(pdev); + + /* disable the generation of interrupt by all common counters */ + writel(ALI_DRW_PMCOM_CNT_OV_INTR_MASK, + drw_pmu->cfg_base + ALI_DRW_PMU_OV_INTR_DISABLE_CTL); + + ali_drw_pmu_uninit_irq(drw_pmu); + perf_pmu_unregister(&drw_pmu->pmu); + + return 0; +} + +static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *no= de) +{ + struct ali_drw_pmu_irq *irq; + struct ali_drw_pmu *drw_pmu; + unsigned int target; + + irq =3D hlist_entry_safe(node, struct ali_drw_pmu_irq, node); + if (cpu !=3D irq->cpu) + return 0; + + target =3D cpumask_any_but(cpu_online_mask, cpu); + if (target >=3D nr_cpu_ids) + return 0; + + /* We're only reading, but this isn't the place to be involving RCU */ + mutex_lock(&ali_drw_pmu_irqs_lock); + list_for_each_entry(drw_pmu, &irq->pmus_node, pmus_node) + perf_pmu_migrate_context(&drw_pmu->pmu, irq->cpu, target); + mutex_unlock(&ali_drw_pmu_irqs_lock); + + WARN_ON(irq_set_affinity_hint(irq->irq_num, cpumask_of(target))); + irq->cpu =3D target; + + return 0; +} + +static const struct acpi_device_id ali_drw_acpi_match[] =3D { + {"ARMHD700", 0}, + {} +}; + +MODULE_DEVICE_TABLE(acpi, ali_drw_acpi_match); + +static struct platform_driver ali_drw_pmu_driver =3D { + .driver =3D { + .name =3D "ali_drw_pmu", + .acpi_match_table =3D ACPI_PTR(ali_drw_acpi_match), + }, + .probe =3D ali_drw_pmu_probe, + .remove =3D ali_drw_pmu_remove, +}; + +static int __init ali_drw_pmu_init(void) +{ + int ret; + + ali_drw_cpuhp_state_num =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "ali_drw_pmu:online", + NULL, + ali_drw_pmu_offline_cpu); + + if (ali_drw_cpuhp_state_num < 0) { + pr_err("DRW Driveway PMU: setup hotplug failed, ret =3D %d\n", + ali_drw_cpuhp_state_num); + return ali_drw_cpuhp_state_num; + } + + ret =3D platform_driver_register(&ali_drw_pmu_driver); + if (ret) + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); + + return ret; +} + +static void __exit ali_drw_pmu_exit(void) +{ + platform_driver_unregister(&ali_drw_pmu_driver); + cpuhp_remove_multi_state(ali_drw_cpuhp_state_num); +} + +module_init(ali_drw_pmu_init); +module_exit(ali_drw_pmu_exit); + +MODULE_AUTHOR("Hongbo Yao "); +MODULE_AUTHOR("Neng Chen "); +MODULE_AUTHOR("Shuai Xue "); +MODULE_DESCRIPTION("DDR Sub-System PMU driver"); +MODULE_LICENSE("GPL v2"); --=20 2.20.1.12.g72788fdb From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27337ECAAD8 for ; Wed, 14 Sep 2022 02:23:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229569AbiINCXx (ORCPT ); Tue, 13 Sep 2022 22:23:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229696AbiINCXj (ORCPT ); Tue, 13 Sep 2022 22:23:39 -0400 Received: from out30-56.freemail.mail.aliyun.com (out30-56.freemail.mail.aliyun.com [115.124.30.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE9EB5D12E for ; Tue, 13 Sep 2022 19:23:38 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R751e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046050;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VPl45l4_1663122214; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VPl45l4_1663122214) by smtp.aliyun-inc.com; Wed, 14 Sep 2022 10:23:35 +0800 From: Shuai Xue To: will@kernel.org, Jonathan.Cameron@Huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: rdunlap@infradead.org, robin.murphy@arm.com, mark.rutland@arm.com, baolin.wang@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [RESEND PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Date: Wed, 14 Sep 2022 10:23:26 +0800 Message-Id: <20220914022326.88550-4-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add maintainers for Alibaba PMU document and driver Signed-off-by: Shuai Xue Reviewed-by: Baolin Wang --- MAINTAINERS | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 936490dcc97b..e8f095c2f5ac 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -749,6 +749,12 @@ S: Supported F: drivers/infiniband/hw/erdma F: include/uapi/rdma/erdma-abi.h =20 +ALIBABA PMU DRIVER +M: Shuai Xue +S: Supported +F: Documentation/admin-guide/perf/alibaba_pmu.rst +F: drivers/perf/alibaba_uncore_dwr_pmu.c + ALIENWARE WMI DRIVER L: Dell.Client.Kernel@dell.com S: Maintained --=20 2.20.1.12.g72788fdb From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D28DC25B08 for ; Thu, 18 Aug 2022 03:18:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243074AbiHRDSq (ORCPT ); Wed, 17 Aug 2022 23:18:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243036AbiHRDSj (ORCPT ); Wed, 17 Aug 2022 23:18:39 -0400 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D31015801 for ; Wed, 17 Aug 2022 20:18:36 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R131e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045176;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VMZAdFI_1660792712; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VMZAdFI_1660792712) by smtp.aliyun-inc.com; Thu, 18 Aug 2022 11:18:33 +0800 From: Shuai Xue To: will@kernel.org, Jonathan.Cameron@Huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: rdunlap@infradead.org, robin.murphy@arm.com, mark.rutland@arm.com, baolin.wang@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [PATCH v4 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Date: Thu, 18 Aug 2022 11:18:22 +0800 Message-Id: <20220818031822.38415-4-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add maintainers for Alibaba PMU document and driver Signed-off-by: Shuai Xue Reviewed-by: Baolin Wang --- MAINTAINERS | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index f512b430c7cb..64d95db9179a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -749,6 +749,12 @@ S: Supported F: drivers/infiniband/hw/erdma F: include/uapi/rdma/erdma-abi.h =20 +ALIBABA PMU DRIVER +M: Shuai Xue +S: Supported +F: Documentation/admin-guide/perf/alibaba_pmu.rst +F: drivers/perf/alibaba_uncore_dwr_pmu.c + ALIENWARE WMI DRIVER L: Dell.Client.Kernel@dell.com S: Maintained --=20 2.20.1.12.g72788fdb From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31161C43334 for ; Tue, 28 Jun 2022 03:16:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230341AbiF1DQw (ORCPT ); Mon, 27 Jun 2022 23:16:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229790AbiF1DQp (ORCPT ); Mon, 27 Jun 2022 23:16:45 -0400 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D7BB1B78F for ; Mon, 27 Jun 2022 20:16:45 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R581e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0VHfG1yS_1656386201; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VHfG1yS_1656386201) by smtp.aliyun-inc.com; Tue, 28 Jun 2022 11:16:42 +0800 From: Shuai Xue To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Date: Tue, 28 Jun 2022 11:16:30 +0800 Message-Id: <20220628031630.18674-4-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add maintainers for Alibaba PMU document and driver Signed-off-by: Shuai Xue Reviewed-by: Baolin Wang --- MAINTAINERS | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 55114e73de26..ccddb59c253c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -733,6 +733,14 @@ S: Maintained F: Documentation/i2c/busses/i2c-ali1563.rst F: drivers/i2c/busses/i2c-ali1563.c =20 +ALIBABA PMU DRIVER +M: Shuai Xue +M: Hongbo Yao +M: Neng Chen +S: Supported +F: Documentation/admin-guide/perf/alibaba_pmu.rst +F: drivers/perf/alibaba_uncore_dwr_pmu.c + ALIENWARE WMI DRIVER L: Dell.Client.Kernel@dell.com S: Maintained --=20 2.20.1.9.gb50a0d7 From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D2CACCA479 for ; Fri, 17 Jun 2022 11:18:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1381483AbiFQLSz (ORCPT ); Fri, 17 Jun 2022 07:18:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1381091AbiFQLSn (ORCPT ); Fri, 17 Jun 2022 07:18:43 -0400 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E0486C556 for ; Fri, 17 Jun 2022 04:18:41 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0VGernka_1655464718; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VGernka_1655464718) by smtp.aliyun-inc.com; Fri, 17 Jun 2022 19:18:39 +0800 From: Shuai Xue To: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com Subject: [PATCH v1 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Date: Fri, 17 Jun 2022 19:18:25 +0800 Message-Id: <20220617111825.92911-4-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add maintainers for Alibaba PMU document and driver Signed-off-by: Shuai Xue Reviewed-by: Baolin Wang --- MAINTAINERS | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 1fc9ead83d2a..68e02d471758 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -733,6 +733,14 @@ S: Maintained F: Documentation/i2c/busses/i2c-ali1563.rst F: drivers/i2c/busses/i2c-ali1563.c =20 +ALIBABA PMU DRIVER +M: Shuai Xue +M: Hongbo Yao +M: Neng Chen +S: Supported +F: Documentation/admin-guide/perf/alibaba_pmu.rst +F: drivers/perf/alibaba_uncore_dwr_pmu.c + ALIENWARE WMI DRIVER L: Dell.Client.Kernel@dell.com S: Maintained --=20 2.20.1.12.g72788fdb From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3964C43334 for ; Fri, 15 Jul 2022 15:13:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234764AbiGOPNp (ORCPT ); Fri, 15 Jul 2022 11:13:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233583AbiGOPNk (ORCPT ); Fri, 15 Jul 2022 11:13:40 -0400 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6FF252A1 for ; Fri, 15 Jul 2022 08:13:34 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R211e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0VJPkRNI_1657898007; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VJPkRNI_1657898007) by smtp.aliyun-inc.com; Fri, 15 Jul 2022 23:13:29 +0800 From: Shuai Xue To: Jonathan.Cameron@Huawei.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [RESEND PATCH v2 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Date: Fri, 15 Jul 2022 23:13:10 +0800 Message-Id: <20220715151310.90091-4-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add maintainers for Alibaba PMU document and driver Signed-off-by: Shuai Xue Reviewed-by: Baolin Wang --- MAINTAINERS | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 55114e73de26..ccddb59c253c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -733,6 +733,14 @@ S: Maintained F: Documentation/i2c/busses/i2c-ali1563.rst F: drivers/i2c/busses/i2c-ali1563.c =20 +ALIBABA PMU DRIVER +M: Shuai Xue +M: Hongbo Yao +M: Neng Chen +S: Supported +F: Documentation/admin-guide/perf/alibaba_pmu.rst +F: drivers/perf/alibaba_uncore_dwr_pmu.c + ALIENWARE WMI DRIVER L: Dell.Client.Kernel@dell.com S: Maintained --=20 2.20.1.9.gb50a0d7 From nobody Mon Apr 27 03:25:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2CCF8C433EF for ; Wed, 20 Jul 2022 06:58:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237135AbiGTG6q (ORCPT ); Wed, 20 Jul 2022 02:58:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236951AbiGTG6m (ORCPT ); Wed, 20 Jul 2022 02:58:42 -0400 Received: from out30-57.freemail.mail.aliyun.com (out30-57.freemail.mail.aliyun.com [115.124.30.57]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D95874B0C5 for ; Tue, 19 Jul 2022 23:58:41 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R531e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=11;SR=0;TI=SMTPD_---0VJvhhVw_1658300318; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VJvhhVw_1658300318) by smtp.aliyun-inc.com; Wed, 20 Jul 2022 14:58:39 +0800 From: Shuai Xue To: Jonathan.Cameron@Huawei.com, rdunlap@infradead.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, will@kernel.org, mark.rutland@arm.com Cc: baolin.wang@linux.alibaba.com, yaohongbo@linux.alibaba.com, nengchen@linux.alibaba.com, zhuo.song@linux.alibaba.com, xueshuai@linux.alibaba.com Subject: [PATCH v3 3/3] MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver Date: Wed, 20 Jul 2022 14:58:15 +0800 Message-Id: <20220720065815.60772-4-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220617111825.92911-1-xueshuai@linux.alibaba.com> References: <20220617111825.92911-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add maintainers for Alibaba PMU document and driver Signed-off-by: Shuai Xue Reviewed-by: Baolin Wang --- MAINTAINERS | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 55114e73de26..8c05de35828d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -733,6 +733,13 @@ S: Maintained F: Documentation/i2c/busses/i2c-ali1563.rst F: drivers/i2c/busses/i2c-ali1563.c =20 +ALIBABA PMU DRIVER +M: Shuai Xue +M: Hongbo Yao +S: Supported +F: Documentation/admin-guide/perf/alibaba_pmu.rst +F: drivers/perf/alibaba_uncore_dwr_pmu.c + ALIENWARE WMI DRIVER L: Dell.Client.Kernel@dell.com S: Maintained --=20 2.20.1.9.gb50a0d7