From nobody Mon Jan 26 22:50:07 2026 Received: from CY7PR03CU001.outbound.protection.outlook.com (mail-westcentralusazon11010020.outbound.protection.outlook.com [40.93.198.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A12D934D917; Mon, 26 Jan 2026 18:12:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.93.198.20 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451179; cv=fail; b=s/ufiKze4vQP7UMbm47cJ41qb4rRP82D6t+4YUOoQIFIsXvdtUodp4/bYZVoGMTvFMcLNg3yt+pQYN3d/4E2tcYqfr2FJTO4ZSgMqUH1tsaONFDSRp1dJyr2rJBWNUQ/mBxmQsJ1iMiRVvZW8+xX5k0l0NKQoH7iLwnLN8+qCXc= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451179; c=relaxed/simple; bh=uW0U7sWv4q6D+t7dQEhGQ2iAta3BZxmXjZuKXSqECjg=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=egVxGYSqe66zEOvmsyHyveG6On9M070qTFFIAFAKJn/QZ8qCrn4/6nYSXe2chxFexaSqJsocvkgNw058w6XK5KGrNZDR8kmiCt+Le+16IW8LYEOB2c4kL+4PFpvz8n/e7mMlIptY5LGxteXKRhPT8PfR0Qq9eMZSZHYhLnaPJ04= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=CITriqE7; arc=fail smtp.client-ip=40.93.198.20 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="CITriqE7" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=CJT6mBVfkdhjAjM0vMKmwP44ZyO/1DdJmv3DPjrcCg78C8aWu5QDNpolKyHcysZqEl47jrzng9FShk7gGGQ5Rycur5kHXZ9JVt5yt/fKY9UmoHlBCxMY9DdYRu0h8u8/vKjx9MJ3O/53JdkRvX4GxCSEkhrEbnzaUoFX1J734m+S8VKTJASW6vqL39RWfxwC2y3Tz+Ds+wbzTu3cGDj0hHk/I3Ge8fZ4JSGqFnaIJR+wF/Qso8MMYkJcF1cKiqGDwfxVLQD/zEL6nktEg1i0B1pfVdttO/MQqz3c/1vw+4ZGRtcyIpyjRGHM6TYVCPiuPUY6OppLPe9yJ3zMZCLdlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qRL37xd9wtmmL9LmO6hkhJ+rDg1HbnpPEFTOE5j6uAw=; b=MDTSguK0jsxeMC2CEMYZ4nUbxValqMup4eX/pLPVdFuYgzf+mTCiNAAHkdz1KD+OvmcEQM7QF00lSlxzB85Ms1kBob9iFJ+KwxjA9w/jhBi4/KBlUkFh569J+ppaux1we87GdCDpyPbvAM1D2VOgFj+n0cP3arno4zCrGcMZC9pqKkECuf3aR5k51//xtZWcqtMbKnnucm9nicCYd8NVZluH2RhR2k/9k1Uqb0WVHNLSZE1mar7xz0kz953lSDAiwfSF5G0G8/pwOz0+zh5kF/WKntAII5wj0ZRsIQ0yuHwT7UAbyoEwAsTbp0SLV0qt1+gmG1hKmcLeFM+Oo92PGA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qRL37xd9wtmmL9LmO6hkhJ+rDg1HbnpPEFTOE5j6uAw=; b=CITriqE7PMA765o8fDkbpShhgdhZmR8J9GXswtHHx60fe1lwLLR3GWrVeQ6VgrJK6y0NtU+UxCTFNpNrrFwGjQGnbw5IHkkZWtQV/hxHRC/hu76UWFFdLc+k6jC6TT6NFZcs0bK2jANbK4lsGbspvXemxcimqONCnax3sYFOGwcftdSiLUFNm365Hmz5CHolOKGxQ8f714S/E17xPYbtHARwjk2Am4ZUedhecEHl/AqU8bHRJsKUWLzzrilzIYC/6q+t+pxON1gBho19qHjmHvTjVI+HemIRLOo+HnRQdEo2/iFIK/DLEIZZxtdsVB2GTZCO8IoNdwOPJ69RGYeEhg== Received: from SJ0PR05CA0111.namprd05.prod.outlook.com (2603:10b6:a03:334::26) by MW4PR12MB5603.namprd12.prod.outlook.com (2603:10b6:303:16a::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9542.15; Mon, 26 Jan 2026 18:12:53 +0000 Received: from SJ5PEPF000001F0.namprd05.prod.outlook.com (2603:10b6:a03:334:cafe::18) by SJ0PR05CA0111.outlook.office365.com (2603:10b6:a03:334::26) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9564.6 via Frontend Transport; Mon, 26 Jan 2026 18:12:36 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001F0.mail.protection.outlook.com (10.167.242.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.3 via Frontend Transport; Mon, 26 Jan 2026 18:12:52 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:38 -0800 Received: from drhqmail201.nvidia.com (10.126.190.180) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:38 -0800 Received: from build-bwicaksono-noble-20251018.internal (10.127.8.11) by mail.nvidia.com (10.126.190.180) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 26 Jan 2026 10:12:37 -0800 From: Besar Wicaksono To: , , , CC: , , , , , , , , , , , , , Besar Wicaksono Subject: [PATCH 1/8] perf/arm_cspmu: nvidia: Rename doc to Tegra241 Date: Mon, 26 Jan 2026 18:11:48 +0000 Message-ID: <20260126181155.2776097-2-bwicaksono@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260126181155.2776097-1-bwicaksono@nvidia.com> References: <20260126181155.2776097-1-bwicaksono@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001F0:EE_|MW4PR12MB5603:EE_ X-MS-Office365-Filtering-Correlation-Id: 1924f884-d81d-41c8-62be-08de5d068565 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|1800799024|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?xkCybv/wV7hrVSskD475cd0HvHBCOjKHGhhn136TuOQf9cvBOvYsvkuOYDMd?= =?us-ascii?Q?Qfhjgz+A3wH/CvP2LfvJsChjLCDzNsB99CAhItKlJiao9wqIPc4Ouz2I9mFX?= =?us-ascii?Q?W8MVfUoL1B7VSkUjg+a8Wb9Pbg9PTvhwoLigag9kOZCXqtDZ9Ogotsu/7rix?= =?us-ascii?Q?nYfAWtQSeRNYPmSEIcopN+Cuy0W4j+NttORm1LCfUCNK+tcgSWIEs2uzjb7U?= =?us-ascii?Q?LWIuVFUHu+LjGVpPgEYjfGccdm3V2o6k2xqavM8dCFYrM7xlsIOLmCfsCLcL?= =?us-ascii?Q?5yTc7GeFLf9JTOhTnZEl/LpGXtQPgqbEUTDk5t6TuGD8N8kJjxpXXGI/JG77?= =?us-ascii?Q?v7aY6tCuoDKMOZrTBilIZZSxO+8h5oVPVJOWyjz2nPdtXfsUR+cvQGqrRHzi?= =?us-ascii?Q?AK9tpOn29HbjGDYNXEk/xetGy3A8wa4hfvuyIEEuVziHPsnnaXfA8jBSHdhS?= =?us-ascii?Q?qzWT5k6vX9Ha+arx+hkbRvnPPv6BoD6JGVDY1iLfSaYOqduISOVw8KPfNBwR?= =?us-ascii?Q?hBZNmr5yHfHr3ocBeS+pyggW+ES+Bx+sShc3VMua4rPQPjAzH+JrZ2hgOYFb?= =?us-ascii?Q?epO+Y5O0uBB0yiM1UI4Xuh4z0KwUFR79RvU+Lf0QIsoqx6sc698ohZR8QmNM?= =?us-ascii?Q?IewDaU6PNHKyCgWHLsrLDmAiBcPunXxtOsq5YxdqzgaqD2muddyIXAu7Vs/Y?= =?us-ascii?Q?SpKn+tUrUijNoNrn10Y6dlUt+vvWmGx18P+9VAXhGtCj1mp5HwPfqEZxPYVx?= =?us-ascii?Q?j5HrOKi5Qt54xY3NQUHv8AeG4+DgPvBTevr6VtgX7RlebNvTY13yWvrs1nTa?= =?us-ascii?Q?BpOZFjRYt46Pgify8o4f+1bnd8/CyhKDT5MIvzIhOOMT9wmJ9LE4GvWXPoQG?= =?us-ascii?Q?SME0FHIHVDa9WX9cpYNBleLR9sOEaNAw14+1VIGqaRVV/NMJz7UADG2v5Gx1?= =?us-ascii?Q?m5p8P6mQWrm4RxGHJ/qwgF4O5PYwCPU+nyuj5FV01+BFKALwGNmOgN62WiEh?= =?us-ascii?Q?h4sx3+mFMnbvyLsHX8V7MrCe53qCZSlFSqJmVC//ywyc92mFphyi560vER7G?= =?us-ascii?Q?pn7ZHt39XtDJT1u//dbIuj4b7lsz/V92VAGUdzIkjJr6EzN0onH+c9mya0bm?= =?us-ascii?Q?jjhvipD5y6sZAQt4c5E2y3FrUhl+dRl/P6XvTebYfyUJqfCEmEBdL2Yamem5?= =?us-ascii?Q?56UCtxf/2Zoe+GFZatU13TKVJHhr88kcbgC4fHbtTqZcTpg5+FDs0w+hK08N?= =?us-ascii?Q?VmUaCeh5QY6qWUX2pwqagNf6ElS52Xlkbk3fMa757ugxCBVAmTKV4dH7jaGv?= =?us-ascii?Q?XP5A6ArVP/zeMr2JJhRfItbGL367FCm25pxF7r/7JjqnC+lgUvO4Ol/2MRai?= =?us-ascii?Q?VAuq9oBY3h5YjydJEmNbjs0KbUAzFTfdnquKnkRALSFEhzfQV2BLcYgQ35ms?= =?us-ascii?Q?D8WYOQRzbCwJDCPlrJ8qiBMeapHJwVrxR6U4Tjz34mST4DUUw3VCcc3bUxRt?= =?us-ascii?Q?MId2A3fkriqfNYWte/hi2SL8krEPtXX6ihAMlBpuqDanwDmxxGKsY2v4nAvd?= =?us-ascii?Q?zI04kvTrfPjEIppAmX1ibv+f8eZ/h3N19MJuAcrmfUcNIT/X7ngiE4H+U/Ni?= =?us-ascii?Q?DC3tYXMVZ0nG8YCcAkfG4Zp88KRprWVIy9tTNyt/gAM/0HRlANi7aoAWvHXH?= =?us-ascii?Q?05HnMA=3D=3D?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(1800799024)(82310400026)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jan 2026 18:12:52.2181 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1924f884-d81d-41c8-62be-08de5d068565 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001F0.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB5603 Content-Type: text/plain; charset="utf-8" The documentation in nvidia-pmu.rst contains PMUs specific to NVIDIA Tegra241 SoC. Rename the file for this specific SoC to have better distinction with other NVIDIA SoC. Signed-off-by: Besar Wicaksono --- Documentation/admin-guide/perf/index.rst | 2 +- .../perf/{nvidia-pmu.rst =3D> nvidia-tegra241-pmu.rst} | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) rename Documentation/admin-guide/perf/{nvidia-pmu.rst =3D> nvidia-tegra241= -pmu.rst} (98%) diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin= -guide/perf/index.rst index 47d9a3df6329..c407bb44b08e 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -24,7 +24,7 @@ Performance monitor support thunderx2-pmu alibaba_pmu dwc_pcie_pmu - nvidia-pmu + nvidia-tegra241-pmu meson-ddr-pmu cxl ampere_cspmu diff --git a/Documentation/admin-guide/perf/nvidia-pmu.rst b/Documentation/= admin-guide/perf/nvidia-tegra241-pmu.rst similarity index 98% rename from Documentation/admin-guide/perf/nvidia-pmu.rst rename to Documentation/admin-guide/perf/nvidia-tegra241-pmu.rst index f538ef67e0e8..fad5bc4cee6c 100644 --- a/Documentation/admin-guide/perf/nvidia-pmu.rst +++ b/Documentation/admin-guide/perf/nvidia-tegra241-pmu.rst @@ -1,8 +1,8 @@ -=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D -NVIDIA Tegra SoC Uncore Performance Monitoring Unit (PMU) -=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +NVIDIA Tegra241 SoC Uncore Performance Monitoring Unit (PMU) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 -The NVIDIA Tegra SoC includes various system PMUs to measure key performan= ce +The NVIDIA Tegra241 SoC includes various system PMUs to measure key perfor= mance metrics like memory bandwidth, latency, and utilization: =20 * Scalable Coherency Fabric (SCF) --=20 2.43.0 From nobody Mon Jan 26 22:50:07 2026 Received: from BN1PR04CU002.outbound.protection.outlook.com (mail-eastus2azon11010049.outbound.protection.outlook.com [52.101.56.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0E4B34D922; Mon, 26 Jan 2026 18:13:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.56.49 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451195; cv=fail; b=Cmmef4uGY6DMZb7/Uc6ogHZql9HSXnN5vAQd6ar+x1OF7fcLw2si1FfKfqLNSfET3BRr5C3HyZeDwkQfK8vneMSFRI4hZNX1FsCPOWcOdt50l3/GN3O9LhRD6tuNOc3yeBBN2jyvkZlGF+IPv4sFyLliGhF2Hvai6x0KVB+AADM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451195; c=relaxed/simple; bh=uWPvjCq8zyUNvGfFkXKJO4bbk2VwrxgRQaGOf8Sd5iE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Zk0tnDHvaZFPxc4jEtZcxkLGCuYOJDcSyAoigPX9eGrxvvqCvn/GJjEmtOALSV11+dkwi3ttOtRjHLjulVO4omJOzxFfv2HQzYTNqSgLJPOMqYy8Y69k0XXFO6mtBjvjsdmYqINK3arb0/FPJGCWE0UJj2s8Qpd0tv7m1RSrPDE= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=k74wNlCT; arc=fail smtp.client-ip=52.101.56.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="k74wNlCT" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=vVILkVlCdPkrjiOiKCeCXbY2Wcfl7aGvgPMLA/tC+e7cUnu81u96ZQfH82d0scaDDl0cYhnwkop3r3CQ1pynYIfllQPQ4pDXJNtUZufsbhv2dXu2eD1CdgDXUpHk3TvBTwMUcyZy9WZDokmtVRCXnBJTN8s1Bo7WzOti6nzLSBy6MiW46As06OC+x5nJB89CfaY9+ixb1EctLcGIqg+kQzSTNk5svGUWV5e6Nqz2BNFUQjhmi4rftecWEuPXJCWkn6O4yVa/Cms87STe3Nf0sU0B5OVI7QFf1xyHXIu2dnOPl6LrnAThe5aM6Y2VKOPTtdaoDbga6BAedjAE/TuzPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yOT0MJd7dyxZAVWYCpq4V02gKsRRUtZ7I3hDTZ0f89g=; b=NLoScUqlc3b8GSa1AVQ1Ia0TQzeF6Tq+wEL7jl2nswuFlbuzzIsW/nKbv3ti7eMCOrWH4oEOrP1wrcJf3EGcIIL74MVeWRbIrzgryQxmL7txRA1ku6PjahAEmv6VUPdbjZGKgSKBaj4/VraarrBXlvI0R1JUTNn9Uq9OMbYGCYD9KuYS+GxxONBnWqybB+ZZWjZqeeiraJih8SlA35qepGLw6QP1B/Gkv1PgAhUmPP6VY+EBP6tdZcn8vbQGTswxCGcWf0rSUTRKbrMnqQHioVMG02CMo//ff7Y1QVTobqPutD/VDxhbcYXEa8fx4mCmDpipaDB2ejc/S1owdR8Zhg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=yOT0MJd7dyxZAVWYCpq4V02gKsRRUtZ7I3hDTZ0f89g=; b=k74wNlCT/fbuStpM22OuqN17/rLrXZ71GITy4iSWZhuVwDUIF+XisBshnVFKNWsQlX9hZmobKBudQyRDiZwDJTNf0CcqdAfe77N3jQs0AwoOzngRVKG4YUdPN32BCZuMdLQVauvRnGuYR34YEV5OlxwVImDwfK4fqfJuJDwm/TPme6UU0ihTsAgwuL8u/SzfZOlK6+we/j0u34WFymqYnii8RIAJBGJv7wDfpcO4x2wR7kNDYHJUx39jzSNHRfv7zsb9rPOz4Rjb68PN3x2zCYfQhfG1nquBMJxCDiqa+D2+0f+v/ljePpQDLhTcH3MBQbg3TRTvufC8CPrPwbBHJQ== Received: from BN9PR03CA0600.namprd03.prod.outlook.com (2603:10b6:408:10d::35) by IA1PR12MB9532.namprd12.prod.outlook.com (2603:10b6:208:595::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9542.15; Mon, 26 Jan 2026 18:13:02 +0000 Received: from BN2PEPF000044A4.namprd02.prod.outlook.com (2603:10b6:408:10d:cafe::63) by BN9PR03CA0600.outlook.office365.com (2603:10b6:408:10d::35) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9542.16 via Frontend Transport; Mon, 26 Jan 2026 18:12:58 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by BN2PEPF000044A4.mail.protection.outlook.com (10.167.243.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.3 via Frontend Transport; Mon, 26 Jan 2026 18:13:02 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:45 -0800 Received: from drhqmail201.nvidia.com (10.126.190.180) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:45 -0800 Received: from build-bwicaksono-noble-20251018.internal (10.127.8.11) by mail.nvidia.com (10.126.190.180) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 26 Jan 2026 10:12:45 -0800 From: Besar Wicaksono To: , , , CC: , , , , , , , , , , , , , Besar Wicaksono Subject: [PATCH 2/8] perf/arm_cspmu: nvidia: Add Tegra410 UCF PMU Date: Mon, 26 Jan 2026 18:11:49 +0000 Message-ID: <20260126181155.2776097-3-bwicaksono@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260126181155.2776097-1-bwicaksono@nvidia.com> References: <20260126181155.2776097-1-bwicaksono@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN2PEPF000044A4:EE_|IA1PR12MB9532:EE_ X-MS-Office365-Filtering-Correlation-Id: c7905b98-cb49-4cfe-b17f-08de5d068b61 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|1800799024|376014|36860700013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?P0ZSlSo/lXlgh4MPzBvDUpK6iYji7uGj3KIKBqXdYCKF9tBQa7JpkfnpsczC?= =?us-ascii?Q?eTvgkPxYQzrmwGy2Q7BklXf8dwi2VcddxHRkQjVEKAG1nSmhMFI29yYrSc9Z?= =?us-ascii?Q?F1X76VfkEdoZj+nlhyi7aeGGBMpNyGf8arLNtspra9/n8Rqxld2VOZVCrxrm?= =?us-ascii?Q?mJV+P1wRLE0GWTptTCdg9bND2SPDbyi8YYo3RslY6/qIW+Ro97gE39In+A2L?= =?us-ascii?Q?wwLnJw6ZjGit0kQTEWjxttfEeS4DcyBo8QTY8LYj7I+Af7ggxqyl4MsHmCyH?= =?us-ascii?Q?9NPdaAsNYvV0CdtWs0a8Fm2q9f6lW+377C2kqu/dPWcNNya79DPiG+yuyFg+?= =?us-ascii?Q?scb2Uq/82HCkyCDEblfC5FzcX1V9RqYj5nLaZiBPan/qWDTsP87Yem2HP88z?= =?us-ascii?Q?aTLDyiXLWpKyCZE4Mkx98e87ICHlG9HSlfDfiZYITOUT79iGSK9ztKJZv2Fu?= =?us-ascii?Q?6M7W3gWkNnTSmU+hCUOZEIXeaFAecLz1n3wiRA/agA/tpKzh2kq+WG6GFhN3?= =?us-ascii?Q?iSKFNuyKSWnoukX9WXhYP1J7/0mXT12G0TuXBLePNFGvCQ/x9IvNRNnYFCn5?= =?us-ascii?Q?nJPgLtO1KEaHz1PVOz3C+cM8MbNKddCl/1E/7oZiuwub2JEMCW5vzYkrbocI?= =?us-ascii?Q?XLQmjbyc/YLKsC8MUsGfv9FChLbjGDYw0LjoM/zFcT2SoJEKlCeSf7OlAEbe?= =?us-ascii?Q?Eg2U1FTHV/YkreDi2rrxU04bzaFXS1yyCLJqohEHsuuBvTopZV2bDI7LCgUt?= =?us-ascii?Q?wnTU1w41R0S/TgPAtP8fbANgI8wNtfDwm9a8htQgtqXa+AyJ5NgjxzZFrOfX?= =?us-ascii?Q?Q6MEvOzZruOQwf/Sn1n3TGO/W1Xx+52RAVLKcXDUQfBf6SQKdW0GvqpTP2Af?= =?us-ascii?Q?VW5ybxgJyWY2nkS/eJ3PvQHb815POLayF5unFaNRr0oT3hhDjwZoKyCUgf1W?= =?us-ascii?Q?HMtEm+szoXEfjOUkJN4vKgL/b4HOotfZBE1QQw3Xk0Bu/IkofprFu27HKJpu?= =?us-ascii?Q?Jh0/U4CLnPmIf2W3ctsUEVza2fIdmciOPSWfHTROW0sjR0X9D3vumC8/Ieii?= =?us-ascii?Q?Adptx/zKx7uyGZG8Zx1tq1esx/fO0S1jCT8C6VLJCS6ADfXNul1EioRKmOAK?= =?us-ascii?Q?rSlTrUrG/XobXF+/MyyCvMayo7IudZ25XWasMh9DA5wxCt743mkLJ2NHE8N5?= =?us-ascii?Q?TWmlh8rjTSa0yB1ZwUc1uqWpBPXmoJ+OGur8a/50o70jt0noQ6GDxOYAMuUh?= =?us-ascii?Q?1WE+M0UaoWBeZo8JX+aoL02mW4zze85Za7RllcMH3wGuyaCRYQICOFbAIoz9?= =?us-ascii?Q?6a+h4ygFScZQ4M3arHO+3gpEmiA+qw9kZq+scFhdyY38EC9D6Ol0M6KjF04D?= =?us-ascii?Q?MitLB4zeXZDslc3pgg/viUyiOhBc/+EV7emZdtnhs6zDJ7llSg+wsrLjr59L?= =?us-ascii?Q?DyV7MCs1vyPKTtwZdBd8nm9EPfeaE3Jup9c0R3Q0uVh8p2RRfmpF/tM5t4l9?= =?us-ascii?Q?htDd/qs12zzYPQzTnqSwbYgS0iYU6YpIqvSXAMccy1ZBkEdrdehdcldF2myK?= =?us-ascii?Q?O4p90g7tzbvCtm0j8haf/DzSab7bYr9r4aKm+j3MnCiGWuJ5fnmmfZvH6FDw?= =?us-ascii?Q?yOk0pZCXqeP4L8QPK0NTidpOVFlwOvT37yYaZcSeZayy1+4fJpwkHUR4g4Cn?= =?us-ascii?Q?rMGmVA=3D=3D?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(82310400026)(1800799024)(376014)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jan 2026 18:13:02.1002 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c7905b98-cb49-4cfe-b17f-08de5d068b61 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN2PEPF000044A4.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB9532 Content-Type: text/plain; charset="utf-8" Adds Unified Coherent Fabric PMU support in Tegra410 SOC. Signed-off-by: Besar Wicaksono --- Documentation/admin-guide/perf/index.rst | 1 + .../admin-guide/perf/nvidia-tegra410-pmu.rst | 106 ++++++++++++++++++ drivers/perf/arm_cspmu/nvidia_cspmu.c | 90 ++++++++++++++- 3 files changed, 196 insertions(+), 1 deletion(-) create mode 100644 Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin= -guide/perf/index.rst index c407bb44b08e..aa12708ddb96 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -25,6 +25,7 @@ Performance monitor support alibaba_pmu dwc_pcie_pmu nvidia-tegra241-pmu + nvidia-tegra410-pmu meson-ddr-pmu cxl ampere_cspmu diff --git a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst b/Docum= entation/admin-guide/perf/nvidia-tegra410-pmu.rst new file mode 100644 index 000000000000..7b7ba5700ca1 --- /dev/null +++ b/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst @@ -0,0 +1,106 @@ +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +NVIDIA Tegra410 SoC Uncore Performance Monitoring Unit (PMU) +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The NVIDIA Tegra410 SoC includes various system PMUs to measure key perfor= mance +metrics like memory bandwidth, latency, and utilization: + +* Unified Coherence Fabric (UCF) + +PMU Driver +---------- + +The PMU driver describes the available events and configuration of each PM= U in +sysfs. Please see the sections below to get the sysfs path of each PMU. Li= ke +other uncore PMU drivers, the driver provides "cpumask" sysfs attribute to= show +the CPU id used to handle the PMU event. There is also "associated_cpus" +sysfs attribute, which contains a list of CPUs associated with the PMU ins= tance. + +UCF PMU +------- + +The Unified Coherence Fabric (UCF) in the NVIDIA Tegra410 SoC serves as a +distributed cache, last level for CPU Memory and CXL Memory, and cache coh= erent +interconnect that supports hardware coherence across multiple coherently c= aching +agents, including: + + * CPU clusters + * GPU + * PCIe Ordering Controller Unit (OCU) + * Other IO-coherent requesters + +The events and configuration options of this PMU device are described in s= ysfs, +see /sys/bus/event_source/devices/nvidia_ucf_pmu_. + +Some of the events available in this PMU can be used to measure bandwidth = and +utilization: + + * slc_access_rd: count the number of read requests to SLC. + * slc_access_wr: count the number of write requests to SLC. + * slc_bytes_rd: count the number of bytes transferred by slc_access_rd. + * slc_bytes_wr: count the number of bytes transferred by slc_access_wr. + * mem_access_rd: count the number of read requests to local or remote me= mory. + * mem_access_wr: count the number of write requests to local or remote m= emory. + * mem_bytes_rd: count the number of bytes transferred by mem_access_rd. + * mem_bytes_wr: count the number of bytes transferred by mem_access_wr. + * cycles: counts the UCF cycles. + +The average bandwidth is calculated as:: + + AVG_SLC_READ_BANDWIDTH_IN_GBPS =3D SLC_BYTES_RD / ELAPSED_TIME_IN_NS + AVG_SLC_WRITE_BANDWIDTH_IN_GBPS =3D SLC_BYTES_WR / ELAPSED_TIME_IN_NS + AVG_MEM_READ_BANDWIDTH_IN_GBPS =3D MEM_BYTES_RD / ELAPSED_TIME_IN_NS + AVG_MEM_WRITE_BANDWIDTH_IN_GBPS =3D MEM_BYTES_WR / ELAPSED_TIME_IN_NS + +The average request rate is calculated as:: + + AVG_SLC_READ_REQUEST_RATE =3D SLC_ACCESS_RD / CYCLES + AVG_SLC_WRITE_REQUEST_RATE =3D SLC_ACCESS_WR / CYCLES + AVG_MEM_READ_REQUEST_RATE =3D MEM_ACCESS_RD / CYCLES + AVG_MEM_WRITE_REQUEST_RATE =3D MEM_ACCESS_WR / CYCLES + +More details about what other events are available can be found in Tegra41= 0 SoC +technical reference manual. + +The events can be filtered based on source or destination. The source filt= er +indicates the traffic initiator to the SLC, e.g local CPU, non-CPU device,= or +remote socket. The destination filter specifies the destination memory typ= e, +e.g. local system memory (CMEM), local GPU memory (GMEM), or remote memory= . The +local/remote classification of the destination filter is based on the home +socket of the address, not where the data actually resides. The available +filters are described in +/sys/bus/event_source/devices/nvidia_ucf_pmu_/format/. + +The list of UCF PMU event filters: + +* Source filter: + + * src_loc_cpu: if set, count events from local CPU + * src_loc_noncpu: if set, count events from local non-CPU device + * src_rem: if set, count events from CPU, GPU, PCIE devices of remote so= cket + +* Destination filter: + + * dst_loc_cmem: if set, count events to local system memory (CMEM) addre= ss + * dst_loc_gmem: if set, count events to local GPU memory (GMEM) address + * dst_loc_other: if set, count events to local CXL memory address + * dst_rem: if set, count events to CPU, GPU, and CXL memory address of r= emote socket + +If the source is not specified, the PMU will count events from all sources= . If +the destination is not specified, the PMU will count events to all destina= tions. + +Example usage: + +* Count event id 0x0 in socket 0 from all sources and to all destinations:: + + perf stat -a -e nvidia_ucf_pmu_0/event=3D0x0/ + +* Count event id 0x0 in socket 0 with source filter =3D local CPU and dest= ination + filter =3D local system memory (CMEM):: + + perf stat -a -e nvidia_ucf_pmu_0/event=3D0x0,src_loc_cpu=3D0x1,dst_loc= _cmem=3D0x1/ + +* Count event id 0x0 in socket 1 with source filter =3D local non-CPU devi= ce and + destination filter =3D remote memory:: + + perf stat -a -e nvidia_ucf_pmu_1/event=3D0x0,src_loc_noncpu=3D0x1,dst_= rem=3D0x1/ diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c b/drivers/perf/arm_cspmu= /nvidia_cspmu.c index e06a06d3407b..c67667097a3c 100644 --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 /* - * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights re= served. + * Copyright (c) 2022-2026, NVIDIA CORPORATION & AFFILIATES. All rights re= served. * */ =20 @@ -21,6 +21,13 @@ #define NV_CNVL_PORT_COUNT 4ULL #define NV_CNVL_FILTER_ID_MASK GENMASK_ULL(NV_CNVL_PORT_COUNT - 1, 0) =20 +#define NV_UCF_SRC_COUNT 3ULL +#define NV_UCF_DST_COUNT 4ULL +#define NV_UCF_FILTER_ID_MASK GENMASK_ULL(11, 0) +#define NV_UCF_FILTER_SRC GENMASK_ULL(2, 0) +#define NV_UCF_FILTER_DST GENMASK_ULL(11, 8) +#define NV_UCF_FILTER_DEFAULT (NV_UCF_FILTER_SRC | NV_UCF_FILTER_DS= T) + #define NV_GENERIC_FILTER_ID_MASK GENMASK_ULL(31, 0) =20 #define NV_PRODID_MASK (PMIIDR_PRODUCTID | PMIIDR_VARIANT | PMIIDR_REVISIO= N) @@ -124,6 +131,37 @@ static struct attribute *mcf_pmu_event_attrs[] =3D { NULL, }; =20 +static struct attribute *ucf_pmu_event_attrs[] =3D { + ARM_CSPMU_EVENT_ATTR(bus_cycles, 0x1D), + + ARM_CSPMU_EVENT_ATTR(slc_allocate, 0xF0), + ARM_CSPMU_EVENT_ATTR(slc_wb, 0xF3), + ARM_CSPMU_EVENT_ATTR(slc_refill_rd, 0x109), + ARM_CSPMU_EVENT_ATTR(slc_refill_wr, 0x10A), + ARM_CSPMU_EVENT_ATTR(slc_hit_rd, 0x119), + + ARM_CSPMU_EVENT_ATTR(slc_access_dataless, 0x183), + ARM_CSPMU_EVENT_ATTR(slc_access_atomic, 0x184), + + ARM_CSPMU_EVENT_ATTR(slc_access, 0xF2), + ARM_CSPMU_EVENT_ATTR(slc_access_rd, 0x111), + ARM_CSPMU_EVENT_ATTR(slc_access_wr, 0x112), + ARM_CSPMU_EVENT_ATTR(slc_bytes_rd, 0x113), + ARM_CSPMU_EVENT_ATTR(slc_bytes_wr, 0x114), + + ARM_CSPMU_EVENT_ATTR(mem_access_rd, 0x121), + ARM_CSPMU_EVENT_ATTR(mem_access_wr, 0x122), + ARM_CSPMU_EVENT_ATTR(mem_bytes_rd, 0x123), + ARM_CSPMU_EVENT_ATTR(mem_bytes_wr, 0x124), + + ARM_CSPMU_EVENT_ATTR(local_snoop, 0x180), + ARM_CSPMU_EVENT_ATTR(ext_snp_access, 0x181), + ARM_CSPMU_EVENT_ATTR(ext_snp_evict, 0x182), + + ARM_CSPMU_EVENT_ATTR(cycles, ARM_CSPMU_EVT_CYCLES_DEFAULT), + NULL, +}; + static struct attribute *generic_pmu_event_attrs[] =3D { ARM_CSPMU_EVENT_ATTR(cycles, ARM_CSPMU_EVT_CYCLES_DEFAULT), NULL, @@ -152,6 +190,18 @@ static struct attribute *cnvlink_pmu_format_attrs[] = =3D { NULL, }; =20 +static struct attribute *ucf_pmu_format_attrs[] =3D { + ARM_CSPMU_FORMAT_EVENT_ATTR, + ARM_CSPMU_FORMAT_ATTR(src_loc_noncpu, "config1:0"), + ARM_CSPMU_FORMAT_ATTR(src_loc_cpu, "config1:1"), + ARM_CSPMU_FORMAT_ATTR(src_rem, "config1:2"), + ARM_CSPMU_FORMAT_ATTR(dst_loc_cmem, "config1:8"), + ARM_CSPMU_FORMAT_ATTR(dst_loc_gmem, "config1:9"), + ARM_CSPMU_FORMAT_ATTR(dst_loc_other, "config1:10"), + ARM_CSPMU_FORMAT_ATTR(dst_rem, "config1:11"), + NULL, +}; + static struct attribute *generic_pmu_format_attrs[] =3D { ARM_CSPMU_FORMAT_EVENT_ATTR, ARM_CSPMU_FORMAT_FILTER_ATTR, @@ -236,6 +286,27 @@ static void nv_cspmu_set_cc_filter(struct arm_cspmu *c= spmu, writel(filter, cspmu->base0 + PMCCFILTR); } =20 +static u32 ucf_pmu_event_filter(const struct perf_event *event) +{ + u32 ret, filter, src, dst; + + filter =3D nv_cspmu_event_filter(event); + + /* Monitor all sources if none is selected. */ + src =3D FIELD_GET(NV_UCF_FILTER_SRC, filter); + if (src =3D=3D 0) + src =3D GENMASK_ULL(NV_UCF_SRC_COUNT - 1, 0); + + /* Monitor all destinations if none is selected. */ + dst =3D FIELD_GET(NV_UCF_FILTER_DST, filter); + if (dst =3D=3D 0) + dst =3D GENMASK_ULL(NV_UCF_DST_COUNT - 1, 0); + + ret =3D FIELD_PREP(NV_UCF_FILTER_SRC, src); + ret |=3D FIELD_PREP(NV_UCF_FILTER_DST, dst); + + return ret; +} =20 enum nv_cspmu_name_fmt { NAME_FMT_GENERIC, @@ -342,6 +413,23 @@ static const struct nv_cspmu_match nv_cspmu_match[] = =3D { .init_data =3D NULL }, }, + { + .prodid =3D 0x2CF20000, + .prodid_mask =3D NV_PRODID_MASK, + .name_pattern =3D "nvidia_ucf_pmu_%u", + .name_fmt =3D NAME_FMT_SOCKET, + .template_ctx =3D { + .event_attr =3D ucf_pmu_event_attrs, + .format_attr =3D ucf_pmu_format_attrs, + .filter_mask =3D NV_UCF_FILTER_ID_MASK, + .filter_default_val =3D NV_UCF_FILTER_DEFAULT, + .filter2_mask =3D 0x0, + .filter2_default_val =3D 0x0, + .get_filter =3D ucf_pmu_event_filter, + .get_filter2 =3D NULL, + .init_data =3D NULL + }, + }, { .prodid =3D 0, .prodid_mask =3D 0, --=20 2.43.0 From nobody Mon Jan 26 22:50:07 2026 Received: from BYAPR05CU005.outbound.protection.outlook.com (mail-westusazon11010029.outbound.protection.outlook.com [52.101.85.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 436D034D4C4; Mon, 26 Jan 2026 18:13:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.85.29 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451190; cv=fail; b=O5Jl+4B1UY40qIvsMoW5Gt481czlrU+pwXCzJtSDS9WdmsiWhcysxR8H1hGwK6Qn9r5jJEqiMe+x7tDC+XV8mPqasxSF+BhtyzwXiNE39bbEngPsJgyUuyBdMfG0Zb93i5T3Du1LBr6+lkUl35YZB8srGSIri+Pfz7x6X6DIFsQ= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451190; c=relaxed/simple; bh=X8hjqm73bxLF0gI1PuIWnrTaSHsZn4kK7pCF7PJvkqw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dsylskrSCVYw52MkNI+bOR7bC9SWMxOne1OrfTzNjsuD5MRAUzKa2EkqAY73+ZfXXfd5Tf2C7973InIpkuU2t+dc/TzXp1fycaV3us1mdX52Qgshx4PfVbpeL5gbkyAWIiQo7c1cfKjQ+UsFlPbVzlCVcIks49DnOSCxRT+TEkQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=oxib6wVR; arc=fail smtp.client-ip=52.101.85.29 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="oxib6wVR" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=XG2qgtRkaSy/Tf/C8Tqo3u+cDNQTJQ4QJNlT2+IXvZgKFYk50GCkul4r9J4qqMTODz7+KbF52U2dg/tDh2GPqr1tfFKz2tE/bb5H2iW3HrKNqenVTzpZW+Dfu8gtQMiHgrP0EQGS2uyykyKNgGhZzEdIddh6ULvzk0pNbjkAA6Aqg+0CnM5ZxwPLkZzpmx3ZEeD0NjOdpAQZeb9wnD09y64yhx7Ndx+ros6zMki1wktKKpGXoZJUMNx+62O3wv6Tt8QRIyMSnHjvYPDFJwpMhMH+K8mHINGenU6xxZBGtOtz4IkbCkj/eWxUVcVJEw0ZfbE5hY3zl2eEQWo3qUT+og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Yqf4Qqi2yDdZR+l6Sk0MncPYW3D/fC2br5uN8FgtaS8=; b=Tr8rzX1GkC72DlJtadS1u6TBhMnraS07Ff25VtQmIY3d/evUcYjqlCZAKI1hy6CoNFBYZAUu9ENXGNQsNZ+/NDzjOqwngMaNq8vAeAGHIrHxSnLOEf8PNMBG1Z4df+ezdULowxWn88IghspFqjlWWdr+nLbQURxtI2Ma6jQRb+oiJqKyE0IbqTDy1FqwjWd/Q3m4mZ9RFuNynwT19uRcOPwabB/2cecbKARfzEN/c6zQLSta20graj7TOK7+3clR0lvk5o4TPGzXTf43l2ksOSQVQlmh5aT/GEdY+9XZC1JhLZQj1dhU9UrU27/M0N+X8Ox5Sxy1FKeQqTg4w6OZ8A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Yqf4Qqi2yDdZR+l6Sk0MncPYW3D/fC2br5uN8FgtaS8=; b=oxib6wVRjFJWZyGP+vWXfk9LFCdtrtSoIaC+Psb61LuCqfP553PQSkUBn/flH3mpwzsCkwQ1EdIOtA+1eGY3OuFZ07osUpL2E1LRQWoXL+hFamOjNN5Xv2wJz9EupIzL7uykQ6lYx/w00NjEWjZEMsWQZBsLnE5ejd+Iy7014wMa3rGoQ6Cbz55gv6XjTOlXRCRl9AGW5xqDNWMKkjTkcnvCECsWD0TAURWoK3QitPFq1C+JzMWAZQuyhgo3WapkrWh/D1I8B1i6E0+AkW1xDyHl79q4qpjLo6fVbFd3F6tj+uF5pWavZ4wbagiRliz/m8Gu+w3P1jRajA9gHUSEQg== Received: from MW4P223CA0024.NAMP223.PROD.OUTLOOK.COM (2603:10b6:303:80::29) by SN7PR12MB7225.namprd12.prod.outlook.com (2603:10b6:806:2a8::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9542.15; Mon, 26 Jan 2026 18:13:02 +0000 Received: from SJ5PEPF000001F5.namprd05.prod.outlook.com (2603:10b6:303:80:cafe::82) by MW4P223CA0024.outlook.office365.com (2603:10b6:303:80::29) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9542.15 via Frontend Transport; Mon, 26 Jan 2026 18:13:07 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001F5.mail.protection.outlook.com (10.167.242.73) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.3 via Frontend Transport; Mon, 26 Jan 2026 18:13:02 +0000 Received: from drhqmail201.nvidia.com (10.126.190.180) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:51 -0800 Received: from drhqmail201.nvidia.com (10.126.190.180) by drhqmail201.nvidia.com (10.126.190.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:50 -0800 Received: from build-bwicaksono-noble-20251018.internal (10.127.8.11) by mail.nvidia.com (10.126.190.180) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 26 Jan 2026 10:12:50 -0800 From: Besar Wicaksono To: , , , CC: , , , , , , , , , , , , , Besar Wicaksono Subject: [PATCH 3/8] perf/arm_cspmu: Add arm_cspmu_acpi_dev_get Date: Mon, 26 Jan 2026 18:11:50 +0000 Message-ID: <20260126181155.2776097-4-bwicaksono@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260126181155.2776097-1-bwicaksono@nvidia.com> References: <20260126181155.2776097-1-bwicaksono@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001F5:EE_|SN7PR12MB7225:EE_ X-MS-Office365-Filtering-Correlation-Id: b1e1b9de-d1f3-4b23-b68b-08de5d068b56 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?qbA5fhkLK5AxXaT3eaYqjSQcg8fkFSbA5+SUIG85Wgva8HW1D6lL5sbSdWsv?= =?us-ascii?Q?EZvpCo0oAEFSEQgHxCwZqrFhVmHcAUGz6t8fEP+5wBroEkeFndpIaWesX3HM?= =?us-ascii?Q?24XYuJiyFirCuMYdGAwLxsiroMK/d1yX11foqz9A1sKZtMbW3Lzkb5PsNQn6?= =?us-ascii?Q?3GrwHMuug/+gFUcJpIuHFHznpieBS8XA6X2ELA8q811DHttUSFyiZRaOupWT?= =?us-ascii?Q?k5VR0dPLkdq3JX1e6vTlk8nIIfqQVKMHH50sUQczPk5/ybnAehRIC4SDnMbR?= =?us-ascii?Q?6rz6hMef7S/nTpZ3eQ5jsDRrFdwa/4L8G/YPAXWfOjfocZaBwGJBivcPOC4/?= =?us-ascii?Q?/2F3qi9WcsPUb6vfYBetKneXv2UuVbcPmQvgeZlputXa1aU6xBQZPB2VyDd/?= =?us-ascii?Q?RSWhHJFNSInKHf48Z8404TnmvC/0kMBiVa4PT+grsMAqU6Z8GgelwdjUilxG?= =?us-ascii?Q?YIhKNOOuUe2r23+b7oSHZl/750C5XJ/QtIVNqcRTgkkJdapuigFKk5Ucbmjo?= =?us-ascii?Q?SsS+otU+sMje7fe15xIMc8loH6IMuwl1KEu1E9bEQGaHWhNx/LDwCnTNc5Xq?= =?us-ascii?Q?o5GsC6AuGljXi2B9yckakobX6AFt0uJovo0Il4mXD4SBEv2nq4Fo1dA79qb+?= =?us-ascii?Q?UFtEULEY5UTJQ+1NhIPSxI6pPt4eU8/kG8G8TQSbr15Dh3PPd1uHgVNE71Py?= =?us-ascii?Q?0Y7Owj/ljtv5AYWHfXCaHk8SeW+ecFV1BrRPmbPLPxOxu2LcoVNASUankzOs?= =?us-ascii?Q?vgYls70hkPZiT2HTWgTC24Z7iYwKQK97UTJ2F/ggQJ8JPIOODBIKGbxOOYNs?= =?us-ascii?Q?hcozA2zMrzQW5a8fp6DdX6yJVKSCb89p8yUM81hYVGIewcRHM1AbMz0cojzD?= =?us-ascii?Q?l7BidSvlVgX+TAgORJ56PBu6JiXKN1w9bt+CUGmjthBeH4Q2kvEKiH3SXhjH?= =?us-ascii?Q?FN0ZADSLkHmqA0VaFEygFn3dRMbkyMm/n/UoC3PI10N8LTHmWKhU0kUW7jFd?= =?us-ascii?Q?A9Fh6Noyeu1G94PzO9TQezb0WOJe/jamcX1Y8jSREhu6QvZY32r/+1h6mljx?= =?us-ascii?Q?uPub8515IFSHIBqQw5UwuRukQ2uHx+CiipuEz8uS35dbfXTOgOPMsRRgdY56?= =?us-ascii?Q?vetxGmgVdtwXkEkf4SrDtmF+tvdAmKL3Ffi68eUrnewlqQZaiNjlNdot++dU?= =?us-ascii?Q?B9YbzZc+XjYEGlPhAfwW9n5H7FLdHXQqfTGUyKuvBck5bwOqny88YtiiVHe2?= =?us-ascii?Q?7j3b8W9AbDFwDGtXxOcUhPLDKA5/n5b5nTv4+/A4yK7Ukvjw5cXSbzAhvK66?= =?us-ascii?Q?bngSw5795pHojiFArx9/abCMsZ4AdY3NCS6e8OMgFKo5wjWvKnPtRjkEmbgr?= =?us-ascii?Q?I7Hnj7MrXWgR0IQAIQXZhA0cbj7IWYF8vsJo19LsI+0F9vntIo+y7Jb6tdfn?= =?us-ascii?Q?l9lqK6/UnVZY1BjA8V9X7/uiUoAUnYYgz5QkyLsbxouAlEYTgnbxO2uz4LxB?= =?us-ascii?Q?lDA5Cxk55FSjs0uTstVz+96mTYkXHnSoPqgboR/V3JG/PQ+5oU7uKqw4A1mX?= =?us-ascii?Q?Du9EphZ0klJXdiEpsHJf9tBkOUXzjOJw9hxSX6PVCGtVSmbEZd4hYYCWrgDc?= =?us-ascii?Q?whvuSuGArSY7qNDS0640FjHecc39QSCJpCVGkqhv73mXC+mnVjsxpc8VGMZg?= =?us-ascii?Q?N+v6oQ=3D=3D?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(376014)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jan 2026 18:13:02.1795 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b1e1b9de-d1f3-4b23-b68b-08de5d068b56 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001F5.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN7PR12MB7225 Content-Type: text/plain; charset="utf-8" Add interface to get ACPI device associated with the PMU. This ACPI device may contain additional properties not covered by the standard properties. Signed-off-by: Besar Wicaksono --- drivers/perf/arm_cspmu/arm_cspmu.c | 24 +++++++++++++++++++++++- drivers/perf/arm_cspmu/arm_cspmu.h | 17 ++++++++++++++++- 2 files changed, 39 insertions(+), 2 deletions(-) diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c b/drivers/perf/arm_cspmu/ar= m_cspmu.c index 34430b68f602..dadc9b765d80 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.c +++ b/drivers/perf/arm_cspmu/arm_cspmu.c @@ -16,7 +16,7 @@ * The user should refer to the vendor technical documentation to get deta= ils * about the supported events. * - * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights re= served. + * Copyright (c) 2022-2026, NVIDIA CORPORATION & AFFILIATES. All rights re= served. * */ =20 @@ -1132,6 +1132,28 @@ static int arm_cspmu_acpi_get_cpus(struct arm_cspmu = *cspmu) =20 return 0; } + +struct acpi_device *arm_cspmu_acpi_dev_get(const struct arm_cspmu *cspmu) +{ + char hid[16]; + char uid[16]; + struct acpi_device *adev; + const struct acpi_apmt_node *apmt_node; + + apmt_node =3D arm_cspmu_apmt_node(cspmu->dev); + if (!apmt_node || apmt_node->type !=3D ACPI_APMT_NODE_TYPE_ACPI) + return NULL; + + memset(hid, 0, sizeof(hid)); + memset(uid, 0, sizeof(uid)); + + memcpy(hid, &apmt_node->inst_primary, sizeof(apmt_node->inst_primary)); + snprintf(uid, sizeof(uid), "%u", apmt_node->inst_secondary); + + adev =3D acpi_dev_get_first_match_dev(hid, uid, -1); + return adev; +} +EXPORT_SYMBOL_GPL(arm_cspmu_acpi_dev_get); #else static int arm_cspmu_acpi_get_cpus(struct arm_cspmu *cspmu) { diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h b/drivers/perf/arm_cspmu/ar= m_cspmu.h index cd65a58dbd88..320096673200 100644 --- a/drivers/perf/arm_cspmu/arm_cspmu.h +++ b/drivers/perf/arm_cspmu/arm_cspmu.h @@ -1,13 +1,14 @@ /* SPDX-License-Identifier: GPL-2.0 * * ARM CoreSight Architecture PMU driver. - * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights re= served. + * Copyright (c) 2022-2026, NVIDIA CORPORATION & AFFILIATES. All rights re= served. * */ =20 #ifndef __ARM_CSPMU_H__ #define __ARM_CSPMU_H__ =20 +#include #include #include #include @@ -255,4 +256,18 @@ int arm_cspmu_impl_register(const struct arm_cspmu_imp= l_match *impl_match); /* Unregister vendor backend. */ void arm_cspmu_impl_unregister(const struct arm_cspmu_impl_match *impl_mat= ch); =20 +#if defined(CONFIG_ACPI) +/** + * Get ACPI device associated with the PMU. + * The caller is responsible for calling acpi_dev_put() on the returned de= vice. + */ +struct acpi_device *arm_cspmu_acpi_dev_get(const struct arm_cspmu *cspmu); +#else +static inline struct acpi_device * +arm_cspmu_acpi_dev_get(const struct arm_cspmu *cspmu) +{ + return NULL; +} +#endif + #endif /* __ARM_CSPMU_H__ */ --=20 2.43.0 From nobody Mon Jan 26 22:50:07 2026 Received: from CO1PR03CU002.outbound.protection.outlook.com (mail-westus2azon11010028.outbound.protection.outlook.com [52.101.46.28]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1DC2034D932; Mon, 26 Jan 2026 18:13:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.46.28 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451200; cv=fail; b=u5jTYwVaTwuKrVylenNxkZ8jA4i6RrohptVLkaFdkc3NTWyGiVE6DT7cP6JN0iRvbOelCIo3TLqIfrzKCw7ghOa8d6HXxMJId/SJTdxce/c8fbDfYqEwp/g9dgy9Pzr/0QVOIt92/jal1dwx90/FnC9PzP3xx8RSyt1frb60zI4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451200; c=relaxed/simple; bh=a0KJJu2FOafD5BAMVclpDoU3oFxaRVlDEPs6OU3Dxmw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=DXpVFhGuYqe2U+5eNlFZk5jHVRYld1495mefnKMxDks0k/8HtQt+xCQf54eCW0eT1RDX5qKRCXTr9PrhWH737IDUUU1NKY3VY8MRkRawli1VD5dwOnQFvTPi2WHJB+07d4mZY7mFL4SD+8d0VqWmWh3gB4Z6XV76Z5g/O1KkOiM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=oIRmPS7L; arc=fail smtp.client-ip=52.101.46.28 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="oIRmPS7L" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=bJkWZHbPtPNWIutish54Q3h2KX4GD2m1O6THlYUs+AFKzy9epQQxc0olJD2pW8mIYomZrxGc9hsAJZNWYhZjehEuV8eOsUd/60Jvbce8zmzc7ObcZ8cTGsV4/jclPyAaAyXjoCNqmtd2fhP1NPMzz59QslTur6EIezt7+/1BGJ/+1Ra8igtisGm4JtQl6WDnvx4QekeY/iMRWcdAZl2tIQRb7Qm3QH/wsuZqEQ8L+mozctlRo9l/ORmCdJjk9ap4EXe+Ait8lycWACBp/g7BLr/79QXtdgZmSaPmX2dXYgklRDhtkcIEkR+VqshppZEdnbp9ZvdFqMP3CZ04Nex2lA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wGU28m02fh2GLpogu/Ho6mdEWfmVROUdh4qaI76JK/E=; b=KCf1wkiURaTIo4vIZJE8+XoH1kgwaMFBSjolMz+oFgXLpDjw8RmplHCJ8IwgoYUXI/P0YKWL2sxymnUlAvPVREEJlV6jsnUwtPLRDQ7AhiIgPff99HpNKX8/2PdU4x15rfTmHvYMJa6/lJzy/b0RZTc1MC0j3wzgN3N6175rj7+PaDMymCnNFh5WM21RJwnWU+VcMK7UMcNbRLM0QWWhWF/oku6Xvz0mD5VtFkwDgx2pxlhg59uSXW3Y+NZ/96hXjMFQ41cByknsKpiVX5BIXs1iCS0oL+NgDB04Zop88fHW9x6RoS5EqdKOzYJMiCYTeYxUjamwlJDZaeE2Do/ghA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wGU28m02fh2GLpogu/Ho6mdEWfmVROUdh4qaI76JK/E=; b=oIRmPS7LSroKdDnTCuv6asOsY9gAM9HTBc8Wwin6gaFofij9479wzkCmx5Y54gQBd1cETdCup0wjjxR9cIBUsY1e8diD3nStTGNW38Hpgk2+gVrJhBYpKuErcSaz5KpbM8w1gX0eT81xEVrGsQSBBfk0zfjr50k3bcNXXC6uYRrY4EyYHciJscNxYlf3u443RajcwQdtzRosZQSFVPWGKmPUkuofenBvQx1mAFS3Eo73Nv4AX9gFkvZ3mWSJvS8EObr4VahrxYAYnUcWdzDGxbtdt8aNcm2O7SAVKmvrKNtYAowjLVbjWgDFMXyNDEGFTJgJjLVH3pEnM+i5oFnKag== Received: from SJ0PR05CA0080.namprd05.prod.outlook.com (2603:10b6:a03:332::25) by MN2PR12MB4270.namprd12.prod.outlook.com (2603:10b6:208:1d9::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9542.14; Mon, 26 Jan 2026 18:13:05 +0000 Received: from SJ5PEPF000001F4.namprd05.prod.outlook.com (2603:10b6:a03:332:cafe::17) by SJ0PR05CA0080.outlook.office365.com (2603:10b6:a03:332::25) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9564.7 via Frontend Transport; Mon, 26 Jan 2026 18:13:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001F4.mail.protection.outlook.com (10.167.242.72) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.3 via Frontend Transport; Mon, 26 Jan 2026 18:13:04 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:53 -0800 Received: from drhqmail201.nvidia.com (10.126.190.180) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:53 -0800 Received: from build-bwicaksono-noble-20251018.internal (10.127.8.11) by mail.nvidia.com (10.126.190.180) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 26 Jan 2026 10:12:52 -0800 From: Besar Wicaksono To: , , , CC: , , , , , , , , , , , , , Besar Wicaksono Subject: [PATCH 4/8] perf/arm_cspmu: nvidia: Add Tegra410 PCIE PMU Date: Mon, 26 Jan 2026 18:11:51 +0000 Message-ID: <20260126181155.2776097-5-bwicaksono@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260126181155.2776097-1-bwicaksono@nvidia.com> References: <20260126181155.2776097-1-bwicaksono@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001F4:EE_|MN2PR12MB4270:EE_ X-MS-Office365-Filtering-Correlation-Id: cd233f17-af10-46a3-f55c-08de5d068cdf X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|376014|82310400026|36860700013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?uJeqUVkR0fRcX1AjvLUfnr4KZSVAnKJe/rTxEklvIVSlaTVpHmjC5J8UROBU?= =?us-ascii?Q?YMxpkH8Wk1IVv04f9ecFLxB782R9J5nDQ5OAn3xpOFGIFje2DY+b2EewCRRK?= =?us-ascii?Q?/f/JgOWbqSA6XSeg6AnzaX56po2tB0eXO9fE91mW+Te8bYkcCVjKqAh0QfyI?= =?us-ascii?Q?7hwKEXsjK1AEhxAyyaOPplVl23XSlELhtJ8PJzqhTdUhl4r3IKPNvt2f46Tc?= =?us-ascii?Q?SO6+U5PY0Mi+wcZPwJfGEH1nKT3PKtx5FfMfk2/cImp/xBEjUaT2RnR1m64R?= =?us-ascii?Q?l7bF0Fd3dhfO5CCjHJN2EsNaXaQR680Om17ZQpIo01N95MfVfau7DKnCyZpY?= =?us-ascii?Q?vosjCeQOzT8UiCvAH+8sqXxLm3crXKV4+WD54iVyZSIOaOb+whJzlkfIiwzf?= =?us-ascii?Q?c0nz4a3IrRSSPgwyzd/3LmrNmeN65VHvL0ySPdFmzRatqPwrulPphZxMsEsd?= =?us-ascii?Q?Md/Q5c6VHbbI6TnXogMUq4pnOv6l48TsIPK0wixt6oo1vjvmdUQnB6RdY3Ly?= =?us-ascii?Q?BCCGYUrEw/E559GUHZJ3EfXafkKIjwmTM+++2QkAtGwXB/NHAgExQEdPt6z5?= =?us-ascii?Q?zBOZSppIY7dn9GEReHXDZHsNgoFz3K0PNGQSYLzXDcXKnUR5nYpqF/8O5D86?= =?us-ascii?Q?WTvwhnlwsY3kO0jsYA1o7vcDW+2qN/KEhJfdZLYVxD48xfxm4/ffSL2sM8vL?= =?us-ascii?Q?qKz6QxDOWA//HhatntTuIVg2M/zQmR2gWzy6xtv5b7nbIkjtwvFUNpJ/RpqS?= =?us-ascii?Q?sTmkRZBpOwDQlN4mU2StBusL9mx2At9OnpwUc2cd795sBVUOpjzDC9uPdkY0?= =?us-ascii?Q?UsNtUjSuHUfwQX8XrYzcxVChAa4LBhhlSe0qIwopS+MmecLAa0FqQNltfqvm?= =?us-ascii?Q?yc7+4qq2nxHmEdfadJwrU7fqVwrCxfGjnV4+QHgHAIozXysqSOUPjcps6qR+?= =?us-ascii?Q?TqYVnVL6mgg3izhnV6/+cqz6ZexaG+mbRR+VbSczUxbOc/FhT/E8y1YXJ96L?= =?us-ascii?Q?36D5Ne8bfZ7oREQzWU+Wmua05Th1O6u3oi/U5+SMEEsiNGjKMvRu3nA+dg2W?= =?us-ascii?Q?okEvus6Uv2wEg6BsnOkVYGGnCRkR0SSNWv6WY+rjxTCfH+Z5MFe8D8uzFUk/?= =?us-ascii?Q?GmmKDvKaoVZeCCg7RglhxXM1iJ25Dj68FUPZOfFYdMkMHvEJxdljE4YRxwgv?= =?us-ascii?Q?nLNjk2swpFQDmJ7toJrZBLcGSfz+WQ6Ba1G/f+Dwo3713gTmfrT+28B17Eu/?= =?us-ascii?Q?nfHT/PbOKnSDhyX5RP6gl2WHercOQKHDwypGYeS0bPUdb6e2TCDd9O4NTxF5?= =?us-ascii?Q?Mwobk6I9mw9UUcPZ0aWsNCjjVgmRS3MJd08nMOK85DwlYagWW8h13bRKDoC7?= =?us-ascii?Q?rGDuAdyu05ekg6tMQf7jO+kWe7IFbK4MYifLMucCydEtryZZujKHha5kEjyk?= =?us-ascii?Q?libHXle3nbwgHJxxm/b2x1JvlZ3wxWa8R8SvpnYzRvn7gn8FJC0K0lFuAksy?= =?us-ascii?Q?6b99yAVwEU1lvkmDFfMcj+1LZB04ygEBsSUSVyyxK97A75E8E217xIZ3hDfb?= =?us-ascii?Q?nreiE4/eeaF91+z56ZMyoTbq0AjT6MzRxs0bzzbd8SCNINxDc0ntL93KNORi?= =?us-ascii?Q?fUAliCh0haMHXjaRsB/63zNQGiRuDIcVIvGFJDaB0g16DUxtJMwk/Cs62W/L?= =?us-ascii?Q?qa0QFg=3D=3D?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(376014)(82310400026)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jan 2026 18:13:04.7526 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: cd233f17-af10-46a3-f55c-08de5d068cdf X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001F4.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4270 Content-Type: text/plain; charset="utf-8" Adds PCIE PMU support in Tegra410 SOC. Signed-off-by: Besar Wicaksono --- .../admin-guide/perf/nvidia-tegra410-pmu.rst | 162 ++++++++++++++ drivers/perf/arm_cspmu/nvidia_cspmu.c | 208 +++++++++++++++++- 2 files changed, 368 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst b/Docum= entation/admin-guide/perf/nvidia-tegra410-pmu.rst index 7b7ba5700ca1..8528685ddb61 100644 --- a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst +++ b/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst @@ -6,6 +6,7 @@ The NVIDIA Tegra410 SoC includes various system PMUs to mea= sure key performance metrics like memory bandwidth, latency, and utilization: =20 * Unified Coherence Fabric (UCF) +* PCIE =20 PMU Driver ---------- @@ -104,3 +105,164 @@ Example usage: destination filter =3D remote memory:: =20 perf stat -a -e nvidia_ucf_pmu_1/event=3D0x0,src_loc_noncpu=3D0x1,dst_= rem=3D0x1/ + +PCIE PMU +-------- + +This PMU monitors all read/write traffic from the root port(s) or a partic= ular +BDF in a PCIE root complex (RC) to local or remote memory. There is one PM= U per +PCIE RC in the SoC. Each RC can have up to 16 lanes that can be bifurcated= into +up to 8 root ports. The traffic from each root port can be filtered using = RP or +BDF filter. For example, specifying "src_rp_mask=3D0xFF" means the PMU cou= nter will +capture traffic from all RPs. Please see below for more details. + +The events and configuration options of this PMU device are described in s= ysfs, +see /sys/bus/event_source/devices/nvidia_pcie_pmu__rc_. + +The events in this PMU can be used to measure bandwidth, utilization, and +latency: + + * rd_req: count the number of read requests by PCIE device. + * wr_req: count the number of write requests by PCIE device. + * rd_bytes: count the number of bytes transferred by rd_req. + * wr_bytes: count the number of bytes transferred by wr_req. + * rd_cum_outs: count outstanding rd_req each cycle. + * cycles: counts the PCIE cycles. + +The average bandwidth is calculated as:: + + AVG_RD_BANDWIDTH_IN_GBPS =3D RD_BYTES / ELAPSED_TIME_IN_NS + AVG_WR_BANDWIDTH_IN_GBPS =3D WR_BYTES / ELAPSED_TIME_IN_NS + +The average request rate is calculated as:: + + AVG_RD_REQUEST_RATE =3D RD_REQ / CYCLES + AVG_WR_REQUEST_RATE =3D WR_REQ / CYCLES + + +The average latency is calculated as:: + + FREQ_IN_GHZ =3D CYCLES / ELAPSED_TIME_IN_NS + AVG_LATENCY_IN_CYCLES =3D RD_CUM_OUTS / RD_REQ + AVERAGE_LATENCY_IN_NS =3D AVG_LATENCY_IN_CYCLES / FREQ_IN_GHZ + +The PMU events can be filtered based on the traffic source and destination. +The source filter indicates the PCIE devices that will be monitored. The +destination filter specifies the destination memory type, e.g. local system +memory (CMEM), local GPU memory (GMEM), or remote memory. The local/remote +classification of the destination filter is based on the home socket of the +address, not where the data actually resides. These filters can be found in +/sys/bus/event_source/devices/nvidia_pcie_pmu__rc_/= format/. + +The list of event filters: + +* Source filter: + + * src_rp_mask: bitmask of root ports that will be monitored. Each bit in= this + bitmask represents the RP index in the RC. If the bit is set, all devi= ces under + the associated RP will be monitored. E.g "src_rp_mask=3D0xF" will moni= tor + devices in root port 0 to 3. + * src_bdf: the BDF that will be monitored. This is a 16-bit value that + follows formula: (bus << 8) + (device << 3) + (function). For example,= the + value of BDF 27:01.1 is 0x2781. + * src_bdf_en: enable the BDF filter. If this is set, the BDF filter valu= e in + "src_bdf" is used to filter the traffic. + + Note that Root-Port and BDF filters are mutually exclusive and the PMU in + each RC can only have one BDF filter for the whole counters. If BDF filt= er + is enabled, the BDF filter value will be applied to all events. + +* Destination filter: + + * dst_loc_cmem: if set, count events to local system memory (CMEM) addre= ss + * dst_loc_gmem: if set, count events to local GPU memory (GMEM) address + * dst_loc_pcie_p2p: if set, count events to local PCIE peer address + * dst_loc_pcie_cxl: if set, count events to local CXL memory address + * dst_rem: if set, count events to remote memory address + +If the source filter is not specified, the PMU will count events from all = root +ports. If the destination filter is not specified, the PMU will count even= ts +to all destinations. + +Example usage: + +* Count event id 0x0 from root port 0 of PCIE RC-0 on socket 0 targeting a= ll + destinations:: + + perf stat -a -e nvidia_pcie_pmu_0_rc_0/event=3D0x0,src_rp_mask=3D0x1/ + +* Count event id 0x1 from root port 0 and 1 of PCIE RC-1 on socket 0 and + targeting just local CMEM of socket 0:: + + perf stat -a -e nvidia_pcie_pmu_0_rc_1/event=3D0x1,src_rp_mask=3D0x3,d= st_loc_cmem=3D0x1/ + +* Count event id 0x2 from root port 0 of PCIE RC-2 on socket 1 targeting a= ll + destinations:: + + perf stat -a -e nvidia_pcie_pmu_1_rc_2/event=3D0x2,src_rp_mask=3D0x1/ + +* Count event id 0x3 from root port 0 and 1 of PCIE RC-3 on socket 1 and + targeting just local CMEM of socket 1:: + + perf stat -a -e nvidia_pcie_pmu_1_rc_3/event=3D0x3,src_rp_mask=3D0x3,d= st_loc_cmem=3D0x1/ + +* Count event id 0x4 from BDF 01:01.0 of PCIE RC-4 on socket 0 targeting a= ll + destinations:: + + perf stat -a -e nvidia_pcie_pmu_0_rc_4/event=3D0x4,src_bdf=3D0x0180,sr= c_bdf_en=3D0x1/ + +Mapping the RC# to lspci segment number can be non-trivial; hence a new NV= IDIA +Designated Vendor Specific Capability (DVSEC) register is added into the P= CIE config space +for each RP. This DVSEC has vendor id "10de" and DVSEC id of "0x4". The DV= SEC register +contains the following information to map PCIE devices under the RP back t= o its RC# : + + - Bus# (byte 0xc) : bus number as reported by the lspci output + - Segment# (byte 0xd) : segment number as reported by the lspci output + - RP# (byte 0xe) : port number as reported by LnkCap attribute from lspc= i for a device with Root Port capability + - RC# (byte 0xf): root complex number associated with the RP + - Socket# (byte 0x10): socket number associated with the RP + +Example script for mapping lspci BDF to RC# and socket#:: + + #!/bin/bash + while read bdf rest; do + dvsec4_reg=3D$(lspci -vv -s $bdf | awk ' + /Designated Vendor-Specific: Vendor=3D10de ID=3D0004/ { + match($0, /\[([0-9a-fA-F]+)/, arr); + print "0x" arr[1]; + exit + } + ') + if [ -n "$dvsec4_reg" ]; then + bus=3D$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0xc))).b) + segment=3D$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0xd)))= .b) + rp=3D$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0xe))).b) + rc=3D$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0xf))).b) + socket=3D$(setpci -s $bdf $(printf '0x%x' $((${dvsec4_reg} + 0x10)))= .b) + echo "$bdf: Bus=3D$bus, Segment=3D$segment, RP=3D$rp, RC=3D$rc, Sock= et=3D$socket" + fi + done < <(lspci -d 10de:) + +Example output:: + + 0001:00:00.0: Bus=3D00, Segment=3D01, RP=3D00, RC=3D00, Socket=3D00 + 0002:80:00.0: Bus=3D80, Segment=3D02, RP=3D01, RC=3D01, Socket=3D00 + 0002:a0:00.0: Bus=3Da0, Segment=3D02, RP=3D02, RC=3D01, Socket=3D00 + 0002:c0:00.0: Bus=3Dc0, Segment=3D02, RP=3D03, RC=3D01, Socket=3D00 + 0002:e0:00.0: Bus=3De0, Segment=3D02, RP=3D04, RC=3D01, Socket=3D00 + 0003:00:00.0: Bus=3D00, Segment=3D03, RP=3D00, RC=3D02, Socket=3D00 + 0004:00:00.0: Bus=3D00, Segment=3D04, RP=3D00, RC=3D03, Socket=3D00 + 0005:00:00.0: Bus=3D00, Segment=3D05, RP=3D00, RC=3D04, Socket=3D00 + 0005:40:00.0: Bus=3D40, Segment=3D05, RP=3D01, RC=3D04, Socket=3D00 + 0005:c0:00.0: Bus=3Dc0, Segment=3D05, RP=3D02, RC=3D04, Socket=3D00 + 0006:00:00.0: Bus=3D00, Segment=3D06, RP=3D00, RC=3D05, Socket=3D00 + 0009:00:00.0: Bus=3D00, Segment=3D09, RP=3D00, RC=3D00, Socket=3D01 + 000a:80:00.0: Bus=3D80, Segment=3D0a, RP=3D01, RC=3D01, Socket=3D01 + 000a:a0:00.0: Bus=3Da0, Segment=3D0a, RP=3D02, RC=3D01, Socket=3D01 + 000a:e0:00.0: Bus=3De0, Segment=3D0a, RP=3D03, RC=3D01, Socket=3D01 + 000b:00:00.0: Bus=3D00, Segment=3D0b, RP=3D00, RC=3D02, Socket=3D01 + 000c:00:00.0: Bus=3D00, Segment=3D0c, RP=3D00, RC=3D03, Socket=3D01 + 000d:00:00.0: Bus=3D00, Segment=3D0d, RP=3D00, RC=3D04, Socket=3D01 + 000d:40:00.0: Bus=3D40, Segment=3D0d, RP=3D01, RC=3D04, Socket=3D01 + 000d:c0:00.0: Bus=3Dc0, Segment=3D0d, RP=3D02, RC=3D04, Socket=3D01 + 000e:00:00.0: Bus=3D00, Segment=3D0e, RP=3D00, RC=3D05, Socket=3D01 diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c b/drivers/perf/arm_cspmu= /nvidia_cspmu.c index c67667097a3c..3a5531d1f94c 100644 --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c @@ -8,6 +8,7 @@ =20 #include #include +#include #include =20 #include "arm_cspmu.h" @@ -28,6 +29,19 @@ #define NV_UCF_FILTER_DST GENMASK_ULL(11, 8) #define NV_UCF_FILTER_DEFAULT (NV_UCF_FILTER_SRC | NV_UCF_FILTER_DS= T) =20 +#define NV_PCIE_V2_PORT_COUNT 8ULL +#define NV_PCIE_V2_FILTER_ID_MASK GENMASK_ULL(24, 0) +#define NV_PCIE_V2_FILTER_PORT GENMASK_ULL(NV_PCIE_V2_PORT_COUNT - 1= , 0) +#define NV_PCIE_V2_FILTER_BDF_VAL GENMASK_ULL(23, NV_PCIE_V2_PORT_COUNT) +#define NV_PCIE_V2_FILTER_BDF_EN BIT(24) +#define NV_PCIE_V2_FILTER_BDF_VAL_EN GENMASK_ULL(24, NV_PCIE_V2_PORT_COUNT) +#define NV_PCIE_V2_FILTER_DEFAULT NV_PCIE_V2_FILTER_PORT + +#define NV_PCIE_V2_DST_COUNT 5ULL +#define NV_PCIE_V2_FILTER2_ID_MASK GENMASK_ULL(4, 0) +#define NV_PCIE_V2_FILTER2_DST GENMASK_ULL(NV_PCIE_V2_DST_COUNT - 1,= 0) +#define NV_PCIE_V2_FILTER2_DEFAULT NV_PCIE_V2_FILTER2_DST + #define NV_GENERIC_FILTER_ID_MASK GENMASK_ULL(31, 0) =20 #define NV_PRODID_MASK (PMIIDR_PRODUCTID | PMIIDR_VARIANT | PMIIDR_REVISIO= N) @@ -162,6 +176,16 @@ static struct attribute *ucf_pmu_event_attrs[] =3D { NULL, }; =20 +static struct attribute *pcie_v2_pmu_event_attrs[] =3D { + ARM_CSPMU_EVENT_ATTR(rd_bytes, 0x0), + ARM_CSPMU_EVENT_ATTR(wr_bytes, 0x1), + ARM_CSPMU_EVENT_ATTR(rd_req, 0x2), + ARM_CSPMU_EVENT_ATTR(wr_req, 0x3), + ARM_CSPMU_EVENT_ATTR(rd_cum_outs, 0x4), + ARM_CSPMU_EVENT_ATTR(cycles, ARM_CSPMU_EVT_CYCLES_DEFAULT), + NULL, +}; + static struct attribute *generic_pmu_event_attrs[] =3D { ARM_CSPMU_EVENT_ATTR(cycles, ARM_CSPMU_EVT_CYCLES_DEFAULT), NULL, @@ -202,6 +226,19 @@ static struct attribute *ucf_pmu_format_attrs[] =3D { NULL, }; =20 +static struct attribute *pcie_v2_pmu_format_attrs[] =3D { + ARM_CSPMU_FORMAT_EVENT_ATTR, + ARM_CSPMU_FORMAT_ATTR(src_rp_mask, "config1:0-7"), + ARM_CSPMU_FORMAT_ATTR(src_bdf, "config1:8-23"), + ARM_CSPMU_FORMAT_ATTR(src_bdf_en, "config1:24"), + ARM_CSPMU_FORMAT_ATTR(dst_loc_cmem, "config2:0"), + ARM_CSPMU_FORMAT_ATTR(dst_loc_gmem, "config2:1"), + ARM_CSPMU_FORMAT_ATTR(dst_loc_pcie_p2p, "config2:2"), + ARM_CSPMU_FORMAT_ATTR(dst_loc_pcie_cxl, "config2:3"), + ARM_CSPMU_FORMAT_ATTR(dst_rem, "config2:4"), + NULL, +}; + static struct attribute *generic_pmu_format_attrs[] =3D { ARM_CSPMU_FORMAT_EVENT_ATTR, ARM_CSPMU_FORMAT_FILTER_ATTR, @@ -233,6 +270,32 @@ nv_cspmu_get_name(const struct arm_cspmu *cspmu) return ctx->name; } =20 +#if defined(CONFIG_ACPI) +static int nv_cspmu_get_inst_id(const struct arm_cspmu *cspmu, u32 *id) +{ + struct fwnode_handle *fwnode; + struct acpi_device *adev; + int ret; + + adev =3D arm_cspmu_acpi_dev_get(cspmu); + if (!adev) + return -ENODEV; + + fwnode =3D acpi_fwnode_handle(adev); + ret =3D fwnode_property_read_u32(fwnode, "instance_id", id); + if (ret) + dev_err(cspmu->dev, "Failed to get instance ID\n"); + + acpi_dev_put(adev); + return ret; +} +#else +static int nv_cspmu_get_inst_id(const struct arm_cspmu *cspmu, u32 *id) +{ + return -EINVAL; +} +#endif + static u32 nv_cspmu_event_filter(const struct perf_event *event) { const struct nv_cspmu_ctx *ctx =3D @@ -278,6 +341,20 @@ static void nv_cspmu_set_ev_filter(struct arm_cspmu *c= spmu, } } =20 +static void nv_cspmu_reset_ev_filter(struct arm_cspmu *cspmu, + const struct perf_event *event) +{ + const struct nv_cspmu_ctx *ctx =3D + to_nv_cspmu_ctx(to_arm_cspmu(event->pmu)); + const u32 offset =3D 4 * event->hw.idx; + + if (ctx->get_filter) + writel(0, cspmu->base0 + PMEVFILTR + offset); + + if (ctx->get_filter2) + writel(0, cspmu->base0 + PMEVFILT2R + offset); +} + static void nv_cspmu_set_cc_filter(struct arm_cspmu *cspmu, const struct perf_event *event) { @@ -308,9 +385,103 @@ static u32 ucf_pmu_event_filter(const struct perf_eve= nt *event) return ret; } =20 +static u32 pcie_v2_pmu_bdf_val_en(u32 filter) +{ + const u32 bdf_en =3D FIELD_GET(NV_PCIE_V2_FILTER_BDF_EN, filter); + + /* Returns both BDF value and enable bit if BDF filtering is enabled. */ + if (bdf_en) + return FIELD_GET(NV_PCIE_V2_FILTER_BDF_VAL_EN, filter); + + /* Ignore the BDF value if BDF filter is not enabled. */ + return 0; +} + +static u32 pcie_v2_pmu_event_filter(const struct perf_event *event) +{ + u32 filter, lead_filter, lead_bdf; + struct perf_event *leader; + const struct nv_cspmu_ctx *ctx =3D + to_nv_cspmu_ctx(to_arm_cspmu(event->pmu)); + + filter =3D event->attr.config1 & ctx->filter_mask; + if (filter !=3D 0) + return filter; + + leader =3D event->group_leader; + + /* Use leader's filter value if its BDF filtering is enabled. */ + if (event !=3D leader) { + lead_filter =3D pcie_v2_pmu_event_filter(leader); + lead_bdf =3D pcie_v2_pmu_bdf_val_en(lead_filter); + if (lead_bdf !=3D 0) + return lead_filter; + } + + /* Otherwise, return default filter value. */ + return ctx->filter_default_val; +} + +static int pcie_v2_pmu_validate_event(struct arm_cspmu *cspmu, + struct perf_event *new_ev) +{ + /* + * Make sure the events are using same BDF filter since the PCIE-SRC PMU + * only supports one common BDF filter setting for all of the counters. + */ + + int idx; + u32 new_filter, new_rp, new_bdf, new_lead_filter, new_lead_bdf; + struct perf_event *leader, *new_leader; + + if (cspmu->impl.ops.is_cycle_counter_event(new_ev)) + return 0; + + new_leader =3D new_ev->group_leader; + + new_filter =3D pcie_v2_pmu_event_filter(new_ev); + new_lead_filter =3D pcie_v2_pmu_event_filter(new_leader); + + new_bdf =3D pcie_v2_pmu_bdf_val_en(new_filter); + new_lead_bdf =3D pcie_v2_pmu_bdf_val_en(new_lead_filter); + + new_rp =3D FIELD_GET(NV_PCIE_V2_FILTER_PORT, new_filter); + + if (new_rp !=3D 0 && new_bdf !=3D 0) { + dev_err(cspmu->dev, + "RP and BDF filtering are mutually exclusive\n"); + return -EINVAL; + } + + if (new_bdf !=3D new_lead_bdf) { + dev_err(cspmu->dev, + "sibling and leader BDF value should be equal\n"); + return -EINVAL; + } + + /* Compare BDF filter on existing events. */ + idx =3D find_first_bit(cspmu->hw_events.used_ctrs, + cspmu->cycle_counter_logical_idx); + + if (idx !=3D cspmu->cycle_counter_logical_idx) { + leader =3D cspmu->hw_events.events[idx]->group_leader; + + const u32 lead_filter =3D pcie_v2_pmu_event_filter(leader); + const u32 lead_bdf =3D pcie_v2_pmu_bdf_val_en(lead_filter); + + if (new_lead_bdf !=3D lead_bdf) { + dev_err(cspmu->dev, "only one BDF value is supported\n"); + return -EINVAL; + } + } + + return 0; +} + enum nv_cspmu_name_fmt { NAME_FMT_GENERIC, - NAME_FMT_SOCKET + NAME_FMT_SOCKET, + NAME_FMT_SOCKET_INST }; =20 struct nv_cspmu_match { @@ -430,6 +601,27 @@ static const struct nv_cspmu_match nv_cspmu_match[] = =3D { .init_data =3D NULL }, }, + { + .prodid =3D 0x10301000, + .prodid_mask =3D NV_PRODID_MASK, + .name_pattern =3D "nvidia_pcie_pmu_%u_rc_%u", + .name_fmt =3D NAME_FMT_SOCKET_INST, + .template_ctx =3D { + .event_attr =3D pcie_v2_pmu_event_attrs, + .format_attr =3D pcie_v2_pmu_format_attrs, + .filter_mask =3D NV_PCIE_V2_FILTER_ID_MASK, + .filter_default_val =3D NV_PCIE_V2_FILTER_DEFAULT, + .filter2_mask =3D NV_PCIE_V2_FILTER2_ID_MASK, + .filter2_default_val =3D NV_PCIE_V2_FILTER2_DEFAULT, + .get_filter =3D pcie_v2_pmu_event_filter, + .get_filter2 =3D nv_cspmu_event_filter2, + .init_data =3D NULL + }, + .ops =3D { + .validate_event =3D pcie_v2_pmu_validate_event, + .reset_ev_filter =3D nv_cspmu_reset_ev_filter, + } + }, { .prodid =3D 0, .prodid_mask =3D 0, @@ -453,7 +645,7 @@ static const struct nv_cspmu_match nv_cspmu_match[] =3D= { static char *nv_cspmu_format_name(const struct arm_cspmu *cspmu, const struct nv_cspmu_match *match) { - char *name; + char *name =3D NULL; struct device *dev =3D cspmu->dev; =20 static atomic_t pmu_generic_idx =3D {0}; @@ -467,6 +659,16 @@ static char *nv_cspmu_format_name(const struct arm_csp= mu *cspmu, socket); break; } + case NAME_FMT_SOCKET_INST: { + const int cpu =3D cpumask_first(&cspmu->associated_cpus); + const int socket =3D cpu_to_node(cpu); + u32 inst_id; + + if (!nv_cspmu_get_inst_id(cspmu, &inst_id)) + name =3D devm_kasprintf(dev, GFP_KERNEL, + match->name_pattern, socket, inst_id); + break; + } case NAME_FMT_GENERIC: name =3D devm_kasprintf(dev, GFP_KERNEL, match->name_pattern, atomic_fetch_inc(&pmu_generic_idx)); @@ -514,8 +716,10 @@ static int nv_cspmu_init_ops(struct arm_cspmu *cspmu) cspmu->impl.ctx =3D ctx; =20 /* NVIDIA specific callbacks. */ + SET_OP(validate_event, impl_ops, match, NULL); SET_OP(set_cc_filter, impl_ops, match, nv_cspmu_set_cc_filter); SET_OP(set_ev_filter, impl_ops, match, nv_cspmu_set_ev_filter); + SET_OP(reset_ev_filter, impl_ops, match, NULL); SET_OP(get_event_attrs, impl_ops, match, nv_cspmu_get_event_attrs); SET_OP(get_format_attrs, impl_ops, match, nv_cspmu_get_format_attrs); SET_OP(get_name, impl_ops, match, nv_cspmu_get_name); --=20 2.43.0 From nobody Mon Jan 26 22:50:07 2026 Received: from CY7PR03CU001.outbound.protection.outlook.com (mail-westcentralusazon11010019.outbound.protection.outlook.com [40.93.198.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C56D2BEFED; Mon, 26 Jan 2026 18:13:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.93.198.19 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451198; cv=fail; b=WUOZwEBXEKgMhLzFRHRcGjWm9ElsdRo39ElSdd59ieSQHm3PI2G0Ymv4nr4tKQDkq7SbjX1SXdu/ghaz9tmYjo3Ic4dmo3yjrDLO30OF88xkOYZv4UX+ao0t4Sfn7N3+A27nr1lJ1DkSa6J757HHyiIerILV+xavzNmAnpbYXUQ= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451198; c=relaxed/simple; bh=p5eICjG3BDOwe1EaqH1tyJW75kd3ingryO2k6wmVRVI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=HUbnkhjN8y1Fp4Qx4/F8vMtZ49u9zHRaof0JpV7OJMxRs+Ef1fnaVDU1CGdtIhDTa0MaZoducFg/1q2YVGQ554Cb1SQ8iB4bNYC7GSrk01asQ9gyI0ff18h2e+5FJYQXLTHBdoVnUnxjgkm8Ps2iXnfaPbTX6hBWY8J7mojF6+E= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=INleF9VK; arc=fail smtp.client-ip=40.93.198.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="INleF9VK" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=p6b9iP+V6qxjVUGt3DeioOOteE+fD1mjn0hF2TeairfvOxFl77xRNfsquJKCgr+Don2RacnqysQSX/VQLuCxtfMIVld9lU73WDq8MQ+plwTzEnT0KDHquh/bzTpjr2/0ieAgLdnJp23Pqbzm+MEOsvXX6OJB/nYoZAU/yJcUkwcYLiMXNqGlLC2Dg3IHy0WXvrEpJNDfq62TC3AT5t5umri6yeQ+6EM+3aIdLF668clMjZ+9CS9fZhTM+pUBxKRIZee9q9C8Hv6rGxJ0bBZss8vchjBxWHv7hiDXOOWCkzOEn5GF8MiKT1ePeQGOB8h5rADZvA0TPW0TXIJfTDlXTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CspIbtJhSFO+i+0GQTNKAezlJDlUx53uKm6idfPopRo=; b=alYb/nEz0Im1XTbDuBm7NTaq1ndKIRiYncAj1eDBGEglgiubkAA92thVEN5e40iKk3NaONQwxthm6IQZHavtoQpiS9iwSVYY44yNRPgsSjY4f58zzpS4zjYD2Y6gLutIaLksLu9Tb4dDx61dTmgVW40GQkmKQbaqg6qWGLg/8ZzK+++MBKaQSRLHbtrkvPxCwatriitKUZJ0YXqsg6XLYwiphWLGWhlORIO8dwOMEApcr1JivRRsmO8wU7nq0TaC6kVfPrLGf6MTGleBJzVCCm/nxS6yJpLH6Hd/EUXACoF2Xa+nrErm+/8eiLvbEneaQNz1NXhHEQqTBhiY3WZzGw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CspIbtJhSFO+i+0GQTNKAezlJDlUx53uKm6idfPopRo=; b=INleF9VKXrt1LFHwGcMna13RUflownw5oaZraNX9gNBRZXIWjlZnSMOcU802h659iLi91OLSy/OEFlanX4HJHRqwBH72a5OoFgp+zt2k+vq8QWaQ5tn1ogA4zFwIx/Pano3f7/DgBIVnFrlrOTHZwxl/WrSr77DTjOiFPO4QclxEO+CxFmzACTqjjFZk74Sp9z/2z8lW6XZ54I/YstG91l/l2pVUsBXOrPUNhpja0SoEBTyViaWNhXSDU+qJRM6bo+GYy7zBWUB9j0NpE2Stl/n2hPy/xgybseUb1UUQemEV7WNuECniAVMJz5nLrg59LiGxXvwTPd9d+KqTF/XBcg== Received: from SJ0PR05CA0084.namprd05.prod.outlook.com (2603:10b6:a03:332::29) by DS0PR12MB8787.namprd12.prod.outlook.com (2603:10b6:8:14e::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9542.11; Mon, 26 Jan 2026 18:13:06 +0000 Received: from SJ5PEPF000001F4.namprd05.prod.outlook.com (2603:10b6:a03:332:cafe::cc) by SJ0PR05CA0084.outlook.office365.com (2603:10b6:a03:332::29) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9564.7 via Frontend Transport; Mon, 26 Jan 2026 18:13:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001F4.mail.protection.outlook.com (10.167.242.72) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.3 via Frontend Transport; Mon, 26 Jan 2026 18:13:06 +0000 Received: from drhqmail202.nvidia.com (10.126.190.181) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:55 -0800 Received: from drhqmail201.nvidia.com (10.126.190.180) by drhqmail202.nvidia.com (10.126.190.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:55 -0800 Received: from build-bwicaksono-noble-20251018.internal (10.127.8.11) by mail.nvidia.com (10.126.190.180) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 26 Jan 2026 10:12:55 -0800 From: Besar Wicaksono To: , , , CC: , , , , , , , , , , , , , Besar Wicaksono Subject: [PATCH 5/8] perf/arm_cspmu: nvidia: Add Tegra410 PCIE-TGT PMU Date: Mon, 26 Jan 2026 18:11:52 +0000 Message-ID: <20260126181155.2776097-6-bwicaksono@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260126181155.2776097-1-bwicaksono@nvidia.com> References: <20260126181155.2776097-1-bwicaksono@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001F4:EE_|DS0PR12MB8787:EE_ X-MS-Office365-Filtering-Correlation-Id: de0ad8f2-f522-425e-e6db-08de5d068df4 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?Sj4RKj6NcOjhxD128DRx6mwFrM8ocj09nsM1dfm36nnaXzjmAEK4et2deZU8?= =?us-ascii?Q?j8uDgUzb/88zf6h7YcU90u4fAO7KQ9J5LwN1S48+PAsLA/dMuTqp7gqXt2Jf?= =?us-ascii?Q?NDIHS3ipghNZ2IOmMJhj5D+otNeIcd2MhR06zt46T+KS4YdiydJmtp8WgcLK?= =?us-ascii?Q?WVUWkLNXfx4zuhiV2fsn9mXWBq746ZlLyWGUT1JQkER8LcH9dPembn8T/Eos?= =?us-ascii?Q?jZbR3hDHc1ptDf5ThtlTaPhUt359+VPehAcHOB9ZWDNIu31zLUANn9aU0cAI?= =?us-ascii?Q?WLGEHvFl1bjAAlBLTYfgEcI8x81yeM3o2rpcvnBX4232dJuW9WYln2kCi0tX?= =?us-ascii?Q?2ML9dquZkwYIpg3ieKuruytFZdIrQxVhBXo42upIubAKEOQv8F97xfo9plyL?= =?us-ascii?Q?we2KiAibVot5Svg18tAYaHVzOOQ4WJELKYw6YeCKlVQ4gjesVKEH8CSzFaCC?= =?us-ascii?Q?LY63NV3MtGDFExznONy2Xj82Qwo1nuXcVYM+0bhPNf2wpz9Tsjj38jP4gVrf?= =?us-ascii?Q?oZgKGrnS5xmCwKwI/I7Sbcg9kCI9lQ0rc0ouHYtNsGmlAM5O8EuDJKdpe4mj?= =?us-ascii?Q?oLgLxhfHwMwhJoRO4YkL4t1wrz6P152opGEJHYD78g2X95mKMOskESc8+m3B?= =?us-ascii?Q?213WnWEDSJP59d0yhPnlEPdtYWQTzrwoG9CtFao2znoOUYOW6AehCNIEGh+7?= =?us-ascii?Q?TQpjgFiY+KwdblQ6jsLtIBjN9LUjIYdG72LeoRYz4qS+KVlJsy/B5/l1NXvh?= =?us-ascii?Q?/jxtTuDfmVCxXI9Ufr6D0t0cc2XAOVDm2MMmHybTP88HioX8VuFo4JR9pFNc?= =?us-ascii?Q?pdNp8N87k3Z+3OXhQBfDVpmcmKvAx2d5FtkLX9wxZDzrV5MneYy5y3sBSC9C?= =?us-ascii?Q?QXAkyOSq8Fn2ONcQ5LC4PMit4GkrmGnkoVQYoZLbbECrI3Iq6zgrC2qA1KTo?= =?us-ascii?Q?KCYbweUBs1hPxs18kkWkHssSrkCYe0bQJqZpZseOAI0V5VBPSn8JiGgi2yFM?= =?us-ascii?Q?k8k5DsoB26L2ziey8oeQ9rr24UJ7KqtJEAEfllvZ1cIgzGzv+4fJHeEQCwvm?= =?us-ascii?Q?qweJ6tzgyKdDxVMZfIc7HGMrAtNO4pJNNcdIAhHsVFysz3A4qAT7ipnu5fB9?= =?us-ascii?Q?XxOhLfh/FA0EK9BeRL949TSa6MpPesjKLJ7ZvmONs88xyEremNSJ2g2c1vx/?= =?us-ascii?Q?LMYKkHEuUkBpGoDVcnLRifCd9d2E0xtHL3OCvATf1D32bhQM3FIvHGNpniDh?= =?us-ascii?Q?1dG61kwMoy9UZJW00fZy0m9b0cidNFYlNO8/0R233XffOrVzyYO1c1BqVYuF?= =?us-ascii?Q?nZrG7INkBcsZ4H9w0Ej9Bw9+z65fCQ7anHLTB3KvZ+FYyUFYwRcrqskCst0D?= =?us-ascii?Q?u69wyJyIdByu5dAK2LFSCt4rI8I8nlI1JovIxkBS5M/C661QKW2M84kF3vOY?= =?us-ascii?Q?BodKBDMogOMa0N01xStLol3UIlOZyUI3cer2nfanlVwsA4VHoIk6x3zE9M27?= =?us-ascii?Q?zdeyFj09eSi2IgC9C0M60acocDyLB/SpTy+cmbsmhks9ySoRLJVKKtgpVK90?= =?us-ascii?Q?qyy/Bzw838W/JM8OcXncmiwpB+Zjr+4yTG7coQczf5NROj7PCs6kSqX6EGgt?= =?us-ascii?Q?fJVyWJgvsDtdmER3Y6wOLKGeDqHJjy+lzRW/99fvj1NhfuNbqlDu3AW8Lq48?= =?us-ascii?Q?hgmOHQ=3D=3D?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(82310400026)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jan 2026 18:13:06.5807 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: de0ad8f2-f522-425e-e6db-08de5d068df4 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001F4.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB8787 Content-Type: text/plain; charset="utf-8" Adds PCIE-TGT PMU support in Tegra410 SOC. Signed-off-by: Besar Wicaksono --- .../admin-guide/perf/nvidia-tegra410-pmu.rst | 76 ++++ drivers/perf/arm_cspmu/nvidia_cspmu.c | 324 ++++++++++++++++++ 2 files changed, 400 insertions(+) diff --git a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst b/Docum= entation/admin-guide/perf/nvidia-tegra410-pmu.rst index 8528685ddb61..07dc447eead7 100644 --- a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst +++ b/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst @@ -7,6 +7,7 @@ metrics like memory bandwidth, latency, and utilization: =20 * Unified Coherence Fabric (UCF) * PCIE +* PCIE-TGT =20 PMU Driver ---------- @@ -211,6 +212,11 @@ Example usage: =20 perf stat -a -e nvidia_pcie_pmu_0_rc_4/event=3D0x4,src_bdf=3D0x0180,sr= c_bdf_en=3D0x1/ =20 +.. _NVIDIA_T410_PCIE_PMU_RC_Mapping_Section: + +Mapping the RC# to lspci segment number +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Mapping the RC# to lspci segment number can be non-trivial; hence a new NV= IDIA Designated Vendor Specific Capability (DVSEC) register is added into the P= CIE config space for each RP. This DVSEC has vendor id "10de" and DVSEC id of "0x4". The DV= SEC register @@ -266,3 +272,73 @@ Example output:: 000d:40:00.0: Bus=3D40, Segment=3D0d, RP=3D01, RC=3D04, Socket=3D01 000d:c0:00.0: Bus=3Dc0, Segment=3D0d, RP=3D02, RC=3D04, Socket=3D01 000e:00:00.0: Bus=3D00, Segment=3D0e, RP=3D00, RC=3D05, Socket=3D01 + +PCIE-TGT PMU +------------ + +The PCIE-TGT PMU monitors traffic targeting PCIE BAR and CXL HDM ranges. +There is one PCIE-TGT PMU per PCIE root complex (RC) in the SoC. Each RC in +Tegra410 SoC can have up to 16 lanes that can be bifurcated into up to 8 r= oot +ports (RP). The PMU provides RP filter to count PCIE BAR traffic to each R= P and +address filter to count access to PCIE BAR or CXL HDM ranges. The details +of the filters are described in the following sections. + +Mapping the RC# to lspci segment number is similar to the PCIE PMU. +Please see :ref:`NVIDIA_T410_PCIE_PMU_RC_Mapping_Section` for more info. + +The events and configuration options of this PMU device are available in s= ysfs, +see /sys/bus/event_source/devices/nvidia_pcie_tgt_pmu__rc_. + +The events in this PMU can be used to measure bandwidth and utilization: + + * rd_req: count the number of read requests to PCIE. + * wr_req: count the number of write requests to PCIE. + * rd_bytes: count the number of bytes transferred by rd_req. + * wr_bytes: count the number of bytes transferred by wr_req. + * cycles: counts the PCIE cycles. + +The average bandwidth is calculated as:: + + AVG_RD_BANDWIDTH_IN_GBPS =3D RD_BYTES / ELAPSED_TIME_IN_NS + AVG_WR_BANDWIDTH_IN_GBPS =3D WR_BYTES / ELAPSED_TIME_IN_NS + +The average request rate is calculated as:: + + AVG_RD_REQUEST_RATE =3D RD_REQ / CYCLES + AVG_WR_REQUEST_RATE =3D WR_REQ / CYCLES + +The PMU events can be filtered based on the destination root port or target +address range. Filtering based on RP is only available for PCIE BAR traffi= c. +Address filter works for both PCIE BAR and CXL HDM ranges. These filters c= an be +found in sysfs, see +/sys/bus/event_source/devices/nvidia_pcie_tgt_pmu__rc_/format/. + +Destination filter settings: + +* dst_rp_mask: bitmask to select the root port(s) to monitor. E.g. "dst_rp= _mask=3D0xFF" + corresponds to all root ports (from 0 to 7) in the PCIE RC. Note that th= is filter is + only available for PCIE BAR traffic. +* dst_addr_base: BAR or CXL HDM filter base address. +* dst_addr_mask: BAR or CXL HDM filter address mask. +* dst_addr_en: enable BAR or CXL HDM address range filter. If this is set,= the + address range specified by "dst_addr_base" and "dst_addr_mask" will be u= sed to filter + the PCIE BAR and CXL HDM traffic address. The PMU uses the following com= parison + to determine if the traffic destination address falls within the filter = range:: + + (txn's addr & dst_addr_mask) =3D=3D (dst_addr_base & dst_addr_mask) + + If the comparison succeeds, then the event will be counted. + +If the destination filter is not specified, the RP filter will be configur= ed by default +to count PCIE BAR traffic to all root ports. + +Example usage: + +* Count event id 0x0 to root port 0 and 1 of PCIE RC-0 on socket 0:: + + perf stat -a -e nvidia_pcie_tgt_pmu_0_rc_0/event=3D0x0,dst_rp_mask=3D0= x3/ + +* Count event id 0x1 for accesses to PCIE BAR or CXL HDM address range + 0x10000 to 0x100FF on socket 0's PCIE RC-1:: + + perf stat -a -e nvidia_pcie_tgt_pmu_0_rc_1/event=3D0x1,dst_addr_base= =3D0x10000,dst_addr_mask=3D0xFFF00,dst_addr_en=3D0x1/ diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c b/drivers/perf/arm_cspmu= /nvidia_cspmu.c index 3a5531d1f94c..095d2f322c6f 100644 --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c @@ -42,6 +42,24 @@ #define NV_PCIE_V2_FILTER2_DST GENMASK_ULL(NV_PCIE_V2_DST_COUNT - 1,= 0) #define NV_PCIE_V2_FILTER2_DEFAULT NV_PCIE_V2_FILTER2_DST =20 +#define NV_PCIE_TGT_PORT_COUNT 8ULL +#define NV_PCIE_TGT_EV_TYPE_CC 0x4 +#define NV_PCIE_TGT_EV_TYPE_COUNT 3ULL +#define NV_PCIE_TGT_EV_TYPE_MASK GENMASK_ULL(NV_PCIE_TGT_EV_TYPE_COUNT= - 1, 0) +#define NV_PCIE_TGT_FILTER2_MASK GENMASK_ULL(NV_PCIE_TGT_PORT_COUNT, 0) +#define NV_PCIE_TGT_FILTER2_PORT GENMASK_ULL(NV_PCIE_TGT_PORT_COUNT - = 1, 0) +#define NV_PCIE_TGT_FILTER2_ADDR_EN BIT(NV_PCIE_TGT_PORT_COUNT) +#define NV_PCIE_TGT_FILTER2_ADDR GENMASK_ULL(15, NV_PCIE_TGT_PORT_COUN= T) +#define NV_PCIE_TGT_FILTER2_DEFAULT NV_PCIE_TGT_FILTER2_PORT + +#define NV_PCIE_TGT_ADDR_COUNT 8ULL +#define NV_PCIE_TGT_ADDR_STRIDE 20 +#define NV_PCIE_TGT_ADDR_CTRL 0xD38 +#define NV_PCIE_TGT_ADDR_BASE_LO 0xD3C +#define NV_PCIE_TGT_ADDR_BASE_HI 0xD40 +#define NV_PCIE_TGT_ADDR_MASK_LO 0xD44 +#define NV_PCIE_TGT_ADDR_MASK_HI 0xD48 + #define NV_GENERIC_FILTER_ID_MASK GENMASK_ULL(31, 0) =20 #define NV_PRODID_MASK (PMIIDR_PRODUCTID | PMIIDR_VARIANT | PMIIDR_REVISIO= N) @@ -186,6 +204,15 @@ static struct attribute *pcie_v2_pmu_event_attrs[] =3D= { NULL, }; =20 +static struct attribute *pcie_tgt_pmu_event_attrs[] =3D { + ARM_CSPMU_EVENT_ATTR(rd_bytes, 0x0), + ARM_CSPMU_EVENT_ATTR(wr_bytes, 0x1), + ARM_CSPMU_EVENT_ATTR(rd_req, 0x2), + ARM_CSPMU_EVENT_ATTR(wr_req, 0x3), + ARM_CSPMU_EVENT_ATTR(cycles, NV_PCIE_TGT_EV_TYPE_CC), + NULL, +}; + static struct attribute *generic_pmu_event_attrs[] =3D { ARM_CSPMU_EVENT_ATTR(cycles, ARM_CSPMU_EVT_CYCLES_DEFAULT), NULL, @@ -239,6 +266,15 @@ static struct attribute *pcie_v2_pmu_format_attrs[] = =3D { NULL, }; =20 +static struct attribute *pcie_tgt_pmu_format_attrs[] =3D { + ARM_CSPMU_FORMAT_ATTR(event, "config:0-2"), + ARM_CSPMU_FORMAT_ATTR(dst_rp_mask, "config:3-10"), + ARM_CSPMU_FORMAT_ATTR(dst_addr_en, "config:11"), + ARM_CSPMU_FORMAT_ATTR(dst_addr_base, "config1:0-63"), + ARM_CSPMU_FORMAT_ATTR(dst_addr_mask, "config2:0-63"), + NULL, +}; + static struct attribute *generic_pmu_format_attrs[] =3D { ARM_CSPMU_FORMAT_EVENT_ATTR, ARM_CSPMU_FORMAT_FILTER_ATTR, @@ -478,6 +514,268 @@ static int pcie_v2_pmu_validate_event(struct arm_cspm= u *cspmu, return 0; } =20 +struct pcie_tgt_addr_filter { + u32 refcount; + u64 base; + u64 mask; +}; + +struct pcie_tgt_data { + struct pcie_tgt_addr_filter addr_filter[NV_PCIE_TGT_ADDR_COUNT]; + void __iomem *addr_filter_reg; +}; + +#if defined(CONFIG_ACPI) +static int pcie_tgt_init_data(struct arm_cspmu *cspmu) +{ + int ret; + struct acpi_device *adev; + struct pcie_tgt_data *data; + struct list_head resource_list; + struct resource_entry *rentry; + struct nv_cspmu_ctx *ctx =3D to_nv_cspmu_ctx(cspmu); + struct device *dev =3D cspmu->dev; + + data =3D devm_kzalloc(dev, sizeof(struct pcie_tgt_data), GFP_KERNEL); + if (!data) + return -ENOMEM; + + adev =3D arm_cspmu_acpi_dev_get(cspmu); + if (!adev) { + dev_err(dev, "failed to get associated PCIE-TGT device\n"); + return -ENODEV; + } + + INIT_LIST_HEAD(&resource_list); + ret =3D acpi_dev_get_memory_resources(adev, &resource_list); + if (ret < 0) { + dev_err(dev, "failed to get PCIE-TGT device memory resources\n"); + acpi_dev_put(adev); + return ret; + } + + rentry =3D list_first_entry_or_null( + &resource_list, struct resource_entry, node); + if (rentry) { + data->addr_filter_reg =3D devm_ioremap_resource(dev, rentry->res); + ret =3D 0; + } + + if (IS_ERR(data->addr_filter_reg)) { + dev_err(dev, "failed to get address filter resource\n"); + ret =3D PTR_ERR(data->addr_filter_reg); + } + + acpi_dev_free_resource_list(&resource_list); + acpi_dev_put(adev); + + ctx->data =3D data; + + return ret; +} +#else +static int pcie_tgt_init_data(struct arm_cspmu *cspmu) +{ + return -ENODEV; +} +#endif + +static struct pcie_tgt_data *pcie_tgt_get_data(struct arm_cspmu *cspmu) +{ + struct nv_cspmu_ctx *ctx =3D to_nv_cspmu_ctx(cspmu); + + return ctx->data; +} + +/* Find the first available address filter slot. */ +static int pcie_tgt_find_addr_idx(struct arm_cspmu *cspmu, u64 base, u64 m= ask, + bool is_reset) +{ + int i; + struct pcie_tgt_data *data =3D pcie_tgt_get_data(cspmu); + + for (i =3D 0; i < NV_PCIE_TGT_ADDR_COUNT; i++) { + if (!is_reset && data->addr_filter[i].refcount =3D=3D 0) + return i; + + if (data->addr_filter[i].base =3D=3D base && + data->addr_filter[i].mask =3D=3D mask) + return i; + } + + return -ENODEV; +} + +static u32 pcie_tgt_pmu_event_filter(const struct perf_event *event) +{ + u32 filter; + + filter =3D (event->attr.config >> NV_PCIE_TGT_EV_TYPE_COUNT) & + NV_PCIE_TGT_FILTER2_MASK; + + return filter; +} + +static bool pcie_tgt_pmu_addr_en(const struct perf_event *event) +{ + u32 filter =3D pcie_tgt_pmu_event_filter(event); + + return FIELD_GET(NV_PCIE_TGT_FILTER2_ADDR_EN, filter) !=3D 0; +} + +static u32 pcie_tgt_pmu_port_filter(const struct perf_event *event) +{ + u32 filter =3D pcie_tgt_pmu_event_filter(event); + + return FIELD_GET(NV_PCIE_TGT_FILTER2_PORT, filter); +} + +static u64 pcie_tgt_pmu_dst_addr_base(const struct perf_event *event) +{ + return event->attr.config1; +} + +static u64 pcie_tgt_pmu_dst_addr_mask(const struct perf_event *event) +{ + return event->attr.config2; +} + +static int pcie_tgt_pmu_validate_event(struct arm_cspmu *cspmu, + struct perf_event *new_ev) +{ + u64 base, mask; + int idx; + + if (!pcie_tgt_pmu_addr_en(new_ev)) + return 0; + + /* Make sure there is a slot available for the address filter. */ + base =3D pcie_tgt_pmu_dst_addr_base(new_ev); + mask =3D pcie_tgt_pmu_dst_addr_mask(new_ev); + idx =3D pcie_tgt_find_addr_idx(cspmu, base, mask, false); + if (idx < 0) + return -EINVAL; + + return 0; +} + +static void pcie_tgt_pmu_config_addr_filter(struct arm_cspmu *cspmu, + bool en, u64 base, u64 mask, int idx) +{ + struct pcie_tgt_data *data; + struct pcie_tgt_addr_filter *filter; + void __iomem *filter_reg; + + data =3D pcie_tgt_get_data(cspmu); + filter =3D &data->addr_filter[idx]; + filter_reg =3D data->addr_filter_reg + (idx * NV_PCIE_TGT_ADDR_STRIDE); + + if (en) { + filter->refcount++; + if (filter->refcount =3D=3D 1) { + filter->base =3D base; + filter->mask =3D mask; + + writel(lower_32_bits(base), filter_reg + NV_PCIE_TGT_ADDR_BASE_LO); + writel(upper_32_bits(base), filter_reg + NV_PCIE_TGT_ADDR_BASE_HI); + writel(lower_32_bits(mask), filter_reg + NV_PCIE_TGT_ADDR_MASK_LO); + writel(upper_32_bits(mask), filter_reg + NV_PCIE_TGT_ADDR_MASK_HI); + writel(1, filter_reg + NV_PCIE_TGT_ADDR_CTRL); + } + } else { + filter->refcount--; + if (filter->refcount =3D=3D 0) { + writel(0, filter_reg + NV_PCIE_TGT_ADDR_CTRL); + writel(0, filter_reg + NV_PCIE_TGT_ADDR_BASE_LO); + writel(0, filter_reg + NV_PCIE_TGT_ADDR_BASE_HI); + writel(0, filter_reg + NV_PCIE_TGT_ADDR_MASK_LO); + writel(0, filter_reg + NV_PCIE_TGT_ADDR_MASK_HI); + + filter->base =3D 0; + filter->mask =3D 0; + } + } +} + +static void pcie_tgt_pmu_set_ev_filter(struct arm_cspmu *cspmu, + const struct perf_event *event) +{ + bool addr_filter_en; + int idx; + u32 filter2_val, filter2_offset, port_filter; + u64 base, mask; + + filter2_val =3D 0; + filter2_offset =3D PMEVFILT2R + (4 * event->hw.idx); + + addr_filter_en =3D pcie_tgt_pmu_addr_en(event); + if (addr_filter_en) { + base =3D pcie_tgt_pmu_dst_addr_base(event); + mask =3D pcie_tgt_pmu_dst_addr_mask(event); + idx =3D pcie_tgt_find_addr_idx(cspmu, base, mask, false); + + if (idx < 0) { + dev_err(cspmu->dev, + "Unable to find a slot for address filtering\n"); + writel(0, cspmu->base0 + filter2_offset); + return; + } + + /* Configure address range filter registers.*/ + pcie_tgt_pmu_config_addr_filter(cspmu, true, base, mask, idx); + + /* Config the counter to use the selected address filter slot. */ + filter2_val |=3D FIELD_PREP(NV_PCIE_TGT_FILTER2_ADDR, 1U << idx); + } + + port_filter =3D pcie_tgt_pmu_port_filter(event); + + /* Monitor all ports if no filter is selected. */ + if (!addr_filter_en && port_filter =3D=3D 0) + port_filter =3D NV_PCIE_TGT_FILTER2_PORT; + + filter2_val |=3D FIELD_PREP(NV_PCIE_TGT_FILTER2_PORT, port_filter); + + writel(filter2_val, cspmu->base0 + filter2_offset); +} + +static void pcie_tgt_pmu_reset_ev_filter(struct arm_cspmu *cspmu, + const struct perf_event *event) +{ + bool addr_filter_en; + u64 base, mask; + int idx; + + addr_filter_en =3D pcie_tgt_pmu_addr_en(event); + if (!addr_filter_en) + return; + + base =3D pcie_tgt_pmu_dst_addr_base(event); + mask =3D pcie_tgt_pmu_dst_addr_mask(event); + idx =3D pcie_tgt_find_addr_idx(cspmu, base, mask, true); + + if (idx < 0) { + dev_err(cspmu->dev, + "Unable to find the address filter slot to reset\n"); + return; + } + + pcie_tgt_pmu_config_addr_filter( + cspmu, false, base, mask, idx); +} + +static u32 pcie_tgt_pmu_event_type(const struct perf_event *event) +{ + return event->attr.config & NV_PCIE_TGT_EV_TYPE_MASK; +} + +static bool pcie_tgt_pmu_is_cycle_counter_event(const struct perf_event *e= vent) +{ + u32 event_type =3D pcie_tgt_pmu_event_type(event); + + return event_type =3D=3D NV_PCIE_TGT_EV_TYPE_CC; +} + enum nv_cspmu_name_fmt { NAME_FMT_GENERIC, NAME_FMT_SOCKET, @@ -622,6 +920,30 @@ static const struct nv_cspmu_match nv_cspmu_match[] = =3D { .reset_ev_filter =3D nv_cspmu_reset_ev_filter, } }, + { + .prodid =3D 0x10700000, + .prodid_mask =3D NV_PRODID_MASK, + .name_pattern =3D "nvidia_pcie_tgt_pmu_%u_rc_%u", + .name_fmt =3D NAME_FMT_SOCKET_INST, + .template_ctx =3D { + .event_attr =3D pcie_tgt_pmu_event_attrs, + .format_attr =3D pcie_tgt_pmu_format_attrs, + .filter_mask =3D 0x0, + .filter_default_val =3D 0x0, + .filter2_mask =3D NV_PCIE_TGT_FILTER2_MASK, + .filter2_default_val =3D NV_PCIE_TGT_FILTER2_DEFAULT, + .get_filter =3D NULL, + .get_filter2 =3D NULL, + .init_data =3D pcie_tgt_init_data + }, + .ops =3D { + .is_cycle_counter_event =3D pcie_tgt_pmu_is_cycle_counter_event, + .event_type =3D pcie_tgt_pmu_event_type, + .validate_event =3D pcie_tgt_pmu_validate_event, + .set_ev_filter =3D pcie_tgt_pmu_set_ev_filter, + .reset_ev_filter =3D pcie_tgt_pmu_reset_ev_filter, + } + }, { .prodid =3D 0, .prodid_mask =3D 0, @@ -717,6 +1039,8 @@ static int nv_cspmu_init_ops(struct arm_cspmu *cspmu) =20 /* NVIDIA specific callbacks. */ SET_OP(validate_event, impl_ops, match, NULL); + SET_OP(event_type, impl_ops, match, NULL); + SET_OP(is_cycle_counter_event, impl_ops, match, NULL); SET_OP(set_cc_filter, impl_ops, match, nv_cspmu_set_cc_filter); SET_OP(set_ev_filter, impl_ops, match, nv_cspmu_set_ev_filter); SET_OP(reset_ev_filter, impl_ops, match, NULL); --=20 2.43.0 From nobody Mon Jan 26 22:50:07 2026 Received: from PH8PR06CU001.outbound.protection.outlook.com (mail-westus3azon11012037.outbound.protection.outlook.com [40.107.209.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4882534DB74; Mon, 26 Jan 2026 18:13:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.209.37 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451206; cv=fail; b=VinCMwM9OFh6YL26ep5xKh4ttR9jiz+1QZm8UHaN+KhF8AI/uEPCHN8gVrMTwiTUfx4z1CVd7MjVZSnt7NFvP9huE6BC3KxHf4vs+Kw2wqvhUjw3q4RDXgvhUUzq63B7c1k7GC5ep3YjvoKlt+E9R1yUD4fEzuNJ6SP02tFnypc= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451206; c=relaxed/simple; bh=CsyNFfBgAruGVBgeQ5/cEwl/8zIWA5HevEcBEG7PSUM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=r2ZVHVFBuADDtlp8VIf2HAIyfVK644xsBJBKD19e5fOLbR8h/waBW8W18NyrP7e/btUOPizVClPJV1NMuxcbvJohdp5CVQw31kCu+EIG83cOk3z9bQU4DKGN+T340Cboxz2cseMCRnaqdgzmrSfbSRHy2HTyaekPK3qfSdSu8HM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=PhPeILkK; arc=fail smtp.client-ip=40.107.209.37 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="PhPeILkK" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=n9wNwYPoa49uypVceXs6IZyTMfZ2S+j4qFbFfSQIljbRbnBK5pJD3BXuldbIrc/cm8AiT+sEYkoLbhUMjbGKB6jK0Cib6r8/lLzfP3awxCnE1Xwoj9ltNVXbyB5jzI2Y/ZQ6KjDr+hqjr3uxLqV3uO8AdCiyaLVbFb/CFfEP6hkmhLru0Y7+75drzVtWIFeGJptvhyAS7vvcg+zXJuWU4p8Mdheanz84RW1+Kka7/oV87XIgzbwdnzL5edFpxzMx35XFVAIOnpkCINTe1jmSEVpWGITJeolUZb7jjkdM7/B3EfN5gywfiXjRR/FhgXx1/36F9iQI/q8wJ7r0+Rlq1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=phDM8XbltrFpX9w+YFZxzBPVhd+4kZzD1GJvzOiNiio=; b=hnEGLwLqPQg83J4lTDQp7jWq3wy1HNeiVVd3Ul57L5MsU9Rrd2Csc9DGGb6lzQzaAMMgUuwnrk9ZMwAzzlH185fCWs3Pjwj2w5rE7Fo+GH8e/aFaM0qzzHY4Fc1h+SRLiNgdNXwASbHRot27CSEIwvAuqy1ZQCMZ4zo3IkeSO0BMStKjttzXdEfS9TvjTMubisyNQKIXxqKgxd1feBhtu/hI9MtXmRPzZdrZDmhJD+zHexAf3PNIg4g+Jxg957cPC4hTMhNT4EAN0Mfml6wLzVtEuRh+5tx9U0tVMzytwGvSA3cAWnL2bVrGMIzHbnR0rl58bbMp/LPtrlWV7tmp2Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=phDM8XbltrFpX9w+YFZxzBPVhd+4kZzD1GJvzOiNiio=; b=PhPeILkKFOon2fXL1dz9fsUUuh+pSPjnqBMdTTH2rANc62LqbMZP5Eyw3lgEtUrwQpT8WiZhWwlkVy6ReL79hiaYAKoIgAxn30qvC5F20Xmg0pBu/0B4l+1H1Fab6gGD55Z77NM706jmXfIsyG9dLlsfmT6MLn1bk1MX+IqjCb1vtHFtFxX4raU87IBGrzZY9qvRHhNwPQtGx5ITAO2aSKYLzFXYXMPNdrlDkhZpsPek857O/3XZJhXEQ513iX8DaxAFOuq5KhQJ7lZy8hsI4rMUdxpwn3X7nYAf8xnGwgTT8avS9Dfwo/jLB1b6ZtRgXaG9nqo/SndmQt9N1Yi4rA== Received: from BN0PR04CA0044.namprd04.prod.outlook.com (2603:10b6:408:e8::19) by CH2PR12MB4245.namprd12.prod.outlook.com (2603:10b6:610:af::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9542.15; Mon, 26 Jan 2026 18:13:13 +0000 Received: from BN2PEPF0000449F.namprd02.prod.outlook.com (2603:10b6:408:e8:cafe::50) by BN0PR04CA0044.outlook.office365.com (2603:10b6:408:e8::19) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9542.16 via Frontend Transport; Mon, 26 Jan 2026 18:13:13 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by BN2PEPF0000449F.mail.protection.outlook.com (10.167.243.150) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.3 via Frontend Transport; Mon, 26 Jan 2026 18:13:13 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:57 -0800 Received: from drhqmail201.nvidia.com (10.126.190.180) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:57 -0800 Received: from build-bwicaksono-noble-20251018.internal (10.127.8.11) by mail.nvidia.com (10.126.190.180) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 26 Jan 2026 10:12:57 -0800 From: Besar Wicaksono To: , , , CC: , , , , , , , , , , , , , Besar Wicaksono Subject: [PATCH 6/8] perf: add NVIDIA Tegra410 CPU Memory Latency PMU Date: Mon, 26 Jan 2026 18:11:53 +0000 Message-ID: <20260126181155.2776097-7-bwicaksono@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260126181155.2776097-1-bwicaksono@nvidia.com> References: <20260126181155.2776097-1-bwicaksono@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN2PEPF0000449F:EE_|CH2PR12MB4245:EE_ X-MS-Office365-Filtering-Correlation-Id: bc781cff-f7ae-4422-d914-08de5d069230 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|82310400026|376014|36860700013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?jBjJqDGyapP3+If1fUdupev98q+RIoXbtTdqJwgzfrAolyrssjPEilQ9DFSc?= =?us-ascii?Q?XgwTkjlu2t9Gg7TlN22yhsGEUzfXAYSpOOAo/v8WmJcB3t29EgziirOtJPlO?= =?us-ascii?Q?B/FUmwp6+tiPLIDqZGRvkYVkGQCRyP4fcET0TuN91QkDV4NGR9n9aP0yckHv?= =?us-ascii?Q?xLeof+BDYs1LSCayg9egVlKJ5W7LNixR+BuqsxBvYjbwOw3JTCGqvhe+SE0w?= =?us-ascii?Q?UwnmHbPvWu4qDVCxahpQHDTkevxO0IXujHVZ1lKKHmqX8KZNOyD4RsmOqJ2H?= =?us-ascii?Q?WwGB3RltDlG69I1oUQLHrOWJ8+7gzIYl1UWp/5bBTkrNqLAm2Cbxe+eE+j4V?= =?us-ascii?Q?UURgGrXeC/jeLOPXtNQo+Oq4xroOd+QTwthbPYEYl1j+PdF66s8PPUeKyL0A?= =?us-ascii?Q?YqNSzgyRHz8oS6GF863LsNKJiiHeq6YnjZPnBgLZgk5DvSweJ0Lkn9ShF5Ex?= =?us-ascii?Q?7E7xvYTpnCpVGZ9cj46C1GIBQ7Cm+VcMMw6Ttfqg4FpoqerhPUSIquzEB/8G?= =?us-ascii?Q?ZHEFN5Z8cpnZfSfNdZUADfFIZY6ITOYv8fhI4es1/DSgCcxGFh/BrdKQS7yX?= =?us-ascii?Q?hH8nOK+aCowWQvx++mb0qpQyQlUN5TVLqKhk06Bt3Oz9o5pMWPnkw2Op+gSV?= =?us-ascii?Q?XLfDRGRd7dG6YJzrkqPCQFjCESFc9O9nJJ/SWxbtmr9ON+DUaVmfXT4UB92L?= =?us-ascii?Q?LDzNDNmMVoJY+n6W87CpAYwIeVpxx+VGO6ZDOjlBH6IuLYt/Rodp8plx8lS0?= =?us-ascii?Q?BxsxaABceGTRZUhUaZwdUrkkkSl9SrU9InkP6e6BqaYx3K6Zb4OAuubjdqh8?= =?us-ascii?Q?bIgKtRhnr/RTaikanK3yV+jtXeh21c4pLmmb8J4PhhoQn2CXavtYbKXrfl9c?= =?us-ascii?Q?wXLGewxqm7RNvZjNbI6jamLj4DP99JAtV3CPjW9ro9Y5PJ74ivzPYvaEg9bm?= =?us-ascii?Q?J28C9FE2WFqRgTMskP02lFd7LP0hGIhHudZpvqCjZYhMoJaU30ZS3sVspWOg?= =?us-ascii?Q?6NhfDZfLsnAf4Kelt4zdGDAsOX+8aKbSsivN7Pd+AcsmMzsT5bDrvjvmmK/v?= =?us-ascii?Q?w31NSSr2rEe9iRXHDHZh+LrHxxiLpcuCccih6iWqkXhLzI+K2oziMudd5oVD?= =?us-ascii?Q?t/yUdBBV78xzo8+3FuIdiJ38WD9lWAxAOK0bsmR7WjI5n/39h4lhHT56DBJu?= =?us-ascii?Q?98YGKuZ5EU3oyicHZzwl/pQX4tyY4FOfLc7uY95PXSvw2nJOfzJQXQip5930?= =?us-ascii?Q?6oESyGyLTfQOvQ6hSFyg36AQJVgQiihjoHCV4iiVaKJ4gmJEm66AdX4RXvyy?= =?us-ascii?Q?9BlpBMe6ecF/9n0rkwGeBk/tDgw/TqxRGriFzziBweGIKaKYE7iJgEpFpKvc?= =?us-ascii?Q?zInSDBYPLUm9rhjbMg0FZFUi/oK1O3cp7TNx4zWX8g32bCJfk9mf1nNWFHPk?= =?us-ascii?Q?O+sW8oC1nXW1de7k52I8/Ol2HvMTbDlns7gN2MnzR2oFo/ycyQjBJJkrnFDq?= =?us-ascii?Q?ilA/0VsHprmzGzI140iRBm61OmcJsn8Xlod92iaATJNxxE5OeECumif4vr0D?= =?us-ascii?Q?MR/hHGNwnThTTVX6cgOroA9IgOqJjvF1f4ltLgQxex0+7VvCiCZHri3DkAWq?= =?us-ascii?Q?zxVeL9EChjg+yTLt+NVV3pOB0FVzmKwuCEN2+vjtg3FN0LObDpwI4DJi598M?= =?us-ascii?Q?vVJ1AQ=3D=3D?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(82310400026)(376014)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jan 2026 18:13:13.4377 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: bc781cff-f7ae-4422-d914-08de5d069230 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN2PEPF0000449F.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR12MB4245 Content-Type: text/plain; charset="utf-8" Adds CPU Memory (CMEM) Latency PMU support in Tegra410 SOC. Signed-off-by: Besar Wicaksono --- .../admin-guide/perf/nvidia-tegra410-pmu.rst | 25 + drivers/perf/Kconfig | 7 + drivers/perf/Makefile | 1 + drivers/perf/nvidia_t410_cmem_latency_pmu.c | 727 ++++++++++++++++++ 4 files changed, 760 insertions(+) create mode 100644 drivers/perf/nvidia_t410_cmem_latency_pmu.c diff --git a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst b/Docum= entation/admin-guide/perf/nvidia-tegra410-pmu.rst index 07dc447eead7..11fc1c88346a 100644 --- a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst +++ b/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst @@ -8,6 +8,7 @@ metrics like memory bandwidth, latency, and utilization: * Unified Coherence Fabric (UCF) * PCIE * PCIE-TGT +* CPU Memory (CMEM) Latency =20 PMU Driver ---------- @@ -342,3 +343,27 @@ Example usage: 0x10000 to 0x100FF on socket 0's PCIE RC-1:: =20 perf stat -a -e nvidia_pcie_tgt_pmu_0_rc_1/event=3D0x1,dst_addr_base= =3D0x10000,dst_addr_mask=3D0xFFF00,dst_addr_en=3D0x1/ + +CPU Memory (CMEM) Latency PMU +----------------------------- + +This PMU monitors latency events of memory read requests to local +CPU DRAM: + + * RD_REQ counters: count read requests (32B per request). + * RD_CUM_OUTS counters: accumulated outstanding request counter, which t= rack + how many cycles the read requests are in flight. + * CYCLES counter: counts the number of elapsed cycles. + +The average latency is calculated as:: + + FREQ_IN_GHZ =3D CYCLES / ELAPSED_TIME_IN_NS + AVG_LATENCY_IN_CYCLES =3D RD_CUM_OUTS / RD_REQ + AVERAGE_LATENCY_IN_NS =3D AVG_LATENCY_IN_CYCLES / FREQ_IN_GHZ + +The events and configuration options of this PMU device are described in s= ysfs, +see /sys/bus/event_source/devices/nvidia_cmem_latency_pmu_. + +Example usage:: + + perf stat -a -e '{nvidia_cmem_latency_pmu_0/rd_req/,nvidia_cmem_latency_= pmu_0/rd_cum_outs/,nvidia_cmem_latency_pmu_0/cycles/}' diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 638321fc9800..9fed3c41d5ea 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -311,4 +311,11 @@ config MARVELL_PEM_PMU Enable support for PCIe Interface performance monitoring on Marvell platform. =20 +config NVIDIA_TEGRA410_CMEM_LATENCY_PMU + tristate "NVIDIA Tegra410 CPU Memory Latency PMU" + depends on ARM64 + help + Enable perf support for CPU memory latency counters monitoring on + NVIDIA Tegra410 SoC. + endmenu diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index ea52711a87e3..4aa6aad393c2 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -35,3 +35,4 @@ obj-$(CONFIG_DWC_PCIE_PMU) +=3D dwc_pcie_pmu.o obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=3D arm_cspmu/ obj-$(CONFIG_MESON_DDR_PMU) +=3D amlogic/ obj-$(CONFIG_CXL_PMU) +=3D cxl_pmu.o +obj-$(CONFIG_NVIDIA_TEGRA410_CMEM_LATENCY_PMU) +=3D nvidia_t410_cmem_laten= cy_pmu.o diff --git a/drivers/perf/nvidia_t410_cmem_latency_pmu.c b/drivers/perf/nvi= dia_t410_cmem_latency_pmu.c new file mode 100644 index 000000000000..9b466581c8fc --- /dev/null +++ b/drivers/perf/nvidia_t410_cmem_latency_pmu.c @@ -0,0 +1,727 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * NVIDIA Tegra410 CPU Memory (CMEM) Latency PMU driver. + * + * Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserve= d. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define NUM_INSTANCES 14 +#define BCAST(pmu) pmu->base[NUM_INSTANCES] + +/* Register offsets. */ +#define CG_CTRL 0x800 +#define CTRL 0x808 +#define STATUS 0x810 +#define CYCLE_CNTR 0x818 +#define MC0_REQ_CNTR 0x820 +#define MC0_AOR_CNTR 0x830 +#define MC1_REQ_CNTR 0x838 +#define MC1_AOR_CNTR 0x848 +#define MC2_REQ_CNTR 0x850 +#define MC2_AOR_CNTR 0x860 + +/* CTRL values. */ +#define CTRL_DISABLE 0x0ULL +#define CTRL_ENABLE 0x1ULL +#define CTRL_CLR 0x2ULL + +/* CG_CTRL values. */ +#define CG_CTRL_DISABLE 0x0ULL +#define CG_CTRL_ENABLE 0x1ULL + +/* STATUS register field. */ +#define STATUS_CYCLE_OVF BIT(0) +#define STATUS_MC0_AOR_OVF BIT(1) +#define STATUS_MC0_REQ_OVF BIT(3) +#define STATUS_MC1_AOR_OVF BIT(4) +#define STATUS_MC1_REQ_OVF BIT(6) +#define STATUS_MC2_AOR_OVF BIT(7) +#define STATUS_MC2_REQ_OVF BIT(9) + +/* Events. */ +#define EVENT_CYCLES 0x0 +#define EVENT_REQ 0x1 +#define EVENT_AOR 0x2 + +#define NUM_EVENTS 0x3 +#define MASK_EVENT 0x3 +#define MAX_ACTIVE_EVENTS 32 + +#define ACTIVE_CPU_MASK 0x0 +#define ASSOCIATED_CPU_MASK 0x1 + +static unsigned long cmem_lat_pmu_cpuhp_state; + +struct cmem_lat_pmu_hw_events { + struct perf_event *events[MAX_ACTIVE_EVENTS]; + DECLARE_BITMAP(used_ctrs, MAX_ACTIVE_EVENTS); +}; + +struct cmem_lat_pmu { + struct pmu pmu; + struct device *dev; + const char *name; + const char *identifier; + void __iomem *base[NUM_INSTANCES + 1]; + cpumask_t associated_cpus; + cpumask_t active_cpu; + struct hlist_node node; + struct cmem_lat_pmu_hw_events hw_events; +}; + +#define to_cmem_lat_pmu(p) \ + container_of(p, struct cmem_lat_pmu, pmu) + + +/* Get event type from perf_event. */ +static inline u32 get_event_type(struct perf_event *event) +{ + return (event->attr.config) & MASK_EVENT; +} + +/* PMU operations. */ +static int cmem_lat_pmu_get_event_idx(struct cmem_lat_pmu_hw_events *hw_ev= ents, + struct perf_event *event) +{ + unsigned int idx; + + idx =3D find_first_zero_bit(hw_events->used_ctrs, MAX_ACTIVE_EVENTS); + if (idx >=3D MAX_ACTIVE_EVENTS) + return -EAGAIN; + + set_bit(idx, hw_events->used_ctrs); + + return idx; +} + +static bool cmem_lat_pmu_validate_event(struct pmu *pmu, + struct cmem_lat_pmu_hw_events *hw_events, + struct perf_event *event) +{ + if (is_software_event(event)) + return true; + + /* Reject groups spanning multiple HW PMUs. */ + if (event->pmu !=3D pmu) + return false; + + return (cmem_lat_pmu_get_event_idx(hw_events, event) >=3D 0); +} + +/* + * Make sure the group of events can be scheduled at once + * on the PMU. + */ +static bool cmem_lat_pmu_validate_group(struct perf_event *event) +{ + struct perf_event *sibling, *leader =3D event->group_leader; + struct cmem_lat_pmu_hw_events fake_hw_events; + + if (event->group_leader =3D=3D event) + return true; + + memset(&fake_hw_events, 0, sizeof(fake_hw_events)); + + if (!cmem_lat_pmu_validate_event(event->pmu, &fake_hw_events, leader)) + return false; + + for_each_sibling_event(sibling, leader) { + if (!cmem_lat_pmu_validate_event(event->pmu, &fake_hw_events, + sibling)) + return false; + } + + return cmem_lat_pmu_validate_event(event->pmu, &fake_hw_events, event); +} + +static int cmem_lat_pmu_event_init(struct perf_event *event) +{ + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + u32 event_type =3D get_event_type(event); + + if (event->attr.type !=3D event->pmu->type || + event_type >=3D NUM_EVENTS) + return -ENOENT; + + /* + * Following other "uncore" PMUs, we do not support sampling mode or + * attach to a task (per-process mode). + */ + if (is_sampling_event(event)) { + dev_dbg(cmem_lat_pmu->pmu.dev, + "Can't support sampling events\n"); + return -EOPNOTSUPP; + } + + if (event->cpu < 0 || event->attach_state & PERF_ATTACH_TASK) { + dev_dbg(cmem_lat_pmu->pmu.dev, + "Can't support per-task counters\n"); + return -EINVAL; + } + + /* + * Make sure the CPU assignment is on one of the CPUs associated with + * this PMU. + */ + if (!cpumask_test_cpu(event->cpu, &cmem_lat_pmu->associated_cpus)) { + dev_dbg(cmem_lat_pmu->pmu.dev, + "Requested cpu is not associated with the PMU\n"); + return -EINVAL; + } + + /* Enforce the current active CPU to handle the events in this PMU. */ + event->cpu =3D cpumask_first(&cmem_lat_pmu->active_cpu); + if (event->cpu >=3D nr_cpu_ids) + return -EINVAL; + + if (!cmem_lat_pmu_validate_group(event)) + return -EINVAL; + + hwc->idx =3D -1; + hwc->config =3D event_type; + + return 0; +} + +static u64 cmem_lat_pmu_read_status(struct cmem_lat_pmu *cmem_lat_pmu, + unsigned int inst) +{ + return readq(cmem_lat_pmu->base[inst] + STATUS); +} + +static u64 cmem_lat_pmu_read_cycle_counter(struct perf_event *event) +{ + const unsigned int instance =3D 0; + u64 status; + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(event->pmu); + struct device *dev =3D cmem_lat_pmu->dev; + + /* + * Use the reading from first instance since all instances are + * identical. + */ + status =3D cmem_lat_pmu_read_status(cmem_lat_pmu, instance); + if (status & STATUS_CYCLE_OVF) + dev_warn(dev, "Cycle counter overflow\n"); + + return readq(cmem_lat_pmu->base[instance] + CYCLE_CNTR); +} + +static u64 cmem_lat_pmu_read_req_counter(struct perf_event *event) +{ + unsigned int i; + u64 status, val =3D 0; + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(event->pmu); + struct device *dev =3D cmem_lat_pmu->dev; + + /* Sum up the counts from all instances. */ + for (i =3D 0; i < NUM_INSTANCES; i++) { + status =3D cmem_lat_pmu_read_status(cmem_lat_pmu, i); + if (status & STATUS_MC0_REQ_OVF) + dev_warn(dev, "MC0 request counter overflow\n"); + if (status & STATUS_MC1_REQ_OVF) + dev_warn(dev, "MC1 request counter overflow\n"); + if (status & STATUS_MC2_REQ_OVF) + dev_warn(dev, "MC2 request counter overflow\n"); + + val +=3D readq(cmem_lat_pmu->base[i] + MC0_REQ_CNTR); + val +=3D readq(cmem_lat_pmu->base[i] + MC1_REQ_CNTR); + val +=3D readq(cmem_lat_pmu->base[i] + MC2_REQ_CNTR); + } + + return val; +} + +static u64 cmem_lat_pmu_read_aor_counter(struct perf_event *event) +{ + unsigned int i; + u64 status, val =3D 0; + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(event->pmu); + struct device *dev =3D cmem_lat_pmu->dev; + + /* Sum up the counts from all instances. */ + for (i =3D 0; i < NUM_INSTANCES; i++) { + status =3D cmem_lat_pmu_read_status(cmem_lat_pmu, i); + if (status & STATUS_MC0_AOR_OVF) + dev_warn(dev, "MC0 AOR counter overflow\n"); + if (status & STATUS_MC1_AOR_OVF) + dev_warn(dev, "MC1 AOR counter overflow\n"); + if (status & STATUS_MC2_AOR_OVF) + dev_warn(dev, "MC2 AOR counter overflow\n"); + + val +=3D readq(cmem_lat_pmu->base[i] + MC0_AOR_CNTR); + val +=3D readq(cmem_lat_pmu->base[i] + MC1_AOR_CNTR); + val +=3D readq(cmem_lat_pmu->base[i] + MC2_AOR_CNTR); + } + + return val; +} + +static u64 (*read_counter_fn[NUM_EVENTS])(struct perf_event *) =3D { + [EVENT_CYCLES] =3D cmem_lat_pmu_read_cycle_counter, + [EVENT_REQ] =3D cmem_lat_pmu_read_req_counter, + [EVENT_AOR] =3D cmem_lat_pmu_read_aor_counter, +}; + +static void cmem_lat_pmu_event_update(struct perf_event *event) +{ + u32 event_type; + u64 prev, now; + struct hw_perf_event *hwc =3D &event->hw; + + if (hwc->state & PERF_HES_STOPPED) + return; + + event_type =3D hwc->config; + + do { + prev =3D local64_read(&hwc->prev_count); + now =3D read_counter_fn[event_type](event); + } while (local64_cmpxchg(&hwc->prev_count, prev, now) !=3D prev); + + local64_add(now - prev, &event->count); + + hwc->state |=3D PERF_HES_UPTODATE; +} + +static void cmem_lat_pmu_start(struct perf_event *event, int pmu_flags) +{ + event->hw.state =3D 0; +} + +static void cmem_lat_pmu_stop(struct perf_event *event, int pmu_flags) +{ + event->hw.state |=3D PERF_HES_STOPPED; +} + +static int cmem_lat_pmu_add(struct perf_event *event, int flags) +{ + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(event->pmu); + struct cmem_lat_pmu_hw_events *hw_events =3D &cmem_lat_pmu->hw_events; + struct hw_perf_event *hwc =3D &event->hw; + int idx; + + if (WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), + &cmem_lat_pmu->associated_cpus))) + return -ENOENT; + + idx =3D cmem_lat_pmu_get_event_idx(hw_events, event); + if (idx < 0) + return idx; + + hw_events->events[idx] =3D event; + hwc->idx =3D idx; + hwc->state =3D PERF_HES_STOPPED | PERF_HES_UPTODATE; + + if (flags & PERF_EF_START) + cmem_lat_pmu_start(event, PERF_EF_RELOAD); + + /* Propagate changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void cmem_lat_pmu_del(struct perf_event *event, int flags) +{ + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(event->pmu); + struct cmem_lat_pmu_hw_events *hw_events =3D &cmem_lat_pmu->hw_events; + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D hwc->idx; + + cmem_lat_pmu_stop(event, PERF_EF_UPDATE); + + hw_events->events[idx] =3D NULL; + + clear_bit(idx, hw_events->used_ctrs); + + perf_event_update_userpage(event); +} + +static void cmem_lat_pmu_read(struct perf_event *event) +{ + cmem_lat_pmu_event_update(event); +} + +static inline void cmem_lat_pmu_cg_ctrl(struct cmem_lat_pmu *cmem_lat_pmu,= u64 val) +{ + writeq(val, BCAST(cmem_lat_pmu) + CG_CTRL); +} + +static inline void cmem_lat_pmu_ctrl(struct cmem_lat_pmu *cmem_lat_pmu, u6= 4 val) +{ + writeq(val, BCAST(cmem_lat_pmu) + CTRL); +} + +static void cmem_lat_pmu_enable(struct pmu *pmu) +{ + bool disabled; + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(pmu); + + disabled =3D bitmap_empty( + cmem_lat_pmu->hw_events.used_ctrs, MAX_ACTIVE_EVENTS); + + if (disabled) + return; + + /* Enable all the counters. */ + cmem_lat_pmu_cg_ctrl(cmem_lat_pmu, CG_CTRL_ENABLE); + cmem_lat_pmu_ctrl(cmem_lat_pmu, CTRL_ENABLE); +} + +static void cmem_lat_pmu_disable(struct pmu *pmu) +{ + int idx; + struct perf_event *event; + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(pmu); + + /* Disable all the counters. */ + cmem_lat_pmu_ctrl(cmem_lat_pmu, CTRL_DISABLE); + + /* + * The counters will start from 0 again on restart. + * Update the events immediately to avoid losing the counts. + */ + for_each_set_bit( + idx, cmem_lat_pmu->hw_events.used_ctrs, MAX_ACTIVE_EVENTS) { + event =3D cmem_lat_pmu->hw_events.events[idx]; + + if (!event) + continue; + + cmem_lat_pmu_event_update(event); + + local64_set(&event->hw.prev_count, 0ULL); + } + + cmem_lat_pmu_ctrl(cmem_lat_pmu, CTRL_CLR); + cmem_lat_pmu_cg_ctrl(cmem_lat_pmu, CG_CTRL_DISABLE); +} + +/* PMU identifier attribute. */ + +static ssize_t cmem_lat_pmu_identifier_show(struct device *dev, + struct device_attribute *attr, + char *page) +{ + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(dev_get_drvdata(dev= )); + + return sysfs_emit(page, "%s\n", cmem_lat_pmu->identifier); +} + +static struct device_attribute cmem_lat_pmu_identifier_attr =3D + __ATTR(identifier, 0444, cmem_lat_pmu_identifier_show, NULL); + +static struct attribute *cmem_lat_pmu_identifier_attrs[] =3D { + &cmem_lat_pmu_identifier_attr.attr, + NULL, +}; + +static struct attribute_group cmem_lat_pmu_identifier_attr_group =3D { + .attrs =3D cmem_lat_pmu_identifier_attrs, +}; + +/* Format attributes. */ + +#define NV_PMU_EXT_ATTR(_name, _func, _config) \ + (&((struct dev_ext_attribute[]){ \ + { \ + .attr =3D __ATTR(_name, 0444, _func, NULL), \ + .var =3D (void *)_config \ + } \ + })[0].attr.attr) + +static struct attribute *cmem_lat_pmu_formats[] =3D { + NV_PMU_EXT_ATTR(event, device_show_string, "config:0-1"), + NULL, +}; + +static const struct attribute_group cmem_lat_pmu_format_group =3D { + .name =3D "format", + .attrs =3D cmem_lat_pmu_formats, +}; + +/* Event attributes. */ + +static ssize_t cmem_lat_pmu_sysfs_event_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct perf_pmu_events_attr *pmu_attr; + + pmu_attr =3D container_of(attr, typeof(*pmu_attr), attr); + return sysfs_emit(buf, "event=3D0x%llx\n", pmu_attr->id); +} + +#define NV_PMU_EVENT_ATTR(_name, _config) \ + PMU_EVENT_ATTR_ID(_name, cmem_lat_pmu_sysfs_event_show, _config) + +static struct attribute *cmem_lat_pmu_events[] =3D { + NV_PMU_EVENT_ATTR(cycles, EVENT_CYCLES), + NV_PMU_EVENT_ATTR(rd_req, EVENT_REQ), + NV_PMU_EVENT_ATTR(rd_cum_outs, EVENT_AOR), + NULL +}; + +static const struct attribute_group cmem_lat_pmu_events_group =3D { + .name =3D "events", + .attrs =3D cmem_lat_pmu_events, +}; + +/* Cpumask attributes. */ + +static ssize_t cmem_lat_pmu_cpumask_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct pmu *pmu =3D dev_get_drvdata(dev); + struct cmem_lat_pmu *cmem_lat_pmu =3D to_cmem_lat_pmu(pmu); + struct dev_ext_attribute *eattr =3D + container_of(attr, struct dev_ext_attribute, attr); + unsigned long mask_id =3D (unsigned long)eattr->var; + const cpumask_t *cpumask; + + switch (mask_id) { + case ACTIVE_CPU_MASK: + cpumask =3D &cmem_lat_pmu->active_cpu; + break; + case ASSOCIATED_CPU_MASK: + cpumask =3D &cmem_lat_pmu->associated_cpus; + break; + default: + return 0; + } + return cpumap_print_to_pagebuf(true, buf, cpumask); +} + +#define NV_PMU_CPUMASK_ATTR(_name, _config) \ + NV_PMU_EXT_ATTR(_name, cmem_lat_pmu_cpumask_show, \ + (unsigned long)_config) + +static struct attribute *cmem_lat_pmu_cpumask_attrs[] =3D { + NV_PMU_CPUMASK_ATTR(cpumask, ACTIVE_CPU_MASK), + NV_PMU_CPUMASK_ATTR(associated_cpus, ASSOCIATED_CPU_MASK), + NULL, +}; + +static const struct attribute_group cmem_lat_pmu_cpumask_attr_group =3D { + .attrs =3D cmem_lat_pmu_cpumask_attrs, +}; + +/* Per PMU device attribute groups. */ + +static const struct attribute_group *cmem_lat_pmu_attr_groups[] =3D { + &cmem_lat_pmu_identifier_attr_group, + &cmem_lat_pmu_format_group, + &cmem_lat_pmu_events_group, + &cmem_lat_pmu_cpumask_attr_group, + NULL, +}; + +static int cmem_lat_pmu_cpu_online(unsigned int cpu, struct hlist_node *no= de) +{ + struct cmem_lat_pmu *cmem_lat_pmu =3D + hlist_entry_safe(node, struct cmem_lat_pmu, node); + + if (!cpumask_test_cpu(cpu, &cmem_lat_pmu->associated_cpus)) + return 0; + + /* If the PMU is already managed, there is nothing to do */ + if (!cpumask_empty(&cmem_lat_pmu->active_cpu)) + return 0; + + /* Use this CPU for event counting */ + cpumask_set_cpu(cpu, &cmem_lat_pmu->active_cpu); + + return 0; +} + +static int cmem_lat_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *= node) +{ + unsigned int dst; + + struct cmem_lat_pmu *cmem_lat_pmu =3D + hlist_entry_safe(node, struct cmem_lat_pmu, node); + + /* Nothing to do if this CPU doesn't own the PMU */ + if (!cpumask_test_and_clear_cpu(cpu, &cmem_lat_pmu->active_cpu)) + return 0; + + /* Choose a new CPU to migrate ownership of the PMU to */ + dst =3D cpumask_any_and_but(&cmem_lat_pmu->associated_cpus, + cpu_online_mask, cpu); + if (dst >=3D nr_cpu_ids) + return 0; + + /* Use this CPU for event counting */ + perf_pmu_migrate_context(&cmem_lat_pmu->pmu, cpu, dst); + cpumask_set_cpu(dst, &cmem_lat_pmu->active_cpu); + + return 0; +} + +static int cmem_lat_pmu_get_cpus(struct cmem_lat_pmu *cmem_lat_pmu, + unsigned int socket) +{ + int ret =3D 0, cpu; + + for_each_possible_cpu(cpu) { + if (cpu_to_node(cpu) =3D=3D socket) + cpumask_set_cpu(cpu, &cmem_lat_pmu->associated_cpus); + } + + if (cpumask_empty(&cmem_lat_pmu->associated_cpus)) { + dev_dbg(cmem_lat_pmu->dev, + "No cpu associated with PMU socket-%u\n", socket); + ret =3D -ENODEV; + } + + return ret; +} + +static int cmem_lat_pmu_probe(struct platform_device *pdev) +{ + struct device *dev =3D &pdev->dev; + struct acpi_device *acpi_dev; + struct cmem_lat_pmu *cmem_lat_pmu; + char *name, *uid_str; + int ret, i; + u32 socket; + + acpi_dev =3D ACPI_COMPANION(dev); + if (!acpi_dev) + return -ENODEV; + + uid_str =3D acpi_device_uid(acpi_dev); + if (!uid_str) + return -ENODEV; + + ret =3D kstrtou32(uid_str, 0, &socket); + if (ret) + return ret; + + cmem_lat_pmu =3D devm_kzalloc(dev, sizeof(*cmem_lat_pmu), GFP_KERNEL); + name =3D devm_kasprintf(dev, GFP_KERNEL, "nvidia_cmem_latency_pmu_%u", so= cket); + if (!cmem_lat_pmu || !name) + return -ENOMEM; + + cmem_lat_pmu->dev =3D dev; + cmem_lat_pmu->name =3D name; + cmem_lat_pmu->identifier =3D acpi_device_hid(acpi_dev); + platform_set_drvdata(pdev, cmem_lat_pmu); + + cmem_lat_pmu->pmu =3D (struct pmu) { + .parent =3D &pdev->dev, + .task_ctx_nr =3D perf_invalid_context, + .pmu_enable =3D cmem_lat_pmu_enable, + .pmu_disable =3D cmem_lat_pmu_disable, + .event_init =3D cmem_lat_pmu_event_init, + .add =3D cmem_lat_pmu_add, + .del =3D cmem_lat_pmu_del, + .start =3D cmem_lat_pmu_start, + .stop =3D cmem_lat_pmu_stop, + .read =3D cmem_lat_pmu_read, + .attr_groups =3D cmem_lat_pmu_attr_groups, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE | + PERF_PMU_CAP_NO_INTERRUPT, + }; + + /* Map the address of all the instances plus one for the broadcast. */ + for (i =3D 0; i < NUM_INSTANCES + 1; i++) { + cmem_lat_pmu->base[i] =3D devm_platform_ioremap_resource(pdev, i); + if (IS_ERR(cmem_lat_pmu->base[i])) { + dev_err(dev, "Failed map address for instance %d\n", i); + return PTR_ERR(cmem_lat_pmu->base[i]); + } + } + + ret =3D cmem_lat_pmu_get_cpus(cmem_lat_pmu, socket); + if (ret) + return ret; + + ret =3D cpuhp_state_add_instance(cmem_lat_pmu_cpuhp_state, + &cmem_lat_pmu->node); + if (ret) { + dev_err(&pdev->dev, "Error %d registering hotplug\n", ret); + return ret; + } + + cmem_lat_pmu_cg_ctrl(cmem_lat_pmu, CG_CTRL_ENABLE); + cmem_lat_pmu_ctrl(cmem_lat_pmu, CTRL_CLR); + cmem_lat_pmu_cg_ctrl(cmem_lat_pmu, CG_CTRL_DISABLE); + + ret =3D perf_pmu_register(&cmem_lat_pmu->pmu, name, -1); + if (ret) { + dev_err(&pdev->dev, "Failed to register PMU: %d\n", ret); + cpuhp_state_remove_instance(cmem_lat_pmu_cpuhp_state, + &cmem_lat_pmu->node); + return ret; + } + + dev_dbg(&pdev->dev, "Registered %s PMU\n", name); + + return 0; +} + +static void cmem_lat_pmu_device_remove(struct platform_device *pdev) +{ + struct cmem_lat_pmu *cmem_lat_pmu =3D platform_get_drvdata(pdev); + + perf_pmu_unregister(&cmem_lat_pmu->pmu); + cpuhp_state_remove_instance(cmem_lat_pmu_cpuhp_state, + &cmem_lat_pmu->node); +} + +static const struct acpi_device_id cmem_lat_pmu_acpi_match[] =3D { + { "NVDA2021", }, + { } +}; +MODULE_DEVICE_TABLE(acpi, cmem_lat_pmu_acpi_match); + +static struct platform_driver cmem_lat_pmu_driver =3D { + .driver =3D { + .name =3D "nvidia-t410-cmem-latency-pmu", + .acpi_match_table =3D ACPI_PTR(cmem_lat_pmu_acpi_match), + .suppress_bind_attrs =3D true, + }, + .probe =3D cmem_lat_pmu_probe, + .remove =3D cmem_lat_pmu_device_remove, +}; + +static int __init cmem_lat_pmu_init(void) +{ + int ret; + + ret =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "perf/nvidia/cmem_latency:online", + cmem_lat_pmu_cpu_online, + cmem_lat_pmu_cpu_teardown); + if (ret < 0) + return ret; + + cmem_lat_pmu_cpuhp_state =3D ret; + + return platform_driver_register(&cmem_lat_pmu_driver); +} + +static void __exit cmem_lat_pmu_exit(void) +{ + platform_driver_unregister(&cmem_lat_pmu_driver); + cpuhp_remove_multi_state(cmem_lat_pmu_cpuhp_state); +} + +module_init(cmem_lat_pmu_init); +module_exit(cmem_lat_pmu_exit); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("NVIDIA Tegra410 CPU Memory Latency PMU driver"); +MODULE_AUTHOR("Besar Wicaksono "); --=20 2.43.0 From nobody Mon Jan 26 22:50:07 2026 Received: from PH0PR06CU001.outbound.protection.outlook.com (mail-westus3azon11011004.outbound.protection.outlook.com [40.107.208.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F83634DB6D; Mon, 26 Jan 2026 18:13:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.208.4 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451206; cv=fail; b=QQJciabNN/Yn3MoJQvMdcvjaIT72X1wXE61MhwtPtOQfSEqBM0Kf6M5iK+sCQ0/0y9esrH/slUvF7FNaWjGKdDSmVhpgkvaxupE1nVu8I+8KSfpngvMsKLLonQ9xmPiVxFqFvoyqiMeFzOQREMWrfrQpicjyavIXo68gURFPG5M= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451206; c=relaxed/simple; bh=yl6zxSw91Yx8qkR2e+gTGSliiz6E7PM+dWAYI8rmG0M=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=aDzK1RdOHJtw+KgE1rQdcMU2yFEuw9/mC7yF6UXmCbDM7lTN9GGpdQwEzCjbblhqC/fJUaMWEINQ6Cgt5UEQMG5N83uyeGPjDyXvoZBWc+qiiOFJMD0+pk4OKtP0kXAh0ZPYtj8Nxlo6uIh/vC9lqGGjGs9U946Z5S7CKGcOxuQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=H3TYYh5b; arc=fail smtp.client-ip=40.107.208.4 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="H3TYYh5b" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=nB7wT8Hk48aF58sQo9mgeFg6tgs6bchx94aa7fPKvV8yDCQXXZwbSSY7nk3bSE9sj3uHgeBIjlydDDeiZYHHu2OnmaamqZbV7OfgaYWjkKZGjVSdicpGYtQuJPsqH1kOQcj2afN3ugFuS1NSHMhAbFLv/lRhjckkQh6gShxTQ5yTKc25wn4TO0cKMrg4DgvTw/D9vR2Vg2H/eGpb2tC1+RWwAancedSG1zsh678fQFqHyvjkWMMMOeNJi3qUOKrPTRZb/L9U+f4sVyHxtNuIOxIUusbGI8ofl9RMWihNxf5sYV9EnMO6y2YNz3Fn3WsAArBMI/gKTv+DhPCQWgHRUg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nJDSG7d5kYxZnIsI0GgZZi/fMZvdLqZQqBdxLxX0BQM=; b=GDnOJoMNpxctpVey95nIhTCwLSw1lnZnqXq9+lycCFca1sdNchySX4FlVa6AAdDDlTrL8ZDFPYGkIk81B7Y4u65XYLbhdu8Nv1PuSlKd9Erm5abyjo9rJ04RuWdFUEPH4GaHO5JBQqQyseQYz+mKIavjSv4LlO0QgK0hy6rNlBFutjL070C42pGB7HSXc/QEu6iZLFjgWu7mUFd/th8sWlPg1BhOaCdTfbsyIMr/RilddIwc0/UhSwuuGgtFC6HRcWc78ZE9rw4a6xjRVmOKMbb1rl28vqfgUpWgmzlJzqLcgWbc/uudKdek6Pw0jxHt1MBTCza6RsQXS9bc78Lt3g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.233) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nJDSG7d5kYxZnIsI0GgZZi/fMZvdLqZQqBdxLxX0BQM=; b=H3TYYh5bOidQHOW5plXj7oZBYwXxX2guAxAO/O138xSf5Uc5GHW5CmJ6lNxqwAufKgHwOD2fBuyDKRoIhmpQ3GfXuPq86RgJLA7zpRVBlOk9QytgBUlqo4HA34oZWdTHGxAbuW7hRhNu/m2bGZ1s8HrVxeyQILorfmgogv+TDvhwYMfIlyDaYUwdKTwt4qhkr9LDBuhqizkln1dzjPNZkFLCOqs9sXKAZ0APnWKF4SCFDlWG3A01Ery2HD6pfxLC9k4o+2ERwj+qHdiH5wNH97teZBb1gIVviYbNAaSoBh9mytTzf1iZuqkRFVJPyIgEVo3zUM47CVPM+b1aO5c0lw== Received: from SJ0PR05CA0206.namprd05.prod.outlook.com (2603:10b6:a03:330::31) by LV8PR12MB9714.namprd12.prod.outlook.com (2603:10b6:408:2a0::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9542.16; Mon, 26 Jan 2026 18:13:11 +0000 Received: from SJ5PEPF000001F1.namprd05.prod.outlook.com (2603:10b6:a03:330:cafe::aa) by SJ0PR05CA0206.outlook.office365.com (2603:10b6:a03:330::31) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9564.7 via Frontend Transport; Mon, 26 Jan 2026 18:13:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.233) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.233 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.233; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.233) by SJ5PEPF000001F1.mail.protection.outlook.com (10.167.242.69) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.3 via Frontend Transport; Mon, 26 Jan 2026 18:13:11 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:13:00 -0800 Received: from drhqmail201.nvidia.com (10.126.190.180) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:12:59 -0800 Received: from build-bwicaksono-noble-20251018.internal (10.127.8.11) by mail.nvidia.com (10.126.190.180) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 26 Jan 2026 10:12:59 -0800 From: Besar Wicaksono To: , , , CC: , , , , , , , , , , , , , Besar Wicaksono Subject: [PATCH 7/8] perf: add NVIDIA Tegra410 C2C PMU Date: Mon, 26 Jan 2026 18:11:54 +0000 Message-ID: <20260126181155.2776097-8-bwicaksono@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260126181155.2776097-1-bwicaksono@nvidia.com> References: <20260126181155.2776097-1-bwicaksono@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001F1:EE_|LV8PR12MB9714:EE_ X-MS-Office365-Filtering-Correlation-Id: 7dca0bf2-8639-464b-d275-08de5d0690b1 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|82310400026|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?OSM/+Y82C7Nzw4irKb1L6Jj1Ed0iBW/+kZeWxPd9Lohp54i5o3FZwoVuk6af?= =?us-ascii?Q?GBLThktB+MgOZRSY4FCFXxTOhO8yZhEu3A0AwmDFIjE9dnEt173T5fYnxwUh?= =?us-ascii?Q?B9z9eaj7eBmpS+BfhyHmIu1U0atev6cBspOwcytGWHG5SND3ySN9GkUYW56e?= =?us-ascii?Q?qnPBLxvJ3RfItU+wIIMYeAiCLiG7CD62NptdvnNUzAAyi2YaumOeVOfFDJxo?= =?us-ascii?Q?wdejA1mxCcFo6MeLA/LWY5V3sG4Svr/PUzZkhlmwiWHk9HtUwBrUZrrOfVFo?= =?us-ascii?Q?6dG2BGSIW5hocwbkkJ4Wt3BDkiEsRvBmiEYmpyEZHHYfrwcfiaP/LFjfteTm?= =?us-ascii?Q?fKMWuvjanMIT3sQXR8yCmvOILszmwdqTsgD3sU+ZK40LDFDx/olei11J1dmB?= =?us-ascii?Q?2cD78K/1D0SveEsaPXlq4mUjlHpAnOE2rP0NmU3yb3KN/Q3GxdAJyCi3UjtM?= =?us-ascii?Q?6t2tE3OZfB8SjQQX/kpWDUwtkFs9uDtNSrlQVY0m/JaXojKf+X5nDl4Doqei?= =?us-ascii?Q?wjbdVEtY06WOl2ALNpMVqd0CQVfogDiQnkkwEVJGZmqNHfSgc81Dab1FifoC?= =?us-ascii?Q?KE5e8Q0dYADgBT7kRASpfgs3U+1VJpzp8p969dJgL6CACytGGavT8NLxhsqS?= =?us-ascii?Q?HTjlCDCk2GgpV96TxevaskV/OHaSH6HF9AZFfGTpuEyFmyqE4rmsx5sAfNuu?= =?us-ascii?Q?Us3g6XxhklT/I43hcudtemO8W1LocS2PF6YgYIb5lTFuFshgMlobCLpXHD0f?= =?us-ascii?Q?tytX9JRwsQ7tj2yZOJVhboxEq568SRaxWO+pjD75Gufw4Ekbqk9R3Vo+uZ//?= =?us-ascii?Q?gXYewchMk3ZZHg9lejeZGfh894vEJjncDkX+KmWpQqsnbLI3qFy6i3wFvFB7?= =?us-ascii?Q?78zbWmL9++Ku2kozlLDLrezptwhRogylrC4EvMxVrGwAys7SUxyAbc6w/9D9?= =?us-ascii?Q?84988+18zJktS3kqHPGZCeNngWCHR3z+s2/8jQ/Cw4wRPKGQ7J0K0bzzw06p?= =?us-ascii?Q?94/b4THUpJep7IwUkI8wxbxpEuc/0C1RTy/VnYYm5dLuir5IRDCPR1ac3wRq?= =?us-ascii?Q?0nxDNbqWdCNFRNQf4NqLsyh1dsOoWjROi79bAF0UaeLIF1eEcamr9aL0R4ns?= =?us-ascii?Q?p/+QreL8tniNBLytQ0ysH3y0QmzfJaQVJjra3bODorVVaFtNduNFY4poHd3W?= =?us-ascii?Q?1qqrHOa56hAg7UJXsTglkx/ts0kO8RyXCJFsDIepV5HuMlR4fjIHCkC63oY2?= =?us-ascii?Q?+YEDM64eULTggF222SeVaE6oOGoL3LpY3EyxyD74HkhTxqhWogXEM6pfeGLO?= =?us-ascii?Q?8Jq7s4Amk8e3BRHmBCb2xc89uKoxYNh0dHobx/SbfmthQSYyZe7ciOZL/aSW?= =?us-ascii?Q?GM7s0zu0SDtSxCxaZuOD25KVvIN+cJPypbHUXvx1fS3Mf19D9Elg01zt4dMI?= =?us-ascii?Q?rbVwHP+YDFnM5QGPZLdQmRk09as9yJhUn+hbO5Vk14jFhIGDzACkTQHMxNQV?= =?us-ascii?Q?mE/13K3XyWraYKouCEDJDCsET0SmxXuORWWlCJzO+KGaUjjUigZGuVwlvld1?= =?us-ascii?Q?TFguVTWY60qhv9NfCnHyLWMiQ+2Kmr/x7VKv6gILDzRqzlBQYC1SeSe+y3U7?= =?us-ascii?Q?39yqbqxQE292SnL+FbLfAdcwl55oxNO3EXFt6P1qgZ8UQ0eXNT+NvSoWKRPf?= =?us-ascii?Q?ACPGoQ=3D=3D?= X-Forefront-Antispam-Report: CIP:216.228.118.233;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge2.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(82310400026)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jan 2026 18:13:11.1611 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7dca0bf2-8639-464b-d275-08de5d0690b1 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.233];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001F1.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV8PR12MB9714 Content-Type: text/plain; charset="utf-8" Adds NVIDIA C2C PMU support in Tegra410 SOC. Signed-off-by: Besar Wicaksono --- .../admin-guide/perf/nvidia-tegra410-pmu.rst | 151 +++ drivers/perf/Kconfig | 7 + drivers/perf/Makefile | 1 + drivers/perf/nvidia_t410_c2c_pmu.c | 1061 +++++++++++++++++ 4 files changed, 1220 insertions(+) create mode 100644 drivers/perf/nvidia_t410_c2c_pmu.c diff --git a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst b/Docum= entation/admin-guide/perf/nvidia-tegra410-pmu.rst index 11fc1c88346a..f81f356debe1 100644 --- a/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst +++ b/Documentation/admin-guide/perf/nvidia-tegra410-pmu.rst @@ -9,6 +9,9 @@ metrics like memory bandwidth, latency, and utilization: * PCIE * PCIE-TGT * CPU Memory (CMEM) Latency +* NVLink-C2C +* NV-CLink +* NV-DLink =20 PMU Driver ---------- @@ -367,3 +370,151 @@ see /sys/bus/event_source/devices/nvidia_cmem_latency= _pmu_. Example usage:: =20 perf stat -a -e '{nvidia_cmem_latency_pmu_0/rd_req/,nvidia_cmem_latency_= pmu_0/rd_cum_outs/,nvidia_cmem_latency_pmu_0/cycles/}' + +NVLink-C2C PMU +-------------- + +This PMU monitors latency events of memory read/write requests that pass t= hrough +the NVIDIA Chip-to-Chip (C2C) interface. Bandwidth events are not available +in this PMU, unlike the C2C PMU in Grace (Tegra241 SoC). + +The events and configuration options of this PMU device are available in s= ysfs, +see /sys/bus/event_source/devices/nvidia_nvlink_c2c_pmu_. + +The list of events: + + * IN_RD_CUM_OUTS: accumulated outstanding request (in cycles) of incomin= g read requests. + * IN_RD_REQ: the number of incoming read requests. + * IN_WR_CUM_OUTS: accumulated outstanding request (in cycles) of incomin= g write requests. + * IN_WR_REQ: the number of incoming write requests. + * OUT_RD_CUM_OUTS: accumulated outstanding request (in cycles) of outgoi= ng read requests. + * OUT_RD_REQ: the number of outgoing read requests. + * OUT_WR_CUM_OUTS: accumulated outstanding request (in cycles) of outgoi= ng write requests. + * OUT_WR_REQ: the number of outgoing write requests. + * CYCLES: NVLink-C2C interface cycle counts. + +The incoming events count the reads/writes from remote device to the SoC. +The outgoing events count the reads/writes from the SoC to remote device. + +The sysfs /sys/bus/event_source/devices/nvidia_nvlink_c2c_pmu_/= peer +contains the information about the connected device. + +When the C2C interface is connected to GPU(s), the user can use the +"gpu_mask" parameter to filter traffic to/from specific GPU(s). Each bit r= epresents the GPU +index, e.g. "gpu_mask=3D0x1" corresponds to GPU 0 and "gpu_mask=3D0x3" is = for GPU 0 and 1. +The PMU will monitor all GPUs by default if not specified. + +When connected to another SoC, only the read events are available. + +The events can be used to calculate the average latency of the read/write = requests:: + + C2C_FREQ_IN_GHZ =3D CYCLES / ELAPSED_TIME_IN_NS + + IN_RD_AVG_LATENCY_IN_CYCLES =3D IN_RD_CUM_OUTS / IN_RD_REQ + IN_RD_AVG_LATENCY_IN_NS =3D IN_RD_AVG_LATENCY_IN_CYCLES / C2C_FREQ_IN_G= HZ + + IN_WR_AVG_LATENCY_IN_CYCLES =3D IN_WR_CUM_OUTS / IN_WR_REQ + IN_WR_AVG_LATENCY_IN_NS =3D IN_WR_AVG_LATENCY_IN_CYCLES / C2C_FREQ_IN_G= HZ + + OUT_RD_AVG_LATENCY_IN_CYCLES =3D OUT_RD_CUM_OUTS / OUT_RD_REQ + OUT_RD_AVG_LATENCY_IN_NS =3D OUT_RD_AVG_LATENCY_IN_CYCLES / C2C_FREQ_IN= _GHZ + + OUT_WR_AVG_LATENCY_IN_CYCLES =3D OUT_WR_CUM_OUTS / OUT_WR_REQ + OUT_WR_AVG_LATENCY_IN_NS =3D OUT_WR_AVG_LATENCY_IN_CYCLES / C2C_FREQ_IN= _GHZ + +Example usage: + + * Count incoming traffic from all GPUs connected via NVLink-C2C:: + + perf stat -a -e nvidia_nvlink_c2c_pmu_0/in_rd_req/ + + * Count incoming traffic from GPU 0 connected via NVLink-C2C:: + + perf stat -a -e nvidia_nvlink_c2c_pmu_0/in_rd_cum_outs,gpu_mask=3D0x= 1/ + + * Count incoming traffic from GPU 1 connected via NVLink-C2C:: + + perf stat -a -e nvidia_nvlink_c2c_pmu_0/in_rd_cum_outs,gpu_mask=3D0x= 2/ + + * Count outgoing traffic to all GPUs connected via NVLink-C2C:: + + perf stat -a -e nvidia_nvlink_c2c_pmu_0/out_rd_req/ + + * Count outgoing traffic to GPU 0 connected via NVLink-C2C:: + + perf stat -a -e nvidia_nvlink_c2c_pmu_0/out_rd_cum_outs,gpu_mask=3D0= x1/ + + * Count outgoing traffic to GPU 1 connected via NVLink-C2C:: + + perf stat -a -e nvidia_nvlink_c2c_pmu_0/out_rd_cum_outs,gpu_mask=3D0= x2/ + +NV-CLink PMU +------------ + +This PMU monitors latency events of memory read requests that pass through +the NV-CLINK interface. Bandwidth events are not available in this PMU. +In Tegra410 SoC, the NV-CLink interface is used to connect to another Tegr= a410 +SoC and this PMU only counts read traffic. + +The events and configuration options of this PMU device are available in s= ysfs, +see /sys/bus/event_source/devices/nvidia_nvclink_pmu_. + +The list of events: + + * IN_RD_CUM_OUTS: accumulated outstanding request (in cycles) of incomin= g read requests. + * IN_RD_REQ: the number of incoming read requests. + * OUT_RD_CUM_OUTS: accumulated outstanding request (in cycles) of outgoi= ng read requests. + * OUT_RD_REQ: the number of outgoing read requests. + * CYCLES: NV-CLINK interface cycle counts. + +The incoming events count the reads from remote device to the SoC. +The outgoing events count the reads from the SoC to remote device. + +The events can be used to calculate the average latency of the read reques= ts:: + + CLINK_FREQ_IN_GHZ =3D CYCLES / ELAPSED_TIME_IN_NS + + IN_RD_AVG_LATENCY_IN_CYCLES =3D IN_RD_CUM_OUTS / IN_RD_REQ + IN_RD_AVG_LATENCY_IN_NS =3D IN_RD_AVG_LATENCY_IN_CYCLES / CLINK_FREQ_IN= _GHZ + + OUT_RD_AVG_LATENCY_IN_CYCLES =3D OUT_RD_CUM_OUTS / OUT_RD_REQ + OUT_RD_AVG_LATENCY_IN_NS =3D OUT_RD_AVG_LATENCY_IN_CYCLES / CLINK_FREQ_= IN_GHZ + +Example usage: + + * Count incoming read traffic from remote SoC connected via NV-CLINK:: + + perf stat -a -e nvidia_nvclink_pmu_0/in_rd_req/ + + * Count outgoing read traffic to remote SoC connected via NV-CLINK:: + + perf stat -a -e nvidia_nvclink_pmu_0/out_rd_req/ + +NV-DLink PMU +------------ + +This PMU monitors latency events of memory read requests that pass through +the NV-DLINK interface. Bandwidth events are not available in this PMU. +In Tegra410 SoC, this PMU only counts CXL memory read traffic. + +The events and configuration options of this PMU device are available in s= ysfs, +see /sys/bus/event_source/devices/nvidia_nvdlink_pmu_. + +The list of events: + + * IN_RD_CUM_OUTS: accumulated outstanding read requests (in cycles) to C= XL memory. + * IN_RD_REQ: the number of read requests to CXL memory. + * CYCLES: NV-DLINK interface cycle counts. + +The events can be used to calculate the average latency of the read reques= ts:: + + DLINK_FREQ_IN_GHZ =3D CYCLES / ELAPSED_TIME_IN_NS + + IN_RD_AVG_LATENCY_IN_CYCLES =3D IN_RD_CUM_OUTS / IN_RD_REQ + IN_RD_AVG_LATENCY_IN_NS =3D IN_RD_AVG_LATENCY_IN_CYCLES / DLINK_FREQ_IN= _GHZ + +Example usage: + + * Count read events to CXL memory:: + + perf stat -a -e '{nvidia_nvdlink_pmu_0/in_rd_req/,nvidia_nvdlink_pmu= _0/in_rd_cum_outs/}' diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index 9fed3c41d5ea..7ee36efe6bc0 100644 --- a/drivers/perf/Kconfig +++ b/drivers/perf/Kconfig @@ -318,4 +318,11 @@ config NVIDIA_TEGRA410_CMEM_LATENCY_PMU Enable perf support for CPU memory latency counters monitoring on NVIDIA Tegra410 SoC. =20 +config NVIDIA_TEGRA410_C2C_PMU + tristate "NVIDIA Tegra410 C2C PMU" + depends on ARM64 && ACPI + help + Enable perf support for counters in NVIDIA C2C interface of NVIDIA + Tegra410 SoC. + endmenu diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile index 4aa6aad393c2..eb8a022dad9a 100644 --- a/drivers/perf/Makefile +++ b/drivers/perf/Makefile @@ -36,3 +36,4 @@ obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=3D arm_= cspmu/ obj-$(CONFIG_MESON_DDR_PMU) +=3D amlogic/ obj-$(CONFIG_CXL_PMU) +=3D cxl_pmu.o obj-$(CONFIG_NVIDIA_TEGRA410_CMEM_LATENCY_PMU) +=3D nvidia_t410_cmem_laten= cy_pmu.o +obj-$(CONFIG_NVIDIA_TEGRA410_C2C_PMU) +=3D nvidia_t410_c2c_pmu.o diff --git a/drivers/perf/nvidia_t410_c2c_pmu.c b/drivers/perf/nvidia_t410_= c2c_pmu.c new file mode 100644 index 000000000000..362e0e5f8b24 --- /dev/null +++ b/drivers/perf/nvidia_t410_c2c_pmu.c @@ -0,0 +1,1061 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * NVIDIA Tegra410 C2C PMU driver. + * + * Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserve= d. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* The C2C interface types in Tegra410. */ +#define C2C_TYPE_NVLINK 0x0 +#define C2C_TYPE_NVCLINK 0x1 +#define C2C_TYPE_NVDLINK 0x2 +#define C2C_TYPE_COUNT 0x3 + +/* The type of the peer device connected to the C2C interface. */ +#define C2C_PEER_TYPE_CPU 0x0 +#define C2C_PEER_TYPE_GPU 0x1 +#define C2C_PEER_TYPE_CXLMEM 0x2 +#define C2C_PEER_TYPE_COUNT 0x3 + +/* The number of peer devices can be connected to the C2C interface. */ +#define C2C_NR_PEER_CPU 0x1 +#define C2C_NR_PEER_GPU 0x2 +#define C2C_NR_PEER_CXLMEM 0x1 +#define C2C_NR_PEER_MAX 0x2 + +/* Number of instances on each interface. */ +#define C2C_NR_INST_NVLINK 14 +#define C2C_NR_INST_NVCLINK 12 +#define C2C_NR_INST_NVDLINK 16 +#define C2C_NR_INST_MAX 16 + +/* Register offsets. */ +#define C2C_CTRL 0x864 +#define C2C_IN_STATUS 0x868 +#define C2C_CYCLE_CNTR 0x86c +#define C2C_IN_RD_CUM_OUTS_CNTR 0x874 +#define C2C_IN_RD_REQ_CNTR 0x87c +#define C2C_IN_WR_CUM_OUTS_CNTR 0x884 +#define C2C_IN_WR_REQ_CNTR 0x88c +#define C2C_OUT_STATUS 0x890 +#define C2C_OUT_RD_CUM_OUTS_CNTR 0x898 +#define C2C_OUT_RD_REQ_CNTR 0x8a0 +#define C2C_OUT_WR_CUM_OUTS_CNTR 0x8a8 +#define C2C_OUT_WR_REQ_CNTR 0x8b0 + +/* C2C_IN_STATUS register field. */ +#define C2C_IN_STATUS_CYCLE_OVF BIT(0) +#define C2C_IN_STATUS_IN_RD_CUM_OUTS_OVF BIT(1) +#define C2C_IN_STATUS_IN_RD_REQ_OVF BIT(2) +#define C2C_IN_STATUS_IN_WR_CUM_OUTS_OVF BIT(3) +#define C2C_IN_STATUS_IN_WR_REQ_OVF BIT(4) + +/* C2C_OUT_STATUS register field. */ +#define C2C_OUT_STATUS_OUT_RD_CUM_OUTS_OVF BIT(0) +#define C2C_OUT_STATUS_OUT_RD_REQ_OVF BIT(1) +#define C2C_OUT_STATUS_OUT_WR_CUM_OUTS_OVF BIT(2) +#define C2C_OUT_STATUS_OUT_WR_REQ_OVF BIT(3) + +/* Events. */ +#define C2C_EVENT_CYCLES 0x0 +#define C2C_EVENT_IN_RD_CUM_OUTS 0x1 +#define C2C_EVENT_IN_RD_REQ 0x2 +#define C2C_EVENT_IN_WR_CUM_OUTS 0x3 +#define C2C_EVENT_IN_WR_REQ 0x4 +#define C2C_EVENT_OUT_RD_CUM_OUTS 0x5 +#define C2C_EVENT_OUT_RD_REQ 0x6 +#define C2C_EVENT_OUT_WR_CUM_OUTS 0x7 +#define C2C_EVENT_OUT_WR_REQ 0x8 + +#define C2C_NUM_EVENTS 0x9 +#define C2C_MASK_EVENT 0xFF +#define C2C_MAX_ACTIVE_EVENTS 32 + +#define C2C_ACTIVE_CPU_MASK 0x0 +#define C2C_ASSOCIATED_CPU_MASK 0x1 + +/* + * Maximum poll count for reading counter value using high-low-high sequen= ce. + */ +#define HILOHI_MAX_POLL 1000 + +static unsigned long nv_c2c_pmu_cpuhp_state; + +/* PMU descriptor. */ + +/* Tracks the events assigned to the PMU for a given logical index. */ +struct nv_c2c_pmu_hw_events { + /* The events that are active. */ + struct perf_event *events[C2C_MAX_ACTIVE_EVENTS]; + + /* + * Each bit indicates a logical counter is being used (or not) for an + * event. + */ + DECLARE_BITMAP(used_ctrs, C2C_MAX_ACTIVE_EVENTS); +}; + +struct nv_c2c_pmu { + struct pmu pmu; + struct device *dev; + struct acpi_device *acpi_dev; + + const char *name; + const char *identifier; + + unsigned int c2c_type; + unsigned int peer_type; + unsigned int socket; + unsigned int nr_inst; + unsigned int nr_peer; + unsigned long peer_insts[C2C_NR_PEER_MAX][BITS_TO_LONGS(C2C_NR_INST_MAX)]; + u32 filter_default; + + struct nv_c2c_pmu_hw_events hw_events; + + cpumask_t associated_cpus; + cpumask_t active_cpu; + + struct hlist_node cpuhp_node; + + struct attribute **formats; + const struct attribute_group *attr_groups[6]; + + void __iomem *base_broadcast; + void __iomem *base[C2C_NR_INST_MAX]; +}; + +#define to_c2c_pmu(p) (container_of(p, struct nv_c2c_pmu, pmu)) + +/* Get event type from perf_event. */ +static inline u32 get_event_type(struct perf_event *event) +{ + return (event->attr.config) & C2C_MASK_EVENT; +} + +static inline u32 get_filter_mask(struct perf_event *event) +{ + u32 filter; + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(event->pmu); + + filter =3D ((u32)event->attr.config1) & c2c_pmu->filter_default; + if (filter =3D=3D 0) + filter =3D c2c_pmu->filter_default; + + return filter; +} + +/* PMU operations. */ + +static int nv_c2c_pmu_get_event_idx(struct nv_c2c_pmu_hw_events *hw_events, + struct perf_event *event) +{ + u32 idx; + + idx =3D find_first_zero_bit(hw_events->used_ctrs, C2C_MAX_ACTIVE_EVENTS); + if (idx >=3D C2C_MAX_ACTIVE_EVENTS) + return -EAGAIN; + + set_bit(idx, hw_events->used_ctrs); + + return idx; +} + +static bool +nv_c2c_pmu_validate_event(struct pmu *pmu, + struct nv_c2c_pmu_hw_events *hw_events, + struct perf_event *event) +{ + if (is_software_event(event)) + return true; + + /* Reject groups spanning multiple HW PMUs. */ + if (event->pmu !=3D pmu) + return false; + + return nv_c2c_pmu_get_event_idx(hw_events, event) >=3D 0; +} + +/* + * Make sure the group of events can be scheduled at once + * on the PMU. + */ +static bool nv_c2c_pmu_validate_group(struct perf_event *event) +{ + struct perf_event *sibling, *leader =3D event->group_leader; + struct nv_c2c_pmu_hw_events fake_hw_events; + + if (event->group_leader =3D=3D event) + return true; + + memset(&fake_hw_events, 0, sizeof(fake_hw_events)); + + if (!nv_c2c_pmu_validate_event(event->pmu, &fake_hw_events, leader)) + return false; + + for_each_sibling_event(sibling, leader) { + if (!nv_c2c_pmu_validate_event(event->pmu, &fake_hw_events, + sibling)) + return false; + } + + return nv_c2c_pmu_validate_event(event->pmu, &fake_hw_events, event); +} + +static int nv_c2c_pmu_event_init(struct perf_event *event) +{ + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(event->pmu); + struct hw_perf_event *hwc =3D &event->hw; + u32 event_type =3D get_event_type(event); + + if (event->attr.type !=3D event->pmu->type || + event_type >=3D C2C_NUM_EVENTS) + return -ENOENT; + + /* + * Following other "uncore" PMUs, we do not support sampling mode or + * attach to a task (per-process mode). + */ + if (is_sampling_event(event)) { + dev_dbg(c2c_pmu->pmu.dev, "Can't support sampling events\n"); + return -EOPNOTSUPP; + } + + if (event->cpu < 0 || event->attach_state & PERF_ATTACH_TASK) { + dev_dbg(c2c_pmu->pmu.dev, "Can't support per-task counters\n"); + return -EINVAL; + } + + /* + * Make sure the CPU assignment is on one of the CPUs associated with + * this PMU. + */ + if (!cpumask_test_cpu(event->cpu, &c2c_pmu->associated_cpus)) { + dev_dbg(c2c_pmu->pmu.dev, + "Requested cpu is not associated with the PMU\n"); + return -EINVAL; + } + + /* Enforce the current active CPU to handle the events in this PMU. */ + event->cpu =3D cpumask_first(&c2c_pmu->active_cpu); + if (event->cpu >=3D nr_cpu_ids) + return -EINVAL; + + if (!nv_c2c_pmu_validate_group(event)) + return -EINVAL; + + hwc->idx =3D -1; + hwc->config =3D event_type; + + return 0; +} + +/* + * Read 64-bit register as a pair of 32-bit registers using hi-lo-hi seque= nce. + */ +static u64 read_reg64_hilohi(const void __iomem *addr, u32 max_poll_count) +{ + u32 val_lo, val_hi; + u64 val; + + /* Use high-low-high sequence to avoid tearing */ + do { + if (max_poll_count-- =3D=3D 0) { + pr_err("NV C2C PMU: timeout hi-low-high sequence\n"); + return 0; + } + + val_hi =3D readl(addr + 4); + val_lo =3D readl(addr); + } while (val_hi !=3D readl(addr + 4)); + + val =3D (((u64)val_hi << 32) | val_lo); + + return val; +} + +static void nv_c2c_pmu_check_status(struct nv_c2c_pmu *c2c_pmu, u32 instan= ce) +{ + u32 in_status, out_status; + + in_status =3D readl(c2c_pmu->base[instance] + C2C_IN_STATUS); + out_status =3D readl(c2c_pmu->base[instance] + C2C_OUT_STATUS); + + if (in_status || out_status) + dev_warn(c2c_pmu->dev, + "C2C PMU overflow in: 0x%x, out: 0x%x\n", + in_status, out_status); +} + +static u32 nv_c2c_ctr_offset[C2C_NUM_EVENTS] =3D { + [C2C_EVENT_CYCLES] =3D C2C_CYCLE_CNTR, + [C2C_EVENT_IN_RD_CUM_OUTS] =3D C2C_IN_RD_CUM_OUTS_CNTR, + [C2C_EVENT_IN_RD_REQ] =3D C2C_IN_RD_REQ_CNTR, + [C2C_EVENT_IN_WR_CUM_OUTS] =3D C2C_IN_WR_CUM_OUTS_CNTR, + [C2C_EVENT_IN_WR_REQ] =3D C2C_IN_WR_REQ_CNTR, + [C2C_EVENT_OUT_RD_CUM_OUTS] =3D C2C_OUT_RD_CUM_OUTS_CNTR, + [C2C_EVENT_OUT_RD_REQ] =3D C2C_OUT_RD_REQ_CNTR, + [C2C_EVENT_OUT_WR_CUM_OUTS] =3D C2C_OUT_WR_CUM_OUTS_CNTR, + [C2C_EVENT_OUT_WR_REQ] =3D C2C_OUT_WR_REQ_CNTR, +}; + +static u64 nv_c2c_pmu_read_counter(struct perf_event *event) +{ + u32 ctr_id, ctr_offset, filter_mask, filter_idx, inst_idx; + unsigned long *inst_mask; + DECLARE_BITMAP(filter_bitmap, C2C_NR_PEER_MAX); + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(event->pmu); + u64 val =3D 0; + + filter_mask =3D get_filter_mask(event); + bitmap_from_arr32(filter_bitmap, &filter_mask, c2c_pmu->nr_peer); + + ctr_id =3D event->hw.config; + ctr_offset =3D nv_c2c_ctr_offset[ctr_id]; + + for_each_set_bit(filter_idx, filter_bitmap, c2c_pmu->nr_peer) { + inst_mask =3D c2c_pmu->peer_insts[filter_idx]; + for_each_set_bit(inst_idx, inst_mask, c2c_pmu->nr_inst) { + nv_c2c_pmu_check_status(c2c_pmu, inst_idx); + + /* + * Each instance share same clock and the driver always + * enables all instances. So we can use the counts from + * one instance for cycle counter. + */ + if (ctr_id =3D=3D C2C_EVENT_CYCLES) + return read_reg64_hilohi( + c2c_pmu->base[inst_idx] + ctr_offset, + HILOHI_MAX_POLL); + + /* + * For other events, sum up the counts from all instances. + */ + val +=3D read_reg64_hilohi( + c2c_pmu->base[inst_idx] + ctr_offset, + HILOHI_MAX_POLL); + } + } + + return val; +} + +static void nv_c2c_pmu_event_update(struct perf_event *event) +{ + struct hw_perf_event *hwc =3D &event->hw; + u64 prev, now; + + do { + prev =3D local64_read(&hwc->prev_count); + now =3D nv_c2c_pmu_read_counter(event); + } while (local64_cmpxchg(&hwc->prev_count, prev, now) !=3D prev); + + local64_add(now - prev, &event->count); +} + +static void nv_c2c_pmu_start(struct perf_event *event, int pmu_flags) +{ + event->hw.state =3D 0; +} + +static void nv_c2c_pmu_stop(struct perf_event *event, int pmu_flags) +{ + event->hw.state |=3D PERF_HES_STOPPED; +} + +static int nv_c2c_pmu_add(struct perf_event *event, int flags) +{ + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(event->pmu); + struct nv_c2c_pmu_hw_events *hw_events =3D &c2c_pmu->hw_events; + struct hw_perf_event *hwc =3D &event->hw; + int idx; + + if (WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), + &c2c_pmu->associated_cpus))) + return -ENOENT; + + idx =3D nv_c2c_pmu_get_event_idx(hw_events, event); + if (idx < 0) + return idx; + + hw_events->events[idx] =3D event; + hwc->idx =3D idx; + hwc->state =3D PERF_HES_STOPPED | PERF_HES_UPTODATE; + + if (flags & PERF_EF_START) + nv_c2c_pmu_start(event, PERF_EF_RELOAD); + + /* Propagate changes to the userspace mapping. */ + perf_event_update_userpage(event); + + return 0; +} + +static void nv_c2c_pmu_del(struct perf_event *event, int flags) +{ + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(event->pmu); + struct nv_c2c_pmu_hw_events *hw_events =3D &c2c_pmu->hw_events; + struct hw_perf_event *hwc =3D &event->hw; + int idx =3D hwc->idx; + + nv_c2c_pmu_stop(event, PERF_EF_UPDATE); + + hw_events->events[idx] =3D NULL; + + clear_bit(idx, hw_events->used_ctrs); + + perf_event_update_userpage(event); +} + +static void nv_c2c_pmu_read(struct perf_event *event) +{ + nv_c2c_pmu_event_update(event); +} + +static void nv_c2c_pmu_enable(struct pmu *pmu) +{ + void __iomem *bcast; + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(pmu); + + /* Check if any filter is enabled. */ + if (bitmap_empty(c2c_pmu->hw_events.used_ctrs, C2C_MAX_ACTIVE_EVENTS)) + return; + + /* Enable all the counters. */ + bcast =3D c2c_pmu->base_broadcast; + writel(0x1UL, bcast + C2C_CTRL); +} + +static void nv_c2c_pmu_disable(struct pmu *pmu) +{ + unsigned int idx; + void __iomem *bcast; + struct perf_event *event; + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(pmu); + + /* Disable all the counters. */ + bcast =3D c2c_pmu->base_broadcast; + writel(0x0UL, bcast + C2C_CTRL); + + /* + * The counters will start from 0 again on restart. + * Update the events immediately to avoid losing the counts. + */ + for_each_set_bit(idx, c2c_pmu->hw_events.used_ctrs, + C2C_MAX_ACTIVE_EVENTS) { + event =3D c2c_pmu->hw_events.events[idx]; + + if (!event) + continue; + + nv_c2c_pmu_event_update(event); + + local64_set(&event->hw.prev_count, 0ULL); + } +} + +/* PMU identifier attribute. */ + +static ssize_t nv_c2c_pmu_identifier_show(struct device *dev, + struct device_attribute *attr, + char *page) +{ + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(dev_get_drvdata(dev)); + + return sysfs_emit(page, "%s\n", c2c_pmu->identifier); +} + +static struct device_attribute nv_c2c_pmu_identifier_attr =3D + __ATTR(identifier, 0444, nv_c2c_pmu_identifier_show, NULL); + +static struct attribute *nv_c2c_pmu_identifier_attrs[] =3D { + &nv_c2c_pmu_identifier_attr.attr, + NULL, +}; + +static struct attribute_group nv_c2c_pmu_identifier_attr_group =3D { + .attrs =3D nv_c2c_pmu_identifier_attrs, +}; + +/* Peer attribute. */ + +static ssize_t nv_c2c_pmu_peer_show(struct device *dev, + struct device_attribute *attr, + char *page) +{ + const char *peer_type[C2C_PEER_TYPE_COUNT] =3D { + [C2C_PEER_TYPE_CPU] =3D "cpu", + [C2C_PEER_TYPE_GPU] =3D "gpu", + [C2C_PEER_TYPE_CXLMEM] =3D "cxlmem", + }; + + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(dev_get_drvdata(dev)); + return sysfs_emit(page, "nr_%s=3D%u\n", peer_type[c2c_pmu->peer_type], + c2c_pmu->nr_peer); +} + +static struct device_attribute nv_c2c_pmu_peer_attr =3D + __ATTR(peer, 0444, nv_c2c_pmu_peer_show, NULL); + +static struct attribute *nv_c2c_pmu_peer_attrs[] =3D { + &nv_c2c_pmu_peer_attr.attr, + NULL, +}; + +static struct attribute_group nv_c2c_pmu_peer_attr_group =3D { + .attrs =3D nv_c2c_pmu_peer_attrs, +}; + +/* Format attributes. */ + +#define NV_C2C_PMU_EXT_ATTR(_name, _func, _config) \ + (&((struct dev_ext_attribute[]){ \ + { \ + .attr =3D __ATTR(_name, 0444, _func, NULL), \ + .var =3D (void *)_config \ + } \ + })[0].attr.attr) + +#define NV_C2C_PMU_FORMAT_ATTR(_name, _config) \ + NV_C2C_PMU_EXT_ATTR(_name, device_show_string, _config) + +#define NV_C2C_PMU_FORMAT_EVENT_ATTR \ + NV_C2C_PMU_FORMAT_ATTR(event, "config:0-3") + +static struct attribute *nv_c2c_nvlink_pmu_formats[] =3D { + NV_C2C_PMU_FORMAT_EVENT_ATTR, + NV_C2C_PMU_FORMAT_ATTR(gpu_mask, "config1:0-1"), + NULL, +}; + +static struct attribute *nv_c2c_pmu_formats[] =3D { + NV_C2C_PMU_FORMAT_EVENT_ATTR, + NULL, +}; + +static struct attribute_group * +nv_c2c_pmu_alloc_format_attr_group(struct nv_c2c_pmu *c2c_pmu) +{ + struct attribute_group *format_group; + struct device *dev =3D c2c_pmu->dev; + + format_group =3D + devm_kzalloc(dev, sizeof(struct attribute_group), GFP_KERNEL); + if (!format_group) + return NULL; + + format_group->name =3D "format"; + format_group->attrs =3D c2c_pmu->formats; + + return format_group; +} + +/* Event attributes. */ + +static ssize_t nv_c2c_pmu_sysfs_event_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct perf_pmu_events_attr *pmu_attr; + + pmu_attr =3D container_of(attr, typeof(*pmu_attr), attr); + return sysfs_emit(buf, "event=3D0x%llx\n", pmu_attr->id); +} + +#define NV_C2C_PMU_EVENT_ATTR(_name, _config) \ + PMU_EVENT_ATTR_ID(_name, nv_c2c_pmu_sysfs_event_show, _config) + +static struct attribute *nv_c2c_pmu_events[] =3D { + NV_C2C_PMU_EVENT_ATTR(cycles, C2C_EVENT_CYCLES), + NV_C2C_PMU_EVENT_ATTR(in_rd_cum_outs, C2C_EVENT_IN_RD_CUM_OUTS), + NV_C2C_PMU_EVENT_ATTR(in_rd_req, C2C_EVENT_IN_RD_REQ), + NV_C2C_PMU_EVENT_ATTR(in_wr_cum_outs, C2C_EVENT_IN_WR_CUM_OUTS), + NV_C2C_PMU_EVENT_ATTR(in_wr_req, C2C_EVENT_IN_WR_REQ), + NV_C2C_PMU_EVENT_ATTR(out_rd_cum_outs, C2C_EVENT_OUT_RD_CUM_OUTS), + NV_C2C_PMU_EVENT_ATTR(out_rd_req, C2C_EVENT_OUT_RD_REQ), + NV_C2C_PMU_EVENT_ATTR(out_wr_cum_outs, C2C_EVENT_OUT_WR_CUM_OUTS), + NV_C2C_PMU_EVENT_ATTR(out_wr_req, C2C_EVENT_OUT_WR_REQ), + NULL +}; + +static umode_t +nv_c2c_pmu_event_attr_is_visible(struct kobject *kobj, struct attribute *a= ttr, + int unused) +{ + struct device *dev =3D kobj_to_dev(kobj); + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(dev_get_drvdata(dev)); + struct perf_pmu_events_attr *eattr; + + eattr =3D container_of(attr, typeof(*eattr), attr.attr); + + if (c2c_pmu->c2c_type =3D=3D C2C_TYPE_NVDLINK) { + /* Only incoming reads are available. */ + switch (eattr->id) { + case C2C_EVENT_IN_WR_CUM_OUTS: + case C2C_EVENT_IN_WR_REQ: + case C2C_EVENT_OUT_RD_CUM_OUTS: + case C2C_EVENT_OUT_RD_REQ: + case C2C_EVENT_OUT_WR_CUM_OUTS: + case C2C_EVENT_OUT_WR_REQ: + return 0; + default: + return attr->mode; + } + } else { + /* Hide the write events if C2C connected to another SoC. */ + if (c2c_pmu->peer_type =3D=3D C2C_PEER_TYPE_CPU) { + switch (eattr->id) { + case C2C_EVENT_IN_WR_CUM_OUTS: + case C2C_EVENT_IN_WR_REQ: + case C2C_EVENT_OUT_WR_CUM_OUTS: + case C2C_EVENT_OUT_WR_REQ: + return 0; + default: + return attr->mode; + } + } + } + + return attr->mode; +} + +static const struct attribute_group nv_c2c_pmu_events_group =3D { + .name =3D "events", + .attrs =3D nv_c2c_pmu_events, + .is_visible =3D nv_c2c_pmu_event_attr_is_visible, +}; + +/* Cpumask attributes. */ + +static ssize_t nv_c2c_pmu_cpumask_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct pmu *pmu =3D dev_get_drvdata(dev); + struct nv_c2c_pmu *c2c_pmu =3D to_c2c_pmu(pmu); + struct dev_ext_attribute *eattr =3D + container_of(attr, struct dev_ext_attribute, attr); + unsigned long mask_id =3D (unsigned long)eattr->var; + const cpumask_t *cpumask; + + switch (mask_id) { + case C2C_ACTIVE_CPU_MASK: + cpumask =3D &c2c_pmu->active_cpu; + break; + case C2C_ASSOCIATED_CPU_MASK: + cpumask =3D &c2c_pmu->associated_cpus; + break; + default: + return 0; + } + return cpumap_print_to_pagebuf(true, buf, cpumask); +} + +#define NV_C2C_PMU_CPUMASK_ATTR(_name, _config) \ + NV_C2C_PMU_EXT_ATTR(_name, nv_c2c_pmu_cpumask_show, \ + (unsigned long)_config) + +static struct attribute *nv_c2c_pmu_cpumask_attrs[] =3D { + NV_C2C_PMU_CPUMASK_ATTR(cpumask, C2C_ACTIVE_CPU_MASK), + NV_C2C_PMU_CPUMASK_ATTR(associated_cpus, C2C_ASSOCIATED_CPU_MASK), + NULL, +}; + +static const struct attribute_group nv_c2c_pmu_cpumask_attr_group =3D { + .attrs =3D nv_c2c_pmu_cpumask_attrs, +}; + +/* Per PMU device attribute groups. */ + +static int nv_c2c_pmu_alloc_attr_groups(struct nv_c2c_pmu *c2c_pmu) +{ + const struct attribute_group **attr_groups =3D c2c_pmu->attr_groups; + + attr_groups[0] =3D nv_c2c_pmu_alloc_format_attr_group(c2c_pmu); + attr_groups[1] =3D &nv_c2c_pmu_events_group; + attr_groups[2] =3D &nv_c2c_pmu_cpumask_attr_group; + attr_groups[3] =3D &nv_c2c_pmu_identifier_attr_group; + attr_groups[4] =3D &nv_c2c_pmu_peer_attr_group; + + if (!attr_groups[0]) + return -ENOMEM; + + return 0; +} + +static int nv_c2c_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) +{ + struct nv_c2c_pmu *c2c_pmu =3D + hlist_entry_safe(node, struct nv_c2c_pmu, cpuhp_node); + + if (!cpumask_test_cpu(cpu, &c2c_pmu->associated_cpus)) + return 0; + + /* If the PMU is already managed, there is nothing to do */ + if (!cpumask_empty(&c2c_pmu->active_cpu)) + return 0; + + /* Use this CPU for event counting */ + cpumask_set_cpu(cpu, &c2c_pmu->active_cpu); + + return 0; +} + +static int nv_c2c_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *no= de) +{ + unsigned int dst; + + struct nv_c2c_pmu *c2c_pmu =3D + hlist_entry_safe(node, struct nv_c2c_pmu, cpuhp_node); + + /* Nothing to do if this CPU doesn't own the PMU */ + if (!cpumask_test_and_clear_cpu(cpu, &c2c_pmu->active_cpu)) + return 0; + + /* Choose a new CPU to migrate ownership of the PMU to */ + dst =3D cpumask_any_and_but(&c2c_pmu->associated_cpus, + cpu_online_mask, cpu); + if (dst >=3D nr_cpu_ids) + return 0; + + /* Use this CPU for event counting */ + perf_pmu_migrate_context(&c2c_pmu->pmu, cpu, dst); + cpumask_set_cpu(dst, &c2c_pmu->active_cpu); + + return 0; +} + +static int nv_c2c_pmu_get_cpus(struct nv_c2c_pmu *c2c_pmu) +{ + int ret =3D 0, socket =3D c2c_pmu->socket, cpu; + + for_each_possible_cpu(cpu) { + if (cpu_to_node(cpu) =3D=3D socket) + cpumask_set_cpu(cpu, &c2c_pmu->associated_cpus); + } + + if (cpumask_empty(&c2c_pmu->associated_cpus)) { + dev_dbg(c2c_pmu->dev, + "No cpu associated with C2C PMU socket-%u\n", socket); + ret =3D -ENODEV; + } + + return ret; +} + +static int nv_c2c_pmu_init_socket(struct nv_c2c_pmu *c2c_pmu) +{ + const char *uid_str; + int ret, socket; + + uid_str =3D acpi_device_uid(c2c_pmu->acpi_dev); + if (!uid_str) { + ret =3D -ENODEV; + goto fail; + } + + ret =3D kstrtou32(uid_str, 0, &socket); + if (ret) + goto fail; + + c2c_pmu->socket =3D socket; + return 0; + +fail: + dev_err(c2c_pmu->dev, "Failed to initialize socket\n"); + return ret; +} + +static int nv_c2c_pmu_init_id(struct nv_c2c_pmu *c2c_pmu) +{ + const char *name_fmt[C2C_TYPE_COUNT] =3D { + [C2C_TYPE_NVLINK] =3D "nvidia_nvlink_c2c_pmu_%u", + [C2C_TYPE_NVCLINK] =3D "nvidia_nvclink_pmu_%u", + [C2C_TYPE_NVDLINK] =3D "nvidia_nvdlink_pmu_%u", + }; + + char *name; + int ret; + + name =3D devm_kasprintf(c2c_pmu->dev, GFP_KERNEL, + name_fmt[c2c_pmu->c2c_type], c2c_pmu->socket); + if (!name) { + ret =3D -ENOMEM; + goto fail; + } + + c2c_pmu->name =3D name; + + c2c_pmu->identifier =3D acpi_device_hid(c2c_pmu->acpi_dev); + + return 0; + +fail: + dev_err(c2c_pmu->dev, "Failed to initialize name\n"); + return ret; +} + +static int nv_c2c_pmu_init_filter(struct nv_c2c_pmu *c2c_pmu) +{ + u32 cpu_en =3D 0; + struct device *dev =3D c2c_pmu->dev; + + if (c2c_pmu->c2c_type =3D=3D C2C_TYPE_NVDLINK) { + c2c_pmu->peer_type =3D C2C_PEER_TYPE_CXLMEM; + + c2c_pmu->nr_inst =3D C2C_NR_INST_NVDLINK; + c2c_pmu->peer_insts[0][0] =3D (1UL << c2c_pmu->nr_inst) - 1; + + c2c_pmu->nr_peer =3D C2C_NR_PEER_CXLMEM; + c2c_pmu->filter_default =3D (1 << c2c_pmu->nr_peer) - 1; + + c2c_pmu->formats =3D nv_c2c_pmu_formats; + + return 0; + } + + c2c_pmu->nr_inst =3D (c2c_pmu->c2c_type =3D=3D C2C_TYPE_NVLINK) ? + C2C_NR_INST_NVLINK : C2C_NR_INST_NVCLINK; + + if (device_property_read_u32(dev, "cpu_en_mask", &cpu_en)) + dev_dbg(dev, "no cpu_en_mask property\n"); + + if (cpu_en) { + c2c_pmu->peer_type =3D C2C_PEER_TYPE_CPU; + + /* Fill peer_insts bitmap with instances connected to peer CPU. */ + bitmap_from_arr32(c2c_pmu->peer_insts[0], &cpu_en, + c2c_pmu->nr_inst); + + c2c_pmu->nr_peer =3D 1; + c2c_pmu->formats =3D nv_c2c_pmu_formats; + } else { + u32 i; + u32 gpu_en =3D 0; + const char *props[C2C_NR_PEER_MAX] =3D { + "gpu0_en_mask", "gpu1_en_mask" + }; + + for (i =3D 0; i < C2C_NR_PEER_MAX; i++) { + if (device_property_read_u32(dev, props[i], &gpu_en)) + dev_dbg(dev, "no %s property\n", props[i]); + + if (gpu_en) { + /* Fill peer_insts bitmap with instances connected to peer GPU. */ + bitmap_from_arr32(c2c_pmu->peer_insts[i], &gpu_en, + c2c_pmu->nr_inst); + + c2c_pmu->nr_peer++; + } + } + + if (c2c_pmu->nr_peer =3D=3D 0) { + dev_err(dev, "No GPU is enabled\n"); + return -EINVAL; + } + + c2c_pmu->peer_type =3D C2C_PEER_TYPE_GPU; + c2c_pmu->formats =3D nv_c2c_nvlink_pmu_formats; + } + + c2c_pmu->filter_default =3D (1 << c2c_pmu->nr_peer) - 1; + + return 0; +} + +static void *nv_c2c_pmu_init_pmu(struct platform_device *pdev) +{ + int ret; + struct nv_c2c_pmu *c2c_pmu; + struct acpi_device *acpi_dev; + struct device *dev =3D &pdev->dev; + + acpi_dev =3D ACPI_COMPANION(dev); + if (!acpi_dev) + return ERR_PTR(-ENODEV); + + c2c_pmu =3D devm_kzalloc(dev, sizeof(*c2c_pmu), GFP_KERNEL); + if (!c2c_pmu) + return ERR_PTR(-ENOMEM); + + c2c_pmu->dev =3D dev; + c2c_pmu->acpi_dev =3D acpi_dev; + c2c_pmu->c2c_type =3D (unsigned int)(unsigned long)device_get_match_data(= dev); + platform_set_drvdata(pdev, c2c_pmu); + + ret =3D nv_c2c_pmu_init_socket(c2c_pmu); + if (ret) + goto done; + + ret =3D nv_c2c_pmu_init_id(c2c_pmu); + if (ret) + goto done; + + ret =3D nv_c2c_pmu_init_filter(c2c_pmu); + if (ret) + goto done; + +done: + if (ret) + return ERR_PTR(ret); + + return c2c_pmu; +} + +static int nv_c2c_pmu_init_mmio(struct nv_c2c_pmu *c2c_pmu) +{ + int i; + struct device *dev =3D c2c_pmu->dev; + struct platform_device *pdev =3D to_platform_device(dev); + + /* Map the address of all the instances. */ + for (i =3D 0; i < c2c_pmu->nr_inst; i++) { + c2c_pmu->base[i] =3D devm_platform_ioremap_resource(pdev, i); + if (IS_ERR(c2c_pmu->base[i])) { + dev_err(dev, "Failed map address for instance %d\n", i); + return PTR_ERR(c2c_pmu->base[i]); + } + } + + /* Map broadcast address. */ + c2c_pmu->base_broadcast =3D devm_platform_ioremap_resource(pdev, + c2c_pmu->nr_inst); + if (IS_ERR(c2c_pmu->base_broadcast)) { + dev_err(dev, "Failed map broadcast address\n"); + return PTR_ERR(c2c_pmu->base_broadcast); + } + + return 0; +} + +static int nv_c2c_pmu_register_pmu(struct nv_c2c_pmu *c2c_pmu) +{ + int ret; + + ret =3D cpuhp_state_add_instance(nv_c2c_pmu_cpuhp_state, + &c2c_pmu->cpuhp_node); + if (ret) { + dev_err(c2c_pmu->dev, "Error %d registering hotplug\n", ret); + return ret; + } + + c2c_pmu->pmu =3D (struct pmu) { + .parent =3D c2c_pmu->dev, + .task_ctx_nr =3D perf_invalid_context, + .pmu_enable =3D nv_c2c_pmu_enable, + .pmu_disable =3D nv_c2c_pmu_disable, + .event_init =3D nv_c2c_pmu_event_init, + .add =3D nv_c2c_pmu_add, + .del =3D nv_c2c_pmu_del, + .start =3D nv_c2c_pmu_start, + .stop =3D nv_c2c_pmu_stop, + .read =3D nv_c2c_pmu_read, + .attr_groups =3D c2c_pmu->attr_groups, + .capabilities =3D PERF_PMU_CAP_NO_EXCLUDE | + PERF_PMU_CAP_NO_INTERRUPT, + }; + + ret =3D perf_pmu_register(&c2c_pmu->pmu, c2c_pmu->name, -1); + if (ret) { + dev_err(c2c_pmu->dev, "Failed to register C2C PMU: %d\n", ret); + cpuhp_state_remove_instance(nv_c2c_pmu_cpuhp_state, + &c2c_pmu->cpuhp_node); + return ret; + } + + return 0; +} + +static int nv_c2c_pmu_probe(struct platform_device *pdev) +{ + int ret; + struct nv_c2c_pmu *c2c_pmu; + + c2c_pmu =3D nv_c2c_pmu_init_pmu(pdev); + if (IS_ERR(c2c_pmu)) + return PTR_ERR(c2c_pmu); + + ret =3D nv_c2c_pmu_init_mmio(c2c_pmu); + if (ret) + return ret; + + ret =3D nv_c2c_pmu_get_cpus(c2c_pmu); + if (ret) + return ret; + + ret =3D nv_c2c_pmu_alloc_attr_groups(c2c_pmu); + if (ret) + return ret; + + ret =3D nv_c2c_pmu_register_pmu(c2c_pmu); + if (ret) + return ret; + + dev_dbg(c2c_pmu->dev, "Registered %s PMU\n", c2c_pmu->name); + + return 0; +} + +static void nv_c2c_pmu_device_remove(struct platform_device *pdev) +{ + struct nv_c2c_pmu *c2c_pmu =3D platform_get_drvdata(pdev); + + perf_pmu_unregister(&c2c_pmu->pmu); + cpuhp_state_remove_instance(nv_c2c_pmu_cpuhp_state, &c2c_pmu->cpuhp_node); +} + +static const struct acpi_device_id nv_c2c_pmu_acpi_match[] =3D { + { "NVDA2023", (kernel_ulong_t)C2C_TYPE_NVLINK }, + { "NVDA2022", (kernel_ulong_t)C2C_TYPE_NVCLINK }, + { "NVDA2020", (kernel_ulong_t)C2C_TYPE_NVDLINK }, + { } +}; +MODULE_DEVICE_TABLE(acpi, nv_c2c_pmu_acpi_match); + +static struct platform_driver nv_c2c_pmu_driver =3D { + .driver =3D { + .name =3D "nvidia-t410-c2c-pmu", + .acpi_match_table =3D ACPI_PTR(nv_c2c_pmu_acpi_match), + .suppress_bind_attrs =3D true, + }, + .probe =3D nv_c2c_pmu_probe, + .remove =3D nv_c2c_pmu_device_remove, +}; + +static int __init nv_c2c_pmu_init(void) +{ + int ret; + + ret =3D cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "perf/nvidia/c2c:online", + nv_c2c_pmu_online_cpu, + nv_c2c_pmu_cpu_teardown); + if (ret < 0) + return ret; + + nv_c2c_pmu_cpuhp_state =3D ret; + return platform_driver_register(&nv_c2c_pmu_driver); +} + +static void __exit nv_c2c_pmu_exit(void) +{ + platform_driver_unregister(&nv_c2c_pmu_driver); + cpuhp_remove_multi_state(nv_c2c_pmu_cpuhp_state); +} + +module_init(nv_c2c_pmu_init); +module_exit(nv_c2c_pmu_exit); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("NVIDIA Tegra410 C2C PMU driver"); +MODULE_AUTHOR("Besar Wicaksono "); --=20 2.43.0 From nobody Mon Jan 26 22:50:07 2026 Received: from PH8PR06CU001.outbound.protection.outlook.com (mail-westus3azon11012047.outbound.protection.outlook.com [40.107.209.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE3BC34DCD6; Mon, 26 Jan 2026 18:13:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.209.47 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451209; cv=fail; b=dWPegM3IubXrJEp6DghBIiK/fYuXS6MNFzevCa0/Ke0I3MYg3MLMN1ArhbMxOz/zLjfPKiud/xZEfnhu6GPgsk3h1TWymadyf5VWLoK5r8y048i4F0HRq/JuelZ2/3LN3d1dMk8eGwsxB921Vl7CRTvAUO3cLVBurHEnLNYY69M= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769451209; c=relaxed/simple; bh=QNaFUeDpy7IaArG9GtnP4fAuh3d0moBOemp2yFkSPaI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=A/IuYUHBY8p1cr3g9nd+dm7U+b4JPFU7A0+5CUrd4Ry8jYNZ961IJhuK7iUj1+qKiER7ZqS53Da2KuFkUWbvZ3fv/+KMogNU0nXjcsEU6SpE+YPaxvZJ5o/Z/hiTSBlthH1a7oXBgrDEitMR5UDltQLu8dssVRpd6V3ioLCG2/E= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=T/eIGRTF; arc=fail smtp.client-ip=40.107.209.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="T/eIGRTF" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=izmbSbT9fchLnyJOmK5eYxYzrwdUPoWhhEyIgfEw3rwkeEpDBsp3lyJ9CRRs54lGhVg4zGNJM9ubT7hjpuvlpO6qgxxEhssUl88nxr3EuBOufSY+fTqD0Mn8/FDGUpDVu4PgueXpYe8KgzNBeTLd3dzfc3xZjAIKFTfM06m0TBtv/S18OsKZByhd+1R+53EnTN1TfbXAsct37hFBYwYsIJ3/sjDaMFHzQDsq+xnHkNJOHN/0sz2luCnNcGtyDAM7nJbjwzd50LLb+Eglfy6WsKToltGHgf6s9Wjy2Oc1YXwKP12xx77F+nb1sXjnrgISvlQsE0X1PoBaBU25BvYE6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=f2yCHArliAdGBypkGh3fUd2DPyX0A0WwmS1dIh1n/Rw=; b=jLjz2aN5Fdn5k/03pOZxAnT4qRstXTk6FioyoHg0pfvvpqU1U4fNs7+Py/lrt4oanXFo/+h+EWz7adz3WAJA8a5ZRSo/RN13YzWVY1ktMP1+QzJvF0aFJr+a8aIGsCXrkcen0vF9excf9KfEZefQ6HBTh/8KfAk0xtjxIyRvQJNpvURY1j7dOj8omdQf3CEeCGIZI1leZ3GmqnLBLgaxEQqunFtYuZfh1QiPhzNG7iKce3WU4rVQ2wXlsDKvt3eps/dqwOlG3CxCWmCWLdw+jdR1+mGpFg0+isMasFTuv2vEPvbqqSofybWZG0VaMHV3j2yksvWK8pueWvJAowRH/A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.118.232) smtp.rcpttodomain=kernel.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=f2yCHArliAdGBypkGh3fUd2DPyX0A0WwmS1dIh1n/Rw=; b=T/eIGRTFIk/S11Wv+hhyPBW6cUy/QW+bVNuGGghp5Lv4qTlgeUGg+xJvDTfbTBLb4tVA5o7JVQAFjSdwzy+uTSd3fAeuEOiRgERlcxSjvNvNvV19qE1+a9CNrdKIZJ1qxLxV8OjqlOQva+fbZ7zA87iEw5S0uDv6JGlU2l3eU8xiAE9N1ZDb7yzWdykezr17n9GdOIAyTW2/2/BFyAA31L2JRdX4r2Gmn2LSFmJ0WDwLDxsrANNYEXyPyLlhcwHdzqsvpYcuAD2GHYAqPAtVSxZhGEHq7RVaLEpMN9TacZhBwzCv4VW0vDRKwDwJMcjd+TfgVE46sSmV9cDNEj8Kcw== Received: from BN0PR07CA0014.namprd07.prod.outlook.com (2603:10b6:408:141::7) by MN2PR12MB4254.namprd12.prod.outlook.com (2603:10b6:208:1d0::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9542.16; Mon, 26 Jan 2026 18:13:23 +0000 Received: from BN2PEPF0000449E.namprd02.prod.outlook.com (2603:10b6:408:141:cafe::3f) by BN0PR07CA0014.outlook.office365.com (2603:10b6:408:141::7) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9542.16 via Frontend Transport; Mon, 26 Jan 2026 18:13:23 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.118.232) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.118.232 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.118.232; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.118.232) by BN2PEPF0000449E.mail.protection.outlook.com (10.167.243.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9564.3 via Frontend Transport; Mon, 26 Jan 2026 18:13:23 +0000 Received: from drhqmail203.nvidia.com (10.126.190.182) by mail.nvidia.com (10.127.129.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:13:02 -0800 Received: from drhqmail201.nvidia.com (10.126.190.180) by drhqmail203.nvidia.com (10.126.190.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 26 Jan 2026 10:13:01 -0800 Received: from build-bwicaksono-noble-20251018.internal (10.127.8.11) by mail.nvidia.com (10.126.190.180) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 26 Jan 2026 10:13:01 -0800 From: Besar Wicaksono To: , , , CC: , , , , , , , , , , , , , Besar Wicaksono Subject: [PATCH 8/8] arm64: defconfig: Enable NVIDIA TEGRA410 PMU Date: Mon, 26 Jan 2026 18:11:55 +0000 Message-ID: <20260126181155.2776097-9-bwicaksono@nvidia.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260126181155.2776097-1-bwicaksono@nvidia.com> References: <20260126181155.2776097-1-bwicaksono@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN2PEPF0000449E:EE_|MN2PR12MB4254:EE_ X-MS-Office365-Filtering-Correlation-Id: 50d6ce0b-bcd3-4be9-bad3-08de5d0697fa X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|36860700013|82310400026|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?LpQMp7AW40Et7mQWkMsY2KCC32569vezVMIHw96vfn4B8nlOlkR40JCsSWcj?= =?us-ascii?Q?3ti62+kWaQfJh5QBF32hAa6RieQNb5CZvKEge7g3Rxqy3c0r9bxoeva/j4YQ?= =?us-ascii?Q?EPuz52CFYIoJIuZ6HW8fSRjqBSjG7zF5B6+M8o6YPyZWZjdntUv2l8ykxE+b?= =?us-ascii?Q?765G44GUk11B3D0WbSZf6mUcMOD3H7CDAmcQyin6dQud96qJM4oK++rURTz5?= =?us-ascii?Q?pq8oDokXK9K0zwjMZOJ5wZXmnR3O8YFWq4tUHUk7v8W2hEB5jMh4TdObr18y?= =?us-ascii?Q?PZgeVsoTWWQTb6saAic8iOe3uLbPBb1sz15HX7GWENsbW+yFg9XudSSfO+CC?= =?us-ascii?Q?+rIIuFdJbU4vT2KFWMGH9vrPCS39Hstp+HtJaoUVXtb9mq+iDBLkRaaivYE2?= =?us-ascii?Q?XFnuUw8NLxQrnnn4BKWM0s9bG/T7J0GEQJhfSKBECQ8aIWDoW+61fjCGuEYo?= =?us-ascii?Q?YRzwd0Dg3ux6uq2dVv70bFInAZVcZYMOmm8zKqCv2GFnc5W4vNZgZYnj/uT4?= =?us-ascii?Q?E16a72lR/xKly9VqBvuiOq7mS0DoUoYZywg2/6x3E7A2iLpRDs69vkO58/AZ?= =?us-ascii?Q?X5i/4ROuPslQPA9lS1p8mhLfbbr7pSBz5lCn8MsCirqMJYa4UBoR4sQwL1L6?= =?us-ascii?Q?B3TuIDRYw5mz2vy3xcyIC4ixrVLYKAgfIiwqo0zHXDbhZlvX5f8mZZIi/os8?= =?us-ascii?Q?/6i5xdxF1nnlQ0G6NuaaZAlNULgJWUHjPqR8VJYY/AQCZd7rWGjkGVOmt8CK?= =?us-ascii?Q?ZJNwoZWCjhzYAYqkBL7/QyYgRYGmUV/LSrgrJKyeWKFYLLeEdEhdftMmUS/H?= =?us-ascii?Q?P6UHEuw+0jZzexdjJ3cEyhpNIEc0ICUni6HM6mSgix+mRnTHk1aCYQHcCOD7?= =?us-ascii?Q?9WdFTUd6lBh9mhFDt6JslqfrRwhKslxkn8Z0yILhUrOuaq+h2r7W/aQvdPoh?= =?us-ascii?Q?W2utsJkXtffh/fMvaWe2nZZlWsuOKr1Ubb7wdxFkKW7LWbvsO61R+ZOkFskW?= =?us-ascii?Q?TmA7PWa7ZS/KeifEnV8k7hiPV9bNEg8Ch56jRBaJ/BnCdDywZkyUu2xSElFS?= =?us-ascii?Q?WXaJDPrPiTV9ZKWMDORcVi/e19E05z8MQMKiUDwcZpteC8FEhCRqhWJTG4bY?= =?us-ascii?Q?/gbeVaOvIqUhxoOGlEEOJQIkj02XHuFWibYtDtGAyPkVxSpJ+G6+5wQOmu4p?= =?us-ascii?Q?/Lc7W22txqljxXGLNumWb8CZApdOLdGdmlkG7zlMZxOAIi7qRZRI4sTxsQRl?= =?us-ascii?Q?eSWxTP3QZuIEFcIK9Q4f+eCVrxwccEGFkgextZ+w1juYwKINAUpoE8RaC8Lo?= =?us-ascii?Q?e7ea0vtojgWoput7rqjiI32Tcplzrln5rhhPwkmztym7DcHuoFik5bgQNdni?= =?us-ascii?Q?LBs9U4o/NQFXaffj14jXsmam8daGYwUhQdPOf3Mn+BeXtlDezH/etGAqsgVn?= =?us-ascii?Q?kjdrXa7VTBxL22J271ylBPMXdvW8UJRixPnwwtnlNEXHDDYDvefCsTagnE/a?= =?us-ascii?Q?WzWoh9gjjkxaXrrMf2y2Bd6ZGqd8HKuvSMoZTnBEB8EdkuinQXweQ4GZY1Df?= =?us-ascii?Q?mo6BTVn2UbJwpHfGOgphgUmn0BxjsY2O6k9JIJ0IqpYZVrtIfc/fkHlebFOi?= =?us-ascii?Q?MRh6THSczHnzDQqt52FMJ5gAzCb1ueNcX/LO1oKsqdCinklMOI/ud9gesByJ?= =?us-ascii?Q?hxiH0w=3D=3D?= X-Forefront-Antispam-Report: CIP:216.228.118.232;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc7edge1.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(36860700013)(82310400026)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jan 2026 18:13:23.2321 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 50d6ce0b-bcd3-4be9-bad3-08de5d0697fa X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.118.232];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN2PEPF0000449E.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR12MB4254 Content-Type: text/plain; charset="utf-8" Enable driver for NVIDIA TEGRA410 CMEM Latency and C2C PMU device. Signed-off-by: Besar Wicaksono --- arch/arm64/configs/defconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig index 45288ec9eaf7..3d0e438cb997 100644 --- a/arch/arm64/configs/defconfig +++ b/arch/arm64/configs/defconfig @@ -1723,6 +1723,8 @@ CONFIG_ARM_DMC620_PMU=3Dm CONFIG_HISI_PMU=3Dy CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU=3Dm CONFIG_NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU=3Dm +CONFIG_NVIDIA_TEGRA410_CMEM_LATENCY_PMU=3Dm +CONFIG_NVIDIA_TEGRA410_C2C_PMU=3Dm CONFIG_MESON_DDR_PMU=3Dm CONFIG_NVMEM_LAYOUT_SL28_VPD=3Dm CONFIG_NVMEM_IMX_OCOTP=3Dy --=20 2.43.0