From nobody Sun Feb 8 07:21:52 2026 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2072.outbound.protection.outlook.com [40.107.223.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F04C27470; Tue, 29 Apr 2025 04:00:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.72 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745899244; cv=fail; b=XLwNnup5Pq8/JfQHXKEBZ8M5RcPFz4ysWZZSreVEFuKKPfQ87IY1jLytyc1VF9Lw9/GOi78j3jLSaLnIBIPTIGClc+ZETd1o/bPFUGkd9vbO71vts6f0nXwPYNsHpzU0eNI27PQ5S/h6s3vfhzjQReIG7kdu97+fY20ng7wKE00= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745899244; c=relaxed/simple; bh=Glp+NTrVrjz69E7ID8NKNOhdRLnwqMvd7kGsouV6spA=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=K0hzYfq7zjHTBN+cjXtgG/gBX3ybdkiREI3S24K7VhhUg1/nuERZP0TnbpJVCGNdCS6yVjxYxjRjW9UeSQ6OMMHlk1rIAtqdrLJp9h3y83osOTvQkK7eE8zTLvmt77iWRgw+wYeZHwE+AVY3eDUfsh4YcPW6qpKNsVfafN4uTS4= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=zACiqTTJ; arc=fail smtp.client-ip=40.107.223.72 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="zACiqTTJ" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=SsiRD96du/PqBOj16JpO4xlU+4AnxnRx3XStRW0OX3iTQ54Q0nieyROZwtSHsNsQFeW1DoXlD1hKxDUM7EEDaYyOMfqY6WXob96Y3qszwMNer5dn1tAvZ+eovuVVIBPayzIQw+g7aHMgoj75N53J4Tndz3vQ6/TqCngwzFpxGOrlRYa/7Ye+c6wRKT51IKteKjrjDkMx11ooCfABNBG5zLgXxJq9D8MLLJ5quZFOWoZiw93YoX0XWc27VQchCWc3qiKdJ/jauqdHMk08+keN9Fh0cbAv5gpv8a6AaDHuQFn3/+2BMPh12DVPhq5QduxKJHU/H1rxYN9wDFEJtZAHHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZZJKQdzsXhc4h5/H42TwHke+LKvW80blz8uLEdZiC4U=; b=rLNSw9QFbIP94BMKZL6pDi5lfHJ+QaE9WvEmaZNn5mfJbaoX2+1FpzH1sBuOzUjOoA2IqGASat+673Cgqn6pivgEYxAD14rqNy6n5eEmwOvM3svokiUkP53lpVeum0V3E3DPA4yeR5kaXaJRtk5zOV3BbY08+2n0MM/Zw+xvXC51hFa7yzarmBUTrIorqXMX8P+kTSQhWCb4ywtWWWmqqvoqcSoNba5kotoY9InuuLNgEHmVUjDO884jZg1ekN7V1DmbiaInUKAaSO7EyzBv3koQAzQrDE36J7pTcPZNPH8pcvhWkC7LU5t+VoMEfBmX9Si8Vo4fvqCkPayCNGFKdw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=redhat.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZZJKQdzsXhc4h5/H42TwHke+LKvW80blz8uLEdZiC4U=; b=zACiqTTJ+3yfNaTQHinG/fOFhT+714USq4nNGapmFZrZd4NYZ3SP37BE2oaam6s1NEbkrVPAjCSaKQ6JU3Jk2EzfIHlDXoLs/92egtAxAxxOIdjlHsunMi0afyUyxSfVqvZWiv1HtVF5d98EABY7KknnAYA1iucnsX77518mrlc= Received: from CH2PR14CA0009.namprd14.prod.outlook.com (2603:10b6:610:60::19) by CH2PR12MB9494.namprd12.prod.outlook.com (2603:10b6:610:27f::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8678.33; Tue, 29 Apr 2025 04:00:38 +0000 Received: from CH3PEPF00000010.namprd04.prod.outlook.com (2603:10b6:610:60:cafe::e2) by CH2PR14CA0009.outlook.office365.com (2603:10b6:610:60::19) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8655.41 via Frontend Transport; Tue, 29 Apr 2025 04:00:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CH3PEPF00000010.mail.protection.outlook.com (10.167.244.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8678.33 via Frontend Transport; Tue, 29 Apr 2025 04:00:38 +0000 Received: from BLR-L-RBANGORI.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Mon, 28 Apr 2025 23:00:32 -0500 From: Ravi Bangoria To: Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim CC: Ravi Bangoria , Peter Zijlstra , Joe Mario , Stephane Eranian , Jiri Olsa , Ian Rogers , Kan Liang , , , "Santosh Shukla" , Ananth Narayan , Sandipan Das Subject: [PATCH v4 1/4] perf amd ibs: Add Load Latency bits in raw dump Date: Tue, 29 Apr 2025 03:59:35 +0000 Message-ID: <20250429035938.1301-2-ravi.bangoria@amd.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250429035938.1301-1-ravi.bangoria@amd.com> References: <20250429035938.1301-1-ravi.bangoria@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH3PEPF00000010:EE_|CH2PR12MB9494:EE_ X-MS-Office365-Filtering-Correlation-Id: c0162844-4d57-4c1f-daa1-08dd86d26705 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|376014|82310400026|7416014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?UcSySSaoCZl8Sdo0ExFk+ncf2YmBE7T+SlyxgRhETzIvJExB5W7vSot39Y+E?= =?us-ascii?Q?DLAmenDgpCOn2fldymnGpGWHlEXf8QeoiPAECoUTgBc8EBgQMlT4qXzGCZn/?= =?us-ascii?Q?zh2ZY+H4uvWhy8Zf/8cogITjaAQx7iXCTkjUQ/MmTL7L+PYXXeY7eJ7c8lsH?= =?us-ascii?Q?/epDcnmSVMSlDaJpxk0qdo+yzHCCbxN4N+JkaoFI6ofDIp5H+PG3JI17KJUJ?= =?us-ascii?Q?RelRUCBxBNHI1XpqmCzEuGM9iVLyFolumgBiQQQ24HnKkSTAR9lQE5z8bCcs?= =?us-ascii?Q?HMqvqPJwpTd/SGQ/OKPDRsb3UlVlfdYqseJDOTZpCR4/hhLhhrO4Eg1Xl8pL?= =?us-ascii?Q?qe7edvwwLPzy+MzZtQ7HTRlhzhR400n5+liIDb19CYJulIRdULu8zuNhC7Pc?= =?us-ascii?Q?2dntqNdR6Oo1C1bEkYtS0EOkWJoFnqxKuuKL7HcoZRb8/DWId+z1sQHvI/dt?= =?us-ascii?Q?ufAUOUZZ+LxFjC8LXoieTfXQsmdKckmp2DMPfzgMUj4MXjd04v119fWVcpkg?= =?us-ascii?Q?Ndv1WhvKtl4J03zq8GvB+sJIMkf2+zvWHV0+lHCG0kJmkEuNvt6BqRxlFdmU?= =?us-ascii?Q?P6+CjJY2a8GgbevwuS2QgJfspzrBh/rxuNTTzDobKK0YG4z2nnEMCLxuirjg?= =?us-ascii?Q?bUt+ob0qQjhMnUzA/2j5C3WRBTUCb5i4f9cJWIzXmdVZlwI0IIasXAhsn1Oq?= =?us-ascii?Q?pXY/0+ys7xWSMRrKRdu1wOcUiN4fDGeyTtFl0x1fcs/XvPqKv/fIm/G6OTa7?= =?us-ascii?Q?IY0wr3g5e4GRtMUVQ4LzMmqrohwAJd5LngMv5hsIQm5b8lLqkCN0lxBxp/MS?= =?us-ascii?Q?RQMhguQ6nlPVLwvL/owUr0Mn3CBTHLxM9lWl4vhlWxE6586uCR7+L4nsgQqa?= =?us-ascii?Q?G6wvvHCGuLRE1Uga7Vmz2iYpF4QkmnA0uiZG620ZQr9kI/gt0UEQ4z4HUY4T?= =?us-ascii?Q?T66DRmFX50JW8GgTXOLK25J+MDPGHI5tn2JCaEN3Ez09XMtTBdWlbxv6LLas?= =?us-ascii?Q?iG1k9DBZB2zk/n58Jwnj22X+uaP/hyAZaeshXT22p/vEcRvwnoJPbgUEJHHx?= =?us-ascii?Q?qHt1mjgfpWUo7DgSpXYAia54SDUr3fh9ZyqqrOcieG2SRtJIlC/incj26+iP?= =?us-ascii?Q?g/SzLUC52WYVXOt3KanKuyKf0wY7B5KBctat8SSoBNY/lRhPNQFVE5Ebbh2d?= =?us-ascii?Q?tyTbEpR23lmAxx2sZB4b9FOfBq8TzXwpZI6Wbvp9qLZeEBgaid9scBQjMHZI?= =?us-ascii?Q?vcYjQK4yyRUkHkN0WzDGni5iK68D7sNdOyDa+QqS4y4s+jo596Zk+mrKufns?= =?us-ascii?Q?wU8ZB7MKxQsPHrw49HeIRIomepJnOA4Z1DS48PLTCxpE+Op19nVyXAUpRLVE?= =?us-ascii?Q?0EtCy+iqbY8DKgxzOG70UFZDIWXZ/GUaTK9WgKlqseXEMiL95spjiIG1vJzA?= =?us-ascii?Q?xlGWFOkbPEeLuh1fcLd7sEI2rxPPPhnNdNCCyXRDO/aRMtA6Cul/n9qSn9Yu?= =?us-ascii?Q?KavqJ3fCcHS1sDWcxUChh3BegEeQ/YaCedzN?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026)(7416014);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Apr 2025 04:00:38.5500 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c0162844-4d57-4c1f-daa1-08dd86d26705 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CH3PEPF00000010.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR12MB9494 Content-Type: text/plain; charset="utf-8" IBS OP PMU on Zen5 supports Load Latency filtering. Decode and dump Load Latency filtering related bits into perf script raw dump. Also add oneliner example in the perf-amd-ibs man page. Signed-off-by: Ravi Bangoria --- tools/perf/Documentation/perf-amd-ibs.txt | 9 +++++++++ tools/perf/util/amd-sample-raw.c | 14 ++++++++++++-- 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/tools/perf/Documentation/perf-amd-ibs.txt b/tools/perf/Documen= tation/perf-amd-ibs.txt index 2fd31d9d7b71..55f80beae037 100644 --- a/tools/perf/Documentation/perf-amd-ibs.txt +++ b/tools/perf/Documentation/perf-amd-ibs.txt @@ -85,6 +85,15 @@ System-wide profile, uOps event, sampling period: 100000= , L3MissOnly (Zen4 onwar =20 # perf record -e ibs_op/cnt_ctl=3D1,l3missonly=3D1/ -c 100000 -a =20 +System-wide profile, cycles event, sampling period: 100000, LdLat filterin= g (Zen5 +onward) + + # perf record -e ibs_op/ldlat=3D128/ -c 100000 -a + + Supported load latency threshold values are 128 to 2048 (both inclusive). + Latency value which is a multiple of 128 incurs a little less profiling + overhead compared to other values. + Per process(upstream v6.2 onward), uOps event, sampling period: 100000 =20 # perf record -e ibs_op/cnt_ctl=3D1/ -c 100000 -p 1234 diff --git a/tools/perf/util/amd-sample-raw.c b/tools/perf/util/amd-sample-= raw.c index 9d0ce88e90e4..ac34b18ccc0c 100644 --- a/tools/perf/util/amd-sample-raw.c +++ b/tools/perf/util/amd-sample-raw.c @@ -19,6 +19,7 @@ =20 static u32 cpu_family, cpu_model, ibs_fetch_type, ibs_op_type; static bool zen4_ibs_extensions; +static bool ldlat_cap; =20 static void pr_ibs_fetch_ctl(union ibs_fetch_ctl reg) { @@ -78,14 +79,20 @@ static void pr_ic_ibs_extd_ctl(union ic_ibs_extd_ctl re= g) static void pr_ibs_op_ctl(union ibs_op_ctl reg) { char l3_miss_only[sizeof(" L3MissOnly _")] =3D ""; + char ldlat[sizeof(" LdLatThrsh __ LdLatEn _")] =3D ""; =20 if (zen4_ibs_extensions) snprintf(l3_miss_only, sizeof(l3_miss_only), " L3MissOnly %d", reg.l3_mi= ss_only); =20 - printf("ibs_op_ctl:\t%016llx MaxCnt %9d%s En %d Val %d CntCtl %d=3D%s Cur= Cnt %9d\n", + if (ldlat_cap) { + snprintf(ldlat, sizeof(ldlat), " LdLatThrsh %2d LdLatEn %d", + reg.ldlat_thrsh, reg.ldlat_en); + } + + printf("ibs_op_ctl:\t%016llx MaxCnt %9d%s En %d Val %d CntCtl %d=3D%s Cur= Cnt %9d%s\n", reg.val, ((reg.opmaxcnt_ext << 16) | reg.opmaxcnt) << 4, l3_miss_only, reg.op_en, reg.op_val, reg.cnt_ctl, - reg.cnt_ctl ? "uOps" : "cycles", reg.opcurcnt); + reg.cnt_ctl ? "uOps" : "cycles", reg.opcurcnt, ldlat); } =20 static void pr_ibs_op_data(union ibs_op_data reg) @@ -331,6 +338,9 @@ bool evlist__has_amd_ibs(struct evlist *evlist) if (perf_env__find_pmu_cap(env, "ibs_op", "zen4_ibs_extensions")) zen4_ibs_extensions =3D 1; =20 + if (perf_env__find_pmu_cap(env, "ibs_op", "ldlat")) + ldlat_cap =3D 1; + if (ibs_fetch_type || ibs_op_type) { if (!cpu_family) parse_cpuid(env); --=20 2.43.0 From nobody Sun Feb 8 07:21:52 2026 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2079.outbound.protection.outlook.com [40.107.93.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9055C270EAF; Tue, 29 Apr 2025 04:00:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.93.79 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745899251; cv=fail; b=kb1Atrc4cvADqlGKj5+6QUwcJOK/59KO5Cj0mxAkAN+K2k7MO1Uu+AhdUbuh/5SAenDEz04OAe9esGGFEKTSM1IGOto94t4KDyvj887SyJU0QNCurRs+hznADyfYKjvDkfKHI640qp1FdJ7JFnnEXQKXnKIzBrs+myn8QGDTRz8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745899251; c=relaxed/simple; bh=F6vK+qcbBQeun0lC5nxK3lTZv6uP5jfhpfJa+wgZRKY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=fr+M7tOhnOj9/a8CIS/GGZMfJQ24hcK43LvzjsWydtUTXdwXzQMNjek26cqeg/u4MxeJzeIIbnkDt0FO/m2BQEZgDyGcVNqGe4zJ7lOADhJIWZoGkTRC6KbEXDECJ0tzqLLAQPVQVToJWmLyWCoNnLKU8aYTdXb9AG13c7uB0nw= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=ni9xqqKx; arc=fail smtp.client-ip=40.107.93.79 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="ni9xqqKx" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=GOLAi+Jq4i5YH/HOc7ej++DPzRbNJbyQcuQrdqapPdcUsKjKXBIcNjH2o9q26sIPp+J65u+T0pMTuiJFP3KFtubWQ6OPv3/cC5JIDrjiwmMOYl27oJiy2XTy0fZXZx/0Q0dqhVDOFzjKQueuAF2z5G6sZ0DwHGbny1/50fQmefqfcpYmNDSJl15MTx3khKbIIF17CqzSI6KB/TIUXLQqEvtun6mNJpUNVJBCXgYCDQbKKvAesdFBzoHkU6drKWxuugDcndw/8Gms+BSdVQd4h6BWGa8iozq/PQ+9nyJ1PzmHG2+3t7tLn8qeLff9vfAdDsy+ET2XTcTDCByg8SNzKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=K4oVkcgJIs/QHr643puiUL3vz6lXlqqMLffc8P3XfkY=; b=yVQ26uyniQAYnskJJPB6wxjpt6UYJqzYGHoAH5MedRk1zT9ibfiRB3bzZQD5Sy5DlTTnZhchykJs+iR/xqFt/rL1vL4Zxdvs1dmWGg7y3NWGI+XMm/ddhU5PNTK1lwh7D1py+284GCzPb83/uHAjAe6heM5hiS4j7tCNw8eRAJYR6PkG6UegesK0tynJs5qZqiWEndlTnvtH+SZGxsrWuglCB4svTpm9wMArQQbhmuX4L7uqTXJ+hEwmiTMFVuDB0tc3St/piDobCEAAO3MYK/Yd9lscwFliSfIRzD7jQy5smkf8SaEbmpLRPfnJCQzJAGNNTcHUlTg1UJvgMvpuPg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=redhat.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=K4oVkcgJIs/QHr643puiUL3vz6lXlqqMLffc8P3XfkY=; b=ni9xqqKxxykauA1HiBCswyl6BGUoq7Q59YJ1pniE0s6DJdVIjs77KDxaKMOiV6Pfd0mdHblrYqT1bHijlaQqJ5bzmpRvkZKG/Cj27Ty6Sqav8yoeactxYmJQ10y0Bfw45mbzfROQhGxl1Oyv4F9unkg6C9q1jQ4kFt7SFmkJFxA= Received: from CH2PR14CA0012.namprd14.prod.outlook.com (2603:10b6:610:60::22) by PH8PR12MB7158.namprd12.prod.outlook.com (2603:10b6:510:22a::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8678.33; Tue, 29 Apr 2025 04:00:42 +0000 Received: from CH3PEPF00000010.namprd04.prod.outlook.com (2603:10b6:610:60:cafe::77) by CH2PR14CA0012.outlook.office365.com (2603:10b6:610:60::22) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8655.40 via Frontend Transport; Tue, 29 Apr 2025 04:00:42 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CH3PEPF00000010.mail.protection.outlook.com (10.167.244.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8678.33 via Frontend Transport; Tue, 29 Apr 2025 04:00:42 +0000 Received: from BLR-L-RBANGORI.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Mon, 28 Apr 2025 23:00:36 -0500 From: Ravi Bangoria To: Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim CC: Ravi Bangoria , Peter Zijlstra , Joe Mario , Stephane Eranian , Jiri Olsa , Ian Rogers , Kan Liang , , , "Santosh Shukla" , Ananth Narayan , Sandipan Das Subject: [PATCH v4 2/4] perf amd ibs: Incorporate Zen5 DTLB and PageSize information Date: Tue, 29 Apr 2025 03:59:36 +0000 Message-ID: <20250429035938.1301-3-ravi.bangoria@amd.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250429035938.1301-1-ravi.bangoria@amd.com> References: <20250429035938.1301-1-ravi.bangoria@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH3PEPF00000010:EE_|PH8PR12MB7158:EE_ X-MS-Office365-Filtering-Correlation-Id: 36394a8c-4499-4e97-0598-08dd86d26919 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|7416014|376014|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?WEsF8g33XBjnZbVJsGBjo0r2CbKFeXiL5gxNM+H4MonkUW2zEX4U6wYle/7m?= =?us-ascii?Q?dpvfiOJ6yUCO69Akri6BNrwjmwcB+tjeZjnxsFf1wieygMmCnf1NE5Sm6zaU?= =?us-ascii?Q?qORZH93+HyGquBBZwbDC3//C1GVILArxwDluHhch3aKXpRb5AoKKRkLLvjHq?= =?us-ascii?Q?p0X/HT0nLenKrk6rNZPXuonPRiGazQZZswe5sxjnmmy82m+s8SR0odA0Yjlr?= =?us-ascii?Q?enldzDfVMpquAEm1wqBL7NhJNkeA7ycK0YG4J3TInhKCuEn9oBNGCeEiStys?= =?us-ascii?Q?htn+1FcLfFmZEWoYZ8L9Dip8WYFcJh2uET+ks5eu1cGc0wh9Aw45x8RIRWh3?= =?us-ascii?Q?MEc/AYzAOjPiHIV3NsOfVMUw9zY971psciDiOCG3YR89G5qxHPAYZXFHUfGY?= =?us-ascii?Q?1+J5c1iIxxFjruqRMC/K9YccQj6N7uP4Azc7JW8Uu1i/BFMnR8uJmZa/SIu5?= =?us-ascii?Q?kMOu8yUJBU2jiJCSwk6fuoXoohk8yBud6KquhGlqY/8Ydo5OKmiBkKFTo9QE?= =?us-ascii?Q?277M4CLFAqcB9S8JdXvrK4f9i7l9EnGghyOBZJ0RyMxFPATTjnba6r4ggKTv?= =?us-ascii?Q?MK3SvxgxynY6bqxy0QkoTlgr6vAgI4YMpVk03wq/WUJPQL1WjFiW6pdBfV5F?= =?us-ascii?Q?1U1tFf68JL3OuGXWqSEocyuFCt/HdU07qKk7sgln15mm6GdT1OmmAwFloU8W?= =?us-ascii?Q?oH/Mnk79uboNKqPq6W+iIHgjChm1qllqLiI9WoyukbeX0zsQ/XYKvMH5v4GQ?= =?us-ascii?Q?AyP5YUdQwCIHAx82T3VdPhWh8W6fBh7zHI2CE7ErbfwN3avK7aWbm6TR0Otk?= =?us-ascii?Q?UZWOuo7ynGzGQj1pDpKH6B6Gn61EsCpnHZHXy8BD1oeV+Jo2pmE20BN1ZluA?= =?us-ascii?Q?bYw/OmI3pgEsFxTMwa3noSn4VKcQz6EdZajNaPyJs2SByX6Qg8r+rqR3ePor?= =?us-ascii?Q?rYNNnGBP0Lj8GcuB6bMS5WpPZqvJjvL32FqmIJAyTqB1zPzi/9sFteQJfARL?= =?us-ascii?Q?/7HB1fl/oZ5Ep7r5vi5zrJnkeduv9+8MboLi5yjrUcW+0Ub3v7sefffjLfUv?= =?us-ascii?Q?DXFeGrAtNULg7ou02rAT7dqHXo7tB5zWrp2g3Ls+nrq2506r/GEovOGohfda?= =?us-ascii?Q?4kmjVKXph+uzYsEsc0WHudJJhfjCgO9ql4LK3GECBzlpkGzEKKEtI43KE32j?= =?us-ascii?Q?lFAk25VK+K5HO/GSaqwY1iGTBp1Axd0wqd9WwYk2PixO4DFYTGUekDv5hKF+?= =?us-ascii?Q?Cqckq/P7XNo+O7arkar6C+Gq/lRC+bfl8yHwMnZD14dezlrGDBY1R9o1FrZ2?= =?us-ascii?Q?63NF6TpHpnkwDG9Giy+aRDZcLSyBXjK2NSHLUJYu+cgg8OggL26sEtbPByMx?= =?us-ascii?Q?ncvWvj1Yn7dj1cQMp0mK0zVVQYlHkDaMVFZiS4jmaQl5qnXuf+6UnjKH6G1S?= =?us-ascii?Q?uG/WpU6nzHJFDLxxRRCWSEgaZkTNkH1kAP1d1IWNVU8XPSPXwLlomErTORDk?= =?us-ascii?Q?0lJ4lHdAnEPkIe/+FGhmCyl9DImMjm2hn9Uj?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(7416014)(376014)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Apr 2025 04:00:42.0322 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 36394a8c-4499-4e97-0598-08dd86d26919 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CH3PEPF00000010.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH8PR12MB7158 Content-Type: text/plain; charset="utf-8" IBS Op PMU on Zen5 reports DTLB and page size information differently compared to prior generation. IBS_OP_DATA3 Zen3/4 Zen5 ---------------------------------------------------------------- 19 IbsDcL2TlbHit1G Reserved ---------------------------------------------------------------- 6 IbsDcL2tlbHit2M Reserved ---------------------------------------------------------------- 5 IbsDcL1TlbHit1G PageSize: 4 IbsDcL1TlbHit2M 0 - 4K 1 - 2M 2 - 1G 3 - Reserved Valid only if IbsDcPhyAddrValid =3D 1 ---------------------------------------------------------------- 3 IbsDcL2TlbMiss IbsDcL2TlbMiss Valid only if IbsDcPhyAddrValid =3D 1 ---------------------------------------------------------------- 2 IbsDcL1tlbMiss IbsDcL1tlbMiss Valid only if IbsDcPhyAddrValid =3D 1 ---------------------------------------------------------------- Kernel expose this change as "dtlb_pgsize" capability in PMU sysfs. Change IBS register raw-dump logic according to new bit definitions. Signed-off-by: Ravi Bangoria --- tools/perf/util/amd-sample-raw.c | 63 ++++++++++++++++++++++++++------ 1 file changed, 51 insertions(+), 12 deletions(-) diff --git a/tools/perf/util/amd-sample-raw.c b/tools/perf/util/amd-sample-= raw.c index ac34b18ccc0c..022c9eb39509 100644 --- a/tools/perf/util/amd-sample-raw.c +++ b/tools/perf/util/amd-sample-raw.c @@ -20,6 +20,7 @@ static u32 cpu_family, cpu_model, ibs_fetch_type, ibs_op_type; static bool zen4_ibs_extensions; static bool ldlat_cap; +static bool dtlb_pgsize_cap; =20 static void pr_ibs_fetch_ctl(union ibs_fetch_ctl reg) { @@ -161,9 +162,20 @@ static void pr_ibs_op_data2(union ibs_op_data2 reg) =20 static void pr_ibs_op_data3(union ibs_op_data3 reg) { - char l2_miss_str[sizeof(" L2Miss _")] =3D ""; - char op_mem_width_str[sizeof(" OpMemWidth _____ bytes")] =3D ""; + static const char * const dc_page_sizes[] =3D { + " 4K", + " 2M", + " 1G", + " ??", + }; char op_dc_miss_open_mem_reqs_str[sizeof(" OpDcMissOpenMemReqs __")] =3D = ""; + char dc_l1_l2tlb_miss_str[sizeof(" DcL1TlbMiss _ DcL2TlbMiss _")] =3D ""; + char dc_l1tlb_hit_str[sizeof(" DcL1TlbHit2M _ DcL1TlbHit1G _")] =3D ""; + char op_mem_width_str[sizeof(" OpMemWidth _____ bytes")] =3D ""; + char dc_l2tlb_hit_2m_str[sizeof(" DcL2TlbHit2M _")] =3D ""; + char dc_l2tlb_hit_1g_str[sizeof(" DcL2TlbHit1G _")] =3D ""; + char dc_page_size_str[sizeof(" DcPageSize ____")] =3D ""; + char l2_miss_str[sizeof(" L2Miss _")] =3D ""; =20 /* * Erratum #1293 @@ -179,16 +191,40 @@ static void pr_ibs_op_data3(union ibs_op_data3 reg) snprintf(op_mem_width_str, sizeof(op_mem_width_str), " OpMemWidth %2d bytes", 1 << (reg.op_mem_width - 1)); =20 - printf("ibs_op_data3:\t%016llx LdOp %d StOp %d DcL1TlbMiss %d DcL2TlbMiss= %d " - "DcL1TlbHit2M %d DcL1TlbHit1G %d DcL2TlbHit2M %d DcMiss %d DcMisAcc %d " - "DcWcMemAcc %d DcUcMemAcc %d DcLockedOp %d DcMissNoMabAlloc %d DcLinAddr= Valid %d " - "DcPhyAddrValid %d DcL2TlbHit1G %d%s SwPf %d%s%s DcMissLat %5d TlbRefill= Lat %5d\n", - reg.val, reg.ld_op, reg.st_op, reg.dc_l1tlb_miss, reg.dc_l2tlb_miss, - reg.dc_l1tlb_hit_2m, reg.dc_l1tlb_hit_1g, reg.dc_l2tlb_hit_2m, reg.dc_mi= ss, - reg.dc_mis_acc, reg.dc_wc_mem_acc, reg.dc_uc_mem_acc, reg.dc_locked_op, - reg.dc_miss_no_mab_alloc, reg.dc_lin_addr_valid, reg.dc_phy_addr_valid, - reg.dc_l2_tlb_hit_1g, l2_miss_str, reg.sw_pf, op_mem_width_str, - op_dc_miss_open_mem_reqs_str, reg.dc_miss_lat, reg.tlb_refill_lat); + if (dtlb_pgsize_cap) { + if (reg.dc_phy_addr_valid) { + int idx =3D (reg.dc_l1tlb_hit_1g << 1) | reg.dc_l1tlb_hit_2m; + + snprintf(dc_l1_l2tlb_miss_str, sizeof(dc_l1_l2tlb_miss_str), + " DcL1TlbMiss %d DcL2TlbMiss %d", + reg.dc_l1tlb_miss, reg.dc_l2tlb_miss); + snprintf(dc_page_size_str, sizeof(dc_page_size_str), + " DcPageSize %4s", dc_page_sizes[idx]); + } + } else { + snprintf(dc_l1_l2tlb_miss_str, sizeof(dc_l1_l2tlb_miss_str), + " DcL1TlbMiss %d DcL2TlbMiss %d", + reg.dc_l1tlb_miss, reg.dc_l2tlb_miss); + snprintf(dc_l1tlb_hit_str, sizeof(dc_l1tlb_hit_str), + " DcL1TlbHit2M %d DcL1TlbHit1G %d", + reg.dc_l1tlb_hit_2m, reg.dc_l1tlb_hit_1g); + snprintf(dc_l2tlb_hit_2m_str, sizeof(dc_l2tlb_hit_2m_str), + " DcL2TlbHit2M %d", reg.dc_l2tlb_hit_2m); + snprintf(dc_l2tlb_hit_1g_str, sizeof(dc_l2tlb_hit_1g_str), + " DcL2TlbHit1G %d", reg.dc_l2_tlb_hit_1g); + } + + printf("ibs_op_data3:\t%016llx LdOp %d StOp %d%s%s%s DcMiss %d DcMisAcc %= d " + "DcWcMemAcc %d DcUcMemAcc %d DcLockedOp %d DcMissNoMabAlloc %d " + "DcLinAddrValid %d DcPhyAddrValid %d%s%s SwPf %d%s%s " + "DcMissLat %5d TlbRefillLat %5d\n", + reg.val, reg.ld_op, reg.st_op, dc_l1_l2tlb_miss_str, + dtlb_pgsize_cap ? dc_page_size_str : dc_l1tlb_hit_str, + dc_l2tlb_hit_2m_str, reg.dc_miss, reg.dc_mis_acc, reg.dc_wc_mem_acc, + reg.dc_uc_mem_acc, reg.dc_locked_op, reg.dc_miss_no_mab_alloc, + reg.dc_lin_addr_valid, reg.dc_phy_addr_valid, dc_l2tlb_hit_1g_str, + l2_miss_str, reg.sw_pf, op_mem_width_str, op_dc_miss_open_mem_reqs_str, + reg.dc_miss_lat, reg.tlb_refill_lat); } =20 /* @@ -341,6 +377,9 @@ bool evlist__has_amd_ibs(struct evlist *evlist) if (perf_env__find_pmu_cap(env, "ibs_op", "ldlat")) ldlat_cap =3D 1; =20 + if (perf_env__find_pmu_cap(env, "ibs_op", "dtlb_pgsize")) + dtlb_pgsize_cap =3D 1; + if (ibs_fetch_type || ibs_op_type) { if (!cpu_family) parse_cpuid(env); --=20 2.43.0 From nobody Sun Feb 8 07:21:52 2026 Received: from NAM11-DM6-obe.outbound.protection.outlook.com (mail-dm6nam11on2071.outbound.protection.outlook.com [40.107.223.71]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 53806270ED7; Tue, 29 Apr 2025 04:00:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.223.71 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745899253; cv=fail; b=qJoeKBoiv6jU4IgdsacEddoc5uzY2PJkCGQhPk1b4DMTZIuq9D13xBJINiuQDrg6gqhqYp+lEFLyT4PBfHnU5T+GFXlQ2dg0a5f+c3xo49sT191uPmIVf2qrm+BEyJIfQOYAF8nMuJ//tHmIlPLNKXI8z5HiEW6mLLPnSEziGYM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745899253; c=relaxed/simple; bh=SRfGjtHMDf5kt8oO3YSUQuGneOpuUyb3tPJO/rxJ45M=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=TJyE0yoApsV0UjZcQhy7jPrcc6PWDehP4uEz16kaMICeYM2Z+Bk9dsZQ+YAQQA/ptNuZZKBiLciq7rFUNotD84JvdEYuzPThUp3KwhRX5hVYxSNp2DAWCueEJ/DpKLhNjQkD94CvujW5T4Mmf2XFOsmRss04y9ybaXHOSSM4Plo= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=bpzEdj9F; arc=fail smtp.client-ip=40.107.223.71 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="bpzEdj9F" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ewwSTlAh34V1FofGok+Xg4EpbYVpTi85u74XzFJydZbtdTc7PrX7WJ8qx5WEcWi6rfjcDAXOHW/evn/1LrJDycjN09ya7m7FQBsrhC2YwfSHTb63NytSmt/5678gBiy9UGj+K0Xijde+NLY9M8Osbi1nP/P8T1Ia0j4nizkxR7PRG2hpMmSCqt/GHWkINEgglhUH0HvRvRBgdF9XUVokuKNFMG+o/eLKILb6x16I3yGj3IgnlpHlHkZINY9j8UgR8fXFDYudn/HEhIz5WvyV+T/7Fc3cxr/sv8lbJdcLqMq//4yT46g41mygOv3IUrI3m5I3sO3YUUDTlDhm80ATYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5QTp/KC4a0jDGDJgHsArdBYT5uvGQGh0pE3YOjdGCQo=; b=cX137SRAGdLJmRGrQGk+jpP7QOQW3N5aMVX5+0E3DzhKESl50rWqA5VmgseyI8YXokYrhfNKZvlwyi/qDepIbK9ykQEvQMy2euHEJCpSa3VE/vOQJqX5hSFkJ35bDU/kYlF3oCzTyLhz3Iqa+ondym6kWAIe7gHeuqxk/XuYvQI1rP7Bwyh4Qli6AUVKfF4zlYJZmTePv8nnVoN1wlHQLWNuJ40cUyfXnZ17z98vvnEbWbzokSsR4NgpKWBLD0+tSw8Xao5m/qxkgfItOXk/ZgbaOHVmKC1rlLpzLq8CkSp6NIH1mn/DWKsIZfwLWZ0d/8lPuIQO0/5k4aR/MeZlmg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=redhat.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5QTp/KC4a0jDGDJgHsArdBYT5uvGQGh0pE3YOjdGCQo=; b=bpzEdj9FgIQQxdlOpP+3HNfi/1TZrqF2S8mUsh3xv+frsLEYJerZ6u6rUF8Zvl45J3HzgpGPtXq6S/DlZ/zKLChcR6Gjz+PBAwl4ol0XEGgw3VjM/kEvJSgM52aFaLl46z5cKLTg8P5MlS+6/Ac82suFHLUVdcKMu3t+K3El3po= Received: from CH0PR03CA0025.namprd03.prod.outlook.com (2603:10b6:610:b0::30) by LV8PR12MB9335.namprd12.prod.outlook.com (2603:10b6:408:1fc::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8678.33; Tue, 29 Apr 2025 04:00:47 +0000 Received: from CH3PEPF0000000B.namprd04.prod.outlook.com (2603:10b6:610:b0:cafe::a0) by CH0PR03CA0025.outlook.office365.com (2603:10b6:610:b0::30) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8655.40 via Frontend Transport; Tue, 29 Apr 2025 04:00:46 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CH3PEPF0000000B.mail.protection.outlook.com (10.167.244.38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8678.33 via Frontend Transport; Tue, 29 Apr 2025 04:00:46 +0000 Received: from BLR-L-RBANGORI.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Mon, 28 Apr 2025 23:00:41 -0500 From: Ravi Bangoria To: Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim CC: Ravi Bangoria , Peter Zijlstra , Joe Mario , Stephane Eranian , Jiri Olsa , Ian Rogers , Kan Liang , , , "Santosh Shukla" , Ananth Narayan , Sandipan Das Subject: [PATCH v4 3/4] perf mem/c2c amd: Add ldlat support Date: Tue, 29 Apr 2025 03:59:37 +0000 Message-ID: <20250429035938.1301-4-ravi.bangoria@amd.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250429035938.1301-1-ravi.bangoria@amd.com> References: <20250429035938.1301-1-ravi.bangoria@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH3PEPF0000000B:EE_|LV8PR12MB9335:EE_ X-MS-Office365-Filtering-Correlation-Id: bce6ef5b-1778-4273-87b2-08dd86d26be7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|36860700013|1800799024|7416014|376014; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?O8DASZE8tRjnZJioGIM8HbohUk5QA2yf5mC/Wwi+qcwgAU1fM/R2RAep4BR6?= =?us-ascii?Q?/Q2VE1acEutyWfZ7DiuBYVDHd9uFazpp6wkEmVH8LlW08nfKYsFj9dv0894h?= =?us-ascii?Q?AZs1yHoOqyM/cn3UJD7ps6MVRyq/XmE1tIk5ImR4cxM7HTN7Y6AOuSaShlGh?= =?us-ascii?Q?gV+XlwUXKal/A7oQYT04rboj8tnJ/GWGMUiGCuwIyMDjQtkzqxWAclcC6ftc?= =?us-ascii?Q?EVg3T9Rh2rqN68lMMyP787izahwKBl/yW7+szG6y3IEgy/TwNz9/lJpb0Pb+?= =?us-ascii?Q?wsT3lrTxJkTVXKscUw3oczY7//7oAql4Z6FvO7R2qdJMbXyfLPU8c8+wYhJW?= =?us-ascii?Q?3r9cks7v+2gGIOKUPGMPPE5HDfYZQjZNG/xIkQMqlSQeiAlLHPWDZDq9aUWA?= =?us-ascii?Q?hvHktPA/Y2vqPPt5o75B1N9iN9XFc7X+UIJoPkcmtFaCDfgfYiiNiiwjH1uP?= =?us-ascii?Q?LdHTWhR9VVPNfDV3+J0zX41gBZkMssuT0qeKZy72LyTPhemmVpJrgjf8/nvl?= =?us-ascii?Q?C7CWF8CttPCfJ0fbYn8n+p5BNEiegeb8AqHqQHQmi6/UbGu5+xvMvpj2PwsY?= =?us-ascii?Q?/aMnUSLGgq+z5TUiHlp3pG5Hc9UixkVkoFluSHO/BzHvyQGI33UwAFBdUIIR?= =?us-ascii?Q?pgVkK5JX0zcRSj23H3dfV3CJajGPi9HNnvLzfdf3q2X85/41Off3Pf4Dl4Lq?= =?us-ascii?Q?+niZSBUxX1JXeKGKnFa9Nfe50gXD77eAg9LTjcLJYWmG8G4zRRyz7B8Qwhxq?= =?us-ascii?Q?Wae3mU2XwVzSw8iJJYHatA1iHFtHn8YZTEsrKlcicNBehgVhDRZFM4IkMnzC?= =?us-ascii?Q?8PanPikqX+pnhI0WCpV4ub7a6HwTue0r3rj0aV57VZZOdXLpxsIWOAQVwi3/?= =?us-ascii?Q?4NW00vvw91/FXo15NAmlaMAknlGnkkagQpoGQakzeMRi3wUJty7+qtI19DpV?= =?us-ascii?Q?u9qBG1GfE0V1q9ey2pxvUZU38BSN3eEHjm9osclOBRVhb8eSJtp4OEVBwXel?= =?us-ascii?Q?HcmvlzrKpd2fdx3InTvXtm66GV+4lHUuB0DZYvWNQig+F678Y42+C1oNxLOl?= =?us-ascii?Q?0/q8WX9LMj5ri9Ojcrte8s/nLY2v/t9jlJLt9DX5ej4gEWXVO4caCgNLKUdP?= =?us-ascii?Q?mr/aSrXMaDrY+4RZ0pCgU9SCV61mPpPUeW7wBbxeTd9YKIFTNuS4c+g7Z30c?= =?us-ascii?Q?qNpzLb98+e2//7P4y1JWHjtFSkSMn85+TY9c23wGhK45MOvGuYuFJecpB1qY?= =?us-ascii?Q?3EIKY0MMNfjgjhN9jBHhy9xbgZIEdT5BErg6nnluKZ44yCQRYs1kER3jO4Aq?= =?us-ascii?Q?JSXXOr4yc7jeSmNFF6n8uU6ykMYVg+VzKOz2A0Hi5it7jrDhMn4D2kFZFy7S?= =?us-ascii?Q?u5B3xEkffRfmgC4EZx2xU17U/9LUbiIXuqMt70V3XIGYNKoceUOqnXl2IrWU?= =?us-ascii?Q?6/fiUVxj+vaIQBX6lyvJIj9HrZkooL2FK+W8rmYAPYSsLfqdI9nXwQknqxU7?= =?us-ascii?Q?BwBlg0oGlZi+xRMv5fuLkTQphHYMUz4/Qz/n?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(82310400026)(36860700013)(1800799024)(7416014)(376014);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Apr 2025 04:00:46.7362 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: bce6ef5b-1778-4273-87b2-08dd86d26be7 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CH3PEPF0000000B.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV8PR12MB9335 Content-Type: text/plain; charset="utf-8" Perf mem and c2c uses IBS Op PMU on AMD platforms. IBS Op PMU on Zen5 uarch has added support for Load Latency filtering. Implement perf mem/ c2c --ldlat using IBS Op Load Latency filtering capability. Some subtle differences between AMD and other arch: o --ldlat is disabled by default on AMD o Supported values are 128 to 2048. Signed-off-by: Ravi Bangoria --- tools/perf/Documentation/perf-c2c.txt | 11 ++++++-- tools/perf/Documentation/perf-mem.txt | 13 ++++++++-- tools/perf/arch/x86/util/mem-events.c | 6 +++++ tools/perf/arch/x86/util/mem-events.h | 1 + tools/perf/arch/x86/util/pmu.c | 20 ++++++++++++--- tools/perf/tests/shell/test_data_symbol.sh | 29 +++++++++++++++++++--- tools/perf/util/pmu.c | 11 ++++++++ tools/perf/util/pmu.h | 2 ++ 8 files changed, 83 insertions(+), 10 deletions(-) diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentati= on/perf-c2c.txt index 856f0dfb8e5a..f4af2dd6ab31 100644 --- a/tools/perf/Documentation/perf-c2c.txt +++ b/tools/perf/Documentation/perf-c2c.txt @@ -54,8 +54,15 @@ RECORD OPTIONS =20 -l:: --ldlat:: - Configure mem-loads latency. Supported on Intel and Arm64 processors - only. Ignored on other archs. + Configure mem-loads latency. Supported on Intel, Arm64 and some AMD + processors. Ignored on other archs. + + On supported AMD processors: + - /sys/bus/event_source/devices/ibs_op/caps/ldlat file contains '1'. + - Supported latency values are 128 to 2048 (both inclusive). + - Latency value which is a multiple of 128 incurs a little less profiling + overhead compared to other values. + - Load latency filtering is disabled by default. =20 -k:: --all-kernel:: diff --git a/tools/perf/Documentation/perf-mem.txt b/tools/perf/Documentati= on/perf-mem.txt index 8a1bd9ff0f86..a9e3c71a2205 100644 --- a/tools/perf/Documentation/perf-mem.txt +++ b/tools/perf/Documentation/perf-mem.txt @@ -28,6 +28,8 @@ and kernel support is required. See linkperf:perf-arm-spe= [1] for a setup guide. Due to the statistical nature of SPE sampling, not every memory operation = will be sampled. =20 +On AMD this use IBS Op PMU to sample load-store operations. + COMMON OPTIONS -------------- -f:: @@ -67,8 +69,15 @@ RECORD OPTIONS Configure all used events to run in user space. =20 --ldlat :: - Specify desired latency for loads event. Supported on Intel and Arm64 - processors only. Ignored on other archs. + Specify desired latency for loads event. Supported on Intel, Arm64 and + some AMD processors. Ignored on other archs. + + On supported AMD processors: + - /sys/bus/event_source/devices/ibs_op/caps/ldlat file contains '1'. + - Supported latency values are 128 to 2048 (both inclusive). + - Latency value which is a multiple of 128 incurs a little less profiling + overhead compared to other values. + - Load latency filtering is disabled by default. =20 REPORT OPTIONS -------------- diff --git a/tools/perf/arch/x86/util/mem-events.c b/tools/perf/arch/x86/ut= il/mem-events.c index 62df03e91c7e..b38f519020ff 100644 --- a/tools/perf/arch/x86/util/mem-events.c +++ b/tools/perf/arch/x86/util/mem-events.c @@ -26,3 +26,9 @@ struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS= __MAX] =3D { E(NULL, NULL, NULL, false, 0), E("mem-ldst", "%s//", NULL, false, 0), }; + +struct perf_mem_event perf_mem_events_amd_ldlat[PERF_MEM_EVENTS__MAX] =3D { + E(NULL, NULL, NULL, false, 0), + E(NULL, NULL, NULL, false, 0), + E("mem-ldst", "%s/ldlat=3D%u/", NULL, true, 0), +}; diff --git a/tools/perf/arch/x86/util/mem-events.h b/tools/perf/arch/x86/ut= il/mem-events.h index f55c8d3b7d59..11e09a256f5b 100644 --- a/tools/perf/arch/x86/util/mem-events.h +++ b/tools/perf/arch/x86/util/mem-events.h @@ -6,5 +6,6 @@ extern struct perf_mem_event perf_mem_events_intel[PERF_MEM= _EVENTS__MAX]; extern struct perf_mem_event perf_mem_events_intel_aux[PERF_MEM_EVENTS__MA= X]; =20 extern struct perf_mem_event perf_mem_events_amd[PERF_MEM_EVENTS__MAX]; +extern struct perf_mem_event perf_mem_events_amd_ldlat[PERF_MEM_EVENTS__MA= X]; =20 #endif /* _X86_MEM_EVENTS_H */ diff --git a/tools/perf/arch/x86/util/pmu.c b/tools/perf/arch/x86/util/pmu.c index e0060dac2a9f..8712cbbbc712 100644 --- a/tools/perf/arch/x86/util/pmu.c +++ b/tools/perf/arch/x86/util/pmu.c @@ -18,8 +18,10 @@ #include "mem-events.h" #include "util/env.h" =20 -void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_unused) +void perf_pmu__arch_init(struct perf_pmu *pmu) { + struct perf_pmu_caps *ldlat_cap; + #ifdef HAVE_AUXTRACE_SUPPORT if (!strcmp(pmu->name, INTEL_PT_PMU_NAME)) { pmu->auxtrace =3D true; @@ -33,8 +35,20 @@ void perf_pmu__arch_init(struct perf_pmu *pmu __maybe_un= used) #endif =20 if (x86__is_amd_cpu()) { - if (!strcmp(pmu->name, "ibs_op")) - pmu->mem_events =3D perf_mem_events_amd; + if (strcmp(pmu->name, "ibs_op")) + return; + + pmu->mem_events =3D perf_mem_events_amd; + + if (!perf_pmu__caps_parse(pmu)) + return; + + ldlat_cap =3D perf_pmu__get_cap(pmu, "ldlat"); + if (!ldlat_cap || strcmp(ldlat_cap->value, "1")) + return; + + perf_mem_events__loads_ldlat =3D 0; + pmu->mem_events =3D perf_mem_events_amd_ldlat; } else if (pmu->is_core) { if (perf_pmu__have_event(pmu, "mem-loads-aux")) pmu->mem_events =3D perf_mem_events_intel_aux; diff --git a/tools/perf/tests/shell/test_data_symbol.sh b/tools/perf/tests/= shell/test_data_symbol.sh index bbe8277496ae..d61b5659a46d 100755 --- a/tools/perf/tests/shell/test_data_symbol.sh +++ b/tools/perf/tests/shell/test_data_symbol.sh @@ -54,11 +54,34 @@ trap cleanup_files exit term int =20 echo "Recording workload..." =20 -# perf mem/c2c internally uses IBS PMU on AMD CPU which doesn't support -# user/kernel filtering and per-process monitoring, spin program on -# specific CPU and test in per-CPU mode. is_amd=3D$(grep -E -c 'vendor_id.*AuthenticAMD' /proc/cpuinfo) if (($is_amd >=3D 1)); then + mem_events=3D"$(perf mem record -v -e list 2>&1)" + if ! [[ "$mem_events" =3D~ ^mem\-ldst.*ibs_op/(.*)/.*available ]]; then + echo "ERROR: mem-ldst event is not matching" + exit 1 + fi + + # --ldlat on AMD: + # o Zen4 and earlier uarch does not support ldlat + # o Even on supported platforms, it's disabled (--ldlat=3D0) by default. + ldlat=3D${BASH_REMATCH[1]} + if [[ -n $ldlat ]]; then + if ! [[ "$ldlat" =3D~ ldlat=3D0 ]]; then + echo "ERROR: ldlat not initialized to 0?" + exit 1 + fi + + mem_events=3D"$(perf mem record -v --ldlat=3D150 -e list 2>&1)" + if ! [[ "$mem_events" =3D~ ^mem-ldst.*ibs_op/ldlat=3D150/.*available ]];= then + echo "ERROR: --ldlat not honored?" + exit 1 + fi + fi + + # perf mem/c2c internally uses IBS PMU on AMD CPU which doesn't + # support user/kernel filtering and per-process monitoring on older + # kernels, spin program on specific CPU and test in per-CPU mode. perf mem record -vvv -o ${PERF_DATA} -C 0 -- taskset -c 0 $TEST_PROGRAM 2= >"${ERR_FILE}" else perf mem record -vvv --all-user -o ${PERF_DATA} -- $TEST_PROGRAM 2>"${ERR= _FILE}" diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index bbb906bb2159..d08972aa461c 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -2259,6 +2259,17 @@ static void perf_pmu__del_caps(struct perf_pmu *pmu) } } =20 +struct perf_pmu_caps *perf_pmu__get_cap(struct perf_pmu *pmu, const char *= name) +{ + struct perf_pmu_caps *caps; + + list_for_each_entry(caps, &pmu->caps, list) { + if (!strcmp(caps->name, name)) + return caps; + } + return NULL; +} + /* * Reading/parsing the given pmu capabilities, which should be located at: * /sys/bus/event_source/devices//caps as sysfs group attributes. diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h index 13dd3511f504..a1fdd6d50c53 100644 --- a/tools/perf/util/pmu.h +++ b/tools/perf/util/pmu.h @@ -277,6 +277,8 @@ bool pmu_uncore_identifier_match(const char *compat, co= nst char *id); =20 int perf_pmu__convert_scale(const char *scale, char **end, double *sval); =20 +struct perf_pmu_caps *perf_pmu__get_cap(struct perf_pmu *pmu, const char *= name); + int perf_pmu__caps_parse(struct perf_pmu *pmu); =20 void perf_pmu__warn_invalid_config(struct perf_pmu *pmu, __u64 config, --=20 2.43.0 From nobody Sun Feb 8 07:21:52 2026 Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2078.outbound.protection.outlook.com [40.107.237.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74E2227055C; Tue, 29 Apr 2025 04:01:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.237.78 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745899286; cv=fail; b=ISAMYS3mlQHHPi8CDF2OI0zPhs7EXpaI4VHw+IWUMRT82JQqIisWh/qER4I0/5cdaPidm9QBRjB3oOoOezslzhB8mfoKCQXne1VItQgsVMGfTTfyF2psdq5n4glFMUw534Q03Wnd/SauVHNvOnGjQUCToM1U5UjIhfP7Ges3yl4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745899286; c=relaxed/simple; bh=hj6t1L7eVbSMtCpmS1WlQyHv3CMp+WFDyopf/mgMerg=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=OgbDAXMdTrQM9Zc31KPDuTsNGxS6pHK5EuutM8j1gQLpcqRgBAieVdQqIYV5U7Zc5pxJ5BPfCE0a+EuOxMSoOafmUfdv3qc3tTPkxxFTlF1hkGaOJBa2QEJEFRpCvoIVvSjP2+ZdkYf3QfrmJbzCN3Qsx9j7HWw6ZLycDbq8a+A= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=H6JBOzmv; arc=fail smtp.client-ip=40.107.237.78 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="H6JBOzmv" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Fe/xWuNyqViahd3pPrFvO0QFYkQZkJTY5w3j6/uC1URAtrksDOpw36U4ldkflyKZ/MlaxrOEXOqotflInV2KoOgSw/JnLEw1rR7cnncz3hBXBitKyIClfQF/t3s1xSXV4gBSbqtbT5M/dAmb5OqV9diP0l2sgVtBtnKnXTeRkKrDgpoL73Q+U1DsCOfLNPX51aiq1UdDiqfwhOOJ/GQ9Gj0uKGoMELyMURNCC7JuMIfJhSsUh+a7mV+cnqH6t+Cs3ogMt5aGjwHdRdrACCPEgqMk7VLyq9K1uNoNILSSaSfYxtCelhipbRY7GKKvr7ECUbxGLqR1mWadKqCQM18XVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FS2aoN5Xb2rm6HXXD74GZOYlJUYczs2p+AqGIjYgBtQ=; b=EwDGy+XSAyZjcU5Dyf3TmJYJ3Mu7Zc8G75ZnIPrGVrgNLuHXXuCbIgaQ0/LBg7nKRRqZ3f051ZJ0RhqXyGrNPg3wQoDqSmV+BirovoHA2ugtS1T4ELMjNmuttJ5mQUgcbk5GQuVwtJpBSsi/Snk0gLGq/I/pgYFA4K46oA3EzhUPgxjqFaRy7ZiYOarBT+H+Itg7nTTMgmAecjFKZUdrIntjuSmNCcJIns+cw3L504pO8dl7W9j4QqYUU1ck3DI4nWpDjaAxZvF8KXYV6mAPLaeeNAMOO87B0PiNB7ez+SoDJfJmBk6kA0njNLYOOyRM8opQxTy+TjAz6UqjkzRRtA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=redhat.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FS2aoN5Xb2rm6HXXD74GZOYlJUYczs2p+AqGIjYgBtQ=; b=H6JBOzmvhbiO6/U+MBas5zQYsDk2roWKT0LSjtQcE4PuUjIX2p6UpMPQTSklO8XDg8mANmGHbtga8LoRTnI59kPcbzhu6arL/uNcqmYd0fpsNn2sDm5e1YmUZANMNjMbS8e2s7vIwTLDq7AoBKCtslojOfySeZqTHd7iLerJG7E= Received: from CH0PR03CA0028.namprd03.prod.outlook.com (2603:10b6:610:b0::33) by DS7PR12MB5912.namprd12.prod.outlook.com (2603:10b6:8:7d::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8678.31; Tue, 29 Apr 2025 04:01:16 +0000 Received: from CH3PEPF0000000B.namprd04.prod.outlook.com (2603:10b6:610:b0:cafe::21) by CH0PR03CA0028.outlook.office365.com (2603:10b6:610:b0::33) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8655.37 via Frontend Transport; Tue, 29 Apr 2025 04:01:16 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CH3PEPF0000000B.mail.protection.outlook.com (10.167.244.38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8678.33 via Frontend Transport; Tue, 29 Apr 2025 04:01:16 +0000 Received: from BLR-L-RBANGORI.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Mon, 28 Apr 2025 23:00:46 -0500 From: Ravi Bangoria To: Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim CC: Ravi Bangoria , Peter Zijlstra , Joe Mario , Stephane Eranian , Jiri Olsa , Ian Rogers , Kan Liang , , , "Santosh Shukla" , Ananth Narayan , Sandipan Das Subject: [PATCH v4 4/4] perf test amd ibs: Add sample period unit test Date: Tue, 29 Apr 2025 03:59:38 +0000 Message-ID: <20250429035938.1301-5-ravi.bangoria@amd.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250429035938.1301-1-ravi.bangoria@amd.com> References: <20250429035938.1301-1-ravi.bangoria@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH3PEPF0000000B:EE_|DS7PR12MB5912:EE_ X-MS-Office365-Filtering-Correlation-Id: 401a4d5f-a40f-473d-b4dd-08dd86d27d99 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|376014|36860700013|7416014|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?yYHLFKAMWEVmAOJFnhlVO1tjm+WKQiMh+RGxMJw1MQg/CE78Da/FMxeyhgYe?= =?us-ascii?Q?J5seLnyv1sctSClD13fSHfLNH/7+7b2tXBEvxD5Xe5GlrLNhJiWk9BsCCk7T?= =?us-ascii?Q?e+uqGgVyhcODyhk3uDSflNuW/zNmvYFfaboMkANWxy55gdp5+Z2ZTKDN2/zv?= =?us-ascii?Q?83gCK88+vv55JbXyag3r/QzGH8WzqSDXtwZPCLmroAsGAXkkajPvz3vMzdPK?= =?us-ascii?Q?tUQRW7pDEOExZ6Pggg3Oejg5vZlLSYUOQ31bNuVXTkndwtww+m0QMCpVk+SU?= =?us-ascii?Q?6gf39BL0Mc+j9q5Kylp1R3d4KbeHhsQdLB5qfUgTQv0MvATDoxILUAfHbFr0?= =?us-ascii?Q?ZZYyNnpHhkzW6MaYAPxM9rz1J/Yl8rToEboEDalHuAlRcbKdDXcNuphdfEVx?= =?us-ascii?Q?/QYG+pghb2r/A7DRbeOm9r/p5VHDC4uodA7u1GSMnVfMpwaaHlw4thrp2D8y?= =?us-ascii?Q?amWkFQSEBT9E95H/uUMeJTGuyhwA0eJ5UyrHEwB7ywydj39E5EVikxAhXeu/?= =?us-ascii?Q?LEMVVSmssnLjWbXRKzwCyV36bMtmXkqOx9MYFkbg1vd2uLX5nqnFDXQsmfRj?= =?us-ascii?Q?/UhD4jyrx0k9N0jav/DNRZKU43yRBL6RGpQEdiKTzUfBQritWvZ3LoSYUlbt?= =?us-ascii?Q?dylMGdkcEAPMRgxWEDyN9rqdLTRpPj6sw/fvQRfh9wvLzfVlLwfVE2lS4rH5?= =?us-ascii?Q?sZgkVIRP/9QYjCpsxmvS+IQRtw4YgQ66GPwf8iapcZVQM/Txojpp49UBogTD?= =?us-ascii?Q?C7msOMcC7HArd+cJka/h0GsiJUx+Hjs9Ps73NNqEBWZYr+Nbc//DBe6zxl3T?= =?us-ascii?Q?luymFdjacrqflybqKBhlTz7JN3qyD+OzCz93C1vKbk+o1bAdLFIBgeUIzGkA?= =?us-ascii?Q?4QocgW9bBOyCzzoR24VwYmeURdc+3/4oZAcRmGHnyAFdF365L0K28GOQSsSG?= =?us-ascii?Q?I4eC6mJh+ttDm34hnQu/MkDw5ptsP70t7+Q3JcK4ruqOaYWjQjk44ZhDh8/s?= =?us-ascii?Q?dPkAN8+aVH+KGBcoETc+FboeZlHvflM6jbQ4MWTIqf+5PuN92DBwad2BGMkT?= =?us-ascii?Q?lQrRyvYhZ8dNb5zU1Ps8lEBOgHFRljRAyVHTzZ6W26JKYl3Ed/kungAAYdLC?= =?us-ascii?Q?E9El6ALd3td04C5CMBI3TwXnj6sbH+u/KKFgO26kYqy+k4bFXJaivnTMUaqt?= =?us-ascii?Q?/AKha1EwYIob8OdQaHrOaegwNuuF4eTMC2lYDskofdOkiTK6nXIUUEgdkItA?= =?us-ascii?Q?c1Rr2lJdx8TtCQ/drNEBCmIvxusLuvNMKM968VFz0/N/rzjqjX7EPbS3UJgx?= =?us-ascii?Q?EnT7h8ItQfZ4uiJcMzCVVVnlfuq1b/OdquLtu5MUjVcpllzIf55GmN3vUWt1?= =?us-ascii?Q?NvJ0QECUFlQDRQDpCFts6+7ipD0rXPQEuqQDu90VcFo3LXExsY9FsDl1PhNQ?= =?us-ascii?Q?dazOU3cANzsrxMFIobvHtR0gSbzWWZPiUR1+ux/6UOLSeY0qUOFkhsuE9lUk?= =?us-ascii?Q?IWLvcx6DfJ1icW55uRyjxQxZcrWawwrOqNqw?= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230040)(1800799024)(376014)(36860700013)(7416014)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Apr 2025 04:01:16.4317 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 401a4d5f-a40f-473d-b4dd-08dd86d27d99 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CH3PEPF0000000B.namprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS7PR12MB5912 Content-Type: text/plain; charset="utf-8" IBS Fetch and IBS Op PMUs has various constraints on supported sample periods. Add perf unit tests to test those. Running it in parallel with other tests causes intermittent failures. Mark it exclusive to force it to run sequentially. Sample output on a Zen5 machine: Without kernel fixes: $ sudo ./perf test -vv 112 112: AMD IBS sample period: --- start --- test child forked, pid 8774 Using CPUID AuthenticAMD-26-2-1 IBS config tests: ----------------- Fetch PMU tests: 0xffff : Ok (nr samples: 1078) 0x1000 : Ok (nr samples: 17030) 0xff : Ok (nr samples: 41068) 0x1 : Ok (nr samples: 40543) 0x0 : Ok 0x10000 : Ok Op PMU tests: 0x0 : Ok 0x1 : Fail 0x8 : Fail 0x9 : Ok (nr samples: 40543) 0xf : Ok (nr samples: 40543) 0x1000 : Ok (nr samples: 18736) 0xffff : Ok (nr samples: 1168) 0x10000 : Ok 0x100000 : Fail (nr samples: 14) 0xf00000 : Fail (nr samples: 1) 0xf0ffff : Fail (nr samples: 1) 0x1f0ffff : Fail (nr samples: 1) 0x7f0ffff : Fail (nr samples: 0) 0x8f0ffff : Ok 0x17f0ffff : Ok IBS sample period constraint tests: ----------------------------------- Fetch PMU test: freq 0, sample_freq 0: Ok freq 0, sample_freq 1: Fail freq 0, sample_freq 15: Fail freq 0, sample_freq 16: Ok (nr samples: 1604) freq 0, sample_freq 17: Ok (nr samples: 1604) freq 0, sample_freq 143: Ok (nr samples: 1604) freq 0, sample_freq 144: Ok (nr samples: 1604) freq 0, sample_freq 145: Ok (nr samples: 1604) freq 0, sample_freq 1234: Ok (nr samples: 1566) freq 0, sample_freq 4103: Ok (nr samples: 1119) freq 0, sample_freq 65520: Ok (nr samples: 2264) freq 0, sample_freq 65535: Ok (nr samples: 2263) freq 0, sample_freq 65552: Ok (nr samples: 1166) freq 0, sample_freq 8388607: Ok (nr samples: 268) freq 0, sample_freq 268435455: Ok (nr samples: 8) freq 1, sample_freq 0: Ok freq 1, sample_freq 1: Ok (nr samples: 4) freq 1, sample_freq 15: Ok (nr samples: 4) freq 1, sample_freq 16: Ok (nr samples: 4) freq 1, sample_freq 17: Ok (nr samples: 4) freq 1, sample_freq 143: Ok (nr samples: 5) freq 1, sample_freq 144: Ok (nr samples: 5) freq 1, sample_freq 145: Ok (nr samples: 5) freq 1, sample_freq 1234: Ok (nr samples: 7) freq 1, sample_freq 4103: Ok (nr samples: 35) freq 1, sample_freq 65520: Ok (nr samples: 642) freq 1, sample_freq 65535: Ok (nr samples: 636) freq 1, sample_freq 65552: Ok (nr samples: 651) freq 1, sample_freq 8388607: Ok Op PMU test: freq 0, sample_freq 0: Ok freq 0, sample_freq 1: Fail freq 0, sample_freq 15: Fail freq 0, sample_freq 16: Fail freq 0, sample_freq 17: Fail freq 0, sample_freq 143: Fail freq 0, sample_freq 144: Ok (nr samples: 1604) freq 0, sample_freq 145: Ok (nr samples: 1604) freq 0, sample_freq 1234: Ok (nr samples: 1604) freq 0, sample_freq 4103: Ok (nr samples: 1604) freq 0, sample_freq 65520: Ok (nr samples: 2227) freq 0, sample_freq 65535: Ok (nr samples: 2296) freq 0, sample_freq 65552: Ok (nr samples: 2213) freq 0, sample_freq 8388607: Ok (nr samples: 250) freq 0, sample_freq 268435455: Ok (nr samples: 8) freq 1, sample_freq 0: Ok freq 1, sample_freq 1: Fail (nr samples: 4) freq 1, sample_freq 15: Fail (nr samples: 4) freq 1, sample_freq 16: Fail (nr samples: 4) freq 1, sample_freq 17: Fail (nr samples: 4) freq 1, sample_freq 143: Fail (nr samples: 5) freq 1, sample_freq 144: Fail (nr samples: 5) freq 1, sample_freq 145: Fail (nr samples: 5) freq 1, sample_freq 1234: Fail (nr samples: 8) freq 1, sample_freq 4103: Fail (nr samples: 33) freq 1, sample_freq 65520: Fail (nr samples: 546) freq 1, sample_freq 65535: Fail (nr samples: 544) freq 1, sample_freq 65552: Fail (nr samples: 555) freq 1, sample_freq 8388607: Ok IBS ioctl() tests: ------------------ Fetch PMU tests ioctl(period =3D 0x0 ): Ok ioctl(period =3D 0x1 ): Fail ioctl(period =3D 0xf ): Fail ioctl(period =3D 0x10 ): Ok ioctl(period =3D 0x11 ): Fail ioctl(period =3D 0x1f ): Fail ioctl(period =3D 0x20 ): Ok ioctl(period =3D 0x80 ): Ok ioctl(period =3D 0x8f ): Fail ioctl(period =3D 0x90 ): Ok ioctl(period =3D 0x91 ): Fail ioctl(period =3D 0x100 ): Ok ioctl(period =3D 0xfff0 ): Ok ioctl(period =3D 0xffff ): Fail ioctl(period =3D 0x10000 ): Ok ioctl(period =3D 0x1fff0 ): Ok ioctl(period =3D 0x1fff5 ): Fail ioctl(freq =3D 0x0 ): Ok ioctl(freq =3D 0x1 ): Ok ioctl(freq =3D 0xf ): Ok ioctl(freq =3D 0x10 ): Ok ioctl(freq =3D 0x11 ): Ok ioctl(freq =3D 0x1f ): Ok ioctl(freq =3D 0x20 ): Ok ioctl(freq =3D 0x80 ): Ok ioctl(freq =3D 0x8f ): Ok ioctl(freq =3D 0x90 ): Ok ioctl(freq =3D 0x91 ): Ok ioctl(freq =3D 0x100 ): Ok Op PMU tests ioctl(period =3D 0x0 ): Ok ioctl(period =3D 0x1 ): Fail ioctl(period =3D 0xf ): Fail ioctl(period =3D 0x10 ): Fail ioctl(period =3D 0x11 ): Fail ioctl(period =3D 0x1f ): Fail ioctl(period =3D 0x20 ): Fail ioctl(period =3D 0x80 ): Fail ioctl(period =3D 0x8f ): Fail ioctl(period =3D 0x90 ): Ok ioctl(period =3D 0x91 ): Fail ioctl(period =3D 0x100 ): Ok ioctl(period =3D 0xfff0 ): Ok ioctl(period =3D 0xffff ): Fail ioctl(period =3D 0x10000 ): Ok ioctl(period =3D 0x1fff0 ): Ok ioctl(period =3D 0x1fff5 ): Fail ioctl(freq =3D 0x0 ): Ok ioctl(freq =3D 0x1 ): Ok ioctl(freq =3D 0xf ): Ok ioctl(freq =3D 0x10 ): Ok ioctl(freq =3D 0x11 ): Ok ioctl(freq =3D 0x1f ): Ok ioctl(freq =3D 0x20 ): Ok ioctl(freq =3D 0x80 ): Ok ioctl(freq =3D 0x8f ): Ok ioctl(freq =3D 0x90 ): Ok ioctl(freq =3D 0x91 ): Ok ioctl(freq =3D 0x100 ): Ok IBS freq (negative) tests: -------------------------- freq 1, sample_freq 200000: Fail IBS L3MissOnly test: (takes a while) -------------------- Fetch L3MissOnly: Fail (nr_samples: 1213) Op L3MissOnly: Ok (nr_samples: 1193) ---- end(-1) ---- 112: AMD IBS sample period : FA= ILED! With kernel fixes: $ sudo ./perf test -vv 112 112: AMD IBS sample period: --- start --- test child forked, pid 6939 Using CPUID AuthenticAMD-26-2-1 IBS config tests: ----------------- Fetch PMU tests: 0xffff : Ok (nr samples: 969) 0x1000 : Ok (nr samples: 15540) 0xff : Ok (nr samples: 40555) 0x1 : Ok (nr samples: 40543) 0x0 : Ok 0x10000 : Ok Op PMU tests: 0x0 : Ok 0x1 : Ok 0x8 : Ok 0x9 : Ok (nr samples: 40543) 0xf : Ok (nr samples: 40543) 0x1000 : Ok (nr samples: 19156) 0xffff : Ok (nr samples: 1169) 0x10000 : Ok 0x100000 : Ok (nr samples: 1151) 0xf00000 : Ok (nr samples: 76) 0xf0ffff : Ok (nr samples: 73) 0x1f0ffff : Ok (nr samples: 33) 0x7f0ffff : Ok (nr samples: 10) 0x8f0ffff : Ok 0x17f0ffff : Ok IBS sample period constraint tests: ----------------------------------- Fetch PMU test: freq 0, sample_freq 0: Ok freq 0, sample_freq 1: Ok freq 0, sample_freq 15: Ok freq 0, sample_freq 16: Ok (nr samples: 1203) freq 0, sample_freq 17: Ok (nr samples: 1604) freq 0, sample_freq 143: Ok (nr samples: 1604) freq 0, sample_freq 144: Ok (nr samples: 1604) freq 0, sample_freq 145: Ok (nr samples: 1604) freq 0, sample_freq 1234: Ok (nr samples: 1604) freq 0, sample_freq 4103: Ok (nr samples: 1343) freq 0, sample_freq 65520: Ok (nr samples: 2254) freq 0, sample_freq 65535: Ok (nr samples: 2136) freq 0, sample_freq 65552: Ok (nr samples: 1158) freq 0, sample_freq 8388607: Ok (nr samples: 257) freq 0, sample_freq 268435455: Ok (nr samples: 8) freq 1, sample_freq 0: Ok freq 1, sample_freq 1: Ok (nr samples: 4) freq 1, sample_freq 15: Ok (nr samples: 4) freq 1, sample_freq 16: Ok (nr samples: 4) freq 1, sample_freq 17: Ok (nr samples: 4) freq 1, sample_freq 143: Ok (nr samples: 5) freq 1, sample_freq 144: Ok (nr samples: 5) freq 1, sample_freq 145: Ok (nr samples: 5) freq 1, sample_freq 1234: Ok (nr samples: 8) freq 1, sample_freq 4103: Ok (nr samples: 34) freq 1, sample_freq 65520: Ok (nr samples: 458) freq 1, sample_freq 65535: Ok (nr samples: 628) freq 1, sample_freq 65552: Ok (nr samples: 396) freq 1, sample_freq 8388607: Ok Op PMU test: freq 0, sample_freq 0: Ok freq 0, sample_freq 1: Ok freq 0, sample_freq 15: Ok freq 0, sample_freq 16: Ok freq 0, sample_freq 17: Ok freq 0, sample_freq 143: Ok freq 0, sample_freq 144: Ok (nr samples: 1604) freq 0, sample_freq 145: Ok (nr samples: 1604) freq 0, sample_freq 1234: Ok (nr samples: 1604) freq 0, sample_freq 4103: Ok (nr samples: 1604) freq 0, sample_freq 65520: Ok (nr samples: 2250) freq 0, sample_freq 65535: Ok (nr samples: 2158) freq 0, sample_freq 65552: Ok (nr samples: 2296) freq 0, sample_freq 8388607: Ok (nr samples: 243) freq 0, sample_freq 268435455: Ok (nr samples: 6) freq 1, sample_freq 0: Ok freq 1, sample_freq 1: Ok (nr samples: 4) freq 1, sample_freq 15: Ok (nr samples: 4) freq 1, sample_freq 16: Ok (nr samples: 4) freq 1, sample_freq 17: Ok (nr samples: 4) freq 1, sample_freq 143: Ok (nr samples: 4) freq 1, sample_freq 144: Ok (nr samples: 5) freq 1, sample_freq 145: Ok (nr samples: 4) freq 1, sample_freq 1234: Ok (nr samples: 6) freq 1, sample_freq 4103: Ok (nr samples: 27) freq 1, sample_freq 65520: Ok (nr samples: 542) freq 1, sample_freq 65535: Ok (nr samples: 550) freq 1, sample_freq 65552: Ok (nr samples: 552) freq 1, sample_freq 8388607: Ok IBS ioctl() tests: ------------------ Fetch PMU tests ioctl(period =3D 0x0 ): Ok ioctl(period =3D 0x1 ): Ok ioctl(period =3D 0xf ): Ok ioctl(period =3D 0x10 ): Ok ioctl(period =3D 0x11 ): Ok ioctl(period =3D 0x1f ): Ok ioctl(period =3D 0x20 ): Ok ioctl(period =3D 0x80 ): Ok ioctl(period =3D 0x8f ): Ok ioctl(period =3D 0x90 ): Ok ioctl(period =3D 0x91 ): Ok ioctl(period =3D 0x100 ): Ok ioctl(period =3D 0xfff0 ): Ok ioctl(period =3D 0xffff ): Ok ioctl(period =3D 0x10000 ): Ok ioctl(period =3D 0x1fff0 ): Ok ioctl(period =3D 0x1fff5 ): Ok ioctl(freq =3D 0x0 ): Ok ioctl(freq =3D 0x1 ): Ok ioctl(freq =3D 0xf ): Ok ioctl(freq =3D 0x10 ): Ok ioctl(freq =3D 0x11 ): Ok ioctl(freq =3D 0x1f ): Ok ioctl(freq =3D 0x20 ): Ok ioctl(freq =3D 0x80 ): Ok ioctl(freq =3D 0x8f ): Ok ioctl(freq =3D 0x90 ): Ok ioctl(freq =3D 0x91 ): Ok ioctl(freq =3D 0x100 ): Ok Op PMU tests ioctl(period =3D 0x0 ): Ok ioctl(period =3D 0x1 ): Ok ioctl(period =3D 0xf ): Ok ioctl(period =3D 0x10 ): Ok ioctl(period =3D 0x11 ): Ok ioctl(period =3D 0x1f ): Ok ioctl(period =3D 0x20 ): Ok ioctl(period =3D 0x80 ): Ok ioctl(period =3D 0x8f ): Ok ioctl(period =3D 0x90 ): Ok ioctl(period =3D 0x91 ): Ok ioctl(period =3D 0x100 ): Ok ioctl(period =3D 0xfff0 ): Ok ioctl(period =3D 0xffff ): Ok ioctl(period =3D 0x10000 ): Ok ioctl(period =3D 0x1fff0 ): Ok ioctl(period =3D 0x1fff5 ): Ok ioctl(freq =3D 0x0 ): Ok ioctl(freq =3D 0x1 ): Ok ioctl(freq =3D 0xf ): Ok ioctl(freq =3D 0x10 ): Ok ioctl(freq =3D 0x11 ): Ok ioctl(freq =3D 0x1f ): Ok ioctl(freq =3D 0x20 ): Ok ioctl(freq =3D 0x80 ): Ok ioctl(freq =3D 0x8f ): Ok ioctl(freq =3D 0x90 ): Ok ioctl(freq =3D 0x91 ): Ok ioctl(freq =3D 0x100 ): Ok IBS freq (negative) tests: -------------------------- freq 1, sample_freq 200000: Ok IBS L3MissOnly test: (takes a while) -------------------- Fetch L3MissOnly: Ok (nr_samples: 1301) Op L3MissOnly: Ok (nr_samples: 1590) ---- end(0) ---- 112: AMD IBS sample period : Ok Signed-off-by: Ravi Bangoria --- tools/perf/arch/x86/include/arch-tests.h | 1 + tools/perf/arch/x86/tests/Build | 1 + tools/perf/arch/x86/tests/amd-ibs-period.c | 1001 ++++++++++++++++++++ tools/perf/arch/x86/tests/arch-tests.c | 2 + 4 files changed, 1005 insertions(+) create mode 100644 tools/perf/arch/x86/tests/amd-ibs-period.c diff --git a/tools/perf/arch/x86/include/arch-tests.h b/tools/perf/arch/x86= /include/arch-tests.h index c0421a26b875..4fd425157d7d 100644 --- a/tools/perf/arch/x86/include/arch-tests.h +++ b/tools/perf/arch/x86/include/arch-tests.h @@ -14,6 +14,7 @@ int test__intel_pt_hybrid_compat(struct test_suite *test,= int subtest); int test__bp_modify(struct test_suite *test, int subtest); int test__x86_sample_parsing(struct test_suite *test, int subtest); int test__amd_ibs_via_core_pmu(struct test_suite *test, int subtest); +int test__amd_ibs_period(struct test_suite *test, int subtest); int test__hybrid(struct test_suite *test, int subtest); =20 extern struct test_suite *arch_tests[]; diff --git a/tools/perf/arch/x86/tests/Build b/tools/perf/arch/x86/tests/Bu= ild index 86262c720857..5e00cbfd2d56 100644 --- a/tools/perf/arch/x86/tests/Build +++ b/tools/perf/arch/x86/tests/Build @@ -10,6 +10,7 @@ perf-test-$(CONFIG_AUXTRACE) +=3D insn-x86.o endif perf-test-$(CONFIG_X86_64) +=3D bp-modify.o perf-test-y +=3D amd-ibs-via-core-pmu.o +perf-test-y +=3D amd-ibs-period.o =20 ifdef SHELLCHECK SHELL_TESTS :=3D gen-insn-x86-dat.sh diff --git a/tools/perf/arch/x86/tests/amd-ibs-period.c b/tools/perf/arch/x= 86/tests/amd-ibs-period.c new file mode 100644 index 000000000000..0cf3656e4b9b --- /dev/null +++ b/tools/perf/arch/x86/tests/amd-ibs-period.c @@ -0,0 +1,1001 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include + +#include "arch-tests.h" +#include "linux/perf_event.h" +#include "linux/zalloc.h" +#include "tests/tests.h" +#include "../perf-sys.h" +#include "pmu.h" +#include "pmus.h" +#include "debug.h" +#include "util.h" +#include "strbuf.h" +#include "../util/env.h" + +#define PAGE_SIZE sysconf(_SC_PAGESIZE) + +#define PERF_MMAP_DATA_PAGES 32L +#define PERF_MMAP_DATA_SIZE (PERF_MMAP_DATA_PAGES * PAGE_SIZE) +#define PERF_MMAP_DATA_MASK (PERF_MMAP_DATA_SIZE - 1) +#define PERF_MMAP_TOTAL_PAGES (PERF_MMAP_DATA_PAGES + 1) +#define PERF_MMAP_TOTAL_SIZE (PERF_MMAP_TOTAL_PAGES * PAGE_SIZE) + +#define rmb() asm volatile("lfence":::"memory") + +enum { + FD_ERROR, + FD_SUCCESS, +}; + +enum { + IBS_FETCH, + IBS_OP, +}; + +struct perf_pmu *fetch_pmu; +struct perf_pmu *op_pmu; +unsigned int perf_event_max_sample_rate; + +/* Dummy workload to generate IBS samples. */ +static int dummy_workload_1(unsigned long count) +{ + int (*func)(void); + int ret =3D 0; + char *p; + char insn1[] =3D { + 0xb8, 0x01, 0x00, 0x00, 0x00, /* mov 1,%eax */ + 0xc3, /* ret */ + 0xcc, /* int 3 */ + }; + + char insn2[] =3D { + 0xb8, 0x02, 0x00, 0x00, 0x00, /* mov 2,%eax */ + 0xc3, /* ret */ + 0xcc, /* int 3 */ + }; + + p =3D zalloc(2 * PAGE_SIZE); + if (!p) { + printf("malloc() failed. %m"); + return 1; + } + + func =3D (void *)((unsigned long)(p + PAGE_SIZE - 1) & ~(PAGE_SIZE - 1)); + + ret =3D mprotect(func, PAGE_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC); + if (ret) { + printf("mprotect() failed. %m"); + goto out; + } + + if (count < 100000) + count =3D 100000; + else if (count > 10000000) + count =3D 10000000; + while (count--) { + memcpy(func, insn1, sizeof(insn1)); + if (func() !=3D 1) { + pr_debug("ERROR insn1\n"); + ret =3D -1; + goto out; + } + memcpy(func, insn2, sizeof(insn2)); + if (func() !=3D 2) { + pr_debug("ERROR insn2\n"); + ret =3D -1; + goto out; + } + } + +out: + free(p); + return ret; +} + +/* Another dummy workload to generate IBS samples. */ +static void dummy_workload_2(char *perf) +{ + char bench[] =3D " bench sched messaging -g 10 -l 5000 > /dev/null 2>&1"; + char taskset[] =3D "taskset -c 0 "; + int ret __maybe_unused; + struct strbuf sb; + char *cmd; + + strbuf_init(&sb, 0); + strbuf_add(&sb, taskset, strlen(taskset)); + strbuf_add(&sb, perf, strlen(perf)); + strbuf_add(&sb, bench, strlen(bench)); + cmd =3D strbuf_detach(&sb, NULL); + ret =3D system(cmd); + free(cmd); +} + +static int sched_affine(int cpu) +{ + cpu_set_t set; + + CPU_ZERO(&set); + CPU_SET(cpu, &set); + if (sched_setaffinity(getpid(), sizeof(set), &set) =3D=3D -1) { + pr_debug("sched_setaffinity() failed. [%m]"); + return -1; + } + return 0; +} + +static void +copy_sample_data(void *src, unsigned long offset, void *dest, size_t size) +{ + size_t chunk1_size, chunk2_size; + + if ((offset + size) < (size_t)PERF_MMAP_DATA_SIZE) { + memcpy(dest, src + offset, size); + } else { + chunk1_size =3D PERF_MMAP_DATA_SIZE - offset; + chunk2_size =3D size - chunk1_size; + + memcpy(dest, src + offset, chunk1_size); + memcpy(dest + chunk1_size, src, chunk2_size); + } +} + +static int rb_read(struct perf_event_mmap_page *rb, void *dest, size_t siz= e) +{ + void *base; + unsigned long data_tail, data_head; + + /* Casting to (void *) is needed. */ + base =3D (void *)rb + PAGE_SIZE; + + data_head =3D rb->data_head; + rmb(); + data_tail =3D rb->data_tail; + + if ((data_head - data_tail) < size) + return -1; + + data_tail &=3D PERF_MMAP_DATA_MASK; + copy_sample_data(base, data_tail, dest, size); + rb->data_tail +=3D size; + return 0; +} + +static void rb_skip(struct perf_event_mmap_page *rb, size_t size) +{ + size_t data_head =3D rb->data_head; + + rmb(); + + if ((rb->data_tail + size) > data_head) + rb->data_tail =3D data_head; + else + rb->data_tail +=3D size; +} + +/* Sample period value taken from perf sample must match with expected val= ue. */ +static int period_equal(unsigned long exp_period, unsigned long act_period) +{ + return exp_period =3D=3D act_period ? 0 : -1; +} + +/* + * Sample period value taken from perf sample must be >=3D minimum sample = period + * supported by IBS HW. + */ +static int period_higher(unsigned long min_period, unsigned long act_perio= d) +{ + return min_period <=3D act_period ? 0 : -1; +} + +static int rb_drain_samples(struct perf_event_mmap_page *rb, + unsigned long exp_period, + int *nr_samples, + int (*callback)(unsigned long, unsigned long)) +{ + struct perf_event_header hdr; + unsigned long period; + int ret =3D 0; + + /* + * PERF_RECORD_SAMPLE: + * struct { + * struct perf_event_header hdr; + * { u64 period; } && PERF_SAMPLE_PERIOD + * }; + */ + while (1) { + if (rb_read(rb, &hdr, sizeof(hdr))) + return ret; + + if (hdr.type =3D=3D PERF_RECORD_SAMPLE) { + (*nr_samples)++; + period =3D 0; + if (rb_read(rb, &period, sizeof(period))) + pr_debug("rb_read(period) error. [%m]"); + ret |=3D callback(exp_period, period); + } else { + rb_skip(rb, hdr.size - sizeof(hdr)); + } + } + return ret; +} + +static long perf_event_open(struct perf_event_attr *attr, pid_t pid, + int cpu, int group_fd, unsigned long flags) +{ + return syscall(__NR_perf_event_open, attr, pid, cpu, group_fd, flags); +} + +static void fetch_prepare_attr(struct perf_event_attr *attr, + unsigned long long config, int freq, + unsigned long sample_period) +{ + memset(attr, 0, sizeof(struct perf_event_attr)); + + attr->type =3D fetch_pmu->type; + attr->size =3D sizeof(struct perf_event_attr); + attr->config =3D config; + attr->disabled =3D 1; + attr->sample_type =3D PERF_SAMPLE_PERIOD; + attr->freq =3D freq; + attr->sample_period =3D sample_period; /* =3D ->sample_freq */ +} + +static void op_prepare_attr(struct perf_event_attr *attr, + unsigned long config, int freq, + unsigned long sample_period) +{ + memset(attr, 0, sizeof(struct perf_event_attr)); + + attr->type =3D op_pmu->type; + attr->size =3D sizeof(struct perf_event_attr); + attr->config =3D config; + attr->disabled =3D 1; + attr->sample_type =3D PERF_SAMPLE_PERIOD; + attr->freq =3D freq; + attr->sample_period =3D sample_period; /* =3D ->sample_freq */ +} + +struct ibs_configs { + /* Input */ + unsigned long config; + + /* Expected output */ + unsigned long period; + int fd; +}; + +/* + * Somehow first Fetch event with sample period =3D 0x10 causes 0 + * samples. So start with large period and decrease it gradually. + */ +struct ibs_configs fetch_configs[] =3D { + { .config =3D 0xffff, .period =3D 0xffff0, .fd =3D FD_SUCCESS }, + { .config =3D 0x1000, .period =3D 0x10000, .fd =3D FD_SUCCESS }, + { .config =3D 0xff, .period =3D 0xff0, .fd =3D FD_SUCCESS }, + { .config =3D 0x1, .period =3D 0x10, .fd =3D FD_SUCCESS }, + { .config =3D 0x0, .period =3D -1, .fd =3D FD_ERROR }, + { .config =3D 0x10000, .period =3D -1, .fd =3D FD_ERROR }, +}; + +struct ibs_configs op_configs[] =3D { + { .config =3D 0x0, .period =3D -1, .fd =3D FD_ERROR }, + { .config =3D 0x1, .period =3D -1, .fd =3D FD_ERROR }, + { .config =3D 0x8, .period =3D -1, .fd =3D FD_ERROR }, + { .config =3D 0x9, .period =3D 0x90, .fd =3D FD_SUCCESS }, + { .config =3D 0xf, .period =3D 0xf0, .fd =3D FD_SUCCESS }, + { .config =3D 0x1000, .period =3D 0x10000, .fd =3D FD_SUCCESS }, + { .config =3D 0xffff, .period =3D 0xffff0, .fd =3D FD_SUCCESS }, + { .config =3D 0x10000, .period =3D -1, .fd =3D FD_ERROR }, + { .config =3D 0x100000, .period =3D 0x100000, .fd =3D FD_SUCCESS }, + { .config =3D 0xf00000, .period =3D 0xf00000, .fd =3D FD_SUCCESS }, + { .config =3D 0xf0ffff, .period =3D 0xfffff0, .fd =3D FD_SUCCESS }, + { .config =3D 0x1f0ffff, .period =3D 0x1fffff0, .fd =3D FD_SUCCESS }, + { .config =3D 0x7f0ffff, .period =3D 0x7fffff0, .fd =3D FD_SUCCESS }, + { .config =3D 0x8f0ffff, .period =3D -1, .fd =3D FD_ERROR }, + { .config =3D 0x17f0ffff, .period =3D -1, .fd =3D FD_ERROR }, +}; + +static int __ibs_config_test(int ibs_type, struct ibs_configs *config, int= *nr_samples) +{ + struct perf_event_attr attr; + int fd, i; + void *rb; + int ret =3D 0; + + if (ibs_type =3D=3D IBS_FETCH) + fetch_prepare_attr(&attr, config->config, 0, 0); + else + op_prepare_attr(&attr, config->config, 0, 0); + + /* CPU0, All processes */ + fd =3D perf_event_open(&attr, -1, 0, -1, 0); + if (config->fd =3D=3D FD_ERROR) { + if (fd !=3D -1) { + close(fd); + return -1; + } + return 0; + } + if (fd <=3D -1) + return -1; + + rb =3D mmap(NULL, PERF_MMAP_TOTAL_SIZE, PROT_READ | PROT_WRITE, + MAP_SHARED, fd, 0); + if (rb =3D=3D MAP_FAILED) { + pr_debug("mmap() failed. [%m]\n"); + return -1; + } + + ioctl(fd, PERF_EVENT_IOC_RESET, 0); + ioctl(fd, PERF_EVENT_IOC_ENABLE, 0); + + i =3D 5; + while (i--) { + dummy_workload_1(1000000); + + ret =3D rb_drain_samples(rb, config->period, nr_samples, + period_equal); + if (ret) + break; + } + + ioctl(fd, PERF_EVENT_IOC_DISABLE, 0); + munmap(rb, PERF_MMAP_TOTAL_SIZE); + close(fd); + return ret; +} + +static int ibs_config_test(void) +{ + int nr_samples =3D 0; + unsigned long i; + int ret =3D 0; + int r; + + pr_debug("\nIBS config tests:\n"); + pr_debug("-----------------\n"); + + pr_debug("Fetch PMU tests:\n"); + for (i =3D 0; i < ARRAY_SIZE(fetch_configs); i++) { + nr_samples =3D 0; + r =3D __ibs_config_test(IBS_FETCH, &(fetch_configs[i]), &nr_samples); + + if (fetch_configs[i].fd =3D=3D FD_ERROR) { + pr_debug("0x%-16lx: %-4s\n", fetch_configs[i].config, + !r ? "Ok" : "Fail"); + } else { + /* + * Although nr_samples =3D=3D 0 is reported as Fail here, + * the failure status is not cascaded up because, we + * can not decide whether test really failed or not + * without actual samples. + */ + pr_debug("0x%-16lx: %-4s (nr samples: %d)\n", fetch_configs[i].config, + (!r && nr_samples !=3D 0) ? "Ok" : "Fail", nr_samples); + } + + ret |=3D r; + } + + pr_debug("Op PMU tests:\n"); + for (i =3D 0; i < ARRAY_SIZE(op_configs); i++) { + nr_samples =3D 0; + r =3D __ibs_config_test(IBS_OP, &(op_configs[i]), &nr_samples); + + if (op_configs[i].fd =3D=3D FD_ERROR) { + pr_debug("0x%-16lx: %-4s\n", op_configs[i].config, + !r ? "Ok" : "Fail"); + } else { + /* + * Although nr_samples =3D=3D 0 is reported as Fail here, + * the failure status is not cascaded up because, we + * can not decide whether test really failed or not + * without actual samples. + */ + pr_debug("0x%-16lx: %-4s (nr samples: %d)\n", op_configs[i].config, + (!r && nr_samples !=3D 0) ? "Ok" : "Fail", nr_samples); + } + + ret |=3D r; + } + + return ret; +} + +struct ibs_period { + /* Input */ + int freq; + unsigned long sample_freq; + + /* Output */ + int ret; + unsigned long period; +}; + +struct ibs_period fetch_period[] =3D { + { .freq =3D 0, .sample_freq =3D 0, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 0, .sample_freq =3D 1, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 0, .sample_freq =3D 0xf, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 0, .sample_freq =3D 0x10, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 0, .sample_freq =3D 0x11, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 0, .sample_freq =3D 0x8f, .ret =3D FD_SUCCESS, .period = =3D 0x80 }, + { .freq =3D 0, .sample_freq =3D 0x90, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 0, .sample_freq =3D 0x91, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 0, .sample_freq =3D 0x4d2, .ret =3D FD_SUCCESS, .period = =3D 0x4d0 }, + { .freq =3D 0, .sample_freq =3D 0x1007, .ret =3D FD_SUCCESS, .period = =3D 0x1000 }, + { .freq =3D 0, .sample_freq =3D 0xfff0, .ret =3D FD_SUCCESS, .period = =3D 0xfff0 }, + { .freq =3D 0, .sample_freq =3D 0xffff, .ret =3D FD_SUCCESS, .period = =3D 0xfff0 }, + { .freq =3D 0, .sample_freq =3D 0x10010, .ret =3D FD_SUCCESS, .period = =3D 0x10010 }, + { .freq =3D 0, .sample_freq =3D 0x7fffff, .ret =3D FD_SUCCESS, .period = =3D 0x7ffff0 }, + { .freq =3D 0, .sample_freq =3D 0xfffffff, .ret =3D FD_SUCCESS, .period = =3D 0xffffff0 }, + { .freq =3D 1, .sample_freq =3D 0, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 1, .sample_freq =3D 1, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0xf, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0x10, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0x11, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0x8f, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0x90, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0x91, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0x4d2, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0x1007, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0xfff0, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0xffff, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + { .freq =3D 1, .sample_freq =3D 0x10010, .ret =3D FD_SUCCESS, .period = =3D 0x10 }, + /* ret=3DFD_ERROR because freq > default perf_event_max_sample_rate (1000= 00) */ + { .freq =3D 1, .sample_freq =3D 0x7fffff, .ret =3D FD_ERROR, .period = =3D -1 }, +}; + +struct ibs_period op_period[] =3D { + { .freq =3D 0, .sample_freq =3D 0, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 0, .sample_freq =3D 1, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 0, .sample_freq =3D 0xf, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 0, .sample_freq =3D 0x10, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 0, .sample_freq =3D 0x11, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 0, .sample_freq =3D 0x8f, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 0, .sample_freq =3D 0x90, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 0, .sample_freq =3D 0x91, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 0, .sample_freq =3D 0x4d2, .ret =3D FD_SUCCESS, .period = =3D 0x4d0 }, + { .freq =3D 0, .sample_freq =3D 0x1007, .ret =3D FD_SUCCESS, .period = =3D 0x1000 }, + { .freq =3D 0, .sample_freq =3D 0xfff0, .ret =3D FD_SUCCESS, .period = =3D 0xfff0 }, + { .freq =3D 0, .sample_freq =3D 0xffff, .ret =3D FD_SUCCESS, .period = =3D 0xfff0 }, + { .freq =3D 0, .sample_freq =3D 0x10010, .ret =3D FD_SUCCESS, .period = =3D 0x10010 }, + { .freq =3D 0, .sample_freq =3D 0x7fffff, .ret =3D FD_SUCCESS, .period = =3D 0x7ffff0 }, + { .freq =3D 0, .sample_freq =3D 0xfffffff, .ret =3D FD_SUCCESS, .period = =3D 0xffffff0 }, + { .freq =3D 1, .sample_freq =3D 0, .ret =3D FD_ERROR, .period = =3D -1 }, + { .freq =3D 1, .sample_freq =3D 1, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0xf, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0x10, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0x11, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0x8f, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0x90, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0x91, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0x4d2, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0x1007, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0xfff0, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0xffff, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + { .freq =3D 1, .sample_freq =3D 0x10010, .ret =3D FD_SUCCESS, .period = =3D 0x90 }, + /* ret=3DFD_ERROR because freq > default perf_event_max_sample_rate (1000= 00) */ + { .freq =3D 1, .sample_freq =3D 0x7fffff, .ret =3D FD_ERROR, .period = =3D -1 }, +}; + +static int __ibs_period_constraint_test(int ibs_type, struct ibs_period *p= eriod, + int *nr_samples) +{ + struct perf_event_attr attr; + int ret =3D 0; + void *rb; + int fd; + + if (period->freq && period->sample_freq > perf_event_max_sample_rate) + period->ret =3D FD_ERROR; + + if (ibs_type =3D=3D IBS_FETCH) + fetch_prepare_attr(&attr, 0, period->freq, period->sample_freq); + else + op_prepare_attr(&attr, 0, period->freq, period->sample_freq); + + /* CPU0, All processes */ + fd =3D perf_event_open(&attr, -1, 0, -1, 0); + if (period->ret =3D=3D FD_ERROR) { + if (fd !=3D -1) { + close(fd); + return -1; + } + return 0; + } + if (fd <=3D -1) + return -1; + + rb =3D mmap(NULL, PERF_MMAP_TOTAL_SIZE, PROT_READ | PROT_WRITE, + MAP_SHARED, fd, 0); + if (rb =3D=3D MAP_FAILED) { + pr_debug("mmap() failed. [%m]\n"); + close(fd); + return -1; + } + + ioctl(fd, PERF_EVENT_IOC_RESET, 0); + ioctl(fd, PERF_EVENT_IOC_ENABLE, 0); + + if (period->freq) { + dummy_workload_1(100000); + ret =3D rb_drain_samples(rb, period->period, nr_samples, + period_higher); + } else { + dummy_workload_1(period->sample_freq * 10); + ret =3D rb_drain_samples(rb, period->period, nr_samples, + period_equal); + } + + ioctl(fd, PERF_EVENT_IOC_DISABLE, 0); + munmap(rb, PERF_MMAP_TOTAL_SIZE); + close(fd); + return ret; +} + +static int ibs_period_constraint_test(void) +{ + unsigned long i; + int nr_samples; + int ret =3D 0; + int r; + + pr_debug("\nIBS sample period constraint tests:\n"); + pr_debug("-----------------------------------\n"); + + pr_debug("Fetch PMU test:\n"); + for (i =3D 0; i < ARRAY_SIZE(fetch_period); i++) { + nr_samples =3D 0; + r =3D __ibs_period_constraint_test(IBS_FETCH, &fetch_period[i], + &nr_samples); + + if (fetch_period[i].ret =3D=3D FD_ERROR) { + pr_debug("freq %d, sample_freq %9ld: %-4s\n", + fetch_period[i].freq, fetch_period[i].sample_freq, + !r ? "Ok" : "Fail"); + } else { + /* + * Although nr_samples =3D=3D 0 is reported as Fail here, + * the failure status is not cascaded up because, we + * can not decide whether test really failed or not + * without actual samples. + */ + pr_debug("freq %d, sample_freq %9ld: %-4s (nr samples: %d)\n", + fetch_period[i].freq, fetch_period[i].sample_freq, + (!r && nr_samples !=3D 0) ? "Ok" : "Fail", nr_samples); + } + ret |=3D r; + } + + pr_debug("Op PMU test:\n"); + for (i =3D 0; i < ARRAY_SIZE(op_period); i++) { + nr_samples =3D 0; + r =3D __ibs_period_constraint_test(IBS_OP, &op_period[i], + &nr_samples); + + if (op_period[i].ret =3D=3D FD_ERROR) { + pr_debug("freq %d, sample_freq %9ld: %-4s\n", + op_period[i].freq, op_period[i].sample_freq, + !r ? "Ok" : "Fail"); + } else { + /* + * Although nr_samples =3D=3D 0 is reported as Fail here, + * the failure status is not cascaded up because, we + * can not decide whether test really failed or not + * without actual samples. + */ + pr_debug("freq %d, sample_freq %9ld: %-4s (nr samples: %d)\n", + op_period[i].freq, op_period[i].sample_freq, + (!r && nr_samples !=3D 0) ? "Ok" : "Fail", nr_samples); + } + ret |=3D r; + } + + return ret; +} + +struct ibs_ioctl { + /* Input */ + int freq; + unsigned long period; + + /* Expected output */ + int ret; +}; + +struct ibs_ioctl fetch_ioctl[] =3D { + { .freq =3D 0, .period =3D 0x0, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x1, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0xf, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x10, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0x11, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x1f, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x20, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0x80, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0x8f, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x90, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0x91, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x100, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0xfff0, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0xffff, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x10000, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0x1fff0, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0x1fff5, .ret =3D FD_ERROR }, + { .freq =3D 1, .period =3D 0x0, .ret =3D FD_ERROR }, + { .freq =3D 1, .period =3D 0x1, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0xf, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x10, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x11, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x1f, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x20, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x80, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x8f, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x90, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x91, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x100, .ret =3D FD_SUCCESS }, +}; + +struct ibs_ioctl op_ioctl[] =3D { + { .freq =3D 0, .period =3D 0x0, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x1, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0xf, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x10, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x11, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x1f, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x20, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x80, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x8f, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x90, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0x91, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x100, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0xfff0, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0xffff, .ret =3D FD_ERROR }, + { .freq =3D 0, .period =3D 0x10000, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0x1fff0, .ret =3D FD_SUCCESS }, + { .freq =3D 0, .period =3D 0x1fff5, .ret =3D FD_ERROR }, + { .freq =3D 1, .period =3D 0x0, .ret =3D FD_ERROR }, + { .freq =3D 1, .period =3D 0x1, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0xf, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x10, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x11, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x1f, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x20, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x80, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x8f, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x90, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x91, .ret =3D FD_SUCCESS }, + { .freq =3D 1, .period =3D 0x100, .ret =3D FD_SUCCESS }, +}; + +static int __ibs_ioctl_test(int ibs_type, struct ibs_ioctl *ibs_ioctl) +{ + struct perf_event_attr attr; + int ret =3D 0; + int fd; + int r; + + if (ibs_type =3D=3D IBS_FETCH) + fetch_prepare_attr(&attr, 0, ibs_ioctl->freq, 1000); + else + op_prepare_attr(&attr, 0, ibs_ioctl->freq, 1000); + + /* CPU0, All processes */ + fd =3D perf_event_open(&attr, -1, 0, -1, 0); + if (fd <=3D -1) { + pr_debug("event_open() Failed\n"); + return -1; + } + + r =3D ioctl(fd, PERF_EVENT_IOC_PERIOD, &ibs_ioctl->period); + if ((ibs_ioctl->ret =3D=3D FD_SUCCESS && r <=3D -1) || + (ibs_ioctl->ret =3D=3D FD_ERROR && r >=3D 0)) { + ret =3D -1; + } + + close(fd); + return ret; +} + +static int ibs_ioctl_test(void) +{ + unsigned long i; + int ret =3D 0; + int r; + + pr_debug("\nIBS ioctl() tests:\n"); + pr_debug("------------------\n"); + + pr_debug("Fetch PMU tests\n"); + for (i =3D 0; i < ARRAY_SIZE(fetch_ioctl); i++) { + r =3D __ibs_ioctl_test(IBS_FETCH, &fetch_ioctl[i]); + + pr_debug("ioctl(%s =3D 0x%-7lx): %s\n", + fetch_ioctl[i].freq ? "freq " : "period", + fetch_ioctl[i].period, r ? "Fail" : "Ok"); + ret |=3D r; + } + + pr_debug("Op PMU tests\n"); + for (i =3D 0; i < ARRAY_SIZE(op_ioctl); i++) { + r =3D __ibs_ioctl_test(IBS_OP, &op_ioctl[i]); + + pr_debug("ioctl(%s =3D 0x%-7lx): %s\n", + op_ioctl[i].freq ? "freq " : "period", + op_ioctl[i].period, r ? "Fail" : "Ok"); + ret |=3D r; + } + + return ret; +} + +static int ibs_freq_neg_test(void) +{ + struct perf_event_attr attr; + int fd; + + pr_debug("\nIBS freq (negative) tests:\n"); + pr_debug("--------------------------\n"); + + /* + * Assuming perf_event_max_sample_rate <=3D 100000, + * config: 0x300D40 =3D=3D> MaxCnt: 200000 + */ + op_prepare_attr(&attr, 0x300D40, 1, 0); + + /* CPU0, All processes */ + fd =3D perf_event_open(&attr, -1, 0, -1, 0); + if (fd !=3D -1) { + pr_debug("freq 1, sample_freq 200000: Fail\n"); + close(fd); + return -1; + } + + pr_debug("freq 1, sample_freq 200000: Ok\n"); + + return 0; +} + +struct ibs_l3missonly { + /* Input */ + int freq; + unsigned long sample_freq; + + /* Expected output */ + int ret; + unsigned long min_period; +}; + +struct ibs_l3missonly fetch_l3missonly =3D { + .freq =3D 1, + .sample_freq =3D 10000, + .ret =3D FD_SUCCESS, + .min_period =3D 0x10, +}; + +struct ibs_l3missonly op_l3missonly =3D { + .freq =3D 1, + .sample_freq =3D 10000, + .ret =3D FD_SUCCESS, + .min_period =3D 0x90, +}; + +static int __ibs_l3missonly_test(char *perf, int ibs_type, int *nr_samples, + struct ibs_l3missonly *l3missonly) +{ + struct perf_event_attr attr; + int ret =3D 0; + void *rb; + int fd; + + if (l3missonly->sample_freq > perf_event_max_sample_rate) + l3missonly->ret =3D FD_ERROR; + + if (ibs_type =3D=3D IBS_FETCH) { + fetch_prepare_attr(&attr, 0x800000000000000UL, l3missonly->freq, + l3missonly->sample_freq); + } else { + op_prepare_attr(&attr, 0x10000, l3missonly->freq, + l3missonly->sample_freq); + } + + /* CPU0, All processes */ + fd =3D perf_event_open(&attr, -1, 0, -1, 0); + if (l3missonly->ret =3D=3D FD_ERROR) { + if (fd !=3D -1) { + close(fd); + return -1; + } + return 0; + } + if (fd =3D=3D -1) { + pr_debug("perf_event_open() failed. [%m]\n"); + return -1; + } + + rb =3D mmap(NULL, PERF_MMAP_TOTAL_SIZE, PROT_READ | PROT_WRITE, + MAP_SHARED, fd, 0); + if (rb =3D=3D MAP_FAILED) { + pr_debug("mmap() failed. [%m]\n"); + close(fd); + return -1; + } + + ioctl(fd, PERF_EVENT_IOC_RESET, 0); + ioctl(fd, PERF_EVENT_IOC_ENABLE, 0); + + dummy_workload_2(perf); + + ioctl(fd, PERF_EVENT_IOC_DISABLE, 0); + + ret =3D rb_drain_samples(rb, l3missonly->min_period, nr_samples, period_h= igher); + + munmap(rb, PERF_MMAP_TOTAL_SIZE); + close(fd); + return ret; +} + +static int ibs_l3missonly_test(char *perf) +{ + int nr_samples =3D 0; + int ret =3D 0; + int r =3D 0; + + pr_debug("\nIBS L3MissOnly test: (takes a while)\n"); + pr_debug("--------------------\n"); + + if (perf_pmu__has_format(fetch_pmu, "l3missonly")) { + nr_samples =3D 0; + r =3D __ibs_l3missonly_test(perf, IBS_FETCH, &nr_samples, &fetch_l3misso= nly); + if (fetch_l3missonly.ret =3D=3D FD_ERROR) { + pr_debug("Fetch L3MissOnly: %-4s\n", !r ? "Ok" : "Fail"); + } else { + /* + * Although nr_samples =3D=3D 0 is reported as Fail here, + * the failure status is not cascaded up because, we + * can not decide whether test really failed or not + * without actual samples. + */ + pr_debug("Fetch L3MissOnly: %-4s (nr_samples: %d)\n", + (!r && nr_samples !=3D 0) ? "Ok" : "Fail", nr_samples); + } + ret |=3D r; + } + + if (perf_pmu__has_format(op_pmu, "l3missonly")) { + nr_samples =3D 0; + r =3D __ibs_l3missonly_test(perf, IBS_OP, &nr_samples, &op_l3missonly); + if (op_l3missonly.ret =3D=3D FD_ERROR) { + pr_debug("Op L3MissOnly: %-4s\n", !r ? "Ok" : "Fail"); + } else { + /* + * Although nr_samples =3D=3D 0 is reported as Fail here, + * the failure status is not cascaded up because, we + * can not decide whether test really failed or not + * without actual samples. + */ + pr_debug("Op L3MissOnly: %-4s (nr_samples: %d)\n", + (!r && nr_samples !=3D 0) ? "Ok" : "Fail", nr_samples); + } + ret |=3D r; + } + + return ret; +} + +static unsigned int get_perf_event_max_sample_rate(void) +{ + unsigned int max_sample_rate =3D 100000; + FILE *fp; + int ret; + + fp =3D fopen("/proc/sys/kernel/perf_event_max_sample_rate", "r"); + if (!fp) { + pr_debug("Can't open perf_event_max_sample_rate. Asssuming %d\n", + max_sample_rate); + goto out; + } + + ret =3D fscanf(fp, "%d", &max_sample_rate); + if (ret =3D=3D EOF) { + pr_debug("Can't read perf_event_max_sample_rate. Assuming 100000\n"); + max_sample_rate =3D 100000; + } + fclose(fp); + +out: + return max_sample_rate; +} + +int test__amd_ibs_period(struct test_suite *test __maybe_unused, + int subtest __maybe_unused) +{ + char perf[PATH_MAX] =3D {'\0'}; + int ret =3D TEST_OK; + + /* + * Reading perf_event_max_sample_rate only once _might_ cause some + * of the test to fail if kernel changes it after reading it here. + */ + perf_event_max_sample_rate =3D get_perf_event_max_sample_rate(); + fetch_pmu =3D perf_pmus__find("ibs_fetch"); + op_pmu =3D perf_pmus__find("ibs_op"); + + if (!x86__is_amd_cpu() || !fetch_pmu || !op_pmu) + return TEST_SKIP; + + perf_exe(perf, sizeof(perf)); + + if (sched_affine(0)) + return TEST_FAIL; + + /* + * Perf event can be opened in two modes: + * 1 Freq mode + * perf_event_attr->freq =3D 1, ->sample_freq =3D + * 2 Sample period mode + * perf_event_attr->freq =3D 0, ->sample_period =3D + * + * Instead of using above interface, IBS event in 'sample period mode' + * can also be opened by passing value directly in a MaxCnt + * bitfields of perf_event_attr->config. Test this IBS specific special + * interface. + */ + if (ibs_config_test()) + ret =3D TEST_FAIL; + + /* + * IBS Fetch and Op PMUs have HW constraints on minimum sample period. + * Also, sample period value must be in multiple of 0x10. Test that IBS + * driver honors HW constraints for various possible values in Freq as + * well as Sample Period mode IBS events. + */ + if (ibs_period_constraint_test()) + ret =3D TEST_FAIL; + + /* + * Test ioctl() with various sample period values for IBS event. + */ + if (ibs_ioctl_test()) + ret =3D TEST_FAIL; + + /* + * Test that opening of freq mode IBS event fails when the freq value + * is passed through ->config, not explicitly in ->sample_freq. Also + * use high freq value (beyond perf_event_max_sample_rate) to test IBS + * driver do not bypass perf_event_max_sample_rate checks. + */ + if (ibs_freq_neg_test()) + ret =3D TEST_FAIL; + + /* + * L3MissOnly is a post-processing filter, i.e. IBS HW checks for L3 + * Miss at the completion of the tagged uOp. The sample is discarded + * if the tagged uOp did not cause L3Miss. Also, IBS HW internally + * resets CurCnt to a small pseudo-random value and resumes counting. + * A new uOp is tagged once CurCnt reaches to MaxCnt. But the process + * repeats until the tagged uOp causes an L3 Miss. + * + * With the freq mode event, the next sample period is calculated by + * generic kernel on every sample to achieve desired freq of samples. + * + * Since the number of times HW internally reset CurCnt and the pseudo- + * random value of CurCnt for all those occurrences are not known to SW, + * the sample period adjustment by kernel goes for a toes for freq mode + * IBS events. Kernel will set very small period for the next sample if + * the window between current sample and prev sample is too high due to + * multiple samples being discarded internally by IBS HW. + * + * Test that IBS sample period constraints are honored when L3MissOnly + * is ON. + */ + if (ibs_l3missonly_test(perf)) + ret =3D TEST_FAIL; + + return ret; +} diff --git a/tools/perf/arch/x86/tests/arch-tests.c b/tools/perf/arch/x86/t= ests/arch-tests.c index a216a5d172ed..bfee2432515b 100644 --- a/tools/perf/arch/x86/tests/arch-tests.c +++ b/tools/perf/arch/x86/tests/arch-tests.c @@ -25,6 +25,7 @@ DEFINE_SUITE("x86 bp modify", bp_modify); #endif DEFINE_SUITE("x86 Sample parsing", x86_sample_parsing); DEFINE_SUITE("AMD IBS via core pmu", amd_ibs_via_core_pmu); +DEFINE_SUITE_EXCLUSIVE("AMD IBS sample period", amd_ibs_period); static struct test_case hybrid_tests[] =3D { TEST_CASE_REASON("x86 hybrid event parsing", hybrid, "not hybrid"), { .name =3D NULL, } @@ -50,6 +51,7 @@ struct test_suite *arch_tests[] =3D { #endif &suite__x86_sample_parsing, &suite__amd_ibs_via_core_pmu, + &suite__amd_ibs_period, &suite__hybrid, NULL, }; --=20 2.43.0