From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BD2BA227E93; Fri, 20 Jun 2025 07:29:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404552; cv=none; b=BDdHa/e1S7/si85ajuY2b5gQK5J65MBil4MOGlIA6KAwnQ/JfmRCPmuBC9/6uMFvNbVFBJT78+zmYlLoy/tbQV4C231URHkP3VcQzYPaJpXKpvC8tn+2p1rFHfP3qIh91qywmnpAhVnSkmTpNL/hBMLVw2IcY0UDFi4QRIOuWyY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404552; c=relaxed/simple; bh=rYWAsQ6FYNS1vD2Z/GSRKI5X/qjhR6LVzd3FnuP33/0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j3rWkpW/TTcFSfz2754d6e//AlFNfeJFo8j6QygUCYbP/5Tpltxl3n181/f8bDA8gOnkrGeWhFBST/aANbPOUYJAoLS73cMLfOhdbnDxB1o4PuZDR6+JZq5h2TjcwxkScc0XTUjQAWd4whGjZJHIi0/WcSjVMdFUvNzenCJqw1A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=IXx5F/SO; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="IXx5F/SO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404550; x=1781940550; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rYWAsQ6FYNS1vD2Z/GSRKI5X/qjhR6LVzd3FnuP33/0=; b=IXx5F/SOUekv+SaVyJ9s0HIt9wo+KVbQ4TGgoYTr/RJJj6tgaEsPl081 erZm+zD2j8pLF2YlxKtgYmsfhsJmxugBJgt02+SSGLNUYAAUqNxvp9iat dOwe0vMib/u5ONSkeSJGKEySVdzuKLYmkZNFLTcjGE64/uAjUQkYK060L UgaZ2UWQobqqJaeXzrLEhaYl3GGg2tg388QmqPftlUkbSqH+WyKI/Rtlb eArU44U8TgHl9zgnXPo0Ls4AkwqzVKnrFBsGx4zBVHfZN7zFAc+NAqVlO wUcz9DNj5qBpWyXG7DQ8WgzgaRGsIrDhUeY/fJ9acXz85gTAe+OF8LZbo g==; X-CSE-ConnectionGUID: awFt9wmjSLusI4dJWSt6RQ== X-CSE-MsgGUID: 2DhBANo0QuiTmBJZNBFMpQ== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887714" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887714" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:10 -0700 X-CSE-ConnectionGUID: 5tBn3PiGQnWY7ite5pHZ8A== X-CSE-MsgGUID: aKSlF50LQOKqJBrO7T4HlA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156650942" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:07 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 01/13] perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call Date: Fri, 20 Jun 2025 10:38:57 +0000 Message-ID: <20250620103909.1586595-2-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use x86_pmu_drain_pebs static call to replace calling x86_pmu.drain_pebs function pointer. Suggested-by: Peter Zijlstra Signed-off-by: Dapeng Mi --- arch/x86/events/intel/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 741b229f0718..fb6e5c2251a2 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3272,7 +3272,7 @@ static int handle_pmi_common(struct pt_regs *regs, u6= 4 status) * The PEBS buffer has to be drained before handling the A-PMI */ if (is_pebs_counter_event_group(event)) - x86_pmu.drain_pebs(regs, &data); + static_call(x86_pmu_drain_pebs)(regs, &data); =20 last_period =3D event->hw.last_period; =20 --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3304A22A1C5; Fri, 20 Jun 2025 07:29:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404555; cv=none; b=rB8zlRDOfNOHR8u7tbHigVCVXr7LDEM4NkXP8CnCJ2pX9rlxwYUWaijJDfFyl/bBd+ipVfAF4Ip7Av2rOC6KdwQDnJs51M3zuwj0thxk7pqCMSqqNBUBiLFOLBTz8BEqvX8iV7U3zokvqc2EWmt7Tm3qasxpGot5jicfGDgaNfQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404555; c=relaxed/simple; bh=1FgyooF1UgSGPehol8a3BixQFNO/i+15be0OWZgITUo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=c4msQD4TiJYhPVVxuBpKR3806+lbriGafhOMyH8FcCKu5Xav/V2j4SJ5jItji594iTIKP+SMeKlOQrKkl3bDTwSBJkj+J5ijAZ9dmhzq2RvsMbWWHFzkZ7TDupbXhrmPTAbhMKWn2OkVKu+gNJaoaWi4dW4D9c/6b2dY+BdRwd4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=aPN49NNN; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="aPN49NNN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404554; x=1781940554; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1FgyooF1UgSGPehol8a3BixQFNO/i+15be0OWZgITUo=; b=aPN49NNNmX4D0sVqo55k8nuq1lvQiPYkbN88GgkOmK2wdVkWWoL/hZcs HnXCWaJXxaVPYMpIlO0w8BxPXsZX0gv7I8Nfnd2NDxW6l5zrPRLW65m7w ZqG2/KfuFapY1HzEqB7FdcUqkMT0jeJolFl8GLHMk+j/nuAM+m7trkynl Sn9fqwUVGvA0bS/otVZI9N8RKo623ePINQ5q60TQBxpoXSl+/WXh3KqwI 0BpUZt/e39MCTxXauIilZ7nt1leOp77YA1Q1niUkLeAswAGjR21VtHxaA LSGr4s00VONVPKVO93WszeLVgVh8PZuyxF6vkgY2AhYqdfXEye31fJDbE w==; X-CSE-ConnectionGUID: QOYyyOePRd64p0476Yw1Vg== X-CSE-MsgGUID: hvh5CMYMR52+rOJIx0HhRA== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887722" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887722" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:14 -0700 X-CSE-ConnectionGUID: LfVgU0yzRO2K+GpGPVxukg== X-CSE-MsgGUID: esJBU56FS5eg51vjnHGmSg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156650955" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:11 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 02/13] perf/x86/intel: Correct large PEBS flag check Date: Fri, 20 Jun 2025 10:38:58 +0000 Message-ID: <20250620103909.1586595-3-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Current large PEBS flag check only checks if sample_regs_user contains unsupported GPRs but doesn't check if sample_regs_intr contains unsupported GPRs. Of course, current PEBS HW supports to sample all perf supported GPRs, so the missed check doesn't cause real issue. But this won't be true after arch-PEBS supports SSP register sampling. So correct this issue. Fixes: a47ba4d77e12 ("perf/x86: Enable free running PEBS for REGS_USER/INTR= ") Signed-off-by: Dapeng Mi --- arch/x86/events/intel/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index fb6e5c2251a2..80c45c92d0da 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -4032,7 +4032,9 @@ static unsigned long intel_pmu_large_pebs_flags(struc= t perf_event *event) if (!event->attr.exclude_kernel) flags &=3D ~PERF_SAMPLE_REGS_USER; if (event->attr.sample_regs_user & ~PEBS_GP_REGS) - flags &=3D ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR); + flags &=3D ~PERF_SAMPLE_REGS_USER; + if (event->attr.sample_regs_intr & ~PEBS_GP_REGS) + flags &=3D ~PERF_SAMPLE_REGS_INTR; return flags; } =20 --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13F2922AE7C; Fri, 20 Jun 2025 07:29:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404559; cv=none; b=BX3GHUSPNPxESNcQ0pHlp8eJZG+81V8rQ7XCfbWtnmD6TCgknF28ZaLXQWNWH7EERrJoAF3v5aMZh3bJ0a7s4wQq068w6EcXwtxv4133UCqoxP2kWvMAMkdKdcxhf9+28lBaqRIF2BR6AfvrQeiQEMwm1tnzbPpcnSPeBDCC6Ts= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404559; c=relaxed/simple; bh=MC3Pq5qWhRKmuaEszLX+vGDs/S8p7oqDGnfI2TVZF9g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LWjweWNoPdSVSlIFznyX3K5cQ9sV6kkgkXk2Aw42UnAW/LGcZ+2GXk4/uTBXxAEIVmBoLfrAz+4zRG74YkndPkEDzq6pSa7KiHlrc/hCUMqpo0a4bjHHM6JQNdoqBNsMlSLuiYf8nNGUG+qf5jvPuncOJb1WzsRypsX4u65jqI0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=dtnf2/Ev; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="dtnf2/Ev" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404558; x=1781940558; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MC3Pq5qWhRKmuaEszLX+vGDs/S8p7oqDGnfI2TVZF9g=; b=dtnf2/Ev07XRfOWhJk3WsvROChepdLCY5hyOULnNJ+KM4GELqjGAletu LdL5Irk8EyWLAOGmyC1VJ2tOFRZNxeJCiJVpdr2FZOXvxHNICwD2IYSqW LCR2SSyojcG1a38E70LJVBIw/oGNGTYi9+n+4+zTpKe1lUtMIq0mWHTzX 4/XnNPREtaAZYm/rB9tUkbHezs7aEoMpvQY47DADIscB3S6NAAPjkodSO ZqZcIaAZPw1fy7c1ndzg0mSx40bkQxPEK066BTbaDxDnFRWId2pahxoFk 9pAT2f86IjEAviT+p6KA4EzPrr76N0MumD41CgMwPhpdMInFidH1lUtNk A==; X-CSE-ConnectionGUID: H2eiCHbuQGaupU1GqhwwbA== X-CSE-MsgGUID: C4oMzWl0QLeVQFx/BZ4OIQ== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887729" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887729" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:17 -0700 X-CSE-ConnectionGUID: T+HvsBC5SlWmwAJyu6oZ+g== X-CSE-MsgGUID: 9CyEpkSnQ+OPmRILaavHLw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156650972" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:14 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 03/13] perf/x86/intel: Initialize architectural PEBS Date: Fri, 20 Jun 2025 10:38:59 +0000 Message-ID: <20250620103909.1586595-4-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" arch-PEBS leverages CPUID.23H.4/5 sub-leaves enumerate arch-PEBS supported capabilities and counters bitmap. This patch parses these 2 sub-leaves and initializes arch-PEBS capabilities and corresponding structures. Since IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs are no longer existed for arch-PEBS, arch-PEBS doesn't need to manipulate these MSRs. Thus add a simple pair of __intel_pmu_pebs_enable/disable() callbacks for arch-PEBS. Signed-off-by: Dapeng Mi --- arch/x86/events/core.c | 21 +++++++++--- arch/x86/events/intel/core.c | 55 +++++++++++++++++++++++-------- arch/x86/events/intel/ds.c | 52 ++++++++++++++++++++++++----- arch/x86/events/perf_event.h | 25 ++++++++++++-- arch/x86/include/asm/perf_event.h | 7 +++- 5 files changed, 129 insertions(+), 31 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 7610f26dfbd9..f30c423e4bd2 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -554,14 +554,22 @@ static inline int precise_br_compat(struct perf_event= *event) return m =3D=3D b; } =20 -int x86_pmu_max_precise(void) +int x86_pmu_max_precise(struct pmu *pmu) { int precise =3D 0; =20 - /* Support for constant skid */ if (x86_pmu.pebs_active && !x86_pmu.pebs_broken) { - precise++; + /* arch PEBS */ + if (x86_pmu.arch_pebs) { + precise =3D 2; + if (hybrid(pmu, arch_pebs_cap).pdists) + precise++; + + return precise; + } =20 + /* legacy PEBS - support for constant skid */ + precise++; /* Support for IP fixup */ if (x86_pmu.lbr_nr || x86_pmu.intel_cap.pebs_format >=3D 2) precise++; @@ -569,13 +577,14 @@ int x86_pmu_max_precise(void) if (x86_pmu.pebs_prec_dist) precise++; } + return precise; } =20 int x86_pmu_hw_config(struct perf_event *event) { if (event->attr.precise_ip) { - int precise =3D x86_pmu_max_precise(); + int precise =3D x86_pmu_max_precise(event->pmu); =20 if (event->attr.precise_ip > precise) return -EOPNOTSUPP; @@ -2627,7 +2636,9 @@ static ssize_t max_precise_show(struct device *cdev, struct device_attribute *attr, char *buf) { - return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise()); + struct pmu *pmu =3D dev_get_drvdata(cdev); + + return snprintf(buf, PAGE_SIZE, "%d\n", x86_pmu_max_precise(pmu)); } =20 static DEVICE_ATTR_RO(max_precise); diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 80c45c92d0da..faf4ab91fa4b 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -5276,34 +5276,58 @@ static inline bool intel_pmu_broken_perf_cap(void) =20 static void update_pmu_cap(struct pmu *pmu) { - unsigned int cntr, fixed_cntr, ecx, edx; - union cpuid35_eax eax; - union cpuid35_ebx ebx; + unsigned int eax, ebx, ecx, edx; + union cpuid35_eax eax_0; + union cpuid35_ebx ebx_0; + u64 cntrs_mask =3D 0; + u64 pebs_mask =3D 0; + u64 pdists_mask =3D 0; =20 - cpuid(ARCH_PERFMON_EXT_LEAF, &eax.full, &ebx.full, &ecx, &edx); + cpuid(ARCH_PERFMON_EXT_LEAF, &eax_0.full, &ebx_0.full, &ecx, &edx); =20 - if (ebx.split.umask2) + if (ebx_0.split.umask2) hybrid(pmu, config_mask) |=3D ARCH_PERFMON_EVENTSEL_UMASK2; - if (ebx.split.eq) + if (ebx_0.split.eq) hybrid(pmu, config_mask) |=3D ARCH_PERFMON_EVENTSEL_EQ; =20 - if (eax.split.cntr_subleaf) { + if (eax_0.split.cntr_subleaf) { cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF, - &cntr, &fixed_cntr, &ecx, &edx); - hybrid(pmu, cntr_mask64) =3D cntr; - hybrid(pmu, fixed_cntr_mask64) =3D fixed_cntr; + &eax, &ebx, &ecx, &edx); + hybrid(pmu, cntr_mask64) =3D eax; + hybrid(pmu, fixed_cntr_mask64) =3D ebx; + cntrs_mask =3D (u64)ebx << INTEL_PMC_IDX_FIXED | eax; } =20 - if (eax.split.acr_subleaf) { + if (eax_0.split.acr_subleaf) { cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_ACR_LEAF, - &cntr, &fixed_cntr, &ecx, &edx); + &eax, &ebx, &ecx, &edx); /* The mask of the counters which can be reloaded */ - hybrid(pmu, acr_cntr_mask64) =3D cntr | ((u64)fixed_cntr << INTEL_PMC_ID= X_FIXED); + hybrid(pmu, acr_cntr_mask64) =3D eax | ((u64)ebx << INTEL_PMC_IDX_FIXED); =20 /* The mask of the counters which can cause a reload of reloadable count= ers */ hybrid(pmu, acr_cause_mask64) =3D ecx | ((u64)edx << INTEL_PMC_IDX_FIXED= ); } =20 + /* Bits[5:4] should be set simultaneously if arch-PEBS is supported */ + if (eax_0.split.pebs_caps_subleaf && eax_0.split.pebs_cnts_subleaf) { + cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_CAP_LEAF, + &eax, &ebx, &ecx, &edx); + hybrid(pmu, arch_pebs_cap).caps =3D (u64)ebx << 32; + + cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_PEBS_COUNTER_LEAF, + &eax, &ebx, &ecx, &edx); + pebs_mask =3D ((u64)ecx << INTEL_PMC_IDX_FIXED) | eax; + pdists_mask =3D ((u64)edx << INTEL_PMC_IDX_FIXED) | ebx; + hybrid(pmu, arch_pebs_cap).counters =3D pebs_mask; + hybrid(pmu, arch_pebs_cap).pdists =3D pdists_mask; + + if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask)) + x86_pmu.arch_pebs =3D 0; + } else { + WARN_ON(x86_pmu.arch_pebs =3D=3D 1); + x86_pmu.arch_pebs =3D 0; + } + if (!intel_pmu_broken_perf_cap()) { /* Perf Metric (Bit 15) and PEBS via PT (Bit 16) are hybrid enumeration = */ rdmsrq(MSR_IA32_PERF_CAPABILITIES, hybrid(pmu, intel_cap).capabilities); @@ -6255,7 +6279,7 @@ tsx_is_visible(struct kobject *kobj, struct attribute= *attr, int i) static umode_t pebs_is_visible(struct kobject *kobj, struct attribute *attr, int i) { - return x86_pmu.ds_pebs ? attr->mode : 0; + return intel_pmu_has_pebs() ? attr->mode : 0; } =20 static umode_t @@ -7731,6 +7755,9 @@ __init int intel_pmu_init(void) if (!is_hybrid() && boot_cpu_has(X86_FEATURE_ARCH_PERFMON_EXT)) update_pmu_cap(NULL); =20 + if (x86_pmu.arch_pebs) + pr_cont("Architectural PEBS, "); + intel_pmu_check_counters_mask(&x86_pmu.cntr_mask64, &x86_pmu.fixed_cntr_mask64, &x86_pmu.intel_ctrl); diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index c0b7ac1c7594..26e485eca0a0 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1531,6 +1531,15 @@ static inline void intel_pmu_drain_large_pebs(struct= cpu_hw_events *cpuc) intel_pmu_drain_pebs_buffer(); } =20 +static void __intel_pmu_pebs_enable(struct perf_event *event) +{ + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + struct hw_perf_event *hwc =3D &event->hw; + + hwc->config &=3D ~ARCH_PERFMON_EVENTSEL_INT; + cpuc->pebs_enabled |=3D 1ULL << hwc->idx; +} + void intel_pmu_pebs_enable(struct perf_event *event) { struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); @@ -1539,9 +1548,7 @@ void intel_pmu_pebs_enable(struct perf_event *event) struct debug_store *ds =3D cpuc->ds; unsigned int idx =3D hwc->idx; =20 - hwc->config &=3D ~ARCH_PERFMON_EVENTSEL_INT; - - cpuc->pebs_enabled |=3D 1ULL << hwc->idx; + __intel_pmu_pebs_enable(event); =20 if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5= )) cpuc->pebs_enabled |=3D 1ULL << (hwc->idx + 32); @@ -1603,14 +1610,22 @@ void intel_pmu_pebs_del(struct perf_event *event) pebs_update_state(needed_cb, cpuc, event, false); } =20 -void intel_pmu_pebs_disable(struct perf_event *event) +static void __intel_pmu_pebs_disable(struct perf_event *event) { struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); struct hw_perf_event *hwc =3D &event->hw; =20 intel_pmu_drain_large_pebs(cpuc); - cpuc->pebs_enabled &=3D ~(1ULL << hwc->idx); + hwc->config |=3D ARCH_PERFMON_EVENTSEL_INT; +} + +void intel_pmu_pebs_disable(struct perf_event *event) +{ + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + struct hw_perf_event *hwc =3D &event->hw; + + __intel_pmu_pebs_disable(event); =20 if ((event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT) && (x86_pmu.version < 5)) @@ -1622,8 +1637,6 @@ void intel_pmu_pebs_disable(struct perf_event *event) =20 if (cpuc->enabled) wrmsrq(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled); - - hwc->config |=3D ARCH_PERFMON_EVENTSEL_INT; } =20 void intel_pmu_pebs_enable_all(void) @@ -2669,11 +2682,26 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs= *iregs, struct perf_sample_d } } =20 +static void __init intel_arch_pebs_init(void) +{ + /* + * Current hybrid platforms always both support arch-PEBS or not + * on all kinds of cores. So directly set x86_pmu.arch_pebs flag + * if boot cpu supports arch-PEBS. + */ + x86_pmu.arch_pebs =3D 1; + x86_pmu.pebs_buffer_size =3D PEBS_BUFFER_SIZE; + x86_pmu.pebs_capable =3D ~0ULL; + + x86_pmu.pebs_enable =3D __intel_pmu_pebs_enable; + x86_pmu.pebs_disable =3D __intel_pmu_pebs_disable; +} + /* * PEBS probe and setup */ =20 -void __init intel_pebs_init(void) +static void __init intel_ds_pebs_init(void) { /* * No support for 32bit formats @@ -2788,6 +2816,14 @@ void __init intel_pebs_init(void) } } =20 +void __init intel_pebs_init(void) +{ + if (x86_pmu.intel_cap.pebs_format =3D=3D 0xf) + intel_arch_pebs_init(); + else + intel_ds_pebs_init(); +} + void perf_restore_debug_store(void) { struct debug_store *ds =3D __this_cpu_read(cpu_hw_events.ds); diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 2b969386dcdd..a5145e8f1ddb 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -708,6 +708,12 @@ enum hybrid_pmu_type { hybrid_big_small_tiny =3D hybrid_big | hybrid_small_tiny, }; =20 +struct arch_pebs_cap { + u64 caps; + u64 counters; + u64 pdists; +}; + struct x86_hybrid_pmu { struct pmu pmu; const char *name; @@ -752,6 +758,8 @@ struct x86_hybrid_pmu { mid_ack :1, enabled_ack :1; =20 + struct arch_pebs_cap arch_pebs_cap; + u64 pebs_data_source[PERF_PEBS_DATA_SOURCE_MAX]; }; =20 @@ -906,7 +914,7 @@ struct x86_pmu { union perf_capabilities intel_cap; =20 /* - * Intel DebugStore bits + * Intel DebugStore and PEBS bits */ unsigned int bts :1, bts_active :1, @@ -917,7 +925,8 @@ struct x86_pmu { pebs_no_tlb :1, pebs_no_isolation :1, pebs_block :1, - pebs_ept :1; + pebs_ept :1, + arch_pebs :1; int pebs_record_size; int pebs_buffer_size; u64 pebs_events_mask; @@ -929,6 +938,11 @@ struct x86_pmu { u64 rtm_abort_event; u64 pebs_capable; =20 + /* + * Intel Architectural PEBS + */ + struct arch_pebs_cap arch_pebs_cap; + /* * Intel LBR */ @@ -1217,7 +1231,7 @@ int x86_reserve_hardware(void); =20 void x86_release_hardware(void); =20 -int x86_pmu_max_precise(void); +int x86_pmu_max_precise(struct pmu *pmu); =20 void hw_perf_lbr_event_destroy(struct perf_event *event); =20 @@ -1792,6 +1806,11 @@ static inline int intel_pmu_max_num_pebs(struct pmu = *pmu) return fls((u32)hybrid(pmu, pebs_events_mask)); } =20 +static inline bool intel_pmu_has_pebs(void) +{ + return x86_pmu.ds_pebs || x86_pmu.arch_pebs; +} + #else /* CONFIG_CPU_SUP_INTEL */ =20 static inline void reserve_ds_buffers(void) diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 70d1d94aca7e..7fca9494aae9 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -196,6 +196,8 @@ union cpuid10_edx { #define ARCH_PERFMON_EXT_LEAF 0x00000023 #define ARCH_PERFMON_NUM_COUNTER_LEAF 0x1 #define ARCH_PERFMON_ACR_LEAF 0x2 +#define ARCH_PERFMON_PEBS_CAP_LEAF 0x4 +#define ARCH_PERFMON_PEBS_COUNTER_LEAF 0x5 =20 union cpuid35_eax { struct { @@ -206,7 +208,10 @@ union cpuid35_eax { unsigned int acr_subleaf:1; /* Events Sub-Leaf */ unsigned int events_subleaf:1; - unsigned int reserved:28; + /* arch-PEBS Sub-Leaves */ + unsigned int pebs_caps_subleaf:1; + unsigned int pebs_cnts_subleaf:1; + unsigned int reserved:26; } split; unsigned int full; }; --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A821722C32D; Fri, 20 Jun 2025 07:29:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404563; cv=none; b=jS/AshncvfGElhNYaT3YR1jhCw9DJmfsqXGw1z4Wr+Ec3sYFw18zxPyLyRlcEvy02eFIO9G4C52S4hN2QWZ/C4AC5iWmG+e6n130i//KIzpKEhYuzLsIXuW5s36I+OvDm2y3SAvTKcEYzQqG83hiH8qZ2z7VoIb7FtaamN4FYvc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404563; c=relaxed/simple; bh=u3426/gS8TOBdXecemGZuZU2eJGV3GBrWr7uHRzixLc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hFoD7ZMI8uKyGVk4uiBLoiZtqRC5u2OL9YniswV8NeE7dd6UK6WPtnc+91KaQO/yQPwdM1CFnaIYPDX0AtveRz9k4QyEd9fO+CBBEohjn3i+avp0cJ619GWM/fKAQKu/tB3DA7JaCABc9r1O0/mp7G7WNi3821MDxthQYvAxG7o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=HI9CHnHH; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="HI9CHnHH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404561; x=1781940561; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=u3426/gS8TOBdXecemGZuZU2eJGV3GBrWr7uHRzixLc=; b=HI9CHnHHyQDSTSFKzjB064gvN8Z54UqMGOkMZNZQKbNOGKJFPts7mgtl 8/FBpBNQmcF8tmyDkxofYjuMwBjzwQIdI56EtE0e8t+t+X7jlw86/3Hwl z4EO+C+rHwzjaJ7Fpecl0G8gS2YZAsS5sVUktGvrms0YGPJDzvIg8QcW2 8KOhbpRt9JeuNc1fhNu1lJn9kpDUkUn/EAW0zRfsCOx2yzy5m+KUIcElz db0iL9qt79VCs9IpA6zmjtywLQTdawto/HHVHljuDPSewqgegw1hRALFi RKL6s0VZ9YsdWLuiFyLHcRoZ6PM5CfEchm9n8QCj+N5QlixKjVaGJkWmp A==; X-CSE-ConnectionGUID: H1KT80DQTKyNdHnieKgBnw== X-CSE-MsgGUID: 5GNlSp4qR3O30fjNAId3Iw== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887740" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887740" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:21 -0700 X-CSE-ConnectionGUID: KQdm0kZOQQeVDxJxIq48Lg== X-CSE-MsgGUID: SMpc0a9FT4+EsyQQhwHTlg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156650992" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:18 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 04/13] perf/x86/intel/ds: Factor out PEBS record processing code to functions Date: Fri, 20 Jun 2025 10:39:00 +0000 Message-ID: <20250620103909.1586595-5-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Beside some PEBS record layout difference, arch-PEBS can share most of PEBS record processing code with adaptive PEBS. Thus, factor out these common processing code to independent inline functions, so they can be reused by subsequent arch-PEBS handler. Suggested-by: Kan Liang Signed-off-by: Dapeng Mi --- arch/x86/events/intel/ds.c | 80 ++++++++++++++++++++++++++------------ 1 file changed, 55 insertions(+), 25 deletions(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 26e485eca0a0..1d95e7313cac 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2614,6 +2614,54 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs = *iregs, struct perf_sample_d } } =20 +static inline void __intel_pmu_handle_pebs_record(struct pt_regs *iregs, + struct pt_regs *regs, + struct perf_sample_data *data, + void *at, u64 pebs_status, + short *counts, void **last, + setup_fn setup_sample) +{ + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + struct perf_event *event; + int bit; + + for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) { + event =3D cpuc->events[bit]; + + if (WARN_ON_ONCE(!event) || + WARN_ON_ONCE(!event->attr.precise_ip)) + continue; + + if (counts[bit]++) + __intel_pmu_pebs_event(event, iregs, regs, data, + last[bit], setup_sample); + + last[bit] =3D at; + } +} + +static inline void +__intel_pmu_handle_last_pebs_record(struct pt_regs *iregs, struct pt_regs = *regs, + struct perf_sample_data *data, u64 mask, + short *counts, void **last, + setup_fn setup_sample) +{ + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + struct perf_event *event; + int bit; + + for_each_set_bit(bit, (unsigned long *)&mask, X86_PMC_IDX_MAX) { + if (!counts[bit]) + continue; + + event =3D cpuc->events[bit]; + + __intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit], + counts[bit], setup_sample); + } + +} + static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sa= mple_data *data) { short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] =3D {}; @@ -2623,9 +2671,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *= iregs, struct perf_sample_d struct x86_perf_regs perf_regs; struct pt_regs *regs =3D &perf_regs.regs; struct pebs_basic *basic; - struct perf_event *event; void *base, *at, *top; - int bit; u64 mask; =20 if (!x86_pmu.pebs_active) @@ -2638,6 +2684,7 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *= iregs, struct perf_sample_d =20 mask =3D hybrid(cpuc->pmu, pebs_events_mask) | (hybrid(cpuc->pmu, fixed_cntr_mask64) << INTEL_PMC_IDX_FIXED); + mask &=3D cpuc->pebs_enabled; =20 if (unlikely(base >=3D top)) { intel_pmu_pebs_event_update_no_drain(cpuc, mask); @@ -2655,31 +2702,14 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs= *iregs, struct perf_sample_d if (basic->format_size !=3D cpuc->pebs_record_size) continue; =20 - pebs_status =3D basic->applicable_counters & cpuc->pebs_enabled & mask; - for_each_set_bit(bit, (unsigned long *)&pebs_status, X86_PMC_IDX_MAX) { - event =3D cpuc->events[bit]; - - if (WARN_ON_ONCE(!event) || - WARN_ON_ONCE(!event->attr.precise_ip)) - continue; - - if (counts[bit]++) { - __intel_pmu_pebs_event(event, iregs, regs, data, last[bit], - setup_pebs_adaptive_sample_data); - } - last[bit] =3D at; - } + pebs_status =3D mask & basic->applicable_counters; + __intel_pmu_handle_pebs_record(iregs, regs, data, at, + pebs_status, counts, last, + setup_pebs_adaptive_sample_data); } =20 - for_each_set_bit(bit, (unsigned long *)&mask, X86_PMC_IDX_MAX) { - if (!counts[bit]) - continue; - - event =3D cpuc->events[bit]; - - __intel_pmu_pebs_last_event(event, iregs, regs, data, last[bit], - counts[bit], setup_pebs_adaptive_sample_data); - } + __intel_pmu_handle_last_pebs_record(iregs, regs, data, mask, counts, last, + setup_pebs_adaptive_sample_data); } =20 static void __init intel_arch_pebs_init(void) --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 53FC222D4FA; Fri, 20 Jun 2025 07:29:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404567; cv=none; b=e1IUdEolrV0RfTPsWbb3P72LjUwLu5F2i06f4ek0Wb3qp9lIhGkFcI7qUVYWhSIaSwGD2+49MFGXxlIxIYdlY2eikVgCG93zpiWRPbELtdiZrpoEYui/RzGCWNy5JqxLOKyo+9lue9ZsYBNAe0Qe3hP0mkzqmh7NY9+zZsc4Rf4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404567; c=relaxed/simple; bh=PpzPLWdDuelV/vLGfLs1EZUny2vY0LDyLXpweAcXeJ8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lY1BdwvOH2uq/3RLgVDotSqn9AAEe2TpFsSK9X8i3xPaNZ19CYio3/K75GddvZlAhuTdFfvwZ59lvTG5uBsi+pHjRQMPtJIp6AVoPNZVkmfREji+hVwYoeTW9kTPnOrm0Pgd5jThzvqmN5EMQDUX/l1vmBvP2uR1P+L8U3Wwdo8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=VpKomr28; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="VpKomr28" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404565; x=1781940565; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PpzPLWdDuelV/vLGfLs1EZUny2vY0LDyLXpweAcXeJ8=; b=VpKomr28wN1MgqBOpUc0fDS79nCFV99ytevdGFOHsOvf12IyhgCeULYW uH2CNdeJsFVV6ku1cVj/dfu2ctMsXpbhKyV3ytvsfsonwu+PySkMn/3JH vmYJTIG6Xe1zaItW031/aRj6ohAy2UsWDoeSu2218LD8yjzk75KnfRGcB 8bfpedElPxcqaOCt5Ulcta3h1O6mT7yrdcPGJD8ofCZuepV8UmbIr8sPr qHwZESN6kZZf/r1BXxIWab9iSjaMniZMh5k9TLYjdHSgqXrBW2CQqq8iP tsGJS3bHV3102EDuWEX+B8e9q1twqIyNPf4BaXRXuKomzlaZpUpnngO1t g==; X-CSE-ConnectionGUID: L9c3JTH/S8+5yxWTgINydg== X-CSE-MsgGUID: F8bi7QZ1SxWbOJ0A/EePZw== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887748" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887748" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:25 -0700 X-CSE-ConnectionGUID: OYnvj+JsSS+KLvKolfhRbw== X-CSE-MsgGUID: 8UvwqDuPRbqvTSXzSRAqRw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156651005" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:22 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 05/13] perf/x86/intel/ds: Factor out PEBS group processing code to functions Date: Fri, 20 Jun 2025 10:39:01 +0000 Message-ID: <20250620103909.1586595-6-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Adaptive PEBS and arch-PEBS share lots of same code to process these PEBS groups, like basic, GPR and meminfo groups. Extract these shared code to generic functions to avoid duplicated code. Signed-off-by: Dapeng Mi --- arch/x86/events/intel/ds.c | 170 +++++++++++++++++++++++-------------- 1 file changed, 104 insertions(+), 66 deletions(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 1d95e7313cac..4fdb1c59a907 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2072,6 +2072,90 @@ static inline void __setup_pebs_counter_group(struct= cpu_hw_events *cpuc, =20 #define PEBS_LATENCY_MASK 0xffff =20 +static inline void __setup_perf_sample_data(struct perf_event *event, + struct pt_regs *iregs, + struct perf_sample_data *data) +{ + perf_sample_data_init(data, 0, event->hw.last_period); + + /* + * We must however always use iregs for the unwinder to stay sane; the + * record BP,SP,IP can point into thin air when the record is from a + * previous PMI context or an (I)RET happened between the record and + * PMI. + */ + perf_sample_save_callchain(data, event, iregs); +} + +static inline void __setup_pebs_basic_group(struct perf_event *event, + struct pt_regs *regs, + struct perf_sample_data *data, + u64 sample_type, u64 ip, + u64 tsc, u16 retire) +{ + /* The ip in basic is EventingIP */ + set_linear_ip(regs, ip); + regs->flags =3D PERF_EFLAGS_EXACT; + setup_pebs_time(event, data, tsc); + + if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) + data->weight.var3_w =3D retire; +} + +static inline void __setup_pebs_gpr_group(struct perf_event *event, + struct pt_regs *regs, + struct pebs_gprs *gprs, + u64 sample_type) +{ + if (event->attr.precise_ip < 2) { + set_linear_ip(regs, gprs->ip); + regs->flags &=3D ~PERF_EFLAGS_EXACT; + } + + if (sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)) + adaptive_pebs_save_regs(regs, gprs); +} + +static inline void __setup_pebs_meminfo_group(struct perf_event *event, + struct perf_sample_data *data, + u64 sample_type, u64 latency, + u16 instr_latency, u64 address, + u64 aux, u64 tsx_tuning, u64 ax) +{ + if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) { + u64 tsx_latency =3D intel_get_tsx_weight(tsx_tuning); + + data->weight.var2_w =3D instr_latency; + + /* + * Although meminfo::latency is defined as a u64, + * only the lower 32 bits include the valid data + * in practice on Ice Lake and earlier platforms. + */ + if (sample_type & PERF_SAMPLE_WEIGHT) + data->weight.full =3D latency ?: tsx_latency; + else + data->weight.var1_dw =3D (u32)latency ?: tsx_latency; + + data->sample_flags |=3D PERF_SAMPLE_WEIGHT_TYPE; + } + + if (sample_type & PERF_SAMPLE_DATA_SRC) { + data->data_src.val =3D get_data_src(event, aux); + data->sample_flags |=3D PERF_SAMPLE_DATA_SRC; + } + + if (sample_type & PERF_SAMPLE_ADDR_TYPE) { + data->addr =3D address; + data->sample_flags |=3D PERF_SAMPLE_ADDR; + } + + if (sample_type & PERF_SAMPLE_TRANSACTION) { + data->txn =3D intel_get_tsx_transaction(tsx_tuning, ax); + data->sample_flags |=3D PERF_SAMPLE_TRANSACTION; + } +} + /* * With adaptive PEBS the layout depends on what fields are configured. */ @@ -2081,12 +2165,14 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, struct pt_regs *regs) { struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + u64 sample_type =3D event->attr.sample_type; struct pebs_basic *basic =3D __pebs; void *next_record =3D basic + 1; - u64 sample_type, format_group; struct pebs_meminfo *meminfo =3D NULL; struct pebs_gprs *gprs =3D NULL; struct x86_perf_regs *perf_regs; + u64 format_group; + u16 retire; =20 if (basic =3D=3D NULL) return; @@ -2094,31 +2180,17 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, perf_regs =3D container_of(regs, struct x86_perf_regs, regs); perf_regs->xmm_regs =3D NULL; =20 - sample_type =3D event->attr.sample_type; format_group =3D basic->format_group; - perf_sample_data_init(data, 0, event->hw.last_period); =20 - setup_pebs_time(event, data, basic->tsc); - - /* - * We must however always use iregs for the unwinder to stay sane; the - * record BP,SP,IP can point into thin air when the record is from a - * previous PMI context or an (I)RET happened between the record and - * PMI. - */ - perf_sample_save_callchain(data, event, iregs); + __setup_perf_sample_data(event, iregs, data); =20 *regs =3D *iregs; - /* The ip in basic is EventingIP */ - set_linear_ip(regs, basic->ip); - regs->flags =3D PERF_EFLAGS_EXACT; =20 - if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) { - if (x86_pmu.flags & PMU_FL_RETIRE_LATENCY) - data->weight.var3_w =3D basic->retire_latency; - else - data->weight.var3_w =3D 0; - } + /* basic group */ + retire =3D x86_pmu.flags & PMU_FL_RETIRE_LATENCY ? + basic->retire_latency : 0; + __setup_pebs_basic_group(event, regs, data, sample_type, + basic->ip, basic->tsc, retire); =20 /* * The record for MEMINFO is in front of GP @@ -2134,54 +2206,20 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, gprs =3D next_record; next_record =3D gprs + 1; =20 - if (event->attr.precise_ip < 2) { - set_linear_ip(regs, gprs->ip); - regs->flags &=3D ~PERF_EFLAGS_EXACT; - } - - if (sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)) - adaptive_pebs_save_regs(regs, gprs); + __setup_pebs_gpr_group(event, regs, gprs, sample_type); } =20 if (format_group & PEBS_DATACFG_MEMINFO) { - if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) { - u64 latency =3D x86_pmu.flags & PMU_FL_INSTR_LATENCY ? - meminfo->cache_latency : meminfo->mem_latency; - - if (x86_pmu.flags & PMU_FL_INSTR_LATENCY) - data->weight.var2_w =3D meminfo->instr_latency; - - /* - * Although meminfo::latency is defined as a u64, - * only the lower 32 bits include the valid data - * in practice on Ice Lake and earlier platforms. - */ - if (sample_type & PERF_SAMPLE_WEIGHT) { - data->weight.full =3D latency ?: - intel_get_tsx_weight(meminfo->tsx_tuning); - } else { - data->weight.var1_dw =3D (u32)latency ?: - intel_get_tsx_weight(meminfo->tsx_tuning); - } - - data->sample_flags |=3D PERF_SAMPLE_WEIGHT_TYPE; - } - - if (sample_type & PERF_SAMPLE_DATA_SRC) { - data->data_src.val =3D get_data_src(event, meminfo->aux); - data->sample_flags |=3D PERF_SAMPLE_DATA_SRC; - } - - if (sample_type & PERF_SAMPLE_ADDR_TYPE) { - data->addr =3D meminfo->address; - data->sample_flags |=3D PERF_SAMPLE_ADDR; - } - - if (sample_type & PERF_SAMPLE_TRANSACTION) { - data->txn =3D intel_get_tsx_transaction(meminfo->tsx_tuning, - gprs ? gprs->ax : 0); - data->sample_flags |=3D PERF_SAMPLE_TRANSACTION; - } + u64 latency =3D x86_pmu.flags & PMU_FL_INSTR_LATENCY ? + meminfo->cache_latency : meminfo->mem_latency; + u64 instr_latency =3D x86_pmu.flags & PMU_FL_INSTR_LATENCY ? + meminfo->instr_latency : 0; + u64 ax =3D gprs ? gprs->ax : 0; + + __setup_pebs_meminfo_group(event, data, sample_type, latency, + instr_latency, meminfo->address, + meminfo->aux, meminfo->tsx_tuning, + ax); } =20 if (format_group & PEBS_DATACFG_XMMS) { --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C8F922E3E1; Fri, 20 Jun 2025 07:29:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404571; cv=none; b=HpguxKWcgkJoBWRm1IfPceFhtvC7kLPetMF3EkLaD37ngLoqlbrMC4nvF6nfx4H0ao9MvEsTJdaMv9A86yi5BOCygKMMoKzyXP7VEjIv1nLhXeLjbJlDiCbMsShCnS1iLjdiJy8/kMXGwXJBAeKkQD6uPKkCbWHck2I1ZIHd/Pc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404571; c=relaxed/simple; bh=ke85iqABoDSH4jaz7WwSDsFkTdgH7Z6+Pg5pVb+D4KM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IuPo1GNIfisMU1/X714ZNeKpDlDTed80VQr1NPghMi0W5i764Z0K2UBHWprxDqbyy7qf8cWEPMV4f0VfsM8+Pic1vVMn273jl4rMHWPJEjBoe+3lq9BeVq21ql0z4wq8Uvsj5WQ0eMHwrB+ddpcUC7JknQWDIs/m6YaLSaQSjkg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=j19IjG3c; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="j19IjG3c" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404569; x=1781940569; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ke85iqABoDSH4jaz7WwSDsFkTdgH7Z6+Pg5pVb+D4KM=; b=j19IjG3c/UCvsTcKcoSqmIMA2UbI/KirkFIgwm2MZrx+0+hCh9YGe4tn xqMBp1Oae5XHTH+Iz8m2Ek4SzK+p5w8hdxuLtkEe2TZMfcBEihTGL8d9h tBAl9H1l7NXaWeDKqPac1fqdWRRHgz4A44raOdYBSbb1eRvGVHrWaGAox idVKAU77hlgdAimAemKj4qDO/MDoNVnV649vPivbOGv7194iy6mppQ0Ni mKl1pUiI+vuC8yvwLw8SI8C8618Ap+ZiKit//HY83cOX19oQuAdY2FG2j fpFZXw6Ism0Gapt7gSe47nFMlhdiMExjmx+NtpRq6zwduhXm2hZKOiYS7 g==; X-CSE-ConnectionGUID: Oi8LWW4uR7SazTJEns/N8g== X-CSE-MsgGUID: ApaqSUV0QJmFOOY7XODKow== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887757" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887757" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:29 -0700 X-CSE-ConnectionGUID: GDDYpFgASTif62Wv4p4kYw== X-CSE-MsgGUID: 8Oa4Rn3iSWySj1mlUFc6Cw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156651023" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:25 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 06/13] perf/x86/intel: Process arch-PEBS records or record fragments Date: Fri, 20 Jun 2025 10:39:02 +0000 Message-ID: <20250620103909.1586595-7-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A significant difference with adaptive PEBS is that arch-PEBS record supports fragments which means an arch-PEBS record could be split into several independent fragments which have its own arch-PEBS header in each fragment. This patch defines architectural PEBS record layout structures and add helpers to process arch-PEBS records or fragments. Only legacy PEBS groups like basic, GPR, XMM and LBR groups are supported in this patch, the new added YMM/ZMM/OPMASK vector registers capturing would be supported in the future. Signed-off-by: Dapeng Mi --- arch/x86/events/intel/core.c | 13 +++ arch/x86/events/intel/ds.c | 180 ++++++++++++++++++++++++++++++ arch/x86/include/asm/msr-index.h | 6 + arch/x86/include/asm/perf_event.h | 96 ++++++++++++++++ 4 files changed, 295 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index faf4ab91fa4b..4025ea7934ac 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3218,6 +3218,19 @@ static int handle_pmi_common(struct pt_regs *regs, u= 64 status) status &=3D ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT; } =20 + /* + * Arch PEBS sets bit 54 in the global status register + */ + if (__test_and_clear_bit(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT, + (unsigned long *)&status)) { + handled++; + static_call(x86_pmu_drain_pebs)(regs, &data); + + if (cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS] && + is_pebs_counter_event_group(cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS])) + status &=3D ~GLOBAL_STATUS_PERF_METRICS_OVF_BIT; + } + /* * Intel PT */ diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 4fdb1c59a907..b6eface4dccd 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2270,6 +2270,114 @@ static void setup_pebs_adaptive_sample_data(struct = perf_event *event, format_group); } =20 +static inline bool arch_pebs_record_continued(struct arch_pebs_header *hea= der) +{ + /* Continue bit or null PEBS record indicates fragment follows. */ + return header->cont || !(header->format & GENMASK_ULL(63, 16)); +} + +static void setup_arch_pebs_sample_data(struct perf_event *event, + struct pt_regs *iregs, void *__pebs, + struct perf_sample_data *data, + struct pt_regs *regs) +{ + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + u64 sample_type =3D event->attr.sample_type; + struct arch_pebs_header *header =3D NULL; + struct arch_pebs_aux *meminfo =3D NULL; + struct arch_pebs_gprs *gprs =3D NULL; + struct x86_perf_regs *perf_regs; + void *next_record; + void *at =3D __pebs; + + if (at =3D=3D NULL) + return; + + perf_regs =3D container_of(regs, struct x86_perf_regs, regs); + perf_regs->xmm_regs =3D NULL; + + __setup_perf_sample_data(event, iregs, data); + + *regs =3D *iregs; + +again: + header =3D at; + next_record =3D at + sizeof(struct arch_pebs_header); + if (header->basic) { + struct arch_pebs_basic *basic =3D next_record; + u16 retire =3D 0; + + next_record =3D basic + 1; + + if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) + retire =3D basic->valid ? basic->retire : 0; + __setup_pebs_basic_group(event, regs, data, sample_type, + basic->ip, basic->tsc, retire); + } + + /* + * The record for MEMINFO is in front of GP + * But PERF_SAMPLE_TRANSACTION needs gprs->ax. + * Save the pointer here but process later. + */ + if (header->aux) { + meminfo =3D next_record; + next_record =3D meminfo + 1; + } + + if (header->gpr) { + gprs =3D next_record; + next_record =3D gprs + 1; + + __setup_pebs_gpr_group(event, regs, (struct pebs_gprs *)gprs, + sample_type); + } + + if (header->aux) { + u64 ax =3D gprs ? gprs->ax : 0; + + __setup_pebs_meminfo_group(event, data, sample_type, + meminfo->cache_latency, + meminfo->instr_latency, + meminfo->address, meminfo->aux, + meminfo->tsx_tuning, ax); + } + + if (header->xmm) { + struct pebs_xmm *xmm; + + next_record +=3D sizeof(struct arch_pebs_xer_header); + + xmm =3D next_record; + perf_regs->xmm_regs =3D xmm->xmm; + next_record =3D xmm + 1; + } + + if (header->lbr) { + struct arch_pebs_lbr_header *lbr_header =3D next_record; + struct lbr_entry *lbr; + int num_lbr; + + next_record =3D lbr_header + 1; + lbr =3D next_record; + + num_lbr =3D header->lbr =3D=3D ARCH_PEBS_LBR_NUM_VAR ? lbr_header->depth= : + header->lbr * ARCH_PEBS_BASE_LBR_ENTRIES; + next_record +=3D num_lbr * sizeof(struct lbr_entry); + + if (has_branch_stack(event)) { + intel_pmu_store_pebs_lbrs(lbr); + intel_pmu_lbr_save_brstack(data, cpuc, event); + } + } + + /* Parse followed fragments if there are. */ + if (arch_pebs_record_continued(header)) { + at =3D at + header->size; + goto again; + } +} + static inline void * get_next_pebs_record_by_bit(void *base, void *top, int bit) { @@ -2750,6 +2858,77 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs = *iregs, struct perf_sample_d setup_pebs_adaptive_sample_data); } =20 +static void intel_pmu_drain_arch_pebs(struct pt_regs *iregs, + struct perf_sample_data *data) +{ + short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] =3D {}; + void *last[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS]; + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + union arch_pebs_index index; + struct x86_perf_regs perf_regs; + struct pt_regs *regs =3D &perf_regs.regs; + void *base, *at, *top; + u64 mask; + + rdmsrl(MSR_IA32_PEBS_INDEX, index.full); + + if (unlikely(!index.split.wr)) { + intel_pmu_pebs_event_update_no_drain(cpuc, X86_PMC_IDX_MAX); + return; + } + + base =3D cpuc->ds_pebs_vaddr; + top =3D (void *)((u64)cpuc->ds_pebs_vaddr + + (index.split.wr << ARCH_PEBS_INDEX_WR_SHIFT)); + + mask =3D hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled; + + if (!iregs) + iregs =3D &dummy_iregs; + + /* Process all but the last event for each counter. */ + for (at =3D base; at < top;) { + struct arch_pebs_header *header; + struct arch_pebs_basic *basic; + u64 pebs_status; + + header =3D at; + + if (WARN_ON_ONCE(!header->size)) + break; + + /* 1st fragment or single record must have basic group */ + if (!header->basic) { + at +=3D header->size; + continue; + } + + basic =3D at + sizeof(struct arch_pebs_header); + pebs_status =3D mask & basic->applicable_counters; + __intel_pmu_handle_pebs_record(iregs, regs, data, at, + pebs_status, counts, last, + setup_arch_pebs_sample_data); + + /* Skip non-last fragments */ + while (arch_pebs_record_continued(header)) { + if (!header->size) + break; + at +=3D header->size; + header =3D at; + } + + /* Skip last fragment or the single record */ + at +=3D header->size; + } + + __intel_pmu_handle_last_pebs_record(iregs, regs, data, mask, counts, + last, setup_arch_pebs_sample_data); + + index.split.wr =3D 0; + index.split.full =3D 0; + wrmsrq(MSR_IA32_PEBS_INDEX, index.full); +} + static void __init intel_arch_pebs_init(void) { /* @@ -2759,6 +2938,7 @@ static void __init intel_arch_pebs_init(void) */ x86_pmu.arch_pebs =3D 1; x86_pmu.pebs_buffer_size =3D PEBS_BUFFER_SIZE; + x86_pmu.drain_pebs =3D intel_pmu_drain_arch_pebs; x86_pmu.pebs_capable =3D ~0ULL; =20 x86_pmu.pebs_enable =3D __intel_pmu_pebs_enable; diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index b7dded3c8113..d3bc28230628 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -322,6 +322,12 @@ #define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE) =20 +/* Arch PEBS */ +#define MSR_IA32_PEBS_BASE 0x000003f4 +#define MSR_IA32_PEBS_INDEX 0x000003f5 +#define ARCH_PEBS_OFFSET_MASK 0x7fffff +#define ARCH_PEBS_INDEX_WR_SHIFT 4 + #define MSR_IA32_RTIT_CTL 0x00000570 #define RTIT_CTL_TRACEEN BIT(0) #define RTIT_CTL_CYCLEACC BIT(1) diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 7fca9494aae9..0f70d13780fe 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -433,6 +433,8 @@ static inline bool is_topdown_idx(int idx) #define GLOBAL_STATUS_LBRS_FROZEN BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT) #define GLOBAL_STATUS_TRACE_TOPAPMI_BIT 55 #define GLOBAL_STATUS_TRACE_TOPAPMI BIT_ULL(GLOBAL_STATUS_TRACE_TOPAPMI_B= IT) +#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT 54 +#define GLOBAL_STATUS_ARCH_PEBS_THRESHOLD BIT_ULL(GLOBAL_STATUS_ARCH_PEBS_= THRESHOLD_BIT) #define GLOBAL_STATUS_PERF_METRICS_OVF_BIT 48 =20 #define GLOBAL_CTRL_EN_PERF_METRICS 48 @@ -503,6 +505,100 @@ struct pebs_cntr_header { =20 #define INTEL_CNTR_METRICS 0x3 =20 +/* + * Arch PEBS + */ +union arch_pebs_index { + struct { + u64 rsvd:4, + wr:23, + rsvd2:4, + full:1, + en:1, + rsvd3:3, + thresh:23, + rsvd4:5; + } split; + u64 full; +}; + +struct arch_pebs_header { + union { + u64 format; + struct { + u64 size:16, /* Record size */ + rsvd:14, + mode:1, /* 64BIT_MODE */ + cont:1, + rsvd2:3, + cntr:5, + lbr:2, + rsvd3:7, + xmm:1, + ymmh:1, + rsvd4:2, + opmask:1, + zmmh:1, + h16zmm:1, + rsvd5:5, + gpr:1, + aux:1, + basic:1; + }; + }; + u64 rsvd6; +}; + +struct arch_pebs_basic { + u64 ip; + u64 applicable_counters; + u64 tsc; + u64 retire :16, /* Retire Latency */ + valid :1, + rsvd :47; + u64 rsvd2; + u64 rsvd3; +}; + +struct arch_pebs_aux { + u64 address; + u64 rsvd; + u64 rsvd2; + u64 rsvd3; + u64 rsvd4; + u64 aux; + u64 instr_latency :16, + pad2 :16, + cache_latency :16, + pad3 :16; + u64 tsx_tuning; +}; + +struct arch_pebs_gprs { + u64 flags, ip, ax, cx, dx, bx, sp, bp, si, di; + u64 r8, r9, r10, r11, r12, r13, r14, r15, ssp; + u64 rsvd; +}; + +struct arch_pebs_xer_header { + u64 xstate; + u64 rsvd; +}; + +#define ARCH_PEBS_LBR_NAN 0x0 +#define ARCH_PEBS_LBR_NUM_8 0x1 +#define ARCH_PEBS_LBR_NUM_16 0x2 +#define ARCH_PEBS_LBR_NUM_VAR 0x3 +#define ARCH_PEBS_BASE_LBR_ENTRIES 8 +struct arch_pebs_lbr_header { + u64 rsvd; + u64 ctl; + u64 depth; + u64 ler_from; + u64 ler_to; + u64 ler_info; +}; + /* * AMD Extended Performance Monitoring and Debug cpuid feature detection */ --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 20781228CBE; Fri, 20 Jun 2025 07:29:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404574; cv=none; b=EWK3qd4dPPLdzNUOq6iiGgJHkpLtiUHFZIGNJWduhiMz0dqF2zkXAAJ8Bs/SOFNq8Y0gOod/G7MJrNg4m5/gbJYTnmgSBu0XTbG/etMBD2uZkRd8AITyg7olsNcjOgVdQDURlMisafC5snx8pKCLpGuH2Gh5yXnCY5L42TWYSK0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404574; c=relaxed/simple; bh=Bn7I4ijfObfwLsbZ5vl8m3VAhJx6IDyhwXBll6CFl1o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=R8nydq8ty8TXJX5Z3MB4IVCfDHvKv/ckLnblyfYfPhV9Fm9QtMGwzqGH+ahJ/6zw08QRP5xYvAToupvhi9+ML/Ci1UIlaQC+0OzkBO5Kb0i7Y/58oFvtZEeZm9KVr4H8d0QMWO66BiKSDydKVaZNZTPVueQlubixXj4cqftxE/w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=F7Sh8Hhc; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="F7Sh8Hhc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404573; x=1781940573; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Bn7I4ijfObfwLsbZ5vl8m3VAhJx6IDyhwXBll6CFl1o=; b=F7Sh8HhcW868pFMuoMVGnnuUTxLDdUs0n8WvbeyvS9AnjnTCO/zevjZE XhKdtUWFww9cyqh8c/zdFNc3b76fjK3dm8lYewKhsruBqwe/uNnVfpwSq zhBX3t6OtGtnGb9/KqUMzf0tRxNdtcD3t2OsPVW6B2xfddkI+A3mNwclK Ce0/6dyw5H5epWPI5ehKyZtJPaftbWtG7rihxSzzPDg3/h3A0sfOH6iri aLKMicdLiDFCitcK8O1+QdtHGc8pWj6SrNoi3bTkf0wc1MxvwL/kcl33Q 2ucitd1ceDG5Nt5QfwWQbKq1x2IlN+UCoDvG7Patk3z7HNaSEzb6bAf1r w==; X-CSE-ConnectionGUID: U97RoZ70RGyQvixmDgw+4w== X-CSE-MsgGUID: kRv4nzBoQR2TtiNLlq3AHw== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887765" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887765" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:32 -0700 X-CSE-ConnectionGUID: 8cmhzUYoQZaN5SpQvhyBZw== X-CSE-MsgGUID: Yt98FONGSHuAzLFZJ0257Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156651036" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:29 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 07/13] perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR Date: Fri, 20 Jun 2025 10:39:03 +0000 Message-ID: <20250620103909.1586595-8-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Arch-PEBS introduces a new MSR IA32_PEBS_BASE to store the arch-PEBS buffer physical address. This patch allocates arch-PEBS buffer and then initializeis IA32_PEBS_BASE MSR with the allocated PEBS buffer physical address. Co-developed-by: Kan Liang Signed-off-by: Kan Liang Signed-off-by: Dapeng Mi --- arch/x86/events/intel/core.c | 2 + arch/x86/events/intel/ds.c | 73 +++++++++++++++++++++++++++------ arch/x86/events/perf_event.h | 7 +++- arch/x86/include/asm/intel_ds.h | 3 +- 4 files changed, 70 insertions(+), 15 deletions(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 4025ea7934ac..5e6ef9f3a077 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -5460,6 +5460,7 @@ static void intel_pmu_cpu_starting(int cpu) return; =20 init_debug_store_on_cpu(cpu); + init_arch_pebs_buf_on_cpu(cpu); /* * Deal with CPUs that don't clear their LBRs on power-up, and that may * even boot with LBRs enabled. @@ -5557,6 +5558,7 @@ static void free_excl_cntrs(struct cpu_hw_events *cpu= c) static void intel_pmu_cpu_dying(int cpu) { fini_debug_store_on_cpu(cpu); + fini_arch_pebs_buf_on_cpu(cpu); } =20 void intel_cpuc_finish(struct cpu_hw_events *cpuc) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index b6eface4dccd..72b925b8c482 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -625,13 +625,22 @@ static int alloc_pebs_buffer(int cpu) int max, node =3D cpu_to_node(cpu); void *buffer, *insn_buff, *cea; =20 - if (!x86_pmu.ds_pebs) + if (!intel_pmu_has_pebs()) return 0; =20 - buffer =3D dsalloc_pages(bsiz, GFP_KERNEL, cpu); + /* + * alloc_pebs_buffer() could be called by init_arch_pebs_buf_on_cpu() + * which is in atomic context. + */ + buffer =3D dsalloc_pages(bsiz, preemptible() ? GFP_KERNEL : GFP_ATOMIC, c= pu); if (unlikely(!buffer)) return -ENOMEM; =20 + if (x86_pmu.arch_pebs) { + hwev->pebs_vaddr =3D buffer; + return 0; + } + /* * HSW+ already provides us the eventing ip; no need to allocate this * buffer then. @@ -644,7 +653,7 @@ static int alloc_pebs_buffer(int cpu) } per_cpu(insn_buffer, cpu) =3D insn_buff; } - hwev->ds_pebs_vaddr =3D buffer; + hwev->pebs_vaddr =3D buffer; /* Update the cpu entry area mapping */ cea =3D &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer; ds->pebs_buffer_base =3D (unsigned long) cea; @@ -660,17 +669,20 @@ static void release_pebs_buffer(int cpu) struct cpu_hw_events *hwev =3D per_cpu_ptr(&cpu_hw_events, cpu); void *cea; =20 - if (!x86_pmu.ds_pebs) + if (!intel_pmu_has_pebs()) return; =20 - kfree(per_cpu(insn_buffer, cpu)); - per_cpu(insn_buffer, cpu) =3D NULL; + if (x86_pmu.ds_pebs) { + kfree(per_cpu(insn_buffer, cpu)); + per_cpu(insn_buffer, cpu) =3D NULL; =20 - /* Clear the fixmap */ - cea =3D &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer; - ds_clear_cea(cea, x86_pmu.pebs_buffer_size); - dsfree_pages(hwev->ds_pebs_vaddr, x86_pmu.pebs_buffer_size); - hwev->ds_pebs_vaddr =3D NULL; + /* Clear the fixmap */ + cea =3D &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer; + ds_clear_cea(cea, x86_pmu.pebs_buffer_size); + } + + dsfree_pages(hwev->pebs_vaddr, x86_pmu.pebs_buffer_size); + hwev->pebs_vaddr =3D NULL; } =20 static int alloc_bts_buffer(int cpu) @@ -823,6 +835,41 @@ void reserve_ds_buffers(void) } } =20 +void init_arch_pebs_buf_on_cpu(int cpu) +{ + struct cpu_hw_events *cpuc =3D per_cpu_ptr(&cpu_hw_events, cpu); + u64 arch_pebs_base; + + if (!x86_pmu.arch_pebs) + return; + + if (alloc_pebs_buffer(cpu) < 0 || !cpuc->pebs_vaddr) { + WARN(1, "Fail to allocate PEBS buffer on CPU %d\n", cpu); + x86_pmu.pebs_active =3D 0; + return; + } + + /* + * 4KB-aligned pointer of the output buffer + * (__alloc_pages_node() return page aligned address) + * Buffer Size =3D 4KB * 2^SIZE + * contiguous physical buffer (__alloc_pages_node() with order) + */ + arch_pebs_base =3D virt_to_phys(cpuc->pebs_vaddr) | PEBS_BUFFER_SHIFT; + wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, (u32)arch_pebs_base, + (u32)(arch_pebs_base >> 32)); + x86_pmu.pebs_active =3D 1; +} + +void fini_arch_pebs_buf_on_cpu(int cpu) +{ + if (!x86_pmu.arch_pebs) + return; + + wrmsr_on_cpu(cpu, MSR_IA32_PEBS_BASE, 0, 0); + release_pebs_buffer(cpu); +} + /* * BTS */ @@ -2877,8 +2924,8 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs = *iregs, return; } =20 - base =3D cpuc->ds_pebs_vaddr; - top =3D (void *)((u64)cpuc->ds_pebs_vaddr + + base =3D cpuc->pebs_vaddr; + top =3D (void *)((u64)cpuc->pebs_vaddr + (index.split.wr << ARCH_PEBS_INDEX_WR_SHIFT)); =20 mask =3D hybrid(cpuc->pmu, arch_pebs_cap).counters & cpuc->pebs_enabled; diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index a5145e8f1ddb..82e8c20611b9 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -283,8 +283,9 @@ struct cpu_hw_events { * Intel DebugStore bits */ struct debug_store *ds; - void *ds_pebs_vaddr; void *ds_bts_vaddr; + /* DS based PEBS or arch-PEBS buffer address */ + void *pebs_vaddr; u64 pebs_enabled; int n_pebs; int n_large_pebs; @@ -1618,6 +1619,10 @@ extern void intel_cpuc_finish(struct cpu_hw_events *= cpuc); =20 int intel_pmu_init(void); =20 +void init_arch_pebs_buf_on_cpu(int cpu); + +void fini_arch_pebs_buf_on_cpu(int cpu); + void init_debug_store_on_cpu(int cpu); =20 void fini_debug_store_on_cpu(int cpu); diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_d= s.h index 5dbeac48a5b9..023c2883f9f3 100644 --- a/arch/x86/include/asm/intel_ds.h +++ b/arch/x86/include/asm/intel_ds.h @@ -4,7 +4,8 @@ #include =20 #define BTS_BUFFER_SIZE (PAGE_SIZE << 4) -#define PEBS_BUFFER_SIZE (PAGE_SIZE << 4) +#define PEBS_BUFFER_SHIFT 4 +#define PEBS_BUFFER_SIZE (PAGE_SIZE << PEBS_BUFFER_SHIFT) =20 /* The maximal number of PEBS events: */ #define MAX_PEBS_EVENTS_FMT4 8 --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45455229B29; Fri, 20 Jun 2025 07:29:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404577; cv=none; b=FF1E0SDLhk0hBocMWLZr4X5yIglEX07ZzhM8e4pBgVA++wws8FcQuILb4E/HEAaslPXEsgAhq4qaiocdB8PI4prM5ACXM3Im6sULzgN08kN5rjpJkfiy5o/PEehiYHra7mNkybEXe34w9X68PhD+ECQmALEt9UTeMmx6e/KFhxY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404577; c=relaxed/simple; bh=FdLKjriQbeodnad0w+4f5569bHFWgZk4vPWeQQc6NHE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RaNR0ws12LdzgHZt9orF5xoeGh4cV5b7hdN3brVGoXzLl64mEBLeC3tM0VhtOHumhyliRWlbaUGBe2yH/jZvI8fILdWP2xdVzgZao2mScNwIcIftohblH6iX9J/gby45BDHhCzKIa0vpWrGOukaCVBq0CeXLz0FqhKpskidZpjc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=J5YDceZU; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="J5YDceZU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404576; x=1781940576; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FdLKjriQbeodnad0w+4f5569bHFWgZk4vPWeQQc6NHE=; b=J5YDceZUbzY7GJApGPOT9ju2enZyxNHTB40dwJdDMCGXN58cpPY5+wVn K42Rp7XICi6CIMZVg/ZyPmXBLe6aWlHjssfzsY1pE5m1MO13EZ0xYFIOV Yn8UC6/4HnBUWQpj+2ujdcr6jDyXCRpqEvKHqabRsJL7diEDzzYuk9pVw qVx6rGO3L0uzi9nqtwlc2QUHtPSbSkfbbqQi+QkQr0sZLDaDreKpFbTm+ 9LuP0gI0lfIogcPCSiS3OE4QqntRcMGCXU1qO1w0jwVLCHr+trMBSgfWb uf3WJ0AtcaR2usbjuZTA6TbcW5X/CQ4Beg9CJ9geoL//h6Z/d/ywg+ugp g==; X-CSE-ConnectionGUID: QaDWpyDDTF+2gyV1JfPNRg== X-CSE-MsgGUID: fFjDlNgPQaaPKSwhkRRUVQ== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887782" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887782" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:36 -0700 X-CSE-ConnectionGUID: QWXA5CHgT2+ez6LjituEww== X-CSE-MsgGUID: ik8N5zvKSUaiifKzVOTcPQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156651045" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:33 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 08/13] perf/x86/intel: Update dyn_constranit base on PEBS event precise level Date: Fri, 20 Jun 2025 10:39:04 +0000 Message-ID: <20250620103909.1586595-9-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" arch-PEBS provides CPUIDs to enumerate which counters support PEBS sampling and precise distribution PEBS sampling. Thus PEBS constraints should be dynamically configured base on these counter and precise distribution bitmap instead of defining them statically. Update event dyn_constraint base on PEBS event precise level. Signed-off-by: Dapeng Mi --- arch/x86/events/intel/core.c | 11 +++++++++++ arch/x86/events/intel/ds.c | 1 + 2 files changed, 12 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 5e6ef9f3a077..00b41c693d13 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -4255,6 +4255,8 @@ static int intel_pmu_hw_config(struct perf_event *eve= nt) } =20 if (event->attr.precise_ip) { + struct arch_pebs_cap pebs_cap =3D hybrid(event->pmu, arch_pebs_cap); + if ((event->attr.config & INTEL_ARCH_EVENT_MASK) =3D=3D INTEL_FIXED_VLBR= _EVENT) return -EINVAL; =20 @@ -4268,6 +4270,15 @@ static int intel_pmu_hw_config(struct perf_event *ev= ent) } if (x86_pmu.pebs_aliases) x86_pmu.pebs_aliases(event); + + if (x86_pmu.arch_pebs) { + u64 cntr_mask =3D hybrid(event->pmu, intel_ctrl) & + ~GLOBAL_CTRL_EN_PERF_METRICS; + u64 pebs_mask =3D event->attr.precise_ip >=3D 3 ? + pebs_cap.pdists : pebs_cap.counters; + if (cntr_mask !=3D pebs_mask) + event->hw.dyn_constraint &=3D pebs_mask; + } } =20 if (needs_branch_stack(event)) { diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 72b925b8c482..30915338b929 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2987,6 +2987,7 @@ static void __init intel_arch_pebs_init(void) x86_pmu.pebs_buffer_size =3D PEBS_BUFFER_SIZE; x86_pmu.drain_pebs =3D intel_pmu_drain_arch_pebs; x86_pmu.pebs_capable =3D ~0ULL; + x86_pmu.flags |=3D PMU_FL_PEBS_ALL; =20 x86_pmu.pebs_enable =3D __intel_pmu_pebs_enable; x86_pmu.pebs_disable =3D __intel_pmu_pebs_disable; --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C10A22A1CF; Fri, 20 Jun 2025 07:29:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404582; cv=none; b=WluCSYo1X1V6vU5Mct9lfIfMRYAx6H95yRhOzIfZJ6vM+CRFl/OmqkzU5QBy6lZZDmdauD7o7bJYwD2mmoeMbA/YkY4IEIztYi9qL4VgPOWlFzlswNtwz9EnDmE8NHETpmqvUIhVeH8JkTTavmGKlyjH5PUN/hg9qvU1p1A4JSU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404582; c=relaxed/simple; bh=2a4Yh0NxCeJd5Yc9iA8ZXCeBUVKSmiPs5lMnJ+Mpco4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HTQElSkjk62lteGGeZ2bhKIMeEPDuoNiyl8OUU7PEmG7pD511VVxtgPFqEd/gDDXBxoBF0T89cgahlPf+LFX5qayhI8VPe8jva5Iv9Bprt6n+h2zHglhvNb6o1CY9Ljrad+LOg6DFGz/9HJ77d0whLRJwlaHjKfI4r9gv8uoZzA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=amG2Cws6; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="amG2Cws6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404580; x=1781940580; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2a4Yh0NxCeJd5Yc9iA8ZXCeBUVKSmiPs5lMnJ+Mpco4=; b=amG2Cws6nHq1Xbjy0Jzm7XEgURlMWwoxkAFd+ZJyw/r2TcM90tdOROT5 MKJFaH3RCq+CF/r1rmyJFi7vQLGFWcUxd2zQFDTkp6bcDjxLAyrDJjkql HkzBmW7cK0EGSp+6HeNG7Y4Wf/nhttvkk0If+Zvd2RoovPOEKJyY27/rH AKiQ10VE0mN/FuOHFuYXXnCWvmVmHmhiBowRpY9ssl0itaFFvW/FNZ6vU hbfCeBdKOvT1NF14WR6Bnovjy9a50+qEeRTgtED0YNfgvyN7pwBBeHq6X AKOPPGvZQhuePcajryXwSdhpzGwfME+hB5WYtLq/eIxmQ5nNpDqld67k8 Q==; X-CSE-ConnectionGUID: c6W7kS+5QHuAo9TxWh2JtA== X-CSE-MsgGUID: 3Ct5MR+5Se+hPNyOeRjjeA== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887797" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887797" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:40 -0700 X-CSE-ConnectionGUID: 1cAkbDI4QSyn0Vmw9UWytg== X-CSE-MsgGUID: nb5MwRDdR/yNkHNy6fxrDA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156651049" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:36 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 09/13] perf/x86/intel: Setup PEBS data configuration and enable legacy groups Date: Fri, 20 Jun 2025 10:39:05 +0000 Message-ID: <20250620103909.1586595-10-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Different with legacy PEBS, arch-PEBS provides per-counter PEBS data configuration by programing MSR IA32_PMC_GPx/FXx_CFG_C MSRs. This patch obtains PEBS data configuration from event attribute and then writes the PEBS data configuration to MSR IA32_PMC_GPx/FXx_CFG_C and enable corresponding PEBS groups. Please notice this patach only enables XMM SIMD regs sampling for arch-PEBS, the other SIMD regs (OPMASK/YMM/ZMM) sampling on arch-PEBS would be supported after PMI based SIMD regs (OPMASK/YMM/ZMM) sampling is supported. Co-developed-by: Kan Liang Signed-off-by: Kan Liang Signed-off-by: Dapeng Mi --- arch/x86/events/intel/core.c | 130 ++++++++++++++++++++++++++++++- arch/x86/events/intel/ds.c | 17 ++++ arch/x86/events/perf_event.h | 12 +++ arch/x86/include/asm/intel_ds.h | 7 ++ arch/x86/include/asm/msr-index.h | 8 ++ 5 files changed, 173 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 00b41c693d13..faea1d42ce0c 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2563,6 +2563,39 @@ static void intel_pmu_disable_fixed(struct perf_even= t *event) cpuc->fixed_ctrl_val &=3D ~mask; } =20 +static inline void __intel_pmu_update_event_ext(int idx, u64 ext) +{ + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + u32 msr =3D idx < INTEL_PMC_IDX_FIXED ? + x86_pmu_cfg_c_addr(idx, true) : + x86_pmu_cfg_c_addr(idx - INTEL_PMC_IDX_FIXED, false); + + cpuc->cfg_c_val[idx] =3D ext; + wrmsrq(msr, ext); +} + +static void intel_pmu_disable_event_ext(struct perf_event *event) +{ + if (!x86_pmu.arch_pebs) + return; + + /* + * Only clear CFG_C MSR for PEBS counter group events, + * it avoids the HW counter's value to be added into + * other PEBS records incorrectly after PEBS counter + * group events are disabled. + * + * For other events, it's unnecessary to clear CFG_C MSRs + * since CFG_C doesn't take effect if counter is in + * disabled state. That helps to reduce the WRMSR overhead + * in context switches. + */ + if (!is_pebs_counter_event_group(event)) + return; + + __intel_pmu_update_event_ext(event->hw.idx, 0); +} + static void intel_pmu_disable_event(struct perf_event *event) { struct hw_perf_event *hwc =3D &event->hw; @@ -2571,9 +2604,12 @@ static void intel_pmu_disable_event(struct perf_even= t *event) switch (idx) { case 0 ... INTEL_PMC_IDX_FIXED - 1: intel_clear_masks(event, idx); + intel_pmu_disable_event_ext(event); x86_pmu_disable_event(event); break; case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1: + intel_pmu_disable_event_ext(event); + fallthrough; case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END: intel_pmu_disable_fixed(event); break; @@ -2944,6 +2980,67 @@ static void intel_pmu_enable_acr(struct perf_event *= event) =20 DEFINE_STATIC_CALL_NULL(intel_pmu_enable_acr_event, intel_pmu_enable_acr); =20 +static void intel_pmu_enable_event_ext(struct perf_event *event) +{ + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); + struct hw_perf_event *hwc =3D &event->hw; + union arch_pebs_index cached, index; + struct arch_pebs_cap cap; + u64 ext =3D 0; + + if (!x86_pmu.arch_pebs) + return; + + cap =3D hybrid(cpuc->pmu, arch_pebs_cap); + + if (event->attr.precise_ip) { + u64 pebs_data_cfg =3D intel_get_arch_pebs_data_config(event); + + ext |=3D ARCH_PEBS_EN; + if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) + ext |=3D (-hwc->sample_period) & ARCH_PEBS_RELOAD; + + if (pebs_data_cfg && cap.caps) { + if (pebs_data_cfg & PEBS_DATACFG_MEMINFO) + ext |=3D ARCH_PEBS_AUX & cap.caps; + + if (pebs_data_cfg & PEBS_DATACFG_GP) + ext |=3D ARCH_PEBS_GPR & cap.caps; + + if (pebs_data_cfg & PEBS_DATACFG_XMMS) + ext |=3D ARCH_PEBS_VECR_XMM & cap.caps; + + if (pebs_data_cfg & PEBS_DATACFG_LBRS) + ext |=3D ARCH_PEBS_LBR & cap.caps; + } + + if (cpuc->n_pebs =3D=3D cpuc->n_large_pebs) + index.split.thresh =3D ARCH_PEBS_THRESH_MUL; + else + index.split.thresh =3D ARCH_PEBS_THRESH_SINGLE; + + rdmsrl(MSR_IA32_PEBS_INDEX, cached.full); + if (index.split.thresh !=3D cached.split.thresh || !cached.split.en) { + if (cached.split.thresh =3D=3D ARCH_PEBS_THRESH_MUL && + cached.split.wr > 0) { + /* + * Large PEBS was enabled. + * Drain PEBS buffer before applying the single PEBS. + */ + intel_pmu_drain_pebs_buffer(); + } else { + index.split.wr =3D 0; + index.split.full =3D 0; + index.split.en =3D 1; + wrmsrq(MSR_IA32_PEBS_INDEX, index.full); + } + } + } + + if (cpuc->cfg_c_val[hwc->idx] !=3D ext) + __intel_pmu_update_event_ext(hwc->idx, ext); +} + static void intel_pmu_enable_event(struct perf_event *event) { u64 enable_mask =3D ARCH_PERFMON_EVENTSEL_ENABLE; @@ -2959,10 +3056,12 @@ static void intel_pmu_enable_event(struct perf_even= t *event) enable_mask |=3D ARCH_PERFMON_EVENTSEL_BR_CNTR; intel_set_masks(event, idx); static_call_cond(intel_pmu_enable_acr_event)(event); + intel_pmu_enable_event_ext(event); __x86_pmu_enable_event(hwc, enable_mask); break; case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS - 1: static_call_cond(intel_pmu_enable_acr_event)(event); + intel_pmu_enable_event_ext(event); fallthrough; case INTEL_PMC_IDX_METRIC_BASE ... INTEL_PMC_IDX_METRIC_END: intel_pmu_enable_fixed(event); @@ -5298,6 +5397,29 @@ static inline bool intel_pmu_broken_perf_cap(void) return false; } =20 +static inline void __intel_update_pmu_caps(struct pmu *pmu) +{ + struct pmu *dest_pmu =3D pmu ? pmu : x86_get_pmu(smp_processor_id()); + + if (hybrid(pmu, arch_pebs_cap).caps & ARCH_PEBS_VECR_XMM) + dest_pmu->capabilities |=3D PERF_PMU_CAP_EXTENDED_REGS; +} + +static inline void __intel_update_large_pebs_flags(struct pmu *pmu) +{ + u64 caps =3D hybrid(pmu, arch_pebs_cap).caps; + + x86_pmu.large_pebs_flags |=3D PERF_SAMPLE_TIME; + if (caps & ARCH_PEBS_LBR) + x86_pmu.large_pebs_flags |=3D PERF_SAMPLE_BRANCH_STACK; + + if (!(caps & ARCH_PEBS_AUX)) + x86_pmu.large_pebs_flags &=3D ~PERF_SAMPLE_DATA_SRC; + if (!(caps & ARCH_PEBS_GPR)) + x86_pmu.large_pebs_flags &=3D + ~(PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER); +} + static void update_pmu_cap(struct pmu *pmu) { unsigned int eax, ebx, ecx, edx; @@ -5345,8 +5467,12 @@ static void update_pmu_cap(struct pmu *pmu) hybrid(pmu, arch_pebs_cap).counters =3D pebs_mask; hybrid(pmu, arch_pebs_cap).pdists =3D pdists_mask; =20 - if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask)) + if (WARN_ON((pebs_mask | pdists_mask) & ~cntrs_mask)) { x86_pmu.arch_pebs =3D 0; + } else { + __intel_update_pmu_caps(pmu); + __intel_update_large_pebs_flags(pmu); + } } else { WARN_ON(x86_pmu.arch_pebs =3D=3D 1); x86_pmu.arch_pebs =3D 0; @@ -5510,6 +5636,8 @@ static void intel_pmu_cpu_starting(int cpu) } } =20 + __intel_update_pmu_caps(cpuc->pmu); + if (!cpuc->shared_regs) return; =20 diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 30915338b929..2989893b982a 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1517,6 +1517,18 @@ pebs_update_state(bool needed_cb, struct cpu_hw_even= ts *cpuc, } } =20 +u64 intel_get_arch_pebs_data_config(struct perf_event *event) +{ + u64 pebs_data_cfg =3D 0; + + if (WARN_ON(event->hw.idx < 0 || event->hw.idx >=3D X86_PMC_IDX_MAX)) + return 0; + + pebs_data_cfg |=3D pebs_update_adaptive_cfg(event); + + return pebs_data_cfg; +} + void intel_pmu_pebs_add(struct perf_event *event) { struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); @@ -2973,6 +2985,11 @@ static void intel_pmu_drain_arch_pebs(struct pt_regs= *iregs, =20 index.split.wr =3D 0; index.split.full =3D 0; + index.split.en =3D 1; + if (cpuc->n_pebs =3D=3D cpuc->n_large_pebs) + index.split.thresh =3D ARCH_PEBS_THRESH_MUL; + else + index.split.thresh =3D ARCH_PEBS_THRESH_SINGLE; wrmsrq(MSR_IA32_PEBS_INDEX, index.full); } =20 diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 82e8c20611b9..db4ec2975de4 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -304,6 +304,8 @@ struct cpu_hw_events { /* Intel ACR configuration */ u64 acr_cfg_b[X86_PMC_IDX_MAX]; u64 acr_cfg_c[X86_PMC_IDX_MAX]; + /* Cached CFG_C values */ + u64 cfg_c_val[X86_PMC_IDX_MAX]; =20 /* * Intel LBR bits @@ -1216,6 +1218,14 @@ static inline unsigned int x86_pmu_fixed_ctr_addr(in= t index) x86_pmu.addr_offset(index, false) : index); } =20 +static inline unsigned int x86_pmu_cfg_c_addr(int index, bool gp) +{ + u32 base =3D gp ? MSR_IA32_PMC_V6_GP0_CFG_C : MSR_IA32_PMC_V6_FX0_CFG_C; + + return base + (x86_pmu.addr_offset ? x86_pmu.addr_offset(index, false) : + index * MSR_IA32_PMC_V6_STEP); +} + static inline int x86_pmu_rdpmc_index(int index) { return x86_pmu.rdpmc_index ? x86_pmu.rdpmc_index(index) : index; @@ -1779,6 +1789,8 @@ void intel_pmu_pebs_data_source_cmt(void); =20 void intel_pmu_pebs_data_source_lnl(void); =20 +u64 intel_get_arch_pebs_data_config(struct perf_event *event); + int intel_pmu_setup_lbr_filter(struct perf_event *event); =20 void intel_pt_interrupt(void); diff --git a/arch/x86/include/asm/intel_ds.h b/arch/x86/include/asm/intel_d= s.h index 023c2883f9f3..7bb80c993bef 100644 --- a/arch/x86/include/asm/intel_ds.h +++ b/arch/x86/include/asm/intel_ds.h @@ -7,6 +7,13 @@ #define PEBS_BUFFER_SHIFT 4 #define PEBS_BUFFER_SIZE (PAGE_SIZE << PEBS_BUFFER_SHIFT) =20 +/* + * The largest PEBS record could consume a page, ensure + * a record at least can be written after triggering PMI. + */ +#define ARCH_PEBS_THRESH_MUL ((PEBS_BUFFER_SIZE - PAGE_SIZE) >> PEBS_BUFFE= R_SHIFT) +#define ARCH_PEBS_THRESH_SINGLE 1 + /* The maximal number of PEBS events: */ #define MAX_PEBS_EVENTS_FMT4 8 #define MAX_PEBS_EVENTS 32 diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index d3bc28230628..07a0e03feb5e 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -328,6 +328,14 @@ #define ARCH_PEBS_OFFSET_MASK 0x7fffff #define ARCH_PEBS_INDEX_WR_SHIFT 4 =20 +#define ARCH_PEBS_RELOAD 0xffffffff +#define ARCH_PEBS_LBR_SHIFT 40 +#define ARCH_PEBS_LBR (0x3ull << ARCH_PEBS_LBR_SHIFT) +#define ARCH_PEBS_VECR_XMM BIT_ULL(49) +#define ARCH_PEBS_GPR BIT_ULL(61) +#define ARCH_PEBS_AUX BIT_ULL(62) +#define ARCH_PEBS_EN BIT_ULL(63) + #define MSR_IA32_RTIT_CTL 0x00000570 #define RTIT_CTL_TRACEEN BIT(0) #define RTIT_CTL_CYCLEACC BIT(1) --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA29723182F; Fri, 20 Jun 2025 07:29:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404585; cv=none; b=pkF7etAOyrT+03sdqsertvnQ1vAJ450Pu+CRNZqC4sLO+lb+95gGThMilpP1YcHriy8YpTFRtjKZZNs1GK12XaZUCAh9DDpxcg+cdWyLSkRRbiUkbuEF0erBN417DKxVlU+yGwup7cBWG7VP/w6LqBpgQZfVeQ27Gk1yqtOltnw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404585; c=relaxed/simple; bh=U6x+FY/Q/JuPWtCjRw8pMzhGhJFnV5UPuzlO+eirDgQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pR6FFiN+KkLr8a97279BljdELtogMeP/DQFrNb/RTUmQiYcHLK95FbYksGmtGXwusQFzlSyh83gx/ZBwsdU1n073rhRzhYGbvNuVERc2m0GX9HTjX88xpDfKJOgaNVZ4q1IIx+9H4Mh330yJSOzcBfdvSXLqBhEMkbstZJhdmOE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=iQrxm4SH; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="iQrxm4SH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404583; x=1781940583; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U6x+FY/Q/JuPWtCjRw8pMzhGhJFnV5UPuzlO+eirDgQ=; b=iQrxm4SHmCRn1KWes24aF8uk20LWrSPA8bPXGd/uoKBaNvvuqrMAsyaw btgdXLui5ogzMlbVD//T3XMNqGL8Y/rt8B5AXbbTWrPbwEIy/pHep0ScL dkmn5/0ZIZVGs5FBHMB/8uwnsFrub7cVnFR3Le8XtjKcYcR/frsatOh+9 ipdbyqXA7m7+WBpu7Y5+53Bv3/LlNGn//Wu43/u1oeIz5C8Lec8ZnKYxD xJ7VKqg6wmViCaqLX1wfmNdyhKu9ZVZZ0C1JgMwC7GL9emSmUnZa1OYgv CrM+fYbOUOFSIiO518uK922dEgk1XIGry4/c3K+qGN1smv/4rYFl4/30b g==; X-CSE-ConnectionGUID: 3rFsDcALQZ22hh/QVJ/Htw== X-CSE-MsgGUID: BETwPhmISQaoLA2oE8qZAw== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887808" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887808" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:43 -0700 X-CSE-ConnectionGUID: +oFe4xKqR3Sh64TmLVU6eA== X-CSE-MsgGUID: ViW2SdLVQXKFfqZTs3cwbw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156651066" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:40 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 10/13] perf/x86/intel: Add counter group support for arch-PEBS Date: Fri, 20 Jun 2025 10:39:06 +0000 Message-ID: <20250620103909.1586595-11-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Base on previous adaptive PEBS counter snapshot support, add counter group support for architectural PEBS. Since arch-PEBS shares same counter group layout with adaptive PEBS, directly reuse __setup_pebs_counter_group() helper to process arch-PEBS counter group. Signed-off-by: Dapeng Mi --- arch/x86/events/intel/core.c | 38 ++++++++++++++++++++++++++++--- arch/x86/events/intel/ds.c | 29 ++++++++++++++++++++--- arch/x86/include/asm/msr-index.h | 6 +++++ arch/x86/include/asm/perf_event.h | 13 ++++++++--- 4 files changed, 77 insertions(+), 9 deletions(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index faea1d42ce0c..b37e09ce3f0c 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3012,6 +3012,17 @@ static void intel_pmu_enable_event_ext(struct perf_e= vent *event) =20 if (pebs_data_cfg & PEBS_DATACFG_LBRS) ext |=3D ARCH_PEBS_LBR & cap.caps; + + if (pebs_data_cfg & + (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT)) + ext |=3D ARCH_PEBS_CNTR_GP & cap.caps; + + if (pebs_data_cfg & + (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT)) + ext |=3D ARCH_PEBS_CNTR_FIXED & cap.caps; + + if (pebs_data_cfg & PEBS_DATACFG_METRICS) + ext |=3D ARCH_PEBS_CNTR_METRICS & cap.caps; } =20 if (cpuc->n_pebs =3D=3D cpuc->n_large_pebs) @@ -3037,6 +3048,9 @@ static void intel_pmu_enable_event_ext(struct perf_ev= ent *event) } } =20 + if (is_pebs_counter_event_group(event)) + ext |=3D ARCH_PEBS_CNTR_ALLOW; + if (cpuc->cfg_c_val[hwc->idx] !=3D ext) __intel_pmu_update_event_ext(hwc->idx, ext); } @@ -4321,6 +4335,20 @@ static bool intel_pmu_is_acr_group(struct perf_event= *event) return false; } =20 +static inline bool intel_pmu_has_pebs_counter_group(struct pmu *pmu) +{ + u64 caps; + + if (x86_pmu.intel_cap.pebs_format >=3D 6 && x86_pmu.intel_cap.pebs_baseli= ne) + return true; + + caps =3D hybrid(pmu, arch_pebs_cap).caps; + if (x86_pmu.arch_pebs && (caps & ARCH_PEBS_CNTR_MASK)) + return true; + + return false; +} + static inline void intel_pmu_set_acr_cntr_constr(struct perf_event *event, u64 *cause_mask, int *num) { @@ -4469,8 +4497,7 @@ static int intel_pmu_hw_config(struct perf_event *eve= nt) } =20 if ((event->attr.sample_type & PERF_SAMPLE_READ) && - (x86_pmu.intel_cap.pebs_format >=3D 6) && - x86_pmu.intel_cap.pebs_baseline && + intel_pmu_has_pebs_counter_group(event->pmu) && is_sampling_event(event) && event->attr.precise_ip) event->group_leader->hw.flags |=3D PERF_X86_EVENT_PEBS_CNTR; @@ -5412,6 +5439,8 @@ static inline void __intel_update_large_pebs_flags(st= ruct pmu *pmu) x86_pmu.large_pebs_flags |=3D PERF_SAMPLE_TIME; if (caps & ARCH_PEBS_LBR) x86_pmu.large_pebs_flags |=3D PERF_SAMPLE_BRANCH_STACK; + if (caps & ARCH_PEBS_CNTR_MASK) + x86_pmu.large_pebs_flags |=3D PERF_SAMPLE_READ; =20 if (!(caps & ARCH_PEBS_AUX)) x86_pmu.large_pebs_flags &=3D ~PERF_SAMPLE_DATA_SRC; @@ -7123,8 +7152,11 @@ __init int intel_pmu_init(void) * Many features on and after V6 require dynamic constraint, * e.g., Arch PEBS, ACR. */ - if (version >=3D 6) + if (version >=3D 6) { x86_pmu.flags |=3D PMU_FL_DYN_CONSTRAINT; + x86_pmu.late_setup =3D intel_pmu_late_setup; + } + /* * Install the hw-cache-events table: */ diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 2989893b982a..e378f33206ed 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1519,13 +1519,20 @@ pebs_update_state(bool needed_cb, struct cpu_hw_eve= nts *cpuc, =20 u64 intel_get_arch_pebs_data_config(struct perf_event *event) { + struct cpu_hw_events *cpuc =3D this_cpu_ptr(&cpu_hw_events); u64 pebs_data_cfg =3D 0; + u64 cntr_mask; =20 if (WARN_ON(event->hw.idx < 0 || event->hw.idx >=3D X86_PMC_IDX_MAX)) return 0; =20 pebs_data_cfg |=3D pebs_update_adaptive_cfg(event); =20 + cntr_mask =3D (PEBS_DATACFG_CNTR_MASK << PEBS_DATACFG_CNTR_SHIFT) | + (PEBS_DATACFG_FIX_MASK << PEBS_DATACFG_FIX_SHIFT) | + PEBS_DATACFG_CNTR | PEBS_DATACFG_METRICS; + pebs_data_cfg |=3D cpuc->pebs_data_cfg & cntr_mask; + return pebs_data_cfg; } =20 @@ -2430,6 +2437,24 @@ static void setup_arch_pebs_sample_data(struct perf_= event *event, } } =20 + if (header->cntr) { + struct arch_pebs_cntr_header *cntr =3D next_record; + unsigned int nr; + + next_record +=3D sizeof(struct arch_pebs_cntr_header); + + if (is_pebs_counter_event_group(event)) { + __setup_pebs_counter_group(cpuc, event, + (struct pebs_cntr_header *)cntr, next_record); + data->sample_flags |=3D PERF_SAMPLE_READ; + } + + nr =3D hweight32(cntr->cntr) + hweight32(cntr->fixed); + if (cntr->metrics =3D=3D INTEL_CNTR_METRICS) + nr +=3D 2; + next_record +=3D nr * sizeof(u64); + } + /* Parse followed fragments if there are. */ if (arch_pebs_record_continued(header)) { at =3D at + header->size; @@ -3076,10 +3101,8 @@ static void __init intel_ds_pebs_init(void) break; =20 case 6: - if (x86_pmu.intel_cap.pebs_baseline) { + if (x86_pmu.intel_cap.pebs_baseline) x86_pmu.large_pebs_flags |=3D PERF_SAMPLE_READ; - x86_pmu.late_setup =3D intel_pmu_late_setup; - } fallthrough; case 5: x86_pmu.pebs_ept =3D 1; diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index 07a0e03feb5e..4de9c0d22fa1 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -329,12 +329,18 @@ #define ARCH_PEBS_INDEX_WR_SHIFT 4 =20 #define ARCH_PEBS_RELOAD 0xffffffff +#define ARCH_PEBS_CNTR_ALLOW BIT_ULL(35) +#define ARCH_PEBS_CNTR_GP BIT_ULL(36) +#define ARCH_PEBS_CNTR_FIXED BIT_ULL(37) +#define ARCH_PEBS_CNTR_METRICS BIT_ULL(38) #define ARCH_PEBS_LBR_SHIFT 40 #define ARCH_PEBS_LBR (0x3ull << ARCH_PEBS_LBR_SHIFT) #define ARCH_PEBS_VECR_XMM BIT_ULL(49) #define ARCH_PEBS_GPR BIT_ULL(61) #define ARCH_PEBS_AUX BIT_ULL(62) #define ARCH_PEBS_EN BIT_ULL(63) +#define ARCH_PEBS_CNTR_MASK (ARCH_PEBS_CNTR_GP | ARCH_PEBS_CNTR_FIXED | \ + ARCH_PEBS_CNTR_METRICS) =20 #define MSR_IA32_RTIT_CTL 0x00000570 #define RTIT_CTL_TRACEEN BIT(0) diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 0f70d13780fe..380f89fd5dac 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -137,16 +137,16 @@ #define ARCH_PERFMON_EVENTS_COUNT 7 =20 #define PEBS_DATACFG_MEMINFO BIT_ULL(0) -#define PEBS_DATACFG_GP BIT_ULL(1) +#define PEBS_DATACFG_GP BIT_ULL(1) #define PEBS_DATACFG_XMMS BIT_ULL(2) #define PEBS_DATACFG_LBRS BIT_ULL(3) -#define PEBS_DATACFG_LBR_SHIFT 24 #define PEBS_DATACFG_CNTR BIT_ULL(4) +#define PEBS_DATACFG_METRICS BIT_ULL(5) +#define PEBS_DATACFG_LBR_SHIFT 24 #define PEBS_DATACFG_CNTR_SHIFT 32 #define PEBS_DATACFG_CNTR_MASK GENMASK_ULL(15, 0) #define PEBS_DATACFG_FIX_SHIFT 48 #define PEBS_DATACFG_FIX_MASK GENMASK_ULL(7, 0) -#define PEBS_DATACFG_METRICS BIT_ULL(5) =20 /* Steal the highest bit of pebs_data_cfg for SW usage */ #define PEBS_UPDATE_DS_SW BIT_ULL(63) @@ -599,6 +599,13 @@ struct arch_pebs_lbr_header { u64 ler_info; }; =20 +struct arch_pebs_cntr_header { + u32 cntr; + u32 fixed; + u32 metrics; + u32 reserved; +}; + /* * AMD Extended Performance Monitoring and Debug cpuid feature detection */ --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 86E83229B29; Fri, 20 Jun 2025 07:29:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404589; cv=none; b=JHrhLemvRRlvyZghItN2aXkO0xnGRD2VRwgJLW3Lygettq6Yt3CzXnc1jiUZCIFxDjL+Xv+5SeN+7uIGr4YbL+AMVITocunKsT2OsThH9XZCK4SsvX6C207Lf2rQTC3SA1OJ2Yjh7zUJz0F96+qNnrtgt/ofeRG/RAgEAGCjJy8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404589; c=relaxed/simple; bh=mjbDHhK9PNHQzi179LUtROpOFo4bclzxUwddR9soylc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lOeTsw2jPMX9+QrXzDDTNM8XQzJOKTJXFH73m7K6TupuW4pNduG1kD5jztJ+XSD5NVcrkYV/nwRzF8KMVdXU+HCucbAhttOBVFGwqhbs8uhIes2hcCGOhX6ZcATRYQgM1fGR95/fb0tCFtO44iGxix+PWm7AzgtoJ6jc67K40g8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=S+V98E0O; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="S+V98E0O" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404587; x=1781940587; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mjbDHhK9PNHQzi179LUtROpOFo4bclzxUwddR9soylc=; b=S+V98E0OhDu/5ckB/vgWNSHMaQ6BQa0rhwvMe6iEVRfNWQgsspzpuAha yvgO1ur4vnDepp6VBaOsdINvJsc1l6XnxWAq0C3wL920NcV9Vr8GnwFzF zInL7klZBGwxQNSUOVhnJ3ZAXsZXkwVBe0baTpNNlzUJFsPR46Xss0Jfk zjtl2woAjIf30B6pleBS7gDlbdZflwTruJt6vsQcQl3eOcedBZxnRDIcM xanNBGu0WhT9b0FOjqCNrDAHRBVlol7VENuMORkfMjckOHxCSowtKSfJp IkT+8SV+WhJbDvBFtRXcG6HhHqCTcJLyJWcls/lJh6k2DoA/qGQh0C/6F g==; X-CSE-ConnectionGUID: pQILsC8HS4Gt/0sl7zymfQ== X-CSE-MsgGUID: FtTjSZcyT1CPIe+krWN83Q== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887824" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887824" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:47 -0700 X-CSE-ConnectionGUID: OVnO1fogQQKxc8e30uoEAA== X-CSE-MsgGUID: nHuOcBSoRNSc0PxL4hPIpw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156651074" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:44 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 11/13] perf/x86: Support to sample SSP register Date: Fri, 20 Jun 2025 10:39:07 +0000 Message-ID: <20250620103909.1586595-12-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This patch adds basic support for sampling SSP register in perf/x86 common code. The x86/intel specific support would be added in next patch. Signed-off-by: Dapeng Mi --- arch/x86/events/intel/ds.c | 2 ++ arch/x86/include/asm/perf_event.h | 1 + arch/x86/include/uapi/asm/perf_regs.h | 4 +++- arch/x86/kernel/perf_regs.c | 7 +++++++ 4 files changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index e378f33206ed..d3a614ed7d60 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2244,6 +2244,7 @@ static void setup_pebs_adaptive_sample_data(struct pe= rf_event *event, return; =20 perf_regs =3D container_of(regs, struct x86_perf_regs, regs); + perf_regs->ssp =3D 0; perf_regs->xmm_regs =3D NULL; =20 format_group =3D basic->format_group; @@ -2360,6 +2361,7 @@ static void setup_arch_pebs_sample_data(struct perf_e= vent *event, return; =20 perf_regs =3D container_of(regs, struct x86_perf_regs, regs); + perf_regs->ssp =3D 0; perf_regs->xmm_regs =3D NULL; =20 __setup_perf_sample_data(event, iregs, data); diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 380f89fd5dac..fcfb8fb6a7a5 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -700,6 +700,7 @@ extern void perf_events_lapic_init(void); struct pt_regs; struct x86_perf_regs { struct pt_regs regs; + u64 ssp; u64 *xmm_regs; }; =20 diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/= asm/perf_regs.h index 7c9d2bb3833b..bf4cec52f808 100644 --- a/arch/x86/include/uapi/asm/perf_regs.h +++ b/arch/x86/include/uapi/asm/perf_regs.h @@ -27,9 +27,11 @@ enum perf_event_x86_regs { PERF_REG_X86_R13, PERF_REG_X86_R14, PERF_REG_X86_R15, + /* shadow stack pointer (SSP) */ + PERF_REG_X86_SSP, /* These are the limits for the GPRs. */ PERF_REG_X86_32_MAX =3D PERF_REG_X86_GS + 1, - PERF_REG_X86_64_MAX =3D PERF_REG_X86_R15 + 1, + PERF_REG_X86_64_MAX =3D PERF_REG_X86_SSP + 1, =20 /* These all need two bits set because they are 128bit */ PERF_REG_X86_XMM0 =3D 32, diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c index 624703af80a1..1cbb9c901a08 100644 --- a/arch/x86/kernel/perf_regs.c +++ b/arch/x86/kernel/perf_regs.c @@ -54,6 +54,8 @@ static unsigned int pt_regs_offset[PERF_REG_X86_MAX] =3D { PT_REGS_OFFSET(PERF_REG_X86_R13, r13), PT_REGS_OFFSET(PERF_REG_X86_R14, r14), PT_REGS_OFFSET(PERF_REG_X86_R15, r15), + /* The pt_regs struct does not store shadow stack pointer. */ + (unsigned int) -1, #endif }; =20 @@ -68,6 +70,11 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) return perf_regs->xmm_regs[idx - PERF_REG_X86_XMM0]; } =20 + if (idx =3D=3D PERF_REG_X86_SSP) { + perf_regs =3D container_of(regs, struct x86_perf_regs, regs); + return perf_regs->ssp; + } + if (WARN_ON_ONCE(idx >=3D ARRAY_SIZE(pt_regs_offset))) return 0; =20 --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28EC2233145; Fri, 20 Jun 2025 07:29:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404592; cv=none; b=III9K3g/H9E19rmfCu+N883F9zL6peP6BQdCtSF5E9LE0c1mBL6A/7UKOdzpEqrIRALqUSfiL0HEu8hsGZMknX7yb24KJwE/BPxtqgFlhe5jsTW6VwHe3eWRgx8p/TCDM+7Kchp+ki7b+J7hUMnRgaRhrspW+OwmvwfqCr1V7m8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404592; c=relaxed/simple; bh=OxQb5cqVBGmoOgOx4NgdvWTB16JH2MCnPNbtL5ULKzY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=b3v8ZH7bzMCU6Yek2E08tHtNII0VlYfZSA4UIRdTKAfHRVlA2d6rnvFrM2wb5/l1/QRcrYQRpPsj5jJn6NcNM2rYjDI9Bj9tpwIV3aoVhTxTghPh2/BFVsrj4XXfDVOWvk7kaotz/OphSdqcHQU2J3E5FQtxWo560SZEO6khBRk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YyyWQ3Xd; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YyyWQ3Xd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404591; x=1781940591; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OxQb5cqVBGmoOgOx4NgdvWTB16JH2MCnPNbtL5ULKzY=; b=YyyWQ3XdU00u5AB3lKhd58z0qPMNu0lxah8c4gyc8LsLBqq4tf6yvOf+ i/5kE7D40l0ERxF4G3zYn3g86oAL09L4Ft7dX7QDVJoncsAUZn6Pkgyma XDFRW5RkAcwsCan1QR/sUbuCrKG5lYmlX+Oo9n6vH+pJM0VSSIgvc9PPl ZybpNl8mMq/aAU9NGJ2W5OLPF9BkhHaXSGgKM2/t7u7Qka7u8wTY8Lg8t vGKPefWhkEOC4nCZndgM+armj1nvUKZ8c8Go+SuKkALigQrilb6Xd8hT+ Y1SoWEwX5lRvnN7AZP8lRc5GR5kF8XG0SEjbiA+8Lt2961DVMGN4cFtWf g==; X-CSE-ConnectionGUID: m94W+FBPR2Kx7mRUK0JHKg== X-CSE-MsgGUID: DfKrQ78nREaZvaq56tpPsQ== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887838" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887838" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:51 -0700 X-CSE-ConnectionGUID: /m1GCKcPT6GgPwuZ2Pd+sA== X-CSE-MsgGUID: wICYpzmWQvqw90NXjdE1cw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156651084" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:47 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 12/13] perf/x86/intel: Support to sample SSP register for arch-PEBS Date: Fri, 20 Jun 2025 10:39:08 +0000 Message-ID: <20250620103909.1586595-13-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Arch-PEBS supports to sample shadow stack pointer (SSP) register in GPR group. This patch supports to sample SSP register for arch-PEBS. Please notice this patch only enables PEBS based SSP sampling, the PMI based SSP sampling would be supported in a separated patch. Signed-off-by: Dapeng Mi --- arch/x86/events/core.c | 16 ++++++++++++++++ arch/x86/events/intel/core.c | 5 +++-- arch/x86/events/intel/ds.c | 7 +++++-- arch/x86/events/perf_event.h | 2 ++ 4 files changed, 26 insertions(+), 4 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index f30c423e4bd2..6435f6686c04 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -666,6 +666,22 @@ int x86_pmu_hw_config(struct perf_event *event) return -EINVAL; } =20 + /* + * sample_regs_user doesn't support SSP register now, it would be + * supported later. + */ + if (event->attr.sample_regs_user & BIT_ULL(PERF_REG_X86_SSP)) + return -EINVAL; + + if (event->attr.sample_regs_intr & BIT_ULL(PERF_REG_X86_SSP)) { + /* + * sample_regs_intr doesn't support SSP register for + * non-PEBS events now. it would be supported later. + */ + if (!event->attr.precise_ip || !x86_pmu.arch_pebs) + return -EINVAL; + } + return x86_setup_perfctr(event); } =20 diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index b37e09ce3f0c..3013e9bce330 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -4152,14 +4152,15 @@ static void intel_pebs_aliases_skl(struct perf_even= t *event) static unsigned long intel_pmu_large_pebs_flags(struct perf_event *event) { unsigned long flags =3D x86_pmu.large_pebs_flags; + u64 gprs_mask =3D x86_pmu.arch_pebs ? PEBS_GP_EXT_REGS : PEBS_GP_REGS; =20 if (event->attr.use_clockid) flags &=3D ~PERF_SAMPLE_TIME; if (!event->attr.exclude_kernel) flags &=3D ~PERF_SAMPLE_REGS_USER; - if (event->attr.sample_regs_user & ~PEBS_GP_REGS) + if (event->attr.sample_regs_user & ~gprs_mask) flags &=3D ~PERF_SAMPLE_REGS_USER; - if (event->attr.sample_regs_intr & ~PEBS_GP_REGS) + if (event->attr.sample_regs_intr & ~gprs_mask) flags &=3D ~PERF_SAMPLE_REGS_INTR; return flags; } diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index d3a614ed7d60..7f790602f554 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1436,6 +1436,7 @@ static u64 pebs_update_adaptive_cfg(struct perf_event= *event) u64 sample_type =3D attr->sample_type; u64 pebs_data_cfg =3D 0; bool gprs, tsx_weight; + u64 gprs_mask; =20 if (!(sample_type & ~(PERF_SAMPLE_IP|PERF_SAMPLE_TIME)) && attr->precise_ip > 1) @@ -1450,10 +1451,11 @@ static u64 pebs_update_adaptive_cfg(struct perf_eve= nt *event) * + precise_ip < 2 for the non event IP * + For RTM TSX weight we need GPRs for the abort code. */ + gprs_mask =3D x86_pmu.arch_pebs ? PEBS_GP_EXT_REGS : PEBS_GP_REGS; gprs =3D ((sample_type & PERF_SAMPLE_REGS_INTR) && - (attr->sample_regs_intr & PEBS_GP_REGS)) || + (attr->sample_regs_intr & gprs_mask)) || ((sample_type & PERF_SAMPLE_REGS_USER) && - (attr->sample_regs_user & PEBS_GP_REGS)); + (attr->sample_regs_user & gprs_mask)); =20 tsx_weight =3D (sample_type & PERF_SAMPLE_WEIGHT_TYPE) && ((attr->config & INTEL_ARCH_EVENT_MASK) =3D=3D @@ -2399,6 +2401,7 @@ static void setup_arch_pebs_sample_data(struct perf_e= vent *event, =20 __setup_pebs_gpr_group(event, regs, (struct pebs_gprs *)gprs, sample_type); + perf_regs->ssp =3D gprs->ssp; } =20 if (header->aux) { diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index db4ec2975de4..bede9dd2720c 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -183,6 +183,8 @@ struct amd_nb { (1ULL << PERF_REG_X86_R14) | \ (1ULL << PERF_REG_X86_R15)) =20 +#define PEBS_GP_EXT_REGS (PEBS_GP_REGS | BIT_ULL(PERF_REG_X86_SSP)) + /* * Per register state. */ --=20 2.43.0 From nobody Thu Oct 9 04:14:03 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B6D9A234987; Fri, 20 Jun 2025 07:29:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404596; cv=none; b=HtXSq4oJARGJw5abj3zfYBmBszVHIoqsHWkhr3V/mxtIEqHEBgnewD+0FE46fpoWLAr8/yZR17L4ReOaDcXLlzREviQ26pOil3+nZ1OQlsmd/UUByv69EHtBSvOU6RZyNYEcE78mQxSDjj3++3IoH9ZrkPu3okzjoG5r5yj6CL0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750404596; c=relaxed/simple; bh=FBRShSaC7ObO9TPOdzCzNNsbgMLNbHPsVYAw1q8ahBE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Soe4Sfjzn8NaE0OWzcBM3DIQ/0YS/dMuq7mTHzVZfpgUtLyQfRhlfY36mR3LR3cZOgtY1UXz6BuRGnPPnUvt9sH5sItVMsIXvMbug1jRCp7uE39YdM+olgobtpZPI3v1icxhm6f8oGfBGkC4KRq4cMtCmpnRfyNRmt6nnDhiyxw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=blzrgeGH; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="blzrgeGH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1750404594; x=1781940594; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FBRShSaC7ObO9TPOdzCzNNsbgMLNbHPsVYAw1q8ahBE=; b=blzrgeGH833JeY45FuiolLm6dPdE1zTJaztgvf7UKdY0CAhaEt9sbWAl UUoXCTuRe4646A5NbhQusqt2taiMgU+tYpUJmU996P8Ofbvd2Mf/d69Me pl+1AEFJGmHyBd7nAMGFVzL+Op2tQ+GalmLIRhuHxPJHE5vl+z5RdRqL+ UNRg7WaEQeMCE+vCP4DluruJ0rP1Gfi848NLiK1vwYAAz6wqyltJqKs6i GJ47NewdrFe6oZQzUrdr4m0E5RGfwnln+0Vwn50oDZIPn2W4zHFQRGaeQ FglF3+Mpz1OTevFFnTeEJyHxFvOWjtPMSEfDfwG5oyopI577lHwjT1ib5 A==; X-CSE-ConnectionGUID: v7058EzvSDCoJL0YTdIliQ== X-CSE-MsgGUID: lvPD7vGjQTqwgURKgTwD9Q== X-IronPort-AV: E=McAfee;i="6800,10657,11469"; a="51887848" X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="51887848" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Jun 2025 00:29:54 -0700 X-CSE-ConnectionGUID: k+Mg1KqRS+eEfoxsi187Pw== X-CSE-MsgGUID: Fh0gTQD4R7aOih6H+oQZGQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,250,1744095600"; d="scan'208";a="156651096" Received: from emr.sh.intel.com ([10.112.229.56]) by orviesa005.jf.intel.com with ESMTP; 20 Jun 2025 00:29:51 -0700 From: Dapeng Mi To: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Namhyung Kim , Ian Rogers , Adrian Hunter , Alexander Shishkin , Kan Liang , Andi Kleen , Eranian Stephane Cc: linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, Dapeng Mi , Dapeng Mi Subject: [Patch v4 13/13] perf tools: x86: Support to show SSP register Date: Fri, 20 Jun 2025 10:39:09 +0000 Message-ID: <20250620103909.1586595-14-dapeng1.mi@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> References: <20250620103909.1586595-1-dapeng1.mi@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add SSP register support for x86 platforms. Reviewed-by: Ian Rogers Signed-off-by: Dapeng Mi --- tools/arch/x86/include/uapi/asm/perf_regs.h | 7 ++++++- tools/perf/arch/x86/util/perf_regs.c | 2 ++ tools/perf/util/intel-pt.c | 2 +- tools/perf/util/perf-regs-arch/perf_regs_x86.c | 2 ++ 4 files changed, 11 insertions(+), 2 deletions(-) diff --git a/tools/arch/x86/include/uapi/asm/perf_regs.h b/tools/arch/x86/i= nclude/uapi/asm/perf_regs.h index 7c9d2bb3833b..fc5e520acc00 100644 --- a/tools/arch/x86/include/uapi/asm/perf_regs.h +++ b/tools/arch/x86/include/uapi/asm/perf_regs.h @@ -27,9 +27,14 @@ enum perf_event_x86_regs { PERF_REG_X86_R13, PERF_REG_X86_R14, PERF_REG_X86_R15, + /* shadow stack pointer (SSP) */ + PERF_REG_X86_SSP, /* These are the limits for the GPRs. */ PERF_REG_X86_32_MAX =3D PERF_REG_X86_GS + 1, - PERF_REG_X86_64_MAX =3D PERF_REG_X86_R15 + 1, + /* PERF_REG_X86_64_MAX used generally, for PEBS, etc. */ + PERF_REG_X86_64_MAX =3D PERF_REG_X86_SSP + 1, + /* PERF_REG_INTEL_PT_MAX ignores the SSP register. */ + PERF_REG_INTEL_PT_MAX =3D PERF_REG_X86_R15 + 1, =20 /* These all need two bits set because they are 128bit */ PERF_REG_X86_XMM0 =3D 32, diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/uti= l/perf_regs.c index 12fd93f04802..9f492568f3b4 100644 --- a/tools/perf/arch/x86/util/perf_regs.c +++ b/tools/perf/arch/x86/util/perf_regs.c @@ -36,6 +36,8 @@ static const struct sample_reg sample_reg_masks[] =3D { SMPL_REG(R14, PERF_REG_X86_R14), SMPL_REG(R15, PERF_REG_X86_R15), #endif + SMPL_REG(SSP, PERF_REG_X86_SSP), + SMPL_REG2(XMM0, PERF_REG_X86_XMM0), SMPL_REG2(XMM1, PERF_REG_X86_XMM1), SMPL_REG2(XMM2, PERF_REG_X86_XMM2), diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index 9b1011fe4826..a6b53718be7d 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -2181,7 +2181,7 @@ static u64 *intel_pt_add_gp_regs(struct regs_dump *in= tr_regs, u64 *pos, u32 bit; int i; =20 - for (i =3D 0, bit =3D 1; i < PERF_REG_X86_64_MAX; i++, bit <<=3D 1) { + for (i =3D 0, bit =3D 1; i < PERF_REG_INTEL_PT_MAX; i++, bit <<=3D 1) { /* Get the PEBS gp_regs array index */ int n =3D pebs_gp_regs[i] - 1; =20 diff --git a/tools/perf/util/perf-regs-arch/perf_regs_x86.c b/tools/perf/ut= il/perf-regs-arch/perf_regs_x86.c index 708954a9d35d..c0e95215b577 100644 --- a/tools/perf/util/perf-regs-arch/perf_regs_x86.c +++ b/tools/perf/util/perf-regs-arch/perf_regs_x86.c @@ -54,6 +54,8 @@ const char *__perf_reg_name_x86(int id) return "R14"; case PERF_REG_X86_R15: return "R15"; + case PERF_REG_X86_SSP: + return "SSP"; =20 #define XMM(x) \ case PERF_REG_X86_XMM ## x: \ --=20 2.43.0