From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0EF752D9EF9 for ; Fri, 13 Jun 2025 13:50:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822618; cv=none; b=H1Trp91HwWRtSMBA7q/f8fj+of1tSA0sYOJnfTPUUDAFbSAmnvXT+SJc1Pgi0do/hZz401EzKeSYrwUKYnocYnx6x0SuHNh7qC4gQGdiD0SvAOzHfJKYHrxLLk6TMQsKMmK+QzpEFi/WrsjRhNlNoSUm55WBqdu1831wj/034ro= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822618; c=relaxed/simple; bh=WWSMNscjRKqDqvaRWCJBqmlgX5BK/JXK0+3JVVtjIaM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=n/H895k1RRenBe35z79MZxo8h1NgwbJQQ8ej4gJVHxEBL37HCkEy31oZ0fTt3D3CSe9FBZc9YiuqjbSJHUy5HWRQBAH8e3ZUkewxq+Xa0DtJaQdicsJ9PfdNTWZtDpcoci8tvPdOM2y/iH4ZPaX+p7lcZdCsgKvQZrImD6zBq+U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=c3iMRb5f; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="c3iMRb5f" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822617; x=1781358617; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WWSMNscjRKqDqvaRWCJBqmlgX5BK/JXK0+3JVVtjIaM=; b=c3iMRb5fT7/t1xdOiTd5hX6sHRO9KizQOCBejjSN4+nbX/OgdI2HS/oa YO1pP4NaVUkTgq5kRolqEuWKJeI2h/i/150tyZYVPjoc4tqRzkIN+ZOhi C3bMeUhk6U6VkQJWrRXXqrc7GRD9Os65kwrtN+juT9jZ2XPwWRsdSHthF yRwICJLvqump9KQ0pcS3VDOmdgacCbNNnU5qfE0TxFwq0QHliRJxSvXGQ IviLdhxYXHIm1ZFPGiyzA8jMR1VilGWRFagpRBrEMGoqsvKWv/C/ng++u nHfetNheAK9Owln61nUmd6+jNlpZTFd1SIKJ+RBxvR28zan4QrJ5/Vx9P g==; X-CSE-ConnectionGUID: mmxggCaWRTmX4OwakYZ3ZQ== X-CSE-MsgGUID: cnqbagmdTgm+LKa5me5Azg== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837549" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837549" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:16 -0700 X-CSE-ConnectionGUID: HYJePYyhQYewtHCH3lqvcQ== X-CSE-MsgGUID: P/PvXNpCSaG+owfBuL8pmg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017597" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:16 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 01/12] perf/x86: Use x86_perf_regs in the x86 nmi handler Date: Fri, 13 Jun 2025 06:49:32 -0700 Message-Id: <20250613134943.3186517-2-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang More and more regs will be supported in the overflow, e.g., more vector registers, SSP, etc. The generic pt_regs struct cannot store all of them. Use the X86 specific x86_perf_regs instead. The x86_perf_regs is already used in PEBS to store the XMM registers. The struct pt_regs *regs is still passed to x86_pmu_handle_irq(). There is no functional change for the existing code. AMD IBS's NMI handler doesn't utilize the static call x86_pmu_handle_irq(). The x86_perf_regs struct doesn't apply to the AMD IBS. It can be added separately later when AMD IBS supports more regs. Signed-off-by: Kan Liang --- arch/x86/events/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 7610f26dfbd9..64a7a8aa2e38 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1752,6 +1752,7 @@ void perf_events_lapic_init(void) static int perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs) { + struct x86_perf_regs x86_regs; u64 start_clock; u64 finish_clock; int ret; @@ -1764,7 +1765,8 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_re= gs *regs) return NMI_DONE; =20 start_clock =3D sched_clock(); - ret =3D static_call(x86_pmu_handle_irq)(regs); + x86_regs.regs =3D *regs; + ret =3D static_call(x86_pmu_handle_irq)(&x86_regs.regs); finish_clock =3D sched_clock(); =20 perf_sample_event_took(finish_clock - start_clock); --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3357233733 for ; Fri, 13 Jun 2025 13:50:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822619; cv=none; b=S2txUiY1nvQbYcjnKzNr+8rVI+t3p0h9xCWQSK+RnY6FfqDcbtUOqwIZGJnYRFf4EOVvd15wYHWdXP/bu5/g7bMFCl8cukgthQsap5c+gND79P+9c5jLjBQK6pT+/Pf/ls2Q4gTMr0UupVk532Ac4rxIAZ9Yakd83tOS2nJ+X88= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822619; c=relaxed/simple; bh=RfajxMaatSOFMzb2ER0CGzUAloIrxpFxVmImnq9sV1M=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JMQfutj1QK1MF8MoykYwhg1W7aCNnCR23bdABpTRAAE3onP6ikRmM3xjx/Hhq/wHEd5ncmIXLjrYB8abFBg5Re/0AQBpehBAEI6i/JxsEbgdhyU98Z6zSOLMuklIQWIMh0vrI4sQ7Q3lbAFlnAfec46mMo4D3U3P/Q3UtUeEhhs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=kuV6m4I3; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="kuV6m4I3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822618; x=1781358618; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RfajxMaatSOFMzb2ER0CGzUAloIrxpFxVmImnq9sV1M=; b=kuV6m4I31F2SLQNucarDJEVfEleJghr7hz8nDxC0a+7onpur+OXTBC1i hV3bX+D63rhWfcPvoqzDx2ziAPlhNTqvALJA6NXquxwR5iHWkLR+/rinO hOT/crKFDUCWvKcDGoln2B9GqVrxWkUa9fqCvyjc959sZnw9kuzglKVCw FRP3cZj5vI3+2pqHL+srRvD/Kw95CKQ2vPzVlAQMZoUHRRrGWIBFZENyw oY9BAfoxU/Knqcf9lwEZs2GxGAWFSxQj5/HOj0w4dQxsd6n9qkSvzRZTp 5FmxwyAqhsCHkCfAUOf7wpLC14ZLmX58rzuUz9jStky4/KFD78bOSgVTJ Q==; X-CSE-ConnectionGUID: SfGItRORQoOBisqf65Tjsw== X-CSE-MsgGUID: BjPnM5RDSJWGefB9yKoSzQ== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837558" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837558" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:16 -0700 X-CSE-ConnectionGUID: zaHut5BdRnqpWHTDYTbc9g== X-CSE-MsgGUID: 0NVrIofSRZWwMz3HxS+TYw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017602" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:16 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 02/12] perf/x86: Setup the regs data Date: Fri, 13 Jun 2025 06:49:33 -0700 Message-Id: <20250613134943.3186517-3-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The current code relies on the generic code to setup the regs data. It will not work well when there are more regs introduced. Introduce a X86-specific x86_pmu_setup_regs_data(). Now, it's the same as the generic code. More X86-specific codes will be added later with the new regs. Signed-off-by: Kan Liang --- arch/x86/events/core.c | 32 ++++++++++++++++++++++++++++++++ arch/x86/events/intel/ds.c | 4 +++- arch/x86/events/perf_event.h | 4 ++++ 3 files changed, 39 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 64a7a8aa2e38..c601ad761534 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1685,6 +1685,38 @@ static void x86_pmu_del(struct perf_event *event, in= t flags) static_call_cond(x86_pmu_del)(event); } =20 +void x86_pmu_setup_regs_data(struct perf_event *event, + struct perf_sample_data *data, + struct pt_regs *regs) +{ + u64 sample_type =3D event->attr.sample_type; + + if (sample_type & PERF_SAMPLE_REGS_USER) { + if (user_mode(regs)) { + data->regs_user.abi =3D perf_reg_abi(current); + data->regs_user.regs =3D regs; + } else if (!(current->flags & PF_KTHREAD)) { + perf_get_regs_user(&data->regs_user, regs); + } else { + data->regs_user.abi =3D PERF_SAMPLE_REGS_ABI_NONE; + data->regs_user.regs =3D NULL; + } + data->dyn_size +=3D sizeof(u64); + if (data->regs_user.regs) + data->dyn_size +=3D hweight64(event->attr.sample_regs_user) * sizeof(u6= 4); + data->sample_flags |=3D PERF_SAMPLE_REGS_USER; + } + + if (sample_type & PERF_SAMPLE_REGS_INTR) { + data->regs_intr.regs =3D regs; + data->regs_intr.abi =3D perf_reg_abi(current); + data->dyn_size +=3D sizeof(u64); + if (data->regs_intr.regs) + data->dyn_size +=3D hweight64(event->attr.sample_regs_intr) * sizeof(u6= 4); + data->sample_flags |=3D PERF_SAMPLE_REGS_INTR; + } +} + int x86_pmu_handle_irq(struct pt_regs *regs) { struct perf_sample_data data; diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index c0b7ac1c7594..e67d8a03ddfe 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2126,8 +2126,10 @@ static void setup_pebs_adaptive_sample_data(struct p= erf_event *event, regs->flags &=3D ~PERF_EFLAGS_EXACT; } =20 - if (sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)) + if (sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)) { adaptive_pebs_save_regs(regs, gprs); + x86_pmu_setup_regs_data(event, data, regs); + } } =20 if (format_group & PEBS_DATACFG_MEMINFO) { diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 2b969386dcdd..12682a059608 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -1278,6 +1278,10 @@ void x86_pmu_enable_event(struct perf_event *event); =20 int x86_pmu_handle_irq(struct pt_regs *regs); =20 +void x86_pmu_setup_regs_data(struct perf_event *event, + struct perf_sample_data *data, + struct pt_regs *regs); + void x86_pmu_show_pmu_cap(struct pmu *pmu); =20 static inline int x86_pmu_num_counters(struct pmu *pmu) --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32FB42D5C65 for ; Fri, 13 Jun 2025 13:50:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822619; cv=none; b=IbC6Gh77yaoP1s0y4tjrqW99euM1ad6gSRCxXm9DBICkRSWkUMuTCM0b0JgNw18vDpwieorhKflvHapocd15uZij6syddHHpatR9JJI+qVb55klopykYOQMa7m88F6oLU8uJmx0ogEIgn6RE2Ebamb4q8K/UuqQGOyVs7B2s1gE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822619; c=relaxed/simple; bh=X26d0YBc6rbGtys4piq20LSlymCyL6RtLI2dM1edAuU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=gKonOL7PWmK8EkVMaUkCNVh2YbBw6q+97xuwsMMnGWmOTb2lkAfAlCYCSUewtlHFfrCwDCeCLRpmOZzhotO1RufiJg0gdcxIY4g2ViCekscwqN5GXf1Ok0fSfZlesbcpkjr13Mmd821N+qjKCKrIpp/d5dnEUJIcPA2lt7wAHO4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KXtjtU7+; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KXtjtU7+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822618; x=1781358618; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=X26d0YBc6rbGtys4piq20LSlymCyL6RtLI2dM1edAuU=; b=KXtjtU7+oORIqFQ4ZFEWwO40K4f4aU4Kihv4YJ9cVyv6i43rQ1JcK6Kz pzSnbCEVU8LkjeiaEztSXB42e0GYehcvR0fIDMicC7WxLXu3s8/aR/COW jyci9eKObA5UnCwpO6rec57UKiOOxL3S0GLoD1vStHIy4k9fRulszDFZV 8cM5EUv6MObf3N6v0qfEAgTdg2apccKzVOcE4SoA7pRcJauxQRyHQnyjL FmJCWRLpFsGKDKk9TXVSPJyTD58ID8IBRn1EXwDkpfKSljnJ5y+af95xc K/VeF6D8ErXAqZB+OLCNrheXbN4/y32/QJ9dBFlJCtCmzWwFTqHDPHPiR w==; X-CSE-ConnectionGUID: h5SdRYEeQ8ukig13Ugd93w== X-CSE-MsgGUID: xUDhCabcSJSQsQ+EYTh4fA== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837564" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837564" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:17 -0700 X-CSE-ConnectionGUID: bByUlr+6TVCcuoC184E9Vw== X-CSE-MsgGUID: w+j/J9abSPysJlA00S/5JA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017607" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:17 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 03/12] x86/fpu/xstate: Add xsaves_nmi Date: Fri, 13 Jun 2025 06:49:34 -0700 Message-Id: <20250613134943.3186517-4-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Linux perf_event subsystem needs to retrieve the current vector registers in an overflow. The overflow is handled by NMI. Add an interface to retrieve the actual register contents when the NMI hit. It's the invoker's responsibility to make sure the contents are properly filtered before exposing them to the end user. The mask may be changed according to the end user's request. The XSAVES with the modified optimizations are chosen. Suggested-by: Dave Hansen Signed-off-by: Kan Liang --- arch/x86/include/asm/fpu/xstate.h | 1 + arch/x86/kernel/fpu/xstate.c | 22 ++++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/x= state.h index b308a76afbb7..87c170d61138 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -106,6 +106,7 @@ extern void __init update_regset_xstate_info(unsigned i= nt size, int xfeature_size(int xfeature_nr); =20 void xsaves(struct xregs_state *xsave, u64 mask); +void xsaves_nmi(struct xregs_state *xsave, u64 mask); void xrstors(struct xregs_state *xsave, u64 mask); =20 int xfd_enable_feature(u64 xfd_err); diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 9aa9ac8399ae..5b0bae135aff 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -1424,6 +1424,28 @@ void xsaves(struct xregs_state *xstate, u64 mask) WARN_ON_ONCE(err); } =20 +/** + * xsaves_nmi - Save selected components to a kernel xstate buffer in NMI + * @xstate: Pointer to the buffer + * @mask: Feature mask to select the components to save + * + * The @xstate buffer must be 64 byte aligned and correctly initialized as + * XSAVES does not write the full xstate header. + * + * This function can only be invoked in an NMI. It returns the *ACTUAL* + * register contents when the NMI hit. + */ +void xsaves_nmi(struct xregs_state *xstate, u64 mask) +{ + int err; + + if (!in_nmi()) + return; + + XSTATE_OP(XSAVES, xstate, (u32)mask, (u32)(mask >> 32), err); + WARN_ON_ONCE(err); +} + /** * xrstors - Restore selected components from a kernel xstate buffer * @xstate: Pointer to the buffer --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE7F32D6626 for ; Fri, 13 Jun 2025 13:50:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822620; cv=none; b=BY5Fm2m+Wm1PhM95yUAXI5kwTUE/amfRu3KpQ6WJ/UY7GNY7kmTOG3eCzTWzRKALYEzkvtcsdTu/B0i6BauGWU7GSS71GnHWEgOiJynpgttX1MGB0Rf/1CKiip8PU74TDw7DnVtR43vww061oq59i1eMj00hCGyPimNa+/WEUuY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822620; c=relaxed/simple; bh=QJCPyGq/8+1SlLUHgMpK6zoVk7NJb9wK746eGcBYWM8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XF12OHuIsqzUw9HuHbJr4Rst1jqgGnaMajgyCsxdUchZUS+Eh6H6bpw4vOagU+TKGNdmw2d86PJsoclLBgimpcyf9I81NIDeNoDYINgljWSoYCnhdg3PvAx6RXs0M+0Te+FYuMT7Vo7Y1LOo7+yZkj1MCy6yZvueSWRnfP6PAN8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QlX15WL4; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QlX15WL4" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822619; x=1781358619; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QJCPyGq/8+1SlLUHgMpK6zoVk7NJb9wK746eGcBYWM8=; b=QlX15WL4z8U25nVtvHn2KX17icGJY8NcneirDBpQTL5lgGKBu8BFVezX MuVpbG3or1eXBnUwTJ/3G4gxLzjb2gZ/fNIdNMUF1N71uoDz3t39W34XI eoGPKiRY7P8uf0XBCQeISVWKZ/TpE6wF3hk53CxEM8d2ubkWvcLR3R4o+ mbHfMZxl5crjEduQee1k3PuU5IloMO9Yvesbz7UfQSADF/7qCWlNnYgFh BmpqboaGrQ3khmuv6HnYSiIg4FjTvBaN8w8zndmQetHoRSKWAfNmabf8+ XNWt+ThmuSMzUbL3AmnkP8RKEwnIWfkpTWuPbOVm40ci8s3M2dTIqmJC7 w==; X-CSE-ConnectionGUID: 1eVbdod+QKOIZXURIDVZ2w== X-CSE-MsgGUID: HuaKLpWdQuGQBo/SVTVTsw== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837570" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837570" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:18 -0700 X-CSE-ConnectionGUID: HxVde+W7Q7yxIEPhJxvrdA== X-CSE-MsgGUID: 3GMc6gA1RCWv6/wIPc7JZg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017612" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:18 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 04/12] perf: Move has_extended_regs() to header file Date: Fri, 13 Jun 2025 06:49:35 -0700 Message-Id: <20250613134943.3186517-5-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang The function will also be used in the ARCH-specific code. Rename it to follow the naming rule of the existing functions. No functional change. Signed-off-by: Kan Liang --- include/linux/perf_event.h | 8 ++++++++ kernel/events/core.c | 8 +------- 2 files changed, 9 insertions(+), 7 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 52dc7cfab0e0..74c188a699e4 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1488,6 +1488,14 @@ perf_event__output_id_sample(struct perf_event *even= t, extern void perf_log_lost_samples(struct perf_event *event, u64 lost); =20 +static inline bool event_has_extended_regs(struct perf_event *event) +{ + struct perf_event_attr *attr =3D &event->attr; + + return (attr->sample_regs_user & PERF_REG_EXTENDED_MASK) || + (attr->sample_regs_intr & PERF_REG_EXTENDED_MASK); +} + static inline bool event_has_any_exclude_flag(struct perf_event *event) { struct perf_event_attr *attr =3D &event->attr; diff --git a/kernel/events/core.c b/kernel/events/core.c index cc77f127e11a..7f0d98d73629 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -12502,12 +12502,6 @@ int perf_pmu_unregister(struct pmu *pmu) } EXPORT_SYMBOL_GPL(perf_pmu_unregister); =20 -static inline bool has_extended_regs(struct perf_event *event) -{ - return (event->attr.sample_regs_user & PERF_REG_EXTENDED_MASK) || - (event->attr.sample_regs_intr & PERF_REG_EXTENDED_MASK); -} - static int perf_try_init_event(struct pmu *pmu, struct perf_event *event) { struct perf_event_context *ctx =3D NULL; @@ -12542,7 +12536,7 @@ static int perf_try_init_event(struct pmu *pmu, str= uct perf_event *event) goto err_pmu; =20 if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS) && - has_extended_regs(event)) { + event_has_extended_regs(event)) { ret =3D -EOPNOTSUPP; goto err_destroy; } --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D3072D8DC5 for ; Fri, 13 Jun 2025 13:50:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822621; cv=none; b=S8bsG0uQxwHO6AtyEglRfOSAAuPv/4qpH/6cZEGG176MiF6XaINl5m8eWSsQHemM04bTlz5c1wB+fvYjnBMG8lLMVbd9IzOrnTBMehveywZl8Au2COq2WU9WogvEivTWElBx760oyQ0YzvEAcguMzUC3vKIpDZGLYfzimN4+BVs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822621; c=relaxed/simple; bh=SCLpVJbkXdLm+MhgaXpua/1aVakeODy6Utk7X5aFoC8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TkPFlYhvWF+Tsv9jmo3Rwg9U4Am+BMzAzA1OlMdP8gZkWWhDq1X1B4iWkCNihNTiTV4YhwVtzuh4Brf4gODUmg74x1x3k9Gs3hkLrNQYNQKYiAg/dOdBI9fvZuG9yP6tE/L6KLgPg3S8OH03MiI4T4T4D/5hDDVonFfFUzRUodI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ei2wd6IA; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ei2wd6IA" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822619; x=1781358619; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SCLpVJbkXdLm+MhgaXpua/1aVakeODy6Utk7X5aFoC8=; b=ei2wd6IA76OpTXT3DMBI0aue4BldI6Gam0sPtlqpDx8TGswx2cEOpL48 XABzFnx716I1imYzzMX0PdenDteCmTH0PVSG7ZJjWhDJBT8PB/Q1VkkGz qqelxZ3U0FAu8bU2tWphRNe2maBXXi+Il98zoRFekw9pCNScSTicwS5w7 j4bOnb+/yLVqP+SzpKWeIjVthXFTRq2EKVOVJr1KiwynuTE1LOWWQkwgI P3D9O5chgGep9ES7MoyUsmvIoKnYjdMXumSR3qHVoBL7+zWeplZc4cnBb vUVMnZbqlSFd5NaM0H0gc2xRJHZR9JdlCKk7LIzFPoFC4HluM//HD6kmN Q==; X-CSE-ConnectionGUID: zFBGH/6IQp6A6IE9dyshfQ== X-CSE-MsgGUID: 4jRvI8oXQCmsScdBudhiFw== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837577" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837577" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:18 -0700 X-CSE-ConnectionGUID: uWPSj/4iQoW5A87jXFxUUA== X-CSE-MsgGUID: QLtzenPoRxStkfNTd51RtQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017616" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:18 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 05/12] perf/x86: Support XMM register for non-PEBS and REGS_USER Date: Fri, 13 Jun 2025 06:49:36 -0700 Message-Id: <20250613134943.3186517-6-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Collecting the XMM registers in a PEBS record has been supported since the Icelake. But non-PEBS events don't supported the feature. It's possible to retrieve the XMM registers from the XSAVE for non-PEBS. Add it to make the support complete. To utilize the XSAVE, a 64-byte aligned buffer is required. Add a per-CPU ext_regs_buf to store the vector registers. Extend the support for both REGS_USER and REGS_INTR. For REGS_USER, the perf_get_regs_user() returns the regs from the task_pt_regs(current), which is struct pt_regs. Need to move it to local struct x86_perf_regs x86_user_regs. For PEBS, the HW support is still preferred. The XMM should be retrieved from PEBS records. There could be more vector registers supported later. Add ext_regs_mask to track the supported vector register group. For now, the feature is only supported for newer Intel platforms (PEBS V4+ or archPerfmonExt (0x23)). In theory, the vector registers can be retrieved as long as the CPU supports. The support for the old generations may be added later if there is a requirement. Signed-off-by: Kan Liang --- arch/x86/events/core.c | 119 ++++++++++++++++++++++++++++++----- arch/x86/events/intel/core.c | 27 ++++++++ arch/x86/events/intel/ds.c | 10 ++- arch/x86/events/perf_event.h | 12 +++- 4 files changed, 148 insertions(+), 20 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index c601ad761534..6b1c347cc17a 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -406,6 +406,63 @@ set_ext_hw_attr(struct hw_perf_event *hwc, struct perf= _event *event) return x86_pmu_extra_regs(val, event); } =20 +static DEFINE_PER_CPU(void *, ext_regs_buf); + +static void x86_pmu_get_ext_regs(struct x86_perf_regs *perf_regs, u64 mask) +{ + void *xsave =3D (void *)ALIGN((unsigned long)per_cpu(ext_regs_buf, smp_pr= ocessor_id()), 64); + struct xregs_state *xregs_xsave =3D xsave; + u64 xcomp_bv; + + if (WARN_ON_ONCE(!xsave)) + return; + + xsaves_nmi(xsave, mask); + + xcomp_bv =3D xregs_xsave->header.xcomp_bv; + if (mask & XFEATURE_MASK_SSE && xcomp_bv & XFEATURE_SSE) + perf_regs->xmm_regs =3D (u64 *)xregs_xsave->i387.xmm_space; +} + +static void release_ext_regs_buffers(void) +{ + int cpu; + + if (!x86_pmu.ext_regs_mask) + return; + + for_each_possible_cpu(cpu) { + kfree(per_cpu(ext_regs_buf, cpu)); + per_cpu(ext_regs_buf, cpu) =3D NULL; + } +} + +static void reserve_ext_regs_buffers(void) +{ + size_t size; + int cpu; + + if (!x86_pmu.ext_regs_mask) + return; + + size =3D FXSAVE_SIZE + XSAVE_HDR_SIZE; + + /* XSAVE feature requires 64-byte alignment. */ + size +=3D 64; + + for_each_possible_cpu(cpu) { + per_cpu(ext_regs_buf, cpu) =3D kzalloc_node(size, GFP_KERNEL, + cpu_to_node(cpu)); + if (!per_cpu(ext_regs_buf, cpu)) + goto err; + } + + return; + +err: + release_ext_regs_buffers(); +} + int x86_reserve_hardware(void) { int err =3D 0; @@ -418,6 +475,7 @@ int x86_reserve_hardware(void) } else { reserve_ds_buffers(); reserve_lbr_buffers(); + reserve_ext_regs_buffers(); } } if (!err) @@ -434,6 +492,7 @@ void x86_release_hardware(void) release_pmc_hardware(); release_ds_buffers(); release_lbr_buffers(); + release_ext_regs_buffers(); mutex_unlock(&pmc_reserve_mutex); } } @@ -642,21 +701,18 @@ int x86_pmu_hw_config(struct perf_event *event) return -EINVAL; } =20 - /* sample_regs_user never support XMM registers */ - if (unlikely(event->attr.sample_regs_user & PERF_REG_EXTENDED_MASK)) - return -EINVAL; - /* - * Besides the general purpose registers, XMM registers may - * be collected in PEBS on some platforms, e.g. Icelake - */ - if (unlikely(event->attr.sample_regs_intr & PERF_REG_EXTENDED_MASK)) { - if (!(event->pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS)) - return -EINVAL; - - if (!event->attr.precise_ip) - return -EINVAL; + if (event->attr.sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_U= SER)) { + /* + * Besides the general purpose registers, XMM registers may + * be collected as well. + */ + if (event_has_extended_regs(event)) { + if (!(event->pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS)) + return -EINVAL; + if (!(x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_XMM))) + return -EINVAL; + } } - return x86_setup_perfctr(event); } =20 @@ -1685,18 +1741,40 @@ static void x86_pmu_del(struct perf_event *event, i= nt flags) static_call_cond(x86_pmu_del)(event); } =20 +static DEFINE_PER_CPU(struct x86_perf_regs, x86_user_regs); + +static struct x86_perf_regs * +x86_pmu_perf_get_regs_user(struct perf_sample_data *data, + struct pt_regs *regs) +{ + struct x86_perf_regs *x86_regs_user =3D this_cpu_ptr(&x86_user_regs); + struct perf_regs regs_user; + + perf_get_regs_user(®s_user, regs); + data->regs_user.abi =3D regs_user.abi; + x86_regs_user->regs =3D *regs_user.regs; + data->regs_user.regs =3D &x86_regs_user->regs; + return x86_regs_user; +} + void x86_pmu_setup_regs_data(struct perf_event *event, struct perf_sample_data *data, - struct pt_regs *regs) + struct pt_regs *regs, + u64 ignore_mask) { + struct x86_perf_regs *perf_regs =3D container_of(regs, struct x86_perf_re= gs, regs); u64 sample_type =3D event->attr.sample_type; + u64 mask =3D 0; + + if (!(event->attr.sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS= _USER))) + return; =20 if (sample_type & PERF_SAMPLE_REGS_USER) { if (user_mode(regs)) { data->regs_user.abi =3D perf_reg_abi(current); data->regs_user.regs =3D regs; } else if (!(current->flags & PF_KTHREAD)) { - perf_get_regs_user(&data->regs_user, regs); + perf_regs =3D x86_pmu_perf_get_regs_user(data, regs); } else { data->regs_user.abi =3D PERF_SAMPLE_REGS_ABI_NONE; data->regs_user.regs =3D NULL; @@ -1715,6 +1793,15 @@ void x86_pmu_setup_regs_data(struct perf_event *even= t, data->dyn_size +=3D hweight64(event->attr.sample_regs_intr) * sizeof(u6= 4); data->sample_flags |=3D PERF_SAMPLE_REGS_INTR; } + + if (event_has_extended_regs(event)) { + perf_regs->xmm_regs =3D NULL; + mask |=3D XFEATURE_MASK_SSE; + } + + mask &=3D ~ignore_mask; + if (mask) + x86_pmu_get_ext_regs(perf_regs, mask); } =20 int x86_pmu_handle_irq(struct pt_regs *regs) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index c2fb729c270e..5706ee562684 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3284,6 +3284,8 @@ static int handle_pmi_common(struct pt_regs *regs, u6= 4 status) if (has_branch_stack(event)) intel_pmu_lbr_save_brstack(&data, cpuc, event); =20 + x86_pmu_setup_regs_data(event, &data, regs, 0); + perf_event_overflow(event, &data, regs); } =20 @@ -5272,6 +5274,29 @@ static inline bool intel_pmu_broken_perf_cap(void) return false; } =20 +static void intel_extended_regs_init(struct pmu *pmu) +{ + /* + * Extend the vector registers support to non-PEBS. + * The feature is limited to newer Intel machines with + * PEBS V4+ or archPerfmonExt (0x23) enabled for now. + * In theory, the vector registers can be retrieved as + * long as the CPU supports. The support for the old + * generations may be added later if there is a + * requirement. + * Only support the extension when XSAVES is available. + */ + if (!boot_cpu_has(X86_FEATURE_XSAVES)) + return; + + if (!boot_cpu_has(X86_FEATURE_XMM) || + !cpu_has_xfeatures(XFEATURE_MASK_SSE, NULL)) + return; + + x86_pmu.ext_regs_mask |=3D BIT_ULL(X86_EXT_REGS_XMM); + x86_get_pmu(smp_processor_id())->capabilities |=3D PERF_PMU_CAP_EXTENDED_= REGS; +} + static void update_pmu_cap(struct pmu *pmu) { unsigned int cntr, fixed_cntr, ecx, edx; @@ -5306,6 +5331,8 @@ static void update_pmu_cap(struct pmu *pmu) /* Perf Metric (Bit 15) and PEBS via PT (Bit 16) are hybrid enumeration = */ rdmsrq(MSR_IA32_PERF_CAPABILITIES, hybrid(pmu, intel_cap).capabilities); } + + intel_extended_regs_init(pmu); } =20 static void intel_pmu_check_hybrid_pmus(struct x86_hybrid_pmu *pmu) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index e67d8a03ddfe..ccb5c3ddab3b 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1415,8 +1415,7 @@ static u64 pebs_update_adaptive_cfg(struct perf_event= *event) if (gprs || (attr->precise_ip < 2) || tsx_weight) pebs_data_cfg |=3D PEBS_DATACFG_GP; =20 - if ((sample_type & PERF_SAMPLE_REGS_INTR) && - (attr->sample_regs_intr & PERF_REG_EXTENDED_MASK)) + if (event_has_extended_regs(event)) pebs_data_cfg |=3D PEBS_DATACFG_XMMS; =20 if (sample_type & PERF_SAMPLE_BRANCH_STACK) { @@ -2127,8 +2126,12 @@ static void setup_pebs_adaptive_sample_data(struct p= erf_event *event, } =20 if (sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)) { + u64 mask =3D 0; + adaptive_pebs_save_regs(regs, gprs); - x86_pmu_setup_regs_data(event, data, regs); + if (format_group & PEBS_DATACFG_XMMS) + mask |=3D XFEATURE_MASK_SSE; + x86_pmu_setup_regs_data(event, data, regs, mask); } } =20 @@ -2755,6 +2758,7 @@ void __init intel_pebs_init(void) x86_pmu.flags |=3D PMU_FL_PEBS_ALL; x86_pmu.pebs_capable =3D ~0ULL; pebs_qual =3D "-baseline"; + x86_pmu.ext_regs_mask |=3D BIT_ULL(X86_EXT_REGS_XMM); x86_get_pmu(smp_processor_id())->capabilities |=3D PERF_PMU_CAP_EXTEND= ED_REGS; } else { /* Only basic record supported */ diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 12682a059608..b48f4215f37c 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -687,6 +687,10 @@ enum { x86_lbr_exclusive_max, }; =20 +enum { + X86_EXT_REGS_XMM =3D 0, +}; + #define PERF_PEBS_DATA_SOURCE_MAX 0x100 #define PERF_PEBS_DATA_SOURCE_MASK (PERF_PEBS_DATA_SOURCE_MAX - 1) #define PERF_PEBS_DATA_SOURCE_GRT_MAX 0x10 @@ -992,6 +996,11 @@ struct x86_pmu { struct extra_reg *extra_regs; unsigned int flags; =20 + /* + * Extended regs, e.g., vector registers + */ + u64 ext_regs_mask; + /* * Intel host/guest support (KVM) */ @@ -1280,7 +1289,8 @@ int x86_pmu_handle_irq(struct pt_regs *regs); =20 void x86_pmu_setup_regs_data(struct perf_event *event, struct perf_sample_data *data, - struct pt_regs *regs); + struct pt_regs *regs, + u64 ignore_mask); =20 void x86_pmu_show_pmu_cap(struct pmu *pmu); =20 --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C64CB2BF045 for ; Fri, 13 Jun 2025 13:50:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822621; cv=none; b=P89POy0NDFvK3PE8nBKjzFKSJUvZUfZGIR7krWkeA+RCapSrOlSK+EZA5rkiLtupjPIlkDgYx8nfK1n4G9SfMqQz99FnvEq8KjGZvD/IR2Hyoh6tmGBKjpgm1Yxs74MRbdQYIJjc5L8CQAlRUZg5+gkyyUdf0ppWEfTsbz6NPZg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822621; c=relaxed/simple; bh=BA/eHJd6NHglezBenIsBKWPWT0Yq2OM6T/TxcLJstR8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=i3hTnyujXRpIIvAw0Wf4DQqyDY4+gh0cMIXx8LUKrAb2NjAeJekFe0PVCiHN17lK5DTb0Gu6uODwx33uCKsDBzR2ERzynERyXL8NmzkrQ92aIxbMZRcvr30ohvFPwtaR4fTpafxZ9dO9ZwD/2zFQG3eTJft949AURkpUT5mreeg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=XmocuUvJ; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XmocuUvJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822620; x=1781358620; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BA/eHJd6NHglezBenIsBKWPWT0Yq2OM6T/TxcLJstR8=; b=XmocuUvJkhtT9bNF9twDSnPcKbVh/GmCV4qBnWxR5nsny1oy6Uk3O894 mbLqfXSV2x+/j55KPaLvCmIMasSkqzN1fZQFGNL1+hilowm8U9I7kSA47 d74cLI/S1nGatqQErf4dExGW1DhurVSrU8NjjoEtpUycqkAsOuo15OSkK ihCDDy/sVt5dii/RKc0XxWbiZLY2goEie5IFl2qWwX8ADZmCCo5B8Eqni O+lTAbh5Z/lGnF2UOAFSvpIKVuy8f+FVXlJ/gblkh857z8ajdxfCVQ67K wZTovlxwc9H41ie0CXJl2NQdQpPcEast9+WtTB3j90LuRrOlU78ClI/17 g==; X-CSE-ConnectionGUID: 2TdlMx/BTTiWGLanOSE1vA== X-CSE-MsgGUID: eTXyR3n0T9+U5FijMefoMA== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837584" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837584" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:19 -0700 X-CSE-ConnectionGUID: n0lFcdzlQPam4EJWBgUe8Q== X-CSE-MsgGUID: vduRPPKHTMq3RZx59mL3zQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017621" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:19 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 06/12] perf: Support extension of sample_regs Date: Fri, 13 Jun 2025 06:49:37 -0700 Message-Id: <20250613134943.3186517-7-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang More regs may be required in a sample, e.g., the vector registers. The current sample_regs_XXX has run out of space. Add sample_ext_regs_intr/user[2] in the struct perf_event_attr. It's used as a bitmap for the extension regs. There will be more than 64 registers added. Add a new flag PERF_PMU_CAP_EXTENDED_REGS2 to indicate the PMU which supports sample_ext_regs_intr/user. Extend the perf_reg_validate() to support the validation of the extension regs. Extend the perf_reg_value() to retrieve the extension regs. The regs may be larger than u64. Add two parameters to store the pointer and size. Add a dedicated perf_output_sample_ext_regs() to dump the extension regs. This is just a generic support for the extension regs. Any attempts to manipulate the extension regs will error out, until the driver-specific supports are implemented, which will be done in the following patch. Signed-off-by: Kan Liang --- arch/arm/kernel/perf_regs.c | 9 +++-- arch/arm64/kernel/perf_regs.c | 9 +++-- arch/csky/kernel/perf_regs.c | 9 +++-- arch/loongarch/kernel/perf_regs.c | 8 +++-- arch/mips/kernel/perf_regs.c | 9 +++-- arch/powerpc/perf/perf_regs.c | 9 +++-- arch/riscv/kernel/perf_regs.c | 8 +++-- arch/s390/kernel/perf_regs.c | 9 +++-- arch/x86/kernel/perf_regs.c | 13 ++++++-- include/linux/perf_event.h | 15 +++++++++ include/linux/perf_regs.h | 29 +++++++++++++--- include/uapi/linux/perf_event.h | 8 +++++ kernel/events/core.c | 55 ++++++++++++++++++++++++++++--- 13 files changed, 162 insertions(+), 28 deletions(-) diff --git a/arch/arm/kernel/perf_regs.c b/arch/arm/kernel/perf_regs.c index 0529f90395c9..b6161c30bd40 100644 --- a/arch/arm/kernel/perf_regs.c +++ b/arch/arm/kernel/perf_regs.c @@ -8,8 +8,10 @@ #include #include =20 -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { + if (WARN_ON_ONCE(ext || ext_size)) + return 0; if (WARN_ON_ONCE((u32)idx >=3D PERF_REG_ARM_MAX)) return 0; =20 @@ -18,8 +20,11 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) =20 #define REG_RESERVED (~((1ULL << PERF_REG_ARM_MAX) - 1)) =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { + if (mask_ext) + return -EINVAL; + if (!mask || mask & REG_RESERVED) return -EINVAL; =20 diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c index b4eece3eb17d..668b54a7faf9 100644 --- a/arch/arm64/kernel/perf_regs.c +++ b/arch/arm64/kernel/perf_regs.c @@ -27,8 +27,10 @@ static u64 perf_ext_regs_value(int idx) } } =20 -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { + if (WARN_ON_ONCE(ext || ext_size)) + return 0; if (WARN_ON_ONCE((u32)idx >=3D PERF_REG_ARM64_EXTENDED_MAX)) return 0; =20 @@ -77,10 +79,13 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) =20 #define REG_RESERVED (~((1ULL << PERF_REG_ARM64_MAX) - 1)) =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { u64 reserved_mask =3D REG_RESERVED; =20 + if (mask_ext) + return -EINVAL; + if (system_supports_sve()) reserved_mask &=3D ~(1ULL << PERF_REG_ARM64_VG); =20 diff --git a/arch/csky/kernel/perf_regs.c b/arch/csky/kernel/perf_regs.c index 09b7f88a2d6a..5988ef55bf0a 100644 --- a/arch/csky/kernel/perf_regs.c +++ b/arch/csky/kernel/perf_regs.c @@ -8,8 +8,10 @@ #include #include =20 -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { + if (WARN_ON_ONCE(ext || ext_size)) + return 0; if (WARN_ON_ONCE((u32)idx >=3D PERF_REG_CSKY_MAX)) return 0; =20 @@ -18,8 +20,11 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) =20 #define REG_RESERVED (~((1ULL << PERF_REG_CSKY_MAX) - 1)) =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { + if (mask_ext) + return -EINVAL; + if (!mask || mask & REG_RESERVED) return -EINVAL; =20 diff --git a/arch/loongarch/kernel/perf_regs.c b/arch/loongarch/kernel/perf= _regs.c index 263ac4ab5af6..798dadee75ff 100644 --- a/arch/loongarch/kernel/perf_regs.c +++ b/arch/loongarch/kernel/perf_regs.c @@ -25,8 +25,10 @@ u64 perf_reg_abi(struct task_struct *tsk) } #endif /* CONFIG_32BIT */ =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { + if (mask_ext) + return -EINVAL; if (!mask) return -EINVAL; if (mask & ~((1ull << PERF_REG_LOONGARCH_MAX) - 1)) @@ -34,8 +36,10 @@ int perf_reg_validate(u64 mask) return 0; } =20 -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { + if (WARN_ON_ONCE(ext || ext_size)) + return 0; if (WARN_ON_ONCE((u32)idx >=3D PERF_REG_LOONGARCH_MAX)) return 0; =20 diff --git a/arch/mips/kernel/perf_regs.c b/arch/mips/kernel/perf_regs.c index e686780d1647..f3fcbf7e5aa6 100644 --- a/arch/mips/kernel/perf_regs.c +++ b/arch/mips/kernel/perf_regs.c @@ -28,8 +28,10 @@ u64 perf_reg_abi(struct task_struct *tsk) } #endif /* CONFIG_32BIT */ =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { + if (mask_ext) + return -EINVAL; if (!mask) return -EINVAL; if (mask & ~((1ull << PERF_REG_MIPS_MAX) - 1)) @@ -37,10 +39,13 @@ int perf_reg_validate(u64 mask) return 0; } =20 -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { long v; =20 + if (WARN_ON_ONCE(ext || ext_size)) + return 0; + switch (idx) { case PERF_REG_MIPS_PC: v =3D regs->cp0_epc; diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c index 350dccb0143c..556466409c76 100644 --- a/arch/powerpc/perf/perf_regs.c +++ b/arch/powerpc/perf/perf_regs.c @@ -99,8 +99,11 @@ static u64 get_ext_regs_value(int idx) } } =20 -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { + if (WARN_ON_ONCE(ext || ext_size)) + return 0; + if (idx =3D=3D PERF_REG_POWERPC_SIER && (IS_ENABLED(CONFIG_FSL_EMB_PERF_EVENT) || IS_ENABLED(CONFIG_PPC32) || @@ -125,8 +128,10 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) return regs_get_register(regs, pt_regs_offset[idx]); } =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { + if (mask_ext) + return -EINVAL; if (!mask || mask & REG_RESERVED) return -EINVAL; return 0; diff --git a/arch/riscv/kernel/perf_regs.c b/arch/riscv/kernel/perf_regs.c index fd304a248de6..05a4f1e7b243 100644 --- a/arch/riscv/kernel/perf_regs.c +++ b/arch/riscv/kernel/perf_regs.c @@ -8,8 +8,10 @@ #include #include =20 -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { + if (WARN_ON_ONCE(ext || ext_size)) + return 0; if (WARN_ON_ONCE((u32)idx >=3D PERF_REG_RISCV_MAX)) return 0; =20 @@ -18,8 +20,10 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) =20 #define REG_RESERVED (~((1ULL << PERF_REG_RISCV_MAX) - 1)) =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { + if (mask_ext) + return -EINVAL; if (!mask || mask & REG_RESERVED) return -EINVAL; =20 diff --git a/arch/s390/kernel/perf_regs.c b/arch/s390/kernel/perf_regs.c index a6b058ee4a36..2e17ae51279e 100644 --- a/arch/s390/kernel/perf_regs.c +++ b/arch/s390/kernel/perf_regs.c @@ -7,10 +7,13 @@ #include #include =20 -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { freg_t fp; =20 + if (WARN_ON_ONCE(ext || ext_size)) + return 0; + if (idx >=3D PERF_REG_S390_R0 && idx <=3D PERF_REG_S390_R15) return regs->gprs[idx]; =20 @@ -34,8 +37,10 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) =20 #define REG_RESERVED (~((1UL << PERF_REG_S390_MAX) - 1)) =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { + if (mask_ext) + return -EINVAL; if (!mask || mask & REG_RESERVED) return -EINVAL; =20 diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c index 624703af80a1..b9d5106afc26 100644 --- a/arch/x86/kernel/perf_regs.c +++ b/arch/x86/kernel/perf_regs.c @@ -57,10 +57,13 @@ static unsigned int pt_regs_offset[PERF_REG_X86_MAX] = =3D { #endif }; =20 -u64 perf_reg_value(struct pt_regs *regs, int idx) +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { struct x86_perf_regs *perf_regs; =20 + if (WARN_ON_ONCE(ext || ext_size)) + return 0; + if (idx >=3D PERF_REG_X86_XMM0 && idx < PERF_REG_X86_XMM_MAX) { perf_regs =3D container_of(regs, struct x86_perf_regs, regs); if (!perf_regs->xmm_regs) @@ -87,8 +90,10 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) (1ULL << PERF_REG_X86_R14) | \ (1ULL << PERF_REG_X86_R15)) =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { + if (mask_ext) + return -EINVAL; if (!mask || (mask & (REG_NOSUPPORT | PERF_REG_X86_RESERVED))) return -EINVAL; =20 @@ -112,8 +117,10 @@ void perf_get_regs_user(struct perf_regs *regs_user, (1ULL << PERF_REG_X86_FS) | \ (1ULL << PERF_REG_X86_GS)) =20 -int perf_reg_validate(u64 mask) +int perf_reg_validate(u64 mask, u64 *mask_ext) { + if (mask_ext) + return -EINVAL; if (!mask || (mask & (REG_NOSUPPORT | PERF_REG_X86_RESERVED))) return -EINVAL; =20 diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 74c188a699e4..42b288ab4d2c 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -305,6 +305,7 @@ struct perf_event_pmu_context; #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100 #define PERF_PMU_CAP_AUX_PAUSE 0x0200 #define PERF_PMU_CAP_AUX_PREFER_LARGE 0x0400 +#define PERF_PMU_CAP_EXTENDED_REGS2 0x0800 /* sample_ext_regs_intr/user */ =20 /** * pmu::scope @@ -1496,6 +1497,20 @@ static inline bool event_has_extended_regs(struct pe= rf_event *event) (attr->sample_regs_intr & PERF_REG_EXTENDED_MASK); } =20 +static inline bool event_has_extended_regs2(struct perf_event *event) +{ + struct perf_event_attr *attr =3D &event->attr; + int i; + + for (i =3D 0; i < PERF_ATTR_EXT_REGS_SIZE; i++) { + if (attr->sample_ext_regs_intr[i] || + attr->sample_ext_regs_user[i]) + return true; + } + + return false; +} + static inline bool event_has_any_exclude_flag(struct perf_event *event) { struct perf_event_attr *attr =3D &event->attr; diff --git a/include/linux/perf_regs.h b/include/linux/perf_regs.h index f632c5725f16..6119bcb010fb 100644 --- a/include/linux/perf_regs.h +++ b/include/linux/perf_regs.h @@ -16,23 +16,42 @@ struct perf_regs { #define PERF_REG_EXTENDED_MASK 0 #endif =20 -u64 perf_reg_value(struct pt_regs *regs, int idx); -int perf_reg_validate(u64 mask); +#define PERF_EXT_REGS_SIZE_MAX 8 + +/** + * perf_reg_value - Get a reg value + * @regs: The area where stores all registers + * @idx: The index of the request register. + * The below @ext indicates the index is for + * a regular register or an extension register. + * @ext: Pointer to the buffer which stores the + * value of the request extension register. + * NULL means request for a regular register. + * @ext_size: Size of the extension register. + * + * If it fails, 0 returns. + * If it succeeds, for a regular register (!ext), + * the value of the register returns. + * For an extension register (ext), ext[0] returns. + */ +u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size); +int perf_reg_validate(u64 mask, u64 *mask_ext); u64 perf_reg_abi(struct task_struct *task); void perf_get_regs_user(struct perf_regs *regs_user, struct pt_regs *regs); #else =20 #define PERF_REG_EXTENDED_MASK 0 +#define PERF_EXT_REGS_SIZE_MAX 8 =20 -static inline u64 perf_reg_value(struct pt_regs *regs, int idx) +static inline u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, = int *ext_size) { return 0; } =20 -static inline int perf_reg_validate(u64 mask) +static inline int perf_reg_validate(u64 mask, u64 *mask_ext) { - return mask ? -ENOSYS : 0; + return mask || mask_ext ? -ENOSYS : 0; } =20 static inline u64 perf_reg_abi(struct task_struct *task) diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_even= t.h index 78a362b80027..e22ba72efcdb 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -382,6 +382,10 @@ enum perf_event_read_format { #define PERF_ATTR_SIZE_VER6 120 /* Add: aux_sample_size */ #define PERF_ATTR_SIZE_VER7 128 /* Add: sig_data */ #define PERF_ATTR_SIZE_VER8 136 /* Add: config3 */ +#define PERF_ATTR_SIZE_VER9 168 /* Add: sample_ext_regs_intr */ + /* Add: sample_ext_regs_user */ + +#define PERF_ATTR_EXT_REGS_SIZE 2 =20 /* * 'struct perf_event_attr' contains various attributes that define @@ -543,6 +547,10 @@ struct perf_event_attr { __u64 sig_data; =20 __u64 config3; /* extension of config2 */ + + /* extension of sample_regs_XXX */ + __u64 sample_ext_regs_intr[PERF_ATTR_EXT_REGS_SIZE]; + __u64 sample_ext_regs_user[PERF_ATTR_EXT_REGS_SIZE]; }; =20 /* diff --git a/kernel/events/core.c b/kernel/events/core.c index 7f0d98d73629..c4279e1bf91a 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7385,11 +7385,40 @@ perf_output_sample_regs(struct perf_output_handle *= handle, for_each_set_bit(bit, _mask, sizeof(mask) * BITS_PER_BYTE) { u64 val; =20 - val =3D perf_reg_value(regs, bit); + val =3D perf_reg_value(regs, bit, NULL, NULL); perf_output_put(handle, val); } } =20 +static void +__perf_output_sample_ext_regs(struct perf_output_handle *handle, + struct pt_regs *regs, u64 mask, int base) +{ + u64 val[PERF_EXT_REGS_SIZE_MAX]; + int i, bit, size =3D 0; + DECLARE_BITMAP(_mask, 64); + + if (!mask) + return; + bitmap_from_u64(_mask, mask); + for_each_set_bit(bit, _mask, sizeof(mask) * BITS_PER_BYTE) { + perf_reg_value(regs, bit + base, val, &size); + + for (i =3D 0; i < size; i++) + perf_output_put(handle, val[i]); + } +} + +static void +perf_output_sample_ext_regs(struct perf_output_handle *handle, + struct pt_regs *regs, u64 *mask) +{ + int i; + + for (i =3D 0; i < PERF_ATTR_EXT_REGS_SIZE; i++) + __perf_output_sample_ext_regs(handle, regs, mask[i], i * 64); +} + static void perf_sample_regs_user(struct perf_regs *regs_user, struct pt_regs *regs) { @@ -7940,9 +7969,14 @@ void perf_output_sample(struct perf_output_handle *h= andle, =20 if (abi) { u64 mask =3D event->attr.sample_regs_user; + u64 *ext_mask =3D event->attr.sample_ext_regs_user; + perf_output_sample_regs(handle, data->regs_user.regs, mask); + perf_output_sample_ext_regs(handle, + data->regs_user.regs, + ext_mask); } } =20 @@ -7971,10 +8005,14 @@ void perf_output_sample(struct perf_output_handle *= handle, =20 if (abi) { u64 mask =3D event->attr.sample_regs_intr; + u64 *ext_mask =3D event->attr.sample_ext_regs_intr; =20 perf_output_sample_regs(handle, data->regs_intr.regs, mask); + perf_output_sample_ext_regs(handle, + data->regs_intr.regs, + ext_mask); } } =20 @@ -12535,6 +12573,12 @@ static int perf_try_init_event(struct pmu *pmu, st= ruct perf_event *event) if (ret) goto err_pmu; =20 + if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS2) && + event_has_extended_regs2(event)) { + ret =3D -EOPNOTSUPP; + goto err_destroy; + } + if (!(pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS) && event_has_extended_regs(event)) { ret =3D -EOPNOTSUPP; @@ -13073,7 +13117,8 @@ static int perf_copy_attr(struct perf_event_attr __= user *uattr, } =20 if (attr->sample_type & PERF_SAMPLE_REGS_USER) { - ret =3D perf_reg_validate(attr->sample_regs_user); + ret =3D perf_reg_validate(attr->sample_regs_user, + attr->sample_ext_regs_user); if (ret) return ret; } @@ -13096,8 +13141,10 @@ static int perf_copy_attr(struct perf_event_attr _= _user *uattr, if (!attr->sample_max_stack) attr->sample_max_stack =3D sysctl_perf_event_max_stack; =20 - if (attr->sample_type & PERF_SAMPLE_REGS_INTR) - ret =3D perf_reg_validate(attr->sample_regs_intr); + if (attr->sample_type & PERF_SAMPLE_REGS_INTR) { + ret =3D perf_reg_validate(attr->sample_regs_intr, + attr->sample_ext_regs_intr); + } =20 #ifndef CONFIG_CGROUP_PERF if (attr->sample_type & PERF_SAMPLE_CGROUP) --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7EE3F2BF05A for ; Fri, 13 Jun 2025 13:50:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822622; cv=none; b=DT7wfjXPC1F38qFf1ZO4dBP5lJ3Wu+xVkbzLHp9FD46J7g2IH9B7clvkJ5r79TF3VGnruTXv7xqFsqqW4UZro1JpM9PdN/3RZpmC9nIPEZ9OBikLZqj5GVo4HdKtJWY6lu89ChqMPR6E/Bptims7prRuvC0hPvHnfgw4mtCb4r8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822622; c=relaxed/simple; bh=G/dtKnd6cDZjg0GYQYI9hG6P6Van/soGZbLCDpNHpAU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=P0KSn9INSJh+EEnGQuCXETN6fhSc0+c+ILZNtRxRcuDB6j2fmVUrgVwCfQNyoLRDj6CsjabWfoEUgpktDORyD9iodXnOJNkuiL07x52sJvwvWNHqUwXw2NJ2VIAm9EzRdDHNd46Bn38BGlakNikQ4Piq8zkVZi+SHU5ne+KDrOI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=G5Azqf70; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="G5Azqf70" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822620; x=1781358620; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=G/dtKnd6cDZjg0GYQYI9hG6P6Van/soGZbLCDpNHpAU=; b=G5Azqf70zJ2y8vTkVUbO1hFs+jybaT9g1C9U9q0P8b7Wns/SVrzDaAW9 GaImAIsSOMUPKNxYUHisW+k4P2U6p8cY76W7Dh1aKk+64NGL7n9mIjCTL dVYS0uv2lSh8KB6QNyxYob6+a9JX/p/PFTFvnME172XnuP6OgwIL4kaqn K+43FhD6JJDIHaf01iJJ5NuazkfExHx7gl7/vaFlr2tmX4wY1C0ouLPNk m9J3/JD6yvX4gWQKOhXfm0+gAFvdUheo/UWkU8fZ/o2mz/v010tdrI5Aj s5hlVMw4jRhQiXyE5oZmhmy5BvSBvOm0n9+EXHKhXgKafoHv4AcCTj6qG w==; X-CSE-ConnectionGUID: JGqzyuk3Re2anrrGHrKQsw== X-CSE-MsgGUID: iZ+VuMRqR1y4I139QQprjg== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837590" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837590" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:20 -0700 X-CSE-ConnectionGUID: omFBQQhLQL2m7NJ2nMWthg== X-CSE-MsgGUID: MXSn1+MuRcqhUUxEUtl1ww== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017625" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:20 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 07/12] perf/x86: Add YMMH in extended regs Date: Fri, 13 Jun 2025 06:49:38 -0700 Message-Id: <20250613134943.3186517-8-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Support YMMH as the extended registers. It can be configured in the sample_ext_regs_intr/user. Only the PMU with PERF_PMU_CAP_EXTENDED_REGS2 supports the feature. The value can be retrieved via the XSAVES. Add sanity check in the perf_reg_validate. Signed-off-by: Kan Liang --- arch/x86/events/core.c | 26 ++++++++++++++ arch/x86/events/perf_event.h | 22 ++++++++++++ arch/x86/include/asm/perf_event.h | 1 + arch/x86/include/uapi/asm/perf_regs.h | 28 +++++++++++++++ arch/x86/kernel/perf_regs.c | 49 +++++++++++++++++++++++++-- 5 files changed, 124 insertions(+), 2 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 6b1c347cc17a..91039c0256b3 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -422,6 +422,14 @@ static void x86_pmu_get_ext_regs(struct x86_perf_regs = *perf_regs, u64 mask) xcomp_bv =3D xregs_xsave->header.xcomp_bv; if (mask & XFEATURE_MASK_SSE && xcomp_bv & XFEATURE_SSE) perf_regs->xmm_regs =3D (u64 *)xregs_xsave->i387.xmm_space; + + xsave +=3D FXSAVE_SIZE + XSAVE_HDR_SIZE; + + /* The XSAVES instruction always uses the compacted format */ + if (mask & XFEATURE_MASK_YMM && xcomp_bv & XFEATURE_MASK_YMM) { + perf_regs->ymmh_regs =3D xsave; + xsave +=3D XSAVE_YMM_SIZE; + } } =20 static void release_ext_regs_buffers(void) @@ -447,6 +455,9 @@ static void reserve_ext_regs_buffers(void) =20 size =3D FXSAVE_SIZE + XSAVE_HDR_SIZE; =20 + if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_YMM)) + size +=3D XSAVE_YMM_SIZE; + /* XSAVE feature requires 64-byte alignment. */ size +=3D 64; =20 @@ -712,6 +723,13 @@ int x86_pmu_hw_config(struct perf_event *event) if (!(x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_XMM))) return -EINVAL; } + if (event_has_extended_regs2(event)) { + if (!(event->pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS2)) + return -EINVAL; + if (x86_pmu_get_event_num_ext_regs(event, X86_EXT_REGS_YMM) && + !(x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_YMM))) + return -EINVAL; + } } return x86_setup_perfctr(event); } @@ -1765,6 +1783,7 @@ void x86_pmu_setup_regs_data(struct perf_event *event, struct x86_perf_regs *perf_regs =3D container_of(regs, struct x86_perf_re= gs, regs); u64 sample_type =3D event->attr.sample_type; u64 mask =3D 0; + int num; =20 if (!(event->attr.sample_type & (PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS= _USER))) return; @@ -1799,6 +1818,13 @@ void x86_pmu_setup_regs_data(struct perf_event *even= t, mask |=3D XFEATURE_MASK_SSE; } =20 + num =3D x86_pmu_get_event_num_ext_regs(event, X86_EXT_REGS_YMM); + if (num) { + perf_regs->ymmh_regs =3D NULL; + mask |=3D XFEATURE_MASK_YMM; + data->dyn_size +=3D num * PERF_X86_EXT_REG_YMMH_SIZE * sizeof(u64); + } + mask &=3D ~ignore_mask; if (mask) x86_pmu_get_ext_regs(perf_regs, mask); diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index b48f4215f37c..911916bc8e36 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -689,6 +689,7 @@ enum { =20 enum { X86_EXT_REGS_XMM =3D 0, + X86_EXT_REGS_YMM, }; =20 #define PERF_PEBS_DATA_SOURCE_MAX 0x100 @@ -1319,6 +1320,27 @@ static inline u64 x86_pmu_get_event_config(struct pe= rf_event *event) return event->attr.config & hybrid(event->pmu, config_mask); } =20 +static inline int get_num_ext_regs(u64 *ext_regs, unsigned int type) +{ + u64 mask; + + switch (type) { + case X86_EXT_REGS_YMM: + mask =3D GENMASK_ULL(PERF_REG_X86_YMMH15, PERF_REG_X86_YMMH0); + return hweight64(ext_regs[0] & mask); + default: + return 0; + } + return 0; +} + +static inline int x86_pmu_get_event_num_ext_regs(struct perf_event *event, + unsigned int type) +{ + return get_num_ext_regs(event->attr.sample_ext_regs_intr, type) + + get_num_ext_regs(event->attr.sample_ext_regs_user, type); +} + extern struct event_constraint emptyconstraint; =20 extern struct event_constraint unconstrained; diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 70d1d94aca7e..c30571f4de26 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -593,6 +593,7 @@ struct pt_regs; struct x86_perf_regs { struct pt_regs regs; u64 *xmm_regs; + u64 *ymmh_regs; }; =20 extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs); diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/= asm/perf_regs.h index 7c9d2bb3833b..f37644513e33 100644 --- a/arch/x86/include/uapi/asm/perf_regs.h +++ b/arch/x86/include/uapi/asm/perf_regs.h @@ -55,4 +55,32 @@ enum perf_event_x86_regs { =20 #define PERF_REG_EXTENDED_MASK (~((1ULL << PERF_REG_X86_XMM0) - 1)) =20 +enum perf_event_x86_ext_regs { + /* YMMH Registers */ + PERF_REG_X86_YMMH0 =3D 0, + PERF_REG_X86_YMMH1, + PERF_REG_X86_YMMH2, + PERF_REG_X86_YMMH3, + PERF_REG_X86_YMMH4, + PERF_REG_X86_YMMH5, + PERF_REG_X86_YMMH6, + PERF_REG_X86_YMMH7, + PERF_REG_X86_YMMH8, + PERF_REG_X86_YMMH9, + PERF_REG_X86_YMMH10, + PERF_REG_X86_YMMH11, + PERF_REG_X86_YMMH12, + PERF_REG_X86_YMMH13, + PERF_REG_X86_YMMH14, + PERF_REG_X86_YMMH15, + + PERF_REG_X86_EXT_REGS_MAX =3D PERF_REG_X86_YMMH15, +}; + +enum perf_event_x86_ext_reg_size { + PERF_X86_EXT_REG_YMMH_SIZE =3D 2, + + /* max of PERF_REG_X86_XXX_SIZE */ + PERF_X86_EXT_REG_SIZE_MAX =3D PERF_X86_EXT_REG_YMMH_SIZE, +}; #endif /* _ASM_X86_PERF_REGS_H */ diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c index b9d5106afc26..f12ef60a1a8a 100644 --- a/arch/x86/kernel/perf_regs.c +++ b/arch/x86/kernel/perf_regs.c @@ -57,10 +57,46 @@ static unsigned int pt_regs_offset[PERF_REG_X86_MAX] = =3D { #endif }; =20 +static_assert(PERF_REG_X86_EXT_REGS_MAX < PERF_ATTR_EXT_REGS_SIZE * 64); +static_assert(PERF_X86_EXT_REG_SIZE_MAX <=3D PERF_EXT_REGS_SIZE_MAX); + +static inline u64 __perf_ext_reg_value(u64 *ext, int *ext_size, + int idx, u64 *regs, int size) +{ + if (!regs) + return 0; + memcpy(ext, ®s[idx * size], sizeof(u64) * size); + *ext_size =3D size; + return ext[0]; +} + +static u64 perf_ext_reg_value(struct pt_regs *regs, int idx, + u64 *ext, int *ext_size) +{ + struct x86_perf_regs *perf_regs; + + perf_regs =3D container_of(regs, struct x86_perf_regs, regs); + switch (idx) { + case PERF_REG_X86_YMMH0 ... PERF_REG_X86_YMMH15: + return __perf_ext_reg_value(ext, ext_size, + idx - PERF_REG_X86_YMMH0, + perf_regs->ymmh_regs, + PERF_X86_EXT_REG_YMMH_SIZE); + default: + WARN_ON_ONCE(1); + *ext_size =3D 0; + break; + } + return 0; +} + u64 perf_reg_value(struct pt_regs *regs, int idx, u64 *ext, int *ext_size) { struct x86_perf_regs *perf_regs; =20 + if (ext && ext_size) + return perf_ext_reg_value(regs, idx, ext, ext_size); + if (WARN_ON_ONCE(ext || ext_size)) return 0; =20 @@ -117,13 +153,22 @@ void perf_get_regs_user(struct perf_regs *regs_user, (1ULL << PERF_REG_X86_FS) | \ (1ULL << PERF_REG_X86_GS)) =20 +static_assert (PERF_ATTR_EXT_REGS_SIZE =3D=3D 2); + int perf_reg_validate(u64 mask, u64 *mask_ext) { - if (mask_ext) + if (!mask && !mask_ext) return -EINVAL; - if (!mask || (mask & (REG_NOSUPPORT | PERF_REG_X86_RESERVED))) + if (mask && (mask & (REG_NOSUPPORT | PERF_REG_X86_RESERVED))) return -EINVAL; =20 + if (mask_ext) { + int h =3D mask_ext[1] ? fls64(mask_ext[1]) + 64 : fls64(mask_ext[0]); + + if (h > PERF_REG_X86_EXT_REGS_MAX + 1) + return -EINVAL; + } + return 0; } =20 --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5763B2BF079 for ; Fri, 13 Jun 2025 13:50:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822623; cv=none; b=eHLKOm5JIJIOdlQ3HUe6RupbKnhRkuxzazRtsdO7lNEA4e49b9n5FZBU32X2z+gdA0PuqmUJEaa3S5J2U5qXAOMEh23tqonpZQTPUG3dPJRvsRIMG4WfkncKwsi5k3Nw+k6w7dowvVxpPtlilVU3AFXkR4aAp1OkzZkSNTsGKl0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822623; c=relaxed/simple; bh=L+oybvM8m2kMnFTwzhYuivmdWwqoG89QEMSYZi2Md7A=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Dmbj0bb96m6dlH7QiADT6MyMy7vCeaWEl4gPOFBvQD0qUVVC+U/y38IlS0e1WG7MQ6r/gvd/5nloFgR4lhDBR+clm845oSf9e4Z3LY1NZqqAIiTvbieq3AKf6bt5gRCdbeGomBDuoai5PIU1Nv06YQmV/iZ8dTOUs8lcYaMYfPw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=NdOkcykB; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="NdOkcykB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822621; x=1781358621; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=L+oybvM8m2kMnFTwzhYuivmdWwqoG89QEMSYZi2Md7A=; b=NdOkcykBfg13uvWH47owlbhoPhHGnDfdSTMhDWXTAwDzlAQ0dZuiJoRI UPqnctAROxewkpSbIrcF3Oh44xt5KMTdD9h3hT5r+SezxpYC0XGz76WI2 m4XVZRn7JowDm+6c5q83g5J6LzKJv0Y+3+iWFalJTa6l66VC/LoRpnHwo /w9sUrrH1wqJX4XNVgse5DH+DJwstHlwsxW3Oc1XcErPWIye11VF2SF8m UkOoTJRP1pYCYiHXxS0gaBZ9I2IWNhn99ISbAs7BgDqfEpnT10/HZzG3c zoFqe4LSHnghYkaSN1biX0YswLzfjaWDYuuP0R/txj1KyX4wJNpk2bbIB A==; X-CSE-ConnectionGUID: KJ1KWz8ySRO20D4qT83Uww== X-CSE-MsgGUID: 80QC9EllQ0CF0aRRZEk4+g== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837598" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837598" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:21 -0700 X-CSE-ConnectionGUID: xhUTbX4RTNayPBRTAAe8gQ== X-CSE-MsgGUID: 7tE5aZitTCe+hKZHSeWISQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017631" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:21 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 08/12] perf/x86: Add APX in extended regs Date: Fri, 13 Jun 2025 06:49:39 -0700 Message-Id: <20250613134943.3186517-9-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Support APX as the extended registers. It can be configured in the sample_ext_regs_intr/user. Only the PMU with PERF_PMU_CAP_EXTENDED_REGS2 supports the feature. The value can be retrieved via the XSAVES. Define several macros to simplify the code. Signed-off-by: Kan Liang --- arch/x86/events/core.c | 48 +++++++++++++++++++-------- arch/x86/events/perf_event.h | 4 +++ arch/x86/include/asm/perf_event.h | 1 + arch/x86/include/uapi/asm/perf_regs.h | 21 +++++++++++- arch/x86/kernel/perf_regs.c | 5 +++ 5 files changed, 65 insertions(+), 14 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 91039c0256b3..67f62268f063 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -408,6 +408,14 @@ set_ext_hw_attr(struct hw_perf_event *hwc, struct perf= _event *event) =20 static DEFINE_PER_CPU(void *, ext_regs_buf); =20 +#define __x86_pmu_get_regs(_mask, _regs, _size) \ +do { \ + if (mask & _mask && xcomp_bv & _mask) { \ + _regs =3D xsave; \ + xsave +=3D _size; \ + } \ +} while (0) + static void x86_pmu_get_ext_regs(struct x86_perf_regs *perf_regs, u64 mask) { void *xsave =3D (void *)ALIGN((unsigned long)per_cpu(ext_regs_buf, smp_pr= ocessor_id()), 64); @@ -426,10 +434,8 @@ static void x86_pmu_get_ext_regs(struct x86_perf_regs = *perf_regs, u64 mask) xsave +=3D FXSAVE_SIZE + XSAVE_HDR_SIZE; =20 /* The XSAVES instruction always uses the compacted format */ - if (mask & XFEATURE_MASK_YMM && xcomp_bv & XFEATURE_MASK_YMM) { - perf_regs->ymmh_regs =3D xsave; - xsave +=3D XSAVE_YMM_SIZE; - } + __x86_pmu_get_regs(XFEATURE_MASK_YMM, perf_regs->ymmh_regs, XSAVE_YMM_SIZ= E); + __x86_pmu_get_regs(XFEATURE_MASK_APX, perf_regs->apx_regs, sizeof(struct = apx_state)); } =20 static void release_ext_regs_buffers(void) @@ -457,6 +463,8 @@ static void reserve_ext_regs_buffers(void) =20 if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_YMM)) size +=3D XSAVE_YMM_SIZE; + if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_APX)) + size +=3D sizeof(struct apx_state); =20 /* XSAVE feature requires 64-byte alignment. */ size +=3D 64; @@ -642,6 +650,13 @@ int x86_pmu_max_precise(void) return precise; } =20 +#define check_ext_regs(_type) \ +do { \ + if (x86_pmu_get_event_num_ext_regs(event, _type) && \ + !(x86_pmu.ext_regs_mask & BIT_ULL(_type))) \ + return -EINVAL; \ +} while (0) + int x86_pmu_hw_config(struct perf_event *event) { if (event->attr.precise_ip) { @@ -726,9 +741,8 @@ int x86_pmu_hw_config(struct perf_event *event) if (event_has_extended_regs2(event)) { if (!(event->pmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS2)) return -EINVAL; - if (x86_pmu_get_event_num_ext_regs(event, X86_EXT_REGS_YMM) && - !(x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_YMM))) - return -EINVAL; + check_ext_regs(X86_EXT_REGS_YMM); + check_ext_regs(X86_EXT_REGS_APX); } } return x86_setup_perfctr(event); @@ -1775,6 +1789,16 @@ x86_pmu_perf_get_regs_user(struct perf_sample_data *= data, return x86_regs_user; } =20 +#define init_ext_regs_data(_type, _regs, _mask, _size) \ +do { \ + num =3D x86_pmu_get_event_num_ext_regs(event, _type); \ + if (num) { \ + _regs =3D NULL; \ + mask |=3D _mask; \ + data->dyn_size +=3D num * _size * sizeof(u64); \ + } \ +} while (0) + void x86_pmu_setup_regs_data(struct perf_event *event, struct perf_sample_data *data, struct pt_regs *regs, @@ -1818,12 +1842,10 @@ void x86_pmu_setup_regs_data(struct perf_event *eve= nt, mask |=3D XFEATURE_MASK_SSE; } =20 - num =3D x86_pmu_get_event_num_ext_regs(event, X86_EXT_REGS_YMM); - if (num) { - perf_regs->ymmh_regs =3D NULL; - mask |=3D XFEATURE_MASK_YMM; - data->dyn_size +=3D num * PERF_X86_EXT_REG_YMMH_SIZE * sizeof(u64); - } + init_ext_regs_data(X86_EXT_REGS_YMM, perf_regs->ymmh_regs, + XFEATURE_MASK_YMM, PERF_X86_EXT_REG_YMMH_SIZE); + init_ext_regs_data(X86_EXT_REGS_APX, perf_regs->apx_regs, + XFEATURE_MASK_APX, PERF_X86_EXT_REG_APX_SIZE); =20 mask &=3D ~ignore_mask; if (mask) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 911916bc8e36..1c40b5d9c025 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -690,6 +690,7 @@ enum { enum { X86_EXT_REGS_XMM =3D 0, X86_EXT_REGS_YMM, + X86_EXT_REGS_APX, }; =20 #define PERF_PEBS_DATA_SOURCE_MAX 0x100 @@ -1328,6 +1329,9 @@ static inline int get_num_ext_regs(u64 *ext_regs, uns= igned int type) case X86_EXT_REGS_YMM: mask =3D GENMASK_ULL(PERF_REG_X86_YMMH15, PERF_REG_X86_YMMH0); return hweight64(ext_regs[0] & mask); + case X86_EXT_REGS_APX: + mask =3D GENMASK_ULL(PERF_REG_X86_R31, PERF_REG_X86_R16); + return hweight64(ext_regs[0] & mask); default: return 0; } diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index c30571f4de26..9e4d60f3a9a2 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -594,6 +594,7 @@ struct x86_perf_regs { struct pt_regs regs; u64 *xmm_regs; u64 *ymmh_regs; + u64 *apx_regs; }; =20 extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs); diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/= asm/perf_regs.h index f37644513e33..e23fb112faac 100644 --- a/arch/x86/include/uapi/asm/perf_regs.h +++ b/arch/x86/include/uapi/asm/perf_regs.h @@ -74,11 +74,30 @@ enum perf_event_x86_ext_regs { PERF_REG_X86_YMMH14, PERF_REG_X86_YMMH15, =20 - PERF_REG_X86_EXT_REGS_MAX =3D PERF_REG_X86_YMMH15, + /* APX Registers */ + PERF_REG_X86_R16, + PERF_REG_X86_R17, + PERF_REG_X86_R18, + PERF_REG_X86_R19, + PERF_REG_X86_R20, + PERF_REG_X86_R21, + PERF_REG_X86_R22, + PERF_REG_X86_R23, + PERF_REG_X86_R24, + PERF_REG_X86_R25, + PERF_REG_X86_R26, + PERF_REG_X86_R27, + PERF_REG_X86_R28, + PERF_REG_X86_R29, + PERF_REG_X86_R30, + PERF_REG_X86_R31, + + PERF_REG_X86_EXT_REGS_MAX =3D PERF_REG_X86_R31, }; =20 enum perf_event_x86_ext_reg_size { PERF_X86_EXT_REG_YMMH_SIZE =3D 2, + PERF_X86_EXT_REG_APX_SIZE =3D 1, =20 /* max of PERF_REG_X86_XXX_SIZE */ PERF_X86_EXT_REG_SIZE_MAX =3D PERF_X86_EXT_REG_YMMH_SIZE, diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c index f12ef60a1a8a..518497bafdf0 100644 --- a/arch/x86/kernel/perf_regs.c +++ b/arch/x86/kernel/perf_regs.c @@ -82,6 +82,11 @@ static u64 perf_ext_reg_value(struct pt_regs *regs, int = idx, idx - PERF_REG_X86_YMMH0, perf_regs->ymmh_regs, PERF_X86_EXT_REG_YMMH_SIZE); + case PERF_REG_X86_R16 ... PERF_REG_X86_R31: + return __perf_ext_reg_value(ext, ext_size, + idx - PERF_REG_X86_R16, + perf_regs->apx_regs, + PERF_X86_EXT_REG_APX_SIZE); default: WARN_ON_ONCE(1); *ext_size =3D 0; --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13F9A2DD00B for ; Fri, 13 Jun 2025 13:50:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822623; cv=none; b=DOvJcHBODSyPw8AK1sB4gUk7fgJf2go+/ghzFECPuYqVA0Q0GdE1hvkbqH3GRAfiNeqpJ4kcchJgb9o9NR5AWk3aYs2SosJY0C9OAtNIsLgEDCe0a++FuBeGrOEKs8qZ4FydpHJ/lwaszj+qt98Dj8pZB98KmU3hzkEHybUaE0E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822623; c=relaxed/simple; bh=2ZZQQW+13o2rsahnzuLYhSsOMSGkVM3SbB+MIh08WAc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=d4TFOdKxj+Y69xppAqMabsNskRONSAhWEGWhkj6Mkjn41ojJXZFgfPTEJlbnrJt68LMAvkl3tjNMPYFiWlxLCB132cBEzX+A8WBljJqf1FUotZUlanKHuSGaAmdyVqNHwKUNMCKK3y3obDZ2KCV64gKaqY85igo847pfBLREQBQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=bqjqb37K; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bqjqb37K" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822622; x=1781358622; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2ZZQQW+13o2rsahnzuLYhSsOMSGkVM3SbB+MIh08WAc=; b=bqjqb37KgLdUifnOqcZ779ACv4dbO7ssX0pxJ5XGfcwsE/Cb2JtixK7w zb0QSadkHLTI9eTKHvXTqrBIPP4jXZ82rvA4hVq/2SWL8VJyOyov0S/SX C4dEDXqJOLFwjpNLcF+RG8CE46G4BX7aT82mMJUm+RZAhOuXTStdxeU5j EwOYXyD2yFBfHQsMThzaZ9gZbK5M/+tXTuxJIBsrTo6Ql9YTN6NAVDdiT 1yiPbxLglEJM816WGtx2VhZCHv/bLKJJjnQOQnzqcUf/FsKlsXzQpb43F kdN73QOAt5B5qAuIQERRV8PPVEYZHn0LU1nCH5B7dIkP18/SI/F+mWv6q w==; X-CSE-ConnectionGUID: lt8SkZCqTeyEPZmnqBBFSg== X-CSE-MsgGUID: sFcxmd0CSmGsWdvVsDWo3w== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837606" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837606" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:22 -0700 X-CSE-ConnectionGUID: Ew5AqW2mS2SzOHQ24eBDRw== X-CSE-MsgGUID: hqPSGureSN25PkXNaAFzwA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017635" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:22 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 09/12] perf/x86: Add OPMASK in extended regs Date: Fri, 13 Jun 2025 06:49:40 -0700 Message-Id: <20250613134943.3186517-10-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Support OPMASK as the extended registers. It can be configured in the sample_ext_regs_intr/user. Only the PMU with PERF_PMU_CAP_EXTENDED_REGS2 supports the feature. The value can be retrieved via the XSAVES. Signed-off-by: Kan Liang --- arch/x86/events/core.c | 6 ++++++ arch/x86/events/perf_event.h | 4 ++++ arch/x86/include/asm/perf_event.h | 1 + arch/x86/include/uapi/asm/perf_regs.h | 13 ++++++++++++- arch/x86/kernel/perf_regs.c | 5 +++++ 5 files changed, 28 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 67f62268f063..741e6dfd50a5 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -436,6 +436,7 @@ static void x86_pmu_get_ext_regs(struct x86_perf_regs *= perf_regs, u64 mask) /* The XSAVES instruction always uses the compacted format */ __x86_pmu_get_regs(XFEATURE_MASK_YMM, perf_regs->ymmh_regs, XSAVE_YMM_SIZ= E); __x86_pmu_get_regs(XFEATURE_MASK_APX, perf_regs->apx_regs, sizeof(struct = apx_state)); + __x86_pmu_get_regs(XFEATURE_MASK_OPMASK, perf_regs->opmask_regs, sizeof(s= truct avx_512_opmask_state)); } =20 static void release_ext_regs_buffers(void) @@ -465,6 +466,8 @@ static void reserve_ext_regs_buffers(void) size +=3D XSAVE_YMM_SIZE; if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_APX)) size +=3D sizeof(struct apx_state); + if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_OPMASK)) + size +=3D sizeof(struct avx_512_opmask_state); =20 /* XSAVE feature requires 64-byte alignment. */ size +=3D 64; @@ -743,6 +746,7 @@ int x86_pmu_hw_config(struct perf_event *event) return -EINVAL; check_ext_regs(X86_EXT_REGS_YMM); check_ext_regs(X86_EXT_REGS_APX); + check_ext_regs(X86_EXT_REGS_OPMASK); } } return x86_setup_perfctr(event); @@ -1846,6 +1850,8 @@ void x86_pmu_setup_regs_data(struct perf_event *event, XFEATURE_MASK_YMM, PERF_X86_EXT_REG_YMMH_SIZE); init_ext_regs_data(X86_EXT_REGS_APX, perf_regs->apx_regs, XFEATURE_MASK_APX, PERF_X86_EXT_REG_APX_SIZE); + init_ext_regs_data(X86_EXT_REGS_OPMASK, perf_regs->opmask_regs, + XFEATURE_MASK_OPMASK, PERF_X86_EXT_REG_OPMASK_SIZE); =20 mask &=3D ~ignore_mask; if (mask) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 1c40b5d9c025..c2626dcea1a0 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -691,6 +691,7 @@ enum { X86_EXT_REGS_XMM =3D 0, X86_EXT_REGS_YMM, X86_EXT_REGS_APX, + X86_EXT_REGS_OPMASK, }; =20 #define PERF_PEBS_DATA_SOURCE_MAX 0x100 @@ -1332,6 +1333,9 @@ static inline int get_num_ext_regs(u64 *ext_regs, uns= igned int type) case X86_EXT_REGS_APX: mask =3D GENMASK_ULL(PERF_REG_X86_R31, PERF_REG_X86_R16); return hweight64(ext_regs[0] & mask); + case X86_EXT_REGS_OPMASK: + mask =3D GENMASK_ULL(PERF_REG_X86_OPMASK7, PERF_REG_X86_OPMASK0); + return hweight64(ext_regs[0] & mask); default: return 0; } diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 9e4d60f3a9a2..4e971f38ff94 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -595,6 +595,7 @@ struct x86_perf_regs { u64 *xmm_regs; u64 *ymmh_regs; u64 *apx_regs; + u64 *opmask_regs; }; =20 extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs); diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/= asm/perf_regs.h index e23fb112faac..b9ec58b98c5e 100644 --- a/arch/x86/include/uapi/asm/perf_regs.h +++ b/arch/x86/include/uapi/asm/perf_regs.h @@ -92,12 +92,23 @@ enum perf_event_x86_ext_regs { PERF_REG_X86_R30, PERF_REG_X86_R31, =20 - PERF_REG_X86_EXT_REGS_MAX =3D PERF_REG_X86_R31, + /* OPMASK Registers */ + PERF_REG_X86_OPMASK0, + PERF_REG_X86_OPMASK1, + PERF_REG_X86_OPMASK2, + PERF_REG_X86_OPMASK3, + PERF_REG_X86_OPMASK4, + PERF_REG_X86_OPMASK5, + PERF_REG_X86_OPMASK6, + PERF_REG_X86_OPMASK7, + + PERF_REG_X86_EXT_REGS_MAX =3D PERF_REG_X86_OPMASK7, }; =20 enum perf_event_x86_ext_reg_size { PERF_X86_EXT_REG_YMMH_SIZE =3D 2, PERF_X86_EXT_REG_APX_SIZE =3D 1, + PERF_X86_EXT_REG_OPMASK_SIZE =3D 1, =20 /* max of PERF_REG_X86_XXX_SIZE */ PERF_X86_EXT_REG_SIZE_MAX =3D PERF_X86_EXT_REG_YMMH_SIZE, diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c index 518497bafdf0..34b94b846f00 100644 --- a/arch/x86/kernel/perf_regs.c +++ b/arch/x86/kernel/perf_regs.c @@ -87,6 +87,11 @@ static u64 perf_ext_reg_value(struct pt_regs *regs, int = idx, idx - PERF_REG_X86_R16, perf_regs->apx_regs, PERF_X86_EXT_REG_APX_SIZE); + case PERF_REG_X86_OPMASK0 ... PERF_REG_X86_OPMASK7: + return __perf_ext_reg_value(ext, ext_size, + idx - PERF_REG_X86_OPMASK0, + perf_regs->opmask_regs, + PERF_X86_EXT_REG_OPMASK_SIZE); default: WARN_ON_ONCE(1); *ext_size =3D 0; --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9D6CF2DD01C for ; Fri, 13 Jun 2025 13:50:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822624; cv=none; b=HDILAxaNeGPMLhYwS5ypPHoUHh+KxVG/iO0g+pZCPxPjeMmcLr4KQ+evbimus0B8Em0nfgBWEQ8v36SDUItX/sH0MH6zYL47Sy/GxCLqCH2nha4ChAhYRJTFbrXw4H6+Zi2eHyYzYlO5/GnfFS83dJE7/41ps1rAovxP7S203zI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822624; c=relaxed/simple; bh=mU88CWHtDu//LZsdO+tWz83VbPYcuTZ7x11tBe3CCpg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=VIhgSkMGR96Pops+cZcpeJ65nASP6Kzm5uX/VGMkH6Nf3KsP6x+LG0xKCHj5SHLbNSbt1KftXaA0d2+UdK5sw4SJnSQ6YVLDXxridNod8t8AUgLSetr87oD77fww4/rcDXB9GA1khx4C1uZZYG3LgFBgiVvFUu2xycsouxNJVXU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=JsTIT/CK; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="JsTIT/CK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822623; x=1781358623; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mU88CWHtDu//LZsdO+tWz83VbPYcuTZ7x11tBe3CCpg=; b=JsTIT/CKD2R0D16epLR5FpIQJ3hvHA1/C1yqSDQEv4k5nfK/cNbBYG4S T1IwJNcOaREcZSvAfzE0XFRNhFUmStUR812Agpqrczu5na6yO280UHlP6 fVldrQ0xONmL+e4GkD+F91gCyYl/h6gynrNY/Z0oRVF+MUlLPgL5ulhN1 EABevhE5to8MlrioKsARFtv04ftcboWCFb/ICG1PDDVpsRRwR4YFqs63w aQUuBTp+HK9JYd0fDJGfUZ06Zg+BbrheSBNwbN/vAWvsrfpIcJZRkr9qd xsr+FSDepVwpggiCBoIqJYyU3n9sjZaFom2Yxc3/pkZsNLyRYAHu4hAzI Q==; X-CSE-ConnectionGUID: YBvxbNTLSECo1v6uBENnpQ== X-CSE-MsgGUID: OXCg5x92Toa8UEo3MT8NaA== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837612" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837612" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:22 -0700 X-CSE-ConnectionGUID: YH6MdUXZRcSdOYOUbNO+NQ== X-CSE-MsgGUID: PB8h9Q70TW6uscrRdte8Wg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017640" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:22 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 10/12] perf/x86: Add ZMM in extended regs Date: Fri, 13 Jun 2025 06:49:41 -0700 Message-Id: <20250613134943.3186517-11-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Support ZMM as the extended registers. It can be configured in the sample_ext_regs_intr/user. Only the PMU with PERF_PMU_CAP_EXTENDED_REGS2 supports the feature. The value can be retrieved via the XSAVES. Signed-off-by: Kan Liang --- arch/x86/events/core.c | 14 +++++++++ arch/x86/events/perf_event.h | 11 ++++++- arch/x86/include/asm/perf_event.h | 2 ++ arch/x86/include/uapi/asm/perf_regs.h | 43 +++++++++++++++++++++++++-- arch/x86/kernel/perf_regs.c | 10 +++++++ 5 files changed, 77 insertions(+), 3 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 741e6dfd50a5..9bcef9a32dd2 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -437,6 +437,10 @@ static void x86_pmu_get_ext_regs(struct x86_perf_regs = *perf_regs, u64 mask) __x86_pmu_get_regs(XFEATURE_MASK_YMM, perf_regs->ymmh_regs, XSAVE_YMM_SIZ= E); __x86_pmu_get_regs(XFEATURE_MASK_APX, perf_regs->apx_regs, sizeof(struct = apx_state)); __x86_pmu_get_regs(XFEATURE_MASK_OPMASK, perf_regs->opmask_regs, sizeof(s= truct avx_512_opmask_state)); + __x86_pmu_get_regs(XFEATURE_MASK_ZMM_Hi256, perf_regs->zmmh_regs, + sizeof(struct avx_512_zmm_uppers_state)); + __x86_pmu_get_regs(XFEATURE_MASK_Hi16_ZMM, perf_regs->h16zmm_regs, + sizeof(struct avx_512_hi16_state)); } =20 static void release_ext_regs_buffers(void) @@ -468,6 +472,10 @@ static void reserve_ext_regs_buffers(void) size +=3D sizeof(struct apx_state); if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_OPMASK)) size +=3D sizeof(struct avx_512_opmask_state); + if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_ZMMH)) + size +=3D sizeof(struct avx_512_zmm_uppers_state); + if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_H16ZMM)) + size +=3D sizeof(struct avx_512_hi16_state); =20 /* XSAVE feature requires 64-byte alignment. */ size +=3D 64; @@ -747,6 +755,8 @@ int x86_pmu_hw_config(struct perf_event *event) check_ext_regs(X86_EXT_REGS_YMM); check_ext_regs(X86_EXT_REGS_APX); check_ext_regs(X86_EXT_REGS_OPMASK); + check_ext_regs(X86_EXT_REGS_ZMMH); + check_ext_regs(X86_EXT_REGS_H16ZMM); } } return x86_setup_perfctr(event); @@ -1852,6 +1862,10 @@ void x86_pmu_setup_regs_data(struct perf_event *even= t, XFEATURE_MASK_APX, PERF_X86_EXT_REG_APX_SIZE); init_ext_regs_data(X86_EXT_REGS_OPMASK, perf_regs->opmask_regs, XFEATURE_MASK_OPMASK, PERF_X86_EXT_REG_OPMASK_SIZE); + init_ext_regs_data(X86_EXT_REGS_ZMMH, perf_regs->zmmh_regs, + XFEATURE_MASK_ZMM_Hi256, PERF_X86_EXT_REG_ZMMH_SIZE); + init_ext_regs_data(X86_EXT_REGS_H16ZMM, perf_regs->h16zmm_regs, + XFEATURE_MASK_Hi16_ZMM, PERF_X86_EXT_REG_H16ZMM_SIZE); =20 mask &=3D ~ignore_mask; if (mask) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index c2626dcea1a0..93a65c529afe 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -692,6 +692,8 @@ enum { X86_EXT_REGS_YMM, X86_EXT_REGS_APX, X86_EXT_REGS_OPMASK, + X86_EXT_REGS_ZMMH, + X86_EXT_REGS_H16ZMM, }; =20 #define PERF_PEBS_DATA_SOURCE_MAX 0x100 @@ -1324,7 +1326,7 @@ static inline u64 x86_pmu_get_event_config(struct per= f_event *event) =20 static inline int get_num_ext_regs(u64 *ext_regs, unsigned int type) { - u64 mask; + u64 mask, mask2; =20 switch (type) { case X86_EXT_REGS_YMM: @@ -1336,6 +1338,13 @@ static inline int get_num_ext_regs(u64 *ext_regs, un= signed int type) case X86_EXT_REGS_OPMASK: mask =3D GENMASK_ULL(PERF_REG_X86_OPMASK7, PERF_REG_X86_OPMASK0); return hweight64(ext_regs[0] & mask); + case X86_EXT_REGS_ZMMH: + mask =3D GENMASK_ULL(PERF_REG_X86_ZMMH15, PERF_REG_X86_ZMMH0); + return hweight64(ext_regs[0] & mask); + case X86_EXT_REGS_H16ZMM: + mask =3D GENMASK_ULL(PERF_REG_X86_EXT_REGS_64, PERF_REG_X86_ZMM16); + mask2 =3D GENMASK_ULL(PERF_REG_X86_ZMM31 - 64, 0); + return hweight64(ext_regs[0] & mask) + hweight64(ext_regs[1] & mask2); default: return 0; } diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index 4e971f38ff94..eb35ba9afbb4 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -596,6 +596,8 @@ struct x86_perf_regs { u64 *ymmh_regs; u64 *apx_regs; u64 *opmask_regs; + u64 *zmmh_regs; + u64 *h16zmm_regs; }; =20 extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs); diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/= asm/perf_regs.h index b9ec58b98c5e..c43a025b0c01 100644 --- a/arch/x86/include/uapi/asm/perf_regs.h +++ b/arch/x86/include/uapi/asm/perf_regs.h @@ -102,15 +102,54 @@ enum perf_event_x86_ext_regs { PERF_REG_X86_OPMASK6, PERF_REG_X86_OPMASK7, =20 - PERF_REG_X86_EXT_REGS_MAX =3D PERF_REG_X86_OPMASK7, + /* ZMMH 0-15 Registers */ + PERF_REG_X86_ZMMH0, + PERF_REG_X86_ZMMH1, + PERF_REG_X86_ZMMH2, + PERF_REG_X86_ZMMH3, + PERF_REG_X86_ZMMH4, + PERF_REG_X86_ZMMH5, + PERF_REG_X86_ZMMH6, + PERF_REG_X86_ZMMH7, + PERF_REG_X86_ZMMH8, + PERF_REG_X86_ZMMH9, + PERF_REG_X86_ZMMH10, + PERF_REG_X86_ZMMH11, + PERF_REG_X86_ZMMH12, + PERF_REG_X86_ZMMH13, + PERF_REG_X86_ZMMH14, + PERF_REG_X86_ZMMH15, + + /* H16ZMM 16-31 Registers */ + PERF_REG_X86_ZMM16, + PERF_REG_X86_ZMM17, + PERF_REG_X86_ZMM18, + PERF_REG_X86_ZMM19, + PERF_REG_X86_ZMM20, + PERF_REG_X86_ZMM21, + PERF_REG_X86_ZMM22, + PERF_REG_X86_ZMM23, + PERF_REG_X86_ZMM24, + PERF_REG_X86_ZMM25, + PERF_REG_X86_ZMM26, + PERF_REG_X86_ZMM27, + PERF_REG_X86_ZMM28, + PERF_REG_X86_ZMM29, + PERF_REG_X86_ZMM30, + PERF_REG_X86_ZMM31, + + PERF_REG_X86_EXT_REGS_64 =3D PERF_REG_X86_ZMM23, + PERF_REG_X86_EXT_REGS_MAX =3D PERF_REG_X86_ZMM31, }; =20 enum perf_event_x86_ext_reg_size { PERF_X86_EXT_REG_YMMH_SIZE =3D 2, PERF_X86_EXT_REG_APX_SIZE =3D 1, PERF_X86_EXT_REG_OPMASK_SIZE =3D 1, + PERF_X86_EXT_REG_ZMMH_SIZE =3D 4, + PERF_X86_EXT_REG_H16ZMM_SIZE =3D 8, =20 /* max of PERF_REG_X86_XXX_SIZE */ - PERF_X86_EXT_REG_SIZE_MAX =3D PERF_X86_EXT_REG_YMMH_SIZE, + PERF_X86_EXT_REG_SIZE_MAX =3D PERF_X86_EXT_REG_H16ZMM_SIZE, }; #endif /* _ASM_X86_PERF_REGS_H */ diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c index 34b94b846f00..d5721ea85c5d 100644 --- a/arch/x86/kernel/perf_regs.c +++ b/arch/x86/kernel/perf_regs.c @@ -92,6 +92,16 @@ static u64 perf_ext_reg_value(struct pt_regs *regs, int = idx, idx - PERF_REG_X86_OPMASK0, perf_regs->opmask_regs, PERF_X86_EXT_REG_OPMASK_SIZE); + case PERF_REG_X86_ZMMH0 ... PERF_REG_X86_ZMMH15: + return __perf_ext_reg_value(ext, ext_size, + idx - PERF_REG_X86_ZMMH0, + perf_regs->zmmh_regs, + PERF_X86_EXT_REG_ZMMH_SIZE); + case PERF_REG_X86_ZMM16 ... PERF_REG_X86_ZMM31: + return __perf_ext_reg_value(ext, ext_size, + idx - PERF_REG_X86_ZMM16, + perf_regs->h16zmm_regs, + PERF_X86_EXT_REG_H16ZMM_SIZE); default: WARN_ON_ONCE(1); *ext_size =3D 0; --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 576CA2DD033 for ; Fri, 13 Jun 2025 13:50:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822625; cv=none; b=hzHMD/PwJFQAPBLPDf9D6jHa5uX3n+JFdQnNRoU7ZkvISh5G+KgyyRzDoaLxzHMYUrin6jn90tnSy/kzES0zVCFOBy0k7UQjjrUIj59k+3gYWyVxj1599/MwiAyWAcdQB5TqPugJXanP0U2UeyriRU0Wr860+h+E3QtIVmjvP9U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822625; c=relaxed/simple; bh=4viLqFODhQf/6D3o1l5DdfiONj6oYIcsCzeLhTNucF8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=jaW4dqqqrbyd/CCOCXpjUQ2Sv/QgEIvEEi9aU31qG7Sr13d20BaGeBJ8PD8QHwJe2QYIbwiXaxLG7toMsYCpIm8/5xTidlclD3NCSu1fGXuHad+kJ5BKGIvBNQve/ycu4o35uM/je7TCte63O6cY3KQfQSukBJPv2jmoxBMLW10= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=cjYq2XMQ; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="cjYq2XMQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822623; x=1781358623; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4viLqFODhQf/6D3o1l5DdfiONj6oYIcsCzeLhTNucF8=; b=cjYq2XMQtZg3avOjjkTQljWsOihUTZbtnr2jEuNWh5rdWqR6cG4fD3+p V90EPJQpreYt60Nt1W1CzJp7F0KapkIUVWpuI7jlkdBxsi1InAfIymHRY zZPGuZMCkM8RWjF90RE+WF36LZmkfW2YEMGXLrrRfCrp9qGUHJiPvHsCX 10waeyU/4LgPHXJhO7L5EOm4wXMVXr/678vIhs1oTG8iDqwmQvsAkoNpd l7LR0270SFxSTW7pVQv1r2JcOlf0IA6IWt39R3GLT9Mg+aY4ekcuXiPwt JvDAbudVpEusfXwb5r4ekxBCOUM+9WM+DQSk934vW2I50NuqKjd5aFQpb A==; X-CSE-ConnectionGUID: bh/thB2AT2eKS8g0xDjaUA== X-CSE-MsgGUID: dcfVfDvDRNmrz6f639LjJg== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837620" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837620" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:23 -0700 X-CSE-ConnectionGUID: +50EwHsYRn2G1SOHDjiPOQ== X-CSE-MsgGUID: KKen9iRQTr+CV+8ZmAZcYw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017646" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:23 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 11/12] perf/x86: Add SSP in extended regs Date: Fri, 13 Jun 2025 06:49:42 -0700 Message-Id: <20250613134943.3186517-12-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Support SSP as the extended registers. It can be configured in the sample_ext_regs_intr/user. Only the PMU with PERF_PMU_CAP_EXTENDED_REGS2 supports the feature. The value can be retrieved via the XSAVES. Signed-off-by: Kan Liang --- arch/x86/events/core.c | 7 +++++++ arch/x86/events/perf_event.h | 5 +++++ arch/x86/include/asm/perf_event.h | 1 + arch/x86/include/uapi/asm/perf_regs.h | 6 +++++- arch/x86/kernel/perf_regs.c | 5 +++++ 5 files changed, 23 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 9bcef9a32dd2..65e4460fdc28 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -441,6 +441,8 @@ static void x86_pmu_get_ext_regs(struct x86_perf_regs *= perf_regs, u64 mask) sizeof(struct avx_512_zmm_uppers_state)); __x86_pmu_get_regs(XFEATURE_MASK_Hi16_ZMM, perf_regs->h16zmm_regs, sizeof(struct avx_512_hi16_state)); + __x86_pmu_get_regs(XFEATURE_MASK_CET_USER, perf_regs->cet_regs, + sizeof(struct cet_user_state)); } =20 static void release_ext_regs_buffers(void) @@ -476,6 +478,8 @@ static void reserve_ext_regs_buffers(void) size +=3D sizeof(struct avx_512_zmm_uppers_state); if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_H16ZMM)) size +=3D sizeof(struct avx_512_hi16_state); + if (x86_pmu.ext_regs_mask & BIT_ULL(X86_EXT_REGS_CET)) + size +=3D sizeof(struct cet_user_state); =20 /* XSAVE feature requires 64-byte alignment. */ size +=3D 64; @@ -757,6 +761,7 @@ int x86_pmu_hw_config(struct perf_event *event) check_ext_regs(X86_EXT_REGS_OPMASK); check_ext_regs(X86_EXT_REGS_ZMMH); check_ext_regs(X86_EXT_REGS_H16ZMM); + check_ext_regs(X86_EXT_REGS_CET); } } return x86_setup_perfctr(event); @@ -1866,6 +1871,8 @@ void x86_pmu_setup_regs_data(struct perf_event *event, XFEATURE_MASK_ZMM_Hi256, PERF_X86_EXT_REG_ZMMH_SIZE); init_ext_regs_data(X86_EXT_REGS_H16ZMM, perf_regs->h16zmm_regs, XFEATURE_MASK_Hi16_ZMM, PERF_X86_EXT_REG_H16ZMM_SIZE); + init_ext_regs_data(X86_EXT_REGS_CET, perf_regs->cet_regs, + XFEATURE_MASK_CET_USER, PERF_X86_EXT_REG_SSP_SIZE); =20 mask &=3D ~ignore_mask; if (mask) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 93a65c529afe..e4906c0b33da 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -694,6 +694,7 @@ enum { X86_EXT_REGS_OPMASK, X86_EXT_REGS_ZMMH, X86_EXT_REGS_H16ZMM, + X86_EXT_REGS_CET, }; =20 #define PERF_PEBS_DATA_SOURCE_MAX 0x100 @@ -1345,6 +1346,10 @@ static inline int get_num_ext_regs(u64 *ext_regs, un= signed int type) mask =3D GENMASK_ULL(PERF_REG_X86_EXT_REGS_64, PERF_REG_X86_ZMM16); mask2 =3D GENMASK_ULL(PERF_REG_X86_ZMM31 - 64, 0); return hweight64(ext_regs[0] & mask) + hweight64(ext_regs[1] & mask2); + case X86_EXT_REGS_CET: + if (ext_regs[1] & BIT_ULL(PERF_REG_X86_SSP - 64)) + return 1; + return 0; default: return 0; } diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_= event.h index eb35ba9afbb4..e49a26886e64 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -598,6 +598,7 @@ struct x86_perf_regs { u64 *opmask_regs; u64 *zmmh_regs; u64 *h16zmm_regs; + u64 *cet_regs; }; =20 extern unsigned long perf_arch_instruction_pointer(struct pt_regs *regs); diff --git a/arch/x86/include/uapi/asm/perf_regs.h b/arch/x86/include/uapi/= asm/perf_regs.h index c43a025b0c01..82df5c65d701 100644 --- a/arch/x86/include/uapi/asm/perf_regs.h +++ b/arch/x86/include/uapi/asm/perf_regs.h @@ -138,8 +138,11 @@ enum perf_event_x86_ext_regs { PERF_REG_X86_ZMM30, PERF_REG_X86_ZMM31, =20 + /* shadow stack pointer (SSP) */ + PERF_REG_X86_SSP, + PERF_REG_X86_EXT_REGS_64 =3D PERF_REG_X86_ZMM23, - PERF_REG_X86_EXT_REGS_MAX =3D PERF_REG_X86_ZMM31, + PERF_REG_X86_EXT_REGS_MAX =3D PERF_REG_X86_SSP, }; =20 enum perf_event_x86_ext_reg_size { @@ -148,6 +151,7 @@ enum perf_event_x86_ext_reg_size { PERF_X86_EXT_REG_OPMASK_SIZE =3D 1, PERF_X86_EXT_REG_ZMMH_SIZE =3D 4, PERF_X86_EXT_REG_H16ZMM_SIZE =3D 8, + PERF_X86_EXT_REG_SSP_SIZE =3D 1, =20 /* max of PERF_REG_X86_XXX_SIZE */ PERF_X86_EXT_REG_SIZE_MAX =3D PERF_X86_EXT_REG_H16ZMM_SIZE, diff --git a/arch/x86/kernel/perf_regs.c b/arch/x86/kernel/perf_regs.c index d5721ea85c5d..6a5936ed7143 100644 --- a/arch/x86/kernel/perf_regs.c +++ b/arch/x86/kernel/perf_regs.c @@ -102,6 +102,11 @@ static u64 perf_ext_reg_value(struct pt_regs *regs, in= t idx, idx - PERF_REG_X86_ZMM16, perf_regs->h16zmm_regs, PERF_X86_EXT_REG_H16ZMM_SIZE); + case PERF_REG_X86_SSP: + return __perf_ext_reg_value(ext, ext_size, + idx - PERF_REG_X86_SSP, + &perf_regs->cet_regs[1], + PERF_X86_EXT_REG_SSP_SIZE); default: WARN_ON_ONCE(1); *ext_size =3D 0; --=20 2.38.1 From nobody Fri Apr 17 23:05:01 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01A302E175B for ; Fri, 13 Jun 2025 13:50:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822625; cv=none; b=UPRhqkpnE+rgyonKlMmlHU+Havv6PgLW/WoyLqN6cCN+SrYIYoPDwOwdsEgsbeIUwF/bUqU9A6yRQFW+WfNGXHXrFWW0qjQpBrfCjkuRlqqaezjzJvRgIPvHFzgF6MIRp0MbBJeMZZM3FARJbK/kNFzDfvY9xJVUv4yJfuMel00= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749822625; c=relaxed/simple; bh=Pj+WNZIWRrgwW/LUYQRXNiJRdWKQig8NyY1KKUlYSac=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RjiHmistov6imn6vUrlGxqKfV3wmTctBYBOLeh8AQgZLnOoeCgN46434s8NNvjAD4O+ujTnfor7sgnmQhrWEvlb1Uc7fRJxNmFaerABGEnvRQ1gHTAsEzvZnabHE4nNZvvvLxkjCtw6h/aHhdhATg4aDDuahFToQRc99AcA0xK0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=CoUEyxJF; arc=none smtp.client-ip=198.175.65.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="CoUEyxJF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1749822624; x=1781358624; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Pj+WNZIWRrgwW/LUYQRXNiJRdWKQig8NyY1KKUlYSac=; b=CoUEyxJF4zL7tx9g2kDKPWvKa9ZRe8mYw7y5YFMzBQMyrz61GBZ89ATM 7nRBqXh4l5jARzFKUxM/GoTzttFPML2uLBr7gIBK/4JnS8C/+1eLJBIz9 hHCj0v6pLQdhCzW5pAFJlUJDU2RHZEUD1mNX14kER4zKlWbkXcf5L6NBo 580sjujNa2mBneGD8GI7z608rihkzFzgPCpItMuz81+y7MbUo8y55EmSh i4j7IyzXF4a1qO5cb89Gpfh7PEEBUXsLpiKLJPse+luM5YCPgSR5HS9FQ WNyfRF6fB86uEeS4tnTXNHvmEJnTBLLNkupdRYokAVi9xw2Unmsq18rad Q==; X-CSE-ConnectionGUID: 3mIcixCZQ5iOhEonoRIdqg== X-CSE-MsgGUID: lW2KDQF6QdqCZciG13GKkA== X-IronPort-AV: E=McAfee;i="6800,10657,11463"; a="55837626" X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="55837626" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2025 06:50:24 -0700 X-CSE-ConnectionGUID: ECnIzSi6Rie4LAeQjOoTGA== X-CSE-MsgGUID: pdubbUBYTHG+hCX72ehkLQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.16,233,1744095600"; d="scan'208";a="171017649" Received: from kanliang-dev.jf.intel.com ([10.165.154.102]) by fmviesa002.fm.intel.com with ESMTP; 13 Jun 2025 06:50:24 -0700 From: kan.liang@linux.intel.com To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, namhyung@kernel.org, tglx@linutronix.de, dave.hansen@linux.intel.com, irogers@google.com, adrian.hunter@intel.com, jolsa@kernel.org, alexander.shishkin@linux.intel.com, linux-kernel@vger.kernel.org Cc: dapeng1.mi@linux.intel.com, ak@linux.intel.com, zide.chen@intel.com, Kan Liang Subject: [RFC PATCH 12/12] perf/x86/intel: Support extended registers Date: Fri, 13 Jun 2025 06:49:43 -0700 Message-Id: <20250613134943.3186517-13-kan.liang@linux.intel.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20250613134943.3186517-1-kan.liang@linux.intel.com> References: <20250613134943.3186517-1-kan.liang@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Kan Liang Support YMM, APX, OPMASK, ZMM, and SSP if there is XSAVES support. Disable large PEBS if the extended regs are required. Signed-off-by: Kan Liang --- arch/x86/events/intel/core.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 5706ee562684..4218067b1843 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -4035,6 +4035,8 @@ static unsigned long intel_pmu_large_pebs_flags(struc= t perf_event *event) flags &=3D ~PERF_SAMPLE_REGS_USER; if (event->attr.sample_regs_user & ~PEBS_GP_REGS) flags &=3D ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR); + if (event_has_extended_regs2(event)) + flags &=3D ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR); return flags; } =20 @@ -5295,6 +5297,26 @@ static void intel_extended_regs_init(struct pmu *pmu) =20 x86_pmu.ext_regs_mask |=3D BIT_ULL(X86_EXT_REGS_XMM); x86_get_pmu(smp_processor_id())->capabilities |=3D PERF_PMU_CAP_EXTENDED_= REGS; + + if (boot_cpu_has(X86_FEATURE_AVX) && + cpu_has_xfeatures(XFEATURE_MASK_YMM, NULL)) + x86_pmu.ext_regs_mask |=3D BIT_ULL(X86_EXT_REGS_YMM); + if (boot_cpu_has(X86_FEATURE_APX) && + cpu_has_xfeatures(XFEATURE_MASK_APX, NULL)) + x86_pmu.ext_regs_mask |=3D BIT_ULL(X86_EXT_REGS_APX); + if (boot_cpu_has(X86_FEATURE_AVX512F)) { + if (cpu_has_xfeatures(XFEATURE_MASK_OPMASK, NULL)) + x86_pmu.ext_regs_mask |=3D BIT_ULL(X86_EXT_REGS_OPMASK); + if (cpu_has_xfeatures(XFEATURE_MASK_ZMM_Hi256, NULL)) + x86_pmu.ext_regs_mask |=3D BIT_ULL(X86_EXT_REGS_ZMMH); + if (cpu_has_xfeatures(XFEATURE_MASK_Hi16_ZMM, NULL)) + x86_pmu.ext_regs_mask |=3D BIT_ULL(X86_EXT_REGS_H16ZMM); + } + if (cpu_feature_enabled(X86_FEATURE_USER_SHSTK)) + x86_pmu.ext_regs_mask |=3D BIT_ULL(X86_EXT_REGS_CET); + + if (x86_pmu.ext_regs_mask !=3D BIT_ULL(X86_EXT_REGS_XMM)) + x86_get_pmu(smp_processor_id())->capabilities |=3D PERF_PMU_CAP_EXTENDED= _REGS2; } =20 static void update_pmu_cap(struct pmu *pmu) --=20 2.38.1