From nobody Mon Dec 15 22:06:12 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 72C6730214C for ; Wed, 10 Dec 2025 23:14:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765408503; cv=none; b=AB0jP5rFjHDgCI5Uuc72lI8Mk/dJ6nE6dJ5+RAfbiHfQB/1TMu9knCFDQ+pJTJNxZiGaiWbY6Ow7zbjjREbHJny+AS+t+5XoqvUt9OZU0i4rDaUHwta08AaPUSahsc8bb90idJDY5z+RzZctey5ZwvK8NfpUau85POMc7uRO86U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765408503; c=relaxed/simple; bh=VpFfv/CGm4lI2PiQoSWuxd4/cOGr0Yyi8yn5mpikwM0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oyNtPLcvqSyWrJJym4kj/sMY4aS9xur4avS6iQvXO/wdpwvcRvllcb9CxqQjDvLigcdoMs0BvQgd/EOk9JW7wVeJZZBM8NQS+xirqhCZh0uePV3V82Lgfwnp8/xN4Ugnq9S/4QFTFjGTiR+ZucKOhXfroCD+KGKxiU/uzy/BPQk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=SWIkfk51; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="SWIkfk51" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1765408496; x=1796944496; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VpFfv/CGm4lI2PiQoSWuxd4/cOGr0Yyi8yn5mpikwM0=; b=SWIkfk51XqUMWvxsCF/x3CDERGR9h1QJ4TANsYdKB+fYLO4r1E0wsUNg FbHZ/VD13cA+AC+faMHrZ1Igm7IWapkFrlKNOsXww5ZQG5Ic6Xym6gFJk 3fAAqYUmDX0GFsM7l1TKNJR1wxxEzq9rao999pYF6sphIYZ8EESLmNFqx hBgEfFWVHxJX9jBZPqj2tN++nLf+wMta/L7C3hXNUblacxeu7mH9x39dI NpdLmUn6pecs98TxmZZmQj0Orf+6Ttg2E/UTZCegqpH6CG0qwjcdDpg8R yY7tX58jyIQhYWaVyuxGbatK1TFiHEp1rvRmOiYfUyNjw4gefRoMQwE1v w==; X-CSE-ConnectionGUID: eCXicqYdSNqgst49lO7lEg== X-CSE-MsgGUID: 1DIPUhKpSXOEwAaLK0CP8g== X-IronPort-AV: E=McAfee;i="6800,10657,11638"; a="69973603" X-IronPort-AV: E=Sophos;i="6.20,265,1758610800"; d="scan'208";a="69973603" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Dec 2025 15:14:35 -0800 X-CSE-ConnectionGUID: V817ec7/T6eNN9nj2mokQg== X-CSE-MsgGUID: oIYuADJBTN+31DO0ZxZuaQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.20,265,1758610800"; d="scan'208";a="227297089" Received: from daliomra-mobl3.amr.corp.intel.com (HELO agluck-desk3.intel.com) ([10.124.221.254]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Dec 2025 15:14:34 -0800 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Maciej Wieczor-Retman , Peter Newman , James Morse , Babu Moger , Drew Fustini , Dave Martin , Chen Yu Cc: x86@kernel.org, linux-kernel@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v16 16/32] x86/resctrl: Discover hardware telemetry events Date: Wed, 10 Dec 2025 15:13:55 -0800 Message-ID: <20251210231413.59102-17-tony.luck@intel.com> X-Mailer: git-send-email 2.51.1 In-Reply-To: <20251210231413.59102-1-tony.luck@intel.com> References: <20251210231413.59102-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Each CPU collects data for telemetry events that it sends to the nearest telemetry event aggregator either when the value of MSR_IA32_PQR_ASSOC.RMID changes, or when a two millisecond timer expires. There is a feature type ("energy" or "perf"), guid, and MMIO region associa= ted with each aggregator. This combination links to an XML description of the set of telemetry events tracked by the aggregator. XML files are published by Intel in a GitHub repository [1]. The telemetry event aggregators maintain per-RMID per-event counts of the total seen for all the CPUs. There may be multiple telemetry event aggregat= ors per package. There are separate sets of aggregators for each feature type. Aggregators in a set may have different guids. All aggregators with the same feature type and guid are symmetric keeping counts for the same set of events for the CPUs that provide data to them. The XML file for each aggregator provides the following information: 0) Feature type of the events ("perf" or "energy") 1) Which telemetry events are tracked by the aggregator. 2) The order in which the event counters appear for each RMID. 3) The value type of each event counter (integer or fixed-point). 4) The number of RMIDs supported. 5) Which additional aggregator status registers are included. 6) The total size of the MMIO region for an aggregator. Introduce struct event_group that condenses the relevant information from an XML file. Hereafter an "event group" refers to a group of events of a particular feature type (event_group::pfname set to "energy" or "perf") with a particular guid. Use event_group::pfname to determine the feature id needed to obtain the aggregator details. It will later be used in console messages and with the rdt=3D boot parameter. The INTEL_PMT_TELEMETRY driver enumerates support for telemetry events. This driver provides intel_pmt_get_regions_by_feature() to list all availab= le telemetry event aggregators of a given feature type. The list includes the "guid", the base address in MMIO space for the region where the event count= ers are exposed, and the package id where the all the CPUs that report to this aggregator are located. Call INTEL_PMT_TELEMETRY's intel_pmt_get_regions_by_feature() for each event group to obtain a private copy of that event group's aggregator data. Dupli= cate the aggregator data between event groups that have the same feature type but different guid. Further processing on this private copy will be unique to the event group. Return the aggregator data to INTEL_PMT_TELEMETRY at resctrl exit time. resctrl will silently ignore unknown guid values. Add a new Kconfig option CONFIG_X86_CPU_RESCTRL_INTEL_AET for the Intel spe= cific parts of telemetry code. This depends on the INTEL_PMT_TELEMETRY and INTEL_= TPMI drivers being built-in to the kernel for enumeration of telemetry features. Signed-off-by: Tony Luck Link: https://github.com/intel/Intel-PMT # [1] Reviewed-by: Reinette Chatre --- arch/x86/kernel/cpu/resctrl/internal.h | 8 ++ arch/x86/kernel/cpu/resctrl/core.c | 5 ++ arch/x86/kernel/cpu/resctrl/intel_aet.c | 109 ++++++++++++++++++++++++ arch/x86/Kconfig | 13 +++ arch/x86/kernel/cpu/resctrl/Makefile | 1 + 5 files changed, 136 insertions(+) create mode 100644 arch/x86/kernel/cpu/resctrl/intel_aet.c diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 11d06995810e..f2e6e3577df0 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -222,4 +222,12 @@ void __init intel_rdt_mbm_apply_quirk(void); void rdt_domain_reconfigure_cdp(struct rdt_resource *r); void resctrl_arch_mbm_cntr_assign_set_one(struct rdt_resource *r); =20 +#ifdef CONFIG_X86_CPU_RESCTRL_INTEL_AET +bool intel_aet_get_events(void); +void __exit intel_aet_exit(void); +#else +static inline bool intel_aet_get_events(void) { return false; } +static inline void __exit intel_aet_exit(void) { } +#endif + #endif /* _ASM_X86_RESCTRL_INTERNAL_H */ diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 986b1303efb9..88be77d5d20d 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -743,6 +743,9 @@ void resctrl_arch_pre_mount(void) =20 if (!atomic_try_cmpxchg(&only_once, &old, 1)) return; + + if (!intel_aet_get_events()) + return; } =20 enum { @@ -1104,6 +1107,8 @@ late_initcall(resctrl_arch_late_init); =20 static void __exit resctrl_arch_exit(void) { + intel_aet_exit(); + cpuhp_remove_state(rdt_online); =20 resctrl_exit(); diff --git a/arch/x86/kernel/cpu/resctrl/intel_aet.c b/arch/x86/kernel/cpu/= resctrl/intel_aet.c new file mode 100644 index 000000000000..6e0d063cfc80 --- /dev/null +++ b/arch/x86/kernel/cpu/resctrl/intel_aet.c @@ -0,0 +1,109 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Resource Director Technology(RDT) + * - Intel Application Energy Telemetry + * + * Copyright (C) 2025 Intel Corporation + * + * Author: + * Tony Luck + */ + +#define pr_fmt(fmt) "resctrl: " fmt + +#include +#include +#include +#include +#include +#include + +#include "internal.h" + +/** + * struct event_group - Events with the same feature type ("energy" or "pe= rf") and guid. + * @pfname: PMT feature name ("energy" or "perf") of this event group. + * @pfg: Points to the aggregated telemetry space information + * returned by the intel_pmt_get_regions_by_feature() + * call to the INTEL_PMT_TELEMETRY driver that contains + * data for all telemetry regions of type @pfname. + * Valid if the system supports the event group, + * NULL otherwise. + */ +struct event_group { + /* Data fields for additional structures to manage this group. */ + const char *pfname; + struct pmt_feature_group *pfg; +}; + +static struct event_group *known_event_groups[] =3D { +}; + +#define for_each_event_group(_peg) \ + for (_peg =3D known_event_groups; \ + _peg < &known_event_groups[ARRAY_SIZE(known_event_groups)]; \ + _peg++) + +/* Stub for now */ +static bool enable_events(struct event_group *e, struct pmt_feature_group = *p) +{ + return false; +} + +static enum pmt_feature_id lookup_pfid(const char *pfname) +{ + if (!strcmp(pfname, "energy")) + return FEATURE_PER_RMID_ENERGY_TELEM; + else if (!strcmp(pfname, "perf")) + return FEATURE_PER_RMID_PERF_TELEM; + + pr_warn("Unknown PMT feature name '%s'\n", pfname); + + return FEATURE_INVALID; +} + +/* + * Request a copy of struct pmt_feature_group for each event group. If the= re is + * one, the returned structure has an array of telemetry_region structures, + * each element of the array describes one telemetry aggregator. The + * telemetry aggregators may have different guids so obtain duplicate stru= ct + * pmt_feature_group for event groups with same feature type but different + * guid. Post-processing ensures an event group can only use the telemetry + * aggregators that match its guid. An event group keeps a pointer to its + * struct pmt_feature_group to indicate that its events are successfully + * enabled. + */ +bool intel_aet_get_events(void) +{ + struct pmt_feature_group *p; + enum pmt_feature_id pfid; + struct event_group **peg; + bool ret =3D false; + + for_each_event_group(peg) { + pfid =3D lookup_pfid((*peg)->pfname); + p =3D intel_pmt_get_regions_by_feature(pfid); + if (IS_ERR_OR_NULL(p)) + continue; + if (enable_events(*peg, p)) { + (*peg)->pfg =3D p; + ret =3D true; + } else { + intel_pmt_put_feature_group(p); + } + } + + return ret; +} + +void __exit intel_aet_exit(void) +{ + struct event_group **peg; + + for_each_event_group(peg) { + if ((*peg)->pfg) { + intel_pmt_put_feature_group((*peg)->pfg); + (*peg)->pfg =3D NULL; + } + } +} diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 34fb46d5341b..52dda19d584d 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -538,6 +538,19 @@ config X86_CPU_RESCTRL =20 Say N if unsure. =20 +config X86_CPU_RESCTRL_INTEL_AET + bool "Intel Application Energy Telemetry" + depends on X86_CPU_RESCTRL && CPU_SUP_INTEL && INTEL_PMT_TELEMETRY=3Dy &&= INTEL_TPMI=3Dy + help + Enable per-RMID telemetry events in resctrl. + + Intel feature that collects per-RMID execution data + about energy consumption, measure of frequency independent + activity and other performance metrics. Data is aggregated + per package. + + Say N if unsure. + config X86_FRED bool "Flexible Return and Event Delivery" depends on X86_64 diff --git a/arch/x86/kernel/cpu/resctrl/Makefile b/arch/x86/kernel/cpu/res= ctrl/Makefile index d8a04b195da2..273ddfa30836 100644 --- a/arch/x86/kernel/cpu/resctrl/Makefile +++ b/arch/x86/kernel/cpu/resctrl/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_X86_CPU_RESCTRL) +=3D core.o rdtgroup.o monitor.o obj-$(CONFIG_X86_CPU_RESCTRL) +=3D ctrlmondata.o +obj-$(CONFIG_X86_CPU_RESCTRL_INTEL_AET) +=3D intel_aet.o obj-$(CONFIG_RESCTRL_FS_PSEUDO_LOCK) +=3D pseudo_lock.o =20 # To allow define_trace.h's recursive include: --=20 2.51.1