From nobody Sun Sep 14 03:50:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5FF5C0015E for ; Thu, 13 Jul 2023 16:32:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232734AbjGMQcl (ORCPT ); Thu, 13 Jul 2023 12:32:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232700AbjGMQcd (ORCPT ); Thu, 13 Jul 2023 12:32:33 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75E91272A; Thu, 13 Jul 2023 09:32:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689265952; x=1720801952; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=O8VYBIBmlW5jTpuxcxQvWpwg4pXyAEtUg3Nl4gqJA+Y=; b=PeuhXeUkI5KDD7qoOATvsf9N56p2NgOP3fZ+Fn5+RpjGVqJZa+DbjQ68 8sUytTqq1ex1agC0j9ASWWBWHJOGqS1y+DnZd5j5tfr/v77imdbKsZFST MVjP9ehLcavbllvh5LY4n9kM7gA5GBYr6y4UIrBhZE+aCnM4ImPhjNI6h PUOnzS9rmUNktdbvHTYyNKjL09sWE53APYI4e60GqMitvgB8hTmtiAgQ/ Y6moQvt6FBn0AWb0L0I+Jnge/ULalVNOXrF2Ua8DKlfISZOTBnn63nDwQ Xqz8mTWXS/FPMLKPQd/+7Of3UPG6YVmFLKVmNT3RSYIk3YvsNU+eb5wwy A==; X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="362707593" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="362707593" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="722046370" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="722046370" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:21 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v3 1/8] x86/resctrl: Refactor in preparation for node-scoped resources Date: Thu, 13 Jul 2023 09:32:00 -0700 Message-Id: <20230713163207.219710-2-tony.luck@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230713163207.219710-1-tony.luck@intel.com> References: <20230713163207.219710-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Sub-NUMA cluster systems provide monitoring resources at the NUMA node scope instead of the L3 cache scope. Rename the cache_level field in struct rdt_resource to the more generic "scope" and add symbolic names and a helper function. No functional change. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- include/linux/resctrl.h | 4 ++-- arch/x86/kernel/cpu/resctrl/internal.h | 5 +++++ arch/x86/kernel/cpu/resctrl/core.c | 17 +++++++++++------ arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +- 5 files changed, 20 insertions(+), 10 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 8334eeacfec5..25051daa6655 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -150,7 +150,7 @@ struct resctrl_schema; * @alloc_capable: Is allocation available on this machine * @mon_capable: Is monitor feature available on this machine * @num_rmid: Number of RMIDs available - * @cache_level: Which cache level defines scope of this resource + * @scope: Scope of this resource (cache level or NUMA node) * @cache: Cache allocation related data * @membw: If the component has bandwidth controls, their properties. * @domains: All domains for this resource @@ -168,7 +168,7 @@ struct rdt_resource { bool alloc_capable; bool mon_capable; int num_rmid; - int cache_level; + int scope; struct resctrl_cache cache; struct resctrl_membw membw; struct list_head domains; diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 85ceaf9a31ac..8275b8a74f7e 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -440,6 +440,11 @@ enum resctrl_res_level { RDT_NUM_RESOURCES, }; =20 +enum resctrl_scope { + SCOPE_L2_CACHE =3D 2, + SCOPE_L3_CACHE =3D 3 +}; + static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res) { struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(res); diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 030d3b409768..6571514752f3 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -65,7 +65,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L3, .name =3D "L3", - .cache_level =3D 3, + .scope =3D SCOPE_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_L3), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", @@ -79,7 +79,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L2, .name =3D "L2", - .cache_level =3D 2, + .scope =3D SCOPE_L2_CACHE, .domains =3D domain_init(RDT_RESOURCE_L2), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", @@ -93,7 +93,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_MBA, .name =3D "MB", - .cache_level =3D 3, + .scope =3D SCOPE_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_MBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", @@ -105,7 +105,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_SMBA, .name =3D "SMBA", - .cache_level =3D 3, + .scope =3D 3, .domains =3D domain_init(RDT_RESOURCE_SMBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", @@ -487,6 +487,11 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct = rdt_hw_domain *hw_dom) return 0; } =20 +static int get_domain_id(int cpu, enum resctrl_scope scope) +{ + return get_cpu_cacheinfo_id(cpu, scope); +} + /* * domain_add_cpu - Add a cpu to a resource's domain list. * @@ -502,7 +507,7 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct r= dt_hw_domain *hw_dom) */ static void domain_add_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->cache_level); + int id =3D get_domain_id(cpu, r->scope); struct list_head *add_pos =3D NULL; struct rdt_hw_domain *hw_dom; struct rdt_domain *d; @@ -552,7 +557,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource= *r) =20 static void domain_remove_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->cache_level); + int id =3D get_domain_id(cpu, r->scope); struct rdt_hw_domain *hw_dom; struct rdt_domain *d; =20 diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index 458cb7419502..42f124ffb968 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -297,7 +297,7 @@ static int pseudo_lock_region_init(struct pseudo_lock_r= egion *plr) plr->size =3D rdtgroup_cbm_to_size(plr->s->res, plr->d, plr->cbm); =20 for (i =3D 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level =3D=3D plr->s->res->cache_level) { + if (ci->info_list[i].level =3D=3D plr->s->res->scope) { plr->line_size =3D ci->info_list[i].coherency_line_size; return 0; } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 725344048f85..418658f0a9ad 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1348,7 +1348,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource= *r, num_b =3D bitmap_weight(&cbm, r->cache.cbm_len); ci =3D get_cpu_cacheinfo(cpumask_any(&d->cpu_mask)); for (i =3D 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level =3D=3D r->cache_level) { + if (ci->info_list[i].level =3D=3D r->scope) { size =3D ci->info_list[i].size / r->cache.cbm_len * num_b; break; } --=20 2.40.1 From nobody Sun Sep 14 03:50:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97A5FC04E69 for ; Thu, 13 Jul 2023 16:32:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232676AbjGMQch (ORCPT ); Thu, 13 Jul 2023 12:32:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43480 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230166AbjGMQcd (ORCPT ); Thu, 13 Jul 2023 12:32:33 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F6472719; Thu, 13 Jul 2023 09:32:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689265952; x=1720801952; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UfMNDGqaFTdJ9acHmvMM5A8aK/ToJAVWU+2PC7nhIQo=; b=TVF+kkA5oUzSy4xh2J+maOj5HuzcGk/qeewYdQ5jvWUnXNBrrV55pHg8 G6bFI7DY4LHFyBym8Ao4ymUqHE7F3ZlG0a6U8u2Py2BXySGbuu6NK5EaK IcsXgVLBfufYJbU82aVAMcpsISqhlsJdrmhueelrCzES/TNZZKn+Ewwpw o7aCAMghIoIKJIsQqByfKKix6rRN+meOoFenma4vDayNxdTf5kONgKhuv gAE8DJa0gT4+/oMZ02bOk/CItGkdbk0bjy/DFYLGUjluN/LbNJ3jvUAk3 9YazBwI63CAdjLEDg6/K+pBN8iv2eMRezW1jXMZWV0cb1IaugKDBC0TXG g==; X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="362707607" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="362707607" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="722046373" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="722046373" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:22 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v3 2/8] x86/resctrl: Remove hard code of RDT_RESOURCE_L3 in monitor.c Date: Thu, 13 Jul 2023 09:32:01 -0700 Message-Id: <20230713163207.219710-3-tony.luck@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230713163207.219710-1-tony.luck@intel.com> References: <20230713163207.219710-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Scope of monitoring may be scoped at L3 cache granularity (legacy) or at the node level (systems with Sub NUMA Cluster enabled). Save the struct rdt_resource pointer that was used to initialize the monitor sections of code and use that value instead of the hard-coded RDT_RESOURCE_L3. No functional change. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- arch/x86/kernel/cpu/resctrl/monitor.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index ded1fc7cb7cb..9be6ffdd01ae 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -30,6 +30,8 @@ struct rmid_entry { struct list_head list; }; =20 +static struct rdt_resource *mon_resource; + /** * @rmid_free_lru A least recently used list of free RMIDs * These RMIDs are guaranteed to have an occupancy less than the @@ -268,7 +270,7 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, stru= ct rdt_domain *d, */ void __check_limbo(struct rdt_domain *d, bool force_free) { - struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + struct rdt_resource *r =3D mon_resource; struct rmid_entry *entry; u32 crmid =3D 1, nrmid; bool rmid_dirty; @@ -333,7 +335,7 @@ int alloc_rmid(void) =20 static void add_rmid_to_limbo(struct rmid_entry *entry) { - struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + struct rdt_resource *r =3D mon_resource; struct rdt_domain *d; int cpu, err; u64 val =3D 0; @@ -645,7 +647,7 @@ void cqm_handle_limbo(struct work_struct *work) =20 mutex_lock(&rdtgroup_mutex); =20 - r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + r =3D mon_resource; d =3D container_of(work, struct rdt_domain, cqm_limbo.work); =20 __check_limbo(d, false); @@ -681,7 +683,7 @@ void mbm_handle_overflow(struct work_struct *work) if (!static_branch_likely(&rdt_mon_enable_key)) goto out_unlock; =20 - r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + r =3D mon_resource; d =3D container_of(work, struct rdt_domain, mbm_over.work); =20 list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { @@ -759,9 +761,9 @@ static struct mon_evt mbm_local_event =3D { /* * Initialize the event list for the resource. * - * Note that MBM events are also part of RDT_RESOURCE_L3 resource - * because as per the SDM the total and local memory bandwidth - * are enumerated as part of L3 monitoring. + * Monitor events can either be part of RDT_RESOURCE_L3 resource, + * or they may be per NUMA node on systems with sub-NUMA cluster + * enabled and are then in the RDT_RESOURCE_NODE resource. */ static void l3_mon_evt_init(struct rdt_resource *r) { @@ -773,6 +775,8 @@ static void l3_mon_evt_init(struct rdt_resource *r) list_add_tail(&mbm_total_event.list, &r->evt_list); if (is_mbm_local_enabled()) list_add_tail(&mbm_local_event.list, &r->evt_list); + + mon_resource =3D r; } =20 int __init rdt_get_mon_l3_config(struct rdt_resource *r) --=20 2.40.1 From nobody Sun Sep 14 03:50:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 367D7C0015E for ; Thu, 13 Jul 2023 16:32:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233738AbjGMQcs (ORCPT ); Thu, 13 Jul 2023 12:32:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232834AbjGMQce (ORCPT ); Thu, 13 Jul 2023 12:32:34 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 916AF273F; Thu, 13 Jul 2023 09:32:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689265953; x=1720801953; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mb0vyqYUmP9/Z2FeljUZJDw2W7rRUKlG9bPFHkTU9kY=; b=E1oeZ9A04k0Ke+aX+iHdySs7rP0WfXXQFQfPhL8Hq8X5uhFiHNbm30qF gkLeXhIh8ORondiflXknuPk6JIa4E/9G4nXxPErGIWnSxtSledge+vhxd iCYqBVJ9AGTCQThXssZp01U2seAPhe69BADaVwN++MImOKVOafvL5Vcff xRyOs1PfwEydJz7lD2WSUL6zrkvKG0TNa3IMv7tWJu63KCLnE++Hn35pD QIZ9zdzd7N2mNE4jTKAeT2re2uUobxxTLHz6g16Tqh+wP9biwiwo3LdiP sSqwL5htgPSykCNjufl8LkMk6Qdab/XB2V/Z1s8lOIPfS6jMqizvAqZw1 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="362707619" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="362707619" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="722046377" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="722046377" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:22 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v3 3/8] x86/resctrl: Add a new node-scoped resource to rdt_resources_all[] Date: Thu, 13 Jul 2023 09:32:02 -0700 Message-Id: <20230713163207.219710-4-tony.luck@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230713163207.219710-1-tony.luck@intel.com> References: <20230713163207.219710-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add a placeholder in the array of struct rdt_hw_resource to be used for event monitoring of systems with Sub-NUMA Cluster enabled. Update get_domain_id() to handle SCOPE_NODE. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- arch/x86/kernel/cpu/resctrl/internal.h | 4 +++- arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++++++++ 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 8275b8a74f7e..243017096ddf 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -435,6 +435,7 @@ enum resctrl_res_level { RDT_RESOURCE_L2, RDT_RESOURCE_MBA, RDT_RESOURCE_SMBA, + RDT_RESOURCE_NODE, =20 /* Must be the last */ RDT_NUM_RESOURCES, @@ -442,7 +443,8 @@ enum resctrl_res_level { =20 enum resctrl_scope { SCOPE_L2_CACHE =3D 2, - SCOPE_L3_CACHE =3D 3 + SCOPE_L3_CACHE =3D 3, + SCOPE_NODE, }; =20 static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 6571514752f3..e4bd3072927c 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -112,6 +112,16 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .fflags =3D RFTYPE_RES_MB, }, }, + [RDT_RESOURCE_NODE] =3D + { + .r_resctrl =3D { + .rid =3D RDT_RESOURCE_NODE, + .name =3D "L3", + .scope =3D SCOPE_NODE, + .domains =3D domain_init(RDT_RESOURCE_NODE), + .fflags =3D 0, + }, + }, }; =20 /* @@ -489,6 +499,8 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct r= dt_hw_domain *hw_dom) =20 static int get_domain_id(int cpu, enum resctrl_scope scope) { + if (scope =3D=3D SCOPE_NODE) + return cpu_to_node(cpu); return get_cpu_cacheinfo_id(cpu, scope); } =20 --=20 2.40.1 From nobody Sun Sep 14 03:50:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5508C001DD for ; Thu, 13 Jul 2023 16:33:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232120AbjGMQcx (ORCPT ); Thu, 13 Jul 2023 12:32:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233512AbjGMQcf (ORCPT ); Thu, 13 Jul 2023 12:32:35 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBB151BF9; Thu, 13 Jul 2023 09:32:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689265953; x=1720801953; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aaVT8LD/zFtFbNHJEOTnDWpLfqn+AGf2IASZmoZ4a+Q=; b=nKgIhHE5zdomRzS8xIiBNh4EJyGvpG98SFrOpmQ3q0Lv0Wq64R+VGoc3 Fao+x2dps6/n8i8MiqnDajJPjH0WwGMShtLtdBjrT5Hxhssfs1qQKs7it zaHd1qcd3xHhhI/F1EQIRp6u7qHl2McIuFQq1h0AKFWfihWA2Fuaax/ZV JLHJ1G24Td37cRJi/SM+wYrEdyJzur5ALewJvwnSvVqTNHo7FQy7ZHRYJ oxmM90R2x0HqGGogTcp662W/EZVxWOVrDV2PQRfZnYG4RsR2tLplmiBVU VjUqMeyr/HDCqv9dv5fic/LtvlADgTYmSDDtBdxuyavUNtfJw8mYVVNlC Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="362707630" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="362707630" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="722046380" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="722046380" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:22 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v3 4/8] x86/resctrl: Add code to setup monitoring at L3 or NODE scope. Date: Thu, 13 Jul 2023 09:32:03 -0700 Message-Id: <20230713163207.219710-5-tony.luck@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230713163207.219710-1-tony.luck@intel.com> References: <20230713163207.219710-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When Sub-NUMA cluster is enabled (snc_ways > 1) use the RDT_RESOURCE_NODE instead of RDT_RESOURCE_L3 for all monitoring operations. The mon_scale and num_rmid values from CPUID(0xf,0x1),(EBX,ECX) must be scaled down by the number of Sub-NUMA Clusters. A subsequent change will detect sub-NUMA cluster mode and set "snc_ways". For now set to one (meaning each L3 cache spans one node). Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- arch/x86/kernel/cpu/resctrl/internal.h | 7 +++++++ arch/x86/kernel/cpu/resctrl/core.c | 7 ++++++- arch/x86/kernel/cpu/resctrl/monitor.c | 4 ++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +- 4 files changed, 16 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 243017096ddf..38bac0062c82 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -430,6 +430,8 @@ DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); =20 extern struct dentry *debugfs_resctrl; =20 +extern int snc_ways; + enum resctrl_res_level { RDT_RESOURCE_L3, RDT_RESOURCE_L2, @@ -447,6 +449,11 @@ enum resctrl_scope { SCOPE_NODE, }; =20 +static inline int get_mbm_res_level(void) +{ + return snc_ways > 1 ? RDT_RESOURCE_NODE : RDT_RESOURCE_L3; +} + static inline struct rdt_resource *resctrl_inc(struct rdt_resource *res) { struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(res); diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index e4bd3072927c..6fe9f87d4403 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -48,6 +48,11 @@ int max_name_width, max_data_width; */ bool rdt_alloc_capable; =20 +/* + * How many Sub-Numa Cluster nodes share a single L3 cache + */ +int snc_ways =3D 1; + static void mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r); @@ -831,7 +836,7 @@ static __init bool get_rdt_alloc_resources(void) =20 static __init bool get_rdt_mon_resources(void) { - struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + struct rdt_resource *r =3D &rdt_resources_all[get_mbm_res_level()].r_resc= trl; =20 if (rdt_cpu_has(X86_FEATURE_CQM_OCCUP_LLC)) rdt_mon_features |=3D (1 << QOS_L3_OCCUP_EVENT_ID); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index 9be6ffdd01ae..da3f36212898 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -787,8 +787,8 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r) int ret; =20 resctrl_rmid_realloc_limit =3D boot_cpu_data.x86_cache_size * 1024; - hw_res->mon_scale =3D boot_cpu_data.x86_cache_occ_scale; - r->num_rmid =3D boot_cpu_data.x86_cache_max_rmid + 1; + hw_res->mon_scale =3D boot_cpu_data.x86_cache_occ_scale / snc_ways; + r->num_rmid =3D (boot_cpu_data.x86_cache_max_rmid + 1) / snc_ways; hw_res->mbm_width =3D MBM_CNTR_WIDTH_BASE; =20 if (mbm_offset > 0 && mbm_offset <=3D MBM_CNTR_WIDTH_OFFSET_MAX) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 418658f0a9ad..d037f3da9e55 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2524,7 +2524,7 @@ static int rdt_get_tree(struct fs_context *fc) static_branch_enable_cpuslocked(&rdt_enable_key); =20 if (is_mbm_enabled()) { - r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + r =3D &rdt_resources_all[get_mbm_res_level()].r_resctrl; list_for_each_entry(dom, &r->domains, list) mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL); } --=20 2.40.1 From nobody Sun Sep 14 03:50:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3919C001DD for ; Thu, 13 Jul 2023 16:32:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234803AbjGMQcp (ORCPT ); Thu, 13 Jul 2023 12:32:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230151AbjGMQce (ORCPT ); Thu, 13 Jul 2023 12:32:34 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F7571980; Thu, 13 Jul 2023 09:32:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689265953; x=1720801953; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=K8r2UZiAK+1TfFAtAuBZuyucEeCR4YldPD7156Ve6Xo=; b=V5f3mOwaYud4fz5CpYXkdFI+Y4WDAE8OpTqfdXEBhyjYG7huAQTDDEsm KLuifvno5SeceIqCzaZ9/F+yu/kSkXqGRxwaUmFuXVxdR7ZYC5pvQGJKj gD3hsNSeE9wJ/7YnIbbz9sVJIUbVo56yjrf53Fh+Et8SnH450ze+1GIjH yhxV1x9io2q+TCiOE62suxdO3BJAy8/TeohsFeHaSqOpgQcWtMc3Zu5xl nUFPuLlWyDaCBajoyi1WbjZKKoh84DyPu7SIjWeikCL/URzZ2OWV4zVBh O5MNjHpiPxV7pdFK3P7MYg5aqLVWAiMlIVATU08VaGzpW2HfoRdt4LirR w==; X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="362707635" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="362707635" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="722046383" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="722046383" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:22 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v3 5/8] x86/resctrl: Add package scoped resource Date: Thu, 13 Jul 2023 09:32:04 -0700 Message-Id: <20230713163207.219710-6-tony.luck@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230713163207.219710-1-tony.luck@intel.com> References: <20230713163207.219710-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Some Intel features require setting a package scoped model specific register. Add a new resource that builds domains for each package. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- include/linux/resctrl.h | 1 + arch/x86/kernel/cpu/resctrl/internal.h | 6 ++++-- arch/x86/kernel/cpu/resctrl/core.c | 23 +++++++++++++++++++---- 3 files changed, 24 insertions(+), 6 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 25051daa6655..f504f6263fec 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -167,6 +167,7 @@ struct rdt_resource { int rid; bool alloc_capable; bool mon_capable; + bool pkg_actions; int num_rmid; int scope; struct resctrl_cache cache; diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 38bac0062c82..67340c83392f 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -438,6 +438,7 @@ enum resctrl_res_level { RDT_RESOURCE_MBA, RDT_RESOURCE_SMBA, RDT_RESOURCE_NODE, + RDT_RESOURCE_PKG, =20 /* Must be the last */ RDT_NUM_RESOURCES, @@ -447,6 +448,7 @@ enum resctrl_scope { SCOPE_L2_CACHE =3D 2, SCOPE_L3_CACHE =3D 3, SCOPE_NODE, + SCOPE_PKG, }; =20 static inline int get_mbm_res_level(void) @@ -478,9 +480,9 @@ int resctrl_arch_set_cdp_enabled(enum resctrl_res_level= l, bool enable); r <=3D &rdt_resources_all[RDT_NUM_RESOURCES - 1].r_resctrl; \ r =3D resctrl_inc(r)) =20 -#define for_each_capable_rdt_resource(r) \ +#define for_each_domain_needed_rdt_resource(r) \ for_each_rdt_resource(r) \ - if (r->alloc_capable || r->mon_capable) + if (r->alloc_capable || r->mon_capable || r->pkg_actions) =20 #define for_each_alloc_capable_rdt_resource(r) \ for_each_rdt_resource(r) \ diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 6fe9f87d4403..af3be3c2db96 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -127,6 +127,16 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .fflags =3D 0, }, }, + [RDT_RESOURCE_PKG] =3D + { + .r_resctrl =3D { + .rid =3D RDT_RESOURCE_PKG, + .name =3D "PKG", + .scope =3D SCOPE_PKG, + .domains =3D domain_init(RDT_RESOURCE_PKG), + .fflags =3D 0, + }, + }, }; =20 /* @@ -504,9 +514,14 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct = rdt_hw_domain *hw_dom) =20 static int get_domain_id(int cpu, enum resctrl_scope scope) { - if (scope =3D=3D SCOPE_NODE) + switch (scope) { + case SCOPE_NODE: return cpu_to_node(cpu); - return get_cpu_cacheinfo_id(cpu, scope); + case SCOPE_PKG: + return topology_physical_package_id(cpu); + default: + return get_cpu_cacheinfo_id(cpu, scope); + } } =20 /* @@ -630,7 +645,7 @@ static int resctrl_online_cpu(unsigned int cpu) struct rdt_resource *r; =20 mutex_lock(&rdtgroup_mutex); - for_each_capable_rdt_resource(r) + for_each_domain_needed_rdt_resource(r) domain_add_cpu(cpu, r); /* The cpu is set in default rdtgroup after online. */ cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); @@ -657,7 +672,7 @@ static int resctrl_offline_cpu(unsigned int cpu) struct rdt_resource *r; =20 mutex_lock(&rdtgroup_mutex); - for_each_capable_rdt_resource(r) + for_each_domain_needed_rdt_resource(r) domain_remove_cpu(cpu, r); list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { --=20 2.40.1 From nobody Sun Sep 14 03:50:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 712C1C001DD for ; Thu, 13 Jul 2023 16:33:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231530AbjGMQdG (ORCPT ); Thu, 13 Jul 2023 12:33:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233609AbjGMQcf (ORCPT ); Thu, 13 Jul 2023 12:32:35 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 345992719; Thu, 13 Jul 2023 09:32:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689265954; x=1720801954; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cXpDt+dnuqr6w3vRcORF5earUb+5d5AO1lTx5qdXM8g=; b=W8Cat0XEBhnBNnTDuDr4pJsDs86opE7eloqFvEaLcO10QJWKfkuA4W1z WYibByu6UT8hrEsWnhVcXdVgRI2WDPhuCAEczG5JThRBmge5m9vkvK8j4 zi1nzfbXxbSwWuB2eS0NteiRhmEA/F2n4c7DG0V8oecx+1zVg3FLKQPvq 8bm/4ktVVsRhBUPUlv8Dy7uR4PrHbsEwGTGdVjbscEIK6SbvpGeCcZDs5 qHDLNsFFNF19SQzQjs0FOxd75Lbb614SJODM8xLWs2mSsTWfQYXJGd0Wu VbiLGhRFUTdjksqqvrbM+reUUSPJTf3hjMkSiGud9LMzaq2cOxPVsVme5 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="362707654" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="362707654" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="722046387" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="722046387" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:23 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v3 6/8] x86/resctrl: Update documentation with Sub-NUMA cluster changes Date: Thu, 13 Jul 2023 09:32:05 -0700 Message-Id: <20230713163207.219710-7-tony.luck@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230713163207.219710-1-tony.luck@intel.com> References: <20230713163207.219710-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" With Sub-NUMA Cluster mode enabled the scope of monitoring resources is per-NODE instead of per-L3 cache. Suffixes of directories with "L3" in their name refer to Sub-NUMA nodes instead of L3 cache ids. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- Documentation/arch/x86/resctrl.rst | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/re= sctrl.rst index cb05d90111b4..4d9ddb91751d 100644 --- a/Documentation/arch/x86/resctrl.rst +++ b/Documentation/arch/x86/resctrl.rst @@ -345,9 +345,13 @@ When control is enabled all CTRL_MON groups will also = contain: When monitoring is enabled all MON groups will also contain: =20 "mon_data": - This contains a set of files organized by L3 domain and by - RDT event. E.g. on a system with two L3 domains there will - be subdirectories "mon_L3_00" and "mon_L3_01". Each of these + This contains a set of files organized by L3 domain or by NUMA + node (depending on whether Sub-NUMA Cluster (SNC) mode is disabled + or enabled respectively) and by RDT event. E.g. on a system with + SNC mode disabled with two L3 domains there will be subdirectories + "mon_L3_00" and "mon_L3_01". The numerical suffix refers to the + L3 cache id. With SNC enabled the directory names are the same, + but the numerical suffix refers to the node id. Each of these directories have one file per event (e.g. "llc_occupancy", "mbm_total_bytes", and "mbm_local_bytes"). In a MON group these files provide a read out of the current value of the event for --=20 2.40.1 From nobody Sun Sep 14 03:50:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B973C001DD for ; Thu, 13 Jul 2023 16:33:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234504AbjGMQdM (ORCPT ); Thu, 13 Jul 2023 12:33:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233870AbjGMQcf (ORCPT ); Thu, 13 Jul 2023 12:32:35 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7704F272A; Thu, 13 Jul 2023 09:32:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689265954; x=1720801954; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6zdXX9a9XkRDDTwQP2Zu+lk034wDBTkzuL9enDAObXo=; b=fVGER8l465/y7Q7J8EO4NYOw+YIaxbMLy2NAShco3AykNmIj6oz4ouu6 uDpxq5CXmvAew7Gtw5+2ThntbVf/q4j2FjNkS1vP2pLPkZW7PJT8b5thl 0jSZ0bzrk/feQKmcODpn4+G9VRuzy+9Uvaj1LquKZ7YEkr4j2l2o1IxNE m3PhcOXLIe7Cn7SD7oQMKYltjwQOQ6jNJ1DJEelF92TsueZjCW+JLB6WP UOMu9mH4wUVyQxXoQBfVsnJ8xFXTOSKZfLiU7st/LaQ1HYBsrneA62tZm n+pbY1kWXtZk4ShK8s6wJiUhivmGs+M0826fh+UVEniW7BJ77XVxlWGiR w==; X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="362707666" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="362707666" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="722046391" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="722046391" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:23 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v3 7/8] x86/resctrl: Determine if Sub-NUMA Cluster is enabled and initialize. Date: Thu, 13 Jul 2023 09:32:06 -0700 Message-Id: <20230713163207.219710-8-tony.luck@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230713163207.219710-1-tony.luck@intel.com> References: <20230713163207.219710-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There isn't a simple hardware enumeration to indicate to software that a system is running with Sub-NUMA Cluster enabled. Compare the number of NUMA nodes with the number of L3 caches to calculate the number of Sub-NUMA nodes per L3 cache. When Sub-NUMA cluster mode is enabled in BIOS setup the RMID counters are distributed equally between the SNC nodes within each socket. E.g. if there are 400 RMID counters, and the system is configured with two SNC nodes per socket, then RMID counter 0..199 are used on SNC node 0 on the socket, and RMID counter 200..399 on SNC node 1. A model specific MSR (0xca0) can change the configuration of the RMIDs when SNC mode is enabled. The MSR controls the interpretation of the RMID field in the IA32_PQR_ASSOC MSR so that the appropriate hardware counters within the SNC node are updated. Also initialize a per-cpu RMID offset value. Use this to calculate the value to write to the IA32_QM_EVTSEL MSR when reading RMID event values. N.B. this works well for well-behaved NUMA applications that access memory predominantly from the local memory node. For applications that access memory across multiple nodes it may be necessary for the user to read counters for all SNC nodes on a socket and add the values to get the actual LLC occupancy or memory bandwidth. Perhaps this isn't all that different from applications that span across multiple sockets in a legacy system. Signed-off-by: Tony Luck Reviewed-by: Peter Newman Tested-by: Peter Newman --- arch/x86/include/asm/resctrl.h | 2 + arch/x86/kernel/cpu/resctrl/core.c | 99 +++++++++++++++++++++++++- arch/x86/kernel/cpu/resctrl/monitor.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +- 4 files changed, 100 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 255a78d9d906..f95e69bacc65 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -35,6 +35,8 @@ DECLARE_STATIC_KEY_FALSE(rdt_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); =20 +DECLARE_PER_CPU(int, rmid_offset); + /* * __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR * diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index af3be3c2db96..a03ff1a95624 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -16,11 +16,14 @@ =20 #define pr_fmt(fmt) "resctrl: " fmt =20 +#include #include #include #include #include +#include =20 +#include #include #include #include "internal.h" @@ -524,6 +527,39 @@ static int get_domain_id(int cpu, enum resctrl_scope s= cope) } } =20 +DEFINE_PER_CPU(int, rmid_offset); + +static void set_per_cpu_rmid_offset(int cpu, struct rdt_resource *r) +{ + this_cpu_write(rmid_offset, (cpu_to_node(cpu) % snc_ways) * r->num_rmid); +} + +/* + * This MSR provides for configuration of RMIDs on Sub-NUMA Cluster + * systems. + * Bit0 =3D 1 (default) For legacy configuration + * Bit0 =3D 0 RMIDs are divided evenly between SNC nodes. + */ +#define MSR_RMID_SNC_CONFIG 0xCA0 + +static void snc_add_pkg(void) +{ + u64 msrval; + + rdmsrl(MSR_RMID_SNC_CONFIG, msrval); + msrval &=3D ~BIT_ULL(0); + wrmsrl(MSR_RMID_SNC_CONFIG, msrval); +} + +static void snc_remove_pkg(void) +{ + u64 msrval; + + rdmsrl(MSR_RMID_SNC_CONFIG, msrval); + msrval |=3D BIT_ULL(0); + wrmsrl(MSR_RMID_SNC_CONFIG, msrval); +} + /* * domain_add_cpu - Add a cpu to a resource's domain list. * @@ -555,6 +591,8 @@ static void domain_add_cpu(int cpu, struct rdt_resource= *r) cpumask_set_cpu(cpu, &d->cpu_mask); if (r->cache.arch_has_per_cpu_cfg) rdt_domain_reconfigure_cdp(r); + if (r->mon_capable) + set_per_cpu_rmid_offset(cpu, r); return; } =20 @@ -573,11 +611,17 @@ static void domain_add_cpu(int cpu, struct rdt_resour= ce *r) return; } =20 - if (r->mon_capable && arch_domain_mbm_alloc(r->num_rmid, hw_dom)) { - domain_free(hw_dom); - return; + if (r->mon_capable) { + if (arch_domain_mbm_alloc(r->num_rmid, hw_dom)) { + domain_free(hw_dom); + return; + } + set_per_cpu_rmid_offset(cpu, r); } =20 + if (r->pkg_actions) + snc_add_pkg(); + list_add_tail(&d->list, add_pos); =20 err =3D resctrl_online_domain(r, d); @@ -613,6 +657,9 @@ static void domain_remove_cpu(int cpu, struct rdt_resou= rce *r) d->plr->d =3D NULL; domain_free(hw_dom); =20 + if (r->pkg_actions) + snc_remove_pkg(); + return; } =20 @@ -899,11 +946,57 @@ static __init bool get_rdt_resources(void) return (rdt_mon_capable || rdt_alloc_capable); } =20 +static const struct x86_cpu_id snc_cpu_ids[] __initconst =3D { + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, 0), + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, 0), + X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, 0), + {} +}; + +/* + * There isn't a simple enumeration bit to show whether SNC mode + * is enabled. Look at the ratio of number of NUMA nodes to the + * number of distinct L3 caches. Take care to skip memory-only nodes. + */ +static __init int find_snc_ways(void) +{ + unsigned long *node_caches; + int mem_only_nodes =3D 0; + int cpu, node, ret; + + if (!x86_match_cpu(snc_cpu_ids)) + return 1; + + node_caches =3D kcalloc(BITS_TO_LONGS(nr_node_ids), sizeof(*node_caches),= GFP_KERNEL); + if (!node_caches) + return 1; + + cpus_read_lock(); + for_each_node(node) { + cpu =3D cpumask_first(cpumask_of_node(node)); + if (cpu < nr_cpu_ids) + set_bit(get_cpu_cacheinfo_id(cpu, 3), node_caches); + else + mem_only_nodes++; + } + cpus_read_unlock(); + + ret =3D (nr_node_ids - mem_only_nodes) / bitmap_weight(node_caches, nr_no= de_ids); + kfree(node_caches); + + if (ret > 1) + rdt_resources_all[RDT_RESOURCE_PKG].r_resctrl.pkg_actions =3D true; + + return ret; +} + static __init void rdt_init_res_defs_intel(void) { struct rdt_hw_resource *hw_res; struct rdt_resource *r; =20 + snc_ways =3D find_snc_ways(); + for_each_rdt_resource(r) { hw_res =3D resctrl_to_arch_res(r); =20 diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index da3f36212898..74db99d299e1 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -160,7 +160,7 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id = eventid, u64 *val) * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62) * are error bits. */ - wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid); + wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid + this_cpu_read(rmid_offset)); rdmsrl(MSR_IA32_QM_CTR, msr_val); =20 if (msr_val & RMID_VAL_ERROR) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index d037f3da9e55..1a9c38b018ba 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1354,7 +1354,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource= *r, } } =20 - return size; + return size / snc_ways; } =20 /** --=20 2.40.1 From nobody Sun Sep 14 03:50:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 804CDC0015E for ; Thu, 13 Jul 2023 16:33:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234884AbjGMQdQ (ORCPT ); Thu, 13 Jul 2023 12:33:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234076AbjGMQcg (ORCPT ); Thu, 13 Jul 2023 12:32:36 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FA3E273F; Thu, 13 Jul 2023 09:32:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689265955; x=1720801955; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QLPVmH8OKhS156FijxUUbjl5TQT038tMDl99vshP4rU=; b=M1gczokT+uUBuUD4ZoLQG6uPnJV3HheA4kBcSKXqSKNjjdAgyTNO4VSq cIFv4lQmR11LW/V/wMceBfYrrJ+X9Zns7v2LnwCf6PYQ98rvDl9GOPRvV hI7NPmdRQWJSpjaK429NjlZEclnZlLLZsT6pDG7Q6EHDEuvERbxrhTqIw 5j41eGvxDAmNsmzjWQKf4yLGI/H8dNzUG2Ml3UOorgJBqEJXZLOUl1Sl8 DyiOAOloSjfk+1d0u1wrzLVYFa6r9OFuycUZYFpHJTHg77RfjRPX6Djcr etJchpisEV7bxqcpamNM2HrsaJGHYlMbuwXZzFi1n4aNPN69VOwqphcIh w==; X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="362707678" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="362707678" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10770"; a="722046397" X-IronPort-AV: E=Sophos;i="6.01,203,1684825200"; d="scan'208";a="722046397" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2023 09:32:23 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v3 8/8] selftests/resctrl: Adjust effective L3 cache size when SNC enabled Date: Thu, 13 Jul 2023 09:32:07 -0700 Message-Id: <20230713163207.219710-9-tony.luck@intel.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230713163207.219710-1-tony.luck@intel.com> References: <20230713163207.219710-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA nodes. Systems may support splitting into either two or four nodes. When SNC mode is enabled the effective amount of L3 cache available for allocation is divided by the number of nodes per L3. Detect which SNC mode is active by comparing the number of CPUs that share a cache with CPU0, with the number of CPUs on node0. Reported-by: "Shaopeng Tan (Fujitsu)" Closes: https://lore.kernel.org/r/TYAPR01MB6330B9B17686EF426D2C3F308B25A@TY= APR01MB6330.jpnprd01.prod.outlook.com Signed-off-by: Tony Luck --- tools/testing/selftests/resctrl/resctrl.h | 1 + tools/testing/selftests/resctrl/resctrlfs.c | 57 +++++++++++++++++++++ 2 files changed, 58 insertions(+) diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/self= tests/resctrl/resctrl.h index 87e39456dee0..a8b43210b573 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/se= lftests/resctrl/resctrlfs.c index fb00245dee92..79eecbf9f863 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -130,6 +130,61 @@ int get_resource_id(int cpu_no, int *resource_id) return 0; } =20 +/* + * Count number of CPUs in a /sys bit map + */ +static int count_sys_bitmap_bits(char *name) +{ + FILE *fp =3D fopen(name, "r"); + int count =3D 0, c; + + if (!fp) + return 0; + + while ((c =3D fgetc(fp)) !=3D EOF) { + if (!isxdigit(c)) + continue; + switch (c) { + case 'f': + count++; + case '7': case 'b': case 'd': case 'e': + count++; + case '3': case '5': case '6': case '9': case 'a': case 'c': + count++; + case '1': case '2': case '4': case '8': + count++; + } + } + fclose(fp); + + return count; +} + +/* + * Detect SNC by compating #CPUs in node0 with #CPUs sharing LLC with CPU0 + * Try to get this right, even if a few CPUs are offline so that the number + * of CPUs in node0 is not exactly half or a quarter of the CPUs sharing t= he + * LLC of CPU0. + */ +static int snc_ways(void) +{ + int node_cpus, cache_cpus; + + node_cpus =3D count_sys_bitmap_bits("/sys/devices/system/node/node0/cpuma= p"); + cache_cpus =3D count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/= index3/shared_cpu_map"); + + if (!node_cpus || !cache_cpus) { + fprintf(stderr, "Warning could not determine Sub-NUMA Cluster mode\n"); + return 1; + } + + if (4 * node_cpus >=3D cache_cpus) + return 4; + else if (2 * node_cpus >=3D cache_cpus) + return 2; + return 1; +} + /* * get_cache_size - Get cache size for a specified CPU * @cpu_no: CPU number @@ -190,6 +245,8 @@ int get_cache_size(int cpu_no, char *cache_type, unsign= ed long *cache_size) break; } =20 + if (cache_num =3D=3D 3) + *cache_size /=3D snc_ways(); return 0; } =20 --=20 2.40.1