From nobody Tue Dec 16 02:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB0D9C83F21 for ; Tue, 29 Aug 2023 23:45:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241231AbjH2Xoy (ORCPT ); Tue, 29 Aug 2023 19:44:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241218AbjH2Xom (ORCPT ); Tue, 29 Aug 2023 19:44:42 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 158351B3; Tue, 29 Aug 2023 16:44:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693352680; x=1724888680; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WgHlgC60ihdg3OH9uRN9oBta8S6ahuDhNdnK2yhBSmQ=; b=fjo90wRPFL3DWVHHYjvu4qaKsKHrtsWYnXW7C/XbO3682usBLSg6oHqB UtZF0nWwNVnOkdkjnUUqJbj5ZOTgbzyZIEAViM9iu7BhTcchju+lOHR3H 70iRsXgH58JI3Qf5uTOqLb40h9V4jN+4l9VUzBPMedCRRC++GeIJhrbmE HH5tk2dBFXbxkYxX80Z8n+4SP9DY8ywkvQWkxay3O7MWIMOzX9U5M1ldy qWGOHenwp4D71O8mDC9PFcebP7PTwjmKc4uaebHW6lxBZwewr+i1CDqGI Mrogy1pazzS8ZEplIjDXpk2LWWlUt1K8sUQqft7FFqBiUEt9qV7S+OuZg Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="355015418" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="355015418" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="688691020" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="688691020" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:36 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v5 1/8] x86/resctrl: Prepare for new domain scope Date: Tue, 29 Aug 2023 16:44:19 -0700 Message-ID: <20230829234426.64421-2-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230829234426.64421-1-tony.luck@intel.com> References: <20230722190740.326190-1-tony.luck@intel.com> <20230829234426.64421-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Legacy resctrl features operated on subsets of CPUs in the system with the defining attribute of each subset being an instance of a particular level of cache. E.g. all CPUs sharing an L3 cache would be part of the same domain. In preparation for features that are scoped at the NUMA node level change the code from explicit references to "cache_level" to a more generic scope. At this point the only options for this scope are groups of CPUs that share an L2 cache or L3 cache. No functional change. Signed-off-by: Tony Luck --- include/linux/resctrl.h | 9 ++++++-- arch/x86/kernel/cpu/resctrl/core.c | 27 ++++++++++++++++++----- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 15 ++++++++++++- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 ++++++++++++- 4 files changed, 56 insertions(+), 10 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 8334eeacfec5..2db1244ae642 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -144,13 +144,18 @@ struct resctrl_membw { struct rdt_parse_data; struct resctrl_schema; =20 +enum resctrl_scope { + RESCTRL_L3_CACHE, + RESCTRL_L2_CACHE, +}; + /** * struct rdt_resource - attributes of a resctrl resource * @rid: The index of the resource * @alloc_capable: Is allocation available on this machine * @mon_capable: Is monitor feature available on this machine * @num_rmid: Number of RMIDs available - * @cache_level: Which cache level defines scope of this resource + * @scope: Scope of this resource * @cache: Cache allocation related data * @membw: If the component has bandwidth controls, their properties. * @domains: All domains for this resource @@ -168,7 +173,7 @@ struct rdt_resource { bool alloc_capable; bool mon_capable; int num_rmid; - int cache_level; + enum resctrl_scope scope; struct resctrl_cache cache; struct resctrl_membw membw; struct list_head domains; diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 030d3b409768..0d3bae523ecb 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -65,7 +65,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L3, .name =3D "L3", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_L3), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", @@ -79,7 +79,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L2, .name =3D "L2", - .cache_level =3D 2, + .scope =3D RESCTRL_L2_CACHE, .domains =3D domain_init(RDT_RESOURCE_L2), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", @@ -93,7 +93,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_MBA, .name =3D "MB", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_MBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", @@ -105,7 +105,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_SMBA, .name =3D "SMBA", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_SMBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", @@ -487,6 +487,21 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct = rdt_hw_domain *hw_dom) return 0; } =20 +static int get_domain_id_from_scope(int cpu, enum resctrl_scope scope) +{ + switch (scope) { + case RESCTRL_L3_CACHE: + return get_cpu_cacheinfo_id(cpu, 3); + case RESCTRL_L2_CACHE: + return get_cpu_cacheinfo_id(cpu, 2); + default: + WARN_ON_ONCE(1); + break; + } + + return -1; +} + /* * domain_add_cpu - Add a cpu to a resource's domain list. * @@ -502,7 +517,7 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct r= dt_hw_domain *hw_dom) */ static void domain_add_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->cache_level); + int id =3D get_domain_id_from_scope(cpu, r->scope); struct list_head *add_pos =3D NULL; struct rdt_hw_domain *hw_dom; struct rdt_domain *d; @@ -552,7 +567,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource= *r) =20 static void domain_remove_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->cache_level); + int id =3D get_domain_id_from_scope(cpu, r->scope); struct rdt_hw_domain *hw_dom; struct rdt_domain *d; =20 diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index 458cb7419502..e79324676f57 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -279,6 +279,7 @@ static void pseudo_lock_region_clear(struct pseudo_lock= _region *plr) static int pseudo_lock_region_init(struct pseudo_lock_region *plr) { struct cpu_cacheinfo *ci; + int cache_level; int ret; int i; =20 @@ -296,8 +297,20 @@ static int pseudo_lock_region_init(struct pseudo_lock_= region *plr) =20 plr->size =3D rdtgroup_cbm_to_size(plr->s->res, plr->d, plr->cbm); =20 + switch (plr->s->res->scope) { + case RESCTRL_L3_CACHE: + cache_level =3D 3; + break; + case RESCTRL_L2_CACHE: + cache_level =3D 2; + break; + default: + WARN_ON_ONCE(1); + return -ENODEV; + } + for (i =3D 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level =3D=3D plr->s->res->cache_level) { + if (ci->info_list[i].level =3D=3D cache_level) { plr->line_size =3D ci->info_list[i].coherency_line_size; return 0; } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 725344048f85..f510414bf6ce 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1343,12 +1343,25 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resour= ce *r, { struct cpu_cacheinfo *ci; unsigned int size =3D 0; + int cache_level; int num_b, i; =20 + switch (r->scope) { + case RESCTRL_L3_CACHE: + cache_level =3D 3; + break; + case RESCTRL_L2_CACHE: + cache_level =3D 2; + break; + default: + WARN_ON_ONCE(1); + return size; + } + num_b =3D bitmap_weight(&cbm, r->cache.cbm_len); ci =3D get_cpu_cacheinfo(cpumask_any(&d->cpu_mask)); for (i =3D 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level =3D=3D r->cache_level) { + if (ci->info_list[i].level =3D=3D cache_level) { size =3D ci->info_list[i].size / r->cache.cbm_len * num_b; break; } --=20 2.41.0 From nobody Tue Dec 16 02:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCD23C83F23 for ; Tue, 29 Aug 2023 23:45:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241236AbjH2Xoz (ORCPT ); Tue, 29 Aug 2023 19:44:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45738 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241219AbjH2Xon (ORCPT ); Tue, 29 Aug 2023 19:44:43 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3ED4C1B1; Tue, 29 Aug 2023 16:44:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693352680; x=1724888680; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CI311CCWAi2SqSCG8crFgtA/lucaCBtX1DdHdCKPdec=; b=RqPYM0FXg1fMp2iTlQPrBx3FkkgJtH+dn6evKwCNzd8yBFLDWn4u/7Vw Kc57O2h/Lredqqp9lXnVLAlyO9XHMUImICLRH+lSfS2kHCLYRpM5Fz+Ir ZZoY76oqzn77oiIecYPOmG9k8STs96Dv6MRX6sn+k0ewDrXEI4GI5GqNq ozRbVN50lcDD6Mv4fjQj0cU0rbHMi0AjLvXQQAWHNDwT+cNNXQLjDI2+w gKnLSjAPsM2w+VYvEXsIuOOjk8Aw7yTqI/TfN5rY3SYjHoo+YupZLnxL6 UxnuNYu3qFInLioMZMn1mDDHpBJWFcmheD70kCNya47MLwrtNFyAC5//B Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="355015429" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="355015429" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="688691024" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="688691024" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:37 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v5 2/8] x86/resctrl: Prepare for different scope for control/monitor operations Date: Tue, 29 Aug 2023 16:44:20 -0700 Message-ID: <20230829234426.64421-3-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230829234426.64421-1-tony.luck@intel.com> References: <20230722190740.326190-1-tony.luck@intel.com> <20230829234426.64421-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Existing resctrl assumes that control and monitor operations on a resource are performed at the same scope. Prepare for systems that use different scope (specifically L3 scope for cache control and NODE scope for cache occupancy and memory bandwidth monitoring). Create separate domain lists for control and monitor operations. No important functional change. But note that errors during initialization of either control or monitor functions on a domain would previously result in that domain being excluded from both control and monitor operations. Now the domains are allocated independently it is no longer required to disable both control and monitor operations if either fail. Signed-off-by: Tony Luck --- include/linux/resctrl.h | 16 +- arch/x86/kernel/cpu/resctrl/internal.h | 6 +- arch/x86/kernel/cpu/resctrl/core.c | 227 +++++++++++++++------- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +- arch/x86/kernel/cpu/resctrl/monitor.c | 2 +- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 32 +-- 7 files changed, 199 insertions(+), 88 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 2db1244ae642..33856943a787 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -155,10 +155,12 @@ enum resctrl_scope { * @alloc_capable: Is allocation available on this machine * @mon_capable: Is monitor feature available on this machine * @num_rmid: Number of RMIDs available - * @scope: Scope of this resource + * @ctrl_scope: Scope of this resource for control functions + * @mon_scope: Scope of this resource for monitor functions * @cache: Cache allocation related data * @membw: If the component has bandwidth controls, their properties. - * @domains: All domains for this resource + * @domains: Control domains for this resource + * @mon_domains: Monitor domains for this resource * @name: Name to use in "schemata" file. * @data_width: Character width of data when displaying * @default_ctrl: Specifies default cache cbm or memory B/W percent. @@ -173,10 +175,12 @@ struct rdt_resource { bool alloc_capable; bool mon_capable; int num_rmid; - enum resctrl_scope scope; + enum resctrl_scope ctrl_scope; + enum resctrl_scope mon_scope; struct resctrl_cache cache; struct resctrl_membw membw; struct list_head domains; + struct list_head mondomains; char *name; int data_width; u32 default_ctrl; @@ -222,8 +226,10 @@ int resctrl_arch_update_one(struct rdt_resource *r, st= ruct rdt_domain *d, =20 u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, u32 closid, enum resctrl_conf_type type); -int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d); -void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); +int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_domain *= d); +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain *d= ); +void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_domain= *d); +void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_domain = *d); =20 /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rm= id diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 85ceaf9a31ac..31a5fc3b717f 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -511,8 +511,10 @@ void rdtgroup_kn_unlock(struct kernfs_node *kn); int rdtgroup_kn_mode_restrict(struct rdtgroup *r, const char *name); int rdtgroup_kn_mode_restore(struct rdtgroup *r, const char *name, umode_t mask); -struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id, - struct list_head **pos); +struct rdt_domain *rdt_find_ctrldomain(struct list_head *h, int id, + struct list_head **pos); +struct rdt_domain *rdt_find_mondomain(struct list_head *h, int id, + struct list_head **pos); ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off); int rdtgroup_schemata_show(struct kernfs_open_file *of, diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 0d3bae523ecb..97f6f9715fdb 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -57,7 +57,7 @@ static void mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r); =20 -#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctrl.dom= ains) +#define domain_init(id, field) LIST_HEAD_INIT(rdt_resources_all[id].r_resc= trl.field) =20 struct rdt_hw_resource rdt_resources_all[] =3D { [RDT_RESOURCE_L3] =3D @@ -65,8 +65,10 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L3, .name =3D "L3", - .scope =3D RESCTRL_L3_CACHE, - .domains =3D domain_init(RDT_RESOURCE_L3), + .ctrl_scope =3D RESCTRL_L3_CACHE, + .mon_scope =3D RESCTRL_L3_CACHE, + .domains =3D domain_init(RDT_RESOURCE_L3, domains), + .mondomains =3D domain_init(RDT_RESOURCE_L3, mondomains), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", .fflags =3D RFTYPE_RES_CACHE, @@ -79,8 +81,8 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L2, .name =3D "L2", - .scope =3D RESCTRL_L2_CACHE, - .domains =3D domain_init(RDT_RESOURCE_L2), + .ctrl_scope =3D RESCTRL_L2_CACHE, + .domains =3D domain_init(RDT_RESOURCE_L2, domains), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", .fflags =3D RFTYPE_RES_CACHE, @@ -93,8 +95,8 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_MBA, .name =3D "MB", - .scope =3D RESCTRL_L3_CACHE, - .domains =3D domain_init(RDT_RESOURCE_MBA), + .ctrl_scope =3D RESCTRL_L3_CACHE, + .domains =3D domain_init(RDT_RESOURCE_MBA, domains), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", .fflags =3D RFTYPE_RES_MB, @@ -105,8 +107,8 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_SMBA, .name =3D "SMBA", - .scope =3D RESCTRL_L3_CACHE, - .domains =3D domain_init(RDT_RESOURCE_SMBA), + .ctrl_scope =3D RESCTRL_L3_CACHE, + .domains =3D domain_init(RDT_RESOURCE_SMBA, domains), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", .fflags =3D RFTYPE_RES_MB, @@ -384,15 +386,16 @@ void rdt_ctrl_update(void *arg) } =20 /* - * rdt_find_domain - Find a domain in a resource that matches input resour= ce id + * __rdt_find_domain - Find a domain in either the list of control or + * monitor domains that matches input resource id * * Search resource r's domain list to find the resource id. If the resource * id is found in a domain, return the domain. Otherwise, if requested by * caller, return the first domain whose id is bigger than the input id. * The domain list is sorted by id in ascending order. */ -struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id, - struct list_head **pos) +static void *__rdt_find_domain(struct list_head *h, int id, + struct list_head **pos) { struct rdt_domain *d; struct list_head *l; @@ -400,7 +403,7 @@ struct rdt_domain *rdt_find_domain(struct rdt_resource = *r, int id, if (id < 0) return ERR_PTR(-ENODEV); =20 - list_for_each(l, &r->domains) { + list_for_each(l, h) { d =3D list_entry(l, struct rdt_domain, list); /* When id is found, return its domain. */ if (id =3D=3D d->id) @@ -416,6 +419,18 @@ struct rdt_domain *rdt_find_domain(struct rdt_resource= *r, int id, return NULL; } =20 +struct rdt_domain *rdt_find_ctrldomain(struct list_head *h, int id, + struct list_head **pos) +{ + return __rdt_find_domain(h, id, pos); +} + +struct rdt_domain *rdt_find_mondomain(struct list_head *h, int id, + struct list_head **pos) +{ + return __rdt_find_domain(h, id, pos); +} + static void setup_default_ctrlval(struct rdt_resource *r, u32 *dc) { struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(r); @@ -431,10 +446,15 @@ static void setup_default_ctrlval(struct rdt_resource= *r, u32 *dc) } =20 static void domain_free(struct rdt_hw_domain *hw_dom) +{ + kfree(hw_dom->ctrl_val); + kfree(hw_dom); +} + +static void mondomain_free(struct rdt_hw_domain *hw_dom) { kfree(hw_dom->arch_mbm_total); kfree(hw_dom->arch_mbm_local); - kfree(hw_dom->ctrl_val); kfree(hw_dom); } =20 @@ -502,6 +522,93 @@ static int get_domain_id_from_scope(int cpu, enum resc= trl_scope scope) return -1; } =20 +static void domain_add_cpu_ctrl(int cpu, struct rdt_resource *r) +{ + int id =3D get_domain_id_from_scope(cpu, r->ctrl_scope); + struct list_head *add_pos =3D NULL; + struct rdt_hw_domain *hw_dom; + struct rdt_domain *d; + int err; + + d =3D rdt_find_ctrldomain(&r->domains, id, &add_pos); + if (IS_ERR(d)) { + pr_warn("Couldn't find scope id=3D%d for CPU %d\n", id, cpu); + return; + } + + if (d) { + cpumask_set_cpu(cpu, &d->cpu_mask); + if (r->cache.arch_has_per_cpu_cfg) + rdt_domain_reconfigure_cdp(r); + return; + } + + hw_dom =3D kzalloc_node(sizeof(*hw_dom), GFP_KERNEL, cpu_to_node(cpu)); + if (!hw_dom) + return; + + d =3D &hw_dom->d_resctrl; + d->id =3D id; + cpumask_set_cpu(cpu, &d->cpu_mask); + + rdt_domain_reconfigure_cdp(r); + + if (domain_setup_ctrlval(r, d)) { + domain_free(hw_dom); + return; + } + + list_add_tail(&d->list, add_pos); + + err =3D resctrl_online_ctrl_domain(r, d); + if (err) { + list_del(&d->list); + domain_free(hw_dom); + } +} + +static void domain_add_cpu_mon(int cpu, struct rdt_resource *r) +{ + int id =3D get_domain_id_from_scope(cpu, r->mon_scope); + struct rdt_hw_domain *hw_mondom; + struct list_head *add_pos =3D NULL; + struct rdt_domain *d; + int err; + + d =3D rdt_find_mondomain(&r->mondomains, id, &add_pos); + if (IS_ERR(d)) { + pr_warn("Couldn't find scope id=3D%d for CPU %d\n", id, cpu); + return; + } + + if (d) { + cpumask_set_cpu(cpu, &d->cpu_mask); + + return; + } + + hw_mondom =3D kzalloc_node(sizeof(*hw_mondom), GFP_KERNEL, cpu_to_node(cp= u)); + if (!hw_mondom) + return; + + d =3D &hw_mondom->d_resctrl; + d->id =3D id; + cpumask_set_cpu(cpu, &d->cpu_mask); + + if (arch_domain_mbm_alloc(r->num_rmid, hw_mondom)) { + mondomain_free(hw_mondom); + return; + } + + list_add_tail(&d->list, add_pos); + + err =3D resctrl_online_mon_domain(r, d); + if (err) { + list_del(&d->list); + mondomain_free(hw_mondom); + } +} + /* * domain_add_cpu - Add a cpu to a resource's domain list. * @@ -517,70 +624,28 @@ static int get_domain_id_from_scope(int cpu, enum res= ctrl_scope scope) */ static void domain_add_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_domain_id_from_scope(cpu, r->scope); - struct list_head *add_pos =3D NULL; - struct rdt_hw_domain *hw_dom; - struct rdt_domain *d; - int err; - - d =3D rdt_find_domain(r, id, &add_pos); - if (IS_ERR(d)) { - pr_warn("Couldn't find cache id for CPU %d\n", cpu); - return; - } - - if (d) { - cpumask_set_cpu(cpu, &d->cpu_mask); - if (r->cache.arch_has_per_cpu_cfg) - rdt_domain_reconfigure_cdp(r); - return; - } - - hw_dom =3D kzalloc_node(sizeof(*hw_dom), GFP_KERNEL, cpu_to_node(cpu)); - if (!hw_dom) - return; - - d =3D &hw_dom->d_resctrl; - d->id =3D id; - cpumask_set_cpu(cpu, &d->cpu_mask); - - rdt_domain_reconfigure_cdp(r); - - if (r->alloc_capable && domain_setup_ctrlval(r, d)) { - domain_free(hw_dom); - return; - } - - if (r->mon_capable && arch_domain_mbm_alloc(r->num_rmid, hw_dom)) { - domain_free(hw_dom); - return; - } - - list_add_tail(&d->list, add_pos); - - err =3D resctrl_online_domain(r, d); - if (err) { - list_del(&d->list); - domain_free(hw_dom); - } + if (r->alloc_capable) + domain_add_cpu_ctrl(cpu, r); + if (r->mon_capable) + domain_add_cpu_mon(cpu, r); } =20 -static void domain_remove_cpu(int cpu, struct rdt_resource *r) +static void domain_remove_cpu_ctrl(int cpu, struct rdt_resource *r) { - int id =3D get_domain_id_from_scope(cpu, r->scope); + int id =3D get_domain_id_from_scope(cpu, r->ctrl_scope); struct rdt_hw_domain *hw_dom; struct rdt_domain *d; =20 - d =3D rdt_find_domain(r, id, NULL); + d =3D rdt_find_ctrldomain(&r->domains, id, NULL); if (IS_ERR_OR_NULL(d)) { - pr_warn("Couldn't find cache id for CPU %d\n", cpu); + pr_warn("Couldn't find scope id=3D%d for CPU %d\n", id, cpu); return; } hw_dom =3D resctrl_to_arch_dom(d); =20 cpumask_clear_cpu(cpu, &d->cpu_mask); if (cpumask_empty(&d->cpu_mask)) { - resctrl_offline_domain(r, d); + resctrl_offline_ctrl_domain(r, d); list_del(&d->list); =20 /* @@ -593,6 +658,30 @@ static void domain_remove_cpu(int cpu, struct rdt_reso= urce *r) =20 return; } +} + +static void domain_remove_cpu_mon(int cpu, struct rdt_resource *r) +{ + int id =3D get_domain_id_from_scope(cpu, r->mon_scope); + struct rdt_hw_domain *hw_mondom; + struct rdt_domain *d; + + d =3D rdt_find_mondomain(&r->mondomains, id, NULL); + if (IS_ERR_OR_NULL(d)) { + pr_warn("Couldn't find scope id=3D%d for CPU %d\n", id, cpu); + return; + } + hw_mondom =3D resctrl_to_arch_dom(d); + + cpumask_clear_cpu(cpu, &d->cpu_mask); + if (cpumask_empty(&d->cpu_mask)) { + resctrl_offline_mon_domain(r, d); + list_del(&d->list); + + mondomain_free(hw_mondom); + + return; + } =20 if (r =3D=3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { if (is_mbm_enabled() && cpu =3D=3D d->mbm_work_cpu) { @@ -607,6 +696,14 @@ static void domain_remove_cpu(int cpu, struct rdt_reso= urce *r) } } =20 +static void domain_remove_cpu(int cpu, struct rdt_resource *r) +{ + if (r->alloc_capable) + domain_remove_cpu_ctrl(cpu, r); + if (r->mon_capable) + domain_remove_cpu_mon(cpu, r); +} + static void clear_closid_rmid(int cpu) { struct resctrl_pqr_state *state =3D this_cpu_ptr(&pqr_state); diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cp= u/resctrl/ctrlmondata.c index b44c487727d4..468c1815edfd 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -560,7 +560,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg) evtid =3D md.u.evtid; =20 r =3D &rdt_resources_all[resid].r_resctrl; - d =3D rdt_find_domain(r, domid, NULL); + d =3D rdt_find_mondomain(&r->mondomains, domid, NULL); if (IS_ERR_OR_NULL(d)) { ret =3D -ENOENT; goto out; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index ded1fc7cb7cb..66beca785535 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -340,7 +340,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) =20 entry->busy =3D 0; cpu =3D get_cpu(); - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->mondomains, list) { if (cpumask_test_cpu(cpu, &d->cpu_mask)) { err =3D resctrl_arch_rmid_read(r, d, entry->rmid, QOS_L3_OCCUP_EVENT_ID, diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index e79324676f57..be8b5f28e638 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -297,7 +297,7 @@ static int pseudo_lock_region_init(struct pseudo_lock_r= egion *plr) =20 plr->size =3D rdtgroup_cbm_to_size(plr->s->res, plr->d, plr->cbm); =20 - switch (plr->s->res->scope) { + switch (plr->s->res->ctrl_scope) { case RESCTRL_L3_CACHE: cache_level =3D 3; break; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index f510414bf6ce..f2aec39c49df 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1346,7 +1346,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource= *r, int cache_level; int num_b, i; =20 - switch (r->scope) { + switch (r->ctrl_scope) { case RESCTRL_L3_CACHE: cache_level =3D 3; break; @@ -1509,7 +1509,7 @@ static int mbm_config_show(struct seq_file *s, struct= rdt_resource *r, u32 evtid =20 mutex_lock(&rdtgroup_mutex); =20 - list_for_each_entry(dom, &r->domains, list) { + list_for_each_entry(dom, &r->mondomains, list) { if (sep) seq_puts(s, ";"); =20 @@ -1632,7 +1632,7 @@ static int mon_config_write(struct rdt_resource *r, c= har *tok, u32 evtid) return -EINVAL; } =20 - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->mondomains, list) { if (d->id =3D=3D dom_id) { ret =3D mbm_config_write_domain(r, d, evtid, val); if (ret) @@ -2538,7 +2538,7 @@ static int rdt_get_tree(struct fs_context *fc) =20 if (is_mbm_enabled()) { r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - list_for_each_entry(dom, &r->domains, list) + list_for_each_entry(dom, &r->mondomains, list) mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL); } =20 @@ -2932,7 +2932,7 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_= node *parent_kn, struct rdt_domain *dom; int ret; =20 - list_for_each_entry(dom, &r->domains, list) { + list_for_each_entry(dom, &r->mondomains, list) { ret =3D mkdir_mondata_subdir(parent_kn, dom, r, prgrp); if (ret) return ret; @@ -3721,15 +3721,17 @@ static void domain_destroy_mon_state(struct rdt_dom= ain *d) kfree(d->mbm_local); } =20 -void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) +void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_domain= *d) { lockdep_assert_held(&rdtgroup_mutex); =20 if (supports_mba_mbps() && r->rid =3D=3D RDT_RESOURCE_MBA) mba_sc_domain_destroy(r, d); +} =20 - if (!r->mon_capable) - return; +void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_domain = *d) +{ + lockdep_assert_held(&rdtgroup_mutex); =20 /* * If resctrl is mounted, remove all the @@ -3786,18 +3788,22 @@ static int domain_setup_mon_state(struct rdt_resour= ce *r, struct rdt_domain *d) return 0; } =20 -int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) +int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_domain *= d) { - int err; - lockdep_assert_held(&rdtgroup_mutex); =20 if (supports_mba_mbps() && r->rid =3D=3D RDT_RESOURCE_MBA) /* RDT_RESOURCE_MBA is never mon_capable */ return mba_sc_domain_allocate(r, d); =20 - if (!r->mon_capable) - return 0; + return 0; +} + +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain *d) +{ + int err; + + lockdep_assert_held(&rdtgroup_mutex); =20 err =3D domain_setup_mon_state(r, d); if (err) --=20 2.41.0 From nobody Tue Dec 16 02:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC993C83F24 for ; Tue, 29 Aug 2023 23:45:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241255AbjH2Xo5 (ORCPT ); Tue, 29 Aug 2023 19:44:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241221AbjH2Xop (ORCPT ); Tue, 29 Aug 2023 19:44:45 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10F8B1BB; Tue, 29 Aug 2023 16:44:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693352681; x=1724888681; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UXG4v9i+1OsFXdyBYAVd/IAJAvTBmiZc9K8aLcW4/gw=; b=eksQGaxXy5ucbHL30aFMQvFoK+nMk3gY0LuV3/Tk2yjhyIgJfVoGGWNp 8PkS6RJmDmnymAjQy+Y/z/DbxNVRm8z7nzu1KO81iZQA1OVSHgH4e5ynm gsv/TN4uRiyc6tA27kYD4DtqQdVN7VQbYPqI2Ya4IyksJ8p1BAukN+Tuh YD8OV5ZIG5PJf/fIuLNJRPynFtBR4qXQKI+bVZP9EyRnuHoNrSIrC/fsm p8hqGyGNCbnvms2aLrlCo1XnGRktV6PzjSoK5+WGGvIp0xUFNlEdP2fK4 ebzGdkvdz0uD3OHjcqcPbfE8haRqgBNPj3ptXI3MRs8KnllkSGbVRAL1g Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="355015436" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="355015436" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="688691027" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="688691027" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:38 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v5 3/8] x86/resctrl: Split the rdt_domain structure Date: Tue, 29 Aug 2023 16:44:21 -0700 Message-ID: <20230829234426.64421-4-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230829234426.64421-1-tony.luck@intel.com> References: <20230722190740.326190-1-tony.luck@intel.com> <20230829234426.64421-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The same rdt_domain structure is used for both control an monitor functions. But this results in wasted memory as some of the fields are only used by control functions, while most are only used for monitor functions. Create a new rdt_mondomain structure tailored explicitly for use in monitor parts of the core. Slim down the rdt_domain structure by removing the unused monitor fields. Similar breakout of struct rdt_hw_mondomain from struct rdt_hw_domain. Signed-off-by: Tony Luck --- include/linux/resctrl.h | 46 +++++++++++++++-------- arch/x86/kernel/cpu/resctrl/internal.h | 38 +++++++++++++------ arch/x86/kernel/cpu/resctrl/core.c | 18 ++++----- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 4 +- arch/x86/kernel/cpu/resctrl/monitor.c | 40 ++++++++++---------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 24 ++++++------ 6 files changed, 101 insertions(+), 69 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 33856943a787..08382548571e 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -53,7 +53,29 @@ struct resctrl_staged_config { }; =20 /** - * struct rdt_domain - group of CPUs sharing a resctrl resource + * struct rdt_domain - group of CPUs sharing a resctrl control resource + * @list: all instances of this resource + * @id: unique id for this instance + * @cpu_mask: which CPUs share this resource + * @plr: pseudo-locked region (if any) associated with domain + * @staged_config: parsed configuration to be applied + * @mbps_val: When mba_sc is enabled, this holds the array of user + * specified control values for mba_sc in MBps, indexed + * by closid + */ +struct rdt_domain { + // First three fields must match struct rdt_mondomain below. + struct list_head list; + int id; + struct cpumask cpu_mask; + + struct pseudo_lock_region *plr; + struct resctrl_staged_config staged_config[CDP_NUM_TYPES]; + u32 *mbps_val; +}; + +/** + * struct rdt_mondomain - group of CPUs sharing a resctrl monitor resource * @list: all instances of this resource * @id: unique id for this instance * @cpu_mask: which CPUs share this resource @@ -64,16 +86,13 @@ struct resctrl_staged_config { * @cqm_limbo: worker to periodically read CQM h/w counters * @mbm_work_cpu: worker CPU for MBM h/w counters * @cqm_work_cpu: worker CPU for CQM h/w counters - * @plr: pseudo-locked region (if any) associated with domain - * @staged_config: parsed configuration to be applied - * @mbps_val: When mba_sc is enabled, this holds the array of user - * specified control values for mba_sc in MBps, indexed - * by closid */ -struct rdt_domain { +struct rdt_mondomain { + // First three fields must match struct rdt_domain above. struct list_head list; int id; struct cpumask cpu_mask; + unsigned long *rmid_busy_llc; struct mbm_state *mbm_total; struct mbm_state *mbm_local; @@ -81,9 +100,6 @@ struct rdt_domain { struct delayed_work cqm_limbo; int mbm_work_cpu; int cqm_work_cpu; - struct pseudo_lock_region *plr; - struct resctrl_staged_config staged_config[CDP_NUM_TYPES]; - u32 *mbps_val; }; =20 /** @@ -227,9 +243,9 @@ int resctrl_arch_update_one(struct rdt_resource *r, str= uct rdt_domain *d, u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, u32 closid, enum resctrl_conf_type type); int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_domain *= d); -int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain *d= ); +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mondomain= *d); void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_domain= *d); -void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_domain = *d); +void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mondoma= in *d); =20 /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rm= id @@ -245,7 +261,7 @@ void resctrl_offline_mon_domain(struct rdt_resource *r,= struct rdt_domain *d); * Return: * 0 on success, or -EIO, -EINVAL etc on error. */ -int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, +int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mondomain *d, u32 rmid, enum resctrl_event_id eventid, u64 *val); =20 /** @@ -258,7 +274,7 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, stru= ct rdt_domain *d, * * This can be called from any CPU. */ -void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, +void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mondomain = *d, u32 rmid, enum resctrl_event_id eventid); =20 /** @@ -270,7 +286,7 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, st= ruct rdt_domain *d, * * This can be called from any CPU. */ -void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain= *d); +void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mondom= ain *d); =20 extern unsigned int resctrl_rmid_realloc_threshold; extern unsigned int resctrl_rmid_realloc_limit; diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 31a5fc3b717f..c61fd6709730 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -106,7 +106,7 @@ union mon_data_bits { struct rmid_read { struct rdtgroup *rgrp; struct rdt_resource *r; - struct rdt_domain *d; + struct rdt_mondomain *d; enum resctrl_event_id evtid; bool first; int err; @@ -320,17 +320,28 @@ struct arch_mbm_state { =20 /** * struct rdt_hw_domain - Arch private attributes of a set of CPUs that sh= are - * a resource + * a control resource * @d_resctrl: Properties exposed to the resctrl file system * @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID) - * @arch_mbm_total: arch private state for MBM total bandwidth - * @arch_mbm_local: arch private state for MBM local bandwidth * * Members of this structure are accessed via helpers that provide abstrac= tion. */ struct rdt_hw_domain { struct rdt_domain d_resctrl; u32 *ctrl_val; +}; + +/** + * struct rdt_hw_mondomain - Arch private attributes of a set of CPUs that= share + * a monitor resource + * @d_resctrl: Properties exposed to the resctrl file system + * @arch_mbm_total: arch private state for MBM total bandwidth + * @arch_mbm_local: arch private state for MBM local bandwidth + * + * Members of this structure are accessed via helpers that provide abstrac= tion. + */ +struct rdt_hw_mondomain { + struct rdt_mondomain d_resctrl; struct arch_mbm_state *arch_mbm_total; struct arch_mbm_state *arch_mbm_local; }; @@ -340,6 +351,11 @@ static inline struct rdt_hw_domain *resctrl_to_arch_do= m(struct rdt_domain *r) return container_of(r, struct rdt_hw_domain, d_resctrl); } =20 +static inline struct rdt_hw_mondomain *resctrl_to_arch_mondom(struct rdt_m= ondomain *r) +{ + return container_of(r, struct rdt_hw_mondomain, d_resctrl); +} + /** * struct msr_param - set a range of MSRs from a domain * @res: The resource to use @@ -513,8 +529,8 @@ int rdtgroup_kn_mode_restore(struct rdtgroup *r, const = char *name, umode_t mask); struct rdt_domain *rdt_find_ctrldomain(struct list_head *h, int id, struct list_head **pos); -struct rdt_domain *rdt_find_mondomain(struct list_head *h, int id, - struct list_head **pos); +struct rdt_mondomain *rdt_find_mondomain(struct list_head *h, int id, + struct list_head **pos); ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off); int rdtgroup_schemata_show(struct kernfs_open_file *of, @@ -543,17 +559,17 @@ bool __init rdt_cpu_has(int flag); void mon_event_count(void *info); int rdtgroup_mondata_show(struct seq_file *m, void *arg); void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, - struct rdt_domain *d, struct rdtgroup *rdtgrp, + struct rdt_mondomain *d, struct rdtgroup *rdtgrp, int evtid, int first); -void mbm_setup_overflow_handler(struct rdt_domain *dom, +void mbm_setup_overflow_handler(struct rdt_mondomain *dom, unsigned long delay_ms); void mbm_handle_overflow(struct work_struct *work); void __init intel_rdt_mbm_apply_quirk(void); bool is_mba_sc(struct rdt_resource *r); -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_m= s); +void cqm_setup_limbo_handler(struct rdt_mondomain *dom, unsigned long dela= y_ms); void cqm_handle_limbo(struct work_struct *work); -bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d); -void __check_limbo(struct rdt_domain *d, bool force_free); +bool has_busy_rmid(struct rdt_resource *r, struct rdt_mondomain *d); +void __check_limbo(struct rdt_mondomain *d, bool force_free); void rdt_domain_reconfigure_cdp(struct rdt_resource *r); void __init thread_throttle_mode_init(void); void __init mbm_config_rftype_init(const char *config); diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 97f6f9715fdb..3e08aa04a7ff 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -425,8 +425,8 @@ struct rdt_domain *rdt_find_ctrldomain(struct list_head= *h, int id, return __rdt_find_domain(h, id, pos); } =20 -struct rdt_domain *rdt_find_mondomain(struct list_head *h, int id, - struct list_head **pos) +struct rdt_mondomain *rdt_find_mondomain(struct list_head *h, int id, + struct list_head **pos) { return __rdt_find_domain(h, id, pos); } @@ -451,7 +451,7 @@ static void domain_free(struct rdt_hw_domain *hw_dom) kfree(hw_dom); } =20 -static void mondomain_free(struct rdt_hw_domain *hw_dom) +static void mondomain_free(struct rdt_hw_mondomain *hw_dom) { kfree(hw_dom->arch_mbm_total); kfree(hw_dom->arch_mbm_local); @@ -484,7 +484,7 @@ static int domain_setup_ctrlval(struct rdt_resource *r,= struct rdt_domain *d) * @num_rmid: The size of the MBM counter array * @hw_dom: The domain that owns the allocated arrays */ -static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_domain *hw_do= m) +static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_mondomain *hw= _dom) { size_t tsize; =20 @@ -570,9 +570,9 @@ static void domain_add_cpu_ctrl(int cpu, struct rdt_res= ource *r) static void domain_add_cpu_mon(int cpu, struct rdt_resource *r) { int id =3D get_domain_id_from_scope(cpu, r->mon_scope); - struct rdt_hw_domain *hw_mondom; + struct rdt_hw_mondomain *hw_mondom; struct list_head *add_pos =3D NULL; - struct rdt_domain *d; + struct rdt_mondomain *d; int err; =20 d =3D rdt_find_mondomain(&r->mondomains, id, &add_pos); @@ -663,15 +663,15 @@ static void domain_remove_cpu_ctrl(int cpu, struct rd= t_resource *r) static void domain_remove_cpu_mon(int cpu, struct rdt_resource *r) { int id =3D get_domain_id_from_scope(cpu, r->mon_scope); - struct rdt_hw_domain *hw_mondom; - struct rdt_domain *d; + struct rdt_hw_mondomain *hw_mondom; + struct rdt_mondomain *d; =20 d =3D rdt_find_mondomain(&r->mondomains, id, NULL); if (IS_ERR_OR_NULL(d)) { pr_warn("Couldn't find scope id=3D%d for CPU %d\n", id, cpu); return; } - hw_mondom =3D resctrl_to_arch_dom(d); + hw_mondom =3D resctrl_to_arch_mondom(d); =20 cpumask_clear_cpu(cpu, &d->cpu_mask); if (cpumask_empty(&d->cpu_mask)) { diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cp= u/resctrl/ctrlmondata.c index 468c1815edfd..5167ac9cbe98 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -521,7 +521,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, } =20 void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, - struct rdt_domain *d, struct rdtgroup *rdtgrp, + struct rdt_mondomain *d, struct rdtgroup *rdtgrp, int evtid, int first) { /* @@ -544,7 +544,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg) struct rdtgroup *rdtgrp; struct rdt_resource *r; union mon_data_bits md; - struct rdt_domain *d; + struct rdt_mondomain *d; struct rmid_read rr; int ret =3D 0; =20 diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index 66beca785535..42262d59ef9b 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -170,7 +170,7 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id = eventid, u64 *val) return 0; } =20 -static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_domain *hw_= dom, +static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_mondomain *= hw_dom, u32 rmid, enum resctrl_event_id eventid) { @@ -189,10 +189,10 @@ static struct arch_mbm_state *get_arch_mbm_state(stru= ct rdt_hw_domain *hw_dom, return NULL; } =20 -void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, +void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mondomain = *d, u32 rmid, enum resctrl_event_id eventid) { - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); + struct rdt_hw_mondomain *hw_dom =3D resctrl_to_arch_mondom(d); struct arch_mbm_state *am; =20 am =3D get_arch_mbm_state(hw_dom, rmid, eventid); @@ -208,9 +208,9 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, st= ruct rdt_domain *d, * Assumes that hardware counters are also reset and thus that there is * no need to record initial non-zero counts. */ -void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain= *d) +void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mondom= ain *d) { - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); + struct rdt_hw_mondomain *hw_dom =3D resctrl_to_arch_mondom(d); =20 if (is_mbm_total_enabled()) memset(hw_dom->arch_mbm_total, 0, @@ -229,11 +229,11 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_m= sr, unsigned int width) return chunks >> shift; } =20 -int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, +int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mondomain *d, u32 rmid, enum resctrl_event_id eventid, u64 *val) { struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(r); - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); + struct rdt_hw_mondomain *hw_mondom =3D resctrl_to_arch_mondom(d); struct arch_mbm_state *am; u64 msr_val, chunks; int ret; @@ -245,7 +245,7 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, stru= ct rdt_domain *d, if (ret) return ret; =20 - am =3D get_arch_mbm_state(hw_dom, rmid, eventid); + am =3D get_arch_mbm_state(hw_mondom, rmid, eventid); if (am) { am->chunks +=3D mbm_overflow_count(am->prev_msr, msr_val, hw_res->mbm_width); @@ -266,7 +266,7 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, stru= ct rdt_domain *d, * decrement the count. If the busy count gets to zero on an RMID, we * free the RMID */ -void __check_limbo(struct rdt_domain *d, bool force_free) +void __check_limbo(struct rdt_mondomain *d, bool force_free) { struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; struct rmid_entry *entry; @@ -305,7 +305,7 @@ void __check_limbo(struct rdt_domain *d, bool force_fre= e) } } =20 -bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d) +bool has_busy_rmid(struct rdt_resource *r, struct rdt_mondomain *d) { return find_first_bit(d->rmid_busy_llc, r->num_rmid) !=3D r->num_rmid; } @@ -334,7 +334,7 @@ int alloc_rmid(void) static void add_rmid_to_limbo(struct rmid_entry *entry) { struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - struct rdt_domain *d; + struct rdt_mondomain *d; int cpu, err; u64 val =3D 0; =20 @@ -383,7 +383,7 @@ void free_rmid(u32 rmid) list_add_tail(&entry->list, &rmid_free_lru); } =20 -static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 rmid, +static struct mbm_state *get_mbm_state(struct rdt_mondomain *d, u32 rmid, enum resctrl_event_id evtid) { switch (evtid) { @@ -516,7 +516,7 @@ void mon_event_count(void *info) * throttle MSRs already have low percentage values. To avoid * unnecessarily restricting such rdtgroups, we also increase the bandwidt= h. */ -static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mb= m) +static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mondomain *dom= _mbm) { u32 closid, rmid, cur_msr_val, new_msr_val; struct mbm_state *pmbm_data, *cmbm_data; @@ -600,7 +600,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct= rdt_domain *dom_mbm) } } =20 -static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int r= mid) +static void mbm_update(struct rdt_resource *r, struct rdt_mondomain *d, in= t rmid) { struct rmid_read rr; =20 @@ -641,12 +641,12 @@ void cqm_handle_limbo(struct work_struct *work) unsigned long delay =3D msecs_to_jiffies(CQM_LIMBOCHECK_INTERVAL); int cpu =3D smp_processor_id(); struct rdt_resource *r; - struct rdt_domain *d; + struct rdt_mondomain *d; =20 mutex_lock(&rdtgroup_mutex); =20 r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - d =3D container_of(work, struct rdt_domain, cqm_limbo.work); + d =3D container_of(work, struct rdt_mondomain, cqm_limbo.work); =20 __check_limbo(d, false); =20 @@ -656,7 +656,7 @@ void cqm_handle_limbo(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } =20 -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_m= s) +void cqm_setup_limbo_handler(struct rdt_mondomain *dom, unsigned long dela= y_ms) { unsigned long delay =3D msecs_to_jiffies(delay_ms); int cpu; @@ -674,7 +674,7 @@ void mbm_handle_overflow(struct work_struct *work) int cpu =3D smp_processor_id(); struct list_head *head; struct rdt_resource *r; - struct rdt_domain *d; + struct rdt_mondomain *d; =20 mutex_lock(&rdtgroup_mutex); =20 @@ -682,7 +682,7 @@ void mbm_handle_overflow(struct work_struct *work) goto out_unlock; =20 r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - d =3D container_of(work, struct rdt_domain, mbm_over.work); + d =3D container_of(work, struct rdt_mondomain, mbm_over.work); =20 list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { mbm_update(r, d, prgrp->mon.rmid); @@ -701,7 +701,7 @@ void mbm_handle_overflow(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } =20 -void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long dela= y_ms) +void mbm_setup_overflow_handler(struct rdt_mondomain *dom, unsigned long d= elay_ms) { unsigned long delay =3D msecs_to_jiffies(delay_ms); int cpu; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index f2aec39c49df..5feec2c33544 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1496,7 +1496,7 @@ static void mon_event_config_read(void *info) mon_info->mon_config =3D msrval & MAX_EVT_CONFIG_BITS; } =20 -static void mondata_config_read(struct rdt_domain *d, struct mon_config_in= fo *mon_info) +static void mondata_config_read(struct rdt_mondomain *d, struct mon_config= _info *mon_info) { smp_call_function_any(&d->cpu_mask, mon_event_config_read, mon_info, 1); } @@ -1504,7 +1504,7 @@ static void mondata_config_read(struct rdt_domain *d,= struct mon_config_info *mo static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32= evtid) { struct mon_config_info mon_info =3D {0}; - struct rdt_domain *dom; + struct rdt_mondomain *dom; bool sep =3D false; =20 mutex_lock(&rdtgroup_mutex); @@ -1561,7 +1561,7 @@ static void mon_event_config_write(void *info) } =20 static int mbm_config_write_domain(struct rdt_resource *r, - struct rdt_domain *d, u32 evtid, u32 val) + struct rdt_mondomain *d, u32 evtid, u32 val) { struct mon_config_info mon_info =3D {0}; int ret =3D 0; @@ -1611,7 +1611,7 @@ static int mon_config_write(struct rdt_resource *r, c= har *tok, u32 evtid) { char *dom_str =3D NULL, *id_str; unsigned long dom_id, val; - struct rdt_domain *d; + struct rdt_mondomain *d; int ret =3D 0; =20 next: @@ -2476,7 +2476,7 @@ static void schemata_list_destroy(void) static int rdt_get_tree(struct fs_context *fc) { struct rdt_fs_context *ctx =3D rdt_fc2context(fc); - struct rdt_domain *dom; + struct rdt_mondomain *dom; struct rdt_resource *r; int ret; =20 @@ -2858,7 +2858,7 @@ static void rmdir_mondata_subdir_allrdtgrp(struct rdt= _resource *r, } =20 static int mkdir_mondata_subdir(struct kernfs_node *parent_kn, - struct rdt_domain *d, + struct rdt_mondomain *d, struct rdt_resource *r, struct rdtgroup *prgrp) { union mon_data_bits priv; @@ -2907,7 +2907,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *p= arent_kn, * and "monitor" groups with given domain id. */ static void mkdir_mondata_subdir_allrdtgrp(struct rdt_resource *r, - struct rdt_domain *d) + struct rdt_mondomain *d) { struct kernfs_node *parent_kn; struct rdtgroup *prgrp, *crgrp; @@ -2929,7 +2929,7 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_= node *parent_kn, struct rdt_resource *r, struct rdtgroup *prgrp) { - struct rdt_domain *dom; + struct rdt_mondomain *dom; int ret; =20 list_for_each_entry(dom, &r->mondomains, list) { @@ -3714,7 +3714,7 @@ static int __init rdtgroup_setup_root(void) return ret; } =20 -static void domain_destroy_mon_state(struct rdt_domain *d) +static void domain_destroy_mon_state(struct rdt_mondomain *d) { bitmap_free(d->rmid_busy_llc); kfree(d->mbm_total); @@ -3729,7 +3729,7 @@ void resctrl_offline_ctrl_domain(struct rdt_resource = *r, struct rdt_domain *d) mba_sc_domain_destroy(r, d); } =20 -void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_domain = *d) +void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mondoma= in *d) { lockdep_assert_held(&rdtgroup_mutex); =20 @@ -3758,7 +3758,7 @@ void resctrl_offline_mon_domain(struct rdt_resource *= r, struct rdt_domain *d) domain_destroy_mon_state(d); } =20 -static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domai= n *d) +static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mondo= main *d) { size_t tsize; =20 @@ -3799,7 +3799,7 @@ int resctrl_online_ctrl_domain(struct rdt_resource *r= , struct rdt_domain *d) return 0; } =20 -int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain *d) +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mondomain= *d) { int err; =20 --=20 2.41.0 From nobody Tue Dec 16 02:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD3A7C83F26 for ; Tue, 29 Aug 2023 23:45:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241242AbjH2Xo4 (ORCPT ); Tue, 29 Aug 2023 19:44:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238658AbjH2Xon (ORCPT ); Tue, 29 Aug 2023 19:44:43 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADCED1B3; Tue, 29 Aug 2023 16:44:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693352681; x=1724888681; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=S/EXhJCpyz3uBL/oEdqmrNfXk9bhwq1mYy59t3f/24c=; b=GPARNH8f6vG+PD8c1Wv/gfiCnllw2lTotjM1TPiHjf0ZXrAeu0Dfgb4d 36X4ccxTatGqI++8ucrEORflen8Db8Vf5jsRVUyF1WDcbGNxfTsDbF/Si z32PTh8R6VnXHlXb+bu0udwJySFK9MAvrjsjLZR2b546AA9LTHBjdjTXZ L9K7HgRJA9fMrRpeBjW8n8qXddibFnyjYv0bY5Jqact3sjfrNKZTnBEf8 p1xpn1dTvf6p4pfleSV4FXevJUmwKefAEulXoAvOdj5+9PG8dyszk0Gvq hXNYrtBriAb/ySxKDn4m//Vvx4HghS9NqMC8Y3FWWVEa0qT0Dlkf4KZgU w==; X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="355015448" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="355015448" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="688691030" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="688691030" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:39 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v5 4/8] x86/resctrl: Add node-scope to the options for feature scope Date: Tue, 29 Aug 2023 16:44:22 -0700 Message-ID: <20230829234426.64421-5-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230829234426.64421-1-tony.luck@intel.com> References: <20230722190740.326190-1-tony.luck@intel.com> <20230829234426.64421-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Currently supported resctrl features are all domain scoped the same as the scope of the L2 or L3 caches. Add "node" as a new option for domain scope. Signed-off-by: Tony Luck --- include/linux/resctrl.h | 1 + arch/x86/kernel/cpu/resctrl/core.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 08382548571e..f55cf7afd4eb 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -163,6 +163,7 @@ struct resctrl_schema; enum resctrl_scope { RESCTRL_L3_CACHE, RESCTRL_L2_CACHE, + RESCTRL_NODE, }; =20 /** diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 3e08aa04a7ff..9fcc264fac6c 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -514,6 +514,8 @@ static int get_domain_id_from_scope(int cpu, enum resct= rl_scope scope) return get_cpu_cacheinfo_id(cpu, 3); case RESCTRL_L2_CACHE: return get_cpu_cacheinfo_id(cpu, 2); + case RESCTRL_NODE: + return cpu_to_node(cpu); default: WARN_ON_ONCE(1); break; --=20 2.41.0 From nobody Tue Dec 16 02:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09846C83F28 for ; Tue, 29 Aug 2023 23:45:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241261AbjH2Xo6 (ORCPT ); Tue, 29 Aug 2023 19:44:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241220AbjH2Xoo (ORCPT ); Tue, 29 Aug 2023 19:44:44 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB6141BC; Tue, 29 Aug 2023 16:44:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693352681; x=1724888681; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pyMs9Xff6XyhCPf/fJEHPprla4QB5nGuCyGEiqgsovY=; b=Pdy2oM+XG7fbXVmzppHlxxrGDqJ92b6rhBAqvRDntoccl6s681dtjn1P VBnAaxM2hHnKpH8AOIKoDix7QdprKE+nfh5Ta1zejT5f8gZKDNv9lqZk5 5uwPXJI+NrQWhkIY+BbNHRjd4ytfvEW52MHNGomNUxN8Saiad70LyDNAw K6sn/jbBsRimCNp5ykRgSiu4kVbhbrk2/u5ohy7HxHMKaFt4/V2cOIn1n /lYulO936RqYPjjzFM0ZPPg49/kg479RUJphbYHU6x0xVUUjNouXp9q8p +MM8HN49zYGx3iAVQTRYxrBgvYglwjE0nRRWZgnlSPsL3Hhk4zo7L3yWd w==; X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="355015461" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="355015461" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="688691033" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="688691033" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:40 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v5 5/8] x86/resctrl: Introduce snc_nodes_per_l3_cache Date: Tue, 29 Aug 2023 16:44:23 -0700 Message-ID: <20230829234426.64421-6-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230829234426.64421-1-tony.luck@intel.com> References: <20230722190740.326190-1-tony.luck@intel.com> <20230829234426.64421-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Intel Sub-NUMA Cluster mode requires several changes in resctrl behavior for correct operation. Add a global integer "snc_nodes_per_l3_cache" that will show how many SNC nodes share each L3 cache. When this is "1", SNC mode is either not implemented, or not enabled. A later patch will detect SNC mode and set snc_nodes_per_l3_cache to the appropriate value. For now it remains at the default "1" to indicate SNC mode is not active. Code that needs to take action when SNC is enabled is: 1) The number of logical RMIDs available for use is the number of physical RMIDs divided by the number of SNC nodes. 2) Likewise the "mon_scale" value must be adjusted for the number of SNC nodes. 3) When reading an RMID counter code must adjust from the logical RMID used to the physical RMID value that must be loaded into the IA32_QM_EVTSEL MSR. 4) The L3 cache is divided between the SNC nodes. So the value reported in the resctrl "size" file is adjusted. 5) The "-o mba_MBps" mount option must be disabled in SNC mode because the monitoring is being done per SNC node, while the bandwidth allocation is still done at the L3 cache scope. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/resctrl/internal.h | 2 ++ arch/x86/kernel/cpu/resctrl/core.c | 7 +++++++ arch/x86/kernel/cpu/resctrl/monitor.c | 16 +++++++++++++--- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ++-- 4 files changed, 24 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index c61fd6709730..326ca6b3688a 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -446,6 +446,8 @@ DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); =20 extern struct dentry *debugfs_resctrl; =20 +extern int snc_nodes_per_l3_cache; + enum resctrl_res_level { RDT_RESOURCE_L3, RDT_RESOURCE_L2, diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 9fcc264fac6c..ed4f55b3e5e4 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -48,6 +48,13 @@ int max_name_width, max_data_width; */ bool rdt_alloc_capable; =20 +/* + * Number of SNC nodes that share each L3 cache. + * Default is 1 for systems that do not support + * SNC, or have SNC disabled. + */ +int snc_nodes_per_l3_cache =3D 1; + static void mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index 42262d59ef9b..b6b3fb0f9abe 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -148,8 +148,18 @@ static inline struct rmid_entry *__rmid_entry(u32 rmid) =20 static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val) { + struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + int cpu =3D smp_processor_id(); + int rmid_offset =3D 0; u64 msr_val; =20 + /* + * When SNC mode is on, need to compute the offset to read the + * physical RMID counter for the node to which this CPU belongs + */ + if (snc_nodes_per_l3_cache > 1) + rmid_offset =3D (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->num_rmi= d; + /* * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured * with a valid event code for supported resource type and the bits @@ -158,7 +168,7 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id = eventid, u64 *val) * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62) * are error bits. */ - wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid); + wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid + rmid_offset); rdmsrl(MSR_IA32_QM_CTR, msr_val); =20 if (msr_val & RMID_VAL_ERROR) @@ -783,8 +793,8 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r) int ret; =20 resctrl_rmid_realloc_limit =3D boot_cpu_data.x86_cache_size * 1024; - hw_res->mon_scale =3D boot_cpu_data.x86_cache_occ_scale; - r->num_rmid =3D boot_cpu_data.x86_cache_max_rmid + 1; + hw_res->mon_scale =3D boot_cpu_data.x86_cache_occ_scale / snc_nodes_per_l= 3_cache; + r->num_rmid =3D (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3= _cache; hw_res->mbm_width =3D MBM_CNTR_WIDTH_BASE; =20 if (mbm_offset > 0 && mbm_offset <=3D MBM_CNTR_WIDTH_OFFSET_MAX) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 5feec2c33544..a8cf6251e506 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1367,7 +1367,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource= *r, } } =20 - return size; + return size / snc_nodes_per_l3_cache; } =20 /** @@ -2600,7 +2600,7 @@ static int rdt_parse_param(struct fs_context *fc, str= uct fs_parameter *param) ctx->enable_cdpl2 =3D true; return 0; case Opt_mba_mbps: - if (!supports_mba_mbps()) + if (!supports_mba_mbps() || snc_nodes_per_l3_cache > 1) return -EINVAL; ctx->enable_mba_mbps =3D true; return 0; --=20 2.41.0 From nobody Tue Dec 16 02:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD5D3C83F25 for ; Tue, 29 Aug 2023 23:45:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241249AbjH2Xo5 (ORCPT ); Tue, 29 Aug 2023 19:44:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241222AbjH2Xop (ORCPT ); Tue, 29 Aug 2023 19:44:45 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E0CB1B1; Tue, 29 Aug 2023 16:44:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693352682; x=1724888682; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=E2VN7C6YvAgSWRKL0T6DnmwaxuervSEnb92Npb1t5PA=; b=IJsk+CTfEH++PVxPLsCB6kVX0y8E0heUDtSodip5laQtO/Rl2KuKpIrS qodwnWfx3BxaowM5imgD2QVlEQSOD9GLLbMyWPTVyCuvIpBIekypUMe9S Cuoi17n5yem9nsI6tThR0RiTNdLKPs0yuwcitQKritoWdTWq+362GJmgH YDeF9+7TqGYd9NvKo5YUv7gSx7LH0FvBo0PhAnIY9aEXsS1wbAJH4AvKG b06IZWGdgNdFAGEWWjjYy4Tz3kRlZZfQ0as4eBSYG0UrvOTqEuu2YAS3Z dq2DRdPk8CaV67oKpEsE8hMnNBJVT8qzTjb/FYtzPfyy4Q4lHcdvFxGcG w==; X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="355015472" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="355015472" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="688691036" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="688691036" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:41 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v5 6/8] x86/resctrl: Sub NUMA Cluster detection and enable Date: Tue, 29 Aug 2023 16:44:24 -0700 Message-ID: <20230829234426.64421-7-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230829234426.64421-1-tony.luck@intel.com> References: <20230722190740.326190-1-tony.luck@intel.com> <20230829234426.64421-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There isn't a simple h/w bit that indicates whether a CPU is running in Sub NUMA Cluster mode. Infer the state by comparing the ratio of NUMA nodes to L3 cache instances. When SNC mode is detected, reconfigure the RMID counters by updating the MSR_RMID_SNC_CONFIG MSR on each socket as CPUs are seen. Signed-off-by: Tony Luck --- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/cpu/resctrl/core.c | 68 ++++++++++++++++++++++++++++++ 2 files changed, 69 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index 1d111350197f..393d1b047617 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -1100,6 +1100,7 @@ #define MSR_IA32_QM_CTR 0xc8e #define MSR_IA32_PQR_ASSOC 0xc8f #define MSR_IA32_L3_CBM_BASE 0xc90 +#define MSR_RMID_SNC_CONFIG 0xca0 #define MSR_IA32_L2_CBM_BASE 0xd10 #define MSR_IA32_MBA_THRTL_BASE 0xd50 =20 diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index ed4f55b3e5e4..9f0ac9721fab 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -16,11 +16,14 @@ =20 #define pr_fmt(fmt) "resctrl: " fmt =20 +#include #include #include #include #include +#include =20 +#include #include #include #include "internal.h" @@ -724,11 +727,34 @@ static void clear_closid_rmid(int cpu) wrmsr(MSR_IA32_PQR_ASSOC, 0, 0); } =20 +/* + * The power-on reset value of MSR_RMID_SNC_CONFIG is 0x1 + * which indicates that RMIDs are configured in legacy mode. + * Clearing bit 0 reconfigures the RMID counters for use + * in Sub NUMA Cluster mode. + */ +static void snc_remap_rmids(int cpu) +{ + u64 val; + + /* Only need to enable once per package */ + if (cpumask_first(topology_core_cpumask(cpu)) !=3D cpu) + return; + + rdmsrl(MSR_RMID_SNC_CONFIG, val); + val &=3D ~BIT_ULL(0); + wrmsrl(MSR_RMID_SNC_CONFIG, val); +} + static int resctrl_online_cpu(unsigned int cpu) { struct rdt_resource *r; =20 mutex_lock(&rdtgroup_mutex); + + if (snc_nodes_per_l3_cache > 1) + snc_remap_rmids(cpu); + for_each_capable_rdt_resource(r) domain_add_cpu(cpu, r); /* The cpu is set in default rdtgroup after online. */ @@ -983,11 +1009,53 @@ static __init bool get_rdt_resources(void) return (rdt_mon_capable || rdt_alloc_capable); } =20 +/* CPU models that support MSR_RMID_SNC_CONFIG */ +static const struct x86_cpu_id snc_cpu_ids[] __initconst =3D { + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, 0), + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, 0), + X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, 0), + {} +}; + +static __init int get_snc_config(void) +{ + unsigned long *node_caches; + int mem_only_nodes =3D 0; + int cpu, node, ret; + + if (!x86_match_cpu(snc_cpu_ids)) + return 1; + + node_caches =3D kcalloc(BITS_TO_LONGS(nr_node_ids), sizeof(*node_caches),= GFP_KERNEL); + if (!node_caches) + return 1; + + cpus_read_lock(); + for_each_node(node) { + cpu =3D cpumask_first(cpumask_of_node(node)); + if (cpu < nr_cpu_ids) + set_bit(get_cpu_cacheinfo_id(cpu, 3), node_caches); + else + mem_only_nodes++; + } + cpus_read_unlock(); + + ret =3D (nr_node_ids - mem_only_nodes) / bitmap_weight(node_caches, nr_no= de_ids); + kfree(node_caches); + + if (ret > 1) + rdt_resources_all[RDT_RESOURCE_L3].r_resctrl.mon_scope =3D RESCTRL_NODE; + + return ret; +} + static __init void rdt_init_res_defs_intel(void) { struct rdt_hw_resource *hw_res; struct rdt_resource *r; =20 + snc_nodes_per_l3_cache =3D get_snc_config(); + for_each_rdt_resource(r) { hw_res =3D resctrl_to_arch_res(r); =20 --=20 2.41.0 From nobody Tue Dec 16 02:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1BC89C83F27 for ; Tue, 29 Aug 2023 23:45:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241268AbjH2Xo7 (ORCPT ); Tue, 29 Aug 2023 19:44:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241224AbjH2Xoq (ORCPT ); Tue, 29 Aug 2023 19:44:46 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 736881B3; Tue, 29 Aug 2023 16:44:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693352683; x=1724888683; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+XDRINXx73/bMenFlRGiWAYMrGt/x5/6+qfN9SAWjoE=; b=mD2YNV0sxoFjkCNSufMSC/J/WfVb9bxSfWBn5W35J9yj5lpjB1jJwhED v57f+yYLurPTmGtHlHwdq6lCG3r6QneIv0cveZncZvB8qN5RKixprW3At qlEzU1DlWImfw/p8Dy6KFrAbKkPQVHdRyrrl95xmxOV8H+owf7t9o6Cnu vlijXKmYpiTxq7+oASzOZLIlZQWwXydja2D31KxBFBFl9ZPcRRYvK2H1e SF/xQoSDogqoYDGU6B2JmUZlL3m7/wyHpSVdy7+q8qlCRYfy8Atth4b37 IHn+Yxs6I3CEiPIEPBkvlHtQePnzDYv94WT26fLzYiqFcqoZYOiPB1VqQ w==; X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="355015483" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="355015483" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="688691039" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="688691039" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:41 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v5 7/8] x86/resctrl: Update documentation with Sub-NUMA cluster changes Date: Tue, 29 Aug 2023 16:44:25 -0700 Message-ID: <20230829234426.64421-8-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230829234426.64421-1-tony.luck@intel.com> References: <20230722190740.326190-1-tony.luck@intel.com> <20230829234426.64421-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" With Sub-NUMA Cluster mode enabled the scope of monitoring resources is per-NODE instead of per-L3 cache. Suffixes of directories with "L3" in their name refer to Sub-NUMA nodes instead of L3 cache ids. Users should be aware that SNC mode also affects the amount of L3 cache available for allocation within each SNC node. Signed-off-by: Tony Luck --- Documentation/arch/x86/resctrl.rst | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/re= sctrl.rst index cb05d90111b4..407764f43f25 100644 --- a/Documentation/arch/x86/resctrl.rst +++ b/Documentation/arch/x86/resctrl.rst @@ -345,9 +345,15 @@ When control is enabled all CTRL_MON groups will also = contain: When monitoring is enabled all MON groups will also contain: =20 "mon_data": - This contains a set of files organized by L3 domain and by - RDT event. E.g. on a system with two L3 domains there will - be subdirectories "mon_L3_00" and "mon_L3_01". Each of these + This contains a set of files organized by L3 domain or by NUMA + node (depending on whether Sub-NUMA Cluster (SNC) mode is disabled + or enabled respectively) and by RDT event. E.g. on a system with + SNC mode disabled with two L3 domains there will be subdirectories + "mon_L3_00" and "mon_L3_01". The numerical suffix refers to the + L3 cache id. With SNC enabled the directory names are the same, + but the numerical suffix refers to the node id. + Mappings from node ids to CPUs are available in the + /sys/devices/system/node/node*/cpulist files. Each of these directories have one file per event (e.g. "llc_occupancy", "mbm_total_bytes", and "mbm_local_bytes"). In a MON group these files provide a read out of the current value of the event for @@ -452,6 +458,19 @@ and 0xA are not. On a system with a 20-bit mask each = bit represents 5% of the capacity of the cache. You could partition the cache into four equal parts with masks: 0x1f, 0x3e0, 0x7c00, 0xf8000. =20 +Notes on Sub-NUMA Cluster mode +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D +When SNC mode is enabled the "llc_occupancy", "mbm_total_bytes", and +"mbm_local_bytes" will only give accurate results for well behaved NUMA +applications. I.e. those that perform the majority of memory accesses +to memory on the local NUMA node to the CPU where the task is executing. + +The cache allocation feature still provides the same number of +bits in a mask to control allocation into the L3 cache. But each +of those ways has its capacity reduced because the cache is divided +between the SNC nodes. The values reported in the resctrl +"size" files are adjusted accordingly. + Memory bandwidth Allocation and monitoring =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --=20 2.41.0 From nobody Tue Dec 16 02:35:51 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D59FC71153 for ; Tue, 29 Aug 2023 23:45:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241263AbjH2Xo7 (ORCPT ); Tue, 29 Aug 2023 19:44:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241225AbjH2Xoq (ORCPT ); Tue, 29 Aug 2023 19:44:46 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 332181BB; Tue, 29 Aug 2023 16:44:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1693352684; x=1724888684; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+1T9Do0MpE3rexZYVFZx+fj/FcnYDzdw9uIqJHZ+mV4=; b=cwzK/P/uWTFMt27jICrS8euvoysFaQaFPP9XRy1XQl7y+XQahYg+zpGL vtc/3IRnLU6LQIuAowK0oRzBRzEe37sJrXNNcHv/vI2Ai/0JnQbNzJ0xi oKFE/GN0KuK8JYXxFP6kMYvLabC8MmTGKoH8vMa9XqV4OYzXJzsDIempP AcwWuQtDLDtE8GCPlNiuMS6DZrQtUd0A/HT+rknvIpx8EKCw1fLglSwP6 Gyh/oEGBkm9JXgF85FKYOwNmqeNKJAteFspYKbGwxxELfNFl/wLbyvELx 0LhsGJnpyD2wKo27hI7R/QisX/ascM6D2qt9JXDz428SRYPO7gWV6f7XC Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="355015494" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="355015494" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10817"; a="688691042" X-IronPort-AV: E=Sophos;i="6.02,211,1688454000"; d="scan'208";a="688691042" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2023 16:44:42 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v5 8/8] selftests/resctrl: Adjust effective L3 cache size when SNC enabled Date: Tue, 29 Aug 2023 16:44:26 -0700 Message-ID: <20230829234426.64421-9-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230829234426.64421-1-tony.luck@intel.com> References: <20230722190740.326190-1-tony.luck@intel.com> <20230829234426.64421-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA nodes. Systems may support splitting into either two or four nodes. When SNC mode is enabled the effective amount of L3 cache available for allocation is divided by the number of nodes per L3. Detect which SNC mode is active by comparing the number of CPUs that share a cache with CPU0, with the number of CPUs on node0. This gives some hope of tests passing. But additional test infrastructure changes are needed to bind tests to nodes and guarantee memory allocation from the local node. Reported-by: "Shaopeng Tan (Fujitsu)" Signed-off-by: Tony Luck --- tools/testing/selftests/resctrl/resctrl.h | 1 + tools/testing/selftests/resctrl/resctrlfs.c | 57 +++++++++++++++++++++ 2 files changed, 58 insertions(+) diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/self= tests/resctrl/resctrl.h index 87e39456dee0..a8b43210b573 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/se= lftests/resctrl/resctrlfs.c index fb00245dee92..79eecbf9f863 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -130,6 +130,61 @@ int get_resource_id(int cpu_no, int *resource_id) return 0; } =20 +/* + * Count number of CPUs in a /sys bit map + */ +static int count_sys_bitmap_bits(char *name) +{ + FILE *fp =3D fopen(name, "r"); + int count =3D 0, c; + + if (!fp) + return 0; + + while ((c =3D fgetc(fp)) !=3D EOF) { + if (!isxdigit(c)) + continue; + switch (c) { + case 'f': + count++; + case '7': case 'b': case 'd': case 'e': + count++; + case '3': case '5': case '6': case '9': case 'a': case 'c': + count++; + case '1': case '2': case '4': case '8': + count++; + } + } + fclose(fp); + + return count; +} + +/* + * Detect SNC by compating #CPUs in node0 with #CPUs sharing LLC with CPU0 + * Try to get this right, even if a few CPUs are offline so that the number + * of CPUs in node0 is not exactly half or a quarter of the CPUs sharing t= he + * LLC of CPU0. + */ +static int snc_ways(void) +{ + int node_cpus, cache_cpus; + + node_cpus =3D count_sys_bitmap_bits("/sys/devices/system/node/node0/cpuma= p"); + cache_cpus =3D count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/= index3/shared_cpu_map"); + + if (!node_cpus || !cache_cpus) { + fprintf(stderr, "Warning could not determine Sub-NUMA Cluster mode\n"); + return 1; + } + + if (4 * node_cpus >=3D cache_cpus) + return 4; + else if (2 * node_cpus >=3D cache_cpus) + return 2; + return 1; +} + /* * get_cache_size - Get cache size for a specified CPU * @cpu_no: CPU number @@ -190,6 +245,8 @@ int get_cache_size(int cpu_no, char *cache_type, unsign= ed long *cache_size) break; } =20 + if (cache_num =3D=3D 3) + *cache_size /=3D snc_ways(); return 0; } =20 --=20 2.41.0