From nobody Fri Dec 19 02:50:45 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9FF6E7F154 for ; Thu, 28 Sep 2023 19:14:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231195AbjI1TOK (ORCPT ); Thu, 28 Sep 2023 15:14:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231293AbjI1TOE (ORCPT ); Thu, 28 Sep 2023 15:14:04 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 839841A8; Thu, 28 Sep 2023 12:14:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695928441; x=1727464441; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gwQ2brqgMUhemRvUUMd53Pt9MPLz6Qcd9Aym5LDIjhE=; b=CnVusxo/lqVNv3Ox1StzKjaJxsc4c15qxirxl8lpsdkSprwA5JJ8GdoW f27O9I7RVYw1ezjimwKHkAEvnU5aWqARkkJRCkC3V+g36eT+D8Q0V1lUO WZYrYGCMCqLbBLuT2RNOW3koPzmJVoD/PWnRkClRXK5SSUwNcyfmYllQf mzZgQUjS2TAICKKu8CanbYsb/9ra38Yt5LgXi37Ihmpg4GBAqmaXgr0Ni nrncv+Ytw4Z0WokZIjJ52AXHmhkOqLBjeH7YiCaZ0jangoDSt0c58GUrf xQ06/1ZhmVomODObO4LNkuUpFx6yoVd/uA4WO+ydXYF6L+QnjCQL03fLy A==; X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="367213873" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="367213873" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:13:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="779020022" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="779020022" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:13:58 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v6 1/8] x86/resctrl: Prepare for new domain scope Date: Thu, 28 Sep 2023 12:13:42 -0700 Message-ID: <20230928191350.205703-2-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230928191350.205703-1-tony.luck@intel.com> References: <20230829234426.64421-1-tony.luck@intel.com> <20230928191350.205703-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Resctrl resources operate on subsets of CPUs in the system with the defining attribute of each subset being an instance of a particular level of cache. E.g. all CPUs sharing an L3 cache would be part of the same domain. In preparation for features that are scoped at the NUMA node level change the code from explicit references to "cache_level" to a more generic scope. At this point the only options for this scope are groups of CPUs that share an L2 cache or L3 cache. Provide a more detailed warning message if a domain id cannot be found when adding a CPU. Just check and silent return if the domain id can't be found when removing a CPU. No functional change. Signed-off-by: Tony Luck --- Changes since v5: 1) Set enum values of RESCTRL_L2_CACHE and RESCTRL_L3_CACHE to "2" and "3 respectively so they can be passed to get_cpu_cacheinfo_id() (in code paths that check that scope is one of the cache types). 2) Simplified the check on scope in pseudo_lock_region_init() and rdtgroup_cbm_to_size(). 3) Added detailed warning if the domain id for a CPU cannot be determined when adding a CPU. Simpler check on removal. --- include/linux/resctrl.h | 9 +++++-- arch/x86/kernel/cpu/resctrl/core.c | 33 ++++++++++++++++++----- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 6 ++++- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 +++- 4 files changed, 43 insertions(+), 10 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 8334eeacfec5..618735e396cb 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -144,13 +144,18 @@ struct resctrl_membw { struct rdt_parse_data; struct resctrl_schema; =20 +enum resctrl_scope { + RESCTRL_L2_CACHE =3D 2, + RESCTRL_L3_CACHE =3D 3, +}; + /** * struct rdt_resource - attributes of a resctrl resource * @rid: The index of the resource * @alloc_capable: Is allocation available on this machine * @mon_capable: Is monitor feature available on this machine * @num_rmid: Number of RMIDs available - * @cache_level: Which cache level defines scope of this resource + * @scope: Scope of this resource * @cache: Cache allocation related data * @membw: If the component has bandwidth controls, their properties. * @domains: All domains for this resource @@ -168,7 +173,7 @@ struct rdt_resource { bool alloc_capable; bool mon_capable; int num_rmid; - int cache_level; + enum resctrl_scope scope; struct resctrl_cache cache; struct resctrl_membw membw; struct list_head domains; diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 030d3b409768..3b1837e1fb6b 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -65,7 +65,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L3, .name =3D "L3", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_L3), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", @@ -79,7 +79,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L2, .name =3D "L2", - .cache_level =3D 2, + .scope =3D RESCTRL_L2_CACHE, .domains =3D domain_init(RDT_RESOURCE_L2), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", @@ -93,7 +93,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_MBA, .name =3D "MB", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_MBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", @@ -105,7 +105,7 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_SMBA, .name =3D "SMBA", - .cache_level =3D 3, + .scope =3D RESCTRL_L3_CACHE, .domains =3D domain_init(RDT_RESOURCE_SMBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", @@ -487,6 +487,19 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct = rdt_hw_domain *hw_dom) return 0; } =20 +static int get_domain_id_from_scope(int cpu, enum resctrl_scope scope) +{ + switch (scope) { + case RESCTRL_L2_CACHE: + case RESCTRL_L3_CACHE: + return get_cpu_cacheinfo_id(cpu, scope); + default: + break; + } + + return -EINVAL; +} + /* * domain_add_cpu - Add a cpu to a resource's domain list. * @@ -502,12 +515,17 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct= rdt_hw_domain *hw_dom) */ static void domain_add_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->cache_level); + int id =3D get_domain_id_from_scope(cpu, r->scope); struct list_head *add_pos =3D NULL; struct rdt_hw_domain *hw_dom; struct rdt_domain *d; int err; =20 + if (id < 0) { + pr_warn_once("Can't find domain id for CPU:%d scope:%d for resource %s\n= ", + cpu, r->scope, r->name); + return; + } d =3D rdt_find_domain(r, id, &add_pos); if (IS_ERR(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); @@ -552,10 +570,13 @@ static void domain_add_cpu(int cpu, struct rdt_resour= ce *r) =20 static void domain_remove_cpu(int cpu, struct rdt_resource *r) { - int id =3D get_cpu_cacheinfo_id(cpu, r->cache_level); + int id =3D get_domain_id_from_scope(cpu, r->scope); struct rdt_hw_domain *hw_dom; struct rdt_domain *d; =20 + if (id < 0) + return; + d =3D rdt_find_domain(r, id, NULL); if (IS_ERR_OR_NULL(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index 8f559eeae08e..8c5f932bc00b 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -292,10 +292,14 @@ static void pseudo_lock_region_clear(struct pseudo_lo= ck_region *plr) */ static int pseudo_lock_region_init(struct pseudo_lock_region *plr) { + int scope =3D plr->s->res->scope; struct cpu_cacheinfo *ci; int ret; int i; =20 + if (WARN_ON_ONCE(scope !=3D RESCTRL_L2_CACHE && scope !=3D RESCTRL_L3_CAC= HE)) + return -ENODEV; + /* Pick the first cpu we find that is associated with the cache. */ plr->cpu =3D cpumask_first(&plr->d->cpu_mask); =20 @@ -311,7 +315,7 @@ static int pseudo_lock_region_init(struct pseudo_lock_r= egion *plr) plr->size =3D rdtgroup_cbm_to_size(plr->s->res, plr->d, plr->cbm); =20 for (i =3D 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level =3D=3D plr->s->res->cache_level) { + if (ci->info_list[i].level =3D=3D scope) { plr->line_size =3D ci->info_list[i].coherency_line_size; return 0; } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 725344048f85..1cf2b36f5bf8 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1345,10 +1345,13 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resour= ce *r, unsigned int size =3D 0; int num_b, i; =20 + if (WARN_ON_ONCE(r->scope !=3D RESCTRL_L2_CACHE && r->scope !=3D RESCTRL_= L3_CACHE)) + return -EINVAL; + num_b =3D bitmap_weight(&cbm, r->cache.cbm_len); ci =3D get_cpu_cacheinfo(cpumask_any(&d->cpu_mask)); for (i =3D 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level =3D=3D r->cache_level) { + if (ci->info_list[i].level =3D=3D r->scope) { size =3D ci->info_list[i].size / r->cache.cbm_len * num_b; break; } --=20 2.41.0 From nobody Fri Dec 19 02:50:45 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC0EACE7B1F for ; Thu, 28 Sep 2023 19:14:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231684AbjI1TOM (ORCPT ); Thu, 28 Sep 2023 15:14:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44848 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231774AbjI1TOF (ORCPT ); Thu, 28 Sep 2023 15:14:05 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34B62199; Thu, 28 Sep 2023 12:14:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695928442; x=1727464442; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cAwaK15eEvQn+U6En6oqNtCm7Lm+PYdBNzcEUQ7JaX0=; b=aY7M+V02ABsCl9WSz8ZjSm4/UO9mWpdRD8Ck96iiUvm+AbeuJXHgcivi 5bvhI0wUEAZwVn+zUyqlzQ/4wFpTHwbpRj9QuO3L6JlPKZB3kSypBJt57 W4kWldyxsBPP0fQqP5QyuqhumwUwUpJdu/KcvbClidVnbqlDXOOnPjalW lnQwYMN60Hce87yVN2QWBuYD8RuSiPIwS7fE1XmQbYjeepbevVOYGB55G JZrKKgQ7wVoxCTU93frzRNVk74uudiv9Bs/3oLfJqVicMvr3cOeb6eBwN vsu2mOaM25sNDG+Dbrn/BZJzGhIpz9gR0d8qdmJ8fv43grJBgtGCLGIpF w==; X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="367213892" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="367213892" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:13:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="779020025" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="779020025" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:13:58 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v6 2/8] x86/resctrl: Prepare to split rdt_domain structure Date: Thu, 28 Sep 2023 12:13:43 -0700 Message-ID: <20230928191350.205703-3-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230928191350.205703-1-tony.luck@intel.com> References: <20230829234426.64421-1-tony.luck@intel.com> <20230928191350.205703-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The rdt_domain structure is used for both control and monitor features. It is about to be split into separate structures for these two usages because the scope for control and monitoring features for a resource will be different for future resources. To allow for common code that scans a list of domains looking for a specific domain id, move the "list" and "id" fields into their own structure within the rdt_domain structure. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- Changes since v5 This is a new patch. Spawned from Reinette's review comment about my trying to mark some fields in the rdt_ctrl_domain and rdt_mon_domain structures as required to be the same. This is a better solution that doesn't require that developers read and obey those comments. --- include/linux/resctrl.h | 14 ++++++-- arch/x86/kernel/cpu/resctrl/core.c | 16 ++++----- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 16 ++++----- arch/x86/kernel/cpu/resctrl/monitor.c | 2 +- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 44 +++++++++++------------ 6 files changed, 51 insertions(+), 43 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 618735e396cb..a583fa88ea5a 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -53,9 +53,18 @@ struct resctrl_staged_config { }; =20 /** - * struct rdt_domain - group of CPUs sharing a resctrl resource + * struct rdt_domain_hdr - common header for different domain types * @list: all instances of this resource * @id: unique id for this instance + */ +struct rdt_domain_hdr { + struct list_head list; + int id; +}; + +/** + * struct rdt_domain - group of CPUs sharing a resctrl resource + * @hdr: common header for different domain types * @cpu_mask: which CPUs share this resource * @rmid_busy_llc: bitmap of which limbo RMIDs are above threshold * @mbm_total: saved state for MBM total bandwidth @@ -71,8 +80,7 @@ struct resctrl_staged_config { * by closid */ struct rdt_domain { - struct list_head list; - int id; + struct rdt_domain_hdr hdr; struct cpumask cpu_mask; unsigned long *rmid_busy_llc; struct mbm_state *mbm_total; diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 3b1837e1fb6b..05369add4578 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -352,7 +352,7 @@ struct rdt_domain *get_domain_from_cpu(int cpu, struct = rdt_resource *r) { struct rdt_domain *d; =20 - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->domains, hdr.list) { /* Find the domain that contains this CPU */ if (cpumask_test_cpu(cpu, &d->cpu_mask)) return d; @@ -401,12 +401,12 @@ struct rdt_domain *rdt_find_domain(struct rdt_resourc= e *r, int id, return ERR_PTR(-ENODEV); =20 list_for_each(l, &r->domains) { - d =3D list_entry(l, struct rdt_domain, list); + d =3D list_entry(l, struct rdt_domain, hdr.list); /* When id is found, return its domain. */ - if (id =3D=3D d->id) + if (id =3D=3D d->hdr.id) return d; /* Stop searching when finding id's position in sorted list. */ - if (id < d->id) + if (id < d->hdr.id) break; } =20 @@ -544,7 +544,7 @@ static void domain_add_cpu(int cpu, struct rdt_resource= *r) return; =20 d =3D &hw_dom->d_resctrl; - d->id =3D id; + d->hdr.id =3D id; cpumask_set_cpu(cpu, &d->cpu_mask); =20 rdt_domain_reconfigure_cdp(r); @@ -559,11 +559,11 @@ static void domain_add_cpu(int cpu, struct rdt_resour= ce *r) return; } =20 - list_add_tail(&d->list, add_pos); + list_add_tail(&d->hdr.list, add_pos); =20 err =3D resctrl_online_domain(r, d); if (err) { - list_del(&d->list); + list_del(&d->hdr.list); domain_free(hw_dom); } } @@ -587,7 +587,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resou= rce *r) cpumask_clear_cpu(cpu, &d->cpu_mask); if (cpumask_empty(&d->cpu_mask)) { resctrl_offline_domain(r, d); - list_del(&d->list); + list_del(&d->hdr.list); =20 /* * rdt_domain "d" is going to be freed below, so clear diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cp= u/resctrl/ctrlmondata.c index b44c487727d4..8bce591a1018 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -67,7 +67,7 @@ int parse_bw(struct rdt_parse_data *data, struct resctrl_= schema *s, =20 cfg =3D &d->staged_config[s->conf_type]; if (cfg->have_new_ctrl) { - rdt_last_cmd_printf("Duplicate domain %d\n", d->id); + rdt_last_cmd_printf("Duplicate domain %d\n", d->hdr.id); return -EINVAL; } =20 @@ -144,7 +144,7 @@ int parse_cbm(struct rdt_parse_data *data, struct resct= rl_schema *s, =20 cfg =3D &d->staged_config[s->conf_type]; if (cfg->have_new_ctrl) { - rdt_last_cmd_printf("Duplicate domain %d\n", d->id); + rdt_last_cmd_printf("Duplicate domain %d\n", d->hdr.id); return -EINVAL; } =20 @@ -224,8 +224,8 @@ static int parse_line(char *line, struct resctrl_schema= *s, return -EINVAL; } dom =3D strim(dom); - list_for_each_entry(d, &r->domains, list) { - if (d->id =3D=3D dom_id) { + list_for_each_entry(d, &r->domains, hdr.list) { + if (d->hdr.id =3D=3D dom_id) { data.buf =3D dom; data.rdtgrp =3D rdtgrp; if (r->parse_ctrlval(&data, s, d)) @@ -316,7 +316,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r,= u32 closid) return -ENOMEM; =20 msr_param.res =3D NULL; - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->domains, hdr.list) { hw_dom =3D resctrl_to_arch_dom(d); for (t =3D 0; t < CDP_NUM_TYPES; t++) { cfg =3D &hw_dom->d_resctrl.staged_config[t]; @@ -464,7 +464,7 @@ static void show_doms(struct seq_file *s, struct resctr= l_schema *schema, int clo u32 ctrl_val; =20 seq_printf(s, "%*s:", max_name_width, schema->name); - list_for_each_entry(dom, &r->domains, list) { + list_for_each_entry(dom, &r->domains, hdr.list) { if (sep) seq_puts(s, ";"); =20 @@ -474,7 +474,7 @@ static void show_doms(struct seq_file *s, struct resctr= l_schema *schema, int clo ctrl_val =3D resctrl_arch_get_config(r, dom, closid, schema->conf_type); =20 - seq_printf(s, r->format_str, dom->id, max_data_width, + seq_printf(s, r->format_str, dom->hdr.id, max_data_width, ctrl_val); sep =3D true; } @@ -503,7 +503,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, } else { seq_printf(s, "%s:%d=3D%x\n", rdtgrp->plr->s->res->name, - rdtgrp->plr->d->id, + rdtgrp->plr->d->hdr.id, rdtgrp->plr->cbm); } } else { diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index ded1fc7cb7cb..27cda5988d7f 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -340,7 +340,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) =20 entry->busy =3D 0; cpu =3D get_cpu(); - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->domains, hdr.list) { if (cpumask_test_cpu(cpu, &d->cpu_mask)) { err =3D resctrl_arch_rmid_read(r, d, entry->rmid, QOS_L3_OCCUP_EVENT_ID, diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index 8c5f932bc00b..18b6183a1b48 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -856,7 +856,7 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_dom= ain *d) * associated with them. */ for_each_alloc_capable_rdt_resource(r) { - list_for_each_entry(d_i, &r->domains, list) { + list_for_each_entry(d_i, &r->domains, hdr.list) { if (d_i->plr) cpumask_or(cpu_with_psl, cpu_with_psl, &d_i->cpu_mask); diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 1cf2b36f5bf8..42adf17ea6fa 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -86,7 +86,7 @@ void rdt_staged_configs_clear(void) lockdep_assert_held(&rdtgroup_mutex); =20 for_each_alloc_capable_rdt_resource(r) { - list_for_each_entry(dom, &r->domains, list) + list_for_each_entry(dom, &r->domains, hdr.list) memset(dom->staged_config, 0, sizeof(dom->staged_config)); } } @@ -928,12 +928,12 @@ static int rdt_bit_usage_show(struct kernfs_open_file= *of, =20 mutex_lock(&rdtgroup_mutex); hw_shareable =3D r->cache.shareable_bits; - list_for_each_entry(dom, &r->domains, list) { + list_for_each_entry(dom, &r->domains, hdr.list) { if (sep) seq_putc(seq, ';'); sw_shareable =3D 0; exclusive =3D 0; - seq_printf(seq, "%d=3D", dom->id); + seq_printf(seq, "%d=3D", dom->hdr.id); for (i =3D 0; i < closids_supported(); i++) { if (!closid_allocated(i)) continue; @@ -1233,7 +1233,7 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgr= oup *rdtgrp) if (r->rid =3D=3D RDT_RESOURCE_MBA || r->rid =3D=3D RDT_RESOURCE_SMBA) continue; has_cache =3D true; - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->domains, hdr.list) { ctrl =3D resctrl_arch_get_config(r, d, closid, s->conf_type); if (rdtgroup_cbm_overlaps(s, d, ctrl, closid, false)) { @@ -1398,7 +1398,7 @@ static int rdtgroup_size_show(struct kernfs_open_file= *of, size =3D rdtgroup_cbm_to_size(rdtgrp->plr->s->res, rdtgrp->plr->d, rdtgrp->plr->cbm); - seq_printf(s, "%d=3D%u\n", rdtgrp->plr->d->id, size); + seq_printf(s, "%d=3D%u\n", rdtgrp->plr->d->hdr.id, size); } goto out; } @@ -1410,7 +1410,7 @@ static int rdtgroup_size_show(struct kernfs_open_file= *of, type =3D schema->conf_type; sep =3D false; seq_printf(s, "%*s:", max_name_width, schema->name); - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->domains, hdr.list) { if (sep) seq_putc(s, ';'); if (rdtgrp->mode =3D=3D RDT_MODE_PSEUDO_LOCKSETUP) { @@ -1428,7 +1428,7 @@ static int rdtgroup_size_show(struct kernfs_open_file= *of, else size =3D rdtgroup_cbm_to_size(r, d, ctrl); } - seq_printf(s, "%d=3D%u", d->id, size); + seq_printf(s, "%d=3D%u", d->hdr.id, size); sep =3D true; } seq_putc(s, '\n'); @@ -1499,7 +1499,7 @@ static int mbm_config_show(struct seq_file *s, struct= rdt_resource *r, u32 evtid =20 mutex_lock(&rdtgroup_mutex); =20 - list_for_each_entry(dom, &r->domains, list) { + list_for_each_entry(dom, &r->domains, hdr.list) { if (sep) seq_puts(s, ";"); =20 @@ -1507,7 +1507,7 @@ static int mbm_config_show(struct seq_file *s, struct= rdt_resource *r, u32 evtid mon_info.evtid =3D evtid; mondata_config_read(dom, &mon_info); =20 - seq_printf(s, "%d=3D0x%02x", dom->id, mon_info.mon_config); + seq_printf(s, "%d=3D0x%02x", dom->hdr.id, mon_info.mon_config); sep =3D true; } seq_puts(s, "\n"); @@ -1622,8 +1622,8 @@ static int mon_config_write(struct rdt_resource *r, c= har *tok, u32 evtid) return -EINVAL; } =20 - list_for_each_entry(d, &r->domains, list) { - if (d->id =3D=3D dom_id) { + list_for_each_entry(d, &r->domains, hdr.list) { + if (d->hdr.id =3D=3D dom_id) { ret =3D mbm_config_write_domain(r, d, evtid, val); if (ret) return -EINVAL; @@ -2141,7 +2141,7 @@ static int set_cache_qos_cfg(int level, bool enable) return -ENOMEM; =20 r_l =3D &rdt_resources_all[level].r_resctrl; - list_for_each_entry(d, &r_l->domains, list) { + list_for_each_entry(d, &r_l->domains, hdr.list) { if (r_l->cache.arch_has_per_cpu_cfg) /* Pick all the CPUs in the domain instance */ for_each_cpu(cpu, &d->cpu_mask) @@ -2226,7 +2226,7 @@ static int set_mba_sc(bool mba_sc) =20 r->membw.mba_sc =3D mba_sc; =20 - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->domains, hdr.list) { for (i =3D 0; i < num_closid; i++) d->mbps_val[i] =3D MBA_MAX_MBPS; } @@ -2528,7 +2528,7 @@ static int rdt_get_tree(struct fs_context *fc) =20 if (is_mbm_enabled()) { r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - list_for_each_entry(dom, &r->domains, list) + list_for_each_entry(dom, &r->domains, hdr.list) mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL); } =20 @@ -2652,7 +2652,7 @@ static int reset_all_ctrls(struct rdt_resource *r) * CBMs in all domains to the maximum mask value. Pick one CPU * from each domain to update the MSRs below. */ - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->domains, hdr.list) { hw_dom =3D resctrl_to_arch_dom(d); cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask); =20 @@ -2858,7 +2858,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *p= arent_kn, char name[32]; int ret; =20 - sprintf(name, "mon_%s_%02d", r->name, d->id); + sprintf(name, "mon_%s_%02d", r->name, d->hdr.id); /* create the directory */ kn =3D kernfs_create_dir(parent_kn, name, parent_kn->mode, prgrp); if (IS_ERR(kn)) @@ -2874,7 +2874,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *p= arent_kn, } =20 priv.u.rid =3D r->rid; - priv.u.domid =3D d->id; + priv.u.domid =3D d->hdr.id; list_for_each_entry(mevt, &r->evt_list, list) { priv.u.evtid =3D mevt->evtid; ret =3D mon_addfile(kn, mevt->name, priv.priv); @@ -2922,7 +2922,7 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_= node *parent_kn, struct rdt_domain *dom; int ret; =20 - list_for_each_entry(dom, &r->domains, list) { + list_for_each_entry(dom, &r->domains, hdr.list) { ret =3D mkdir_mondata_subdir(parent_kn, dom, r, prgrp); if (ret) return ret; @@ -3081,7 +3081,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d= , struct resctrl_schema *s, */ tmp_cbm =3D cfg->new_ctrl; if (bitmap_weight(&tmp_cbm, r->cache.cbm_len) < r->cache.min_cbm_bits) { - rdt_last_cmd_printf("No space on %s:%d\n", s->name, d->id); + rdt_last_cmd_printf("No space on %s:%d\n", s->name, d->hdr.id); return -ENOSPC; } cfg->have_new_ctrl =3D true; @@ -3104,7 +3104,7 @@ static int rdtgroup_init_cat(struct resctrl_schema *s= , u32 closid) struct rdt_domain *d; int ret; =20 - list_for_each_entry(d, &s->res->domains, list) { + list_for_each_entry(d, &s->res->domains, hdr.list) { ret =3D __init_one_rdt_domain(d, s, closid); if (ret < 0) return ret; @@ -3119,7 +3119,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r,= u32 closid) struct resctrl_staged_config *cfg; struct rdt_domain *d; =20 - list_for_each_entry(d, &r->domains, list) { + list_for_each_entry(d, &r->domains, hdr.list) { if (is_mba_sc(r)) { d->mbps_val[closid] =3D MBA_MAX_MBPS; continue; @@ -3726,7 +3726,7 @@ void resctrl_offline_domain(struct rdt_resource *r, s= truct rdt_domain *d) * per domain monitor data directories. */ if (static_branch_unlikely(&rdt_mon_enable_key)) - rmdir_mondata_subdir_allrdtgrp(r, d->id); + rmdir_mondata_subdir_allrdtgrp(r, d->hdr.id); =20 if (is_mbm_enabled()) cancel_delayed_work(&d->mbm_over); --=20 2.41.0 From nobody Fri Dec 19 02:50:45 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6F2FE7F154 for ; Thu, 28 Sep 2023 19:14:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232283AbjI1TOT (ORCPT ); Thu, 28 Sep 2023 15:14:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231426AbjI1TOG (ORCPT ); Thu, 28 Sep 2023 15:14:06 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36DA7195; Thu, 28 Sep 2023 12:14:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695928443; x=1727464443; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WjyFjMEQ9QfPNTyw95WjG0m4TEBlmo9zg/UynmB0ffQ=; b=TxI8/5AQ7zaNdcJPdLkBerzgtkiOcxryWib6cx4VGYw/aADgzAPCfFNs XBND2IY9ycBSvyQ5JrKtvl7Bxa2iTtAl8o/iz72AosbHKk1u1SnkdG+Ku RXyU3TjXIP3La60wTqcgw/q6mhE8V9Jv9JRMxTllkl35aoBm/UziDmhdV csEWFi6WKasPRxVuimmSGKF4Tx1fVVsVF+4jQ2Nzg3FXqkLVd8Tu3sT6x +xmxVB4dB+WfQ4HadIPP01EmEeIQJS5I+GN7qvXVuugsGqGAVmwTluLYS Q//EmxpfVItbpCV/+Ct0wXPrsIAlXUat3M/ZKccVNA0g/JVuigZT4Mu8g Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="367213897" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="367213897" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:13:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="779020028" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="779020028" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:13:59 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v6 3/8] x86/resctrl: Prepare for different scope for control/monitor operations Date: Thu, 28 Sep 2023 12:13:44 -0700 Message-ID: <20230928191350.205703-4-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230928191350.205703-1-tony.luck@intel.com> References: <20230829234426.64421-1-tony.luck@intel.com> <20230928191350.205703-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Resctrl assumes that control and monitor operations on a resource are performed at the same scope. Prepare for systems that use different scope (specifically L3 scope for cache control and NODE scope for cache occupancy and memory bandwidth monitoring). Create separate domain lists for control and monitor operations. Note that errors during initialization of either control or monitor functions on a domain would previously result in that domain being excluded from both control and monitor operations. Now the domains are allocated independently it is no longer required to disable both control and monitor operations if either fail. Signed-off-by: Tony Luck --- Changes since v5: Commit comment: s/Existing resctrl assumes/Resctrl assumes/ Many new names. Put an underscore in "mon_domains" for consistency with "mon_scope". Do same with all the other "mon" changes. Also rename "scope" to "ctrl_scope", "domains" to "ctrl_domains" and all the assocated functions and macros. --- include/linux/resctrl.h | 18 +- arch/x86/kernel/cpu/resctrl/internal.h | 4 +- arch/x86/kernel/cpu/resctrl/core.c | 198 ++++++++++++++++------ arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 12 +- arch/x86/kernel/cpu/resctrl/monitor.c | 2 +- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 54 +++--- 7 files changed, 200 insertions(+), 92 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index a583fa88ea5a..0af5c5aa5a6f 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -163,10 +163,12 @@ enum resctrl_scope { * @alloc_capable: Is allocation available on this machine * @mon_capable: Is monitor feature available on this machine * @num_rmid: Number of RMIDs available - * @scope: Scope of this resource + * @ctrl_scope: Scope of this resource for control functions + * @mon_scope: Scope of this resource for monitor functions * @cache: Cache allocation related data * @membw: If the component has bandwidth controls, their properties. - * @domains: All domains for this resource + * @ctrl_domains: Control domains for this resource + * @mon_domains: Monitor domains for this resource * @name: Name to use in "schemata" file. * @data_width: Character width of data when displaying * @default_ctrl: Specifies default cache cbm or memory B/W percent. @@ -181,10 +183,12 @@ struct rdt_resource { bool alloc_capable; bool mon_capable; int num_rmid; - enum resctrl_scope scope; + enum resctrl_scope ctrl_scope; + enum resctrl_scope mon_scope; struct resctrl_cache cache; struct resctrl_membw membw; - struct list_head domains; + struct list_head ctrl_domains; + struct list_head mon_domains; char *name; int data_width; u32 default_ctrl; @@ -230,8 +234,10 @@ int resctrl_arch_update_one(struct rdt_resource *r, st= ruct rdt_domain *d, =20 u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, u32 closid, enum resctrl_conf_type type); -int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d); -void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); +int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_domain *= d); +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain *d= ); +void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_domain= *d); +void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_domain = *d); =20 /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rm= id diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 85ceaf9a31ac..e9a2a8993d14 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -511,8 +511,8 @@ void rdtgroup_kn_unlock(struct kernfs_node *kn); int rdtgroup_kn_mode_restrict(struct rdtgroup *r, const char *name); int rdtgroup_kn_mode_restore(struct rdtgroup *r, const char *name, umode_t mask); -struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id, - struct list_head **pos); +struct rdt_domain_hdr *rdt_find_domain(struct list_head *h, int id, + struct list_head **pos); ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off); int rdtgroup_schemata_show(struct kernfs_open_file *of, diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 05369add4578..7ef178fb7c77 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -57,7 +57,8 @@ static void mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *r); =20 -#define domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctrl.dom= ains) +#define ctrl_domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctr= l.ctrl_domains) +#define mon_domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctrl= .mon_domains) =20 struct rdt_hw_resource rdt_resources_all[] =3D { [RDT_RESOURCE_L3] =3D @@ -65,8 +66,10 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L3, .name =3D "L3", - .scope =3D RESCTRL_L3_CACHE, - .domains =3D domain_init(RDT_RESOURCE_L3), + .ctrl_scope =3D RESCTRL_L3_CACHE, + .mon_scope =3D RESCTRL_L3_CACHE, + .ctrl_domains =3D ctrl_domain_init(RDT_RESOURCE_L3), + .mon_domains =3D mon_domain_init(RDT_RESOURCE_L3), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", .fflags =3D RFTYPE_RES_CACHE, @@ -79,8 +82,8 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_L2, .name =3D "L2", - .scope =3D RESCTRL_L2_CACHE, - .domains =3D domain_init(RDT_RESOURCE_L2), + .ctrl_scope =3D RESCTRL_L2_CACHE, + .ctrl_domains =3D ctrl_domain_init(RDT_RESOURCE_L2), .parse_ctrlval =3D parse_cbm, .format_str =3D "%d=3D%0*x", .fflags =3D RFTYPE_RES_CACHE, @@ -93,8 +96,8 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_MBA, .name =3D "MB", - .scope =3D RESCTRL_L3_CACHE, - .domains =3D domain_init(RDT_RESOURCE_MBA), + .ctrl_scope =3D RESCTRL_L3_CACHE, + .ctrl_domains =3D ctrl_domain_init(RDT_RESOURCE_MBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", .fflags =3D RFTYPE_RES_MB, @@ -105,8 +108,8 @@ struct rdt_hw_resource rdt_resources_all[] =3D { .r_resctrl =3D { .rid =3D RDT_RESOURCE_SMBA, .name =3D "SMBA", - .scope =3D RESCTRL_L3_CACHE, - .domains =3D domain_init(RDT_RESOURCE_SMBA), + .ctrl_scope =3D RESCTRL_L3_CACHE, + .ctrl_domains =3D ctrl_domain_init(RDT_RESOURCE_SMBA), .parse_ctrlval =3D parse_bw, .format_str =3D "%d=3D%*u", .fflags =3D RFTYPE_RES_MB, @@ -352,7 +355,7 @@ struct rdt_domain *get_domain_from_cpu(int cpu, struct = rdt_resource *r) { struct rdt_domain *d; =20 - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->ctrl_domains, hdr.list) { /* Find the domain that contains this CPU */ if (cpumask_test_cpu(cpu, &d->cpu_mask)) return d; @@ -384,29 +387,39 @@ void rdt_ctrl_update(void *arg) } =20 /* - * rdt_find_domain - Find a domain in a resource that matches input resour= ce id + * rdt_find_domain - Find a domain in one of a resource domain lists. * - * Search resource r's domain list to find the resource id. If the resource - * id is found in a domain, return the domain. Otherwise, if requested by - * caller, return the first domain whose id is bigger than the input id. + * Search the list to find the resource id. If the resource id is found + * in a domain, return the domain. Otherwise, if requested by caller, + * return the first domain whose id is bigger than the input id. * The domain list is sorted by id in ascending order. + * + * If an existing domain in the resource r's domain list matches the cpu's + * resource id, add the cpu in the domain. + * + * Otherwise, caller will allocate a new domain and insert into the right = position + * in the domain list sorted by id in ascending order. + * + * The order in the domain list is visible to users when we print entries + * in the schemata file and schemata input is validated to have the same o= rder + * as this list. */ -struct rdt_domain *rdt_find_domain(struct rdt_resource *r, int id, - struct list_head **pos) +struct rdt_domain_hdr *rdt_find_domain(struct list_head *h, int id, + struct list_head **pos) { - struct rdt_domain *d; + struct rdt_domain_hdr *d; struct list_head *l; =20 if (id < 0) return ERR_PTR(-ENODEV); =20 - list_for_each(l, &r->domains) { - d =3D list_entry(l, struct rdt_domain, hdr.list); + list_for_each(l, h) { + d =3D list_entry(l, struct rdt_domain_hdr, list); /* When id is found, return its domain. */ - if (id =3D=3D d->hdr.id) + if (id =3D=3D d->id) return d; /* Stop searching when finding id's position in sorted list. */ - if (id < d->hdr.id) + if (id < d->id) break; } =20 @@ -500,37 +513,27 @@ static int get_domain_id_from_scope(int cpu, enum res= ctrl_scope scope) return -EINVAL; } =20 -/* - * domain_add_cpu - Add a cpu to a resource's domain list. - * - * If an existing domain in the resource r's domain list matches the cpu's - * resource id, add the cpu in the domain. - * - * Otherwise, a new domain is allocated and inserted into the right positi= on - * in the domain list sorted by id in ascending order. - * - * The order in the domain list is visible to users when we print entries - * in the schemata file and schemata input is validated to have the same o= rder - * as this list. - */ -static void domain_add_cpu(int cpu, struct rdt_resource *r) +static void domain_add_cpu_ctrl(int cpu, struct rdt_resource *r) { - int id =3D get_domain_id_from_scope(cpu, r->scope); + int id =3D get_domain_id_from_scope(cpu, r->ctrl_scope); struct list_head *add_pos =3D NULL; struct rdt_hw_domain *hw_dom; + struct rdt_domain_hdr *hdr; struct rdt_domain *d; int err; =20 if (id < 0) { - pr_warn_once("Can't find domain id for CPU:%d scope:%d for resource %s\n= ", - cpu, r->scope, r->name); + pr_warn_once("Can't find control domain id for CPU:%d scope:%d for resou= rce %s\n", + cpu, r->ctrl_scope, r->name); return; } - d =3D rdt_find_domain(r, id, &add_pos); - if (IS_ERR(d)) { - pr_warn("Couldn't find cache id for CPU %d\n", cpu); + + hdr =3D rdt_find_domain(&r->ctrl_domains, id, &add_pos); + if (IS_ERR(hdr)) { + pr_warn("Couldn't find control scope id=3D%d for CPU %d\n", id, cpu); return; } + d =3D container_of(hdr, struct rdt_domain, hdr); =20 if (d) { cpumask_set_cpu(cpu, &d->cpu_mask); @@ -549,44 +552,101 @@ static void domain_add_cpu(int cpu, struct rdt_resou= rce *r) =20 rdt_domain_reconfigure_cdp(r); =20 - if (r->alloc_capable && domain_setup_ctrlval(r, d)) { + if (domain_setup_ctrlval(r, d)) { domain_free(hw_dom); return; } =20 - if (r->mon_capable && arch_domain_mbm_alloc(r->num_rmid, hw_dom)) { + list_add_tail(&d->hdr.list, add_pos); + + err =3D resctrl_online_ctrl_domain(r, d); + if (err) { + list_del(&d->hdr.list); domain_free(hw_dom); + } +} + +static void domain_add_cpu_mon(int cpu, struct rdt_resource *r) +{ + int id =3D get_domain_id_from_scope(cpu, r->mon_scope); + struct list_head *add_pos =3D NULL; + struct rdt_hw_domain *hw_mondom; + struct rdt_domain_hdr *hdr; + struct rdt_domain *d; + int err; + + if (id < 0) { + pr_warn_once("Can't find monitor domain id for CPU:%d scope:%d for resou= rce %s\n", + cpu, r->mon_scope, r->name); + return; + } + + hdr =3D rdt_find_domain(&r->mon_domains, id, &add_pos); + if (IS_ERR(hdr)) { + pr_warn("Couldn't find monitor scope id=3D%d for CPU %d\n", id, cpu); + return; + } + d =3D container_of(hdr, struct rdt_domain, hdr); + + if (d) { + cpumask_set_cpu(cpu, &d->cpu_mask); + return; + } + + hw_mondom =3D kzalloc_node(sizeof(*hw_mondom), GFP_KERNEL, cpu_to_node(cp= u)); + if (!hw_mondom) + return; + + d =3D &hw_mondom->d_resctrl; + d->hdr.id =3D id; + cpumask_set_cpu(cpu, &d->cpu_mask); + + if (arch_domain_mbm_alloc(r->num_rmid, hw_mondom)) { + domain_free(hw_mondom); return; } =20 list_add_tail(&d->hdr.list, add_pos); =20 - err =3D resctrl_online_domain(r, d); + err =3D resctrl_online_mon_domain(r, d); if (err) { list_del(&d->hdr.list); - domain_free(hw_dom); + domain_free(hw_mondom); } } =20 -static void domain_remove_cpu(int cpu, struct rdt_resource *r) +/* + * domain_add_cpu - Add a cpu to either/both resource's domain lists. + */ +static void domain_add_cpu(int cpu, struct rdt_resource *r) +{ + if (r->alloc_capable) + domain_add_cpu_ctrl(cpu, r); + if (r->mon_capable) + domain_add_cpu_mon(cpu, r); +} + +static void domain_remove_cpu_ctrl(int cpu, struct rdt_resource *r) { - int id =3D get_domain_id_from_scope(cpu, r->scope); + int id =3D get_domain_id_from_scope(cpu, r->ctrl_scope); struct rdt_hw_domain *hw_dom; + struct rdt_domain_hdr *hdr; struct rdt_domain *d; =20 if (id < 0) return; =20 - d =3D rdt_find_domain(r, id, NULL); - if (IS_ERR_OR_NULL(d)) { - pr_warn("Couldn't find cache id for CPU %d\n", cpu); + hdr =3D rdt_find_domain(&r->ctrl_domains, id, NULL); + if (IS_ERR_OR_NULL(hdr)) { + pr_warn("Couldn't find control scope id=3D%d for CPU %d\n", id, cpu); return; } + d =3D container_of(hdr, struct rdt_domain, hdr); hw_dom =3D resctrl_to_arch_dom(d); =20 cpumask_clear_cpu(cpu, &d->cpu_mask); if (cpumask_empty(&d->cpu_mask)) { - resctrl_offline_domain(r, d); + resctrl_offline_ctrl_domain(r, d); list_del(&d->hdr.list); =20 /* @@ -599,6 +659,34 @@ static void domain_remove_cpu(int cpu, struct rdt_reso= urce *r) =20 return; } +} + +static void domain_remove_cpu_mon(int cpu, struct rdt_resource *r) +{ + int id =3D get_domain_id_from_scope(cpu, r->mon_scope); + struct rdt_hw_domain *hw_mondom; + struct rdt_domain_hdr *hdr; + struct rdt_domain *d; + + if (id < 0) + return; + + hdr =3D rdt_find_domain(&r->mon_domains, id, NULL); + if (IS_ERR_OR_NULL(hdr)) { + pr_warn("Couldn't find scope id=3D%d for CPU %d\n", id, cpu); + return; + } + d =3D container_of(hdr, struct rdt_domain, hdr); + hw_mondom =3D resctrl_to_arch_dom(d); + + cpumask_clear_cpu(cpu, &d->cpu_mask); + if (cpumask_empty(&d->cpu_mask)) { + resctrl_offline_mon_domain(r, d); + list_del(&d->hdr.list); + domain_free(hw_mondom); + + return; + } =20 if (r =3D=3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { if (is_mbm_enabled() && cpu =3D=3D d->mbm_work_cpu) { @@ -613,6 +701,14 @@ static void domain_remove_cpu(int cpu, struct rdt_reso= urce *r) } } =20 +static void domain_remove_cpu(int cpu, struct rdt_resource *r) +{ + if (r->alloc_capable) + domain_remove_cpu_ctrl(cpu, r); + if (r->mon_capable) + domain_remove_cpu_mon(cpu, r); +} + static void clear_closid_rmid(int cpu) { struct resctrl_pqr_state *state =3D this_cpu_ptr(&pqr_state); diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cp= u/resctrl/ctrlmondata.c index 8bce591a1018..a6261e177cc1 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -224,7 +224,7 @@ static int parse_line(char *line, struct resctrl_schema= *s, return -EINVAL; } dom =3D strim(dom); - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->ctrl_domains, hdr.list) { if (d->hdr.id =3D=3D dom_id) { data.buf =3D dom; data.rdtgrp =3D rdtgrp; @@ -316,7 +316,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r,= u32 closid) return -ENOMEM; =20 msr_param.res =3D NULL; - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->ctrl_domains, hdr.list) { hw_dom =3D resctrl_to_arch_dom(d); for (t =3D 0; t < CDP_NUM_TYPES; t++) { cfg =3D &hw_dom->d_resctrl.staged_config[t]; @@ -464,7 +464,7 @@ static void show_doms(struct seq_file *s, struct resctr= l_schema *schema, int clo u32 ctrl_val; =20 seq_printf(s, "%*s:", max_name_width, schema->name); - list_for_each_entry(dom, &r->domains, hdr.list) { + list_for_each_entry(dom, &r->ctrl_domains, hdr.list) { if (sep) seq_puts(s, ";"); =20 @@ -540,6 +540,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_re= source *r, int rdtgroup_mondata_show(struct seq_file *m, void *arg) { struct kernfs_open_file *of =3D m->private; + struct rdt_domain_hdr *hdr; u32 resid, evtid, domid; struct rdtgroup *rdtgrp; struct rdt_resource *r; @@ -560,11 +561,12 @@ int rdtgroup_mondata_show(struct seq_file *m, void *a= rg) evtid =3D md.u.evtid; =20 r =3D &rdt_resources_all[resid].r_resctrl; - d =3D rdt_find_domain(r, domid, NULL); - if (IS_ERR_OR_NULL(d)) { + hdr =3D rdt_find_domain(&r->mon_domains, domid, NULL); + if (IS_ERR_OR_NULL(hdr)) { ret =3D -ENOENT; goto out; } + d =3D container_of(hdr, struct rdt_domain, hdr); =20 mon_event_read(&rr, r, d, rdtgrp, evtid, false); =20 diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index 27cda5988d7f..3265b8499e2a 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -340,7 +340,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) =20 entry->busy =3D 0; cpu =3D get_cpu(); - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->mon_domains, hdr.list) { if (cpumask_test_cpu(cpu, &d->cpu_mask)) { err =3D resctrl_arch_rmid_read(r, d, entry->rmid, QOS_L3_OCCUP_EVENT_ID, diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index 18b6183a1b48..bda32b4e1c1e 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -292,7 +292,7 @@ static void pseudo_lock_region_clear(struct pseudo_lock= _region *plr) */ static int pseudo_lock_region_init(struct pseudo_lock_region *plr) { - int scope =3D plr->s->res->scope; + int scope =3D plr->s->res->ctrl_scope; struct cpu_cacheinfo *ci; int ret; int i; @@ -856,7 +856,7 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_dom= ain *d) * associated with them. */ for_each_alloc_capable_rdt_resource(r) { - list_for_each_entry(d_i, &r->domains, hdr.list) { + list_for_each_entry(d_i, &r->ctrl_domains, hdr.list) { if (d_i->plr) cpumask_or(cpu_with_psl, cpu_with_psl, &d_i->cpu_mask); diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 42adf17ea6fa..8132f81f31bb 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -86,7 +86,7 @@ void rdt_staged_configs_clear(void) lockdep_assert_held(&rdtgroup_mutex); =20 for_each_alloc_capable_rdt_resource(r) { - list_for_each_entry(dom, &r->domains, hdr.list) + list_for_each_entry(dom, &r->ctrl_domains, hdr.list) memset(dom->staged_config, 0, sizeof(dom->staged_config)); } } @@ -928,7 +928,7 @@ static int rdt_bit_usage_show(struct kernfs_open_file *= of, =20 mutex_lock(&rdtgroup_mutex); hw_shareable =3D r->cache.shareable_bits; - list_for_each_entry(dom, &r->domains, hdr.list) { + list_for_each_entry(dom, &r->ctrl_domains, hdr.list) { if (sep) seq_putc(seq, ';'); sw_shareable =3D 0; @@ -1233,7 +1233,7 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgr= oup *rdtgrp) if (r->rid =3D=3D RDT_RESOURCE_MBA || r->rid =3D=3D RDT_RESOURCE_SMBA) continue; has_cache =3D true; - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->ctrl_domains, hdr.list) { ctrl =3D resctrl_arch_get_config(r, d, closid, s->conf_type); if (rdtgroup_cbm_overlaps(s, d, ctrl, closid, false)) { @@ -1345,13 +1345,13 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resour= ce *r, unsigned int size =3D 0; int num_b, i; =20 - if (WARN_ON_ONCE(r->scope !=3D RESCTRL_L2_CACHE && r->scope !=3D RESCTRL_= L3_CACHE)) + if (WARN_ON_ONCE(r->ctrl_scope !=3D RESCTRL_L2_CACHE && r->ctrl_scope != =3D RESCTRL_L3_CACHE)) return -EINVAL; =20 num_b =3D bitmap_weight(&cbm, r->cache.cbm_len); ci =3D get_cpu_cacheinfo(cpumask_any(&d->cpu_mask)); for (i =3D 0; i < ci->num_leaves; i++) { - if (ci->info_list[i].level =3D=3D r->scope) { + if (ci->info_list[i].level =3D=3D r->ctrl_scope) { size =3D ci->info_list[i].size / r->cache.cbm_len * num_b; break; } @@ -1410,7 +1410,7 @@ static int rdtgroup_size_show(struct kernfs_open_file= *of, type =3D schema->conf_type; sep =3D false; seq_printf(s, "%*s:", max_name_width, schema->name); - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->ctrl_domains, hdr.list) { if (sep) seq_putc(s, ';'); if (rdtgrp->mode =3D=3D RDT_MODE_PSEUDO_LOCKSETUP) { @@ -1499,7 +1499,7 @@ static int mbm_config_show(struct seq_file *s, struct= rdt_resource *r, u32 evtid =20 mutex_lock(&rdtgroup_mutex); =20 - list_for_each_entry(dom, &r->domains, hdr.list) { + list_for_each_entry(dom, &r->mon_domains, hdr.list) { if (sep) seq_puts(s, ";"); =20 @@ -1622,7 +1622,7 @@ static int mon_config_write(struct rdt_resource *r, c= har *tok, u32 evtid) return -EINVAL; } =20 - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->mon_domains, hdr.list) { if (d->hdr.id =3D=3D dom_id) { ret =3D mbm_config_write_domain(r, d, evtid, val); if (ret) @@ -2141,7 +2141,7 @@ static int set_cache_qos_cfg(int level, bool enable) return -ENOMEM; =20 r_l =3D &rdt_resources_all[level].r_resctrl; - list_for_each_entry(d, &r_l->domains, hdr.list) { + list_for_each_entry(d, &r_l->ctrl_domains, hdr.list) { if (r_l->cache.arch_has_per_cpu_cfg) /* Pick all the CPUs in the domain instance */ for_each_cpu(cpu, &d->cpu_mask) @@ -2226,7 +2226,7 @@ static int set_mba_sc(bool mba_sc) =20 r->membw.mba_sc =3D mba_sc; =20 - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->ctrl_domains, hdr.list) { for (i =3D 0; i < num_closid; i++) d->mbps_val[i] =3D MBA_MAX_MBPS; } @@ -2528,7 +2528,7 @@ static int rdt_get_tree(struct fs_context *fc) =20 if (is_mbm_enabled()) { r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - list_for_each_entry(dom, &r->domains, hdr.list) + list_for_each_entry(dom, &r->mon_domains, hdr.list) mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL); } =20 @@ -2649,10 +2649,10 @@ static int reset_all_ctrls(struct rdt_resource *r) =20 /* * Disable resource control for this resource by setting all - * CBMs in all domains to the maximum mask value. Pick one CPU + * CBMs in all ctrl_domains to the maximum mask value. Pick one CPU * from each domain to update the MSRs below. */ - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->ctrl_domains, hdr.list) { hw_dom =3D resctrl_to_arch_dom(d); cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask); =20 @@ -2922,7 +2922,7 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_= node *parent_kn, struct rdt_domain *dom; int ret; =20 - list_for_each_entry(dom, &r->domains, hdr.list) { + list_for_each_entry(dom, &r->mon_domains, hdr.list) { ret =3D mkdir_mondata_subdir(parent_kn, dom, r, prgrp); if (ret) return ret; @@ -3104,7 +3104,7 @@ static int rdtgroup_init_cat(struct resctrl_schema *s= , u32 closid) struct rdt_domain *d; int ret; =20 - list_for_each_entry(d, &s->res->domains, hdr.list) { + list_for_each_entry(d, &s->res->ctrl_domains, hdr.list) { ret =3D __init_one_rdt_domain(d, s, closid); if (ret < 0) return ret; @@ -3119,7 +3119,7 @@ static void rdtgroup_init_mba(struct rdt_resource *r,= u32 closid) struct resctrl_staged_config *cfg; struct rdt_domain *d; =20 - list_for_each_entry(d, &r->domains, hdr.list) { + list_for_each_entry(d, &r->ctrl_domains, hdr.list) { if (is_mba_sc(r)) { d->mbps_val[closid] =3D MBA_MAX_MBPS; continue; @@ -3711,16 +3711,16 @@ static void domain_destroy_mon_state(struct rdt_dom= ain *d) kfree(d->mbm_local); } =20 -void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) +void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_domain= *d) { lockdep_assert_held(&rdtgroup_mutex); =20 if (supports_mba_mbps() && r->rid =3D=3D RDT_RESOURCE_MBA) mba_sc_domain_destroy(r, d); +} =20 - if (!r->mon_capable) - return; - +void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_domain = *d) +{ /* * If resctrl is mounted, remove all the * per domain monitor data directories. @@ -3776,18 +3776,22 @@ static int domain_setup_mon_state(struct rdt_resour= ce *r, struct rdt_domain *d) return 0; } =20 -int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) +int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_domain *= d) { - int err; - lockdep_assert_held(&rdtgroup_mutex); =20 if (supports_mba_mbps() && r->rid =3D=3D RDT_RESOURCE_MBA) /* RDT_RESOURCE_MBA is never mon_capable */ return mba_sc_domain_allocate(r, d); =20 - if (!r->mon_capable) - return 0; + return 0; +} + +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain *d) +{ + int err; + + lockdep_assert_held(&rdtgroup_mutex); =20 err =3D domain_setup_mon_state(r, d); if (err) --=20 2.41.0 From nobody Fri Dec 19 02:50:45 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABEEACE7B1E for ; Thu, 28 Sep 2023 19:14:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232252AbjI1TOZ (ORCPT ); Thu, 28 Sep 2023 15:14:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232117AbjI1TON (ORCPT ); Thu, 28 Sep 2023 15:14:13 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A4328194; Thu, 28 Sep 2023 12:14:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695928443; x=1727464443; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PKKJBbnMer+IGpPdDFuKBb4TtXUxA2W2eBeyRg4qQhM=; b=NIzpmkwPlOiEmq11ZAAOExubA83nw3+Fs1kKVVgFNmRzLL38h8Ku8yNF LmLKJ2DKp3KEjS45iWNwDyZAuxhh05UXa34esy68YTXOuyXFQ3sYFqq3x +EOkCMQWxvynmhgvTyEXCWtdrm3425gGI7jiXgdYEIDqs+kL25CDrfdVq HBXleI6ldvP9aGeehOD5RjzUNJBjTOqgVP0X6aYXY0hK+B5m7trsLK5VS UaduRpDsMP/UjcEBsjpe2nXsgjHSfwYyJjmkmx4ESrTnmE9Ty/O5xTfGG N9vjjSH9F1ZJM4rcABMnYahLSScko87f3xC0iL4dGk1yi6+TdkXioGf/f A==; X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="367213913" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="367213913" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:14:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="779020033" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="779020033" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:13:59 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v6 4/8] x86/resctrl: Split the rdt_domain and rdt_hw_domain structures Date: Thu, 28 Sep 2023 12:13:45 -0700 Message-ID: <20230928191350.205703-5-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230928191350.205703-1-tony.luck@intel.com> References: <20230829234426.64421-1-tony.luck@intel.com> <20230928191350.205703-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The same rdt_domain structure is used for both control and monitor functions. But this results in wasted memory as some of the fields are only used by control functions, while most are only used for monitor functions. Split into separate rdt_ctrl_domain and rdt_mon_domain structures with just the fields required for control and monitoring respectively. Similar split of the rdt_hw_domain structure into rdt_hw_ctrl_domain and rdt_hw_mon_domain. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- Changes since v5: Make rdt_find_domain() work on either control or monitor domains using infrastructure setup in previous patch to have a common header stucture in each. Don't use a field paramter in the domain_init() macro, just provide separate "ctrl_domain_init()" and "mon_domain_init()" versions. Improve error messages if domain_add_cpu_{ctrl,mon}() fail to locate a domain when adding a CPU. Re-order local variable declarations to maintain reverse fir tree pattern in functions when the name changes in this patch broke the pattern. Moved the comment describing how domain lists are ordered that used to be in front of domain_add_cpu() to rdt_find_domain() which is doing the majority of this work. Dropped two blank lines that don't belong. --- include/linux/resctrl.h | 50 +++++++------ arch/x86/kernel/cpu/resctrl/internal.h | 60 ++++++++++------ arch/x86/kernel/cpu/resctrl/core.c | 87 ++++++++++++----------- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 32 ++++----- arch/x86/kernel/cpu/resctrl/monitor.c | 40 +++++------ arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 6 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 62 ++++++++-------- 7 files changed, 184 insertions(+), 153 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 0af5c5aa5a6f..1c925e3db2ea 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -63,7 +63,25 @@ struct rdt_domain_hdr { }; =20 /** - * struct rdt_domain - group of CPUs sharing a resctrl resource + * struct rdt_ctrl_domain - group of CPUs sharing a resctrl control resour= ce + * @hdr: common header for different domain types + * @cpu_mask: which CPUs share this resource + * @plr: pseudo-locked region (if any) associated with domain + * @staged_config: parsed configuration to be applied + * @mbps_val: When mba_sc is enabled, this holds the array of user + * specified control values for mba_sc in MBps, indexed + * by closid + */ +struct rdt_ctrl_domain { + struct rdt_domain_hdr hdr; + struct cpumask cpu_mask; + struct pseudo_lock_region *plr; + struct resctrl_staged_config staged_config[CDP_NUM_TYPES]; + u32 *mbps_val; +}; + +/** + * struct rdt_mon_domain - group of CPUs sharing a resctrl control resource * @hdr: common header for different domain types * @cpu_mask: which CPUs share this resource * @rmid_busy_llc: bitmap of which limbo RMIDs are above threshold @@ -73,13 +91,8 @@ struct rdt_domain_hdr { * @cqm_limbo: worker to periodically read CQM h/w counters * @mbm_work_cpu: worker CPU for MBM h/w counters * @cqm_work_cpu: worker CPU for CQM h/w counters - * @plr: pseudo-locked region (if any) associated with domain - * @staged_config: parsed configuration to be applied - * @mbps_val: When mba_sc is enabled, this holds the array of user - * specified control values for mba_sc in MBps, indexed - * by closid */ -struct rdt_domain { +struct rdt_mon_domain { struct rdt_domain_hdr hdr; struct cpumask cpu_mask; unsigned long *rmid_busy_llc; @@ -89,9 +102,6 @@ struct rdt_domain { struct delayed_work cqm_limbo; int mbm_work_cpu; int cqm_work_cpu; - struct pseudo_lock_region *plr; - struct resctrl_staged_config staged_config[CDP_NUM_TYPES]; - u32 *mbps_val; }; =20 /** @@ -195,7 +205,7 @@ struct rdt_resource { const char *format_str; int (*parse_ctrlval)(struct rdt_parse_data *data, struct resctrl_schema *s, - struct rdt_domain *d); + struct rdt_ctrl_domain *d); struct list_head evt_list; unsigned long fflags; bool cdp_capable; @@ -229,15 +239,15 @@ int resctrl_arch_update_domains(struct rdt_resource *= r, u32 closid); * Update the ctrl_val and apply this config right now. * Must be called on one of the domain's CPUs. */ -int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_domain *d, +int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain= *d, u32 closid, enum resctrl_conf_type t, u32 cfg_val); =20 -u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, +u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain= *d, u32 closid, enum resctrl_conf_type type); -int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_domain *= d); -int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain *d= ); -void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_domain= *d); -void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_domain = *d); +int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_dom= ain *d); +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mon_domai= n *d); +void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_d= omain *d); +void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_dom= ain *d); =20 /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rm= id @@ -253,7 +263,7 @@ void resctrl_offline_mon_domain(struct rdt_resource *r,= struct rdt_domain *d); * Return: * 0 on success, or -EIO, -EINVAL etc on error. */ -int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, +int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *= d, u32 rmid, enum resctrl_event_id eventid, u64 *val); =20 /** @@ -266,7 +276,7 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, stru= ct rdt_domain *d, * * This can be called from any CPU. */ -void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, +void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain= *d, u32 rmid, enum resctrl_event_id eventid); =20 /** @@ -278,7 +288,7 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, st= ruct rdt_domain *d, * * This can be called from any CPU. */ -void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain= *d); +void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_do= main *d); =20 extern unsigned int resctrl_rmid_realloc_threshold; extern unsigned int resctrl_rmid_realloc_limit; diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index e9a2a8993d14..ee38249c6f1d 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -106,7 +106,7 @@ union mon_data_bits { struct rmid_read { struct rdtgroup *rgrp; struct rdt_resource *r; - struct rdt_domain *d; + struct rdt_mon_domain *d; enum resctrl_event_id evtid; bool first; int err; @@ -191,7 +191,7 @@ struct mongroup { */ struct pseudo_lock_region { struct resctrl_schema *s; - struct rdt_domain *d; + struct rdt_ctrl_domain *d; u32 cbm; wait_queue_head_t lock_thread_wq; int thread_done; @@ -319,25 +319,41 @@ struct arch_mbm_state { }; =20 /** - * struct rdt_hw_domain - Arch private attributes of a set of CPUs that sh= are - * a resource + * struct rdt_hw_ctrl_domain - Arch private attributes of a set of CPUs th= at share + * a resource for a control function * @d_resctrl: Properties exposed to the resctrl file system * @ctrl_val: array of cache or mem ctrl values (indexed by CLOSID) + * + * Members of this structure are accessed via helpers that provide abstrac= tion. + */ +struct rdt_hw_ctrl_domain { + struct rdt_ctrl_domain d_resctrl; + u32 *ctrl_val; +}; + +/** + * struct rdt_hw_mon_domain - Arch private attributes of a set of CPUs tha= t share + * a resource for a monitor function + * @d_resctrl: Properties exposed to the resctrl file system * @arch_mbm_total: arch private state for MBM total bandwidth * @arch_mbm_local: arch private state for MBM local bandwidth * * Members of this structure are accessed via helpers that provide abstrac= tion. */ -struct rdt_hw_domain { - struct rdt_domain d_resctrl; - u32 *ctrl_val; +struct rdt_hw_mon_domain { + struct rdt_mon_domain d_resctrl; struct arch_mbm_state *arch_mbm_total; struct arch_mbm_state *arch_mbm_local; }; =20 -static inline struct rdt_hw_domain *resctrl_to_arch_dom(struct rdt_domain = *r) +static inline struct rdt_hw_ctrl_domain *resctrl_to_arch_ctrl_dom(struct r= dt_ctrl_domain *r) { - return container_of(r, struct rdt_hw_domain, d_resctrl); + return container_of(r, struct rdt_hw_ctrl_domain, d_resctrl); +} + +static inline struct rdt_hw_mon_domain *resctrl_to_arch_mon_dom(struct rdt= _mon_domain *r) +{ + return container_of(r, struct rdt_hw_mon_domain, d_resctrl); } =20 /** @@ -405,7 +421,7 @@ struct rdt_hw_resource { struct rdt_resource r_resctrl; u32 num_closid; unsigned int msr_base; - void (*msr_update) (struct rdt_domain *d, struct msr_param *m, + void (*msr_update) (struct rdt_ctrl_domain *d, struct msr_param *m, struct rdt_resource *r); unsigned int mon_scale; unsigned int mbm_width; @@ -418,9 +434,9 @@ static inline struct rdt_hw_resource *resctrl_to_arch_r= es(struct rdt_resource *r } =20 int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s, - struct rdt_domain *d); + struct rdt_ctrl_domain *d); int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, - struct rdt_domain *d); + struct rdt_ctrl_domain *d); =20 extern struct mutex rdtgroup_mutex; =20 @@ -517,21 +533,21 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_fi= le *of, char *buf, size_t nbytes, loff_t off); int rdtgroup_schemata_show(struct kernfs_open_file *of, struct seq_file *s, void *v); -bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d, +bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_ctrl_domai= n *d, unsigned long cbm, int closid, bool exclusive); -unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_domai= n *d, +unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, struct rdt_ctrl_= domain *d, unsigned long cbm); enum rdtgrp_mode rdtgroup_mode_by_closid(int closid); int rdtgroup_tasks_assigned(struct rdtgroup *r); int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp); int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp); -bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned lo= ng cbm); -bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d); +bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_ctrl_domain *d, unsign= ed long cbm); +bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d); int rdt_pseudo_lock_init(void); void rdt_pseudo_lock_release(void); int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp); void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp); -struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r); +struct rdt_ctrl_domain *get_domain_from_cpu(int cpu, struct rdt_resource *= r); int closids_supported(void); void closid_free(int closid); int alloc_rmid(void); @@ -541,17 +557,17 @@ bool __init rdt_cpu_has(int flag); void mon_event_count(void *info); int rdtgroup_mondata_show(struct seq_file *m, void *arg); void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, - struct rdt_domain *d, struct rdtgroup *rdtgrp, + struct rdt_mon_domain *d, struct rdtgroup *rdtgrp, int evtid, int first); -void mbm_setup_overflow_handler(struct rdt_domain *dom, +void mbm_setup_overflow_handler(struct rdt_mon_domain *dom, unsigned long delay_ms); void mbm_handle_overflow(struct work_struct *work); void __init intel_rdt_mbm_apply_quirk(void); bool is_mba_sc(struct rdt_resource *r); -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_m= s); +void cqm_setup_limbo_handler(struct rdt_mon_domain *dom, unsigned long del= ay_ms); void cqm_handle_limbo(struct work_struct *work); -bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d); -void __check_limbo(struct rdt_domain *d, bool force_free); +bool has_busy_rmid(struct rdt_resource *r, struct rdt_mon_domain *d); +void __check_limbo(struct rdt_mon_domain *d, bool force_free); void rdt_domain_reconfigure_cdp(struct rdt_resource *r); void __init thread_throttle_mode_init(void); void __init mbm_config_rftype_init(const char *config); diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 7ef178fb7c77..726f00c01079 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -49,12 +49,12 @@ int max_name_width, max_data_width; bool rdt_alloc_capable; =20 static void -mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m, +mba_wrmsr_intel(struct rdt_ctrl_domain *d, struct msr_param *m, struct rdt_resource *r); static void -cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *= r); +cat_wrmsr(struct rdt_ctrl_domain *d, struct msr_param *m, struct rdt_resou= rce *r); static void -mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, +mba_wrmsr_amd(struct rdt_ctrl_domain *d, struct msr_param *m, struct rdt_resource *r); =20 #define ctrl_domain_init(id) LIST_HEAD_INIT(rdt_resources_all[id].r_resctr= l.ctrl_domains) @@ -303,11 +303,11 @@ static void rdt_get_cdp_l2_config(void) } =20 static void -mba_wrmsr_amd(struct rdt_domain *d, struct msr_param *m, struct rdt_resour= ce *r) +mba_wrmsr_amd(struct rdt_ctrl_domain *d, struct msr_param *m, struct rdt_r= esource *r) { - unsigned int i; - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); + struct rdt_hw_ctrl_domain *hw_dom =3D resctrl_to_arch_ctrl_dom(d); struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(r); + unsigned int i; =20 for (i =3D m->low; i < m->high; i++) wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]); @@ -328,12 +328,12 @@ static u32 delay_bw_map(unsigned long bw, struct rdt_= resource *r) } =20 static void -mba_wrmsr_intel(struct rdt_domain *d, struct msr_param *m, +mba_wrmsr_intel(struct rdt_ctrl_domain *d, struct msr_param *m, struct rdt_resource *r) { - unsigned int i; - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); + struct rdt_hw_ctrl_domain *hw_dom =3D resctrl_to_arch_ctrl_dom(d); struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(r); + unsigned int i; =20 /* Write the delay values for mba. */ for (i =3D m->low; i < m->high; i++) @@ -341,19 +341,19 @@ mba_wrmsr_intel(struct rdt_domain *d, struct msr_para= m *m, } =20 static void -cat_wrmsr(struct rdt_domain *d, struct msr_param *m, struct rdt_resource *= r) +cat_wrmsr(struct rdt_ctrl_domain *d, struct msr_param *m, struct rdt_resou= rce *r) { - unsigned int i; - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); + struct rdt_hw_ctrl_domain *hw_dom =3D resctrl_to_arch_ctrl_dom(d); struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(r); + unsigned int i; =20 for (i =3D m->low; i < m->high; i++) wrmsrl(hw_res->msr_base + i, hw_dom->ctrl_val[i]); } =20 -struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r) +struct rdt_ctrl_domain *get_domain_from_cpu(int cpu, struct rdt_resource *= r) { - struct rdt_domain *d; + struct rdt_ctrl_domain *d; =20 list_for_each_entry(d, &r->ctrl_domains, hdr.list) { /* Find the domain that contains this CPU */ @@ -375,7 +375,7 @@ void rdt_ctrl_update(void *arg) struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(m->res); struct rdt_resource *r =3D m->res; int cpu =3D smp_processor_id(); - struct rdt_domain *d; + struct rdt_ctrl_domain *d; =20 d =3D get_domain_from_cpu(cpu, r); if (d) { @@ -443,18 +443,23 @@ static void setup_default_ctrlval(struct rdt_resource= *r, u32 *dc) *dc =3D r->default_ctrl; } =20 -static void domain_free(struct rdt_hw_domain *hw_dom) +static void ctrl_domain_free(struct rdt_hw_ctrl_domain *hw_dom) +{ + kfree(hw_dom->ctrl_val); + kfree(hw_dom); +} + +static void mon_domain_free(struct rdt_hw_mon_domain *hw_dom) { kfree(hw_dom->arch_mbm_total); kfree(hw_dom->arch_mbm_local); - kfree(hw_dom->ctrl_val); kfree(hw_dom); } =20 -static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_domain = *d) +static int domain_setup_ctrlval(struct rdt_resource *r, struct rdt_ctrl_do= main *d) { + struct rdt_hw_ctrl_domain *hw_dom =3D resctrl_to_arch_ctrl_dom(d); struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(r); - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); struct msr_param m; u32 *dc; =20 @@ -477,7 +482,7 @@ static int domain_setup_ctrlval(struct rdt_resource *r,= struct rdt_domain *d) * @num_rmid: The size of the MBM counter array * @hw_dom: The domain that owns the allocated arrays */ -static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_domain *hw_do= m) +static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_mon_domain *h= w_dom) { size_t tsize; =20 @@ -516,10 +521,10 @@ static int get_domain_id_from_scope(int cpu, enum res= ctrl_scope scope) static void domain_add_cpu_ctrl(int cpu, struct rdt_resource *r) { int id =3D get_domain_id_from_scope(cpu, r->ctrl_scope); + struct rdt_hw_ctrl_domain *hw_dom; struct list_head *add_pos =3D NULL; - struct rdt_hw_domain *hw_dom; struct rdt_domain_hdr *hdr; - struct rdt_domain *d; + struct rdt_ctrl_domain *d; int err; =20 if (id < 0) { @@ -533,7 +538,7 @@ static void domain_add_cpu_ctrl(int cpu, struct rdt_res= ource *r) pr_warn("Couldn't find control scope id=3D%d for CPU %d\n", id, cpu); return; } - d =3D container_of(hdr, struct rdt_domain, hdr); + d =3D container_of(hdr, struct rdt_ctrl_domain, hdr); =20 if (d) { cpumask_set_cpu(cpu, &d->cpu_mask); @@ -553,7 +558,7 @@ static void domain_add_cpu_ctrl(int cpu, struct rdt_res= ource *r) rdt_domain_reconfigure_cdp(r); =20 if (domain_setup_ctrlval(r, d)) { - domain_free(hw_dom); + ctrl_domain_free(hw_dom); return; } =20 @@ -562,17 +567,17 @@ static void domain_add_cpu_ctrl(int cpu, struct rdt_r= esource *r) err =3D resctrl_online_ctrl_domain(r, d); if (err) { list_del(&d->hdr.list); - domain_free(hw_dom); + ctrl_domain_free(hw_dom); } } =20 static void domain_add_cpu_mon(int cpu, struct rdt_resource *r) { int id =3D get_domain_id_from_scope(cpu, r->mon_scope); + struct rdt_hw_mon_domain *hw_mondom; struct list_head *add_pos =3D NULL; - struct rdt_hw_domain *hw_mondom; struct rdt_domain_hdr *hdr; - struct rdt_domain *d; + struct rdt_mon_domain *d; int err; =20 if (id < 0) { @@ -586,7 +591,7 @@ static void domain_add_cpu_mon(int cpu, struct rdt_reso= urce *r) pr_warn("Couldn't find monitor scope id=3D%d for CPU %d\n", id, cpu); return; } - d =3D container_of(hdr, struct rdt_domain, hdr); + d =3D container_of(hdr, struct rdt_mon_domain, hdr); =20 if (d) { cpumask_set_cpu(cpu, &d->cpu_mask); @@ -602,7 +607,7 @@ static void domain_add_cpu_mon(int cpu, struct rdt_reso= urce *r) cpumask_set_cpu(cpu, &d->cpu_mask); =20 if (arch_domain_mbm_alloc(r->num_rmid, hw_mondom)) { - domain_free(hw_mondom); + mon_domain_free(hw_mondom); return; } =20 @@ -611,7 +616,7 @@ static void domain_add_cpu_mon(int cpu, struct rdt_reso= urce *r) err =3D resctrl_online_mon_domain(r, d); if (err) { list_del(&d->hdr.list); - domain_free(hw_mondom); + mon_domain_free(hw_mondom); } } =20 @@ -629,9 +634,9 @@ static void domain_add_cpu(int cpu, struct rdt_resource= *r) static void domain_remove_cpu_ctrl(int cpu, struct rdt_resource *r) { int id =3D get_domain_id_from_scope(cpu, r->ctrl_scope); - struct rdt_hw_domain *hw_dom; + struct rdt_hw_ctrl_domain *hw_dom; struct rdt_domain_hdr *hdr; - struct rdt_domain *d; + struct rdt_ctrl_domain *d; =20 if (id < 0) return; @@ -641,8 +646,8 @@ static void domain_remove_cpu_ctrl(int cpu, struct rdt_= resource *r) pr_warn("Couldn't find control scope id=3D%d for CPU %d\n", id, cpu); return; } - d =3D container_of(hdr, struct rdt_domain, hdr); - hw_dom =3D resctrl_to_arch_dom(d); + d =3D container_of(hdr, struct rdt_ctrl_domain, hdr); + hw_dom =3D resctrl_to_arch_ctrl_dom(d); =20 cpumask_clear_cpu(cpu, &d->cpu_mask); if (cpumask_empty(&d->cpu_mask)) { @@ -650,12 +655,12 @@ static void domain_remove_cpu_ctrl(int cpu, struct rd= t_resource *r) list_del(&d->hdr.list); =20 /* - * rdt_domain "d" is going to be freed below, so clear + * rdt_ctrl_domain "d" is going to be freed below, so clear * its pointer from pseudo_lock_region struct. */ if (d->plr) d->plr->d =3D NULL; - domain_free(hw_dom); + ctrl_domain_free(hw_dom); =20 return; } @@ -664,9 +669,9 @@ static void domain_remove_cpu_ctrl(int cpu, struct rdt_= resource *r) static void domain_remove_cpu_mon(int cpu, struct rdt_resource *r) { int id =3D get_domain_id_from_scope(cpu, r->mon_scope); - struct rdt_hw_domain *hw_mondom; + struct rdt_hw_mon_domain *hw_mondom; struct rdt_domain_hdr *hdr; - struct rdt_domain *d; + struct rdt_mon_domain *d; =20 if (id < 0) return; @@ -676,14 +681,14 @@ static void domain_remove_cpu_mon(int cpu, struct rdt= _resource *r) pr_warn("Couldn't find scope id=3D%d for CPU %d\n", id, cpu); return; } - d =3D container_of(hdr, struct rdt_domain, hdr); - hw_mondom =3D resctrl_to_arch_dom(d); + d =3D container_of(hdr, struct rdt_mon_domain, hdr); + hw_mondom =3D resctrl_to_arch_mon_dom(d); =20 cpumask_clear_cpu(cpu, &d->cpu_mask); if (cpumask_empty(&d->cpu_mask)) { resctrl_offline_mon_domain(r, d); list_del(&d->hdr.list); - domain_free(hw_mondom); + mon_domain_free(hw_mondom); =20 return; } diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cp= u/resctrl/ctrlmondata.c index a6261e177cc1..7513eba9feaf 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -58,7 +58,7 @@ static bool bw_validate(char *buf, unsigned long *data, s= truct rdt_resource *r) } =20 int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, - struct rdt_domain *d) + struct rdt_ctrl_domain *d) { struct resctrl_staged_config *cfg; u32 closid =3D data->rdtgrp->closid; @@ -135,7 +135,7 @@ static bool cbm_validate(char *buf, u32 *data, struct r= dt_resource *r) * resource type. */ int parse_cbm(struct rdt_parse_data *data, struct resctrl_schema *s, - struct rdt_domain *d) + struct rdt_ctrl_domain *d) { struct rdtgroup *rdtgrp =3D data->rdtgrp; struct resctrl_staged_config *cfg; @@ -205,7 +205,7 @@ static int parse_line(char *line, struct resctrl_schema= *s, struct rdt_resource *r =3D s->res; struct rdt_parse_data data; char *dom =3D NULL, *id; - struct rdt_domain *d; + struct rdt_ctrl_domain *d; unsigned long dom_id; =20 if (rdtgrp->mode =3D=3D RDT_MODE_PSEUDO_LOCKSETUP && @@ -265,11 +265,11 @@ static u32 get_config_index(u32 closid, enum resctrl_= conf_type type) } } =20 -static bool apply_config(struct rdt_hw_domain *hw_dom, +static bool apply_config(struct rdt_hw_ctrl_domain *hw_dom, struct resctrl_staged_config *cfg, u32 idx, cpumask_var_t cpu_mask) { - struct rdt_domain *dom =3D &hw_dom->d_resctrl; + struct rdt_ctrl_domain *dom =3D &hw_dom->d_resctrl; =20 if (cfg->new_ctrl !=3D hw_dom->ctrl_val[idx]) { cpumask_set_cpu(cpumask_any(&dom->cpu_mask), cpu_mask); @@ -281,11 +281,11 @@ static bool apply_config(struct rdt_hw_domain *hw_dom, return false; } =20 -int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_domain *d, +int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain= *d, u32 closid, enum resctrl_conf_type t, u32 cfg_val) { + struct rdt_hw_ctrl_domain *hw_dom =3D resctrl_to_arch_ctrl_dom(d); struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(r); - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); u32 idx =3D get_config_index(closid, t); struct msr_param msr_param; =20 @@ -305,11 +305,11 @@ int resctrl_arch_update_one(struct rdt_resource *r, s= truct rdt_domain *d, int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid) { struct resctrl_staged_config *cfg; - struct rdt_hw_domain *hw_dom; + struct rdt_hw_ctrl_domain *hw_dom; struct msr_param msr_param; enum resctrl_conf_type t; + struct rdt_ctrl_domain *d; cpumask_var_t cpu_mask; - struct rdt_domain *d; u32 idx; =20 if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) @@ -317,7 +317,7 @@ int resctrl_arch_update_domains(struct rdt_resource *r,= u32 closid) =20 msr_param.res =3D NULL; list_for_each_entry(d, &r->ctrl_domains, hdr.list) { - hw_dom =3D resctrl_to_arch_dom(d); + hw_dom =3D resctrl_to_arch_ctrl_dom(d); for (t =3D 0; t < CDP_NUM_TYPES; t++) { cfg =3D &hw_dom->d_resctrl.staged_config[t]; if (!cfg->have_new_ctrl) @@ -447,10 +447,10 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_fi= le *of, return ret ?: nbytes; } =20 -u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, +u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain= *d, u32 closid, enum resctrl_conf_type type) { - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); + struct rdt_hw_ctrl_domain *hw_dom =3D resctrl_to_arch_ctrl_dom(d); u32 idx =3D get_config_index(closid, type); =20 return hw_dom->ctrl_val[idx]; @@ -459,7 +459,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, str= uct rdt_domain *d, static void show_doms(struct seq_file *s, struct resctrl_schema *schema, i= nt closid) { struct rdt_resource *r =3D schema->res; - struct rdt_domain *dom; + struct rdt_ctrl_domain *dom; bool sep =3D false; u32 ctrl_val; =20 @@ -521,7 +521,7 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, } =20 void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, - struct rdt_domain *d, struct rdtgroup *rdtgrp, + struct rdt_mon_domain *d, struct rdtgroup *rdtgrp, int evtid, int first) { /* @@ -541,11 +541,11 @@ int rdtgroup_mondata_show(struct seq_file *m, void *a= rg) { struct kernfs_open_file *of =3D m->private; struct rdt_domain_hdr *hdr; + struct rdt_mon_domain *d; u32 resid, evtid, domid; struct rdtgroup *rdtgrp; struct rdt_resource *r; union mon_data_bits md; - struct rdt_domain *d; struct rmid_read rr; int ret =3D 0; =20 @@ -566,7 +566,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg) ret =3D -ENOENT; goto out; } - d =3D container_of(hdr, struct rdt_domain, hdr); + d =3D container_of(hdr, struct rdt_mon_domain, hdr); =20 mon_event_read(&rr, r, d, rdtgrp, evtid, false); =20 diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index 3265b8499e2a..97d2ed829f5d 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -170,7 +170,7 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id = eventid, u64 *val) return 0; } =20 -static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_domain *hw_= dom, +static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_mon_domain = *hw_dom, u32 rmid, enum resctrl_event_id eventid) { @@ -189,10 +189,10 @@ static struct arch_mbm_state *get_arch_mbm_state(stru= ct rdt_hw_domain *hw_dom, return NULL; } =20 -void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, +void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_mon_domain= *d, u32 rmid, enum resctrl_event_id eventid) { - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); + struct rdt_hw_mon_domain *hw_dom =3D resctrl_to_arch_mon_dom(d); struct arch_mbm_state *am; =20 am =3D get_arch_mbm_state(hw_dom, rmid, eventid); @@ -208,9 +208,9 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, st= ruct rdt_domain *d, * Assumes that hardware counters are also reset and thus that there is * no need to record initial non-zero counts. */ -void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_domain= *d) +void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_do= main *d) { - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); + struct rdt_hw_mon_domain *hw_dom =3D resctrl_to_arch_mon_dom(d); =20 if (is_mbm_total_enabled()) memset(hw_dom->arch_mbm_total, 0, @@ -229,11 +229,11 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_m= sr, unsigned int width) return chunks >> shift; } =20 -int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, +int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_mon_domain *= d, u32 rmid, enum resctrl_event_id eventid, u64 *val) { + struct rdt_hw_mon_domain *hw_dom =3D resctrl_to_arch_mon_dom(d); struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(r); - struct rdt_hw_domain *hw_dom =3D resctrl_to_arch_dom(d); struct arch_mbm_state *am; u64 msr_val, chunks; int ret; @@ -266,7 +266,7 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, stru= ct rdt_domain *d, * decrement the count. If the busy count gets to zero on an RMID, we * free the RMID */ -void __check_limbo(struct rdt_domain *d, bool force_free) +void __check_limbo(struct rdt_mon_domain *d, bool force_free) { struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; struct rmid_entry *entry; @@ -305,7 +305,7 @@ void __check_limbo(struct rdt_domain *d, bool force_fre= e) } } =20 -bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d) +bool has_busy_rmid(struct rdt_resource *r, struct rdt_mon_domain *d) { return find_first_bit(d->rmid_busy_llc, r->num_rmid) !=3D r->num_rmid; } @@ -334,7 +334,7 @@ int alloc_rmid(void) static void add_rmid_to_limbo(struct rmid_entry *entry) { struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - struct rdt_domain *d; + struct rdt_mon_domain *d; int cpu, err; u64 val =3D 0; =20 @@ -383,7 +383,7 @@ void free_rmid(u32 rmid) list_add_tail(&entry->list, &rmid_free_lru); } =20 -static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 rmid, +static struct mbm_state *get_mbm_state(struct rdt_mon_domain *d, u32 rmid, enum resctrl_event_id evtid) { switch (evtid) { @@ -516,13 +516,13 @@ void mon_event_count(void *info) * throttle MSRs already have low percentage values. To avoid * unnecessarily restricting such rdtgroups, we also increase the bandwidt= h. */ -static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mb= m) +static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_mon_domain *do= m_mbm) { u32 closid, rmid, cur_msr_val, new_msr_val; struct mbm_state *pmbm_data, *cmbm_data; + struct rdt_ctrl_domain *dom_mba; u32 cur_bw, delta_bw, user_bw; struct rdt_resource *r_mba; - struct rdt_domain *dom_mba; struct list_head *head; struct rdtgroup *entry; =20 @@ -600,7 +600,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct= rdt_domain *dom_mbm) } } =20 -static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int r= mid) +static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d, i= nt rmid) { struct rmid_read rr; =20 @@ -640,13 +640,13 @@ void cqm_handle_limbo(struct work_struct *work) { unsigned long delay =3D msecs_to_jiffies(CQM_LIMBOCHECK_INTERVAL); int cpu =3D smp_processor_id(); + struct rdt_mon_domain *d; struct rdt_resource *r; - struct rdt_domain *d; =20 mutex_lock(&rdtgroup_mutex); =20 r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - d =3D container_of(work, struct rdt_domain, cqm_limbo.work); + d =3D container_of(work, struct rdt_mon_domain, cqm_limbo.work); =20 __check_limbo(d, false); =20 @@ -656,7 +656,7 @@ void cqm_handle_limbo(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } =20 -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_m= s) +void cqm_setup_limbo_handler(struct rdt_mon_domain *dom, unsigned long del= ay_ms) { unsigned long delay =3D msecs_to_jiffies(delay_ms); int cpu; @@ -672,9 +672,9 @@ void mbm_handle_overflow(struct work_struct *work) unsigned long delay =3D msecs_to_jiffies(MBM_OVERFLOW_INTERVAL); struct rdtgroup *prgrp, *crgrp; int cpu =3D smp_processor_id(); + struct rdt_mon_domain *d; struct list_head *head; struct rdt_resource *r; - struct rdt_domain *d; =20 mutex_lock(&rdtgroup_mutex); =20 @@ -682,7 +682,7 @@ void mbm_handle_overflow(struct work_struct *work) goto out_unlock; =20 r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - d =3D container_of(work, struct rdt_domain, mbm_over.work); + d =3D container_of(work, struct rdt_mon_domain, mbm_over.work); =20 list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { mbm_update(r, d, prgrp->mon.rmid); @@ -701,7 +701,7 @@ void mbm_handle_overflow(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } =20 -void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long dela= y_ms) +void mbm_setup_overflow_handler(struct rdt_mon_domain *dom, unsigned long = delay_ms) { unsigned long delay =3D msecs_to_jiffies(delay_ms); int cpu; diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cp= u/resctrl/pseudo_lock.c index bda32b4e1c1e..675e9e47af54 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -814,7 +814,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) * Return: true if @cbm overlaps with pseudo-locked region on @d, false * otherwise. */ -bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned lo= ng cbm) +bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_ctrl_domain *d, unsign= ed long cbm) { unsigned int cbm_len; unsigned long cbm_b; @@ -841,11 +841,11 @@ bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_d= omain *d, unsigned long cbm * if it is not possible to test due to memory allocation issue, * false otherwise. */ -bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d) +bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_ctrl_domain *d) { + struct rdt_ctrl_domain *d_i; cpumask_var_t cpu_with_psl; struct rdt_resource *r; - struct rdt_domain *d_i; bool ret =3D false; =20 if (!zalloc_cpumask_var(&cpu_with_psl, GFP_KERNEL)) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 8132f81f31bb..b0901fb95aa9 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -80,8 +80,8 @@ void rdt_last_cmd_printf(const char *fmt, ...) =20 void rdt_staged_configs_clear(void) { + struct rdt_ctrl_domain *dom; struct rdt_resource *r; - struct rdt_domain *dom; =20 lockdep_assert_held(&rdtgroup_mutex); =20 @@ -920,7 +920,7 @@ static int rdt_bit_usage_show(struct kernfs_open_file *= of, unsigned long sw_shareable =3D 0, hw_shareable =3D 0; unsigned long exclusive =3D 0, pseudo_locked =3D 0; struct rdt_resource *r =3D s->res; - struct rdt_domain *dom; + struct rdt_ctrl_domain *dom; int i, hwb, swb, excl, psl; enum rdtgrp_mode mode; bool sep =3D false; @@ -1137,7 +1137,7 @@ static enum resctrl_conf_type resctrl_peer_type(enum = resctrl_conf_type my_type) * * Return: false if CBM does not overlap, true if it does. */ -static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_dom= ain *d, +static bool __rdtgroup_cbm_overlaps(struct rdt_resource *r, struct rdt_ctr= l_domain *d, unsigned long cbm, int closid, enum resctrl_conf_type type, bool exclusive) { @@ -1192,7 +1192,7 @@ static bool __rdtgroup_cbm_overlaps(struct rdt_resour= ce *r, struct rdt_domain *d * * Return: true if CBM overlap detected, false if there is no overlap */ -bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_domain *d, +bool rdtgroup_cbm_overlaps(struct resctrl_schema *s, struct rdt_ctrl_domai= n *d, unsigned long cbm, int closid, bool exclusive) { enum resctrl_conf_type peer_type =3D resctrl_peer_type(s->conf_type); @@ -1222,10 +1222,10 @@ bool rdtgroup_cbm_overlaps(struct resctrl_schema *s= , struct rdt_domain *d, static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp) { int closid =3D rdtgrp->closid; + struct rdt_ctrl_domain *d; struct resctrl_schema *s; struct rdt_resource *r; bool has_cache =3D false; - struct rdt_domain *d; u32 ctrl; =20 list_for_each_entry(s, &resctrl_schema_all, list) { @@ -1339,7 +1339,7 @@ static ssize_t rdtgroup_mode_write(struct kernfs_open= _file *of, * bitmap functions work correctly. */ unsigned int rdtgroup_cbm_to_size(struct rdt_resource *r, - struct rdt_domain *d, unsigned long cbm) + struct rdt_ctrl_domain *d, unsigned long cbm) { struct cpu_cacheinfo *ci; unsigned int size =3D 0; @@ -1372,9 +1372,9 @@ static int rdtgroup_size_show(struct kernfs_open_file= *of, { struct resctrl_schema *schema; enum resctrl_conf_type type; + struct rdt_ctrl_domain *d; struct rdtgroup *rdtgrp; struct rdt_resource *r; - struct rdt_domain *d; unsigned int size; int ret =3D 0; u32 closid; @@ -1486,7 +1486,7 @@ static void mon_event_config_read(void *info) mon_info->mon_config =3D msrval & MAX_EVT_CONFIG_BITS; } =20 -static void mondata_config_read(struct rdt_domain *d, struct mon_config_in= fo *mon_info) +static void mondata_config_read(struct rdt_mon_domain *d, struct mon_confi= g_info *mon_info) { smp_call_function_any(&d->cpu_mask, mon_event_config_read, mon_info, 1); } @@ -1494,7 +1494,7 @@ static void mondata_config_read(struct rdt_domain *d,= struct mon_config_info *mo static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32= evtid) { struct mon_config_info mon_info =3D {0}; - struct rdt_domain *dom; + struct rdt_mon_domain *dom; bool sep =3D false; =20 mutex_lock(&rdtgroup_mutex); @@ -1551,7 +1551,7 @@ static void mon_event_config_write(void *info) } =20 static int mbm_config_write_domain(struct rdt_resource *r, - struct rdt_domain *d, u32 evtid, u32 val) + struct rdt_mon_domain *d, u32 evtid, u32 val) { struct mon_config_info mon_info =3D {0}; int ret =3D 0; @@ -1601,7 +1601,7 @@ static int mon_config_write(struct rdt_resource *r, c= har *tok, u32 evtid) { char *dom_str =3D NULL, *id_str; unsigned long dom_id, val; - struct rdt_domain *d; + struct rdt_mon_domain *d; int ret =3D 0; =20 next: @@ -2125,9 +2125,9 @@ static inline bool is_mba_linear(void) static int set_cache_qos_cfg(int level, bool enable) { void (*update)(void *arg); + struct rdt_ctrl_domain *d; struct rdt_resource *r_l; cpumask_var_t cpu_mask; - struct rdt_domain *d; int cpu; =20 if (level =3D=3D RDT_RESOURCE_L3) @@ -2174,7 +2174,7 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *= r) l3_qos_cfg_update(&hw_res->cdp_enabled); } =20 -static int mba_sc_domain_allocate(struct rdt_resource *r, struct rdt_domai= n *d) +static int mba_sc_domain_allocate(struct rdt_resource *r, struct rdt_ctrl_= domain *d) { u32 num_closid =3D resctrl_arch_get_num_closid(r); int cpu =3D cpumask_any(&d->cpu_mask); @@ -2192,7 +2192,7 @@ static int mba_sc_domain_allocate(struct rdt_resource= *r, struct rdt_domain *d) } =20 static void mba_sc_domain_destroy(struct rdt_resource *r, - struct rdt_domain *d) + struct rdt_ctrl_domain *d) { kfree(d->mbps_val); d->mbps_val =3D NULL; @@ -2218,7 +2218,7 @@ static int set_mba_sc(bool mba_sc) { struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl; u32 num_closid =3D resctrl_arch_get_num_closid(r); - struct rdt_domain *d; + struct rdt_ctrl_domain *d; int i; =20 if (!supports_mba_mbps() || mba_sc =3D=3D is_mba_sc(r)) @@ -2466,7 +2466,7 @@ static void schemata_list_destroy(void) static int rdt_get_tree(struct fs_context *fc) { struct rdt_fs_context *ctx =3D rdt_fc2context(fc); - struct rdt_domain *dom; + struct rdt_mon_domain *dom; struct rdt_resource *r; int ret; =20 @@ -2634,10 +2634,10 @@ static int rdt_init_fs_context(struct fs_context *f= c) static int reset_all_ctrls(struct rdt_resource *r) { struct rdt_hw_resource *hw_res =3D resctrl_to_arch_res(r); - struct rdt_hw_domain *hw_dom; + struct rdt_hw_ctrl_domain *hw_dom; struct msr_param msr_param; + struct rdt_ctrl_domain *d; cpumask_var_t cpu_mask; - struct rdt_domain *d; int i; =20 if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) @@ -2653,7 +2653,7 @@ static int reset_all_ctrls(struct rdt_resource *r) * from each domain to update the MSRs below. */ list_for_each_entry(d, &r->ctrl_domains, hdr.list) { - hw_dom =3D resctrl_to_arch_dom(d); + hw_dom =3D resctrl_to_arch_ctrl_dom(d); cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask); =20 for (i =3D 0; i < hw_res->num_closid; i++) @@ -2848,7 +2848,7 @@ static void rmdir_mondata_subdir_allrdtgrp(struct rdt= _resource *r, } =20 static int mkdir_mondata_subdir(struct kernfs_node *parent_kn, - struct rdt_domain *d, + struct rdt_mon_domain *d, struct rdt_resource *r, struct rdtgroup *prgrp) { union mon_data_bits priv; @@ -2897,7 +2897,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *p= arent_kn, * and "monitor" groups with given domain id. */ static void mkdir_mondata_subdir_allrdtgrp(struct rdt_resource *r, - struct rdt_domain *d) + struct rdt_mon_domain *d) { struct kernfs_node *parent_kn; struct rdtgroup *prgrp, *crgrp; @@ -2919,7 +2919,7 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_= node *parent_kn, struct rdt_resource *r, struct rdtgroup *prgrp) { - struct rdt_domain *dom; + struct rdt_mon_domain *dom; int ret; =20 list_for_each_entry(dom, &r->mon_domains, hdr.list) { @@ -3021,7 +3021,7 @@ static u32 cbm_ensure_valid(u32 _val, struct rdt_reso= urce *r) * Set the RDT domain up to start off with all usable allocations. That is, * all shareable and unused bits. All-zero CBM is invalid. */ -static int __init_one_rdt_domain(struct rdt_domain *d, struct resctrl_sche= ma *s, +static int __init_one_rdt_domain(struct rdt_ctrl_domain *d, struct resctrl= _schema *s, u32 closid) { enum resctrl_conf_type peer_type =3D resctrl_peer_type(s->conf_type); @@ -3101,7 +3101,7 @@ static int __init_one_rdt_domain(struct rdt_domain *d= , struct resctrl_schema *s, */ static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid) { - struct rdt_domain *d; + struct rdt_ctrl_domain *d; int ret; =20 list_for_each_entry(d, &s->res->ctrl_domains, hdr.list) { @@ -3117,7 +3117,7 @@ static int rdtgroup_init_cat(struct resctrl_schema *s= , u32 closid) static void rdtgroup_init_mba(struct rdt_resource *r, u32 closid) { struct resctrl_staged_config *cfg; - struct rdt_domain *d; + struct rdt_ctrl_domain *d; =20 list_for_each_entry(d, &r->ctrl_domains, hdr.list) { if (is_mba_sc(r)) { @@ -3704,14 +3704,14 @@ static int __init rdtgroup_setup_root(void) return ret; } =20 -static void domain_destroy_mon_state(struct rdt_domain *d) +static void domain_destroy_mon_state(struct rdt_mon_domain *d) { bitmap_free(d->rmid_busy_llc); kfree(d->mbm_total); kfree(d->mbm_local); } =20 -void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_domain= *d) +void resctrl_offline_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_d= omain *d) { lockdep_assert_held(&rdtgroup_mutex); =20 @@ -3719,7 +3719,7 @@ void resctrl_offline_ctrl_domain(struct rdt_resource = *r, struct rdt_domain *d) mba_sc_domain_destroy(r, d); } =20 -void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_domain = *d) +void resctrl_offline_mon_domain(struct rdt_resource *r, struct rdt_mon_dom= ain *d) { /* * If resctrl is mounted, remove all the @@ -3746,7 +3746,7 @@ void resctrl_offline_mon_domain(struct rdt_resource *= r, struct rdt_domain *d) domain_destroy_mon_state(d); } =20 -static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domai= n *d) +static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_d= omain *d) { size_t tsize; =20 @@ -3776,7 +3776,7 @@ static int domain_setup_mon_state(struct rdt_resource= *r, struct rdt_domain *d) return 0; } =20 -int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_domain *= d) +int resctrl_online_ctrl_domain(struct rdt_resource *r, struct rdt_ctrl_dom= ain *d) { lockdep_assert_held(&rdtgroup_mutex); =20 @@ -3787,7 +3787,7 @@ int resctrl_online_ctrl_domain(struct rdt_resource *r= , struct rdt_domain *d) return 0; } =20 -int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_domain *d) +int resctrl_online_mon_domain(struct rdt_resource *r, struct rdt_mon_domai= n *d) { int err; =20 --=20 2.41.0 From nobody Fri Dec 19 02:50:45 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB76BE80ABC for ; Thu, 28 Sep 2023 19:14:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232210AbjI1TOQ (ORCPT ); Thu, 28 Sep 2023 15:14:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231925AbjI1TOF (ORCPT ); Thu, 28 Sep 2023 15:14:05 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C17219F; Thu, 28 Sep 2023 12:14:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695928444; x=1727464444; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SnEmTSRpqMnt7CEuYlf9Sv/kATyxfsRrWWBykdECv8A=; b=jrXQzhtDGETtVcR9ndmTJf5/XMT6vGogDdb3pwF7iEI20LitlXyGlBVB Q5FTh7TPV75TUuvrloxMmA9WfVP66JukAprqew/Qp8bGWHwz8xtjzIKtK 5pEwoGp9tcwWHbqyQSMe97fBYjS4oLGqrYbJQYexXHc+MtcgvhkGI4kZq JVbZAVIyZWYLMrqzFkxGW2ZiRdUF9+p9Fr9bvmL1XmvUFhD/nLqDRmVcU qbEqtleX/aC7EvgFQFOBYtfW6zZxrSrtS4KnZGg/mqzrT3ZwpsA5UPAY7 WO9myZksU7fsa21Q8kocooYy70/eYFZ0b7f1ns0hysVGoJOVobpLNQyjj w==; X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="367213934" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="367213934" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:14:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="779020036" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="779020036" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:14:00 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v6 5/8] x86/resctrl: Add node-scope to the options for feature scope Date: Thu, 28 Sep 2023 12:13:46 -0700 Message-ID: <20230928191350.205703-6-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230928191350.205703-1-tony.luck@intel.com> References: <20230829234426.64421-1-tony.luck@intel.com> <20230928191350.205703-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Currently supported resctrl features are all domain scoped the same as the scope of the L2 or L3 caches. Add RESCTRL_NODE as a new option for features that are scoped at the same granularity as NUMA nodes. This is needed for Intel's Sub-NUMA Cluster (SNC) feature where monitoring features are node scoped. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- Changes since v5: Updates to commit message. include/linux/resctrl.h | 1 + arch/x86/kernel/cpu/resctrl/core.c | 2 ++ 2 files changed, 3 insertions(+) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 1c925e3db2ea..18ed787f9798 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -165,6 +165,7 @@ struct resctrl_schema; enum resctrl_scope { RESCTRL_L2_CACHE =3D 2, RESCTRL_L3_CACHE =3D 3, + RESCTRL_NODE, }; =20 /** diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 726f00c01079..e61bf919ac78 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -511,6 +511,8 @@ static int get_domain_id_from_scope(int cpu, enum resct= rl_scope scope) case RESCTRL_L2_CACHE: case RESCTRL_L3_CACHE: return get_cpu_cacheinfo_id(cpu, scope); + case RESCTRL_NODE: + return cpu_to_node(cpu); default: break; } --=20 2.41.0 From nobody Fri Dec 19 02:50:45 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0543CE7B1F for ; Thu, 28 Sep 2023 19:14:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232215AbjI1TOX (ORCPT ); Thu, 28 Sep 2023 15:14:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231954AbjI1TOH (ORCPT ); Thu, 28 Sep 2023 15:14:07 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E3551A2; Thu, 28 Sep 2023 12:14:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695928445; x=1727464445; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CjDzME9E1sh2o1kULbVTBnwuHu5JOjSatiCrJ26MprE=; b=cWsdEcPQAIpIuhqVoaSjAEkIakuoiLzBoxYqN5DCFZvOqHjqJkzy97Xa RpLVynXb8g46cngoQxd1B5EZZvfou2jb9r+hga8OD2B1gOxv8fS4GHW+t h/f8oladaqGqXhFB5vBppX163QqTLLH0AZG21vYeDX2gNmyaXkhNDk2fv vKH9JPt1Hyge6/9wWVkel4wLshoUIvjZKuxCxLZ8f8g3PR7CZbKDPFokY kPkjThW30AxRxdWAsLG4o9IAHgCf8Rs4+fX0cVLJMfVk9NQr8OkncyWXl CH0vWYlosm6Jr/vva44tcdh4QhNhwS/XdfolaesAlkHZfRNZzOXZ9uh5c Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="367213943" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="367213943" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:14:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="779020039" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="779020039" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:14:00 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v6 6/8] x86/resctrl: Introduce snc_nodes_per_l3_cache Date: Thu, 28 Sep 2023 12:13:47 -0700 Message-ID: <20230928191350.205703-7-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230928191350.205703-1-tony.luck@intel.com> References: <20230829234426.64421-1-tony.luck@intel.com> <20230928191350.205703-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Intel Sub-NUMA Cluster (SNC) is a feature that subdivides the CPU cores and memory controllers on a socket into two or more groups. These are presented to the operating system as NUMA nodes. This may enable some workloads to have slightly lower latency to memory as the memory controller(s) in an SNC node are electrically closer to the CPU cores on that SNC node. This cost may be offset by lower bandwidth since the memory accesses for each core can only be interleaved between the memory controllers on the same SNC node. Resctrl monitoring on Intel system depends upon attaching RMIDs to tasks to track L3 cache occupancy and memory bandwidth. There is an MSR that controls how the RMIDs are shared between SNC nodes. The default mode divides them numerically. E.g. when there are two SNC nodes on a socket the lower number half of the RMIDs are given to the first node, the remainder to the second node. This would be difficult to use with the Linux resctrl interface as specific RMID values assigned to resctrl groups are not visible to users. The other mode divides the RMIDs and renumbers the ones on the second SNC node to start from zero. Even with this redumbering SNC mode requires several changes in resctrl behavior for correct operation. Add a global integer "snc_nodes_per_l3_cache" that will show how many SNC nodes share each L3 cache. When this is "1", SNC mode is either not implemented, or not enabled. A later patch will detect SNC mode and set snc_nodes_per_l3_cache to the appropriate value. For now it remains at the default "1" to indicate SNC mode is not active. Code that needs to take action when SNC is enabled is: 1) The number of logical RMIDs per L3 cache available for use is the number of physical RMIDs divided by the number of SNC nodes. 2) Likewise the "mon_scale" value must be adjusted for the number of SNC nodes. 3) The RMID renumbering operates when using the value from the IA32_PQR_ASSOC MSR to count accesses by a task. When reading an RMID counter, code must adjust from the logical RMID used to the physical RMID value for the SNC node that it wishes to read and load the adjusted value into the IA32_QM_EVTSEL MSR. 4) The L3 cache is divided between the SNC nodes. So the value reported in the resctrl "size" file is adjusted. 5) The "-o mba_MBps" mount option must be disabled in SNC mode because the monitoring is being done per SNC node, while the bandwidth allocation is still done at the L3 cache scope. Trying to use this feedback loop might result in contradictory changes to the throttling level coming from each of the SNC node bandwidth measurements. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- Changes since v5: Major overhaul to the commit message. Starts with high level overview of what SNC is, before going into details on changes needed. Begin with definiton of the SNC acronym. Clarify in point "1" that available RMIDs are per L3 cache. Add extra detail in "5" why mba_MBps is incompatible with SNC mode. Code changes: Reformat a comment to use longer lines. Added a period at end of sentence for a comment. --- arch/x86/kernel/cpu/resctrl/internal.h | 2 ++ arch/x86/kernel/cpu/resctrl/core.c | 6 ++++++ arch/x86/kernel/cpu/resctrl/monitor.c | 16 +++++++++++++--- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 4 ++-- 4 files changed, 23 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index ee38249c6f1d..0a17ace5811e 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -446,6 +446,8 @@ DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); =20 extern struct dentry *debugfs_resctrl; =20 +extern int snc_nodes_per_l3_cache; + enum resctrl_res_level { RDT_RESOURCE_L3, RDT_RESOURCE_L2, diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index e61bf919ac78..1f94b7b11f3e 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -48,6 +48,12 @@ int max_name_width, max_data_width; */ bool rdt_alloc_capable; =20 +/* + * Number of SNC nodes that share each L3 cache. Default is 1 for + * systems that do not support SNC, or have SNC disabled. + */ +int snc_nodes_per_l3_cache =3D 1; + static void mba_wrmsr_intel(struct rdt_ctrl_domain *d, struct msr_param *m, struct rdt_resource *r); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index 97d2ed829f5d..e6e566921a60 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -148,8 +148,18 @@ static inline struct rmid_entry *__rmid_entry(u32 rmid) =20 static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val) { + struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + int cpu =3D smp_processor_id(); + int rmid_offset =3D 0; u64 msr_val; =20 + /* + * When SNC mode is on, need to compute the offset to read the + * physical RMID counter for the node to which this CPU belongs. + */ + if (snc_nodes_per_l3_cache > 1) + rmid_offset =3D (cpu_to_node(cpu) % snc_nodes_per_l3_cache) * r->num_rmi= d; + /* * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured * with a valid event code for supported resource type and the bits @@ -158,7 +168,7 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id = eventid, u64 *val) * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62) * are error bits. */ - wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid); + wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid + rmid_offset); rdmsrl(MSR_IA32_QM_CTR, msr_val); =20 if (msr_val & RMID_VAL_ERROR) @@ -783,8 +793,8 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r) int ret; =20 resctrl_rmid_realloc_limit =3D boot_cpu_data.x86_cache_size * 1024; - hw_res->mon_scale =3D boot_cpu_data.x86_cache_occ_scale; - r->num_rmid =3D boot_cpu_data.x86_cache_max_rmid + 1; + hw_res->mon_scale =3D boot_cpu_data.x86_cache_occ_scale / snc_nodes_per_l= 3_cache; + r->num_rmid =3D (boot_cpu_data.x86_cache_max_rmid + 1) / snc_nodes_per_l3= _cache; hw_res->mbm_width =3D MBM_CNTR_WIDTH_BASE; =20 if (mbm_offset > 0 && mbm_offset <=3D MBM_CNTR_WIDTH_OFFSET_MAX) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index b0901fb95aa9..a5404c412f53 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -1357,7 +1357,7 @@ unsigned int rdtgroup_cbm_to_size(struct rdt_resource= *r, } } =20 - return size; + return size / snc_nodes_per_l3_cache; } =20 /** @@ -2590,7 +2590,7 @@ static int rdt_parse_param(struct fs_context *fc, str= uct fs_parameter *param) ctx->enable_cdpl2 =3D true; return 0; case Opt_mba_mbps: - if (!supports_mba_mbps()) + if (!supports_mba_mbps() || snc_nodes_per_l3_cache > 1) return -EINVAL; ctx->enable_mba_mbps =3D true; return 0; --=20 2.41.0 From nobody Fri Dec 19 02:50:45 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CC4CCE7B1F for ; Thu, 28 Sep 2023 19:14:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232394AbjI1TO3 (ORCPT ); Thu, 28 Sep 2023 15:14:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232107AbjI1TON (ORCPT ); Thu, 28 Sep 2023 15:14:13 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E4691A7; Thu, 28 Sep 2023 12:14:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695928446; x=1727464446; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pVbIcVXxYuj9/HHxZC2rWHaPLAO4XTekywHD+3qbsvI=; b=IuQvVlb+oshVxv6+XxS5CDmUertTkhqWmw0WUSEK+fTZ5CPz1SAso9yw KMUPHzN/YQ4tGcist6R52Lb5rrz/Y30+wHhQzCUG43uGxD2DNCe0CGHwZ qNumCAlcVJmPr77+Aj5WT8CUqr9zon9nDhXY2L/2mEhkE2MsiTVAq5BZN hUJ474nAZZ8s29p5SYI5F0tObEpequUoSfd7oil5wCCo3J4f5c6OI1EM7 2m5LnUec7H4JA2uUM+G2Ga2W2dwr4oY+6t0HcX65Fudq4giKcMpLv4sER zZWTDNgo7SaFbjb5V7TqEIz+va/4YRHID0GAxXJTS4DKY7kENlqrbkvq0 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="367213946" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="367213946" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:14:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="779020042" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="779020042" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:14:01 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v6 7/8] x86/resctrl: Sub NUMA Cluster detection and enable Date: Thu, 28 Sep 2023 12:13:48 -0700 Message-ID: <20230928191350.205703-8-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230928191350.205703-1-tony.luck@intel.com> References: <20230829234426.64421-1-tony.luck@intel.com> <20230928191350.205703-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There isn't a simple h/w bit that indicates whether a CPU is running in Sub NUMA Cluster (SNC) mode. Infer the state by comparing the ratio of NUMA nodes to L3 cache instances. When SNC mode is detected, reconfigure the RMID counters by updating the MSR_RMID_SNC_CONFIG MSR on each socket as CPUs are seen. Clearing bit zero of the MSR divides the RMIDs and renumbers the ones on the second SNC node to start from zero. An earlier commit includes all the required changes in Linux to operate in this reconfigured mode. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- Changes since v5: Short explanation of RMID reconfiguration added in the commit message (longer one included in previous patch that has all the code that takes action when "snc_nodes_per_l3_cache" is greater than "1"). Added to the comment before snc_remap_rmids() describing what "remapping" is occuring. Added a comment before snc_get_config() [renamed from get_snc_config() to be consistent with "snc_" as a prefix] describing how it works, and that it can fail if the system is booted with "maxcpus=3DN" parameter. Now using bitmap_zalloc() to allocate bitmap. Add code to defend against divide by zero if no caches are found. --- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/cpu/resctrl/core.c | 90 ++++++++++++++++++++++++++++++ 2 files changed, 91 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index 1d111350197f..393d1b047617 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -1100,6 +1100,7 @@ #define MSR_IA32_QM_CTR 0xc8e #define MSR_IA32_PQR_ASSOC 0xc8f #define MSR_IA32_L3_CBM_BASE 0xc90 +#define MSR_RMID_SNC_CONFIG 0xca0 #define MSR_IA32_L2_CBM_BASE 0xd10 #define MSR_IA32_MBA_THRTL_BASE 0xd50 =20 diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resct= rl/core.c index 1f94b7b11f3e..0041c80c3b2c 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -16,11 +16,14 @@ =20 #define pr_fmt(fmt) "resctrl: " fmt =20 +#include #include #include #include #include +#include =20 +#include #include #include #include "internal.h" @@ -733,11 +736,42 @@ static void clear_closid_rmid(int cpu) wrmsr(MSR_IA32_PQR_ASSOC, 0, 0); } =20 +/* + * The power-on reset value of MSR_RMID_SNC_CONFIG is 0x1 + * which indicates that RMIDs are configured in legacy mode. + * This mode is incompatible with Linux resctrl semantics + * as RMIDs are partitioned between SNC nodes, which requires + * a user to know which RMID is allocated to a task. + * Clearing bit 0 reconfigures the RMID counters for use + * in Sub NUMA Cluster mode. This mode is better for Linux. + * The RMID space is divided between all SNC nodes with the + * RMIDs renumbered to start from zero in each node when + * couning operations from tasks. Code to read the counters + * must adjust RMID counnter numbers based on SNC node. See + * __rmid_read() for code that does this. + */ +static void snc_remap_rmids(int cpu) +{ + u64 val; + + /* Only need to enable once per package. */ + if (cpumask_first(topology_core_cpumask(cpu)) !=3D cpu) + return; + + rdmsrl(MSR_RMID_SNC_CONFIG, val); + val &=3D ~BIT_ULL(0); + wrmsrl(MSR_RMID_SNC_CONFIG, val); +} + static int resctrl_online_cpu(unsigned int cpu) { struct rdt_resource *r; =20 mutex_lock(&rdtgroup_mutex); + + if (snc_nodes_per_l3_cache > 1) + snc_remap_rmids(cpu); + for_each_capable_rdt_resource(r) domain_add_cpu(cpu, r); /* The cpu is set in default rdtgroup after online. */ @@ -992,11 +1026,67 @@ static __init bool get_rdt_resources(void) return (rdt_mon_capable || rdt_alloc_capable); } =20 +/* CPU models that support MSR_RMID_SNC_CONFIG */ +static const struct x86_cpu_id snc_cpu_ids[] __initconst =3D { + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, 0), + X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, 0), + X86_MATCH_INTEL_FAM6_MODEL(EMERALDRAPIDS_X, 0), + {} +}; + +/* + * There isn't a simple h/w bit that indicates whether a CPU is running + * in Sub NUMA Cluster (SNC) mode. Infer the state by comparing the + * ratio of NUMA nodes to L3 cache instances. + * It is not possible to accurately determine SNC state if the system is + * booted with a maxcpus=3DN parameter. That distorts the ratio of SNC nod= es + * to L3 caches. It will be OK if system is booted with hyperthreading + * disabled (since this doesn't affect the ratio). + */ +static __init int snc_get_config(void) +{ + unsigned long *node_caches; + int mem_only_nodes =3D 0; + int cpu, node, ret; + int num_l3_caches; + + if (!x86_match_cpu(snc_cpu_ids)) + return 1; + + node_caches =3D bitmap_zalloc(nr_node_ids, GFP_KERNEL); + if (!node_caches) + return 1; + + cpus_read_lock(); + for_each_node(node) { + cpu =3D cpumask_first(cpumask_of_node(node)); + if (cpu < nr_cpu_ids) + set_bit(get_cpu_cacheinfo_id(cpu, 3), node_caches); + else + mem_only_nodes++; + } + cpus_read_unlock(); + + num_l3_caches =3D bitmap_weight(node_caches, nr_node_ids); + if (!num_l3_caches) + return 1; + + ret =3D (nr_node_ids - mem_only_nodes) / num_l3_caches; + kfree(node_caches); + + if (ret > 1) + rdt_resources_all[RDT_RESOURCE_L3].r_resctrl.mon_scope =3D RESCTRL_NODE; + + return ret; +} + static __init void rdt_init_res_defs_intel(void) { struct rdt_hw_resource *hw_res; struct rdt_resource *r; =20 + snc_nodes_per_l3_cache =3D snc_get_config(); + for_each_rdt_resource(r) { hw_res =3D resctrl_to_arch_res(r); =20 --=20 2.41.0 From nobody Fri Dec 19 02:50:45 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDB2BCE7B1F for ; Thu, 28 Sep 2023 19:14:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232305AbjI1TO0 (ORCPT ); Thu, 28 Sep 2023 15:14:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232178AbjI1TON (ORCPT ); Thu, 28 Sep 2023 15:14:13 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A5DD31AB; Thu, 28 Sep 2023 12:14:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695928446; x=1727464446; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mn6nYHQ+REP1+buEIYzgvASYIMVA5gLIDjoJUxkB0Ok=; b=duWY74iPCS0aoa5n5Gjw2tawpKPLUFeu/NI0PYnZv6O8k4rxd0uWWOX/ PK7cKYgAWHc0XCTkgg/J4D0ezok4af01egNJRsDdcCUxNf/MJbWBJ3MyP afgLnuek101P3i38jS9+D+a8XCFtxLIsPCczd6HFrzMkw9+7kUbfZZH0L ZQQB6jq0b8CwfAR1RjeyHNKewWZfLdT4ezf1FOuKCbZUTps6k8gTLeaNi s5fVOX9eJb4qN87kpYXHrL1pWYExBeSH6rnaR7KxyuOT+ieJ8DMbSLWX8 8X8n/hQLkF6FlRRhEWZuZ3PJjqwt4dJyGVlDtX7jm3EqE3CA3ACBYA3VY g==; X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="367213958" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="367213958" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:14:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10847"; a="779020047" X-IronPort-AV: E=Sophos;i="6.03,185,1694761200"; d="scan'208";a="779020047" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2023 12:14:01 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v6 8/8] x86/resctrl: Update documentation with Sub-NUMA cluster changes Date: Thu, 28 Sep 2023 12:13:49 -0700 Message-ID: <20230928191350.205703-9-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230928191350.205703-1-tony.luck@intel.com> References: <20230829234426.64421-1-tony.luck@intel.com> <20230928191350.205703-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" With Sub-NUMA Cluster mode enabled the scope of monitoring resources is per-NODE instead of per-L3 cache. Suffixes of directories with "L3" in their name refer to Sub-NUMA nodes instead of L3 cache ids. Users should be aware that SNC mode also affects the amount of L3 cache available for allocation within each SNC node. Signed-off-by: Tony Luck Reviewed-by: Peter Newman --- Changes since v5: Added addtional details about challenges tracking tasks when SNC mode is enabled. --- Documentation/arch/x86/resctrl.rst | 34 +++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/re= sctrl.rst index cb05d90111b4..d6b6a4cfd967 100644 --- a/Documentation/arch/x86/resctrl.rst +++ b/Documentation/arch/x86/resctrl.rst @@ -345,9 +345,15 @@ When control is enabled all CTRL_MON groups will also = contain: When monitoring is enabled all MON groups will also contain: =20 "mon_data": - This contains a set of files organized by L3 domain and by - RDT event. E.g. on a system with two L3 domains there will - be subdirectories "mon_L3_00" and "mon_L3_01". Each of these + This contains a set of files organized by L3 domain or by NUMA + node (depending on whether Sub-NUMA Cluster (SNC) mode is disabled + or enabled respectively) and by RDT event. E.g. on a system with + SNC mode disabled with two L3 domains there will be subdirectories + "mon_L3_00" and "mon_L3_01". The numerical suffix refers to the + L3 cache id. With SNC enabled the directory names are the same, + but the numerical suffix refers to the node id. + Mappings from node ids to CPUs are available in the + /sys/devices/system/node/node*/cpulist files. Each of these directories have one file per event (e.g. "llc_occupancy", "mbm_total_bytes", and "mbm_local_bytes"). In a MON group these files provide a read out of the current value of the event for @@ -452,6 +458,28 @@ and 0xA are not. On a system with a 20-bit mask each = bit represents 5% of the capacity of the cache. You could partition the cache into four equal parts with masks: 0x1f, 0x3e0, 0x7c00, 0xf8000. =20 +Notes on Sub-NUMA Cluster mode +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D +When SNC mode is enabled the "llc_occupancy", "mbm_total_bytes", and +"mbm_local_bytes" will only give meaningful results for well behaved NUMA +applications. I.e. those that perform the majority of memory accesses +to memory on the local NUMA node to the CPU where the task is executing. +Note that Linux may load balance tasks between Sub-NUMA nodes much +more readily than between regular NUMA nodes since the CPUs on SNC +share the same L3 cache and the system may report the NUMA distance +between SNC nodes with a lower value than used for regular NUMA nodes. +Tasks that migrate between nodes will have their traffic recorded by the +counters in different SNC nodes so a user will need to read mon_data +files from each node on which the task executed to get the full +view of traffic for which the task was the source. + + +The cache allocation feature still provides the same number of +bits in a mask to control allocation into the L3 cache. But each +of those ways has its capacity reduced because the cache is divided +between the SNC nodes. The values reported in the resctrl +"size" files are adjusted accordingly. + Memory bandwidth Allocation and monitoring =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --=20 2.41.0