From nobody Thu Nov 28 05:55:30 2024 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 287351AAE31; Thu, 3 Oct 2024 19:12:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727982759; cv=none; b=FG34GR2wz1buEbWVokdVxHE8MpaeRYVBr01mXrkHvZ3lrs1KQXNQg5yzQjYDi6s8fnukws61fMs3C+1JQVh9eJXKlOCzMNEe+SHWND9rCJ6A6ARtIfn5lYZuvf/xd8BreRowG9ME/IG+3+iWsdhYnN7P4pezReMLnG7E/XHtwNU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727982759; c=relaxed/simple; bh=YTcnn9tY/T7LhH21dp3UNmpxTa0EEnAoZv8qsoiI0C0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Yuwl2ulS2QXImDcqboPuh+FXu/UL+GWdHYIvUS2eKKpa11Wg+s5fDrA5tVeZAR5gnFw1NvxNUgyoW0hEi+mRevL2m2lU68PiAZHJTgqHsaaGGTZEkmO2V+LeBDWqqfKJMUnziEcTeLOyfApMQV4XhF86/Hu3DJ0bBK5BdV3HM/4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=GS0IfD8T; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="GS0IfD8T" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1727982757; x=1759518757; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YTcnn9tY/T7LhH21dp3UNmpxTa0EEnAoZv8qsoiI0C0=; b=GS0IfD8TvyBnmLbd8m6ZSHvZ8bfm2HJJW0KZbyLIuUaELCJ9Q0lok0KM ZY0fDDekY9EDh/eMBPYsf9ffoyvSb4QCypaGyB/xUX/XBqsi5B7hYdZ3n WeMVLQCI2752lG7BQjdRNud3lon8sMInAlnLSblhts7VXAa6GVVdbKAhJ nyPKUH6gI0v07B/OoksZbf6KsFt65ABzVudKRQRp7FgGezTSzXaQLYBm6 ZvOam4CckGbJSEw7qsvgU3/HzNurhgHrOM9OugCAloeYAFJdhFOycAV8m wG+HNoh3YNOh5GsfenlpbEWqJWRZSTvnz7rQwrGqnSyZ3qmpt3qpWBRwk Q==; X-CSE-ConnectionGUID: U979jiFbTwK7JR0ULMErDA== X-CSE-MsgGUID: 0SjLXhlcT5aKI9mvKSaiCA== X-IronPort-AV: E=McAfee;i="6700,10204,11214"; a="27287502" X-IronPort-AV: E=Sophos;i="6.11,175,1725346800"; d="scan'208";a="27287502" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2024 12:12:36 -0700 X-CSE-ConnectionGUID: QX+V9hVRS46KeKq9Cxaijg== X-CSE-MsgGUID: tXMSFAwWQhumEsQ4elPSUQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,175,1725346800"; d="scan'208";a="74031033" Received: from agluck-desk3.sc.intel.com ([172.25.222.70]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2024 12:12:36 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v7 1/4] x86/resctrl: Make input event for MBA Software Controller configurable Date: Thu, 3 Oct 2024 12:12:25 -0700 Message-ID: <20241003191228.67541-2-tony.luck@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003191228.67541-1-tony.luck@intel.com> References: <20241003191228.67541-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The MBA Software Controller(mba_sc) is a feedback loop that uses measurements of local memory bandwidth to adjust MBA throttling levels to keep workloads in a resctrl group within a target bandwidth set in the schemata file. Users may want to use total memory bandwidth instead of local to handle workloads that have poor NUMA localization. Update the once-per-second polling code to pick a chosen event (local or total memory bandwidth). Signed-off-by: Tony Luck --- include/linux/resctrl.h | 2 + arch/x86/kernel/cpu/resctrl/monitor.c | 80 ++++++++++++-------------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 + 3 files changed, 40 insertions(+), 44 deletions(-) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index d94abba1c716..ccb0f50dc18c 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -161,6 +161,7 @@ enum membw_throttle_mode { * @throttle_mode: Bandwidth throttling mode when threads request * different memory bandwidths * @mba_sc: True if MBA software controller(mba_sc) is enabled + * @mba_mbps_event: Monitoring event guiding feedback loop when @mba_sc is= true * @mb_map: Mapping of memory B/W percentage to memory B/W delay */ struct resctrl_membw { @@ -170,6 +171,7 @@ struct resctrl_membw { bool arch_needs_linear; enum membw_throttle_mode throttle_mode; bool mba_sc; + enum resctrl_event_id mba_mbps_event; u32 *mb_map; }; =20 diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/re= sctrl/monitor.c index 851b561850e0..2692ce7f708e 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -663,10 +663,11 @@ static int __mon_event_count(u32 closid, u32 rmid, st= ruct rmid_read *rr) */ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) { - u32 idx =3D resctrl_arch_rmid_idx_encode(closid, rmid); - struct mbm_state *m =3D &rr->d->mbm_local[idx]; u64 cur_bw, bytes, cur_bytes; + struct mbm_state *m; =20 + m =3D get_mbm_state(rr->d, closid, rmid, rr->evtid); + WARN_ON(!m); cur_bytes =3D rr->val; bytes =3D cur_bytes - m->prev_bw_bytes; m->prev_bw_bytes =3D cur_bytes; @@ -752,20 +753,22 @@ static void update_mba_bw(struct rdtgroup *rgrp, stru= ct rdt_mon_domain *dom_mbm) u32 closid, rmid, cur_msr_val, new_msr_val; struct mbm_state *pmbm_data, *cmbm_data; struct rdt_ctrl_domain *dom_mba; + enum resctrl_event_id evt_id; struct rdt_resource *r_mba; - u32 cur_bw, user_bw, idx; struct list_head *head; struct rdtgroup *entry; + u32 cur_bw, user_bw; =20 - if (!is_mbm_local_enabled()) + if (!is_mbm_enabled()) return; =20 r_mba =3D &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl; + evt_id =3D r_mba->membw.mba_mbps_event; =20 closid =3D rgrp->closid; rmid =3D rgrp->mon.rmid; - idx =3D resctrl_arch_rmid_idx_encode(closid, rmid); - pmbm_data =3D &dom_mbm->mbm_local[idx]; + pmbm_data =3D get_mbm_state(dom_mbm, closid, rmid, evt_id); + WARN_ON(!pmbm_data); =20 dom_mba =3D get_ctrl_domain_from_cpu(smp_processor_id(), r_mba); if (!dom_mba) { @@ -784,7 +787,8 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct= rdt_mon_domain *dom_mbm) */ head =3D &rgrp->mon.crdtgrp_list; list_for_each_entry(entry, head, mon.crdtgrp_list) { - cmbm_data =3D &dom_mbm->mbm_local[entry->mon.rmid]; + cmbm_data =3D get_mbm_state(dom_mbm, entry->closid, entry->mon.rmid, evt= _id); + WARN_ON(!cmbm_data); cur_bw +=3D cmbm_data->prev_bw; } =20 @@ -813,54 +817,42 @@ static void update_mba_bw(struct rdtgroup *rgrp, stru= ct rdt_mon_domain *dom_mbm) resctrl_arch_update_one(r_mba, dom_mba, closid, CDP_NONE, new_msr_val); } =20 -static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d, - u32 closid, u32 rmid) +static void mbm_update_one_event(struct rdt_resource *r, struct rdt_mon_do= main *d, + u32 closid, u32 rmid, enum resctrl_event_id evtid) { + struct rdt_resource *rmba =3D &rdt_resources_all[RDT_RESOURCE_MBA].r_resc= trl; struct rmid_read rr =3D {0}; =20 rr.r =3D r; rr.d =3D d; + rr.evtid =3D evtid; + rr.arch_mon_ctx =3D resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); + if (IS_ERR(rr.arch_mon_ctx)) { + pr_warn_ratelimited("Failed to allocate monitor context: %ld", + PTR_ERR(rr.arch_mon_ctx)); + return; + } =20 + __mon_event_count(closid, rmid, &rr); + + if (is_mba_sc(NULL) && rr.evtid =3D=3D rmba->membw.mba_mbps_event) + mbm_bw_count(closid, rmid, &rr); + + resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); +} + +static void mbm_update(struct rdt_resource *r, struct rdt_mon_domain *d, + u32 closid, u32 rmid) +{ /* * This is protected from concurrent reads from user * as both the user and we hold the global mutex. */ - if (is_mbm_total_enabled()) { - rr.evtid =3D QOS_L3_MBM_TOTAL_EVENT_ID; - rr.val =3D 0; - rr.arch_mon_ctx =3D resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); - if (IS_ERR(rr.arch_mon_ctx)) { - pr_warn_ratelimited("Failed to allocate monitor context: %ld", - PTR_ERR(rr.arch_mon_ctx)); - return; - } - - __mon_event_count(closid, rmid, &rr); - - resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); - } - if (is_mbm_local_enabled()) { - rr.evtid =3D QOS_L3_MBM_LOCAL_EVENT_ID; - rr.val =3D 0; - rr.arch_mon_ctx =3D resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); - if (IS_ERR(rr.arch_mon_ctx)) { - pr_warn_ratelimited("Failed to allocate monitor context: %ld", - PTR_ERR(rr.arch_mon_ctx)); - return; - } - - __mon_event_count(closid, rmid, &rr); - - /* - * Call the MBA software controller only for the - * control groups and when user has enabled - * the software controller explicitly. - */ - if (is_mba_sc(NULL)) - mbm_bw_count(closid, rmid, &rr); + if (is_mbm_total_enabled()) + mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_TOTAL_EVENT_ID); =20 - resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); - } + if (is_mbm_local_enabled()) + mbm_update_one_event(r, d, closid, rmid, QOS_L3_MBM_LOCAL_EVENT_ID); } =20 /* diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index d7163b764c62..aedb30120d50 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2505,6 +2505,7 @@ static void rdt_disable_ctx(void) =20 static int rdt_enable_ctx(struct rdt_fs_context *ctx) { + struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl; int ret =3D 0; =20 if (ctx->enable_cdpl2) { @@ -2520,6 +2521,7 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx) } =20 if (ctx->enable_mba_mbps) { + r->membw.mba_mbps_event =3D QOS_L3_MBM_LOCAL_EVENT_ID; ret =3D set_mba_sc(true); if (ret) goto out_cdpl3; --=20 2.46.1 From nobody Thu Nov 28 05:55:30 2024 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 938011ABECF; Thu, 3 Oct 2024 19:12:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727982761; cv=none; b=gKx82GcvvBrVE7Bu2ahev5pFoIc+BQEjI0COBB1nPTMJPATYLlap55LK/1gi9aOgkME54f1iveQsGMDMIsxtmq4lJdJngOVA9m6FSj3iKio+Aaroq8pfzrGzl23qYlSOj3Z4wLqnIvmIxP3e/dgiFHQoKijhKTYuY12X/YGduh4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727982761; c=relaxed/simple; bh=5d03wtvW5lJdk3P0f1faBD/S2847o8ZU5ziGMFCBQUQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lUq+XJcteNmDqNDpIQdxPXr4m6l3rwfra5V1uqZZ/E7RBb/2CsxdlFPUfyqRKiExray2hUtLJLN83cA6dkZJGTfPQiL5aanmbypZIhdJdTlgUvTzx4t/ldYfknTSFVj/T9HTmx8rcr64sMIujcQZ0dohRFR/GYXTGqhheAFH/Ew= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ftATdjfe; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ftATdjfe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1727982759; x=1759518759; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5d03wtvW5lJdk3P0f1faBD/S2847o8ZU5ziGMFCBQUQ=; b=ftATdjfera2tXhLw2TfTa71j6o+PUrIVy2S3E0EnM4zVgMFvr2Nn1wC4 z+VL4+TkUIytsXKM2GYsBDubNFnKtH+U4FelhMWli2cqD3y3tkt6j7p7+ 9arhryHcc/CbvG0wroOnVpRicHnMmn/PMQBFe/iCbuphbbTdDa6QbGdQM K8MWBKkEozYPjIsS/8n+KAzRThvruANC5HAAUjhurOXQ3Q3F8Ks7j0256 tAvpGuE90Vc/gjoJbcR2G30etnyASENc4USl8V6rQaFmTBwhtLKluHyW7 zJY+MNWkK5K7+bCDc7FdOqjLrtOz3h2QlR7grsGBIZ99zzbY6k+wd8wV/ w==; X-CSE-ConnectionGUID: Ge0QHU4BRl6JHFvZU1xFGg== X-CSE-MsgGUID: seJbXQBfQQS5q4pXSt7upQ== X-IronPort-AV: E=McAfee;i="6700,10204,11214"; a="27287513" X-IronPort-AV: E=Sophos;i="6.11,175,1725346800"; d="scan'208";a="27287513" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2024 12:12:36 -0700 X-CSE-ConnectionGUID: JQamvRccRG6Go8h/bRKzcw== X-CSE-MsgGUID: xiC6pfURQO6OSh73VMzSlQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,175,1725346800"; d="scan'208";a="74031036" Received: from agluck-desk3.sc.intel.com ([172.25.222.70]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2024 12:12:36 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v7 2/4] x86/resctrl: Add mount option to pick input event for mba_MBps mode Date: Thu, 3 Oct 2024 12:12:26 -0700 Message-ID: <20241003191228.67541-3-tony.luck@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003191228.67541-1-tony.luck@intel.com> References: <20241003191228.67541-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a new mount option "mba_MBps_event=3D{event_name}" where event_name is one of "mbm_local_bytes" or "mbm_total_bytes" that allows a user to specify which monitoring event to use. Do not allow both options at the same time. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/rdtgroup.c | 49 ++++++++++++++++++++------ 2 files changed, 40 insertions(+), 10 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/r= esctrl/internal.h index 955999aecfca..b3d4fc80a496 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -102,6 +102,7 @@ struct rdt_fs_context { bool enable_cdpl2; bool enable_cdpl3; bool enable_mba_mbps; + enum resctrl_event_id mba_mbps_event; bool enable_debug; }; =20 diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index aedb30120d50..606cf635ea94 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2343,7 +2343,7 @@ static bool supports_mba_mbps(void) struct rdt_resource *rmbm =3D &rdt_resources_all[RDT_RESOURCE_L3].r_resct= rl; struct rdt_resource *r =3D &rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl; =20 - return (is_mbm_local_enabled() && + return (is_mbm_enabled() && r->alloc_capable && is_mba_linear() && r->ctrl_scope =3D=3D rmbm->mon_scope); } @@ -2521,7 +2521,7 @@ static int rdt_enable_ctx(struct rdt_fs_context *ctx) } =20 if (ctx->enable_mba_mbps) { - r->membw.mba_mbps_event =3D QOS_L3_MBM_LOCAL_EVENT_ID; + r->membw.mba_mbps_event =3D ctx->mba_mbps_event; ret =3D set_mba_sc(true); if (ret) goto out_cdpl3; @@ -2739,15 +2739,17 @@ enum rdt_param { Opt_cdp, Opt_cdpl2, Opt_mba_mbps, + Opt_mba_mbps_event, Opt_debug, nr__rdt_params }; =20 static const struct fs_parameter_spec rdt_fs_parameters[] =3D { - fsparam_flag("cdp", Opt_cdp), - fsparam_flag("cdpl2", Opt_cdpl2), - fsparam_flag("mba_MBps", Opt_mba_mbps), - fsparam_flag("debug", Opt_debug), + fsparam_flag("cdp", Opt_cdp), + fsparam_flag("cdpl2", Opt_cdpl2), + fsparam_flag("mba_MBps", Opt_mba_mbps), + fsparam_string("mba_MBps_event", Opt_mba_mbps_event), + fsparam_flag("debug", Opt_debug), {} }; =20 @@ -2770,9 +2772,25 @@ static int rdt_parse_param(struct fs_context *fc, st= ruct fs_parameter *param) ctx->enable_cdpl2 =3D true; return 0; case Opt_mba_mbps: - msg =3D "mba_MBps requires local MBM and linear scale MBA at L3 scope"; - if (!supports_mba_mbps()) + msg =3D "mba_MBps requires MBM and linear scale MBA at L3 scope"; + if (!supports_mba_mbps() || ctx->enable_mba_mbps) return invalfc(fc, msg); + if (is_mbm_local_enabled()) + ctx->mba_mbps_event =3D QOS_L3_MBM_LOCAL_EVENT_ID; + else + return invalfc(fc, msg); + ctx->enable_mba_mbps =3D true; + return 0; + case Opt_mba_mbps_event: + msg =3D "mba_MBps requires MBM and linear scale MBA at L3 scope"; + if (!supports_mba_mbps() || ctx->enable_mba_mbps) + return invalfc(fc, msg); + if (!strcmp("mbm_local_bytes", param->string)) + ctx->mba_mbps_event =3D QOS_L3_MBM_LOCAL_EVENT_ID; + else if (!strcmp("mbm_total_bytes", param->string)) + ctx->mba_mbps_event =3D QOS_L3_MBM_TOTAL_EVENT_ID; + else + return invalfc(fc, "unknown MBM event %s", param->string); ctx->enable_mba_mbps =3D true; return 0; case Opt_debug: @@ -3931,16 +3949,27 @@ static int rdtgroup_rename(struct kernfs_node *kn, return ret; } =20 +static const char *mba_sc_event_opt_name(struct rdt_resource *r) +{ + if (r->membw.mba_mbps_event =3D=3D QOS_L3_MBM_LOCAL_EVENT_ID) + return ",mba_MBps_event=3Dmbm_local_bytes"; + else if (r->membw.mba_mbps_event =3D=3D QOS_L3_MBM_TOTAL_EVENT_ID) + return ",mba_MBps_event=3Dmbm_total_bytes"; + return ""; +} + static int rdtgroup_show_options(struct seq_file *seq, struct kernfs_root = *kf) { + struct rdt_resource *r_mba =3D &rdt_resources_all[RDT_RESOURCE_MBA].r_res= ctrl; + if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L3)) seq_puts(seq, ",cdp"); =20 if (resctrl_arch_get_cdp_enabled(RDT_RESOURCE_L2)) seq_puts(seq, ",cdpl2"); =20 - if (is_mba_sc(&rdt_resources_all[RDT_RESOURCE_MBA].r_resctrl)) - seq_puts(seq, ",mba_MBps"); + if (is_mba_sc(r_mba)) + seq_puts(seq, mba_sc_event_opt_name(r_mba)); =20 if (resctrl_debug) seq_puts(seq, ",debug"); --=20 2.46.1 From nobody Thu Nov 28 05:55:30 2024 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC3101AC430; Thu, 3 Oct 2024 19:12:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727982763; cv=none; b=ESEa14YPB596bEYLF2c7JR3t4dCPj5mJ8KVqvhVBgmjL2wZbl0EcGHxztpaL6qrlq2rZ0vlYQLsylYp7jFMrw7YUHQ/YVEhkr25MgAcT9dmTRZwM+2spb4GHipQC/Jc5Lt1g4VTTP/mn+jTcpXgX2tf/NHwGYLDgjnYBQP6urDs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727982763; c=relaxed/simple; bh=WFYp1PXddKTWJGxTZoXlPEcHf+ZdNBvnk1nW5zmKgLo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JYJ+ExNWqnzKmLDUBVpVRBAd3KVeIS6oPC9lUhmcfBljyT5GK3DbsX6POb2VCS/Ujfkkx9UETVBRvZ1GwAbpOXyj7CZu0jj8p4o2mf/xnBi4j0C1IkBChfyD/sthOoFouhdMBGOeGX9EQpupVu1TAy9Ifh0dO3Nf1nhI75jVeJU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=LJDfZXwK; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="LJDfZXwK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1727982760; x=1759518760; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WFYp1PXddKTWJGxTZoXlPEcHf+ZdNBvnk1nW5zmKgLo=; b=LJDfZXwKYoIAkbKLHcdyv07OpAdLM7ksNb4dj01NL5iFcmS8qCExd/9h 2Ij2xqpAj7vx5rL82iGngIxikc1qC7NjFDC1q9HQ4ZLzd1x/UL+TH7szO e3AvLcW1z/fZlNhAUEdme3oV6p/x0HYPe+7YEbqOkk/jph6NdhQhUriMk Jzvy3cn6cyQKBCq7FNjCm4b52f7ZbbM3xtJAlSbTZG/dwxt2+rsxmWW+R e7/SDuh8Sjs5wusx/hxQTIdPyLvlc3XtgHSGELD6f9gaWb5JM+/FHznnS ZMFbJb43X+vz36N8jN3odE89YZTbhGDiIkuTyzClAzO+51Hgjfw8lSD7D Q==; X-CSE-ConnectionGUID: AtdttTkYREK+PkRe4DZRfA== X-CSE-MsgGUID: iXrlg1r8QwOasMUVddNQPg== X-IronPort-AV: E=McAfee;i="6700,10204,11214"; a="27287520" X-IronPort-AV: E=Sophos;i="6.11,175,1725346800"; d="scan'208";a="27287520" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2024 12:12:36 -0700 X-CSE-ConnectionGUID: 889iCSLsR2WR+Hs3qxH9sg== X-CSE-MsgGUID: dZRYd7NrTE+lrZro4P23xQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,175,1725346800"; d="scan'208";a="74031039" Received: from agluck-desk3.sc.intel.com ([172.25.222.70]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2024 12:12:36 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v7 3/4] x86/resctrl: Use total bandwidth for mba_MBps option when local isn't present Date: Thu, 3 Oct 2024 12:12:27 -0700 Message-ID: <20241003191228.67541-4-tony.luck@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003191228.67541-1-tony.luck@intel.com> References: <20241003191228.67541-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On Intel systems the memory bandwidth monitoring events are independently enumerated. It is possible for a system to support total memory bandwidth monitoring, but not support local bandwidth monitoring. On such a system a user could not enable mba_sc mode. Modify the existing "mba_MBps" mount option to switch to total bandwidth monitoring if local monitoring is not available. Signed-off-by: Tony Luck --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/r= esctrl/rdtgroup.c index 606cf635ea94..433daaa4d125 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2777,6 +2777,8 @@ static int rdt_parse_param(struct fs_context *fc, str= uct fs_parameter *param) return invalfc(fc, msg); if (is_mbm_local_enabled()) ctx->mba_mbps_event =3D QOS_L3_MBM_LOCAL_EVENT_ID; + else if (is_mbm_total_enabled()) + ctx->mba_mbps_event =3D QOS_L3_MBM_TOTAL_EVENT_ID; else return invalfc(fc, msg); ctx->enable_mba_mbps =3D true; --=20 2.46.1 From nobody Thu Nov 28 05:55:30 2024 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28A661AC445; Thu, 3 Oct 2024 19:12:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727982762; cv=none; b=F/8kVWa0lkZJ/Mnrt7gE/EVCXaj+YkcUUjuBLY/39gSBmQYKU5STfkC+5iNVgnODh52Kp+kdMLB561nZlimzm3iLNHcX5EzhGkiHMCxoZxZoNN3EMlNT2KZ9X2GeOk3xbBqOuFcoAgltPwavhqPby97KWKe0RbjVHBEeOZJudaM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727982762; c=relaxed/simple; bh=sDrMH2vMq/rOASOnomz+Tq6S88RIiBBasNCK7KSIlt4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SeOHGjK4iA+yqggGLBCUjXEw2wWy2VvbPm4Be4ggG78q+aAAGegmrNC0rPBEYgngbZY0Yb1Xkl+cDZ2CYeBrheSHLv9IWKMQ21Dm4xn2F+SbYruutMDPmFAS/DQIXddAO9CHDeuMWetSV6Xbt+dUDYwID6Fo0htK5kadCacpu/U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=AttPN6u9; arc=none smtp.client-ip=198.175.65.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="AttPN6u9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1727982761; x=1759518761; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sDrMH2vMq/rOASOnomz+Tq6S88RIiBBasNCK7KSIlt4=; b=AttPN6u9NSfpQTGlFw/EXtE0QLzTLHSnUXvoz5RDiL7NjU/e2UvTBgV9 3efeLTiEBzgcI67gaInrwL6H6uJFwR1t3OsBYM5j4T+wmx9LBbfG0mrYg r1LyeUZrRz7DCHiTiSxuBMqVgIDRG/GYHlQqGKJuFlF+/yxkII+Ndsaj8 9V+zqrpzWKn8iQSweejKDzZ7nYHBCeq222/R2TL6WBp81HAnI1ivewsFT ZocV1V0tzJSzbZUXEcRsQcaV+apR1Ps0BPOPzgwvzvjOx1dIDXVYV8Qaf edC4Yn7fqrOStasGWoOdCf8Zxh/jkoD0Slo6NxLg/tRzEF6NDgpAQ/OBE w==; X-CSE-ConnectionGUID: +KIIyio5R+GAFRR1j8VU8w== X-CSE-MsgGUID: aJyNdKIETTuOVwIHH7dynQ== X-IronPort-AV: E=McAfee;i="6700,10204,11214"; a="27287533" X-IronPort-AV: E=Sophos;i="6.11,175,1725346800"; d="scan'208";a="27287533" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2024 12:12:36 -0700 X-CSE-ConnectionGUID: TkfdpTCnR326mxjRWMqr0w== X-CSE-MsgGUID: KPr0GTWXQl+BqWmZhg6stA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,175,1725346800"; d="scan'208";a="74031042" Received: from agluck-desk3.sc.intel.com ([172.25.222.70]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2024 12:12:36 -0700 From: Tony Luck To: Fenghua Yu , Reinette Chatre , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org Cc: Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck Subject: [PATCH v7 4/4] x86/resctrl: Add new "mba_MBps_event" mount option to documentation Date: Thu, 3 Oct 2024 12:12:28 -0700 Message-ID: <20241003191228.67541-5-tony.luck@intel.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20241003191228.67541-1-tony.luck@intel.com> References: <20241003191228.67541-1-tony.luck@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" New mount option may be used to choose a specific memory bandwidth monitoring event to feed the MBA Software Controller(mba_sc) feedback loop. Resctrl will automatically switch to using total memory bandwidth on systems that do not support monitroing local bandwidth. Signed-off-by: Tony Luck --- Documentation/arch/x86/resctrl.rst | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/re= sctrl.rst index a824affd741d..ab0868713f4a 100644 --- a/Documentation/arch/x86/resctrl.rst +++ b/Documentation/arch/x86/resctrl.rst @@ -35,7 +35,8 @@ about the feature from resctrl's info directory. =20 To use the feature mount the file system:: =20 - # mount -t resctrl resctrl [-o cdp[,cdpl2][,mba_MBps][,debug]] /sys/fs/re= sctrl + # mount -t resctrl resctrl [-o cdp[,cdpl2][,mba_MBps] \ + [,mba_MBps_event=3D[mbm_local_bytes|mbm_total_bytes]][,debug]] /sys/fs/re= sctrl =20 mount options are: =20 @@ -44,8 +45,14 @@ mount options are: "cdpl2": Enable code/data prioritization in L2 cache allocations. "mba_MBps": - Enable the MBA Software Controller(mba_sc) to specify MBA - bandwidth in MiBps + Use a software feedback loop from the memory bandwidth monitoring + feature to automatically adjust memory bandwidth allocation + throttling so that the user can specify MBA control values in MiBps. + Defaults to using MBM local bandwidth, but will use total bandwidth on + systems that do not support local bandwidth monitoring. +"mba_MBps_event=3D[mbm_local_bytes|mbm_total_bytes]": + Enable the "mba_MBps" option with an explicitly chosen monitor + event as input to the software feedback loop. "debug": Make debug files accessible. Available debug files are annotated with "Available only with debug option". @@ -561,16 +568,24 @@ increase or vary although user specified bandwidth pe= rcentage is same. In order to mitigate this and make the interface more user friendly, resctrl added support for specifying the bandwidth in MiBps as well. The kernel underneath would use a software feedback mechanism or a "Software -Controller(mba_sc)" which reads the actual bandwidth using MBM counters -and adjust the memory bandwidth percentages to ensure:: +Controller(mba_sc)" which reads the actual bandwidth using either local +or total memory bandwidth MBM counters and adjusts the memory bandwidth +percentages to ensure:: =20 "actual bandwidth < user specified bandwidth". =20 By default, the schemata would take the bandwidth percentage values where as user can switch to the "MBA software controller" mode using -a mount option 'mba_MBps'. The schemata format is specified in the below +the mount option 'mba_MBps' or explicitly choose which MBM event with +the 'mba_MBps_event' option. The schemata format is specified in the below sections. =20 +The software feedback mechanism uses measurement of local +memory bandwidth to make adjustments to throttling levels. If a system +is running applications with poor NUMA locality users may want to use +the "mba_MBps_event=3Dmbm_total_bytes" mount option which will use total +memory bandwidth measurements instead of local. + L3 schemata file details (code and data prioritization disabled) ---------------------------------------------------------------- With CDP disabled the L3 schemata format is:: --=20 2.46.1