Resctrl subsystem can support two monitoring modes, "mbm_event" or
"default". In mbm_event mode, monitoring event can only accumulate data
while it is backed by a hardware counter. In "default" mode, resctrl
assumes there is a hardware counter for each event within every CTRL_MON
and MON group.
Introduce interface to switch between mbm_event and default modes.
Example:
To list the MBM monitor modes supported:
$ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
[mbm_event]
default
To enable the "mbm_event" monitoring mode:
$ echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
To enable the "default" monitoring mode:
$ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
MBM event counters are automatically reset as part of changing the mode.
Clear both architectural and non-architectural event states to prevent
overflow conditions during the next event read. Also clear assignable
counter configuration on all the domains.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v14: Updated the changelog to reflect the change in monitor mode naming.
Added the call resctrl_bmec_files_show() to enable/disable files
related to BMEC.
Added resctrl_set_mon_evt_cfg() to reset event configuration values
when mode is changes.
v13: Resolved the conflicts due to FS/ARCH restructure.
Introduced the new resctrl_init_evt_configuration() to initialize
the event modes and configuration values.
Added the call to resctrl_bmec_files_show() hide/show BMEC related
files.
v12: Fixed the documentation for a consistency.
Introduced mbm_cntr_free_all() and resctrl_reset_rmid_all() to clear
counters and non-architectural states when monitor mode is changed.
https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/
v11: Changed the name of the function rdtgroup_mbm_assign_mode_write() to
resctrl_mbm_assign_mode_write().
Rewrote the commit message with context.
Added few more details in resctrl.rst about mbm_cntr_assign mode.
Re-arranged the text in resctrl.rst file.
v10: The call mbm_cntr_reset() has been moved to earlier patch.
Minor documentation update.
v9: Fixed extra spaces in user documentation.
Fixed problem changing the mode to mbm_cntr_assign mode when it is
not supported. Added extra checks to detect if systems supports it.
Used the rdtgroup_cntr_id_init to initialize cntr_id.
v8: Reset the internal counters after mbm_cntr_assign mode is changed.
Renamed rdtgroup_mbm_cntr_reset() to mbm_cntr_reset()
Updated the documentation to make text generic.
v7: Changed the interface name to mbm_assign_mode.
Removed the references of ABMC.
Added the changes to reset global and domain bitmaps.
Added the changes to reset rmid.
v6: Changed the mode name to mbm_cntr_assign.
Moved all the FS related code here.
Added changes to reset mbm_cntr_map and resctrl group counters.
v5: Change log and mode description text correction.
v4: Minor commit text changes. Keep the default to ABMC when supported.
Fixed comments to reflect changed interface "mbm_mode".
v3: New patch to address the review comments from upstream.
---
Documentation/filesystems/resctrl.rst | 23 +++++++-
fs/resctrl/internal.h | 2 +
fs/resctrl/monitor.c | 27 ++++++++++
fs/resctrl/rdtgroup.c | 78 +++++++++++++++++++++++++--
4 files changed, 126 insertions(+), 4 deletions(-)
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index cd82c2966ed7..7e62c7fdcefa 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -259,7 +259,9 @@ with the following files:
"mbm_assign_mode":
The supported monitoring modes. The enclosed brackets indicate which mode
- is enabled.
+ is enabled. The MBM events (mbm_total_bytes and/or mbm_local_bytes) associated
+ with counters may reset when "mbm_assign_mode" is changed.
+
::
# cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
@@ -279,6 +281,15 @@ with the following files:
of counters available is described in the "num_mbm_cntrs" file. Changing the
mode may cause all counters on the resource to reset.
+ Moving to mbm_event mode require users to assign the counters to the events.
+ Otherwise, the MBM event counters will return 'Unassigned' when read.
+
+ The mode is beneficial for AMD platforms that support more CTRL_MON
+ and MON groups than available hardware counters. By default, this
+ feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth
+ Monitoring Counters) capability, ensuring counters remain assigned even
+ when the corresponding RMID is not actively used by any processor.
+
"default":
In default mode, resctrl assumes there is a hardware counter for each
@@ -288,6 +299,16 @@ with the following files:
result in misleading values or display "Unavailable" if no counter is assigned
to the event.
+ * To enable "mbm_event" monitoring mode:
+ ::
+
+ # echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+
+ * To enable "default" monitoring mode:
+ ::
+
+ # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode
+
"num_mbm_cntrs":
The maximum number of counter IDs (total of available and assigned counters)
in each domain when the system supports mbm_event mode.
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 14d99c723ea5..adc9ff3efdfd 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -414,6 +414,8 @@ void resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *
struct rdtgroup *rdtgrp, struct mon_evt *mevt);
int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
+void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d);
+void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d);
#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 618c94cd1ad8..504b869570e6 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1045,6 +1045,33 @@ static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id)
memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg));
}
+/**
+ * mbm_cntr_free_all() - Clear all the counter ID configuration details in the
+ * domain @d. Called when mbm_assign_mode is changed.
+ */
+void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d)
+{
+ memset(d->cntr_cfg, 0, sizeof(*d->cntr_cfg) * r->mon.num_mbm_cntrs);
+}
+
+/**
+ * resctrl_reset_rmid_all() - Reset all non-architecture states for all the
+ * supported RMIDs.
+ */
+void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d)
+{
+ u32 idx_limit = resctrl_arch_system_num_rmid_idx();
+ enum resctrl_event_id evt;
+ int idx;
+
+ for_each_mbm_event_id(evt) {
+ if (!resctrl_is_mon_event_enabled(evt))
+ continue;
+ idx = MBM_STATE_IDX(evt);
+ memset(d->mbm_states[idx], 0, sizeof(struct mbm_state) * idx_limit);
+ }
+}
+
/**
* resctrl_alloc_config_cntr() - Allocate a counter ID and configure it for the
* event pointed to by @mevt and the resctrl group @rdtgrp within the domain @d.
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 8c67e0897f25..6bb61fcf8673 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -1876,6 +1876,77 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of,
return 0;
}
+static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes, loff_t off)
+{
+ struct rdt_resource *r = rdt_kn_parent_priv(of->kn);
+ struct rdt_mon_domain *d;
+ int ret = 0;
+ bool enable;
+
+ /* Valid input requires a trailing newline */
+ if (nbytes == 0 || buf[nbytes - 1] != '\n')
+ return -EINVAL;
+
+ buf[nbytes - 1] = '\0';
+
+ cpus_read_lock();
+ mutex_lock(&rdtgroup_mutex);
+
+ rdt_last_cmd_clear();
+
+ if (!strcmp(buf, "default")) {
+ enable = 0;
+ } else if (!strcmp(buf, "mbm_event")) {
+ if (r->mon.mbm_cntr_assignable) {
+ enable = 1;
+ } else {
+ ret = -EINVAL;
+ rdt_last_cmd_puts("mbm_event mode is not supported\n");
+ goto write_exit;
+ }
+ } else {
+ ret = -EINVAL;
+ rdt_last_cmd_puts("Unsupported assign mode\n");
+ goto write_exit;
+ }
+
+ if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) {
+ ret = resctrl_arch_mbm_cntr_assign_set(r, enable);
+ if (ret)
+ goto write_exit;
+
+ /* Update the visibility of BMEC related files */
+ resctrl_bmec_files_show(r, !enable);
+
+ /*
+ * Initialize the default memory transaction values for
+ * total and local events.
+ */
+ if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
+ resctrl_set_mon_evt_cfg(QOS_L3_MBM_TOTAL_EVENT_ID,
+ MAX_EVT_CONFIG_BITS);
+ if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
+ resctrl_set_mon_evt_cfg(QOS_L3_MBM_LOCAL_EVENT_ID,
+ READS_TO_LOCAL_MEM |
+ READS_TO_LOCAL_S_MEM |
+ NON_TEMP_WRITE_TO_LOCAL_MEM);
+ /*
+ * Reset all the non-achitectural RMID state and assignable counters.
+ */
+ list_for_each_entry(d, &r->mon_domains, hdr.list) {
+ mbm_cntr_free_all(r, d);
+ resctrl_reset_rmid_all(r, d);
+ }
+ }
+
+write_exit:
+ mutex_unlock(&rdtgroup_mutex);
+ cpus_read_unlock();
+
+ return ret ?: nbytes;
+}
+
static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of,
struct seq_file *s, void *v)
{
@@ -2203,8 +2274,8 @@ static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgr
struct mon_evt *mevt;
int assign_state;
char domain[10];
+ int ret = 0;
bool found;
- int ret;
mevt = mbm_get_mon_event_by_name(r, event);
if (!mevt) {
@@ -2249,7 +2320,7 @@ static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgr
switch (assign_state) {
case ASSIGN_NONE:
- ret = resctrl_unassign_cntr_event(r, d, rdtgrp, mevt);
+ resctrl_unassign_cntr_event(r, d, rdtgrp, mevt);
break;
case ASSIGN_EXCLUSIVE:
ret = resctrl_assign_cntr_event(r, d, rdtgrp, mevt);
@@ -2463,9 +2534,10 @@ static struct rftype res_common_files[] = {
},
{
.name = "mbm_assign_mode",
- .mode = 0444,
+ .mode = 0644,
.kf_ops = &rdtgroup_kf_single_ops,
.seq_show = resctrl_mbm_assign_mode_show,
+ .write = resctrl_mbm_assign_mode_write,
.fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE,
},
{
--
2.34.1
Hi Babu, On 6/13/25 2:05 PM, Babu Moger wrote: > Resctrl subsystem can support two monitoring modes, "mbm_event" or > "default". In mbm_event mode, monitoring event can only accumulate data > while it is backed by a hardware counter. In "default" mode, resctrl > assumes there is a hardware counter for each event within every CTRL_MON > and MON group. > > Introduce interface to switch between mbm_event and default modes. "Introduce interface" -> "Introduce mbm_assign_mode resctrl file" > > Example: > To list the MBM monitor modes supported: > $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > [mbm_event] > default > > To enable the "mbm_event" monitoring mode: > $ echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > > To enable the "default" monitoring mode: > $ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > > MBM event counters are automatically reset as part of changing the mode. > Clear both architectural and non-architectural event states to prevent > overflow conditions during the next event read. Also clear assignable > counter configuration on all the domains. > > Signed-off-by: Babu Moger <babu.moger@amd.com> > --- ... > --- > Documentation/filesystems/resctrl.rst | 23 +++++++- > fs/resctrl/internal.h | 2 + > fs/resctrl/monitor.c | 27 ++++++++++ > fs/resctrl/rdtgroup.c | 78 +++++++++++++++++++++++++-- > 4 files changed, 126 insertions(+), 4 deletions(-) > > diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst > index cd82c2966ed7..7e62c7fdcefa 100644 > --- a/Documentation/filesystems/resctrl.rst > +++ b/Documentation/filesystems/resctrl.rst > @@ -259,7 +259,9 @@ with the following files: > > "mbm_assign_mode": > The supported monitoring modes. The enclosed brackets indicate which mode > - is enabled. > + is enabled. The MBM events (mbm_total_bytes and/or mbm_local_bytes) associated Since there may be more events in future I think the "(mbm_total_bytes and/or mbm_local_bytes)" can be dropped. > + with counters may reset when "mbm_assign_mode" is changed. > + > :: > > # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > @@ -279,6 +281,15 @@ with the following files: > of counters available is described in the "num_mbm_cntrs" file. Changing the > mode may cause all counters on the resource to reset. > > + Moving to mbm_event mode require users to assign the counters to the events. "Moving to mbm_event mode require" -> "mbm_event counter assignment mode requires" > + Otherwise, the MBM event counters will return 'Unassigned' when read. > + > + The mode is beneficial for AMD platforms that support more CTRL_MON > + and MON groups than available hardware counters. By default, this > + feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth > + Monitoring Counters) capability, ensuring counters remain assigned even > + when the corresponding RMID is not actively used by any processor. > + > "default": > > In default mode, resctrl assumes there is a hardware counter for each > @@ -288,6 +299,16 @@ with the following files: > result in misleading values or display "Unavailable" if no counter is assigned > to the event. > > + * To enable "mbm_event" monitoring mode: > + :: > + > + # echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > + > + * To enable "default" monitoring mode: > + :: > + > + # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode > + > "num_mbm_cntrs": > The maximum number of counter IDs (total of available and assigned counters) > in each domain when the system supports mbm_event mode. > diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h > index 14d99c723ea5..adc9ff3efdfd 100644 > --- a/fs/resctrl/internal.h > +++ b/fs/resctrl/internal.h > @@ -414,6 +414,8 @@ void resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain * > struct rdtgroup *rdtgrp, struct mon_evt *mevt); > int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d, > struct rdtgroup *rdtgrp, enum resctrl_event_id evtid); > +void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d); > +void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d); > > #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK > int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp); > diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c > index 618c94cd1ad8..504b869570e6 100644 > --- a/fs/resctrl/monitor.c > +++ b/fs/resctrl/monitor.c > @@ -1045,6 +1045,33 @@ static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id) > memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg)); > } > > +/** > + * mbm_cntr_free_all() - Clear all the counter ID configuration details in the > + * domain @d. Called when mbm_assign_mode is changed. > + */ > +void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d) > +{ > + memset(d->cntr_cfg, 0, sizeof(*d->cntr_cfg) * r->mon.num_mbm_cntrs); > +} > + > +/** > + * resctrl_reset_rmid_all() - Reset all non-architecture states for all the > + * supported RMIDs. > + */ > +void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d) struct rdt_resource *r is unused? At this time it seems unnecessary to check if an MBM event belongs to particular resource since at this point I expect only L3 is possible. Even so, to be consistent and robust I think it would make flows easier to understand by always ensureing that mon_evt::rid matches expected resource. > +{ > + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); > + enum resctrl_event_id evt; > + int idx; > + > + for_each_mbm_event_id(evt) { > + if (!resctrl_is_mon_event_enabled(evt)) > + continue; > + idx = MBM_STATE_IDX(evt); > + memset(d->mbm_states[idx], 0, sizeof(struct mbm_state) * idx_limit); sizeof(*d->mbm_states[0])? > + } > +} > + > /** > * resctrl_alloc_config_cntr() - Allocate a counter ID and configure it for the > * event pointed to by @mevt and the resctrl group @rdtgrp within the domain @d. > diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c > index 8c67e0897f25..6bb61fcf8673 100644 > --- a/fs/resctrl/rdtgroup.c > +++ b/fs/resctrl/rdtgroup.c > @@ -1876,6 +1876,77 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of, > return 0; > } > > +static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of, > + char *buf, size_t nbytes, loff_t off) > +{ > + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); > + struct rdt_mon_domain *d; > + int ret = 0; > + bool enable; > + > + /* Valid input requires a trailing newline */ > + if (nbytes == 0 || buf[nbytes - 1] != '\n') > + return -EINVAL; > + > + buf[nbytes - 1] = '\0'; > + > + cpus_read_lock(); > + mutex_lock(&rdtgroup_mutex); > + > + rdt_last_cmd_clear(); > + > + if (!strcmp(buf, "default")) { > + enable = 0; > + } else if (!strcmp(buf, "mbm_event")) { > + if (r->mon.mbm_cntr_assignable) { > + enable = 1; > + } else { > + ret = -EINVAL; > + rdt_last_cmd_puts("mbm_event mode is not supported\n"); > + goto write_exit; write_exit -> out_unlock > + } > + } else { > + ret = -EINVAL; > + rdt_last_cmd_puts("Unsupported assign mode\n"); > + goto write_exit; > + } > + > + if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) { > + ret = resctrl_arch_mbm_cntr_assign_set(r, enable); > + if (ret) > + goto write_exit; > + > + /* Update the visibility of BMEC related files */ > + resctrl_bmec_files_show(r, !enable); > + > + /* > + * Initialize the default memory transaction values for > + * total and local events. > + */ > + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) > + resctrl_set_mon_evt_cfg(QOS_L3_MBM_TOTAL_EVENT_ID, > + MAX_EVT_CONFIG_BITS); > + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) > + resctrl_set_mon_evt_cfg(QOS_L3_MBM_LOCAL_EVENT_ID, > + READS_TO_LOCAL_MEM | > + READS_TO_LOCAL_S_MEM | > + NON_TEMP_WRITE_TO_LOCAL_MEM); Nice, yes, this belongs in resctrl fs. > + /* > + * Reset all the non-achitectural RMID state and assignable counters. > + */ > + list_for_each_entry(d, &r->mon_domains, hdr.list) { > + mbm_cntr_free_all(r, d); > + resctrl_reset_rmid_all(r, d); > + } > + } > + > +write_exit: > + mutex_unlock(&rdtgroup_mutex); > + cpus_read_unlock(); > + > + return ret ?: nbytes; > +} > + > static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of, > struct seq_file *s, void *v) > { > @@ -2203,8 +2274,8 @@ static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgr > struct mon_evt *mevt; > int assign_state; > char domain[10]; > + int ret = 0; > bool found; > - int ret; > > mevt = mbm_get_mon_event_by_name(r, event); > if (!mevt) { > @@ -2249,7 +2320,7 @@ static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgr > > switch (assign_state) { > case ASSIGN_NONE: > - ret = resctrl_unassign_cntr_event(r, d, rdtgrp, mevt); > + resctrl_unassign_cntr_event(r, d, rdtgrp, mevt); > break; > case ASSIGN_EXCLUSIVE: > ret = resctrl_assign_cntr_event(r, d, rdtgrp, mevt); Two stray hunks? > @@ -2463,9 +2534,10 @@ static struct rftype res_common_files[] = { > }, > { > .name = "mbm_assign_mode", > - .mode = 0444, > + .mode = 0644, > .kf_ops = &rdtgroup_kf_single_ops, > .seq_show = resctrl_mbm_assign_mode_show, > + .write = resctrl_mbm_assign_mode_write, > .fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE, > }, > { Reinette
Hi Reinette, On 6/25/25 18:40, Reinette Chatre wrote: > Hi Babu, > > On 6/13/25 2:05 PM, Babu Moger wrote: >> Resctrl subsystem can support two monitoring modes, "mbm_event" or >> "default". In mbm_event mode, monitoring event can only accumulate data >> while it is backed by a hardware counter. In "default" mode, resctrl >> assumes there is a hardware counter for each event within every CTRL_MON >> and MON group. >> >> Introduce interface to switch between mbm_event and default modes. > > "Introduce interface" -> "Introduce mbm_assign_mode resctrl file" > Sure. >> >> Example: >> To list the MBM monitor modes supported: >> $ cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> [mbm_event] >> default >> >> To enable the "mbm_event" monitoring mode: >> $ echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> >> To enable the "default" monitoring mode: >> $ echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> >> MBM event counters are automatically reset as part of changing the mode. >> Clear both architectural and non-architectural event states to prevent >> overflow conditions during the next event read. Also clear assignable >> counter configuration on all the domains. >> >> Signed-off-by: Babu Moger <babu.moger@amd.com> >> --- > > ... > >> --- >> Documentation/filesystems/resctrl.rst | 23 +++++++- >> fs/resctrl/internal.h | 2 + >> fs/resctrl/monitor.c | 27 ++++++++++ >> fs/resctrl/rdtgroup.c | 78 +++++++++++++++++++++++++-- >> 4 files changed, 126 insertions(+), 4 deletions(-) >> >> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst >> index cd82c2966ed7..7e62c7fdcefa 100644 >> --- a/Documentation/filesystems/resctrl.rst >> +++ b/Documentation/filesystems/resctrl.rst >> @@ -259,7 +259,9 @@ with the following files: >> >> "mbm_assign_mode": >> The supported monitoring modes. The enclosed brackets indicate which mode >> - is enabled. >> + is enabled. The MBM events (mbm_total_bytes and/or mbm_local_bytes) associated > > Since there may be more events in future I think the "(mbm_total_bytes and/or > mbm_local_bytes)" can be dropped. Sure. > >> + with counters may reset when "mbm_assign_mode" is changed. >> + >> :: >> >> # cat /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> @@ -279,6 +281,15 @@ with the following files: >> of counters available is described in the "num_mbm_cntrs" file. Changing the >> mode may cause all counters on the resource to reset. >> >> + Moving to mbm_event mode require users to assign the counters to the events. > > "Moving to mbm_event mode require" -> "mbm_event counter assignment mode requires" Sure. > >> + Otherwise, the MBM event counters will return 'Unassigned' when read. >> + >> + The mode is beneficial for AMD platforms that support more CTRL_MON >> + and MON groups than available hardware counters. By default, this >> + feature is enabled on AMD platforms with the ABMC (Assignable Bandwidth >> + Monitoring Counters) capability, ensuring counters remain assigned even >> + when the corresponding RMID is not actively used by any processor. >> + >> "default": >> >> In default mode, resctrl assumes there is a hardware counter for each >> @@ -288,6 +299,16 @@ with the following files: >> result in misleading values or display "Unavailable" if no counter is assigned >> to the event. >> >> + * To enable "mbm_event" monitoring mode: >> + :: >> + >> + # echo "mbm_event" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> + >> + * To enable "default" monitoring mode: >> + :: >> + >> + # echo "default" > /sys/fs/resctrl/info/L3_MON/mbm_assign_mode >> + >> "num_mbm_cntrs": >> The maximum number of counter IDs (total of available and assigned counters) >> in each domain when the system supports mbm_event mode. >> diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h >> index 14d99c723ea5..adc9ff3efdfd 100644 >> --- a/fs/resctrl/internal.h >> +++ b/fs/resctrl/internal.h >> @@ -414,6 +414,8 @@ void resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain * >> struct rdtgroup *rdtgrp, struct mon_evt *mevt); >> int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d, >> struct rdtgroup *rdtgrp, enum resctrl_event_id evtid); >> +void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d); >> +void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d); >> >> #ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK >> int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp); >> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c >> index 618c94cd1ad8..504b869570e6 100644 >> --- a/fs/resctrl/monitor.c >> +++ b/fs/resctrl/monitor.c >> @@ -1045,6 +1045,33 @@ static void mbm_cntr_free(struct rdt_mon_domain *d, int cntr_id) >> memset(&d->cntr_cfg[cntr_id], 0, sizeof(struct mbm_cntr_cfg)); >> } >> >> +/** >> + * mbm_cntr_free_all() - Clear all the counter ID configuration details in the >> + * domain @d. Called when mbm_assign_mode is changed. >> + */ >> +void mbm_cntr_free_all(struct rdt_resource *r, struct rdt_mon_domain *d) >> +{ >> + memset(d->cntr_cfg, 0, sizeof(*d->cntr_cfg) * r->mon.num_mbm_cntrs); >> +} >> + >> +/** >> + * resctrl_reset_rmid_all() - Reset all non-architecture states for all the >> + * supported RMIDs. >> + */ >> +void resctrl_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *d) > > struct rdt_resource *r is unused? At this time it seems unnecessary to check if > an MBM event belongs to particular resource since at this point I expect only L3 > is possible. Even so, to be consistent and robust I think it would make flows > easier to understand by always ensureing that mon_evt::rid matches > expected resource. Agree. But in this we dont access mon_evt structure here. > >> +{ >> + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); >> + enum resctrl_event_id evt; >> + int idx; >> + >> + for_each_mbm_event_id(evt) { >> + if (!resctrl_is_mon_event_enabled(evt)) >> + continue; >> + idx = MBM_STATE_IDX(evt); >> + memset(d->mbm_states[idx], 0, sizeof(struct mbm_state) * idx_limit); > > sizeof(*d->mbm_states[0])? > Sure. >> + } >> +} >> + >> /** >> * resctrl_alloc_config_cntr() - Allocate a counter ID and configure it for the >> * event pointed to by @mevt and the resctrl group @rdtgrp within the domain @d. >> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c >> index 8c67e0897f25..6bb61fcf8673 100644 >> --- a/fs/resctrl/rdtgroup.c >> +++ b/fs/resctrl/rdtgroup.c >> @@ -1876,6 +1876,77 @@ static int resctrl_mbm_assign_mode_show(struct kernfs_open_file *of, >> return 0; >> } >> >> +static ssize_t resctrl_mbm_assign_mode_write(struct kernfs_open_file *of, >> + char *buf, size_t nbytes, loff_t off) >> +{ >> + struct rdt_resource *r = rdt_kn_parent_priv(of->kn); >> + struct rdt_mon_domain *d; >> + int ret = 0; >> + bool enable; >> + >> + /* Valid input requires a trailing newline */ >> + if (nbytes == 0 || buf[nbytes - 1] != '\n') >> + return -EINVAL; >> + >> + buf[nbytes - 1] = '\0'; >> + >> + cpus_read_lock(); >> + mutex_lock(&rdtgroup_mutex); >> + >> + rdt_last_cmd_clear(); >> + >> + if (!strcmp(buf, "default")) { >> + enable = 0; >> + } else if (!strcmp(buf, "mbm_event")) { >> + if (r->mon.mbm_cntr_assignable) { >> + enable = 1; >> + } else { >> + ret = -EINVAL; >> + rdt_last_cmd_puts("mbm_event mode is not supported\n"); >> + goto write_exit; > > write_exit -> out_unlock > Sure. >> + } >> + } else { >> + ret = -EINVAL; >> + rdt_last_cmd_puts("Unsupported assign mode\n"); >> + goto write_exit; >> + } >> + >> + if (enable != resctrl_arch_mbm_cntr_assign_enabled(r)) { >> + ret = resctrl_arch_mbm_cntr_assign_set(r, enable); >> + if (ret) >> + goto write_exit; >> + >> + /* Update the visibility of BMEC related files */ >> + resctrl_bmec_files_show(r, !enable); >> + >> + /* >> + * Initialize the default memory transaction values for >> + * total and local events. >> + */ >> + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID)) >> + resctrl_set_mon_evt_cfg(QOS_L3_MBM_TOTAL_EVENT_ID, >> + MAX_EVT_CONFIG_BITS); >> + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID)) >> + resctrl_set_mon_evt_cfg(QOS_L3_MBM_LOCAL_EVENT_ID, >> + READS_TO_LOCAL_MEM | >> + READS_TO_LOCAL_S_MEM | >> + NON_TEMP_WRITE_TO_LOCAL_MEM); > > Nice, yes, this belongs in resctrl fs. > >> + /* >> + * Reset all the non-achitectural RMID state and assignable counters. >> + */ >> + list_for_each_entry(d, &r->mon_domains, hdr.list) { >> + mbm_cntr_free_all(r, d); >> + resctrl_reset_rmid_all(r, d); >> + } >> + } >> + >> +write_exit: >> + mutex_unlock(&rdtgroup_mutex); >> + cpus_read_unlock(); >> + >> + return ret ?: nbytes; >> +} >> + >> static int resctrl_num_mbm_cntrs_show(struct kernfs_open_file *of, >> struct seq_file *s, void *v) >> { >> @@ -2203,8 +2274,8 @@ static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgr >> struct mon_evt *mevt; >> int assign_state; >> char domain[10]; >> + int ret = 0; >> bool found; >> - int ret; >> >> mevt = mbm_get_mon_event_by_name(r, event); >> if (!mevt) { >> @@ -2249,7 +2320,7 @@ static int resctrl_process_assign(struct rdt_resource *r, struct rdtgroup *rdtgr >> >> switch (assign_state) { >> case ASSIGN_NONE: >> - ret = resctrl_unassign_cntr_event(r, d, rdtgrp, mevt); >> + resctrl_unassign_cntr_event(r, d, rdtgrp, mevt); >> break; >> case ASSIGN_EXCLUSIVE: >> ret = resctrl_assign_cntr_event(r, d, rdtgrp, mevt); > > Two stray hunks? > Yes. Fixed it. >> @@ -2463,9 +2534,10 @@ static struct rftype res_common_files[] = { >> }, >> { >> .name = "mbm_assign_mode", >> - .mode = 0444, >> + .mode = 0644, >> .kf_ops = &rdtgroup_kf_single_ops, >> .seq_show = resctrl_mbm_assign_mode_show, >> + .write = resctrl_mbm_assign_mode_write, >> .fflags = RFTYPE_MON_INFO | RFTYPE_RES_CACHE, >> }, >> { > > Reinette > -- Thanks Babu Moger
© 2016 - 2025 Red Hat, Inc.