When "mbm_event" counter assignment mode is supported the
/sys/fs/resctrl/info/L3_MON/event_configs directory contains a
sub-directory for each MBM event that can be assigned to a counter.
The MBM event sub-directory contains a file named "event_filter" that
is used to view and modify which memory transactions the MBM event is
configured with.
Create /sys/fs/resctrl/info/L3_MON/event_configs directory on resctrl
mount and pre-populate it with directories for the two existing MBM events:
mbm_total_bytes and mbm_local_bytes. Create the "event_filter" file within
each MBM event directory with the needed *show() that displays the memory
transactions with which the MBM event is configured.
Example:
$ mount -t resctrl resctrl /sys/fs/resctrl
$ cd /sys/fs/resctrl/
$ cat info/L3_MON/event_configs/mbm_total_bytes/event_filter
local_reads,remote_reads,local_non_temporal_writes,
remote_non_temporal_writes,local_reads_slow_memory,
remote_reads_slow_memory,dirty_victim_writes_all
$ cat info/L3_MON/event_configs/mbm_local_bytes/event_filter
local_reads,local_non_temporal_writes,local_reads_slow_memory
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v16: Moved event_filter_show() to fs/resctrl/monitor.c
Changed the goto label out_config to out.
Added rdtgroup_mutex in event_filter_show().
Removed extern for mbm_transactions. Not required.
0025-fs-resctrl-Add-event-configuration-directory-under
0025-fs-resctrl-Add-event-configuration-directory-under
0025-fs-resctrl-Add-event-configuration-directory-under
Added prototype rdt_kn_parent_priv() in so it can be called from monitor.c
v15: Fixed the event_filter display with proper spacing.
Updated the changelog.
Changed the function name resctrl_mkdir_counter_configs() to
resctrl_mkdir_event_configs().
Called resctrl_mkdir_event_configs from rdtgroup_mkdir_info_resdir().
It avoids the call kernfs_find_and_get() to get the node for info directory.
Used for_each_mon_event() where applicable.
v14: Updated the changelog with context. Thanks to Reinette.
Changed the name of directory to event_configs from counter_config.
Updated user doc about the memory transactions supported by assignment.
Removed mbm_mode from struct mon_evt. Not required anymore.
v13: Updated user doc (resctrl.rst).
Changed the name of the function resctrl_mkdir_info_configs to
resctrl_mkdir_counter_configs().
Replaced seq_puts() with seq_putc() where applicable.
Removed RFTYPE_MON_CONFIG definition. Not required.
Changed the name of the flag RFTYPE_CONFIG to RFTYPE_ASSIGN_CONFIG.
Reinette suggested RFTYPE_MBM_EVENT_CONFIG but RFTYPE_ASSIGN_CONFIG
seemed shorter and pricise.
The configuration is created using evt_list.
Resolved conflicts caused by the recent FS/ARCH code restructure.
The monitor.c/rdtgroup.c files have been split between the FS and ARCH directories.
v12: New patch to hold the MBM event configurations for mbm_cntr_assign mode.
---
Documentation/filesystems/resctrl.rst | 32 +++++++++++++++
fs/resctrl/internal.h | 6 +++
fs/resctrl/monitor.c | 24 +++++++++++
fs/resctrl/rdtgroup.c | 58 ++++++++++++++++++++++++++-
4 files changed, 119 insertions(+), 1 deletion(-)
diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index 4c24c5f3f4c1..3dfc177f9792 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -310,6 +310,38 @@ with the following files:
# cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs
0=30;1=30
+"event_configs":
+ Directory that exists when "mbm_event" counter assignment mode is supported.
+ Contains sub-directory for each MBM event that can be assigned to a counter.
+
+ Two MBM events are supported by default: mbm_local_bytes and mbm_total_bytes.
+ Each MBM event's sub-directory contains a file named "event_filter" that is
+ used to view and modify which memory transactions the MBM event is configured
+ with.
+
+ List of memory transaction types supported:
+
+ ========================== ========================================================
+ Name Description
+ ========================== ========================================================
+ dirty_victim_writes_all Dirty Victims from the QOS domain to all types of memory
+ remote_reads_slow_memory Reads to slow memory in the non-local NUMA domain
+ local_reads_slow_memory Reads to slow memory in the local NUMA domain
+ remote_non_temporal_writes Non-temporal writes to non-local NUMA domain
+ local_non_temporal_writes Non-temporal writes to local NUMA domain
+ remote_reads Reads to memory in the non-local NUMA domain
+ local_reads Reads to memory in the local NUMA domain
+ ========================== ========================================================
+
+ For example::
+
+ # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter
+ local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes,
+ local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all
+
+ # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter
+ local_reads,local_non_temporal_writes,local_reads_slow_memory
+
"max_threshold_occupancy":
Read/write file provides the largest value (in
bytes) at which a previously used LLC_occupancy
diff --git a/fs/resctrl/internal.h b/fs/resctrl/internal.h
index 693268bcbad2..e082d8718199 100644
--- a/fs/resctrl/internal.h
+++ b/fs/resctrl/internal.h
@@ -252,6 +252,8 @@ struct mbm_transaction {
#define RFTYPE_DEBUG BIT(10)
+#define RFTYPE_ASSIGN_CONFIG BIT(11)
+
#define RFTYPE_CTRL_INFO (RFTYPE_INFO | RFTYPE_CTRL)
#define RFTYPE_MON_INFO (RFTYPE_INFO | RFTYPE_MON)
@@ -408,6 +410,10 @@ void rdtgroup_unassign_cntr_event(struct rdt_mon_domain *d, struct rdtgroup *rdt
int mbm_cntr_get(struct rdt_resource *r, struct rdt_mon_domain *d,
struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
+void *rdt_kn_parent_priv(struct kernfs_node *kn);
+
+int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v);
+
#ifdef CONFIG_RESCTRL_FS_PSEUDO_LOCK
int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 16bcfeeb89e6..fa5f63126682 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = {
{"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM},
};
+int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v)
+{
+ struct mon_evt *mevt = rdt_kn_parent_priv(of->kn);
+ bool sep = false;
+ int i;
+
+ mutex_lock(&rdtgroup_mutex);
+
+ for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) {
+ if (mevt->evt_cfg & mbm_transactions[i].val) {
+ if (sep)
+ seq_putc(seq, ',');
+ seq_printf(seq, "%s", mbm_transactions[i].name);
+ sep = true;
+ }
+ }
+ seq_putc(seq, '\n');
+
+ mutex_unlock(&rdtgroup_mutex);
+
+ return 0;
+}
+
/**
* resctrl_mon_resource_init() - Initialise global monitoring structures.
*
@@ -982,6 +1005,7 @@ int resctrl_mon_resource_init(void)
RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
resctrl_file_fflags_init("available_mbm_cntrs",
RFTYPE_MON_INFO | RFTYPE_RES_CACHE);
+ resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG);
}
return 0;
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 15d10c346307..11fc8e362ead 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -975,7 +975,7 @@ static int rdt_last_cmd_status_show(struct kernfs_open_file *of,
return 0;
}
-static void *rdt_kn_parent_priv(struct kernfs_node *kn)
+void *rdt_kn_parent_priv(struct kernfs_node *kn)
{
/*
* The parent pointer is only valid within RCU section since it can be
@@ -2019,6 +2019,12 @@ static struct rftype res_common_files[] = {
.seq_show = mbm_local_bytes_config_show,
.write = mbm_local_bytes_config_write,
},
+ {
+ .name = "event_filter",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = event_filter_show,
+ },
{
.name = "mbm_assign_mode",
.mode = 0444,
@@ -2279,10 +2285,48 @@ int rdtgroup_kn_mode_restore(struct rdtgroup *r, const char *name,
return ret;
}
+static int resctrl_mkdir_event_configs(struct rdt_resource *r, struct kernfs_node *l3_mon_kn)
+{
+ struct kernfs_node *kn_subdir, *kn_subdir2;
+ struct mon_evt *mevt;
+ int ret;
+
+ kn_subdir = kernfs_create_dir(l3_mon_kn, "event_configs", l3_mon_kn->mode, NULL);
+ if (IS_ERR(kn_subdir))
+ return PTR_ERR(kn_subdir);
+
+ ret = rdtgroup_kn_set_ugid(kn_subdir);
+ if (ret)
+ return ret;
+
+ for_each_mon_event(mevt) {
+ if (mevt->rid != r->rid || !mevt->enabled || !resctrl_is_mbm_event(mevt->evtid))
+ continue;
+
+ kn_subdir2 = kernfs_create_dir(kn_subdir, mevt->name, kn_subdir->mode, mevt);
+ if (IS_ERR(kn_subdir2)) {
+ ret = PTR_ERR(kn_subdir2);
+ goto out;
+ }
+
+ ret = rdtgroup_kn_set_ugid(kn_subdir2);
+ if (ret)
+ goto out;
+
+ ret = rdtgroup_add_files(kn_subdir2, RFTYPE_ASSIGN_CONFIG);
+ if (ret)
+ break;
+ }
+
+out:
+ return ret;
+}
+
static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
unsigned long fflags)
{
struct kernfs_node *kn_subdir;
+ struct rdt_resource *r;
int ret;
kn_subdir = kernfs_create_dir(kn_info, name,
@@ -2295,6 +2339,18 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name,
return ret;
ret = rdtgroup_add_files(kn_subdir, fflags);
+ if (ret)
+ return ret;
+
+ if ((fflags & RFTYPE_MON_INFO) == RFTYPE_MON_INFO) {
+ r = priv;
+ if (r->mon.mbm_cntr_assignable) {
+ ret = resctrl_mkdir_event_configs(r, kn_subdir);
+ if (ret)
+ return ret;
+ }
+ }
+
if (!ret)
kernfs_activate(kn_subdir);
--
2.34.1
Hi Babu, On 7/25/25 11:29 AM, Babu Moger wrote: > --- > diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst > index 4c24c5f3f4c1..3dfc177f9792 100644 > --- a/Documentation/filesystems/resctrl.rst > +++ b/Documentation/filesystems/resctrl.rst > @@ -310,6 +310,38 @@ with the following files: > # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs > 0=30;1=30 > > +"event_configs": > + Directory that exists when "mbm_event" counter assignment mode is supported. > + Contains sub-directory for each MBM event that can be assigned to a counter. "Contains sub-directory" -> "Contains a sub-directory"? > + > + Two MBM events are supported by default: mbm_local_bytes and mbm_total_bytes. > + Each MBM event's sub-directory contains a file named "event_filter" that is > + used to view and modify which memory transactions the MBM event is configured > + with. > + > + List of memory transaction types supported: > + > + ========================== ======================================================== > + Name Description > + ========================== ======================================================== > + dirty_victim_writes_all Dirty Victims from the QOS domain to all types of memory > + remote_reads_slow_memory Reads to slow memory in the non-local NUMA domain > + local_reads_slow_memory Reads to slow memory in the local NUMA domain > + remote_non_temporal_writes Non-temporal writes to non-local NUMA domain > + local_non_temporal_writes Non-temporal writes to local NUMA domain > + remote_reads Reads to memory in the non-local NUMA domain > + local_reads Reads to memory in the local NUMA domain > + ========================== ======================================================== > + > + For example:: > + > + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter > + local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes, > + local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all > + > + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter > + local_reads,local_non_temporal_writes,local_reads_slow_memory > + > "max_threshold_occupancy": > Read/write file provides the largest value (in > bytes) at which a previously used LLC_occupancy ... > diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c > index 16bcfeeb89e6..fa5f63126682 100644 > --- a/fs/resctrl/monitor.c > +++ b/fs/resctrl/monitor.c > @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = { > {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM}, > }; > > +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) > +{ > + struct mon_evt *mevt = rdt_kn_parent_priv(of->kn); > + bool sep = false; > + int i; > + > + mutex_lock(&rdtgroup_mutex); > + There is inconsistency among the files introduced on how "mbm_event mode disabled" case is handled. Some files return failure from their _show()/_write() when "mbm_event mode is disabled", some don't. The "event_filter" file always prints the MBM transactions monitored when assignable counters are supported, whether mbm_event mode is enabled or not. This means that the MBM event's configuration values are printed when "default" mode is enabled. I have two concerns about this 1) This is potentially very confusing since switching to "default" will make the BMEC files visible that will enable the user to modify the event configurations per domain. Having this file print a global event configuration while there are potentially various different domain-specific configuration active will be confusing. 2) Can it be guaranteed that the MBM events will monitor the default assignable counter memory transactions when in "default" mode? It has never been possible to query which memory transactions are monitored by the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features so this seems to use one feature to deduce capabilities or another? > + for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) { > + if (mevt->evt_cfg & mbm_transactions[i].val) { > + if (sep) > + seq_putc(seq, ','); > + seq_printf(seq, "%s", mbm_transactions[i].name); > + sep = true; > + } > + } > + seq_putc(seq, '\n'); > + > + mutex_unlock(&rdtgroup_mutex); > + > + return 0; > +} > + > /** > * resctrl_mon_resource_init() - Initialise global monitoring structures. > * > @@ -982,6 +1005,7 @@ int resctrl_mon_resource_init(void) > RFTYPE_MON_INFO | RFTYPE_RES_CACHE); > resctrl_file_fflags_init("available_mbm_cntrs", > RFTYPE_MON_INFO | RFTYPE_RES_CACHE); > + resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG); > } > > return 0; ... > @@ -2295,6 +2339,18 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name, > return ret; > > ret = rdtgroup_add_files(kn_subdir, fflags); > + if (ret) > + return ret; > + > + if ((fflags & RFTYPE_MON_INFO) == RFTYPE_MON_INFO) { > + r = priv; > + if (r->mon.mbm_cntr_assignable) { > + ret = resctrl_mkdir_event_configs(r, kn_subdir); > + if (ret) > + return ret; > + } > + } > + > if (!ret) > kernfs_activate(kn_subdir); > Looks like the "if (!ret)" above can be dropped to always call "kernfs_activate(kn_subdir)" on exit making it clear that this is success path and function exits early on any error. Reinette
Hi Reinette, On 7/30/2025 3:04 PM, Reinette Chatre wrote: > Hi Babu, > > On 7/25/25 11:29 AM, Babu Moger wrote: > > >> --- >> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst >> index 4c24c5f3f4c1..3dfc177f9792 100644 >> --- a/Documentation/filesystems/resctrl.rst >> +++ b/Documentation/filesystems/resctrl.rst >> @@ -310,6 +310,38 @@ with the following files: >> # cat /sys/fs/resctrl/info/L3_MON/available_mbm_cntrs >> 0=30;1=30 >> >> +"event_configs": >> + Directory that exists when "mbm_event" counter assignment mode is supported. >> + Contains sub-directory for each MBM event that can be assigned to a counter. > "Contains sub-directory" -> "Contains a sub-directory"? Sure. >> + >> + Two MBM events are supported by default: mbm_local_bytes and mbm_total_bytes. >> + Each MBM event's sub-directory contains a file named "event_filter" that is >> + used to view and modify which memory transactions the MBM event is configured >> + with. >> + >> + List of memory transaction types supported: >> + >> + ========================== ======================================================== >> + Name Description >> + ========================== ======================================================== >> + dirty_victim_writes_all Dirty Victims from the QOS domain to all types of memory >> + remote_reads_slow_memory Reads to slow memory in the non-local NUMA domain >> + local_reads_slow_memory Reads to slow memory in the local NUMA domain >> + remote_non_temporal_writes Non-temporal writes to non-local NUMA domain >> + local_non_temporal_writes Non-temporal writes to local NUMA domain >> + remote_reads Reads to memory in the non-local NUMA domain >> + local_reads Reads to memory in the local NUMA domain >> + ========================== ======================================================== >> + >> + For example:: >> + >> + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_total_bytes/event_filter >> + local_reads,remote_reads,local_non_temporal_writes,remote_non_temporal_writes, >> + local_reads_slow_memory,remote_reads_slow_memory,dirty_victim_writes_all >> + >> + # cat /sys/fs/resctrl/info/L3_MON/event_configs/mbm_local_bytes/event_filter >> + local_reads,local_non_temporal_writes,local_reads_slow_memory >> + >> "max_threshold_occupancy": >> Read/write file provides the largest value (in >> bytes) at which a previously used LLC_occupancy > ... > >> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c >> index 16bcfeeb89e6..fa5f63126682 100644 >> --- a/fs/resctrl/monitor.c >> +++ b/fs/resctrl/monitor.c >> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = { >> {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM}, >> }; >> >> +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) >> +{ >> + struct mon_evt *mevt = rdt_kn_parent_priv(of->kn); >> + bool sep = false; >> + int i; >> + >> + mutex_lock(&rdtgroup_mutex); >> + > There is inconsistency among the files introduced on how > "mbm_event mode disabled" case is handled. Some files return failure > from their _show()/_write() when "mbm_event mode is disabled", some don't. > > The "event_filter" file always prints the MBM transactions monitored > when assignable counters are supported, whether mbm_event mode is enabled > or not. This means that the MBM event's configuration values are printed > when "default" mode is enabled. I have two concerns about this > 1) This is potentially very confusing since switching to "default" will > make the BMEC files visible that will enable the user to modify the > event configurations per domain. Having this file print a global event > configuration while there are potentially various different domain-specific > configuration active will be confusing. Yes. I agree. > 2) Can it be guaranteed that the MBM events will monitor the default > assignable counter memory transactions when in "default" mode? It has > never been possible to query which memory transactions are monitored by > the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features > so this seems to use one feature to deduce capabilities or another? So, initialize both total and local event configuration to default values when switched to "default mode"? Something like this? mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask; mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM; We are already doing that right (in patch 32)? > > >> + for (i = 0; i < NUM_MBM_TRANSACTIONS; i++) { >> + if (mevt->evt_cfg & mbm_transactions[i].val) { >> + if (sep) >> + seq_putc(seq, ','); >> + seq_printf(seq, "%s", mbm_transactions[i].name); >> + sep = true; >> + } >> + } >> + seq_putc(seq, '\n'); >> + >> + mutex_unlock(&rdtgroup_mutex); >> + >> + return 0; >> +} >> + >> /** >> * resctrl_mon_resource_init() - Initialise global monitoring structures. >> * >> @@ -982,6 +1005,7 @@ int resctrl_mon_resource_init(void) >> RFTYPE_MON_INFO | RFTYPE_RES_CACHE); >> resctrl_file_fflags_init("available_mbm_cntrs", >> RFTYPE_MON_INFO | RFTYPE_RES_CACHE); >> + resctrl_file_fflags_init("event_filter", RFTYPE_ASSIGN_CONFIG); >> } >> >> return 0; > ... > >> @@ -2295,6 +2339,18 @@ static int rdtgroup_mkdir_info_resdir(void *priv, char *name, >> return ret; >> >> ret = rdtgroup_add_files(kn_subdir, fflags); >> + if (ret) >> + return ret; >> + >> + if ((fflags & RFTYPE_MON_INFO) == RFTYPE_MON_INFO) { >> + r = priv; >> + if (r->mon.mbm_cntr_assignable) { >> + ret = resctrl_mkdir_event_configs(r, kn_subdir); >> + if (ret) >> + return ret; >> + } >> + } >> + >> if (!ret) >> kernfs_activate(kn_subdir); >> > Looks like the "if (!ret)" above can be dropped to always call "kernfs_activate(kn_subdir)" > on exit making it clear that this is success path and function exits early on any error. Sure. Will do, Thanks Babu
Hi Babu, On 8/8/25 6:56 AM, Moger, Babu wrote: > On 7/30/2025 3:04 PM, Reinette Chatre wrote: >> On 7/25/25 11:29 AM, Babu Moger wrote: >>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c >>> index 16bcfeeb89e6..fa5f63126682 100644 >>> --- a/fs/resctrl/monitor.c >>> +++ b/fs/resctrl/monitor.c >>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = { >>> {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM}, >>> }; >>> +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) >>> +{ >>> + struct mon_evt *mevt = rdt_kn_parent_priv(of->kn); >>> + bool sep = false; >>> + int i; >>> + >>> + mutex_lock(&rdtgroup_mutex); >>> + >> There is inconsistency among the files introduced on how >> "mbm_event mode disabled" case is handled. Some files return failure >> from their _show()/_write() when "mbm_event mode is disabled", some don't. >> >> The "event_filter" file always prints the MBM transactions monitored >> when assignable counters are supported, whether mbm_event mode is enabled >> or not. This means that the MBM event's configuration values are printed >> when "default" mode is enabled. I have two concerns about this >> 1) This is potentially very confusing since switching to "default" will >> make the BMEC files visible that will enable the user to modify the >> event configurations per domain. Having this file print a global event >> configuration while there are potentially various different domain-specific >> configuration active will be confusing. > Yes. I agree. hmmm ... ok, you say that you agree but I cannot tell if you are going to do anything about it. On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config files should be the *only* source of MBM event configuration information, no? It may be ok to have event_filter print configuration when assignable counters are disabled if BMEC is not supported but that would require that this information will always be known for a "default" monitoring setup. While this may be true for AMD it is not obvious to me that this is universally true. Once this file exists in this form resctrl will always be required to provide data for the event configuration and it is not clear to me that this can always be guaranteed. >> 2) Can it be guaranteed that the MBM events will monitor the default >> assignable counter memory transactions when in "default" mode? It has >> never been possible to query which memory transactions are monitored by >> the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features >> so this seems to use one feature to deduce capabilities or another? > > So, initialize both total and local event configuration to default values when switched to "default mode"? > > Something like this? > > mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask; > > mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM; > > We are already doing that right (in patch 32)? Yes, but it creates this strange dependency where the "default" monitoring mode (that has been supported long before configurable events and assignable counters came along) requires support of "assignable counter mode". Consider it from another view, if resctrl wants to make event configuration available for the "default" mode then the "event_filter" file could always be visible when MBM local/total is supported to give users insight into what is monitored, whether assignable counters are supported or not. This is not possible on systems that do not support ABMC though, right? Reinette
Hi Reinette, On 8/8/2025 10:12 AM, Reinette Chatre wrote: > Hi Babu, > > On 8/8/25 6:56 AM, Moger, Babu wrote: >> On 7/30/2025 3:04 PM, Reinette Chatre wrote: >>> On 7/25/25 11:29 AM, Babu Moger wrote: >>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c >>>> index 16bcfeeb89e6..fa5f63126682 100644 >>>> --- a/fs/resctrl/monitor.c >>>> +++ b/fs/resctrl/monitor.c >>>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = { >>>> {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM}, >>>> }; >>>> +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) >>>> +{ >>>> + struct mon_evt *mevt = rdt_kn_parent_priv(of->kn); >>>> + bool sep = false; >>>> + int i; >>>> + >>>> + mutex_lock(&rdtgroup_mutex); >>>> + >>> There is inconsistency among the files introduced on how >>> "mbm_event mode disabled" case is handled. Some files return failure >>> from their _show()/_write() when "mbm_event mode is disabled", some don't. >>> >>> The "event_filter" file always prints the MBM transactions monitored >>> when assignable counters are supported, whether mbm_event mode is enabled >>> or not. This means that the MBM event's configuration values are printed >>> when "default" mode is enabled. I have two concerns about this >>> 1) This is potentially very confusing since switching to "default" will >>> make the BMEC files visible that will enable the user to modify the >>> event configurations per domain. Having this file print a global event >>> configuration while there are potentially various different domain-specific >>> configuration active will be confusing. >> Yes. I agree. > hmmm ... ok, you say that you agree but I cannot tell if you are going to > do anything about it. > > On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config > files should be the *only* source of MBM event configuration information, no? That is correct. > > It may be ok to have event_filter print configuration when assignable counters are disabled > if BMEC is not supported but that would require that this information will always be > known for a "default" monitoring setup. While this may be true for AMD it is not obvious > to me that this is universally true. Once this file exists in this form resctrl will always > be required to provide data for the event configuration and it is not clear to me that > this can always be guaranteed. Yea. It is not true universally true. I don't know what these values are for Intel and ARM. > >>> 2) Can it be guaranteed that the MBM events will monitor the default >>> assignable counter memory transactions when in "default" mode? It has >>> never been possible to query which memory transactions are monitored by >>> the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features >>> so this seems to use one feature to deduce capabilities or another? >> So, initialize both total and local event configuration to default values when switched to "default mode"? >> >> Something like this? >> >> mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask; >> >> mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM; >> >> We are already doing that right (in patch 32)? > Yes, but it creates this strange dependency where the "default" monitoring mode > (that has been supported long before configurable events and assignable counters came > along) requires support of "assignable counter mode". > > Consider it from another view, if resctrl wants to make event configuration available > for the "default" mode then the "event_filter" file could always be visible when MBM > local/total is supported to give users insight into what is monitored, whether > assignable counters are supported or not. This is not possible on systems that do > not support ABMC though, right? That is correct. With BMEC, each domain can have its own settings. So, printing the default will not be accurate. What options do we have here. How about adding the check if (resctrl_arch_mbm_cntr_assign_enabled())? Only print the values when ABMC is supported else print information in "last_cmd_status". Thanks Babu
Hi Babu, On 8/8/25 10:47 AM, Moger, Babu wrote: > On 8/8/2025 10:12 AM, Reinette Chatre wrote: >> On 8/8/25 6:56 AM, Moger, Babu wrote: >>> On 7/30/2025 3:04 PM, Reinette Chatre wrote: >>>> On 7/25/25 11:29 AM, Babu Moger wrote: >>>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c >>>>> index 16bcfeeb89e6..fa5f63126682 100644 >>>>> --- a/fs/resctrl/monitor.c >>>>> +++ b/fs/resctrl/monitor.c >>>>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = { >>>>> {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM}, >>>>> }; >>>>> +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) >>>>> +{ >>>>> + struct mon_evt *mevt = rdt_kn_parent_priv(of->kn); >>>>> + bool sep = false; >>>>> + int i; >>>>> + >>>>> + mutex_lock(&rdtgroup_mutex); >>>>> + >>>> There is inconsistency among the files introduced on how >>>> "mbm_event mode disabled" case is handled. Some files return failure >>>> from their _show()/_write() when "mbm_event mode is disabled", some don't. >>>> >>>> The "event_filter" file always prints the MBM transactions monitored >>>> when assignable counters are supported, whether mbm_event mode is enabled >>>> or not. This means that the MBM event's configuration values are printed >>>> when "default" mode is enabled. I have two concerns about this >>>> 1) This is potentially very confusing since switching to "default" will >>>> make the BMEC files visible that will enable the user to modify the >>>> event configurations per domain. Having this file print a global event >>>> configuration while there are potentially various different domain-specific >>>> configuration active will be confusing. >>> Yes. I agree. >> hmmm ... ok, you say that you agree but I cannot tell if you are going to >> do anything about it. >> >> On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config >> files should be the *only* source of MBM event configuration information, no? > > That is correct. > > >> >> It may be ok to have event_filter print configuration when assignable counters are disabled >> if BMEC is not supported but that would require that this information will always be >> known for a "default" monitoring setup. While this may be true for AMD it is not obvious >> to me that this is universally true. Once this file exists in this form resctrl will always >> be required to provide data for the event configuration and it is not clear to me that >> this can always be guaranteed. > > Yea. It is not true universally true. I don't know what these values are for Intel and ARM. > >> >>>> 2) Can it be guaranteed that the MBM events will monitor the default >>>> assignable counter memory transactions when in "default" mode? It has >>>> never been possible to query which memory transactions are monitored by >>>> the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features >>>> so this seems to use one feature to deduce capabilities or another? >>> So, initialize both total and local event configuration to default values when switched to "default mode"? >>> >>> Something like this? >>> >>> mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask; >>> >>> mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM; >>> >>> We are already doing that right (in patch 32)? >> Yes, but it creates this strange dependency where the "default" monitoring mode >> (that has been supported long before configurable events and assignable counters came >> along) requires support of "assignable counter mode". >> >> Consider it from another view, if resctrl wants to make event configuration available >> for the "default" mode then the "event_filter" file could always be visible when MBM >> local/total is supported to give users insight into what is monitored, whether >> assignable counters are supported or not. This is not possible on systems that do >> not support ABMC though, right? > > That is correct. With BMEC, each domain can have its own settings. So, printing the default will not be accurate. > > What options do we have here. > > How about adding the check if (resctrl_arch_mbm_cntr_assign_enabled())? Only print the values when ABMC is supported else print information in "last_cmd_status". > Did you perhaps intend to write "Only print the values when ABMC is *enabled* else print information in "last_cmd_status".? If this is what you actually meant then I agree. I think doing so places clear boundary on this feature that gives us more options/flexibility for future changes. Reinette
Hi Reinette, On 8/8/2025 1:23 PM, Reinette Chatre wrote: > Hi Babu, > > On 8/8/25 10:47 AM, Moger, Babu wrote: >> On 8/8/2025 10:12 AM, Reinette Chatre wrote: >>> On 8/8/25 6:56 AM, Moger, Babu wrote: >>>> On 7/30/2025 3:04 PM, Reinette Chatre wrote: >>>>> On 7/25/25 11:29 AM, Babu Moger wrote: >>>>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c >>>>>> index 16bcfeeb89e6..fa5f63126682 100644 >>>>>> --- a/fs/resctrl/monitor.c >>>>>> +++ b/fs/resctrl/monitor.c >>>>>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = { >>>>>> {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM}, >>>>>> }; >>>>>> +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) >>>>>> +{ >>>>>> + struct mon_evt *mevt = rdt_kn_parent_priv(of->kn); >>>>>> + bool sep = false; >>>>>> + int i; >>>>>> + >>>>>> + mutex_lock(&rdtgroup_mutex); >>>>>> + >>>>> There is inconsistency among the files introduced on how >>>>> "mbm_event mode disabled" case is handled. Some files return failure >>>>> from their _show()/_write() when "mbm_event mode is disabled", some don't. >>>>> >>>>> The "event_filter" file always prints the MBM transactions monitored >>>>> when assignable counters are supported, whether mbm_event mode is enabled >>>>> or not. This means that the MBM event's configuration values are printed >>>>> when "default" mode is enabled. I have two concerns about this >>>>> 1) This is potentially very confusing since switching to "default" will >>>>> make the BMEC files visible that will enable the user to modify the >>>>> event configurations per domain. Having this file print a global event >>>>> configuration while there are potentially various different domain-specific >>>>> configuration active will be confusing. >>>> Yes. I agree. >>> hmmm ... ok, you say that you agree but I cannot tell if you are going to >>> do anything about it. >>> >>> On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config >>> files should be the *only* source of MBM event configuration information, no? >> That is correct. >> >> >>> It may be ok to have event_filter print configuration when assignable counters are disabled >>> if BMEC is not supported but that would require that this information will always be >>> known for a "default" monitoring setup. While this may be true for AMD it is not obvious >>> to me that this is universally true. Once this file exists in this form resctrl will always >>> be required to provide data for the event configuration and it is not clear to me that >>> this can always be guaranteed. >> Yea. It is not true universally true. I don't know what these values are for Intel and ARM. >> >>>>> 2) Can it be guaranteed that the MBM events will monitor the default >>>>> assignable counter memory transactions when in "default" mode? It has >>>>> never been possible to query which memory transactions are monitored by >>>>> the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features >>>>> so this seems to use one feature to deduce capabilities or another? >>>> So, initialize both total and local event configuration to default values when switched to "default mode"? >>>> >>>> Something like this? >>>> >>>> mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask; >>>> >>>> mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM; >>>> >>>> We are already doing that right (in patch 32)? >>> Yes, but it creates this strange dependency where the "default" monitoring mode >>> (that has been supported long before configurable events and assignable counters came >>> along) requires support of "assignable counter mode". >>> >>> Consider it from another view, if resctrl wants to make event configuration available >>> for the "default" mode then the "event_filter" file could always be visible when MBM >>> local/total is supported to give users insight into what is monitored, whether >>> assignable counters are supported or not. This is not possible on systems that do >>> not support ABMC though, right? >> That is correct. With BMEC, each domain can have its own settings. So, printing the default will not be accurate. >> >> What options do we have here. >> >> How about adding the check if (resctrl_arch_mbm_cntr_assign_enabled())? Only print the values when ABMC is supported else print information in "last_cmd_status". >> > Did you perhaps intend to write "Only print the values when ABMC is *enabled* else print > information in "last_cmd_status".? If this is what you actually meant then I agree. I think > doing so places clear boundary on this feature that gives us more options/flexibility for > future changes. Yes. That is what I meant. We have same check in event_filer_write(). Will add the same check in event_filter_show(). Thanks Babu
Hi Babu, On 8/8/25 11:48 AM, Moger, Babu wrote: > On 8/8/2025 1:23 PM, Reinette Chatre wrote: >> On 8/8/25 10:47 AM, Moger, Babu wrote: >>> On 8/8/2025 10:12 AM, Reinette Chatre wrote: >>>> On 8/8/25 6:56 AM, Moger, Babu wrote: >>>>> On 7/30/2025 3:04 PM, Reinette Chatre wrote: >>>>>> On 7/25/25 11:29 AM, Babu Moger wrote: >>>>>>> diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c >>>>>>> index 16bcfeeb89e6..fa5f63126682 100644 >>>>>>> --- a/fs/resctrl/monitor.c >>>>>>> +++ b/fs/resctrl/monitor.c >>>>>>> @@ -929,6 +929,29 @@ struct mbm_transaction mbm_transactions[NUM_MBM_TRANSACTIONS] = { >>>>>>> {"dirty_victim_writes_all", DIRTY_VICTIMS_TO_ALL_MEM}, >>>>>>> }; >>>>>>> +int event_filter_show(struct kernfs_open_file *of, struct seq_file *seq, void *v) >>>>>>> +{ >>>>>>> + struct mon_evt *mevt = rdt_kn_parent_priv(of->kn); >>>>>>> + bool sep = false; >>>>>>> + int i; >>>>>>> + >>>>>>> + mutex_lock(&rdtgroup_mutex); >>>>>>> + >>>>>> There is inconsistency among the files introduced on how >>>>>> "mbm_event mode disabled" case is handled. Some files return failure >>>>>> from their _show()/_write() when "mbm_event mode is disabled", some don't. >>>>>> >>>>>> The "event_filter" file always prints the MBM transactions monitored >>>>>> when assignable counters are supported, whether mbm_event mode is enabled >>>>>> or not. This means that the MBM event's configuration values are printed >>>>>> when "default" mode is enabled. I have two concerns about this >>>>>> 1) This is potentially very confusing since switching to "default" will >>>>>> make the BMEC files visible that will enable the user to modify the >>>>>> event configurations per domain. Having this file print a global event >>>>>> configuration while there are potentially various different domain-specific >>>>>> configuration active will be confusing. >>>>> Yes. I agree. >>>> hmmm ... ok, you say that you agree but I cannot tell if you are going to >>>> do anything about it. >>>> >>>> On a system with BMEC enabled the mbm_total_bytes_config and mbm_local_bytes_config >>>> files should be the *only* source of MBM event configuration information, no? >>> That is correct. >>> >>> >>>> It may be ok to have event_filter print configuration when assignable counters are disabled >>>> if BMEC is not supported but that would require that this information will always be >>>> known for a "default" monitoring setup. While this may be true for AMD it is not obvious >>>> to me that this is universally true. Once this file exists in this form resctrl will always >>>> be required to provide data for the event configuration and it is not clear to me that >>>> this can always be guaranteed. >>> Yea. It is not true universally true. I don't know what these values are for Intel and ARM. >>> >>>>>> 2) Can it be guaranteed that the MBM events will monitor the default >>>>>> assignable counter memory transactions when in "default" mode? It has >>>>>> never been possible to query which memory transactions are monitored by >>>>>> the default X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features >>>>>> so this seems to use one feature to deduce capabilities or another? >>>>> So, initialize both total and local event configuration to default values when switched to "default mode"? >>>>> >>>>> Something like this? >>>>> >>>>> mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID].evt_cfg = r->mon.mbm_cfg_mask; >>>>> >>>>> mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID].evt_cfg = READS_TO_LOCAL_MEM | READS_TO_LOCAL_S_MEM | NON_TEMP_WRITE_TO_LOCAL_MEM; >>>>> >>>>> We are already doing that right (in patch 32)? >>>> Yes, but it creates this strange dependency where the "default" monitoring mode >>>> (that has been supported long before configurable events and assignable counters came >>>> along) requires support of "assignable counter mode". >>>> >>>> Consider it from another view, if resctrl wants to make event configuration available >>>> for the "default" mode then the "event_filter" file could always be visible when MBM >>>> local/total is supported to give users insight into what is monitored, whether >>>> assignable counters are supported or not. This is not possible on systems that do >>>> not support ABMC though, right? >>> That is correct. With BMEC, each domain can have its own settings. So, printing the default will not be accurate. >>> >>> What options do we have here. >>> >>> How about adding the check if (resctrl_arch_mbm_cntr_assign_enabled())? Only print the values when ABMC is supported else print information in "last_cmd_status". >>> >> Did you perhaps intend to write "Only print the values when ABMC is *enabled* else print >> information in "last_cmd_status".? If this is what you actually meant then I agree. I think >> doing so places clear boundary on this feature that gives us more options/flexibility for >> future changes. > > Yes. That is what I meant. We have same check in event_filer_write(). Will add the same check in event_filter_show(). > Thank you. This makes this specific behavior consistent and addresses the topic that started this thread: > There is inconsistency among the files introduced on how > "mbm_event mode disabled" case is handled." Could you please check the final work to confirm that the new resctrl files are consistent in handling of "mbm_event mode supported" and "mbm_event mode" enabled vs disabled cases? For example, they consider same scenarios as valid/invalid and return same error code in invalid cases. Reinette
© 2016 - 2025 Red Hat, Inc.