Assign/unassign counters on resctrl group creation/deletion. Two counters
are required per group, one for MBM total event and one for MBM local
event.
There are a limited number of counters available for assignment. If these
counters are exhausted, the kernel will display the error message: "Out of
MBM assignable counters". However, it is not necessary to fail the
creation of a group due to assignment failures. Users have the flexibility
to modify the assignments at a later time.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v11: Moved mbm_cntr_reset() to monitor.c.
Added code reset non-architectural state in mbm_cntr_reset().
Added missing rdtgroup_unassign_cntrs() calls on failure path.
v10: Assigned the counter before exposing the event files.
Moved the call rdtgroup_assign_cntrs() inside mkdir_rdt_prepare_rmid_alloc().
This is called both CNTR_MON and MON group creation.
Call mbm_cntr_reset() when unmounted to clear all the assignments.
Taken care of few other feedback comments.
v9: Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to return void.
Updated couple of rdtgroup_unassign_cntrs() calls properly.
Updated function comments.
v8: Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
Fixed the problem with unassigning the child MON groups of CTRL_MON group.
v7: Reworded the commit message.
Removed the reference of ABMC with mbm_cntr_assign.
Renamed the function rdtgroup_assign_cntrs to rdtgroup_assign_grp.
v6: Removed the redundant comments on all the calls of
rdtgroup_assign_cntrs. Updated the commit message.
Dropped printing error message on every call of rdtgroup_assign_cntrs.
v5: Removed the code to enable/disable ABMC during the mount.
That will be another patch.
Added arch callers to get the arch specific data.
Renamed fuctions to match the other abmc function.
Added code comments for assignment failures.
v4: Few name changes based on the upstream discussion.
Commit message update.
v3: This is a new patch. Patch addresses the upstream comment to enable
ABMC feature by default if the feature is available.
---
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/monitor.c | 27 +++++++++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 63 +++++++++++++++++++++++++-
3 files changed, 89 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index a5b8eadc7f5c..c979abb3d3b0 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -731,4 +731,5 @@ int resctrl_assign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d,
struct rdtgroup *rdtgrp, enum resctrl_event_id evtid);
+void mbm_cntr_reset(struct rdt_resource *r);
#endif /* _ASM_X86_RESCTRL_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index b6d188d0f9b7..118b39fbb01e 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1557,3 +1557,30 @@ int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d
return ret;
}
+
+void mbm_cntr_reset(struct rdt_resource *r)
+{
+ u32 idx_limit = resctrl_arch_system_num_rmid_idx();
+ struct rdt_mon_domain *dom;
+
+ /*
+ * Reset the domain counter configuration. Hardware counters
+ * will reset after switching the monitor mode. So, reset the
+ * architectural amd non-architectural state so that reading
+ * of hardware counter is not considered as an overflow in the
+ * next update.
+ */
+ if (is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
+ list_for_each_entry(dom, &r->mon_domains, hdr.list) {
+ memset(dom->cntr_cfg, 0,
+ sizeof(*dom->cntr_cfg) * r->mon.num_mbm_cntrs);
+ if (is_mbm_total_enabled())
+ memset(dom->mbm_total, 0,
+ sizeof(struct mbm_state) * idx_limit);
+ if (is_mbm_local_enabled())
+ memset(dom->mbm_local, 0,
+ sizeof(struct mbm_state) * idx_limit);
+ resctrl_arch_reset_rmid_all(r, dom);
+ }
+ }
+}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 2b86124c336b..f61f0cd032ef 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2687,6 +2687,46 @@ static void schemata_list_destroy(void)
}
}
+/*
+ * Called when a new group is created. If "mbm_cntr_assign" mode is enabled,
+ * counters are automatically assigned. Each group can accommodate two counters:
+ * one for the total event and one for the local event. Assignments may fail
+ * due to the limited number of counters. However, it is not necessary to fail
+ * the group creation and thus no failure is returned. Users have the option
+ * to modify the counter assignments after the group has been created.
+ */
+static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
+{
+ struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+
+ if (!resctrl_arch_mbm_cntr_assign_enabled(r))
+ return;
+
+ if (is_mbm_total_enabled())
+ resctrl_assign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
+
+ if (is_mbm_local_enabled())
+ resctrl_assign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
+}
+
+/*
+ * Called when a group is deleted. Counters are unassigned if it was in
+ * assigned state.
+ */
+static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
+{
+ struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+
+ if (!resctrl_arch_mbm_cntr_assign_enabled(r))
+ return;
+
+ if (is_mbm_total_enabled())
+ resctrl_unassign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
+
+ if (is_mbm_local_enabled())
+ resctrl_unassign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
+}
+
static int rdt_get_tree(struct fs_context *fc)
{
struct rdt_fs_context *ctx = rdt_fc2context(fc);
@@ -2741,6 +2781,8 @@ static int rdt_get_tree(struct fs_context *fc)
if (ret < 0)
goto out_info;
+ rdtgroup_assign_cntrs(&rdtgroup_default);
+
ret = mkdir_mondata_all(rdtgroup_default.kn,
&rdtgroup_default, &kn_mondata);
if (ret < 0)
@@ -2779,8 +2821,10 @@ static int rdt_get_tree(struct fs_context *fc)
if (resctrl_arch_mon_capable())
kernfs_remove(kn_mondata);
out_mongrp:
- if (resctrl_arch_mon_capable())
+ if (resctrl_arch_mon_capable()) {
+ rdtgroup_unassign_cntrs(&rdtgroup_default);
kernfs_remove(kn_mongrp);
+ }
out_info:
kernfs_remove(kn_info);
out_schemata_free:
@@ -2956,6 +3000,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
head = &rdtgrp->mon.crdtgrp_list;
list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
+ rdtgroup_unassign_cntrs(sentry);
free_rmid(sentry->closid, sentry->mon.rmid);
list_del(&sentry->mon.crdtgrp_list);
@@ -2996,6 +3041,8 @@ static void rmdir_all_sub(void)
cpumask_or(&rdtgroup_default.cpu_mask,
&rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
+ rdtgroup_unassign_cntrs(rdtgrp);
+
free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
kernfs_remove(rdtgrp->kn);
@@ -3027,6 +3074,8 @@ static void rdt_kill_sb(struct super_block *sb)
for_each_alloc_capable_rdt_resource(r)
reset_all_ctrls(r);
rmdir_all_sub();
+ rdtgroup_unassign_cntrs(&rdtgroup_default);
+ mbm_cntr_reset(&rdt_resources_all[RDT_RESOURCE_L3].r_resctrl);
rdt_pseudo_lock_release();
rdtgroup_default.mode = RDT_MODE_SHAREABLE;
schemata_list_destroy();
@@ -3490,9 +3539,12 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
}
rdtgrp->mon.rmid = ret;
+ rdtgroup_assign_cntrs(rdtgrp);
+
ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn);
if (ret) {
rdt_last_cmd_puts("kernfs subdir error\n");
+ rdtgroup_unassign_cntrs(rdtgrp);
free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
return ret;
}
@@ -3502,8 +3554,10 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp)
{
- if (resctrl_arch_mon_capable())
+ if (resctrl_arch_mon_capable()) {
+ rdtgroup_unassign_cntrs(rgrp);
free_rmid(rgrp->closid, rgrp->mon.rmid);
+ }
}
static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
@@ -3764,6 +3818,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
update_closid_rmid(tmpmask, NULL);
rdtgrp->flags = RDT_DELETED;
+
+ rdtgroup_unassign_cntrs(rdtgrp);
+
free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
/*
@@ -3810,6 +3867,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
update_closid_rmid(tmpmask, NULL);
+ rdtgroup_unassign_cntrs(rdtgrp);
+
free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
closid_free(rdtgrp->closid);
--
2.34.1
Hi, On Wed, Jan 22, 2025 at 02:20:25PM -0600, Babu Moger wrote: > Assign/unassign counters on resctrl group creation/deletion. Two counters > are required per group, one for MBM total event and one for MBM local > event. > > There are a limited number of counters available for assignment. If these > counters are exhausted, the kernel will display the error message: "Out of > MBM assignable counters". However, it is not necessary to fail the > creation of a group due to assignment failures. Users have the flexibility > to modify the assignments at a later time. If we are doing this, should turning mbm_cntr_assign mode on also trigger auto-assingment for all extant monitoring groups? Either way though, this auto-assignment feels like a potential nuisance for userspace. If the userspace use-case requires too many monitoring groups for the available counters, then the kernel will auto-assign counters to a random subset of groups which may or may not be the ones that userspace wanted to monitor; then userspace must manually look for the assigned counters and unassign some of them before they can be assigned where userspace actually wanted them. This is not impossible for userspace to cope with, but it feels awkward. Is there a way to inhibit auto-assignment? Or could automatic assignments be considered somehow "weak", so that new explicit assignments can steal automatically assigned counters without the need to unassign them explicitly? [...] Cheers ---Dave
Hi Dave, On Wed, Feb 19, 2025 at 2:41 PM Dave Martin <Dave.Martin@arm.com> wrote: > > Hi, > > On Wed, Jan 22, 2025 at 02:20:25PM -0600, Babu Moger wrote: > > Assign/unassign counters on resctrl group creation/deletion. Two counters > > are required per group, one for MBM total event and one for MBM local > > event. > > > > There are a limited number of counters available for assignment. If these > > counters are exhausted, the kernel will display the error message: "Out of > > MBM assignable counters". However, it is not necessary to fail the > > creation of a group due to assignment failures. Users have the flexibility > > to modify the assignments at a later time. > > If we are doing this, should turning mbm_cntr_assign mode on also > trigger auto-assingment for all extant monitoring groups? > > Either way though, this auto-assignment feels like a potential nuisance > for userspace. > > If the userspace use-case requires too many monitoring groups for the > available counters, then the kernel will auto-assign counters to a > random subset of groups which may or may not be the ones that userspace > wanted to monitor; then userspace must manually look for the assigned > counters and unassign some of them before they can be assigned where > userspace actually wanted them. > > This is not impossible for userspace to cope with, but it feels > awkward. > > Is there a way to inhibit auto-assignment? > > Or could automatic assignments be considered somehow "weak", so that > new explicit assignments can steal automatically assigned counters > without the need to unassign them explicitly? We had an incomplete discussion about this early on[1]. I guess I didn't revisit it because I found it was trivial to add a flag that inhibits the assignment behavior during mkdir and had moved on to bigger issues. If an agent creating directories isn't coordinated with the agent managing counters, a series of creating and destroying a group could prevent a monitor assignment from ever succeeding because it's not possible to atomically discover the name of the new directory that stole the previously-available counter and reassign it. However, if the counter-manager can get all the counters assigned once and only move them with atomic reassignments, it will become impossible to snatch them with a mkdir. -Peter [1] https://lore.kernel.org/lkml/CALPaoCihfQ9VtLYzyHB9-PsQzXLc06BW8bzhBXwj9-i+Q8RVFQ@mail.gmail.com/
Hi Dave and Peter, On 2/19/25 6:09 AM, Peter Newman wrote: > Hi Dave, > > On Wed, Feb 19, 2025 at 2:41 PM Dave Martin <Dave.Martin@arm.com> wrote: >> >> Hi, >> >> On Wed, Jan 22, 2025 at 02:20:25PM -0600, Babu Moger wrote: >>> Assign/unassign counters on resctrl group creation/deletion. Two counters >>> are required per group, one for MBM total event and one for MBM local >>> event. >>> >>> There are a limited number of counters available for assignment. If these >>> counters are exhausted, the kernel will display the error message: "Out of >>> MBM assignable counters". However, it is not necessary to fail the >>> creation of a group due to assignment failures. Users have the flexibility >>> to modify the assignments at a later time. >> >> If we are doing this, should turning mbm_cntr_assign mode on also >> trigger auto-assingment for all extant monitoring groups? >> >> Either way though, this auto-assignment feels like a potential nuisance >> for userspace. hmmm ... this auto-assignment was created with the goal to help userspace. In mbm_cntr_assign mode the user will only see data when a counter is assigned to an event. mbm_cntr_assign mode is selected as default on a system that supports ABMC. Without auto assignment a user will thus see different behavior when reading the monitoring events when the user switches to a kernel with assignable counter support: Before assignable counter support events will have data, with assignable counter support the events will not have data. I understood that interfaces should not behave differently when user space switches kernels and that is what the auto assignment aims to solve. >> >> If the userspace use-case requires too many monitoring groups for the >> available counters, then the kernel will auto-assign counters to a >> random subset of groups which may or may not be the ones that userspace >> wanted to monitor; then userspace must manually look for the assigned >> counters and unassign some of them before they can be assigned where >> userspace actually wanted them. >> >> This is not impossible for userspace to cope with, but it feels >> awkward. >> >> Is there a way to inhibit auto-assignment? >> >> Or could automatic assignments be considered somehow "weak", so that >> new explicit assignments can steal automatically assigned counters >> without the need to unassign them explicitly? > > We had an incomplete discussion about this early on[1]. I guess I > didn't revisit it because I found it was trivial to add a flag that > inhibits the assignment behavior during mkdir and had moved on to > bigger issues. Could you please remind me how a user will set this flag? > > If an agent creating directories isn't coordinated with the agent > managing counters, a series of creating and destroying a group could > prevent a monitor assignment from ever succeeding because it's not > possible to atomically discover the name of the new directory that > stole the previously-available counter and reassign it. > > However, if the counter-manager can get all the counters assigned once > and only move them with atomic reassignments, it will become > impossible to snatch them with a mkdir. > You have many points that makes auto-assignment not be ideal but I remain concerned that not doing something like this will break existing users who are not as familiar with resctrl internals. Reinette
Hi Reinette, On Wed, Feb 19, 2025 at 6:55 PM Reinette Chatre <reinette.chatre@intel.com> wrote: > > Hi Dave and Peter, > > On 2/19/25 6:09 AM, Peter Newman wrote: > > Hi Dave, > > > > On Wed, Feb 19, 2025 at 2:41 PM Dave Martin <Dave.Martin@arm.com> wrote: > >> > >> Hi, > >> > >> On Wed, Jan 22, 2025 at 02:20:25PM -0600, Babu Moger wrote: > >>> Assign/unassign counters on resctrl group creation/deletion. Two counters > >>> are required per group, one for MBM total event and one for MBM local > >>> event. > >>> > >>> There are a limited number of counters available for assignment. If these > >>> counters are exhausted, the kernel will display the error message: "Out of > >>> MBM assignable counters". However, it is not necessary to fail the > >>> creation of a group due to assignment failures. Users have the flexibility > >>> to modify the assignments at a later time. > >> > >> If we are doing this, should turning mbm_cntr_assign mode on also > >> trigger auto-assingment for all extant monitoring groups? > >> > >> Either way though, this auto-assignment feels like a potential nuisance > >> for userspace. > > hmmm ... this auto-assignment was created with the goal to help userspace. > In mbm_cntr_assign mode the user will only see data when a counter is assigned > to an event. mbm_cntr_assign mode is selected as default on a system that > supports ABMC. Without auto assignment a user will thus see different > behavior when reading the monitoring events when the user switches to a kernel with > assignable counter support: Before assignable counter support events will have > data, with assignable counter support the events will not have data. > > I understood that interfaces should not behave differently when user space > switches kernels and that is what the auto assignment aims to solve. > > >> > >> If the userspace use-case requires too many monitoring groups for the > >> available counters, then the kernel will auto-assign counters to a > >> random subset of groups which may or may not be the ones that userspace > >> wanted to monitor; then userspace must manually look for the assigned > >> counters and unassign some of them before they can be assigned where > >> userspace actually wanted them. > >> > >> This is not impossible for userspace to cope with, but it feels > >> awkward. > >> > >> Is there a way to inhibit auto-assignment? > >> > >> Or could automatic assignments be considered somehow "weak", so that > >> new explicit assignments can steal automatically assigned counters > >> without the need to unassign them explicitly? > > > > We had an incomplete discussion about this early on[1]. I guess I > > didn't revisit it because I found it was trivial to add a flag that > > inhibits the assignment behavior during mkdir and had moved on to > > bigger issues. > > Could you please remind me how a user will set this flag? Quoting my original suggestion[1]: "info/L3_MON/mbm_assign_on_mkdir? boolean (parsed with kstrtobool()), defaulting to true?" After mount, any groups that got counters on creation would have to be cleaned up, but at least that can be done with forward progress once the flag is cleared. I was able to live with that as long as there aren't users polling for resctrl to be mounted and immediately creating groups. For us, a single container manager service manages resctrl. > > > > > If an agent creating directories isn't coordinated with the agent > > managing counters, a series of creating and destroying a group could > > prevent a monitor assignment from ever succeeding because it's not > > possible to atomically discover the name of the new directory that > > stole the previously-available counter and reassign it. > > > > However, if the counter-manager can get all the counters assigned once > > and only move them with atomic reassignments, it will become > > impossible to snatch them with a mkdir. > > > > You have many points that makes auto-assignment not be ideal but I > remain concerned that not doing something like this will break > existing users who are not as familiar with resctrl internals. I agree auto-assignment should be the default. I just want an official way to turn it off. Thanks! -Peter [1] https://lore.kernel.org/lkml/CALPaoCiJ9ELXkij-zsAhxC1hx8UUR+KMPJH6i8c8AT6_mtXs+Q@mail.gmail.com/
On Thu, Feb 20, 2025 at 11:35:56AM +0100, Peter Newman wrote: > Hi Reinette, > > On Wed, Feb 19, 2025 at 6:55 PM Reinette Chatre > <reinette.chatre@intel.com> wrote: > > > > Hi Dave and Peter, > > > > On 2/19/25 6:09 AM, Peter Newman wrote: > > > Hi Dave, > > > > > > On Wed, Feb 19, 2025 at 2:41 PM Dave Martin <Dave.Martin@arm.com> wrote: > > >> > > >> Hi, > > >> > > >> On Wed, Jan 22, 2025 at 02:20:25PM -0600, Babu Moger wrote: > > >>> Assign/unassign counters on resctrl group creation/deletion. Two counters > > >>> are required per group, one for MBM total event and one for MBM local > > >>> event. > > >>> > > >>> There are a limited number of counters available for assignment. If these > > >>> counters are exhausted, the kernel will display the error message: "Out of > > >>> MBM assignable counters". However, it is not necessary to fail the > > >>> creation of a group due to assignment failures. Users have the flexibility > > >>> to modify the assignments at a later time. > > >> > > >> If we are doing this, should turning mbm_cntr_assign mode on also > > >> trigger auto-assingment for all extant monitoring groups? > > >> > > >> Either way though, this auto-assignment feels like a potential nuisance > > >> for userspace. > > > > hmmm ... this auto-assignment was created with the goal to help userspace. > > In mbm_cntr_assign mode the user will only see data when a counter is assigned > > to an event. mbm_cntr_assign mode is selected as default on a system that > > supports ABMC. Without auto assignment a user will thus see different > > behavior when reading the monitoring events when the user switches to a kernel with > > assignable counter support: Before assignable counter support events will have > > data, with assignable counter support the events will not have data. > > > > I understood that interfaces should not behave differently when user space > > switches kernels and that is what the auto assignment aims to solve. > > > > >> > > >> If the userspace use-case requires too many monitoring groups for the > > >> available counters, then the kernel will auto-assign counters to a > > >> random subset of groups which may or may not be the ones that userspace > > >> wanted to monitor; then userspace must manually look for the assigned > > >> counters and unassign some of them before they can be assigned where > > >> userspace actually wanted them. > > >> > > >> This is not impossible for userspace to cope with, but it feels > > >> awkward. > > >> > > >> Is there a way to inhibit auto-assignment? > > >> > > >> Or could automatic assignments be considered somehow "weak", so that > > >> new explicit assignments can steal automatically assigned counters > > >> without the need to unassign them explicitly? > > > > > > We had an incomplete discussion about this early on[1]. I guess I > > > didn't revisit it because I found it was trivial to add a flag that > > > inhibits the assignment behavior during mkdir and had moved on to > > > bigger issues. > > > > Could you please remind me how a user will set this flag? > > Quoting my original suggestion[1]: > > "info/L3_MON/mbm_assign_on_mkdir? > > boolean (parsed with kstrtobool()), defaulting to true?" > > After mount, any groups that got counters on creation would have to be > cleaned up, but at least that can be done with forward progress once > the flag is cleared. > > I was able to live with that as long as there aren't users polling for > resctrl to be mounted and immediately creating groups. For us, a > single container manager service manages resctrl. > > > > > > > > > If an agent creating directories isn't coordinated with the agent > > > managing counters, a series of creating and destroying a group could > > > prevent a monitor assignment from ever succeeding because it's not > > > possible to atomically discover the name of the new directory that > > > stole the previously-available counter and reassign it. > > > > > > However, if the counter-manager can get all the counters assigned once > > > and only move them with atomic reassignments, it will become > > > impossible to snatch them with a mkdir. > > > > > > > You have many points that makes auto-assignment not be ideal but I > > remain concerned that not doing something like this will break > > existing users who are not as familiar with resctrl internals. > > I agree auto-assignment should be the default. I just want an official > way to turn it off. > > Thanks! > -Peter > > [1] https://lore.kernel.org/lkml/CALPaoCiJ9ELXkij-zsAhxC1hx8UUR+KMPJH6i8c8AT6_mtXs+Q@mail.gmail.com/ > +1 That's basically my position -- the auto-assignment feels like a _potential_ nuisance for ABMC-aware users, but it depends on what they are trying to do. Migration of non-ABMC-aware users will be easier for basic use cases if auto-assignment occurs by default (as in this series). Having an explicit way to turn this off seems perfectly reasonable (and could be added later on, if not provided in this series). What about the question re whether turning mbm_cntr_assign mode on should trigger auto-assignment? Currently turning this mode off and then on again has the effect of removing all automatic assignments for extant groups. This feels surprising and/or unintentional (?) Cheers ---Dave
Hi Dave, On 2/20/25 5:40 AM, Dave Martin wrote: > On Thu, Feb 20, 2025 at 11:35:56AM +0100, Peter Newman wrote: >> Hi Reinette, >> >> On Wed, Feb 19, 2025 at 6:55 PM Reinette Chatre >> <reinette.chatre@intel.com> wrote: >>> >>> Hi Dave and Peter, >>> >>> On 2/19/25 6:09 AM, Peter Newman wrote: >>>> Hi Dave, >>>> >>>> On Wed, Feb 19, 2025 at 2:41 PM Dave Martin <Dave.Martin@arm.com> wrote: >>>>> >>>>> Hi, >>>>> >>>>> On Wed, Jan 22, 2025 at 02:20:25PM -0600, Babu Moger wrote: >>>>>> Assign/unassign counters on resctrl group creation/deletion. Two counters >>>>>> are required per group, one for MBM total event and one for MBM local >>>>>> event. >>>>>> >>>>>> There are a limited number of counters available for assignment. If these >>>>>> counters are exhausted, the kernel will display the error message: "Out of >>>>>> MBM assignable counters". However, it is not necessary to fail the >>>>>> creation of a group due to assignment failures. Users have the flexibility >>>>>> to modify the assignments at a later time. >>>>> >>>>> If we are doing this, should turning mbm_cntr_assign mode on also >>>>> trigger auto-assingment for all extant monitoring groups? >>>>> >>>>> Either way though, this auto-assignment feels like a potential nuisance >>>>> for userspace. >>> >>> hmmm ... this auto-assignment was created with the goal to help userspace. >>> In mbm_cntr_assign mode the user will only see data when a counter is assigned >>> to an event. mbm_cntr_assign mode is selected as default on a system that >>> supports ABMC. Without auto assignment a user will thus see different >>> behavior when reading the monitoring events when the user switches to a kernel with >>> assignable counter support: Before assignable counter support events will have >>> data, with assignable counter support the events will not have data. >>> >>> I understood that interfaces should not behave differently when user space >>> switches kernels and that is what the auto assignment aims to solve. >>> >>>>> >>>>> If the userspace use-case requires too many monitoring groups for the >>>>> available counters, then the kernel will auto-assign counters to a >>>>> random subset of groups which may or may not be the ones that userspace >>>>> wanted to monitor; then userspace must manually look for the assigned >>>>> counters and unassign some of them before they can be assigned where >>>>> userspace actually wanted them. >>>>> >>>>> This is not impossible for userspace to cope with, but it feels >>>>> awkward. >>>>> >>>>> Is there a way to inhibit auto-assignment? >>>>> >>>>> Or could automatic assignments be considered somehow "weak", so that >>>>> new explicit assignments can steal automatically assigned counters >>>>> without the need to unassign them explicitly? >>>> >>>> We had an incomplete discussion about this early on[1]. I guess I >>>> didn't revisit it because I found it was trivial to add a flag that >>>> inhibits the assignment behavior during mkdir and had moved on to >>>> bigger issues. >>> >>> Could you please remind me how a user will set this flag? >> >> Quoting my original suggestion[1]: >> >> "info/L3_MON/mbm_assign_on_mkdir? >> >> boolean (parsed with kstrtobool()), defaulting to true?" >> >> After mount, any groups that got counters on creation would have to be >> cleaned up, but at least that can be done with forward progress once >> the flag is cleared. >> >> I was able to live with that as long as there aren't users polling for >> resctrl to be mounted and immediately creating groups. For us, a >> single container manager service manages resctrl. >> >>> >>>> >>>> If an agent creating directories isn't coordinated with the agent >>>> managing counters, a series of creating and destroying a group could >>>> prevent a monitor assignment from ever succeeding because it's not >>>> possible to atomically discover the name of the new directory that >>>> stole the previously-available counter and reassign it. >>>> >>>> However, if the counter-manager can get all the counters assigned once >>>> and only move them with atomic reassignments, it will become >>>> impossible to snatch them with a mkdir. >>>> >>> >>> You have many points that makes auto-assignment not be ideal but I >>> remain concerned that not doing something like this will break >>> existing users who are not as familiar with resctrl internals. >> >> I agree auto-assignment should be the default. I just want an official >> way to turn it off. >> >> Thanks! >> -Peter >> >> [1] https://lore.kernel.org/lkml/CALPaoCiJ9ELXkij-zsAhxC1hx8UUR+KMPJH6i8c8AT6_mtXs+Q@mail.gmail.com/ >> > > +1 > > That's basically my position -- the auto-assignment feels like a > _potential_ nuisance for ABMC-aware users, but it depends on what they > are trying to do. Migration of non-ABMC-aware users will be easier for > basic use cases if auto-assignment occurs by default (as in this > series). > > Having an explicit way to turn this off seems perfectly reasonable > (and could be added later on, if not provided in this series). > > > What about the question re whether turning mbm_cntr_assign mode on > should trigger auto-assignment? > > Currently turning this mode off and then on again has the effect of > removing all automatic assignments for extant groups. This feels > surprising and/or unintentional (?) Connecting to what you start off by saying I also see auto-assignment as the way to provide a smooth transition for "non-ABMC-aware" users. To me a user that turns this mode off and then on again can be considered as a user that is "ABMC-aware" and turning it "off and then on again" seems like an intuitive way to get to a "clean slate" wrt counter assignments. This may also be a convenient way for an "ABMC-aware" user space to unassign all counters and thus also helpful if resctrl supports the flag that Peter proposed. The flag seems to already keep something like this in its context with a name of "mbm_assign_on_mkdir" that could be interpreted as "only auto assign on mkdir"? I am not taking a stand for one or the other approach but instead trying to be more specific about pros/cons. Could you please provide more insight in the use case you have in mind so that we can see how resctrl could behave with few surprises? Reinette
Hi, On Thu, Feb 20, 2025 at 09:08:17AM -0800, Reinette Chatre wrote: > Hi Dave, > > On 2/20/25 5:40 AM, Dave Martin wrote: > > On Thu, Feb 20, 2025 at 11:35:56AM +0100, Peter Newman wrote: > >> Hi Reinette, > >> > >> On Wed, Feb 19, 2025 at 6:55 PM Reinette Chatre > >> <reinette.chatre@intel.com> wrote: [...] > >>> Could you please remind me how a user will set this flag? > >> > >> Quoting my original suggestion[1]: > >> > >> "info/L3_MON/mbm_assign_on_mkdir? > >> > >> boolean (parsed with kstrtobool()), defaulting to true?" > >> > >> After mount, any groups that got counters on creation would have to be > >> cleaned up, but at least that can be done with forward progress once > >> the flag is cleared. > >> > >> I was able to live with that as long as there aren't users polling for > >> resctrl to be mounted and immediately creating groups. For us, a > >> single container manager service manages resctrl. [...] > > +1 > > > > That's basically my position -- the auto-assignment feels like a > > _potential_ nuisance for ABMC-aware users, but it depends on what they > > are trying to do. Migration of non-ABMC-aware users will be easier for > > basic use cases if auto-assignment occurs by default (as in this > > series). > > > > Having an explicit way to turn this off seems perfectly reasonable > > (and could be added later on, if not provided in this series). > > > > > > What about the question re whether turning mbm_cntr_assign mode on > > should trigger auto-assignment? > > > > Currently turning this mode off and then on again has the effect of > > removing all automatic assignments for extant groups. This feels > > surprising and/or unintentional (?) > > Connecting to what you start off by saying I also see auto-assignment > as the way to provide a smooth transition for "non-ABMC-aware" users. I agree, and having this on by default also helps non-ABMC-aware users. > To me a user that turns this mode off and then on again can be > considered as a user that is "ABMC-aware" and turning it "off and then > on again" seems like an intuitive way to get to a "clean slate" > wrt counter assignments. This may also be a convenient way for > an "ABMC-aware" user space to unassign all counters and thus also > helpful if resctrl supports the flag that Peter proposed. The flag > seems to already keep something like this in its context with > a name of "mbm_assign_on_mkdir" that could be interpreted as > "only auto assign on mkdir"? Yes, that's reasonable. It could be a good idea to document this behaviour of switching the mbm_cntr_assign mode, if we think it is useful and people are likely to rely on it. Since mkdir is an implementation detail of the resctrl interface, I'd be tempted to go for a more generic name, say, "mbm_assign_new_mon_groups". But that's just bikeshedding. The proposed behaviour seems fine. Either way, if this is not included in this series, it could be added later without breaking anything. > I am not taking a stand for one or the other approach but instead > trying to be more specific about pros/cons. Could you please provide > more insight in the use case you have in mind so that we can see how > resctrl could behave with few surprises? > > Reinette I don't have a strong view either. I don't have a concrete use case here -- I was just trying to imagine the experience of an ABMC-aware user who wants full control over what counters get assigned. I agree that the convenience of the non-ABMC-aware user should probably take priority over that of the ABMC-aware user, at least in situations where the expected behaviour is achievable (i.e., where we didn't run out of counters to auto-assign.) Cheers ---Dave
Hi All, On 2/21/2025 11:14 AM, Dave Martin wrote: > Hi, > > On Thu, Feb 20, 2025 at 09:08:17AM -0800, Reinette Chatre wrote: >> Hi Dave, >> >> On 2/20/25 5:40 AM, Dave Martin wrote: >>> On Thu, Feb 20, 2025 at 11:35:56AM +0100, Peter Newman wrote: >>>> Hi Reinette, >>>> >>>> On Wed, Feb 19, 2025 at 6:55 PM Reinette Chatre >>>> <reinette.chatre@intel.com> wrote: > > [...] > >>>>> Could you please remind me how a user will set this flag? >>>> >>>> Quoting my original suggestion[1]: >>>> >>>> "info/L3_MON/mbm_assign_on_mkdir? >>>> >>>> boolean (parsed with kstrtobool()), defaulting to true?" >>>> >>>> After mount, any groups that got counters on creation would have to be >>>> cleaned up, but at least that can be done with forward progress once >>>> the flag is cleared. >>>> >>>> I was able to live with that as long as there aren't users polling for >>>> resctrl to be mounted and immediately creating groups. For us, a >>>> single container manager service manages resctrl. > > [...] > >>> +1 >>> >>> That's basically my position -- the auto-assignment feels like a >>> _potential_ nuisance for ABMC-aware users, but it depends on what they >>> are trying to do. Migration of non-ABMC-aware users will be easier for >>> basic use cases if auto-assignment occurs by default (as in this >>> series). >>> >>> Having an explicit way to turn this off seems perfectly reasonable >>> (and could be added later on, if not provided in this series). >>> >>> >>> What about the question re whether turning mbm_cntr_assign mode on >>> should trigger auto-assignment? >>> >>> Currently turning this mode off and then on again has the effect of >>> removing all automatic assignments for extant groups. This feels >>> surprising and/or unintentional (?) >> >> Connecting to what you start off by saying I also see auto-assignment >> as the way to provide a smooth transition for "non-ABMC-aware" users. > > I agree, and having this on by default also helps non-ABMC-aware users. > >> To me a user that turns this mode off and then on again can be >> considered as a user that is "ABMC-aware" and turning it "off and then >> on again" seems like an intuitive way to get to a "clean slate" >> wrt counter assignments. This may also be a convenient way for >> an "ABMC-aware" user space to unassign all counters and thus also >> helpful if resctrl supports the flag that Peter proposed. The flag >> seems to already keep something like this in its context with >> a name of "mbm_assign_on_mkdir" that could be interpreted as >> "only auto assign on mkdir"? > > Yes, that's reasonable. It could be a good idea to document this > behaviour of switching the mbm_cntr_assign mode, if we think it is > useful and people are likely to rely on it. > > Since mkdir is an implementation detail of the resctrl interface, I'd > be tempted to go for a more generic name, say, > "mbm_assign_new_mon_groups". But that's just bikeshedding. > The proposed behaviour seems fine. > > Either way, if this is not included in this series, it could be added > later without breaking anything. How about more generic "mbm_cntr_assign_auto" ? We can add it as part of "struct resctrl_mon" and set it "on" when ABMC is detected. It will be part of check in rdtgroup_assign_cntrs() which is called when new groups are created. Also, provide user interface to disable it. Seems simple to me. Thanks Babu > > >> I am not taking a stand for one or the other approach but instead >> trying to be more specific about pros/cons. Could you please provide >> more insight in the use case you have in mind so that we can see how >> resctrl could behave with few surprises? >> >> Reinette > > I don't have a strong view either. > > I don't have a concrete use case here -- I was just trying to imagine > the experience of an ABMC-aware user who wants full control over what > counters get assigned. > > I agree that the convenience of the non-ABMC-aware user should probably > take priority over that of the ABMC-aware user, at least in situations > where the expected behaviour is achievable (i.e., where we didn't run > out of counters to auto-assign.) > > Cheers > ---Dave >
Hi Babu, On 2/21/25 10:23 AM, Moger, Babu wrote: > Hi All, > > On 2/21/2025 11:14 AM, Dave Martin wrote: >> Hi, >> >> On Thu, Feb 20, 2025 at 09:08:17AM -0800, Reinette Chatre wrote: >>> Hi Dave, >>> >>> On 2/20/25 5:40 AM, Dave Martin wrote: >>>> On Thu, Feb 20, 2025 at 11:35:56AM +0100, Peter Newman wrote: >>>>> Hi Reinette, >>>>> >>>>> On Wed, Feb 19, 2025 at 6:55 PM Reinette Chatre >>>>> <reinette.chatre@intel.com> wrote: >> >> [...] >> >>>>>> Could you please remind me how a user will set this flag? >>>>> >>>>> Quoting my original suggestion[1]: >>>>> >>>>> "info/L3_MON/mbm_assign_on_mkdir? >>>>> >>>>> boolean (parsed with kstrtobool()), defaulting to true?" >>>>> >>>>> After mount, any groups that got counters on creation would have to be >>>>> cleaned up, but at least that can be done with forward progress once >>>>> the flag is cleared. >>>>> >>>>> I was able to live with that as long as there aren't users polling for >>>>> resctrl to be mounted and immediately creating groups. For us, a >>>>> single container manager service manages resctrl. >> >> [...] >> >>>> +1 >>>> >>>> That's basically my position -- the auto-assignment feels like a >>>> _potential_ nuisance for ABMC-aware users, but it depends on what they >>>> are trying to do. Migration of non-ABMC-aware users will be easier for >>>> basic use cases if auto-assignment occurs by default (as in this >>>> series). >>>> >>>> Having an explicit way to turn this off seems perfectly reasonable >>>> (and could be added later on, if not provided in this series). >>>> >>>> >>>> What about the question re whether turning mbm_cntr_assign mode on >>>> should trigger auto-assignment? >>>> >>>> Currently turning this mode off and then on again has the effect of >>>> removing all automatic assignments for extant groups. This feels >>>> surprising and/or unintentional (?) >>> >>> Connecting to what you start off by saying I also see auto-assignment >>> as the way to provide a smooth transition for "non-ABMC-aware" users. >> >> I agree, and having this on by default also helps non-ABMC-aware users. >> >>> To me a user that turns this mode off and then on again can be >>> considered as a user that is "ABMC-aware" and turning it "off and then >>> on again" seems like an intuitive way to get to a "clean slate" >>> wrt counter assignments. This may also be a convenient way for >>> an "ABMC-aware" user space to unassign all counters and thus also >>> helpful if resctrl supports the flag that Peter proposed. The flag >>> seems to already keep something like this in its context with >>> a name of "mbm_assign_on_mkdir" that could be interpreted as >>> "only auto assign on mkdir"? >> >> Yes, that's reasonable. It could be a good idea to document this >> behaviour of switching the mbm_cntr_assign mode, if we think it is >> useful and people are likely to rely on it. >> >> Since mkdir is an implementation detail of the resctrl interface, I'd >> be tempted to go for a more generic name, say, >> "mbm_assign_new_mon_groups". But that's just bikeshedding. >> The proposed behaviour seems fine. >> >> Either way, if this is not included in this series, it could be added >> later without breaking anything. > > How about more generic "mbm_cntr_assign_auto" ? I would like to be careful to not make it _too_ generic. Dave already pointed out that users may be surprised that counters are not auto-assigned when switching between the different modes so using the the name to help highlight when this auto-assignment can be expected to happen seems very useful. Reinette
Hi Reinette, On 2/21/2025 4:48 PM, Reinette Chatre wrote: > Hi Babu, > > On 2/21/25 10:23 AM, Moger, Babu wrote: >> Hi All, >> >> On 2/21/2025 11:14 AM, Dave Martin wrote: >>> Hi, >>> >>> On Thu, Feb 20, 2025 at 09:08:17AM -0800, Reinette Chatre wrote: >>>> Hi Dave, >>>> >>>> On 2/20/25 5:40 AM, Dave Martin wrote: >>>>> On Thu, Feb 20, 2025 at 11:35:56AM +0100, Peter Newman wrote: >>>>>> Hi Reinette, >>>>>> >>>>>> On Wed, Feb 19, 2025 at 6:55 PM Reinette Chatre >>>>>> <reinette.chatre@intel.com> wrote: >>> >>> [...] >>> >>>>>>> Could you please remind me how a user will set this flag? >>>>>> >>>>>> Quoting my original suggestion[1]: >>>>>> >>>>>> "info/L3_MON/mbm_assign_on_mkdir? >>>>>> >>>>>> boolean (parsed with kstrtobool()), defaulting to true?" >>>>>> >>>>>> After mount, any groups that got counters on creation would have to be >>>>>> cleaned up, but at least that can be done with forward progress once >>>>>> the flag is cleared. >>>>>> >>>>>> I was able to live with that as long as there aren't users polling for >>>>>> resctrl to be mounted and immediately creating groups. For us, a >>>>>> single container manager service manages resctrl. >>> >>> [...] >>> >>>>> +1 >>>>> >>>>> That's basically my position -- the auto-assignment feels like a >>>>> _potential_ nuisance for ABMC-aware users, but it depends on what they >>>>> are trying to do. Migration of non-ABMC-aware users will be easier for >>>>> basic use cases if auto-assignment occurs by default (as in this >>>>> series). >>>>> >>>>> Having an explicit way to turn this off seems perfectly reasonable >>>>> (and could be added later on, if not provided in this series). >>>>> >>>>> >>>>> What about the question re whether turning mbm_cntr_assign mode on >>>>> should trigger auto-assignment? >>>>> >>>>> Currently turning this mode off and then on again has the effect of >>>>> removing all automatic assignments for extant groups. This feels >>>>> surprising and/or unintentional (?) >>>> >>>> Connecting to what you start off by saying I also see auto-assignment >>>> as the way to provide a smooth transition for "non-ABMC-aware" users. >>> >>> I agree, and having this on by default also helps non-ABMC-aware users. >>> >>>> To me a user that turns this mode off and then on again can be >>>> considered as a user that is "ABMC-aware" and turning it "off and then >>>> on again" seems like an intuitive way to get to a "clean slate" >>>> wrt counter assignments. This may also be a convenient way for >>>> an "ABMC-aware" user space to unassign all counters and thus also >>>> helpful if resctrl supports the flag that Peter proposed. The flag >>>> seems to already keep something like this in its context with >>>> a name of "mbm_assign_on_mkdir" that could be interpreted as >>>> "only auto assign on mkdir"? >>> >>> Yes, that's reasonable. It could be a good idea to document this >>> behaviour of switching the mbm_cntr_assign mode, if we think it is >>> useful and people are likely to rely on it. >>> >>> Since mkdir is an implementation detail of the resctrl interface, I'd >>> be tempted to go for a more generic name, say, >>> "mbm_assign_new_mon_groups". But that's just bikeshedding. >>> The proposed behaviour seems fine. >>> >>> Either way, if this is not included in this series, it could be added >>> later without breaking anything. >> >> How about more generic "mbm_cntr_assign_auto" ? > > I would like to be careful to not make it _too_ generic. Dave already pointed > out that users may be surprised that counters are not auto-assigned when switching > between the different modes so using the the name to help highlight when this > auto-assignment can be expected to happen seems very useful. In that case "mbm_assign_on_mkdir" seems on point and precise. Thanks Babu
Hi Babu, On Sat, Feb 22, 2025 at 12:43 AM Moger, Babu <bmoger@amd.com> wrote: > > Hi Reinette, > > On 2/21/2025 4:48 PM, Reinette Chatre wrote: > > Hi Babu, > > > > On 2/21/25 10:23 AM, Moger, Babu wrote: > >> Hi All, > >> > >> On 2/21/2025 11:14 AM, Dave Martin wrote: > >>> Hi, > >>> > >>> On Thu, Feb 20, 2025 at 09:08:17AM -0800, Reinette Chatre wrote: > >>>> Hi Dave, > >>>> > >>>> On 2/20/25 5:40 AM, Dave Martin wrote: > >>>>> On Thu, Feb 20, 2025 at 11:35:56AM +0100, Peter Newman wrote: > >>>>>> Hi Reinette, > >>>>>> > >>>>>> On Wed, Feb 19, 2025 at 6:55 PM Reinette Chatre > >>>>>> <reinette.chatre@intel.com> wrote: > >>> > >>> [...] > >>> > >>>>>>> Could you please remind me how a user will set this flag? > >>>>>> > >>>>>> Quoting my original suggestion[1]: > >>>>>> > >>>>>> "info/L3_MON/mbm_assign_on_mkdir? > >>>>>> > >>>>>> boolean (parsed with kstrtobool()), defaulting to true?" > >>>>>> > >>>>>> After mount, any groups that got counters on creation would have to be > >>>>>> cleaned up, but at least that can be done with forward progress once > >>>>>> the flag is cleared. > >>>>>> > >>>>>> I was able to live with that as long as there aren't users polling for > >>>>>> resctrl to be mounted and immediately creating groups. For us, a > >>>>>> single container manager service manages resctrl. > >>> > >>> [...] > >>> > >>>>> +1 > >>>>> > >>>>> That's basically my position -- the auto-assignment feels like a > >>>>> _potential_ nuisance for ABMC-aware users, but it depends on what they > >>>>> are trying to do. Migration of non-ABMC-aware users will be easier for > >>>>> basic use cases if auto-assignment occurs by default (as in this > >>>>> series). > >>>>> > >>>>> Having an explicit way to turn this off seems perfectly reasonable > >>>>> (and could be added later on, if not provided in this series). > >>>>> > >>>>> > >>>>> What about the question re whether turning mbm_cntr_assign mode on > >>>>> should trigger auto-assignment? > >>>>> > >>>>> Currently turning this mode off and then on again has the effect of > >>>>> removing all automatic assignments for extant groups. This feels > >>>>> surprising and/or unintentional (?) > >>>> > >>>> Connecting to what you start off by saying I also see auto-assignment > >>>> as the way to provide a smooth transition for "non-ABMC-aware" users. > >>> > >>> I agree, and having this on by default also helps non-ABMC-aware users. > >>> > >>>> To me a user that turns this mode off and then on again can be > >>>> considered as a user that is "ABMC-aware" and turning it "off and then > >>>> on again" seems like an intuitive way to get to a "clean slate" > >>>> wrt counter assignments. This may also be a convenient way for > >>>> an "ABMC-aware" user space to unassign all counters and thus also > >>>> helpful if resctrl supports the flag that Peter proposed. The flag > >>>> seems to already keep something like this in its context with > >>>> a name of "mbm_assign_on_mkdir" that could be interpreted as > >>>> "only auto assign on mkdir"? > >>> > >>> Yes, that's reasonable. It could be a good idea to document this > >>> behaviour of switching the mbm_cntr_assign mode, if we think it is > >>> useful and people are likely to rely on it. > >>> > >>> Since mkdir is an implementation detail of the resctrl interface, I'd > >>> be tempted to go for a more generic name, say, > >>> "mbm_assign_new_mon_groups". But that's just bikeshedding. > >>> The proposed behaviour seems fine. > >>> > >>> Either way, if this is not included in this series, it could be added > >>> later without breaking anything. > >> > >> How about more generic "mbm_cntr_assign_auto" ? > > > > I would like to be careful to not make it _too_ generic. Dave already pointed > > out that users may be surprised that counters are not auto-assigned when switching > > between the different modes so using the the name to help highlight when this > > auto-assignment can be expected to happen seems very useful. > > In that case "mbm_assign_on_mkdir" seems on point and precise. > Thanks It also looks like counters are not assigned when a domain is hotplugged, so explicitly stating that it's on mkdir gets us off the hook for that. -Peter
Hi Babu,
On 1/22/25 12:20 PM, Babu Moger wrote:
> Assign/unassign counters on resctrl group creation/deletion. Two counters
> are required per group, one for MBM total event and one for MBM local
> event.
>
> There are a limited number of counters available for assignment. If these
> counters are exhausted, the kernel will display the error message: "Out of
> MBM assignable counters". However, it is not necessary to fail the
> creation of a group due to assignment failures. Users have the flexibility
> to modify the assignments at a later time.
>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index b6d188d0f9b7..118b39fbb01e 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1557,3 +1557,30 @@ int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d
>
> return ret;
> }
> +
> +void mbm_cntr_reset(struct rdt_resource *r)
> +{
> + u32 idx_limit = resctrl_arch_system_num_rmid_idx();
> + struct rdt_mon_domain *dom;
> +
> + /*
> + * Reset the domain counter configuration. Hardware counters
> + * will reset after switching the monitor mode. So, reset the
> + * architectural amd non-architectural state so that reading
"amd" -> "and"
> + * of hardware counter is not considered as an overflow in the
> + * next update.
> + */
> + if (is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
> + list_for_each_entry(dom, &r->mon_domains, hdr.list) {
> + memset(dom->cntr_cfg, 0,
> + sizeof(*dom->cntr_cfg) * r->mon.num_mbm_cntrs);
> + if (is_mbm_total_enabled())
> + memset(dom->mbm_total, 0,
> + sizeof(struct mbm_state) * idx_limit);
> + if (is_mbm_local_enabled())
> + memset(dom->mbm_local, 0,
> + sizeof(struct mbm_state) * idx_limit);
> + resctrl_arch_reset_rmid_all(r, dom);
> + }
> + }
> +}
I looked back at the previous versions to better understand how this function
came about and I do not think it actually solves the problem it aims to solve.
rdtgroup_unassign_cntrs() can fail and when it does the counter is not free'd. That
leaves a monitoring domain's array with an entry that points to a resource group
that no longer exists (unless it is the default resource group) since
rdtgroup_unassign_cntrs() does not check the return and proceeds to remove the
resource group. mbm_cntr_reset() is called on umount of resctrl but
rdtgroup_unassign_cntrs() is called on every group remove and those scenarios
are not handled.
To address this I believe that I need to go back on a previous request to have
resctrl_arch_config_cntr() return an error code. AMD does not need this and
it is difficult to predict what will work for MPAM. I originally wanted to be
flexible here but this appears to be impractical. With a new requirement that
resctrl_arch_config_cntr() always succeeds the counter will in turn always
be free'd and not leave dangling pointers. I believe doing so eliminates
the need for mbm_cntr_reset() as used in this patch. My apologies for the
misdirection. We can re-evaluate these flows if MPAM needs anything different.
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 2b86124c336b..f61f0cd032ef 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2687,6 +2687,46 @@ static void schemata_list_destroy(void)
> }
> }
>
> +/*
> + * Called when a new group is created. If "mbm_cntr_assign" mode is enabled,
> + * counters are automatically assigned. Each group can accommodate two counters:
> + * one for the total event and one for the local event. Assignments may fail
> + * due to the limited number of counters. However, it is not necessary to fail
> + * the group creation and thus no failure is returned. Users have the option
> + * to modify the counter assignments after the group has been created.
> + */
> +static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
> +{
> + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +
> + if (!resctrl_arch_mbm_cntr_assign_enabled(r))
> + return;
> +
> + if (is_mbm_total_enabled())
> + resctrl_assign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
> +
> + if (is_mbm_local_enabled())
> + resctrl_assign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
> +}
> +
> +/*
> + * Called when a group is deleted. Counters are unassigned if it was in
> + * assigned state.
> + */
> +static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
> +{
> + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +
> + if (!resctrl_arch_mbm_cntr_assign_enabled(r))
> + return;
> +
It looks to me as though there are a couple of places (rmdir_all_sub(), rdt_kill_sb(),
and rdtgroup_rmdir_ctrl()) where rdtgroup_unassign_cntrs() could be called on a system that
does not support monitoring and/or only supports cache allocation monitoring.
In these paths it is only the architecture's resctrl_arch_mbm_cntr_assign_enabled(r) that
gates the resctrl flow. I think rdtgroup_unassign_cntrs() and to match rdtgroup_assign_cntrs()
can do with at least a r->mon_capable check.
> + if (is_mbm_total_enabled())
> + resctrl_unassign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
> +
> + if (is_mbm_local_enabled())
> + resctrl_unassign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
> +}
> +
> static int rdt_get_tree(struct fs_context *fc)
> {
> struct rdt_fs_context *ctx = rdt_fc2context(fc);
> @@ -2741,6 +2781,8 @@ static int rdt_get_tree(struct fs_context *fc)
> if (ret < 0)
> goto out_info;
>
> + rdtgroup_assign_cntrs(&rdtgroup_default);
> +
> ret = mkdir_mondata_all(rdtgroup_default.kn,
> &rdtgroup_default, &kn_mondata);
> if (ret < 0)
> @@ -2779,8 +2821,10 @@ static int rdt_get_tree(struct fs_context *fc)
> if (resctrl_arch_mon_capable())
> kernfs_remove(kn_mondata);
> out_mongrp:
> - if (resctrl_arch_mon_capable())
> + if (resctrl_arch_mon_capable()) {
> + rdtgroup_unassign_cntrs(&rdtgroup_default);
> kernfs_remove(kn_mongrp);
> + }
> out_info:
> kernfs_remove(kn_info);
> out_schemata_free:
> @@ -2956,6 +3000,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
>
> head = &rdtgrp->mon.crdtgrp_list;
> list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
> + rdtgroup_unassign_cntrs(sentry);
> free_rmid(sentry->closid, sentry->mon.rmid);
> list_del(&sentry->mon.crdtgrp_list);
>
> @@ -2996,6 +3041,8 @@ static void rmdir_all_sub(void)
> cpumask_or(&rdtgroup_default.cpu_mask,
> &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
>
> + rdtgroup_unassign_cntrs(rdtgrp);
> +
> free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>
> kernfs_remove(rdtgrp->kn);
> @@ -3027,6 +3074,8 @@ static void rdt_kill_sb(struct super_block *sb)
> for_each_alloc_capable_rdt_resource(r)
> reset_all_ctrls(r);
> rmdir_all_sub();
> + rdtgroup_unassign_cntrs(&rdtgroup_default);
> + mbm_cntr_reset(&rdt_resources_all[RDT_RESOURCE_L3].r_resctrl);
> rdt_pseudo_lock_release();
> rdtgroup_default.mode = RDT_MODE_SHAREABLE;
> schemata_list_destroy();
> @@ -3490,9 +3539,12 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
> }
> rdtgrp->mon.rmid = ret;
>
> + rdtgroup_assign_cntrs(rdtgrp);
> +
> ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn);
> if (ret) {
> rdt_last_cmd_puts("kernfs subdir error\n");
> + rdtgroup_unassign_cntrs(rdtgrp);
> free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
> return ret;
> }
> @@ -3502,8 +3554,10 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
>
> static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp)
> {
> - if (resctrl_arch_mon_capable())
> + if (resctrl_arch_mon_capable()) {
> + rdtgroup_unassign_cntrs(rgrp);
> free_rmid(rgrp->closid, rgrp->mon.rmid);
> + }
> }
>
> static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
> @@ -3764,6 +3818,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
> update_closid_rmid(tmpmask, NULL);
>
> rdtgrp->flags = RDT_DELETED;
> +
> + rdtgroup_unassign_cntrs(rdtgrp);
> +
> free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>
> /*
> @@ -3810,6 +3867,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
> cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
> update_closid_rmid(tmpmask, NULL);
>
> + rdtgroup_unassign_cntrs(rdtgrp);
> +
> free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
> closid_free(rdtgrp->closid);
>
Reinette
Hi Reinette,
On 2/6/25 12:03, Reinette Chatre wrote:
> Hi Babu,
>
> On 1/22/25 12:20 PM, Babu Moger wrote:
>> Assign/unassign counters on resctrl group creation/deletion. Two counters
>> are required per group, one for MBM total event and one for MBM local
>> event.
>>
>> There are a limited number of counters available for assignment. If these
>> counters are exhausted, the kernel will display the error message: "Out of
>> MBM assignable counters". However, it is not necessary to fail the
>> creation of a group due to assignment failures. Users have the flexibility
>> to modify the assignments at a later time.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index b6d188d0f9b7..118b39fbb01e 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1557,3 +1557,30 @@ int resctrl_unassign_cntr_event(struct rdt_resource *r, struct rdt_mon_domain *d
>>
>> return ret;
>> }
>> +
>> +void mbm_cntr_reset(struct rdt_resource *r)
>> +{
>> + u32 idx_limit = resctrl_arch_system_num_rmid_idx();
>> + struct rdt_mon_domain *dom;
>> +
>> + /*
>> + * Reset the domain counter configuration. Hardware counters
>> + * will reset after switching the monitor mode. So, reset the
>> + * architectural amd non-architectural state so that reading
>
> "amd" -> "and"
Sure.
>
>> + * of hardware counter is not considered as an overflow in the
>> + * next update.
>> + */
>> + if (is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
>> + list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>> + memset(dom->cntr_cfg, 0,
>> + sizeof(*dom->cntr_cfg) * r->mon.num_mbm_cntrs);
>> + if (is_mbm_total_enabled())
>> + memset(dom->mbm_total, 0,
>> + sizeof(struct mbm_state) * idx_limit);
>> + if (is_mbm_local_enabled())
>> + memset(dom->mbm_local, 0,
>> + sizeof(struct mbm_state) * idx_limit);
>> + resctrl_arch_reset_rmid_all(r, dom);
>> + }
>> + }
>> +}
>
> I looked back at the previous versions to better understand how this function
> came about and I do not think it actually solves the problem it aims to solve.
>
> rdtgroup_unassign_cntrs() can fail and when it does the counter is not free'd. That
> leaves a monitoring domain's array with an entry that points to a resource group
> that no longer exists (unless it is the default resource group) since
> rdtgroup_unassign_cntrs() does not check the return and proceeds to remove the
> resource group. mbm_cntr_reset() is called on umount of resctrl but
> rdtgroup_unassign_cntrs() is called on every group remove and those scenarios
> are not handled.
>
> To address this I believe that I need to go back on a previous request to have
> resctrl_arch_config_cntr() return an error code. AMD does not need this and
> it is difficult to predict what will work for MPAM. I originally wanted to be
> flexible here but this appears to be impractical. With a new requirement that
> resctrl_arch_config_cntr() always succeeds the counter will in turn always
> be free'd and not leave dangling pointers. I believe doing so eliminates
> the need for mbm_cntr_reset() as used in this patch. My apologies for the
> misdirection. We can re-evaluate these flows if MPAM needs anything different.
So, new requirement is to free the counter even if the
resctrl_arch_config_cntr() call fails. That way after calling
rdtgroup_unassign_cntrs() the counter is freed and it is in clean state.
So, we dont need to call mbm_cntr_reset() in the end to clear all the entries.
Here is the call sequence.
rdtgroup_unassign_cntrs() -> resctrl_unassign_cntr_event() ->
resctrl_free_config_cntr() -> resctrl_config_cntr() ->
resctrl_arch_config_cntr().
So, only change here is.
/*
* Unassign and free the counter if assigned else return success.
*/
static int resctrl_free_config_cntr(struct rdt_resource *r,
struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
enum resctrl_event_id evtid)
{
int cntr_id, ret = 0;
cntr_id = mbm_cntr_get(r, d, rdtgrp, evtid);
if (cntr_id < 0)
return ret;
/* Unassign and free the counter*/
ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
rdtgrp->closid, cntr_id, false);
mbm_cntr_free(d, cntr_id);
return ret;
}
>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 2b86124c336b..f61f0cd032ef 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2687,6 +2687,46 @@ static void schemata_list_destroy(void)
>> }
>> }
>>
>> +/*
>> + * Called when a new group is created. If "mbm_cntr_assign" mode is enabled,
>> + * counters are automatically assigned. Each group can accommodate two counters:
>> + * one for the total event and one for the local event. Assignments may fail
>> + * due to the limited number of counters. However, it is not necessary to fail
>> + * the group creation and thus no failure is returned. Users have the option
>> + * to modify the counter assignments after the group has been created.
>> + */
>> +static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
>> +{
>> + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +
>> + if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>> + return;
>> +
>> + if (is_mbm_total_enabled())
>> + resctrl_assign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +
>> + if (is_mbm_local_enabled())
>> + resctrl_assign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +}
>> +
>> +/*
>> + * Called when a group is deleted. Counters are unassigned if it was in
>> + * assigned state.
>> + */
>> +static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
>> +{
>> + struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +
>> + if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>> + return;
>> +
>
> It looks to me as though there are a couple of places (rmdir_all_sub(), rdt_kill_sb(),
> and rdtgroup_rmdir_ctrl()) where rdtgroup_unassign_cntrs() could be called on a system that
> does not support monitoring and/or only supports cache allocation monitoring.
>
> In these paths it is only the architecture's resctrl_arch_mbm_cntr_assign_enabled(r) that
> gates the resctrl flow. I think rdtgroup_unassign_cntrs() and to match rdtgroup_assign_cntrs()
> can do with at least a r->mon_capable check.
ok. Will add following check.
if (!r->mon_capable || !resctrl_arch_mbm_cntr_assign_enabled(r))
return;
>
>> + if (is_mbm_total_enabled())
>> + resctrl_unassign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +
>> + if (is_mbm_local_enabled())
>> + resctrl_unassign_cntr_event(r, NULL, rdtgrp, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +}
>> +
>> static int rdt_get_tree(struct fs_context *fc)
>> {
>> struct rdt_fs_context *ctx = rdt_fc2context(fc);
>> @@ -2741,6 +2781,8 @@ static int rdt_get_tree(struct fs_context *fc)
>> if (ret < 0)
>> goto out_info;
>>
>> + rdtgroup_assign_cntrs(&rdtgroup_default);
>> +
>> ret = mkdir_mondata_all(rdtgroup_default.kn,
>> &rdtgroup_default, &kn_mondata);
>> if (ret < 0)
>> @@ -2779,8 +2821,10 @@ static int rdt_get_tree(struct fs_context *fc)
>> if (resctrl_arch_mon_capable())
>> kernfs_remove(kn_mondata);
>> out_mongrp:
>> - if (resctrl_arch_mon_capable())
>> + if (resctrl_arch_mon_capable()) {
>> + rdtgroup_unassign_cntrs(&rdtgroup_default);
>> kernfs_remove(kn_mongrp);
>> + }
>> out_info:
>> kernfs_remove(kn_info);
>> out_schemata_free:
>> @@ -2956,6 +3000,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
>>
>> head = &rdtgrp->mon.crdtgrp_list;
>> list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
>> + rdtgroup_unassign_cntrs(sentry);
>> free_rmid(sentry->closid, sentry->mon.rmid);
>> list_del(&sentry->mon.crdtgrp_list);
>>
>> @@ -2996,6 +3041,8 @@ static void rmdir_all_sub(void)
>> cpumask_or(&rdtgroup_default.cpu_mask,
>> &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
>>
>> + rdtgroup_unassign_cntrs(rdtgrp);
>> +
>> free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>
>> kernfs_remove(rdtgrp->kn);
>> @@ -3027,6 +3074,8 @@ static void rdt_kill_sb(struct super_block *sb)
>> for_each_alloc_capable_rdt_resource(r)
>> reset_all_ctrls(r);
>> rmdir_all_sub();
>> + rdtgroup_unassign_cntrs(&rdtgroup_default);
>> + mbm_cntr_reset(&rdt_resources_all[RDT_RESOURCE_L3].r_resctrl);
>> rdt_pseudo_lock_release();
>> rdtgroup_default.mode = RDT_MODE_SHAREABLE;
>> schemata_list_destroy();
>> @@ -3490,9 +3539,12 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
>> }
>> rdtgrp->mon.rmid = ret;
>>
>> + rdtgroup_assign_cntrs(rdtgrp);
>> +
>> ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn);
>> if (ret) {
>> rdt_last_cmd_puts("kernfs subdir error\n");
>> + rdtgroup_unassign_cntrs(rdtgrp);
>> free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>> return ret;
>> }
>> @@ -3502,8 +3554,10 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp)
>>
>> static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp)
>> {
>> - if (resctrl_arch_mon_capable())
>> + if (resctrl_arch_mon_capable()) {
>> + rdtgroup_unassign_cntrs(rgrp);
>> free_rmid(rgrp->closid, rgrp->mon.rmid);
>> + }
>> }
>>
>> static int mkdir_rdt_prepare(struct kernfs_node *parent_kn,
>> @@ -3764,6 +3818,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>> update_closid_rmid(tmpmask, NULL);
>>
>> rdtgrp->flags = RDT_DELETED;
>> +
>> + rdtgroup_unassign_cntrs(rdtgrp);
>> +
>> free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>
>> /*
>> @@ -3810,6 +3867,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>> cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
>> update_closid_rmid(tmpmask, NULL);
>>
>> + rdtgroup_unassign_cntrs(rdtgrp);
>> +
>> free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>> closid_free(rdtgrp->closid);
>>
>
> Reinette
>
--
Thanks
Babu Moger
Hi Babu,
On 2/10/25 9:27 AM, Moger, Babu wrote:
> On 2/6/25 12:03, Reinette Chatre wrote:
>> On 1/22/25 12:20 PM, Babu Moger wrote:
>>
>>> + * of hardware counter is not considered as an overflow in the
>>> + * next update.
>>> + */
>>> + if (is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
>>> + list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>> + memset(dom->cntr_cfg, 0,
>>> + sizeof(*dom->cntr_cfg) * r->mon.num_mbm_cntrs);
>>> + if (is_mbm_total_enabled())
>>> + memset(dom->mbm_total, 0,
>>> + sizeof(struct mbm_state) * idx_limit);
>>> + if (is_mbm_local_enabled())
>>> + memset(dom->mbm_local, 0,
>>> + sizeof(struct mbm_state) * idx_limit);
>>> + resctrl_arch_reset_rmid_all(r, dom);
>>> + }
>>> + }
>>> +}
>>
>> I looked back at the previous versions to better understand how this function
>> came about and I do not think it actually solves the problem it aims to solve.
>>
>> rdtgroup_unassign_cntrs() can fail and when it does the counter is not free'd. That
>> leaves a monitoring domain's array with an entry that points to a resource group
>> that no longer exists (unless it is the default resource group) since
>> rdtgroup_unassign_cntrs() does not check the return and proceeds to remove the
>> resource group. mbm_cntr_reset() is called on umount of resctrl but
>> rdtgroup_unassign_cntrs() is called on every group remove and those scenarios
>> are not handled.
>>
>> To address this I believe that I need to go back on a previous request to have
>> resctrl_arch_config_cntr() return an error code. AMD does not need this and
>> it is difficult to predict what will work for MPAM. I originally wanted to be
>> flexible here but this appears to be impractical. With a new requirement that
>> resctrl_arch_config_cntr() always succeeds the counter will in turn always
>> be free'd and not leave dangling pointers. I believe doing so eliminates
>> the need for mbm_cntr_reset() as used in this patch. My apologies for the
>> misdirection. We can re-evaluate these flows if MPAM needs anything different.
>
> So, new requirement is to free the counter even if the
> resctrl_arch_config_cntr() call fails. That way after calling
No. Quoting above: "new requirement that resctrl_arch_config_cntr() always succeeds".
As I see it this will eliminate a lot of error checking on the calling path,
not ignore errors.
Reinette
© 2016 - 2026 Red Hat, Inc.