The mbm_cntr_assign mode offers several hardware counters that can be
assigned to an RMID, event pair and monitor the bandwidth as long as it
is assigned.
Counters are managed at two levels. The global assignment is tracked
using the mbm_cntr_free_map field in the struct resctrl_mon, while
domain-specific assignments are tracked using the mbm_cntr_map field
in the struct rdt_mon_domain. Allocation begins at the global level
and is then applied individually to each domain.
Introduce an interface to allocate these counters and update the
corresponding domains accordingly.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v9: Introduced new function resctrl_config_cntr to assign the counter, update
the bitmap and reset the architectural state.
Taken care of error handling(freeing the counter) when assignment fails.
Moved mbm_cntr_assigned_to_domain here as it used in this patch.
Minor text changes.
v8: Renamed rdtgroup_assign_cntr() to rdtgroup_assign_cntr_event().
Added the code to return the error if rdtgroup_assign_cntr_event fails.
Moved definition of MBM_EVENT_ARRAY_INDEX to resctrl/internal.h.
Updated typo in the comments.
v7: New patch. Moved all the FS code here.
Merged rdtgroup_assign_cntr and rdtgroup_alloc_cntr.
Adde new #define MBM_EVENT_ARRAY_INDEX.
---
arch/x86/kernel/cpu/resctrl/internal.h | 2 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 87 ++++++++++++++++++++++++++
2 files changed, 89 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 00f7bf60e16a..cb496bd97007 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -717,6 +717,8 @@ unsigned int mon_event_config_index_get(u32 evtid);
int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
enum resctrl_event_id evtid, u32 rmid, u32 closid,
u32 cntr_id, bool assign);
+int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp,
+ struct rdt_mon_domain *d, enum resctrl_event_id evtid);
void rdt_staged_configs_clear(void);
bool closid_allocated(unsigned int closid);
int resctrl_find_cleanest_closid(void);
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 1b5529c212f5..bc3752967c44 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1924,6 +1924,93 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
return 0;
}
+/*
+ * Configure the counter for the event, RMID pair for the domain.
+ * Update the bitmap and reset the architectural state.
+ */
+static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+ enum resctrl_event_id evtid, u32 rmid, u32 closid,
+ u32 cntr_id, bool assign)
+{
+ int ret;
+
+ ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign);
+ if (ret)
+ return ret;
+
+ if (assign)
+ __set_bit(cntr_id, d->mbm_cntr_map);
+ else
+ __clear_bit(cntr_id, d->mbm_cntr_map);
+
+ /*
+ * Reset the architectural state so that reading of hardware
+ * counter is not considered as an overflow in next update.
+ */
+ resctrl_arch_reset_rmid(r, d, closid, rmid, evtid);
+
+ return ret;
+}
+
+static bool mbm_cntr_assigned_to_domain(struct rdt_resource *r, u32 cntr_id)
+{
+ struct rdt_mon_domain *d;
+
+ list_for_each_entry(d, &r->mon_domains, hdr.list)
+ if (test_bit(cntr_id, d->mbm_cntr_map))
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Assign a hardware counter to event @evtid of group @rdtgrp.
+ * Counter will be assigned to all the domains if rdt_mon_domain is NULL
+ * else the counter will be assigned to specific domain.
+ */
+int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp,
+ struct rdt_mon_domain *d, enum resctrl_event_id evtid)
+{
+ int index = MBM_EVENT_ARRAY_INDEX(evtid);
+ int cntr_id = rdtgrp->mon.cntr_id[index];
+ int ret;
+
+ /*
+ * Allocate a new counter id to the event if the counter is not
+ * assigned already.
+ */
+ if (cntr_id == MON_CNTR_UNSET) {
+ cntr_id = mbm_cntr_alloc(r);
+ if (cntr_id < 0) {
+ rdt_last_cmd_puts("Out of MBM assignable counters\n");
+ return -ENOSPC;
+ }
+ rdtgrp->mon.cntr_id[index] = cntr_id;
+ }
+
+ if (!d) {
+ list_for_each_entry(d, &r->mon_domains, hdr.list) {
+ ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
+ rdtgrp->closid, cntr_id, true);
+ if (ret)
+ goto out_done_assign;
+ }
+ } else {
+ ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid,
+ rdtgrp->closid, cntr_id, true);
+ if (ret)
+ goto out_done_assign;
+ }
+
+out_done_assign:
+ if (ret && !mbm_cntr_assigned_to_domain(r, cntr_id)) {
+ mbm_cntr_free(r, cntr_id);
+ rdtgroup_cntr_id_init(rdtgrp, evtid);
+ }
+
+ return ret;
+}
+
/* rdtgroup information files for one cache resource. */
static struct rftype res_common_files[] = {
{
--
2.34.1
Hi Babu, On 10/29/24 4:21 PM, Babu Moger wrote: > The mbm_cntr_assign mode offers several hardware counters that can be > assigned to an RMID, event pair and monitor the bandwidth as long as it > is assigned. > > Counters are managed at two levels. The global assignment is tracked > using the mbm_cntr_free_map field in the struct resctrl_mon, while > domain-specific assignments are tracked using the mbm_cntr_map field > in the struct rdt_mon_domain. Allocation begins at the global level > and is then applied individually to each domain. > > Introduce an interface to allocate these counters and update the > corresponding domains accordingly. > > Signed-off-by: Babu Moger <babu.moger@amd.com> > --- ... > diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h > index 00f7bf60e16a..cb496bd97007 100644 > --- a/arch/x86/kernel/cpu/resctrl/internal.h > +++ b/arch/x86/kernel/cpu/resctrl/internal.h > @@ -717,6 +717,8 @@ unsigned int mon_event_config_index_get(u32 evtid); > int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, > enum resctrl_event_id evtid, u32 rmid, u32 closid, > u32 cntr_id, bool assign); > +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, > + struct rdt_mon_domain *d, enum resctrl_event_id evtid); > void rdt_staged_configs_clear(void); > bool closid_allocated(unsigned int closid); > int resctrl_find_cleanest_closid(void); > diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > index 1b5529c212f5..bc3752967c44 100644 > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > @@ -1924,6 +1924,93 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, > return 0; > } > > +/* > + * Configure the counter for the event, RMID pair for the domain. > + * Update the bitmap and reset the architectural state. > + */ > +static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, > + enum resctrl_event_id evtid, u32 rmid, u32 closid, > + u32 cntr_id, bool assign) > +{ > + int ret; > + > + ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign); > + if (ret) > + return ret; > + > + if (assign) > + __set_bit(cntr_id, d->mbm_cntr_map); > + else > + __clear_bit(cntr_id, d->mbm_cntr_map); > + > + /* > + * Reset the architectural state so that reading of hardware > + * counter is not considered as an overflow in next update. > + */ > + resctrl_arch_reset_rmid(r, d, closid, rmid, evtid); resctrl_arch_reset_rmid() expects to be run on a CPU that is in the domain @d ... note that after the architectural state is reset it initializes the state by reading the event on the current CPU. By running it here it is run on a random CPU that may not be in the right domain. > + > + return ret; > +} > + > +static bool mbm_cntr_assigned_to_domain(struct rdt_resource *r, u32 cntr_id) > +{ > + struct rdt_mon_domain *d; > + > + list_for_each_entry(d, &r->mon_domains, hdr.list) > + if (test_bit(cntr_id, d->mbm_cntr_map)) > + return 1; > + > + return 0; > +} > + > +/* > + * Assign a hardware counter to event @evtid of group @rdtgrp. > + * Counter will be assigned to all the domains if rdt_mon_domain is NULL > + * else the counter will be assigned to specific domain. > + */ > +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, > + struct rdt_mon_domain *d, enum resctrl_event_id evtid) > +{ > + int index = MBM_EVENT_ARRAY_INDEX(evtid); > + int cntr_id = rdtgrp->mon.cntr_id[index]; > + int ret; > + > + /* > + * Allocate a new counter id to the event if the counter is not > + * assigned already. > + */ > + if (cntr_id == MON_CNTR_UNSET) { > + cntr_id = mbm_cntr_alloc(r); > + if (cntr_id < 0) { > + rdt_last_cmd_puts("Out of MBM assignable counters\n"); > + return -ENOSPC; > + } > + rdtgrp->mon.cntr_id[index] = cntr_id; > + } > + > + if (!d) { > + list_for_each_entry(d, &r->mon_domains, hdr.list) { > + ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, > + rdtgrp->closid, cntr_id, true); > + if (ret) > + goto out_done_assign; This may not be what users expect. What if, for example, domain #1 has a counter assigned to "total" event and then user wants to change that to assign a counter to "total" event of all domains. Would this not reconfigure the counter associated with domain #1 and unnecessarily reset it? Could this be made a bit smarter to only configure a counter on a domain if it is not already configured? This could perhaps form part of resctrl_config_cntr() to not scatter the duplicate check everywhere. What do you think? Also, looks like this can do partial assignment. For example, if one of the domains encounter a failure then domains already configured are not undone. This matches other similar flows but is not documented and left to reader to decipher. > + } > + } else { > + ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, > + rdtgrp->closid, cntr_id, true); Looking at flows calling rdtgroup_assign_cntr_event() I do not see a check if counter is already assigned. So, if a user makes a loop of assigning a counter to the same event over and over it will result in an IPI every time. This seems unnecessary, what do you think? > + if (ret) > + goto out_done_assign; > + } > + > +out_done_assign: > + if (ret && !mbm_cntr_assigned_to_domain(r, cntr_id)) { > + mbm_cntr_free(r, cntr_id); > + rdtgroup_cntr_id_init(rdtgrp, evtid); > + } > + > + return ret; > +} > + > /* rdtgroup information files for one cache resource. */ > static struct rftype res_common_files[] = { > { Reinette
Hi Reinette, On 11/15/24 18:57, Reinette Chatre wrote: > Hi Babu, > > On 10/29/24 4:21 PM, Babu Moger wrote: >> The mbm_cntr_assign mode offers several hardware counters that can be >> assigned to an RMID, event pair and monitor the bandwidth as long as it >> is assigned. >> >> Counters are managed at two levels. The global assignment is tracked >> using the mbm_cntr_free_map field in the struct resctrl_mon, while >> domain-specific assignments are tracked using the mbm_cntr_map field >> in the struct rdt_mon_domain. Allocation begins at the global level >> and is then applied individually to each domain. >> >> Introduce an interface to allocate these counters and update the >> corresponding domains accordingly. >> >> Signed-off-by: Babu Moger <babu.moger@amd.com> >> --- > > ... > >> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h >> index 00f7bf60e16a..cb496bd97007 100644 >> --- a/arch/x86/kernel/cpu/resctrl/internal.h >> +++ b/arch/x86/kernel/cpu/resctrl/internal.h >> @@ -717,6 +717,8 @@ unsigned int mon_event_config_index_get(u32 evtid); >> int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >> enum resctrl_event_id evtid, u32 rmid, u32 closid, >> u32 cntr_id, bool assign); >> +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, >> + struct rdt_mon_domain *d, enum resctrl_event_id evtid); >> void rdt_staged_configs_clear(void); >> bool closid_allocated(unsigned int closid); >> int resctrl_find_cleanest_closid(void); >> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> index 1b5529c212f5..bc3752967c44 100644 >> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> @@ -1924,6 +1924,93 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >> return 0; >> } >> >> +/* >> + * Configure the counter for the event, RMID pair for the domain. >> + * Update the bitmap and reset the architectural state. >> + */ >> +static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >> + enum resctrl_event_id evtid, u32 rmid, u32 closid, >> + u32 cntr_id, bool assign) >> +{ >> + int ret; >> + >> + ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign); >> + if (ret) >> + return ret; >> + >> + if (assign) >> + __set_bit(cntr_id, d->mbm_cntr_map); >> + else >> + __clear_bit(cntr_id, d->mbm_cntr_map); >> + >> + /* >> + * Reset the architectural state so that reading of hardware >> + * counter is not considered as an overflow in next update. >> + */ >> + resctrl_arch_reset_rmid(r, d, closid, rmid, evtid); > > resctrl_arch_reset_rmid() expects to be run on a CPU that is in the domain > @d ... note that after the architectural state is reset it initializes the > state by reading the event on the current CPU. By running it here it is > run on a random CPU that may not be in the right domain. Yes. That is correct. We can move this part to our earlier implementation. We dont need to read the RMID. We just have to reset the counter. https://lore.kernel.org/lkml/16d88cc4091cef1999b7ec329364e12dd0dc748d.1728495588.git.babu.moger@amd.com/ diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 9fe419d0c536..bc3654ec3a08 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2371,6 +2371,13 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, &abmc_cfg, 1); + /* + * Reset the architectural state so that reading of hardware + * counter is not considered as an overflow in next update. + */ + if (arch_mbm) + memset(arch_mbm, 0, sizeof(struct arch_mbm_state)); + return 0; } > >> + >> + return ret; >> +} >> + >> +static bool mbm_cntr_assigned_to_domain(struct rdt_resource *r, u32 cntr_id) >> +{ >> + struct rdt_mon_domain *d; >> + >> + list_for_each_entry(d, &r->mon_domains, hdr.list) >> + if (test_bit(cntr_id, d->mbm_cntr_map)) >> + return 1; >> + >> + return 0; >> +} >> + >> +/* >> + * Assign a hardware counter to event @evtid of group @rdtgrp. >> + * Counter will be assigned to all the domains if rdt_mon_domain is NULL >> + * else the counter will be assigned to specific domain. >> + */ >> +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, >> + struct rdt_mon_domain *d, enum resctrl_event_id evtid) >> +{ >> + int index = MBM_EVENT_ARRAY_INDEX(evtid); >> + int cntr_id = rdtgrp->mon.cntr_id[index]; >> + int ret; >> + >> + /* >> + * Allocate a new counter id to the event if the counter is not >> + * assigned already. >> + */ >> + if (cntr_id == MON_CNTR_UNSET) { >> + cntr_id = mbm_cntr_alloc(r); >> + if (cntr_id < 0) { >> + rdt_last_cmd_puts("Out of MBM assignable counters\n"); >> + return -ENOSPC; >> + } >> + rdtgrp->mon.cntr_id[index] = cntr_id; >> + } >> + >> + if (!d) { >> + list_for_each_entry(d, &r->mon_domains, hdr.list) { >> + ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, >> + rdtgrp->closid, cntr_id, true); >> + if (ret) >> + goto out_done_assign; > > This may not be what users expect. What if, for example, domain #1 has a counter > assigned to "total" event and then user wants to change that to > assign a counter to "total" event of all domains. Would this not reconfigure the > counter associated with domain #1 and unnecessarily reset it? Could this be > made a bit smarter to only configure a counter on a domain if it is not already > configured? This could perhaps form part of resctrl_config_cntr() to not scatter > the duplicate check everywhere. What do you think? Yes. that is correct. We can add a check in resctrl_config_cntr(). diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 9fe419d0c536..bc3654ec3a08 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2384,6 +2391,10 @@ static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, { int ret; + /* Return success if the domain is in expected assign state already */ + if (assign == test_bit(cntr_id, d->mbm_cntr_map)) + return 0; + ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign); if (ret) return ret; > > Also, looks like this can do partial assignment. For example, if one of the > domains encounter a failure then domains already configured are not undone. This > matches other similar flows but is not documented and left to reader to decipher. I will add the text in patch 26 (x86/resctrl: Introduce interface to modify assignment states of the groups). > > >> + } >> + } else { >> + ret = resctrl_config_cntr(r, d, evtid, rdtgrp->mon.rmid, >> + rdtgrp->closid, cntr_id, true); > > Looking at flows calling rdtgroup_assign_cntr_event() I do not see a check > if counter is already assigned. So, if a user makes a loop of assigning a counter > to the same event over and over it will result in an IPI every time. This seems > unnecessary, what do you think? This will be taken care by the above check in resctrl_config_cntr(). > >> + if (ret) >> + goto out_done_assign; >> + } >> + >> +out_done_assign: >> + if (ret && !mbm_cntr_assigned_to_domain(r, cntr_id)) { >> + mbm_cntr_free(r, cntr_id); >> + rdtgroup_cntr_id_init(rdtgrp, evtid); >> + } >> + >> + return ret; >> +} >> + >> /* rdtgroup information files for one cache resource. */ >> static struct rftype res_common_files[] = { >> { > > Reinette > > -- Thanks Babu Moger
Hi Babu, On 11/20/24 10:05 AM, Moger, Babu wrote: > Hi Reinette, > > On 11/15/24 18:57, Reinette Chatre wrote: >> Hi Babu, >> >> On 10/29/24 4:21 PM, Babu Moger wrote: >>> The mbm_cntr_assign mode offers several hardware counters that can be >>> assigned to an RMID, event pair and monitor the bandwidth as long as it >>> is assigned. >>> >>> Counters are managed at two levels. The global assignment is tracked >>> using the mbm_cntr_free_map field in the struct resctrl_mon, while >>> domain-specific assignments are tracked using the mbm_cntr_map field >>> in the struct rdt_mon_domain. Allocation begins at the global level >>> and is then applied individually to each domain. >>> >>> Introduce an interface to allocate these counters and update the >>> corresponding domains accordingly. >>> >>> Signed-off-by: Babu Moger <babu.moger@amd.com> >>> --- >> >> ... >> >>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h >>> index 00f7bf60e16a..cb496bd97007 100644 >>> --- a/arch/x86/kernel/cpu/resctrl/internal.h >>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h >>> @@ -717,6 +717,8 @@ unsigned int mon_event_config_index_get(u32 evtid); >>> int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>> enum resctrl_event_id evtid, u32 rmid, u32 closid, >>> u32 cntr_id, bool assign); >>> +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, >>> + struct rdt_mon_domain *d, enum resctrl_event_id evtid); >>> void rdt_staged_configs_clear(void); >>> bool closid_allocated(unsigned int closid); >>> int resctrl_find_cleanest_closid(void); >>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>> index 1b5529c212f5..bc3752967c44 100644 >>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>> @@ -1924,6 +1924,93 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>> return 0; >>> } >>> >>> +/* >>> + * Configure the counter for the event, RMID pair for the domain. >>> + * Update the bitmap and reset the architectural state. >>> + */ >>> +static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>> + enum resctrl_event_id evtid, u32 rmid, u32 closid, >>> + u32 cntr_id, bool assign) >>> +{ >>> + int ret; >>> + >>> + ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign); >>> + if (ret) >>> + return ret; >>> + >>> + if (assign) >>> + __set_bit(cntr_id, d->mbm_cntr_map); >>> + else >>> + __clear_bit(cntr_id, d->mbm_cntr_map); >>> + >>> + /* >>> + * Reset the architectural state so that reading of hardware >>> + * counter is not considered as an overflow in next update. >>> + */ >>> + resctrl_arch_reset_rmid(r, d, closid, rmid, evtid); >> >> resctrl_arch_reset_rmid() expects to be run on a CPU that is in the domain >> @d ... note that after the architectural state is reset it initializes the >> state by reading the event on the current CPU. By running it here it is >> run on a random CPU that may not be in the right domain. > > Yes. That is correct. We can move this part to our earlier > implementation. We dont need to read the RMID. We just have to reset the > counter. > > https://lore.kernel.org/lkml/16d88cc4091cef1999b7ec329364e12dd0dc748d.1728495588.git.babu.moger@amd.com/ > > diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c > b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > index 9fe419d0c536..bc3654ec3a08 100644 > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > @@ -2371,6 +2371,13 @@ int resctrl_arch_config_cntr(struct rdt_resource > *r, struct rdt_mon_domain *d, > smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, > &abmc_cfg, 1); > > + /* > + * Reset the architectural state so that reading of hardware > + * counter is not considered as an overflow in next update. > + */ > + if (arch_mbm) > + memset(arch_mbm, 0, sizeof(struct arch_mbm_state)); > + > return 0; > } > > I am not sure what you envision here. One motivation for the move out of resctrl_arch_config_cntr() was to avoid architectural state being reset twice. For reference, mbm_config_write_domain()->resctrl_arch_reset_rmid_all(). Will architectural state be reset twice again? One thing that I did not notice before is that the non-architectural MBM state is not reset. Care should be taken to reset this also when considering that there is a plan to use that MBM state to build a generic rate event for all platforms: https://lore.kernel.org/all/CALPaoCgFRFgQqG00Uc0GhMHK47bsbtFw6Bxy5O9A_HeYmGa5sA@mail.gmail.com/ Reinette
Hi Reinette, On 11/21/2024 2:50 PM, Reinette Chatre wrote: > Hi Babu, > > On 11/20/24 10:05 AM, Moger, Babu wrote: >> Hi Reinette, >> >> On 11/15/24 18:57, Reinette Chatre wrote: >>> Hi Babu, >>> >>> On 10/29/24 4:21 PM, Babu Moger wrote: >>>> The mbm_cntr_assign mode offers several hardware counters that can be >>>> assigned to an RMID, event pair and monitor the bandwidth as long as it >>>> is assigned. >>>> >>>> Counters are managed at two levels. The global assignment is tracked >>>> using the mbm_cntr_free_map field in the struct resctrl_mon, while >>>> domain-specific assignments are tracked using the mbm_cntr_map field >>>> in the struct rdt_mon_domain. Allocation begins at the global level >>>> and is then applied individually to each domain. >>>> >>>> Introduce an interface to allocate these counters and update the >>>> corresponding domains accordingly. >>>> >>>> Signed-off-by: Babu Moger <babu.moger@amd.com> >>>> --- >>> >>> ... >>> >>>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h >>>> index 00f7bf60e16a..cb496bd97007 100644 >>>> --- a/arch/x86/kernel/cpu/resctrl/internal.h >>>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h >>>> @@ -717,6 +717,8 @@ unsigned int mon_event_config_index_get(u32 evtid); >>>> int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>>> enum resctrl_event_id evtid, u32 rmid, u32 closid, >>>> u32 cntr_id, bool assign); >>>> +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, >>>> + struct rdt_mon_domain *d, enum resctrl_event_id evtid); >>>> void rdt_staged_configs_clear(void); >>>> bool closid_allocated(unsigned int closid); >>>> int resctrl_find_cleanest_closid(void); >>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>> index 1b5529c212f5..bc3752967c44 100644 >>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>> @@ -1924,6 +1924,93 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>>> return 0; >>>> } >>>> >>>> +/* >>>> + * Configure the counter for the event, RMID pair for the domain. >>>> + * Update the bitmap and reset the architectural state. >>>> + */ >>>> +static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>>> + enum resctrl_event_id evtid, u32 rmid, u32 closid, >>>> + u32 cntr_id, bool assign) >>>> +{ >>>> + int ret; >>>> + >>>> + ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign); >>>> + if (ret) >>>> + return ret; >>>> + >>>> + if (assign) >>>> + __set_bit(cntr_id, d->mbm_cntr_map); >>>> + else >>>> + __clear_bit(cntr_id, d->mbm_cntr_map); >>>> + >>>> + /* >>>> + * Reset the architectural state so that reading of hardware >>>> + * counter is not considered as an overflow in next update. >>>> + */ >>>> + resctrl_arch_reset_rmid(r, d, closid, rmid, evtid); >>> >>> resctrl_arch_reset_rmid() expects to be run on a CPU that is in the domain >>> @d ... note that after the architectural state is reset it initializes the >>> state by reading the event on the current CPU. By running it here it is >>> run on a random CPU that may not be in the right domain. >> >> Yes. That is correct. We can move this part to our earlier >> implementation. We dont need to read the RMID. We just have to reset the >> counter. >> >> https://lore.kernel.org/lkml/16d88cc4091cef1999b7ec329364e12dd0dc748d.1728495588.git.babu.moger@amd.com/ >> >> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> index 9fe419d0c536..bc3654ec3a08 100644 >> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >> @@ -2371,6 +2371,13 @@ int resctrl_arch_config_cntr(struct rdt_resource >> *r, struct rdt_mon_domain *d, >> smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, >> &abmc_cfg, 1); >> >> + /* >> + * Reset the architectural state so that reading of hardware >> + * counter is not considered as an overflow in next update. >> + */ >> + if (arch_mbm) >> + memset(arch_mbm, 0, sizeof(struct arch_mbm_state)); >> + >> return 0; >> } >> >> > > I am not sure what you envision here. One motivation for the move out of > resctrl_arch_config_cntr() was to avoid architectural state being reset twice. For reference, > mbm_config_write_domain()->resctrl_arch_reset_rmid_all(). Will architectural state > be reset twice again? That is good point. We don't have to do it twice. We can move the whole reset(arch_mbm) in resctrl_arch_config_cntr(). > One thing that I did not notice before is that the non-architectural MBM state is not > reset. Care should be taken to reset this also when considering that there is a plan > to use that MBM state to build a generic rate event for all platforms: > https://lore.kernel.org/all/CALPaoCgFRFgQqG00Uc0GhMHK47bsbtFw6Bxy5O9A_HeYmGa5sA@mail.gmail.com/ Did you mean we should add the following code in resctrl_arch_config_cntr()? m = get_mbm_state(d, closid, rmid, evtid); if (m) memset(m, 0, sizeof(struct mbm_state)); - Babu Moger
Hi Babu, On 11/22/24 1:04 PM, Moger, Babu wrote: > Hi Reinette, > > On 11/21/2024 2:50 PM, Reinette Chatre wrote: >> Hi Babu, >> >> On 11/20/24 10:05 AM, Moger, Babu wrote: >>> Hi Reinette, >>> >>> On 11/15/24 18:57, Reinette Chatre wrote: >>>> Hi Babu, >>>> >>>> On 10/29/24 4:21 PM, Babu Moger wrote: >>>>> The mbm_cntr_assign mode offers several hardware counters that can be >>>>> assigned to an RMID, event pair and monitor the bandwidth as long as it >>>>> is assigned. >>>>> >>>>> Counters are managed at two levels. The global assignment is tracked >>>>> using the mbm_cntr_free_map field in the struct resctrl_mon, while >>>>> domain-specific assignments are tracked using the mbm_cntr_map field >>>>> in the struct rdt_mon_domain. Allocation begins at the global level >>>>> and is then applied individually to each domain. >>>>> >>>>> Introduce an interface to allocate these counters and update the >>>>> corresponding domains accordingly. >>>>> >>>>> Signed-off-by: Babu Moger <babu.moger@amd.com> >>>>> --- >>>> >>>> ... >>>> >>>>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h >>>>> index 00f7bf60e16a..cb496bd97007 100644 >>>>> --- a/arch/x86/kernel/cpu/resctrl/internal.h >>>>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h >>>>> @@ -717,6 +717,8 @@ unsigned int mon_event_config_index_get(u32 evtid); >>>>> int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>>>> enum resctrl_event_id evtid, u32 rmid, u32 closid, >>>>> u32 cntr_id, bool assign); >>>>> +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, >>>>> + struct rdt_mon_domain *d, enum resctrl_event_id evtid); >>>>> void rdt_staged_configs_clear(void); >>>>> bool closid_allocated(unsigned int closid); >>>>> int resctrl_find_cleanest_closid(void); >>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>> index 1b5529c212f5..bc3752967c44 100644 >>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>> @@ -1924,6 +1924,93 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>>>> return 0; >>>>> } >>>>> +/* >>>>> + * Configure the counter for the event, RMID pair for the domain. >>>>> + * Update the bitmap and reset the architectural state. >>>>> + */ >>>>> +static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>>>> + enum resctrl_event_id evtid, u32 rmid, u32 closid, >>>>> + u32 cntr_id, bool assign) >>>>> +{ >>>>> + int ret; >>>>> + >>>>> + ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign); >>>>> + if (ret) >>>>> + return ret; >>>>> + >>>>> + if (assign) >>>>> + __set_bit(cntr_id, d->mbm_cntr_map); >>>>> + else >>>>> + __clear_bit(cntr_id, d->mbm_cntr_map); >>>>> + >>>>> + /* >>>>> + * Reset the architectural state so that reading of hardware >>>>> + * counter is not considered as an overflow in next update. >>>>> + */ >>>>> + resctrl_arch_reset_rmid(r, d, closid, rmid, evtid); >>>> >>>> resctrl_arch_reset_rmid() expects to be run on a CPU that is in the domain >>>> @d ... note that after the architectural state is reset it initializes the >>>> state by reading the event on the current CPU. By running it here it is >>>> run on a random CPU that may not be in the right domain. >>> >>> Yes. That is correct. We can move this part to our earlier >>> implementation. We dont need to read the RMID. We just have to reset the >>> counter. >>> >>> https://lore.kernel.org/lkml/16d88cc4091cef1999b7ec329364e12dd0dc748d.1728495588.git.babu.moger@amd.com/ >>> >>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>> index 9fe419d0c536..bc3654ec3a08 100644 >>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>> @@ -2371,6 +2371,13 @@ int resctrl_arch_config_cntr(struct rdt_resource >>> *r, struct rdt_mon_domain *d, >>> smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, >>> &abmc_cfg, 1); >>> >>> + /* >>> + * Reset the architectural state so that reading of hardware >>> + * counter is not considered as an overflow in next update. >>> + */ >>> + if (arch_mbm) >>> + memset(arch_mbm, 0, sizeof(struct arch_mbm_state)); >>> + >>> return 0; >>> } >>> >>> >> >> I am not sure what you envision here. One motivation for the move out of >> resctrl_arch_config_cntr() was to avoid architectural state being reset twice. For reference, >> mbm_config_write_domain()->resctrl_arch_reset_rmid_all(). Will architectural state >> be reset twice again? > > That is good point. We don't have to do it twice. > > We can move the whole reset(arch_mbm) in resctrl_arch_config_cntr(). This is not clear to me. The architectural state needs to be reset on MBM config write even when assignable mode is not supported and/or enabled. Moving it to resctrl_arch_config_cntr() will break this, no? I wonder if it may not simplify things to call resctrl_arch_reset_rmid() from resctrl_abmc_config_one_amd()? >> One thing that I did not notice before is that the non-architectural MBM state is not >> reset. Care should be taken to reset this also when considering that there is a plan >> to use that MBM state to build a generic rate event for all platforms: >> https://lore.kernel.org/all/CALPaoCgFRFgQqG00Uc0GhMHK47bsbtFw6Bxy5O9A_HeYmGa5sA@mail.gmail.com/ > > Did you mean we should add the following code in resctrl_arch_config_cntr()? > > m = get_mbm_state(d, closid, rmid, evtid); > if (m) > memset(m, 0, sizeof(struct mbm_state)); This is not arch code but instead resctrl fs, so resctrl_config_cntr() may be more appropriate? Reinette
Hi Reinette, On 11/22/2024 4:07 PM, Reinette Chatre wrote: > Hi Babu, > > On 11/22/24 1:04 PM, Moger, Babu wrote: >> Hi Reinette, >> >> On 11/21/2024 2:50 PM, Reinette Chatre wrote: >>> Hi Babu, >>> >>> On 11/20/24 10:05 AM, Moger, Babu wrote: >>>> Hi Reinette, >>>> >>>> On 11/15/24 18:57, Reinette Chatre wrote: >>>>> Hi Babu, >>>>> >>>>> On 10/29/24 4:21 PM, Babu Moger wrote: >>>>>> The mbm_cntr_assign mode offers several hardware counters that can be >>>>>> assigned to an RMID, event pair and monitor the bandwidth as long as it >>>>>> is assigned. >>>>>> >>>>>> Counters are managed at two levels. The global assignment is tracked >>>>>> using the mbm_cntr_free_map field in the struct resctrl_mon, while >>>>>> domain-specific assignments are tracked using the mbm_cntr_map field >>>>>> in the struct rdt_mon_domain. Allocation begins at the global level >>>>>> and is then applied individually to each domain. >>>>>> >>>>>> Introduce an interface to allocate these counters and update the >>>>>> corresponding domains accordingly. >>>>>> >>>>>> Signed-off-by: Babu Moger <babu.moger@amd.com> >>>>>> --- >>>>> >>>>> ... >>>>> >>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h >>>>>> index 00f7bf60e16a..cb496bd97007 100644 >>>>>> --- a/arch/x86/kernel/cpu/resctrl/internal.h >>>>>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h >>>>>> @@ -717,6 +717,8 @@ unsigned int mon_event_config_index_get(u32 evtid); >>>>>> int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>>>>> enum resctrl_event_id evtid, u32 rmid, u32 closid, >>>>>> u32 cntr_id, bool assign); >>>>>> +int rdtgroup_assign_cntr_event(struct rdt_resource *r, struct rdtgroup *rdtgrp, >>>>>> + struct rdt_mon_domain *d, enum resctrl_event_id evtid); >>>>>> void rdt_staged_configs_clear(void); >>>>>> bool closid_allocated(unsigned int closid); >>>>>> int resctrl_find_cleanest_closid(void); >>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>>> index 1b5529c212f5..bc3752967c44 100644 >>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>>> @@ -1924,6 +1924,93 @@ int resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>>>>> return 0; >>>>>> } >>>>>> +/* >>>>>> + * Configure the counter for the event, RMID pair for the domain. >>>>>> + * Update the bitmap and reset the architectural state. >>>>>> + */ >>>>>> +static int resctrl_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d, >>>>>> + enum resctrl_event_id evtid, u32 rmid, u32 closid, >>>>>> + u32 cntr_id, bool assign) >>>>>> +{ >>>>>> + int ret; >>>>>> + >>>>>> + ret = resctrl_arch_config_cntr(r, d, evtid, rmid, closid, cntr_id, assign); >>>>>> + if (ret) >>>>>> + return ret; >>>>>> + >>>>>> + if (assign) >>>>>> + __set_bit(cntr_id, d->mbm_cntr_map); >>>>>> + else >>>>>> + __clear_bit(cntr_id, d->mbm_cntr_map); >>>>>> + >>>>>> + /* >>>>>> + * Reset the architectural state so that reading of hardware >>>>>> + * counter is not considered as an overflow in next update. >>>>>> + */ >>>>>> + resctrl_arch_reset_rmid(r, d, closid, rmid, evtid); >>>>> >>>>> resctrl_arch_reset_rmid() expects to be run on a CPU that is in the domain >>>>> @d ... note that after the architectural state is reset it initializes the >>>>> state by reading the event on the current CPU. By running it here it is >>>>> run on a random CPU that may not be in the right domain. >>>> >>>> Yes. That is correct. We can move this part to our earlier >>>> implementation. We dont need to read the RMID. We just have to reset the >>>> counter. >>>> >>>> https://lore.kernel.org/lkml/16d88cc4091cef1999b7ec329364e12dd0dc748d.1728495588.git.babu.moger@amd.com/ >>>> >>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>> index 9fe419d0c536..bc3654ec3a08 100644 >>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>> @@ -2371,6 +2371,13 @@ int resctrl_arch_config_cntr(struct rdt_resource >>>> *r, struct rdt_mon_domain *d, >>>> smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, >>>> &abmc_cfg, 1); >>>> >>>> + /* >>>> + * Reset the architectural state so that reading of hardware >>>> + * counter is not considered as an overflow in next update. >>>> + */ >>>> + if (arch_mbm) >>>> + memset(arch_mbm, 0, sizeof(struct arch_mbm_state)); >>>> + >>>> return 0; >>>> } >>>> >>>> >>> >>> I am not sure what you envision here. One motivation for the move out of >>> resctrl_arch_config_cntr() was to avoid architectural state being reset twice. For reference, >>> mbm_config_write_domain()->resctrl_arch_reset_rmid_all(). Will architectural state >>> be reset twice again? >> >> That is good point. We don't have to do it twice. >> >> We can move the whole reset(arch_mbm) in resctrl_arch_config_cntr(). > > This is not clear to me. The architectural state needs to be reset on MBM config write even > when assignable mode is not supported and/or enabled. Moving it to resctrl_arch_config_cntr() > will break this, no? Yes. The architectural state needs to be reset if ABMC is enabled or not enabled on MBM config write. > > I wonder if it may not simplify things to call resctrl_arch_reset_rmid() from > resctrl_abmc_config_one_amd()? Yes. That is is an option. I can try. > >>> One thing that I did not notice before is that the non-architectural MBM state is not >>> reset. Care should be taken to reset this also when considering that there is a plan >>> to use that MBM state to build a generic rate event for all platforms: >>> https://lore.kernel.org/all/CALPaoCgFRFgQqG00Uc0GhMHK47bsbtFw6Bxy5O9A_HeYmGa5sA@mail.gmail.com/ >> >> Did you mean we should add the following code in resctrl_arch_config_cntr()? >> >> m = get_mbm_state(d, closid, rmid, evtid); >> if (m) >> memset(m, 0, sizeof(struct mbm_state)); > > This is not arch code but instead resctrl fs, so resctrl_config_cntr() may be more appropriate? Sure. We can do that. > > Reinette > > -- - Babu Moger
© 2016 - 2024 Red Hat, Inc.