The ABMC feature provides an option to the user to assign a hardware
counter to an RMID, event pair and monitor the bandwidth as long as it
is assigned. The assigned RMID will be tracked by the hardware until the
user unassigns it manually.
Implement an architecture-specific handler to assign and unassign the
counter. Configure counters by writing to the L3_QOS_ABMC_CFG MSR,
specifying the counter ID, bandwidth source (RMID), and event
configuration.
The feature details are documented in the APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v13: Moved resctrl_arch_config_cntr() prototype to include/linux/resctrl.h.
Changed resctrl_arch_config_cntr() to retun void from int.
Updated the kernal doc for the prototype.
Updated the code comment.
12: Added the check to reset the architecture-specific state only when
assign is requested.
Added evt_cfg as the parameter as the user will be passing the event
configuration from /info/L3_MON/event_configs/.
v11: Moved resctrl_arch_assign_cntr() and resctrl_abmc_config_one_amd() to
monitor.c.
Added the code to reset the arch state in resctrl_arch_assign_cntr().
Also removed resctrl_arch_reset_rmid() inside IPI as the counters are
reset from the callers.
Re-wrote commit message.
v10: Added call resctrl_arch_reset_rmid() to reset the RMID in the domain
inside IPI call.
SMP and non-SMP call support is not required in resctrl_arch_config_cntr
with new domain specific assign approach/data structure.
Commit message update.
v9: Removed the code to reset the architectural state. It will done
in another patch.
v8: Rename resctrl_arch_assign_cntr to resctrl_arch_config_cntr.
v7: Separated arch and fs functions. This patch only has arch implementation.
Added struct rdt_resource to the interface resctrl_arch_assign_cntr.
Rename rdtgroup_abmc_cfg() to resctrl_abmc_config_one_amd().
v6: Removed mbm_cntr_alloc() from this patch to keep fs and arch code
separate.
Added code to update the counter assignment at domain level.
v5: Few name changes to match cntr_id.
Changed the function names to
rdtgroup_assign_cntr
resctr_arch_assign_cntr
More comments on commit log.
Added function summary.
v4: Commit message update.
User bitmap APIs where applicable.
Changed the interfaces considering MPAM(arm).
Added domain specific assignment.
v3: Removed the static from the prototype of rdtgroup_assign_abmc.
The function is not called directly from user anymore. These
---
arch/x86/kernel/cpu/resctrl/monitor.c | 37 +++++++++++++++++++++++++++
include/linux/resctrl.h | 17 ++++++++++++
2 files changed, 54 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index ff4b2abfa044..e31084f7babd 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -448,3 +448,40 @@ inline bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r)
{
return resctrl_to_arch_res(r)->mbm_cntr_assign_enabled;
}
+
+static void resctrl_abmc_config_one_amd(void *info)
+{
+ union l3_qos_abmc_cfg *abmc_cfg = info;
+
+ wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, abmc_cfg->full);
+}
+
+/*
+ * Send an IPI to the domain to assign the counter to RMID, event pair.
+ */
+void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+ enum resctrl_event_id evtid, u32 rmid, u32 closid,
+ u32 cntr_id, u32 evt_cfg, bool assign)
+{
+ struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
+ union l3_qos_abmc_cfg abmc_cfg = { 0 };
+ struct arch_mbm_state *am;
+
+ abmc_cfg.split.cfg_en = 1;
+ abmc_cfg.split.cntr_en = assign ? 1 : 0;
+ abmc_cfg.split.cntr_id = cntr_id;
+ abmc_cfg.split.bw_src = rmid;
+ abmc_cfg.split.bw_type = evt_cfg;
+
+ smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, &abmc_cfg, 1);
+
+ /*
+ * The hardware counter is reset (because cfg_en == 1) so there is no
+ * need to record initial non-zero counts.
+ */
+ if (assign) {
+ am = get_arch_mbm_state(hw_dom, rmid, evtid);
+ if (am)
+ memset(am, 0, sizeof(*am));
+ }
+}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index d77981d1fcb9..59a4fe60ab46 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -559,6 +559,23 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
*/
void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
+/**
+ * resctrl_arch_config_cntr() - Configure the counter id to RMID, event
+ * pair on the domain.
+ * @r: Resource structure.
+ * @d: Domain that the counter id to be configured.
+ * @evtid: Event type to configure.
+ * @rmid: RMID to configure.
+ * @closid: CLOSID to configure.
+ * @cntr_id: Counter ID to configure.
+ * @evt_cfg: MBM event configuration value representing reads,
+ * writes etc.
+ * @assign: Assign or unassign.
+ */
+void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
+ enum resctrl_event_id evtid, u32 rmid, u32 closid,
+ u32 cntr_id, u32 evt_cfg, bool assign);
+
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
--
2.34.1
Hi Babu,
On 5/15/25 3:51 PM, Babu Moger wrote:
> The ABMC feature provides an option to the user to assign a hardware
> counter to an RMID, event pair and monitor the bandwidth as long as it
> is assigned. The assigned RMID will be tracked by the hardware until the
> user unassigns it manually.
(please review this often repeated snippet to match new design)
>
> Implement an architecture-specific handler to assign and unassign the
> counter. Configure counters by writing to the L3_QOS_ABMC_CFG MSR,
> specifying the counter ID, bandwidth source (RMID), and event
> configuration.
>
...
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index ff4b2abfa044..e31084f7babd 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -448,3 +448,40 @@ inline bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r)
> {
> return resctrl_to_arch_res(r)->mbm_cntr_assign_enabled;
> }
> +
> +static void resctrl_abmc_config_one_amd(void *info)
> +{
> + union l3_qos_abmc_cfg *abmc_cfg = info;
> +
> + wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, abmc_cfg->full);
> +}
> +
> +/*
> + * Send an IPI to the domain to assign the counter to RMID, event pair.
> + */
> +void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> + enum resctrl_event_id evtid, u32 rmid, u32 closid,
> + u32 cntr_id, u32 evt_cfg, bool assign)
> +{
> + struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
> + union l3_qos_abmc_cfg abmc_cfg = { 0 };
> + struct arch_mbm_state *am;
> +
> + abmc_cfg.split.cfg_en = 1;
> + abmc_cfg.split.cntr_en = assign ? 1 : 0;
> + abmc_cfg.split.cntr_id = cntr_id;
> + abmc_cfg.split.bw_src = rmid;
> + abmc_cfg.split.bw_type = evt_cfg;
Is evt_cfg really needed to be programmed when unassigning a counter? Looking ahead at
patch #14 resctrl_free_config_cntr() needs to go through extra list walk to get this data
but why would hardware need an accurate event configuration to *unassign* a counter?
It seems unnecessary to provide both the event ID *and* the configuration.
resctrl_arch_config_cntr() could drop the "evt_cfg" parameter and instead there
can be a new resctrl utility that architecture can use to query the event's configuration.
Similar to resctrl_is_mon_event_enabled() introduced in
https://lore.kernel.org/lkml/20250521225049.132551-3-tony.luck@intel.com/ that exposes an
event property.
It looks to me as though there are a couple of changes in the telemetry work
that would benefit this work. https://lore.kernel.org/lkml/20250521225049.132551-2-tony.luck@intel.com/
switches the monitor events to be maintained in an array indexed by event ID, eliminating the
need for searching the evt_list that this work does in a couple of places. Also note the handy
new for_each_mbm_event() helper (https://lore.kernel.org/lkml/20250521225049.132551-5-tony.luck@intel.com/).
> +
> + smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, &abmc_cfg, 1);
> +
> + /*
> + * The hardware counter is reset (because cfg_en == 1) so there is no
> + * need to record initial non-zero counts.
> + */
> + if (assign) {
> + am = get_arch_mbm_state(hw_dom, rmid, evtid);
> + if (am)
> + memset(am, 0, sizeof(*am));
> + }
> +}
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index d77981d1fcb9..59a4fe60ab46 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -559,6 +559,23 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
> */
> void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
>
> +/**
> + * resctrl_arch_config_cntr() - Configure the counter id to RMID, event
> + * pair on the domain.
The sentence seem strange, should "Configure the counter" perhaps be
"Assign the counter"? Or if the naming requires "configure" ...
"Configure the counter with its new RMID and event details."? Please feel
free to improve.
> + * @r: Resource structure.
> + * @d: Domain that the counter id to be configured.
I am unable to parse description of @d.
> + * @evtid: Event type to configure.
> + * @rmid: RMID to configure.
> + * @closid: CLOSID to configure.
> + * @cntr_id: Counter ID to configure.
All four parameters descriptions end with "to configure" ... but it is actually only
the counter that is configured while the rest is the data that the counter is configured with, no?
> + * @evt_cfg: MBM event configuration value representing reads,
> + * writes etc.
Needs definition about what the contents of @evt_cfg means. This is the API ...it
cannot be vague like "reads, write, etc." but should be specific about which bit means
what.
> + * @assign: Assign or unassign.
"True to assign the counter, false to unassign the counter."
Needs some context here about what architecture can expect on how this function will
be called. For example, "Can be called from any CPU."
> + */
> +void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
> + enum resctrl_event_id evtid, u32 rmid, u32 closid,
> + u32 cntr_id, u32 evt_cfg, bool assign);
> +
> extern unsigned int resctrl_rmid_realloc_threshold;
> extern unsigned int resctrl_rmid_realloc_limit;
>
Reinette
Hi Reinette,
On 5/22/2025 4:51 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 5/15/25 3:51 PM, Babu Moger wrote:
>> The ABMC feature provides an option to the user to assign a hardware
>> counter to an RMID, event pair and monitor the bandwidth as long as it
>> is assigned. The assigned RMID will be tracked by the hardware until the
>> user unassigns it manually.
>
> (please review this often repeated snippet to match new design)
Sure.
>
>>
>> Implement an architecture-specific handler to assign and unassign the
>> counter. Configure counters by writing to the L3_QOS_ABMC_CFG MSR,
>> specifying the counter ID, bandwidth source (RMID), and event
>> configuration.
>>
>
> ...
>
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index ff4b2abfa044..e31084f7babd 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -448,3 +448,40 @@ inline bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r)
>> {
>> return resctrl_to_arch_res(r)->mbm_cntr_assign_enabled;
>> }
>> +
>> +static void resctrl_abmc_config_one_amd(void *info)
>> +{
>> + union l3_qos_abmc_cfg *abmc_cfg = info;
>> +
>> + wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, abmc_cfg->full);
>> +}
>> +
>> +/*
>> + * Send an IPI to the domain to assign the counter to RMID, event pair.
>> + */
>> +void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + enum resctrl_event_id evtid, u32 rmid, u32 closid,
>> + u32 cntr_id, u32 evt_cfg, bool assign)
>> +{
>> + struct rdt_hw_mon_domain *hw_dom = resctrl_to_arch_mon_dom(d);
>> + union l3_qos_abmc_cfg abmc_cfg = { 0 };
>> + struct arch_mbm_state *am;
>> +
>> + abmc_cfg.split.cfg_en = 1;
>> + abmc_cfg.split.cntr_en = assign ? 1 : 0;
>> + abmc_cfg.split.cntr_id = cntr_id;
>> + abmc_cfg.split.bw_src = rmid;
>> + abmc_cfg.split.bw_type = evt_cfg;
>
> Is evt_cfg really needed to be programmed when unassigning a counter? Looking ahead at
> patch #14 resctrl_free_config_cntr() needs to go through extra list walk to get this data
> but why would hardware need an accurate event configuration to *unassign* a counter?
evt_cfg is not required during unassign. I can remove it.
>
> It seems unnecessary to provide both the event ID *and* the configuration.
> resctrl_arch_config_cntr() could drop the "evt_cfg" parameter and instead there
> can be a new resctrl utility that architecture can use to query the event's configuration.
> Similar to resctrl_is_mon_event_enabled() introduced in
> https://lore.kernel.org/lkml/20250521225049.132551-3-tony.luck@intel.com/ that exposes an
> event property.
Sounds good.
I can add a new function resctrl_get_mon_event_config(evtid) and call it
only during the "assign". It will be called inside
resctrl_arch_config_cntr().
>
> It looks to me as though there are a couple of changes in the telemetry work
> that would benefit this work. https://lore.kernel.org/lkml/20250521225049.132551-2-tony.luck@intel.com/
> switches the monitor events to be maintained in an array indexed by event ID, eliminating the
> need for searching the evt_list that this work does in a couple of places. Also note the handy
> new for_each_mbm_event() helper (https://lore.kernel.org/lkml/20250521225049.132551-5-tony.luck@intel.com/).
Sure. Looking at it now.
>
>
>> +
>> + smp_call_function_any(&d->hdr.cpu_mask, resctrl_abmc_config_one_amd, &abmc_cfg, 1);
>> +
>> + /*
>> + * The hardware counter is reset (because cfg_en == 1) so there is no
>> + * need to record initial non-zero counts.
>> + */
>> + if (assign) {
>> + am = get_arch_mbm_state(hw_dom, rmid, evtid);
>> + if (am)
>> + memset(am, 0, sizeof(*am));
>> + }
>> +}
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index d77981d1fcb9..59a4fe60ab46 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -559,6 +559,23 @@ void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_mon_domain *
>> */
>> void resctrl_arch_reset_all_ctrls(struct rdt_resource *r);
>>
>> +/**
>> + * resctrl_arch_config_cntr() - Configure the counter id to RMID, event
>> + * pair on the domain.
>
> The sentence seem strange, should "Configure the counter" perhaps be
> "Assign the counter"? Or if the naming requires "configure" ...
> "Configure the counter with its new RMID and event details."? Please feel
> free to improve.
Last one looks good.
>
>> + * @r: Resource structure.
>> + * @d: Domain that the counter id to be configured.
>
> I am unable to parse description of @d.
The domain in which the counter ID is to be configured.
>
>> + * @evtid: Event type to configure.
>> + * @rmid: RMID to configure.
>> + * @closid: CLOSID to configure.
>> + * @cntr_id: Counter ID to configure.
>
> All four parameters descriptions end with "to configure" ... but it is actually only
> the counter that is configured while the rest is the data that the counter is configured with, no?
Will remove "to configure" from all the other fields except the cntr_id.
>
>> + * @evt_cfg: MBM event configuration value representing reads,
>> + * writes etc.
>
> Needs definition about what the contents of @evt_cfg means. This is the API ...it
> cannot be vague like "reads, write, etc." but should be specific about which bit means
> what.
Copying your comment on other patch
https://lore.kernel.org/lkml/14ca1527-ee25-448d-949b-ed8df546c916@intel.com/
@evt_cfg: Event configuration created using the READS_TO_LOCAL_MEM,
READS_TO_REMOTE_MEM, etc. bits that represent the memory transactions
being counted.
>
>> + * @assign: Assign or unassign.
>
> "True to assign the counter, false to unassign the counter."
>
Sure.
>
> Needs some context here about what architecture can expect on how this function will
> be called. For example, "Can be called from any CPU."
>
Sure.
>> + */
>> +void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_mon_domain *d,
>> + enum resctrl_event_id evtid, u32 rmid, u32 closid,
>> + u32 cntr_id, u32 evt_cfg, bool assign);
>> +
>> extern unsigned int resctrl_rmid_realloc_threshold;
>> extern unsigned int resctrl_rmid_realloc_limit;
>>
>
> Reinette
>
Thanks
Babu
> It looks to me as though there are a couple of changes in the telemetry work > that would benefit this work. https://lore.kernel.org/lkml/20250521225049.132551-2-tony.luck@intel.com/ > switches the monitor events to be maintained in an array indexed by event ID, eliminating the > need for searching the evt_list that this work does in a couple of places. Also note the handy > new for_each_mbm_event() helper (https://lore.kernel.org/lkml/20250521225049.132551-5-tony.luck@intel.com/). Yesterday I ran through the exercise of rebasing my AET patches on top of these ABMC patches in order to check whether the ABMC patches painted resctrl into some corner that would be hard to get back out of. Good news: they don't. There was a bunch of manual patching to make the first four patches fit on top of the ABMC code, but I also noticed a few places where things were simpler after combining the two series. Maybe a good path forward would be to take those first four patches from my AET series and then build ABMC on top of those. -Tony
On Thu, May 22, 2025 at 10:16:16PM +0000, Luck, Tony wrote: > > It looks to me as though there are a couple of changes in the telemetry work > > that would benefit this work. https://lore.kernel.org/lkml/20250521225049.132551-2-tony.luck@intel.com/ > > switches the monitor events to be maintained in an array indexed by event ID, eliminating the > > need for searching the evt_list that this work does in a couple of places. Also note the handy > > new for_each_mbm_event() helper (https://lore.kernel.org/lkml/20250521225049.132551-5-tony.luck@intel.com/). > > Yesterday I ran through the exercise of rebasing my AET patches on top of these > ABMC patches in order to check whether the ABMC patches painted resctrl > into some corner that would be hard to get back out of. > > Good news: they don't. > > There was a bunch of manual patching to make the first four patches fit on top > of the ABMC code, but I also noticed a few places where things were simpler > after combining the two series. > > Maybe a good path forward would be to take those first four patches from > my AET series and then build ABMC on top of those. As an encouragement to try this direction, I took my four patches on top of tip x86/cache and then applied Babu's ABMC series. Changes to Babu's code: 1) Adapt where needed for removal of evt_list. Use event array instead. 2) Use for_each_mbm_event() [Maybe didn't get all places?] 3) Bring the s/evt_val/evt_cfg/ fix into patch 20 from 21 4) Fix fir tree declaration for resctrl_process_assign() I don't have an AMD system to check if the ABMC parts still work. But it does pass the resctrl self tests, so legacy isn't broken. Patches in the "my_mbm_plus_babu_abmc" branch of my kernel.org repo: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git -Tony
Hi Tony,
On Fri, May 23, 2025 at 11:08 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> On Thu, May 22, 2025 at 10:16:16PM +0000, Luck, Tony wrote:
> > > It looks to me as though there are a couple of changes in the telemetry work
> > > that would benefit this work. https://lore.kernel.org/lkml/20250521225049.132551-2-tony.luck@intel.com/
> > > switches the monitor events to be maintained in an array indexed by event ID, eliminating the
> > > need for searching the evt_list that this work does in a couple of places. Also note the handy
> > > new for_each_mbm_event() helper (https://lore.kernel.org/lkml/20250521225049.132551-5-tony.luck@intel.com/).
> >
> > Yesterday I ran through the exercise of rebasing my AET patches on top of these
> > ABMC patches in order to check whether the ABMC patches painted resctrl
> > into some corner that would be hard to get back out of.
> >
> > Good news: they don't.
> >
> > There was a bunch of manual patching to make the first four patches fit on top
> > of the ABMC code, but I also noticed a few places where things were simpler
> > after combining the two series.
> >
> > Maybe a good path forward would be to take those first four patches from
> > my AET series and then build ABMC on top of those.
>
> As an encouragement to try this direction, I took my four patches
> on top of tip x86/cache and then applied Babu's ABMC series.
I did the same thing last week, except in the other order, so I
switched to your changes to test.
>
> Changes to Babu's code:
> 1) Adapt where needed for removal of evt_list. Use event array instead.
> 2) Use for_each_mbm_event() [Maybe didn't get all places?]
> 3) Bring the s/evt_val/evt_cfg/ fix into patch 20 from 21
> 4) Fix fir tree declaration for resctrl_process_assign()
>
> I don't have an AMD system to check if the ABMC parts still work. But
> it does pass the resctrl self tests, so legacy isn't broken.
>
> Patches in the "my_mbm_plus_babu_abmc" branch of my kernel.org
> repo: git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git
Thanks for applying my suggestion[1] about the array entry sizes, but
you needed one more dereference:
diff --git a/arch/x86/kernel/cpu/resctrl/core.c
b/arch/x86/kernel/cpu/resctrl/core.c
index 1db6a61e27746..0c27e0a5a7b96 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -399,7 +399,7 @@ static int domain_setup_ctrlval(struct
rdt_resource *r, struct rdt_ctrl_domain *
*/
static int arch_domain_mbm_alloc(u32 num_rmid, struct
rdt_hw_mon_domain *hw_dom)
{
- size_t tsize = sizeof(hw_dom->arch_mbm_states[0]);
+ size_t tsize = sizeof(*hw_dom->arch_mbm_states[0]);
enum resctrl_event_id evt;
int idx;
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 098ff002d2232..44ec33cb165f7 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -4819,7 +4823,7 @@ void resctrl_offline_mon_domain(struct
rdt_resource *r, struct rdt_mon_domain *d
static int domain_setup_mon_state(struct rdt_resource *r, struct
rdt_mon_domain *d)
{
u32 idx_limit = resctrl_arch_system_num_rmid_idx();
- size_t tsize = sizeof(d->mbm_states[0]);
+ size_t tsize = sizeof(*d->mbm_states[0]);
enum resctrl_event_id evt;
int idx;
You should be able to repro an array overrun without ABMC, and a page
fault is likely if the system implements a lot of RMIDs. The AMD EPYC
9B45 I tested on implements 4096 RMIDs.
Thanks,
-Peter
[1] https://lore.kernel.org/lkml/CALPaoCj8yfzJ=5CkxTPQXc0-WRWpu0xKRX8v4FAWFGQKtXtMUw@mail.gmail.com/
> Thanks for applying my suggestion[1] about the array entry sizes, but > you needed one more dereference: > - size_t tsize = sizeof(hw_dom->arch_mbm_states[0]); > + size_t tsize = sizeof(*hw_dom->arch_mbm_states[0]); > - size_t tsize = sizeof(d->mbm_states[0]); > + size_t tsize = sizeof(*d->mbm_states[0]); Indeed yes. Thanks. -Tony
Hi Tony, Peter, On 5/27/2025 4:41 PM, Luck, Tony wrote: > >> Thanks for applying my suggestion[1] about the array entry sizes, but >> you needed one more dereference: > >> - size_t tsize = sizeof(hw_dom->arch_mbm_states[0]); >> + size_t tsize = sizeof(*hw_dom->arch_mbm_states[0]); > >> - size_t tsize = sizeof(d->mbm_states[0]); >> + size_t tsize = sizeof(*d->mbm_states[0]); > > Indeed yes. Thanks. > Tony, Thanks for porting patches. I can actually pick your branch [1] and apply review comments on top for v14 series. Hope that is fine with everyone. [1] https://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git/log/?h=my_mbm_plus_babu_abmc One question though: Where will the Peter's fix [2] go? [2] https://lore.kernel.org/lkml/CALPaoCj7FBv_vfDp+4tgqo4p8T7Eov_Ys+CQRoAX6u43a4OTDQ@mail.gmail.com/ thanks Babu
Hi Tony,
On 5/28/25 16:41, Moger, Babu wrote:
> Hi Tony, Peter,
>
> On 5/27/2025 4:41 PM, Luck, Tony wrote:
>>
>>> Thanks for applying my suggestion[1] about the array entry sizes, but
>>> you needed one more dereference:
>>
>>> - size_t tsize = sizeof(hw_dom->arch_mbm_states[0]);
>>> + size_t tsize = sizeof(*hw_dom->arch_mbm_states[0]);
>>
>>> - size_t tsize = sizeof(d->mbm_states[0]);
>>> + size_t tsize = sizeof(*d->mbm_states[0]);
>>
>> Indeed yes. Thanks.
>>
>
> Tony, Thanks for porting patches.
>
> I can actually pick your branch [1] and apply review comments on top for
> v14 series. Hope that is fine with everyone.
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git/log/?h=my_mbm_plus_babu_abmc
>
> One question though: Where will the Peter's fix [2] go?
> [2]
> https://lore.kernel.org/lkml/CALPaoCj7FBv_vfDp+4tgqo4p8T7Eov_Ys+CQRoAX6u43a4OTDQ@mail.gmail.com/
>
> thanks
> Babu
>
>
I'm currently working on v14 and plan to post the updated ABMC series
tomorrow. I've used your multi-event support patches as the base:
x86, fs/resctrl: Consolidate monitor event descriptions
x86, fs/resctrl: Replace architecture event enabled checks
x86/resctrl: Remove 'rdt_mon_features' global variable
x86, fs/resctrl: Prepare for more monitor events
I noticed there are a few comments on your series here:
https://lore.kernel.org/lkml/20250521225049.132551-1-tony.luck@intel.com/
Let me know if you've updated the patches. If so, I’ll incorporate the
latest version. Otherwise, I’ll proceed with the current base as-is.
--
Thanks
Babu Moger
> One question though: Where will the Peter's fix [2] go? > [2] > https://lore.kernel.org/lkml/CALPaoCj7FBv_vfDp+4tgqo4p8T7Eov_Ys+CQRoAX6u43a4OTDQ@mail.gmail.com/ Babu, I'll backport into my patches and then rebase the whole branch. Hopefully should not take long. -Tony
> I'll backport into my patches and then rebase the whole branch. Hopefully should not take long. Done and force pushed to same branch at kernel .org. Both Peter's changes went into my 4th patch, and your patches didn't touch those functions so git did all the real work. -Tony
Hi Tony, On 5/28/2025 5:13 PM, Luck, Tony wrote: >> I'll backport into my patches and then rebase the whole branch. Hopefully should not take long. > > Done and force pushed to same branch at kernel .org. > > Both Peter's changes went into my 4th patch, and your patches didn't touch those functions > so git did all the real work. > I will base my v14 on top of this. Thanks Babu
© 2016 - 2025 Red Hat, Inc.