[PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature

Babu Moger posted 23 patches 1 year ago
There is a newer version of this series
[PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature
Posted by Babu Moger 1 year ago
Add the functionality to enable/disable AMD ABMC feature.

AMD ABMC feature is enabled by setting enabled bit(0) in MSR
L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
to be updated on all the logical processors in the QOS Domain.

Hardware counters will reset when ABMC state is changed.

The ABMC feature details are documented in APM listed below [1].
[1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
Monitoring (ABMC).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v11: Moved the monitoring related calls to monitor.c file.
     Moved the changes from include/linux/resctrl.h to
     arch/x86/kernel/cpu/resctrl/internal.h.
     Removed the Reviewed-by tag as patch changed.
     Actual code did not change.

v10: No changes.

v9: Re-ordered the MSR and added Reviewed-by tag.

v8: Commit message update and moved around the comments about L3_QOS_EXT_CFG
    to _resctrl_abmc_enable.

v7: Renamed the function
    resctrl_arch_get_abmc_enabled() to resctrl_arch_mbm_cntr_assign_enabled().

    Merged resctrl_arch_mbm_cntr_assign_disable, resctrl_arch_mbm_cntr_assign_disable
    and renamed to resctrl_arch_mbm_cntr_assign_set().

    Moved the function definition to linux/resctrl.h.

    Passed the struct rdt_resource to these functions.
    Removed resctrl_arch_reset_rmid_all() from arch code. This will be done
    from the caller.

v6: Renamed abmc_enabled to mbm_cntr_assign_enabled.
    Used msr_set_bit and msr_clear_bit for msr updates.
    Renamed resctrl_arch_abmc_enable() to resctrl_arch_mbm_cntr_assign_enable().
    Renamed resctrl_arch_abmc_disable() to resctrl_arch_mbm_cntr_assign_disable().
    Made _resctrl_abmc_enable to return void.

v5: Renamed resctrl_abmc_enable to resctrl_arch_abmc_enable.
    Renamed resctrl_abmc_disable to resctrl_arch_abmc_disable.
    Introduced resctrl_arch_get_abmc_enabled to get abmc state from
    non-arch code.
    Renamed resctrl_abmc_set_all to _resctrl_abmc_enable().
    Modified commit log to make it clear about AMD ABMC feature.

v3: No changes.

v2: Few text changes in commit message.
---
 arch/x86/include/asm/msr-index.h       |  1 +
 arch/x86/kernel/cpu/resctrl/core.c     |  5 ++++
 arch/x86/kernel/cpu/resctrl/internal.h |  7 +++++
 arch/x86/kernel/cpu/resctrl/monitor.c  | 36 ++++++++++++++++++++++++++
 4 files changed, 49 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 9a71880eec07..fea1f3afe197 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1197,6 +1197,7 @@
 /* - AMD: */
 #define MSR_IA32_MBA_BW_BASE		0xc0000200
 #define MSR_IA32_SMBA_BW_BASE		0xc0000280
+#define MSR_IA32_L3_QOS_EXT_CFG		0xc00003ff
 #define MSR_IA32_EVT_CFG_BASE		0xc0000400
 
 /* AMD-V MSRs */
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index cb7feb7c990f..3f847728aa7a 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -405,6 +405,11 @@ void rdt_ctrl_update(void *arg)
 	hw_res->msr_update(m);
 }
 
+bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r)
+{
+	return resctrl_to_arch_res(r)->mbm_cntr_assign_enabled;
+}
+
 /*
  * rdt_find_domain - Search for a domain id in a resource domain list.
  *
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 05358e78147b..ca69f2e0909f 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -56,6 +56,9 @@
 /* Max event bits supported */
 #define MAX_EVT_CONFIG_BITS		GENMASK(6, 0)
 
+/* Setting bit 0 in L3_QOS_EXT_CFG enables the ABMC feature. */
+#define ABMC_ENABLE_BIT			0
+
 /**
  * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
  *			        aren't marked nohz_full
@@ -479,6 +482,7 @@ struct rdt_parse_data {
  * @mbm_cfg_mask:	Bandwidth sources that can be tracked when Bandwidth
  *			Monitoring Event Configuration (BMEC) is supported.
  * @cdp_enabled:	CDP state of this resource
+ * @mbm_cntr_assign_enabled:	ABMC feature is enabled
  *
  * Members of this structure are either private to the architecture
  * e.g. mbm_width, or accessed via helpers that provide abstraction. e.g.
@@ -493,6 +497,7 @@ struct rdt_hw_resource {
 	unsigned int		mbm_width;
 	unsigned int		mbm_cfg_mask;
 	bool			cdp_enabled;
+	bool			mbm_cntr_assign_enabled;
 };
 
 static inline struct rdt_hw_resource *resctrl_to_arch_res(struct rdt_resource *r)
@@ -658,4 +663,6 @@ void resctrl_file_fflags_init(const char *config, unsigned long fflags);
 void rdt_staged_configs_clear(void);
 bool closid_allocated(unsigned int closid);
 int resctrl_find_cleanest_closid(void);
+int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable);
+bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r);
 #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index c3d7d4c3009a..a7526306f5e4 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -1261,3 +1261,39 @@ void __init intel_rdt_mbm_apply_quirk(void)
 	mbm_cf_rmidthreshold = mbm_cf_table[cf_index].rmidthreshold;
 	mbm_cf = mbm_cf_table[cf_index].cf;
 }
+
+static void resctrl_abmc_set_one_amd(void *arg)
+{
+	bool *enable = arg;
+
+	if (*enable)
+		msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
+	else
+		msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
+}
+
+/*
+ * Update L3_QOS_EXT_CFG MSR on all the CPUs associated with the monitor
+ * domain.
+ */
+static void _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
+{
+	struct rdt_mon_domain *d;
+
+	list_for_each_entry(d, &r->mon_domains, hdr.list)
+		on_each_cpu_mask(&d->hdr.cpu_mask,
+				 resctrl_abmc_set_one_amd, &enable, 1);
+}
+
+int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
+{
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+
+	if (r->mon.mbm_cntr_assignable &&
+	    hw_res->mbm_cntr_assign_enabled != enable) {
+		_resctrl_abmc_enable(r, enable);
+		hw_res->mbm_cntr_assign_enabled = enable;
+	}
+
+	return 0;
+}
-- 
2.34.1
Re: [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature
Posted by James Morse 11 months, 3 weeks ago
Hi Babu,

On 22/01/2025 20:20, Babu Moger wrote:
> Add the functionality to enable/disable AMD ABMC feature.
> 
> AMD ABMC feature is enabled by setting enabled bit(0) in MSR
> L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
> to be updated on all the logical processors in the QOS Domain.
> 
> Hardware counters will reset when ABMC state is changed.
> 
> The ABMC feature details are documented in APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).


> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 05358e78147b..ca69f2e0909f 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -658,4 +663,6 @@ void resctrl_file_fflags_init(const char *config, unsigned long fflags);
>  void rdt_staged_configs_clear(void);
>  bool closid_allocated(unsigned int closid);
>  int resctrl_find_cleanest_closid(void);
> +int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable);
> +bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r);
>  #endif /* _ASM_X86_RESCTRL_INTERNAL_H */

A minor nit - but could these be added to include/linux/resctrl.h instead?
This is where they need to end up after the arch/fs split, and its harmless to do it from
the beginning.


Thanks,

James
Re: [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature
Posted by Reinette Chatre 11 months, 3 weeks ago
Hi James,

On 2/21/25 10:05 AM, James Morse wrote:
> Hi Babu,
> 
> On 22/01/2025 20:20, Babu Moger wrote:
>> Add the functionality to enable/disable AMD ABMC feature.
>>
>> AMD ABMC feature is enabled by setting enabled bit(0) in MSR
>> L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
>> to be updated on all the logical processors in the QOS Domain.
>>
>> Hardware counters will reset when ABMC state is changed.
>>
>> The ABMC feature details are documented in APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC).
> 
> 
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index 05358e78147b..ca69f2e0909f 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -658,4 +663,6 @@ void resctrl_file_fflags_init(const char *config, unsigned long fflags);
>>  void rdt_staged_configs_clear(void);
>>  bool closid_allocated(unsigned int closid);
>>  int resctrl_find_cleanest_closid(void);
>> +int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable);
>> +bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r);
>>  #endif /* _ASM_X86_RESCTRL_INTERNAL_H */
> 
> A minor nit - but could these be added to include/linux/resctrl.h instead?
> This is where they need to end up after the arch/fs split, and its harmless to do it from
> the beginning.

These prototypes were moved back to the internal header to follow guidance
from Boris that was received during recent software controller enhancement.
Boris advised [1] that items needed by other architecture should only be
moved to include/linux/resctrl.h when that support is added.

Reinette

[1] https://lore.kernel.org/lkml/20241209222047.GKZ1dtPxIu5_Hxs1fp@fat_crate.local/
Re: [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature
Posted by Reinette Chatre 1 year ago
Hi Babu,

On 1/22/25 12:20 PM, Babu Moger wrote:
> Add the functionality to enable/disable AMD ABMC feature.
> 
> AMD ABMC feature is enabled by setting enabled bit(0) in MSR
> L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
> to be updated on all the logical processors in the QOS Domain.
> 
> Hardware counters will reset when ABMC state is changed.

I find that the state management in this series is organized better
and easier to understand. I do think that it can be simplified more
and a hint to this is that it is mentioned here but not done in the
code introduced here but instead required from the caller. It seems
simpler to me that the architectural state can just be reset at the
same time as enable/disable of ABMC? 

> 
> The ABMC feature details are documented in APM listed below [1].
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
> Monitoring (ABMC).
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---

...

> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index c3d7d4c3009a..a7526306f5e4 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -1261,3 +1261,39 @@ void __init intel_rdt_mbm_apply_quirk(void)
>  	mbm_cf_rmidthreshold = mbm_cf_table[cf_index].rmidthreshold;
>  	mbm_cf = mbm_cf_table[cf_index].cf;
>  }
> +
> +static void resctrl_abmc_set_one_amd(void *arg)
> +{
> +	bool *enable = arg;
> +
> +	if (*enable)
> +		msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
> +	else
> +		msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
> +}
> +
> +/*
> + * Update L3_QOS_EXT_CFG MSR on all the CPUs associated with the monitor
> + * domain.

All monitor domains are impacted and above does not clearly state "why".
How about
 * ABMC enable/disable requires update of L3_QOS_EXT_CFG MSR on all the CPUs
 * associated with all monitor domains.


> + */
> +static void _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
> +{
> +	struct rdt_mon_domain *d;
> +
> +	list_for_each_entry(d, &r->mon_domains, hdr.list)
> +		on_each_cpu_mask(&d->hdr.cpu_mask,
> +				 resctrl_abmc_set_one_amd, &enable, 1);
> +}
> +
> +int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
> +{
> +	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
> +
> +	if (r->mon.mbm_cntr_assignable &&
> +	    hw_res->mbm_cntr_assign_enabled != enable) {
> +		_resctrl_abmc_enable(r, enable);
> +		hw_res->mbm_cntr_assign_enabled = enable;

Added benefit of resetting architectural state within this if statement
(perhaps simpler to be done within _resctrl_abmc_enable()) is that it will
not be done unnecessarily if ABMC is already in requested state.

> +	}
> +
> +	return 0;
> +}

Reinette
Re: [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature
Posted by Moger, Babu 1 year ago
Hi Reinette,

On 2/5/2025 4:49 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 1/22/25 12:20 PM, Babu Moger wrote:
>> Add the functionality to enable/disable AMD ABMC feature.
>>
>> AMD ABMC feature is enabled by setting enabled bit(0) in MSR
>> L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
>> to be updated on all the logical processors in the QOS Domain.
>>
>> Hardware counters will reset when ABMC state is changed.
> 
> I find that the state management in this series is organized better
> and easier to understand. I do think that it can be simplified more
> and a hint to this is that it is mentioned here but not done in the
> code introduced here but instead required from the caller. It seems
> simpler to me that the architectural state can just be reset at the
> same time as enable/disable of ABMC?

Right now, it is done from mbm_cntr_reset(). It does both arch and 
non-arch state reset for all the RMIDs in all the domains. It is called 
in two places.

1 rdtgroup.c resctrl_mbm_assign_mode_write -> mbm_cntr_reset();
2 rdtgroup.c rdt_kill_sb()-> mbm_cntr_reset();

I will have to introduce another function to reset RMIDs in all the 
domains. Also, make sure it is called from both these places.

list_for_each_entry(dom, &r->mon_domains, hdr.list)
             resctrl_arch_reset_rmid_all(r, dom);


I feel current code is much more cleaner.  What do you think?

> 
>>
>> The ABMC feature details are documented in APM listed below [1].
>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>> Monitoring (ABMC).
>>
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> 
> ...
> 
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index c3d7d4c3009a..a7526306f5e4 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -1261,3 +1261,39 @@ void __init intel_rdt_mbm_apply_quirk(void)
>>   	mbm_cf_rmidthreshold = mbm_cf_table[cf_index].rmidthreshold;
>>   	mbm_cf = mbm_cf_table[cf_index].cf;
>>   }
>> +
>> +static void resctrl_abmc_set_one_amd(void *arg)
>> +{
>> +	bool *enable = arg;
>> +
>> +	if (*enable)
>> +		msr_set_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
>> +	else
>> +		msr_clear_bit(MSR_IA32_L3_QOS_EXT_CFG, ABMC_ENABLE_BIT);
>> +}
>> +
>> +/*
>> + * Update L3_QOS_EXT_CFG MSR on all the CPUs associated with the monitor
>> + * domain.
> 
> All monitor domains are impacted and above does not clearly state "why".
> How about
>   * ABMC enable/disable requires update of L3_QOS_EXT_CFG MSR on all the CPUs
>   * associated with all monitor domains.

Sure.

> 
> 
>> + */
>> +static void _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
>> +{
>> +	struct rdt_mon_domain *d;
>> +
>> +	list_for_each_entry(d, &r->mon_domains, hdr.list)
>> +		on_each_cpu_mask(&d->hdr.cpu_mask,
>> +				 resctrl_abmc_set_one_amd, &enable, 1);
>> +}
>> +
>> +int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
>> +{
>> +	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>> +
>> +	if (r->mon.mbm_cntr_assignable &&
>> +	    hw_res->mbm_cntr_assign_enabled != enable) {
>> +		_resctrl_abmc_enable(r, enable);
>> +		hw_res->mbm_cntr_assign_enabled = enable;
> 
> Added benefit of resetting architectural state within this if statement
> (perhaps simpler to be done within _resctrl_abmc_enable()) is that it will
> not be done unnecessarily if ABMC is already in requested state.

It will be
       list_for_each_entry(dom, &r->mon_domains, hdr.list)
             resctrl_arch_reset_rmid_all(r, dom);
> 
>> +	}
>> +
>> +	return 0;
>> +}
> 
> Reinette
> 

Thanks
Babu
Re: [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature
Posted by Reinette Chatre 1 year ago
Hi Babu,

On 2/6/25 8:15 AM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 2/5/2025 4:49 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 1/22/25 12:20 PM, Babu Moger wrote:
>>> Add the functionality to enable/disable AMD ABMC feature.
>>>
>>> AMD ABMC feature is enabled by setting enabled bit(0) in MSR
>>> L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
>>> to be updated on all the logical processors in the QOS Domain.
>>>
>>> Hardware counters will reset when ABMC state is changed.
>>
>> I find that the state management in this series is organized better
>> and easier to understand. I do think that it can be simplified more
>> and a hint to this is that it is mentioned here but not done in the
>> code introduced here but instead required from the caller. It seems
>> simpler to me that the architectural state can just be reset at the
>> same time as enable/disable of ABMC?
> 
> Right now, it is done from mbm_cntr_reset(). It does both arch and non-arch state reset for all the RMIDs in all the domains. It is called in two places.
> 
> 1 rdtgroup.c resctrl_mbm_assign_mode_write -> mbm_cntr_reset();
Please see my response to this usage in the related patch:
https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/
In summary, I find mbm_cntr_reset() ended up being a catch-all for random
cleanup and creates confusion with the other mbm_cntr_*() calls.

> 2 rdtgroup.c rdt_kill_sb()-> mbm_cntr_reset();
Please see my response to this usage in the related patch:
https://lore.kernel.org/lkml/8d04f824-d1cc-461c-9c57-0f26c6aa96e0@intel.com/
In summary, it does not solve the problem it originally set out to solve
and it can be eliminated.

> 
> I will have to introduce another function to reset RMIDs in all the domains. Also, make sure it is called from both these places.
> 
> list_for_each_entry(dom, &r->mon_domains, hdr.list)
>             resctrl_arch_reset_rmid_all(r, dom);

I do not see need for new functions, except the one I mention in 
https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/
that suggests a new helper for reset of architectural state that does not
exist and ends up being open coded in two places in this series.

With only one place (resctrl_mbm_assign_mode_write()) remaining that needs
all state reset I think it will be easier to understand if the state reset
is open coded within it, replacing mbm_cntr_reset() with:

	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
		mbm_cntr_free_all()
		resctrl_reset_rmid_all() // Just for architectural state
	}

I would not insist on reset of architectural state within the
architectural helper. I find that it is best for architecture to
maintain its state but I also see there are many precedent for
resctrl explicitly managing the state.

> I feel current code is much more cleaner.  What do you think?

It is better that previous versions, yes.

> 
>>
>>>
>>> The ABMC feature details are documented in APM listed below [1].
>>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>>> Monitoring (ABMC).
>>>
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>
>> ...
>>

...

>>> + */
>>> +static void _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
>>> +{
>>> +    struct rdt_mon_domain *d;
>>> +
>>> +    list_for_each_entry(d, &r->mon_domains, hdr.list)
>>> +        on_each_cpu_mask(&d->hdr.cpu_mask,
>>> +                 resctrl_abmc_set_one_amd, &enable, 1);
>>> +}
>>> +
>>> +int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
>>> +{
>>> +    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>> +
>>> +    if (r->mon.mbm_cntr_assignable &&
>>> +        hw_res->mbm_cntr_assign_enabled != enable) {
>>> +        _resctrl_abmc_enable(r, enable);
>>> +        hw_res->mbm_cntr_assign_enabled = enable;
>>
>> Added benefit of resetting architectural state within this if statement
>> (perhaps simpler to be done within _resctrl_abmc_enable()) is that it will
>> not be done unnecessarily if ABMC is already in requested state.
> 
> It will be
>       list_for_each_entry(dom, &r->mon_domains, hdr.list)
>             resctrl_arch_reset_rmid_all(r, dom);

I am not sure if you are actually planning a new loop here ... as
I suggested above this can be added to _resctrl_abmc_enable() where
there is already a loop over all monitor domains and all that is
needed is to add a call to resctrl_arch_reset_rmid_all(r, dom). 
Even so, as I mentioned above, if after fixing automatic counter
unassignment you still find that resetting architectural and
non-architectural state together then we can go with that to match
the other flows (eg. mbm_config_write_domain()).

Reinette

Re: [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature
Posted by Moger, Babu 1 year ago
Hi Reinette,

Lots of things in here.

On 2/6/2025 12:42 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 2/6/25 8:15 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 2/5/2025 4:49 PM, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 1/22/25 12:20 PM, Babu Moger wrote:
>>>> Add the functionality to enable/disable AMD ABMC feature.
>>>>
>>>> AMD ABMC feature is enabled by setting enabled bit(0) in MSR
>>>> L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
>>>> to be updated on all the logical processors in the QOS Domain.
>>>>
>>>> Hardware counters will reset when ABMC state is changed.
>>>
>>> I find that the state management in this series is organized better
>>> and easier to understand. I do think that it can be simplified more
>>> and a hint to this is that it is mentioned here but not done in the
>>> code introduced here but instead required from the caller. It seems
>>> simpler to me that the architectural state can just be reset at the
>>> same time as enable/disable of ABMC?
>>
>> Right now, it is done from mbm_cntr_reset(). It does both arch and non-arch state reset for all the RMIDs in all the domains. It is called in two places.
>>
>> 1 rdtgroup.c resctrl_mbm_assign_mode_write -> mbm_cntr_reset();
> Please see my response to this usage in the related patch:
> https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/
> In summary, I find mbm_cntr_reset() ended up being a catch-all for random
> cleanup and creates confusion with the other mbm_cntr_*() calls.

Yes. It should work. Will respond that comment later.

>> 2 rdtgroup.c rdt_kill_sb()-> mbm_cntr_reset();
> Please see my response to this usage in the related patch:
> https://lore.kernel.org/lkml/8d04f824-d1cc-461c-9c57-0f26c6aa96e0@intel.com/
> In summary, it does not solve the problem it originally set out to solve
> and it can be eliminated.

Yes. Should be fine. mbm_cntr_reset() can be completely removed.
Will respond that comment later.

> 
>>
>> I will have to introduce another function to reset RMIDs in all the domains. Also, make sure it is called from both these places.
>>
>> list_for_each_entry(dom, &r->mon_domains, hdr.list)
>>              resctrl_arch_reset_rmid_all(r, dom);
> 
> I do not see need for new functions, except the one I mention in
> https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/
> that suggests a new helper for reset of architectural state that does not
> exist and ends up being open coded in two places in this series.
> 
> With only one place (resctrl_mbm_assign_mode_write()) remaining that needs
> all state reset I think it will be easier to understand if the state reset
> is open coded within it, replacing mbm_cntr_reset() with:
> 
> 	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
> 		mbm_cntr_free_all()
> 		resctrl_reset_rmid_all() // Just for architectural state
> 	}

You meant "Just for non-architectural state" ?


> I would not insist on reset of architectural state within the
> architectural helper. I find that it is best for architecture to
> maintain its state but I also see there are many precedent for
> resctrl explicitly managing the state.
> 
>> I feel current code is much more cleaner.  What do you think?
> 
> It is better that previous versions, yes.
> 
>>
>>>
>>>>
>>>> The ABMC feature details are documented in APM listed below [1].
>>>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>>>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>>>> Monitoring (ABMC).
>>>>
>>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>>> ---
>>>
>>> ...
>>>
> 
> ...
> 
>>>> + */
>>>> +static void _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
>>>> +{
>>>> +    struct rdt_mon_domain *d;
>>>> +
>>>> +    list_for_each_entry(d, &r->mon_domains, hdr.list)
>>>> +        on_each_cpu_mask(&d->hdr.cpu_mask,
>>>> +                 resctrl_abmc_set_one_amd, &enable, 1);
>>>> +}
>>>> +
>>>> +int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
>>>> +{
>>>> +    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>>> +
>>>> +    if (r->mon.mbm_cntr_assignable &&
>>>> +        hw_res->mbm_cntr_assign_enabled != enable) {
>>>> +        _resctrl_abmc_enable(r, enable);
>>>> +        hw_res->mbm_cntr_assign_enabled = enable;
>>>
>>> Added benefit of resetting architectural state within this if statement
>>> (perhaps simpler to be done within _resctrl_abmc_enable()) is that it will
>>> not be done unnecessarily if ABMC is already in requested state.
>>
>> It will be
>>        list_for_each_entry(dom, &r->mon_domains, hdr.list)
>>              resctrl_arch_reset_rmid_all(r, dom);
> 
> I am not sure if you are actually planning a new loop here ... as
> I suggested above this can be added to _resctrl_abmc_enable() where
> there is already a loop over all monitor domains and all that is
> needed is to add a call to resctrl_arch_reset_rmid_all(r, dom).

Sure.

> Even so, as I mentioned above, if after fixing automatic counter
> unassignment you still find that resetting architectural and
> non-architectural state together then we can go with that to match
> the other flows (eg. mbm_config_write_domain()).
> 

Sure.
Thanks
Babu
Re: [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD ABMC feature
Posted by Reinette Chatre 1 year ago
Hi Babu,

On 2/6/25 2:57 PM, Moger, Babu wrote:
> On 2/6/2025 12:42 PM, Reinette Chatre wrote:
>> On 2/6/25 8:15 AM, Moger, Babu wrote:
>>> On 2/5/2025 4:49 PM, Reinette Chatre wrote:
>>>> On 1/22/25 12:20 PM, Babu Moger wrote:


>>>
>>> I will have to introduce another function to reset RMIDs in all the domains. Also, make sure it is called from both these places.
>>>
>>> list_for_each_entry(dom, &r->mon_domains, hdr.list)
>>>              resctrl_arch_reset_rmid_all(r, dom);
>>
>> I do not see need for new functions, except the one I mention in
>> https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/
>> that suggests a new helper for reset of architectural state that does not
>> exist and ends up being open coded in two places in this series.
>>
>> With only one place (resctrl_mbm_assign_mode_write()) remaining that needs
>> all state reset I think it will be easier to understand if the state reset
>> is open coded within it, replacing mbm_cntr_reset() with:
>>
>>     list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>         mbm_cntr_free_all()
>>         resctrl_reset_rmid_all() // Just for architectural state
>>     }
> 
> You meant "Just for non-architectural state" ?

Yes, thank you, what a confusion causing typo.

Reinette