[PATCH v10 12/24] x86/resctrl: Introduce cntr_cfg to track assignable counters at domain

Babu Moger posted 24 patches 1 year ago
There is a newer version of this series
[PATCH v10 12/24] x86/resctrl: Introduce cntr_cfg to track assignable counters at domain
Posted by Babu Moger 1 year ago
In mbm_assign_mode, the MBM counters are assigned/unassigned to an RMID,
event pair in a resctrl group and monitor the bandwidth as long as it is
assigned. Counters are assigned/unassigned at domain level and needs to
be tracked at domain level.

Add the mbm_assign_cntr_cfg data structure to struct rdt_ctrl_domain to
manage and track MBM counter assignments at the domain level.

Suggested-by: Peter Newman <peternewman@google.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v10: Patch changed completely to handle the counters at domain level.
     https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
     Removed Reviewed-by tag.
     Did not see the need to add cntr_id in mbm_state structure. Not used in the code.

v9: Added Reviewed-by tag. No other changes.

v8: Minor commit message changes.

v7: Added check mbm_cntr_assignable for allocating bitmap mbm_cntr_map

v6: New patch to add domain level assignment.
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++++++++
 include/linux/resctrl.h                | 12 ++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 682f47e0beb1..1ee008a63d8b 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -4068,6 +4068,7 @@ static void __init rdtgroup_setup_default(void)
 
 static void domain_destroy_mon_state(struct rdt_mon_domain *d)
 {
+	kfree(d->cntr_cfg);
 	bitmap_free(d->rmid_busy_llc);
 	kfree(d->mbm_total);
 	kfree(d->mbm_local);
@@ -4141,6 +4142,16 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
 			return -ENOMEM;
 		}
 	}
+	if (is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
+		tsize = sizeof(*d->cntr_cfg);
+		d->cntr_cfg = kcalloc(r->mon.num_mbm_cntrs, tsize, GFP_KERNEL);
+		if (!d->cntr_cfg) {
+			bitmap_free(d->rmid_busy_llc);
+			kfree(d->mbm_total);
+			kfree(d->mbm_local);
+			return -ENOMEM;
+		}
+	}
 
 	return 0;
 }
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index c8ab3d7a0dab..03c67d9156f3 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -94,6 +94,16 @@ struct rdt_ctrl_domain {
 	u32				*mbps_val;
 };
 
+/**
+ * struct mbm_cntr_cfg -Assignable counter configuration
+ * @evtid:		Event type
+ * @rdtgroup:		Resctrl group assigned to the counter
+ */
+struct mbm_cntr_cfg {
+	enum resctrl_event_id	evtid;
+	struct rdtgroup		*rdtgrp;
+};
+
 /**
  * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor resource
  * @hdr:		common header for different domain types
@@ -105,6 +115,7 @@ struct rdt_ctrl_domain {
  * @cqm_limbo:		worker to periodically read CQM h/w counters
  * @mbm_work_cpu:	worker CPU for MBM h/w counters
  * @cqm_work_cpu:	worker CPU for CQM h/w counters
+ * @cntr_cfg:		Assignable counters configuration
  */
 struct rdt_mon_domain {
 	struct rdt_domain_hdr		hdr;
@@ -116,6 +127,7 @@ struct rdt_mon_domain {
 	struct delayed_work		cqm_limbo;
 	int				mbm_work_cpu;
 	int				cqm_work_cpu;
+	struct mbm_cntr_cfg		*cntr_cfg;
 };
 
 /**
-- 
2.34.1
Re: [PATCH v10 12/24] x86/resctrl: Introduce cntr_cfg to track assignable counters at domain
Posted by Reinette Chatre 11 months, 4 weeks ago
Hi Babu,

Did subject intend to use name of new struct?

On 12/12/24 12:15 PM, Babu Moger wrote:
> In mbm_assign_mode, the MBM counters are assigned/unassigned to an RMID,
> event pair in a resctrl group and monitor the bandwidth as long as it is
> assigned. Counters are assigned/unassigned at domain level and needs to
> be tracked at domain level.
> 
> Add the mbm_assign_cntr_cfg data structure to struct rdt_ctrl_domain to

"mbm_assign_cntr_cfg" -> "mbm_cntr_cfg"

> manage and track MBM counter assignments at the domain level.

This can really use some more information about this data structure. I think
it will be helpful to provide more information about how the data structure
looks ... for example, that it is an array indexed by counter ID where the
assignment details of each counter is stored. I also think it will be helpful
to describe how interactions with  this data structure works, that a NULL
rdtgrp means that the counter is free and that it is not possible to find
a counter from a resource group and arrays need to be searched instead and doing
so is ok for $REASON (when considering the number of RMID and domain combinations
possible on AMD). A lot is left for the reader to figure out.
> 
> Suggested-by: Peter Newman <peternewman@google.com>
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v10: Patch changed completely to handle the counters at domain level.
>      https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
>      Removed Reviewed-by tag.
>      Did not see the need to add cntr_id in mbm_state structure. Not used in the code.
> 
> v9: Added Reviewed-by tag. No other changes.
> 
> v8: Minor commit message changes.
> 
> v7: Added check mbm_cntr_assignable for allocating bitmap mbm_cntr_map
> 
> v6: New patch to add domain level assignment.
> ---
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++++++++
>  include/linux/resctrl.h                | 12 ++++++++++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 682f47e0beb1..1ee008a63d8b 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -4068,6 +4068,7 @@ static void __init rdtgroup_setup_default(void)
>  
>  static void domain_destroy_mon_state(struct rdt_mon_domain *d)
>  {
> +	kfree(d->cntr_cfg);
>  	bitmap_free(d->rmid_busy_llc);
>  	kfree(d->mbm_total);
>  	kfree(d->mbm_local);
> @@ -4141,6 +4142,16 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
>  			return -ENOMEM;
>  		}
>  	}
> +	if (is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
> +		tsize = sizeof(*d->cntr_cfg);
> +		d->cntr_cfg = kcalloc(r->mon.num_mbm_cntrs, tsize, GFP_KERNEL);
> +		if (!d->cntr_cfg) {
> +			bitmap_free(d->rmid_busy_llc);
> +			kfree(d->mbm_total);
> +			kfree(d->mbm_local);
> +			return -ENOMEM;
> +		}
> +	}
>  
>  	return 0;
>  }
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index c8ab3d7a0dab..03c67d9156f3 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -94,6 +94,16 @@ struct rdt_ctrl_domain {
>  	u32				*mbps_val;
>  };
>  
> +/**
> + * struct mbm_cntr_cfg -Assignable counter configuration

Please compare with style use in rest of the file. For example,
"-Assignable" -> "- assignable"

> + * @evtid:		Event type

This description is not useful. Consider: "MBM event to which
the counter is assigned. Only valid if @rdtgroup is not NULL."
(This was the first thing that came to my mind, please improve)

> + * @rdtgroup:		Resctrl group assigned to the counter

Can add "NULL if counter is free"

> + */
> +struct mbm_cntr_cfg {
> +	enum resctrl_event_id	evtid;
> +	struct rdtgroup		*rdtgrp;
> +};
> +
>  /**
>   * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor resource
>   * @hdr:		common header for different domain types
> @@ -105,6 +115,7 @@ struct rdt_ctrl_domain {
>   * @cqm_limbo:		worker to periodically read CQM h/w counters
>   * @mbm_work_cpu:	worker CPU for MBM h/w counters
>   * @cqm_work_cpu:	worker CPU for CQM h/w counters
> + * @cntr_cfg:		Assignable counters configuration

Match capitalization of surrounding text. 
Will be helpful to add that this is an array indexed by counter ID.

>   */
>  struct rdt_mon_domain {
>  	struct rdt_domain_hdr		hdr;
> @@ -116,6 +127,7 @@ struct rdt_mon_domain {
>  	struct delayed_work		cqm_limbo;
>  	int				mbm_work_cpu;
>  	int				cqm_work_cpu;
> +	struct mbm_cntr_cfg		*cntr_cfg;
>  };
>  
>  /**

Reinette
Re: [PATCH v10 12/24] x86/resctrl: Introduce cntr_cfg to track assignable counters at domain
Posted by Moger, Babu 11 months, 4 weeks ago
Hi Reinette,

On 12/19/2024 4:33 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> Did subject intend to use name of new struct?

Yes. Will change it to "mbm_cntr_cfg.
> 
> On 12/12/24 12:15 PM, Babu Moger wrote:
>> In mbm_assign_mode, the MBM counters are assigned/unassigned to an RMID,
>> event pair in a resctrl group and monitor the bandwidth as long as it is
>> assigned. Counters are assigned/unassigned at domain level and needs to
>> be tracked at domain level.
>>
>> Add the mbm_assign_cntr_cfg data structure to struct rdt_ctrl_domain to
> 
> "mbm_assign_cntr_cfg" -> "mbm_cntr_cfg"

Sure.

> 
>> manage and track MBM counter assignments at the domain level.
> 
> This can really use some more information about this data structure. I think
> it will be helpful to provide more information about how the data structure
> looks ... for example, that it is an array indexed by counter ID where the
> assignment details of each counter is stored. I also think it will be helpful
> to describe how interactions with  this data structure works, that a NULL
> rdtgrp means that the counter is free and that it is not possible to find
> a counter from a resource group and arrays need to be searched instead and doing
> so is ok for $REASON (when considering the number of RMID and domain combinations
> possible on AMD). A lot is left for the reader to figure out.

How about this?


In mbm_assign_mode, the MBM counters are assigned/unassigned to an RMID,
event pair in a resctrl group and monitor the bandwidth as long as it is
assigned. Counters are assigned/unassigned at domain level and needs to
be tracked at domain level.

Add the mbm_cntr_cfg data structure to struct rdt_ctrl_domain to
manage and track MBM counter assignments at the domain level.

Each domain will contain num_mbm_cntrs entries, indexed by cntr_id. 
During initialization, all entries will be set to zero. When a counter 
is allocated, its corresponding entry will be populated with the 
assigned struct rdtgroup and enum resctrl_event_id. When the counter is 
released, its entry will be reset to zero.

>>
>> Suggested-by: Peter Newman <peternewman@google.com>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v10: Patch changed completely to handle the counters at domain level.
>>       https://lore.kernel.org/lkml/CALPaoCj+zWq1vkHVbXYP0znJbe6Ke3PXPWjtri5AFgD9cQDCUg@mail.gmail.com/
>>       Removed Reviewed-by tag.
>>       Did not see the need to add cntr_id in mbm_state structure. Not used in the code.
>>
>> v9: Added Reviewed-by tag. No other changes.
>>
>> v8: Minor commit message changes.
>>
>> v7: Added check mbm_cntr_assignable for allocating bitmap mbm_cntr_map
>>
>> v6: New patch to add domain level assignment.
>> ---
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++++++++
>>   include/linux/resctrl.h                | 12 ++++++++++++
>>   2 files changed, 23 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 682f47e0beb1..1ee008a63d8b 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -4068,6 +4068,7 @@ static void __init rdtgroup_setup_default(void)
>>   
>>   static void domain_destroy_mon_state(struct rdt_mon_domain *d)
>>   {
>> +	kfree(d->cntr_cfg);
>>   	bitmap_free(d->rmid_busy_llc);
>>   	kfree(d->mbm_total);
>>   	kfree(d->mbm_local);
>> @@ -4141,6 +4142,16 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_mon_domain
>>   			return -ENOMEM;
>>   		}
>>   	}
>> +	if (is_mbm_enabled() && r->mon.mbm_cntr_assignable) {
>> +		tsize = sizeof(*d->cntr_cfg);
>> +		d->cntr_cfg = kcalloc(r->mon.num_mbm_cntrs, tsize, GFP_KERNEL);
>> +		if (!d->cntr_cfg) {
>> +			bitmap_free(d->rmid_busy_llc);
>> +			kfree(d->mbm_total);
>> +			kfree(d->mbm_local);
>> +			return -ENOMEM;
>> +		}
>> +	}
>>   
>>   	return 0;
>>   }
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index c8ab3d7a0dab..03c67d9156f3 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -94,6 +94,16 @@ struct rdt_ctrl_domain {
>>   	u32				*mbps_val;
>>   };
>>   
>> +/**
>> + * struct mbm_cntr_cfg -Assignable counter configuration
> 
> Please compare with style use in rest of the file. For example,
> "-Assignable" -> "- assignable"

Sure.

> 
>> + * @evtid:		Event type
> 
> This description is not useful. Consider: "MBM event to which
> the counter is assigned. Only valid if @rdtgroup is not NULL."
> (This was the first thing that came to my mind, please improve)
> 
>> + * @rdtgroup:		Resctrl group assigned to the counter
> 
> Can add "NULL if counter is free"

Sure.

> 
>> + */
>> +struct mbm_cntr_cfg {
>> +	enum resctrl_event_id	evtid;
>> +	struct rdtgroup		*rdtgrp;
>> +};
>> +
>>   /**
>>    * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor resource
>>    * @hdr:		common header for different domain types
>> @@ -105,6 +115,7 @@ struct rdt_ctrl_domain {
>>    * @cqm_limbo:		worker to periodically read CQM h/w counters
>>    * @mbm_work_cpu:	worker CPU for MBM h/w counters
>>    * @cqm_work_cpu:	worker CPU for CQM h/w counters
>> + * @cntr_cfg:		Assignable counters configuration
> 
> Match capitalization of surrounding text.
> Will be helpful to add that this is an array indexed by counter ID.

ok. Sure.

> 
>>    */
>>   struct rdt_mon_domain {
>>   	struct rdt_domain_hdr		hdr;
>> @@ -116,6 +127,7 @@ struct rdt_mon_domain {
>>   	struct delayed_work		cqm_limbo;
>>   	int				mbm_work_cpu;
>>   	int				cqm_work_cpu;
>> +	struct mbm_cntr_cfg		*cntr_cfg;
>>   };
>>   
>>   /**
> 
> Reinette
>
Re: [PATCH v10 12/24] x86/resctrl: Introduce cntr_cfg to track assignable counters at domain
Posted by Reinette Chatre 11 months, 4 weeks ago
Hi Babu,

On 12/20/24 9:33 AM, Moger, Babu wrote:
> On 12/19/2024 4:33 PM, Reinette Chatre wrote:
>> On 12/12/24 12:15 PM, Babu Moger wrote:
>>> In mbm_assign_mode, the MBM counters are assigned/unassigned to an RMID,
>>> event pair in a resctrl group and monitor the bandwidth as long as it is
>>> assigned. Counters are assigned/unassigned at domain level and needs to
>>> be tracked at domain level.
>>>
>>> Add the mbm_assign_cntr_cfg data structure to struct rdt_ctrl_domain to
>>
>> "mbm_assign_cntr_cfg" -> "mbm_cntr_cfg"
> 
> Sure.
> 
>>
>>> manage and track MBM counter assignments at the domain level.
>>
>> This can really use some more information about this data structure. I think
>> it will be helpful to provide more information about how the data structure
>> looks ... for example, that it is an array indexed by counter ID where the
>> assignment details of each counter is stored. I also think it will be helpful
>> to describe how interactions with  this data structure works, that a NULL
>> rdtgrp means that the counter is free and that it is not possible to find
>> a counter from a resource group and arrays need to be searched instead and doing
>> so is ok for $REASON (when considering the number of RMID and domain combinations
>> possible on AMD). A lot is left for the reader to figure out.
> 
> How about this?
> 
> 
> In mbm_assign_mode, the MBM counters are assigned/unassigned to an RMID,
> event pair in a resctrl group and monitor the bandwidth as long as it is
> assigned. Counters are assigned/unassigned at domain level and needs to
> be tracked at domain level.
> 
> Add the mbm_cntr_cfg data structure to struct rdt_ctrl_domain to
> manage and track MBM counter assignments at the domain level.
> 
> Each domain will contain num_mbm_cntrs entries, indexed by cntr_id. During initialization, all entries will be set to zero. When a counter is allocated, its corresponding entry will be populated with the assigned struct rdtgroup and enum resctrl_event_id. When the counter is released, its entry will be reset to zero.

It will be better if you take a step back and create a coherent changelog
instead of appending independent text snippets. What you present has the
same mistake as before (mbm_assign_mode vs mbm_cntr_assign mode) and does
not address all the points raised.
Consider something like below (please check, improve, and complete):

	In mbm_cntr_assign mode hardware counters are assigned/unassigned
	to an MBM event of a monitor group. Hardware counters are
	assigned/unassigned at monitoring domain level.

	Manage a monitoring domain's hardware counters using a per monitoring
	domain array of struct mbm_cntr_cfg that is indexed by the hardware
	counter	ID. A hardware counter's configuration contains the MBM event
	ID and points to the monitoring group that it is assigned to, with a
	NULL pointer meaning that the hardware counter is available for assignment.

	There is no direct way to determine which hardware counters are	assigned
	to a particular monitoring group. Check every entry of every hardware
	counter	configuration array in every monitoring domain to query which
	MBM events of a monitoring group is tracked by hardware. Such queries
	are acceptable because <insert reason here>.

                                                                                
Please work on creating good changelogs. The requirements should be clear to you. 

Reinette