[PATCH v4 09/13] x86/resctrl: Add sysfs interface files to read/write event configuration

Babu Moger posted 13 patches 3 years, 7 months ago
[PATCH v4 09/13] x86/resctrl: Add sysfs interface files to read/write event configuration
Posted by Babu Moger 3 years, 7 months ago
Add two new sysfs files to read/write the event configuration if
the feature Bandwidth Monitoring Event Configuration (BMEC) is
supported. The file mbm_local_config is for the configuration
of the event mbm_local_bytes and the file mbm_total_config is
for the configuration of mbm_total_bytes.

$ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local*
/sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
/sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config

$ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total*
/sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
/sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |   40 ++++++++++++++++++++++++--------
 1 file changed, 30 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f55a693fa958..da11fdad204d 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -254,6 +254,10 @@ static const struct kernfs_ops kf_mondata_ops = {
 	.seq_show		= rdtgroup_mondata_show,
 };
 
+static const struct kernfs_ops kf_mondata_config_ops = {
+	.atomic_write_len       = PAGE_SIZE,
+};
+
 static bool is_cpu_list(struct kernfs_open_file *of)
 {
 	struct rftype *rft = of->kn->priv;
@@ -2478,24 +2482,40 @@ static struct file_system_type rdt_fs_type = {
 	.kill_sb		= rdt_kill_sb,
 };
 
-static int mon_addfile(struct kernfs_node *parent_kn, const char *name,
+static int mon_addfile(struct kernfs_node *parent_kn, struct mon_evt *mevt,
 		       void *priv)
 {
-	struct kernfs_node *kn;
+	struct kernfs_node *kn_evt, *kn_evt_config;
 	int ret = 0;
 
-	kn = __kernfs_create_file(parent_kn, name, 0444,
-				  GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
-				  &kf_mondata_ops, priv, NULL, NULL);
-	if (IS_ERR(kn))
-		return PTR_ERR(kn);
+	kn_evt = __kernfs_create_file(parent_kn, mevt->name, 0444,
+			GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
+			&kf_mondata_ops, priv, NULL, NULL);
+	if (IS_ERR(kn_evt))
+		return PTR_ERR(kn_evt);
 
-	ret = rdtgroup_kn_set_ugid(kn);
+	ret = rdtgroup_kn_set_ugid(kn_evt);
 	if (ret) {
-		kernfs_remove(kn);
+		kernfs_remove(kn_evt);
 		return ret;
 	}
 
+	if (mevt->configurable) {
+		kn_evt_config = __kernfs_create_file(parent_kn,
+				mevt->config_name, 0644,
+				GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
+				&kf_mondata_config_ops, priv, NULL, NULL);
+		if (IS_ERR(kn_evt_config))
+			return PTR_ERR(kn_evt_config);
+
+		ret = rdtgroup_kn_set_ugid(kn_evt_config);
+		if (ret) {
+			kernfs_remove(kn_evt_config);
+			kernfs_remove(kn_evt);
+			return ret;
+		}
+	}
+
 	return ret;
 }
 
@@ -2550,7 +2570,7 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
 	priv.u.domid = d->id;
 	list_for_each_entry(mevt, &r->evt_list, list) {
 		priv.u.evtid = mevt->evtid;
-		ret = mon_addfile(kn, mevt->name, priv.priv);
+		ret = mon_addfile(kn, mevt, priv.priv);
 		if (ret)
 			goto out_destroy;
 

Re: [PATCH v4 09/13] x86/resctrl: Add sysfs interface files to read/write event configuration
Posted by Reinette Chatre 3 years, 6 months ago
Hi Babu,

On 9/7/2022 11:01 AM, Babu Moger wrote:
> Add two new sysfs files to read/write the event configuration if
> the feature Bandwidth Monitoring Event Configuration (BMEC) is
> supported. The file mbm_local_config is for the configuration
> of the event mbm_local_bytes and the file mbm_total_config is
> for the configuration of mbm_total_bytes.
> 
> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local*
> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
> 
> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total*
> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
> 

This patch makes the mbm*config files per monitor group. Looking
ahead at later patches how the configuration is set it is not clear
to me that this is the right place for these configuration files.

Looking ahead to patch 10 there is neither rmid nor closid within
the (MSR_IA32_EVT_CFG_BASE + index) register - it only takes
the bits indicating what access types needs to be counted. Also
in patch 10 I understand that the scope of this register is per L3 cache
domain.

Considering this, why is the sysfs file associated with each
monitor group?

For example, consider the following scenario:
# cd /sys/fs/resctrl
# mkdir g2
# mkdir mon_groups/m1
# mkdir mon_groups/m2
# find . | grep mbm_local_config
./mon_data/mon_L3_00/mbm_local_config
./mon_data/mon_L3_01/mbm_local_config
./g2/mon_data/mon_L3_00/mbm_local_config
./g2/mon_data/mon_L3_01/mbm_local_config
./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
./mon_groups/m2/mon_data/mon_L3_01/mbm_local_config
./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
./mon_groups/m1/mon_data/mon_L3_01/mbm_local_config


From what I understand, the following sysfs files are
associated with cache domain #0 and thus writing to any of these
files would change the same configuration:
./mon_data/mon_L3_00/mbm_local_config
./g2/mon_data/mon_L3_00/mbm_local_config
./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config

Could you please correct me where I am wrong?


> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c |   40 ++++++++++++++++++++++++--------
>  1 file changed, 30 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index f55a693fa958..da11fdad204d 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -254,6 +254,10 @@ static const struct kernfs_ops kf_mondata_ops = {
>  	.seq_show		= rdtgroup_mondata_show,
>  };
>  
> +static const struct kernfs_ops kf_mondata_config_ops = {
> +	.atomic_write_len       = PAGE_SIZE,
> +};
> +

Please use coding style (tabs vs spaces) that is consistent with area
you are contributing to.

>  static bool is_cpu_list(struct kernfs_open_file *of)
>  {
>  	struct rftype *rft = of->kn->priv;
> @@ -2478,24 +2482,40 @@ static struct file_system_type rdt_fs_type = {
>  	.kill_sb		= rdt_kill_sb,
>  };
>  
> -static int mon_addfile(struct kernfs_node *parent_kn, const char *name,
> +static int mon_addfile(struct kernfs_node *parent_kn, struct mon_evt *mevt,
>  		       void *priv)
>  {
> -	struct kernfs_node *kn;
> +	struct kernfs_node *kn_evt, *kn_evt_config;
>  	int ret = 0;
>  
> -	kn = __kernfs_create_file(parent_kn, name, 0444,
> -				  GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
> -				  &kf_mondata_ops, priv, NULL, NULL);
> -	if (IS_ERR(kn))
> -		return PTR_ERR(kn);
> +	kn_evt = __kernfs_create_file(parent_kn, mevt->name, 0444,
> +			GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
> +			&kf_mondata_ops, priv, NULL, NULL);

Please run your series through checkpatch (alignment issue above)

> +	if (IS_ERR(kn_evt))
> +		return PTR_ERR(kn_evt);
>  
> -	ret = rdtgroup_kn_set_ugid(kn);
> +	ret = rdtgroup_kn_set_ugid(kn_evt);
>  	if (ret) {
> -		kernfs_remove(kn);
> +		kernfs_remove(kn_evt);
>  		return ret;
>  	}
>  
> +	if (mevt->configurable) {
> +		kn_evt_config = __kernfs_create_file(parent_kn,
> +				mevt->config_name, 0644,
> +				GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
> +				&kf_mondata_config_ops, priv, NULL, NULL);
> +		if (IS_ERR(kn_evt_config))
> +			return PTR_ERR(kn_evt_config);
> +

Since an error is returned here it seems that some cleanup (kn_evt) is missing?


> +		ret = rdtgroup_kn_set_ugid(kn_evt_config);
> +		if (ret) {
> +			kernfs_remove(kn_evt_config);
> +			kernfs_remove(kn_evt);
> +			return ret;
> +		}
> +	}
> +
>  	return ret;
>  }
>  

Reinette
Re: [PATCH v4 09/13] x86/resctrl: Add sysfs interface files to read/write event configuration
Posted by Moger, Babu 3 years, 6 months ago
Hi Reinette,

On 9/16/22 10:58, Reinette Chatre wrote:
> Hi Babu,
>
> On 9/7/2022 11:01 AM, Babu Moger wrote:
>> Add two new sysfs files to read/write the event configuration if
>> the feature Bandwidth Monitoring Event Configuration (BMEC) is
>> supported. The file mbm_local_config is for the configuration
>> of the event mbm_local_bytes and the file mbm_total_config is
>> for the configuration of mbm_total_bytes.
>>
>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local*
>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
>>
>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total*
>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
>>
> This patch makes the mbm*config files per monitor group. Looking
> ahead at later patches how the configuration is set it is not clear
> to me that this is the right place for these configuration files.
>
> Looking ahead to patch 10 there is neither rmid nor closid within
> the (MSR_IA32_EVT_CFG_BASE + index) register - it only takes
> the bits indicating what access types needs to be counted. Also
> in patch 10 I understand that the scope of this register is per L3 cache
> domain.
Yes. Scope of  MSR_IA32_EVT_CFG_BASE per L3 domain.
>
> Considering this, why is the sysfs file associated with each
> monitor group?
Please see the response below.
>
> For example, consider the following scenario:
> # cd /sys/fs/resctrl
> # mkdir g2
> # mkdir mon_groups/m1
> # mkdir mon_groups/m2
> # find . | grep mbm_local_config
> ./mon_data/mon_L3_00/mbm_local_config
> ./mon_data/mon_L3_01/mbm_local_config
> ./g2/mon_data/mon_L3_00/mbm_local_config
> ./g2/mon_data/mon_L3_01/mbm_local_config
> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
> ./mon_groups/m2/mon_data/mon_L3_01/mbm_local_config
> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
> ./mon_groups/m1/mon_data/mon_L3_01/mbm_local_config
>
>
> From what I understand, the following sysfs files are
> associated with cache domain #0 and thus writing to any of these
> files would change the same configuration:
> ./mon_data/mon_L3_00/mbm_local_config
> ./g2/mon_data/mon_L3_00/mbm_local_config
> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
>
> Could you please correct me where I am wrong?

For example, we have CPUs 0-7 in domain 0. We have two counters which are
configurable.

Lets consider same example as your mentioned about.

g2 is a control group.

m1 and m2 are monitor group.

We can have control group g2 with CPUs 0-7 to limit the L3 bandwidth (or
memory bandwidth with required schemata setting).

We can have mon group m1 with cpus 0-3 to monitor mbm_local_bytes.

We can have mon group m2 with cpus  4-7 to monitor mbm_total_bytes.

Each group is independently, monitoring two separate thing. Without having
sysfs file (mbm_local_config and mbm_total_config) in each monitor group,
we wont be able to configure the above configuration.


>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c |   40 ++++++++++++++++++++++++--------
>>  1 file changed, 30 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index f55a693fa958..da11fdad204d 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -254,6 +254,10 @@ static const struct kernfs_ops kf_mondata_ops = {
>>  	.seq_show		= rdtgroup_mondata_show,
>>  };
>>  
>> +static const struct kernfs_ops kf_mondata_config_ops = {
>> +	.atomic_write_len       = PAGE_SIZE,
>> +};
>> +
> Please use coding style (tabs vs spaces) that is consistent with area
> you are contributing to.
Sure
>
>>  static bool is_cpu_list(struct kernfs_open_file *of)
>>  {
>>  	struct rftype *rft = of->kn->priv;
>> @@ -2478,24 +2482,40 @@ static struct file_system_type rdt_fs_type = {
>>  	.kill_sb		= rdt_kill_sb,
>>  };
>>  
>> -static int mon_addfile(struct kernfs_node *parent_kn, const char *name,
>> +static int mon_addfile(struct kernfs_node *parent_kn, struct mon_evt *mevt,
>>  		       void *priv)
>>  {
>> -	struct kernfs_node *kn;
>> +	struct kernfs_node *kn_evt, *kn_evt_config;
>>  	int ret = 0;
>>  
>> -	kn = __kernfs_create_file(parent_kn, name, 0444,
>> -				  GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
>> -				  &kf_mondata_ops, priv, NULL, NULL);
>> -	if (IS_ERR(kn))
>> -		return PTR_ERR(kn);
>> +	kn_evt = __kernfs_create_file(parent_kn, mevt->name, 0444,
>> +			GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
>> +			&kf_mondata_ops, priv, NULL, NULL);
> Please run your series through checkpatch (alignment issue above)
Sure
>
>> +	if (IS_ERR(kn_evt))
>> +		return PTR_ERR(kn_evt);
>>  
>> -	ret = rdtgroup_kn_set_ugid(kn);
>> +	ret = rdtgroup_kn_set_ugid(kn_evt);
>>  	if (ret) {
>> -		kernfs_remove(kn);
>> +		kernfs_remove(kn_evt);
>>  		return ret;
>>  	}
>>  
>> +	if (mevt->configurable) {
>> +		kn_evt_config = __kernfs_create_file(parent_kn,
>> +				mevt->config_name, 0644,
>> +				GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
>> +				&kf_mondata_config_ops, priv, NULL, NULL);
>> +		if (IS_ERR(kn_evt_config))
>> +			return PTR_ERR(kn_evt_config);
>> +
> Since an error is returned here it seems that some cleanup (kn_evt) is missing?

Yes. That is correct.  Will fix it.

Thanks

Babu

>
>
>> +		ret = rdtgroup_kn_set_ugid(kn_evt_config);
>> +		if (ret) {
>> +			kernfs_remove(kn_evt_config);
>> +			kernfs_remove(kn_evt);
>> +			return ret;
>> +		}
>> +	}
>> +
>>  	return ret;
>>  }
>>  
> Reinette

-- 
Thanks
Babu Moger

Re: [PATCH v4 09/13] x86/resctrl: Add sysfs interface files to read/write event configuration
Posted by Reinette Chatre 3 years, 6 months ago
Hi Babu,

On 9/19/2022 8:46 AM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 9/16/22 10:58, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 9/7/2022 11:01 AM, Babu Moger wrote:
>>> Add two new sysfs files to read/write the event configuration if
>>> the feature Bandwidth Monitoring Event Configuration (BMEC) is
>>> supported. The file mbm_local_config is for the configuration
>>> of the event mbm_local_bytes and the file mbm_total_config is
>>> for the configuration of mbm_total_bytes.
>>>
>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local*
>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
>>>
>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total*
>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
>>>
>> This patch makes the mbm*config files per monitor group. Looking
>> ahead at later patches how the configuration is set it is not clear
>> to me that this is the right place for these configuration files.
>>
>> Looking ahead to patch 10 there is neither rmid nor closid within
>> the (MSR_IA32_EVT_CFG_BASE + index) register - it only takes
>> the bits indicating what access types needs to be counted. Also
>> in patch 10 I understand that the scope of this register is per L3 cache
>> domain.
> Yes. Scope of  MSR_IA32_EVT_CFG_BASE per L3 domain.
>>
>> Considering this, why is the sysfs file associated with each
>> monitor group?
> Please see the response below.
>>
>> For example, consider the following scenario:
>> # cd /sys/fs/resctrl
>> # mkdir g2
>> # mkdir mon_groups/m1
>> # mkdir mon_groups/m2
>> # find . | grep mbm_local_config
>> ./mon_data/mon_L3_00/mbm_local_config
>> ./mon_data/mon_L3_01/mbm_local_config
>> ./g2/mon_data/mon_L3_00/mbm_local_config
>> ./g2/mon_data/mon_L3_01/mbm_local_config
>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
>> ./mon_groups/m2/mon_data/mon_L3_01/mbm_local_config
>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
>> ./mon_groups/m1/mon_data/mon_L3_01/mbm_local_config
>>
>>
>> From what I understand, the following sysfs files are
>> associated with cache domain #0 and thus writing to any of these
>> files would change the same configuration:
>> ./mon_data/mon_L3_00/mbm_local_config
>> ./g2/mon_data/mon_L3_00/mbm_local_config
>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
>>
>> Could you please correct me where I am wrong?
> 
> For example, we have CPUs 0-7 in domain 0. We have two counters which are
> configurable.
> 
> Lets consider same example as your mentioned about.
> 
> g2 is a control group.
> 
> m1 and m2 are monitor group.
> 
> We can have control group g2 with CPUs 0-7 to limit the L3 bandwidth (or
> memory bandwidth with required schemata setting).
> 
> We can have mon group m1 with cpus 0-3 to monitor mbm_local_bytes.
> 
> We can have mon group m2 with cpus  4-7 to monitor mbm_total_bytes.
> 
> Each group is independently, monitoring two separate thing. Without having

Right, because monitoring, the actual counting of the events, is per monitor
group. When a monitor group is created a new RMID is created and when the
counter is read it is per-RMID. 

The event configuration is independent from the RMID using the counter.

> sysfs file (mbm_local_config and mbm_total_config) in each monitor group,
> we wont be able to configure the above configuration.

I do not understand this reasoning. From what I understand the
event configuration is independent from the monitoring group. Thus, changing
an event configuration for one monitoring group would impact all
monitoring groups using that event counter. This implementation associates
an event configuration with each monitoring group and by doing so it
implies that it is unique to the monitoring group, but that is not
how it works.

Reinette








Re: [PATCH v4 09/13] x86/resctrl: Add sysfs interface files to read/write event configuration
Posted by Moger, Babu 3 years, 6 months ago
Hi Reinette,

On 9/19/22 11:42, Reinette Chatre wrote:
> Hi Babu,
>
> On 9/19/2022 8:46 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 9/16/22 10:58, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 9/7/2022 11:01 AM, Babu Moger wrote:
>>>> Add two new sysfs files to read/write the event configuration if
>>>> the feature Bandwidth Monitoring Event Configuration (BMEC) is
>>>> supported. The file mbm_local_config is for the configuration
>>>> of the event mbm_local_bytes and the file mbm_total_config is
>>>> for the configuration of mbm_total_bytes.
>>>>
>>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local*
>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
>>>>
>>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total*
>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
>>>>
>>> This patch makes the mbm*config files per monitor group. Looking
>>> ahead at later patches how the configuration is set it is not clear
>>> to me that this is the right place for these configuration files.
>>>
>>> Looking ahead to patch 10 there is neither rmid nor closid within
>>> the (MSR_IA32_EVT_CFG_BASE + index) register - it only takes
>>> the bits indicating what access types needs to be counted. Also
>>> in patch 10 I understand that the scope of this register is per L3 cache
>>> domain.
>> Yes. Scope of  MSR_IA32_EVT_CFG_BASE per L3 domain.
>>> Considering this, why is the sysfs file associated with each
>>> monitor group?
>> Please see the response below.
>>> For example, consider the following scenario:
>>> # cd /sys/fs/resctrl
>>> # mkdir g2
>>> # mkdir mon_groups/m1
>>> # mkdir mon_groups/m2
>>> # find . | grep mbm_local_config
>>> ./mon_data/mon_L3_00/mbm_local_config
>>> ./mon_data/mon_L3_01/mbm_local_config
>>> ./g2/mon_data/mon_L3_00/mbm_local_config
>>> ./g2/mon_data/mon_L3_01/mbm_local_config
>>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
>>> ./mon_groups/m2/mon_data/mon_L3_01/mbm_local_config
>>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
>>> ./mon_groups/m1/mon_data/mon_L3_01/mbm_local_config
>>>
>>>
>>> From what I understand, the following sysfs files are
>>> associated with cache domain #0 and thus writing to any of these
>>> files would change the same configuration:
>>> ./mon_data/mon_L3_00/mbm_local_config
>>> ./g2/mon_data/mon_L3_00/mbm_local_config
>>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
>>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
>>>
>>> Could you please correct me where I am wrong?
>> For example, we have CPUs 0-7 in domain 0. We have two counters which are
>> configurable.
>>
>> Lets consider same example as your mentioned about.
>>
>> g2 is a control group.
>>
>> m1 and m2 are monitor group.
>>
>> We can have control group g2 with CPUs 0-7 to limit the L3 bandwidth (or
>> memory bandwidth with required schemata setting).
>>
>> We can have mon group m1 with cpus 0-3 to monitor mbm_local_bytes.
>>
>> We can have mon group m2 with cpus  4-7 to monitor mbm_total_bytes.
>>
>> Each group is independently, monitoring two separate thing. Without having
> Right, because monitoring, the actual counting of the events, is per monitor
> group. When a monitor group is created a new RMID is created and when the
> counter is read it is per-RMID. 
>
> The event configuration is independent from the RMID using the counter.
>
>> sysfs file (mbm_local_config and mbm_total_config) in each monitor group,
>> we wont be able to configure the above configuration.
> I do not understand this reasoning. From what I understand the
> event configuration is independent from the monitoring group. Thus, changing
> an event configuration for one monitoring group would impact all
> monitoring groups using that event counter. This implementation associates
> an event configuration with each monitoring group and by doing so it
> implies that it is unique to the monitoring group, but that is not
> how it works.

The event configuration is designed per L3 domain. The mon_data is also
per domain (like mon_L3_00.. mon_L3_01 etc). So, added the event
configuration file inside each domain. We have all the information inside
the domain. Thought, that is right place. I am open for suggestions.

Thanks

Babu


Re: [PATCH v4 09/13] x86/resctrl: Add sysfs interface files to read/write event configuration
Posted by Reinette Chatre 3 years, 6 months ago
Hi Babu,

On 9/19/2022 1:26 PM, Moger, Babu wrote:
> On 9/19/22 11:42, Reinette Chatre wrote:
>> On 9/19/2022 8:46 AM, Moger, Babu wrote:
>>> On 9/16/22 10:58, Reinette Chatre wrote:
>>>> On 9/7/2022 11:01 AM, Babu Moger wrote:
>>>>> Add two new sysfs files to read/write the event configuration if
>>>>> the feature Bandwidth Monitoring Event Configuration (BMEC) is
>>>>> supported. The file mbm_local_config is for the configuration
>>>>> of the event mbm_local_bytes and the file mbm_total_config is
>>>>> for the configuration of mbm_total_bytes.
>>>>>
>>>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local*
>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
>>>>>
>>>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total*
>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
>>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
>>>>>
>>>> This patch makes the mbm*config files per monitor group. Looking
>>>> ahead at later patches how the configuration is set it is not clear
>>>> to me that this is the right place for these configuration files.
>>>>
>>>> Looking ahead to patch 10 there is neither rmid nor closid within
>>>> the (MSR_IA32_EVT_CFG_BASE + index) register - it only takes
>>>> the bits indicating what access types needs to be counted. Also
>>>> in patch 10 I understand that the scope of this register is per L3 cache
>>>> domain.
>>> Yes. Scope of  MSR_IA32_EVT_CFG_BASE per L3 domain.
>>>> Considering this, why is the sysfs file associated with each
>>>> monitor group?
>>> Please see the response below.
>>>> For example, consider the following scenario:
>>>> # cd /sys/fs/resctrl
>>>> # mkdir g2
>>>> # mkdir mon_groups/m1
>>>> # mkdir mon_groups/m2
>>>> # find . | grep mbm_local_config
>>>> ./mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_data/mon_L3_01/mbm_local_config
>>>> ./g2/mon_data/mon_L3_00/mbm_local_config
>>>> ./g2/mon_data/mon_L3_01/mbm_local_config
>>>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_groups/m2/mon_data/mon_L3_01/mbm_local_config
>>>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_groups/m1/mon_data/mon_L3_01/mbm_local_config
>>>>
>>>>
>>>> From what I understand, the following sysfs files are
>>>> associated with cache domain #0 and thus writing to any of these
>>>> files would change the same configuration:
>>>> ./mon_data/mon_L3_00/mbm_local_config
>>>> ./g2/mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
>>>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
>>>>
>>>> Could you please correct me where I am wrong?
>>> For example, we have CPUs 0-7 in domain 0. We have two counters which are
>>> configurable.
>>>
>>> Lets consider same example as your mentioned about.
>>>
>>> g2 is a control group.
>>>
>>> m1 and m2 are monitor group.
>>>
>>> We can have control group g2 with CPUs 0-7 to limit the L3 bandwidth (or
>>> memory bandwidth with required schemata setting).
>>>
>>> We can have mon group m1 with cpus 0-3 to monitor mbm_local_bytes.
>>>
>>> We can have mon group m2 with cpus  4-7 to monitor mbm_total_bytes.
>>>
>>> Each group is independently, monitoring two separate thing. Without having
>> Right, because monitoring, the actual counting of the events, is per monitor
>> group. When a monitor group is created a new RMID is created and when the
>> counter is read it is per-RMID. 
>>
>> The event configuration is independent from the RMID using the counter.
>>
>>> sysfs file (mbm_local_config and mbm_total_config) in each monitor group,
>>> we wont be able to configure the above configuration.
>> I do not understand this reasoning. From what I understand the
>> event configuration is independent from the monitoring group. Thus, changing
>> an event configuration for one monitoring group would impact all
>> monitoring groups using that event counter. This implementation associates
>> an event configuration with each monitoring group and by doing so it
>> implies that it is unique to the monitoring group, but that is not
>> how it works.
> 
> The event configuration is designed per L3 domain. The mon_data is also
> per domain (like mon_L3_00.. mon_L3_01 etc). So, added the event
> configuration file inside each domain. We have all the information inside
> the domain. Thought, that is right place. I am open for suggestions.

It is not clear to me if you are also seeing all the duplication that
accompanies this implementation. As you can see in the example I provided in
https://lore.kernel.org/lkml/13294a8f-e76f-a6a9-284c-67adbc80ec7c@intel.com/,
if I understand the implementation correctly, there will be several
configuration files scattered through resctrl that all configure the same
value. I asked you to correct me where I am wrong but you did not correct me.
Instead you keep repeating that placing the files in the duplicate locations
is convenient. I can see how this is convenient for you but please do consider
that having these duplicate configuration files scattered through resctrl
makes for a very confusing user interface and unexpected behavior. Users
would expect that a configuration associated with a monitor group impacts
that monitor group only - not all monitor groups associated with that
domain.

User API is hard so this does need careful thought. Perhaps the architects
can chime in here.

One option could be:
# cd /sys/fs/resctrl/info/L3_MON
# cat mbm_total_config
0=7f;1=7f
# cat mbm_local_config
0=15;1=15

It would be clear when changing mem_total_config or mbm_local_config that
it would impact all monitoring groups within all resource groups. What do
you think?

Reinette
RE: [PATCH v4 09/13] x86/resctrl: Add sysfs interface files to read/write event configuration
Posted by Moger, Babu 3 years, 6 months ago
[AMD Official Use Only - General]

Hi Reinette,

> -----Original Message-----
> From: Reinette Chatre <reinette.chatre@intel.com>
> Sent: Monday, September 19, 2022 4:07 PM
> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net;
> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de
> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org;
> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org;
> quic_neeraju@quicinc.com; rdunlap@infradead.org;
> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com;
> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com;
> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com;
> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan
> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com;
> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org;
> bagasdotme@gmail.com; eranian@google.com
> Subject: Re: [PATCH v4 09/13] x86/resctrl: Add sysfs interface files to
> read/write event configuration
> 
> Hi Babu,
> 
> On 9/19/2022 1:26 PM, Moger, Babu wrote:
> > On 9/19/22 11:42, Reinette Chatre wrote:
> >> On 9/19/2022 8:46 AM, Moger, Babu wrote:
> >>> On 9/16/22 10:58, Reinette Chatre wrote:
> >>>> On 9/7/2022 11:01 AM, Babu Moger wrote:
> >>>>> Add two new sysfs files to read/write the event configuration if
> >>>>> the feature Bandwidth Monitoring Event Configuration (BMEC) is
> >>>>> supported. The file mbm_local_config is for the configuration of
> >>>>> the event mbm_local_bytes and the file mbm_total_config is for the
> >>>>> configuration of mbm_total_bytes.
> >>>>>
> >>>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local*
> >>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
> >>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
> >>>>>
> >>>>> $ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total*
> >>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
> >>>>> /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
> >>>>>
> >>>> This patch makes the mbm*config files per monitor group. Looking
> >>>> ahead at later patches how the configuration is set it is not clear
> >>>> to me that this is the right place for these configuration files.
> >>>>
> >>>> Looking ahead to patch 10 there is neither rmid nor closid within
> >>>> the (MSR_IA32_EVT_CFG_BASE + index) register - it only takes the
> >>>> bits indicating what access types needs to be counted. Also in
> >>>> patch 10 I understand that the scope of this register is per L3
> >>>> cache domain.
> >>> Yes. Scope of  MSR_IA32_EVT_CFG_BASE per L3 domain.
> >>>> Considering this, why is the sysfs file associated with each
> >>>> monitor group?
> >>> Please see the response below.
> >>>> For example, consider the following scenario:
> >>>> # cd /sys/fs/resctrl
> >>>> # mkdir g2
> >>>> # mkdir mon_groups/m1
> >>>> # mkdir mon_groups/m2
> >>>> # find . | grep mbm_local_config
> >>>> ./mon_data/mon_L3_00/mbm_local_config
> >>>> ./mon_data/mon_L3_01/mbm_local_config
> >>>> ./g2/mon_data/mon_L3_00/mbm_local_config
> >>>> ./g2/mon_data/mon_L3_01/mbm_local_config
> >>>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
> >>>> ./mon_groups/m2/mon_data/mon_L3_01/mbm_local_config
> >>>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
> >>>> ./mon_groups/m1/mon_data/mon_L3_01/mbm_local_config
> >>>>
> >>>>
> >>>> From what I understand, the following sysfs files are associated
> >>>> with cache domain #0 and thus writing to any of these files would
> >>>> change the same configuration:
> >>>> ./mon_data/mon_L3_00/mbm_local_config
> >>>> ./g2/mon_data/mon_L3_00/mbm_local_config
> >>>> ./mon_groups/m2/mon_data/mon_L3_00/mbm_local_config
> >>>> ./mon_groups/m1/mon_data/mon_L3_00/mbm_local_config
> >>>>
> >>>> Could you please correct me where I am wrong?
> >>> For example, we have CPUs 0-7 in domain 0. We have two counters
> >>> which are configurable.
> >>>
> >>> Lets consider same example as your mentioned about.
> >>>
> >>> g2 is a control group.
> >>>
> >>> m1 and m2 are monitor group.
> >>>
> >>> We can have control group g2 with CPUs 0-7 to limit the L3 bandwidth
> >>> (or memory bandwidth with required schemata setting).
> >>>
> >>> We can have mon group m1 with cpus 0-3 to monitor mbm_local_bytes.
> >>>
> >>> We can have mon group m2 with cpus  4-7 to monitor mbm_total_bytes.
> >>>
> >>> Each group is independently, monitoring two separate thing. Without
> >>> having
> >> Right, because monitoring, the actual counting of the events, is per
> >> monitor group. When a monitor group is created a new RMID is created
> >> and when the counter is read it is per-RMID.
> >>
> >> The event configuration is independent from the RMID using the counter.
> >>
> >>> sysfs file (mbm_local_config and mbm_total_config) in each monitor
> >>> group, we wont be able to configure the above configuration.
> >> I do not understand this reasoning. From what I understand the event
> >> configuration is independent from the monitoring group. Thus,
> >> changing an event configuration for one monitoring group would impact
> >> all monitoring groups using that event counter. This implementation
> >> associates an event configuration with each monitoring group and by
> >> doing so it implies that it is unique to the monitoring group, but
> >> that is not how it works.
> >
> > The event configuration is designed per L3 domain. The mon_data is
> > also per domain (like mon_L3_00.. mon_L3_01 etc). So, added the event
> > configuration file inside each domain. We have all the information
> > inside the domain. Thought, that is right place. I am open for suggestions.
> 
> It is not clear to me if you are also seeing all the duplication that accompanies
> this implementation. As you can see in the example I provided in
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kern
> el.org%2Flkml%2F13294a8f-e76f-a6a9-284c-
> 67adbc80ec7c%40intel.com%2F&amp;data=05%7C01%7Cbabu.moger%40amd.
> com%7Cc22190a25ac044ec5f5408da9a82f5b7%7C3dd8961fe4884e608e11a82
> d994e183d%7C0%7C0%7C637992184504699692%7CUnknown%7CTWFpbGZsb3
> d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C3000%7C%7C%7C&amp;sdata=uPuGOlwaIgwJ9VnwNOS%2B4mUrqJnS65
> OdrEsEXtztUbU%3D&amp;reserved=0,
> if I understand the implementation correctly, there will be several
> configuration files scattered through resctrl that all configure the same value. I
> asked you to correct me where I am wrong but you did not correct me.
> Instead you keep repeating that placing the files in the duplicate locations is
> convenient. I can see how this is convenient for you but please do consider that
> having these duplicate configuration files scattered through resctrl makes for a
> very confusing user interface and unexpected behavior. Users would expect
> that a configuration associated with a monitor group impacts that monitor
> group only - not all monitor groups associated with that domain.
> 
> User API is hard so this does need careful thought. Perhaps the architects can
> chime in here.
> 
> One option could be:
> # cd /sys/fs/resctrl/info/L3_MON
> # cat mbm_total_config
> 0=7f;1=7f
> # cat mbm_local_config
> 0=15;1=15

I think this should work. 
# cat mbm_total_config
0=7f;1=7f
I would think 0 and 1 are domain ids here.

We have to provide interface to write also.
#echo "0=0x70" > mbm_total_config  (update mbm_total_config  for domain 0)
#echo 1=0x10  > mbm_local_config     (update mbm_local_config  for domain 1)

We will have to parse the string and update the specific domains.

> 
> It would be clear when changing mem_total_config or mbm_local_config that
> it would impact all monitoring groups within all resource groups. What do you
> think?

Yes. Thank you. It should work. As long and we have the ways to modify(and read) the specific L3 domains then it should be fine. Let me start on that. Will reply if I see any major issues.
Thanks
Babu