[v9] x86/resctrl : Support AMD Assignable Bandwidth Monitoring Counters (ABMC)

[PATCH v9 20/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled

Posted by Babu Moger 3 weeks, 5 days ago

Assign/unassign counters on resctrl group creation/deletion. Two counters
are required per group, one for MBM total event and one for MBM local
event.

There are a limited number of counters available for assignment. If these
counters are exhausted, the kernel will display the error message: "Out of
MBM assignable counters". However, it is not necessary to fail the
creation of a group due to assignment failures. Users have the flexibility
to modify the assignments at a later time.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v9: Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to return void.
    Updated couple of rdtgroup_unassign_cntrs() calls properly.
    Updated function comments.

v8: Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
    Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
    Fixed the problem with unassigning the child MON groups of CTRL_MON group.

v7: Reworded the commit message.
    Removed the reference of ABMC with mbm_cntr_assign.
    Renamed the function rdtgroup_assign_cntrs to rdtgroup_assign_grp.

v6: Removed the redundant comments on all the calls of
    rdtgroup_assign_cntrs. Updated the commit message.
    Dropped printing error message on every call of rdtgroup_assign_cntrs.

v5: Removed the code to enable/disable ABMC during the mount.
    That will be another patch.
    Added arch callers to get the arch specific data.
    Renamed fuctions to match the other abmc function.
    Added code comments for assignment failures.

v4: Few name changes based on the upstream discussion.
    Commit message update.

v3: This is a new patch. Patch addresses the upstream comment to enable
    ABMC feature by default if the feature is available.
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 61 +++++++++++++++++++++++++-
 1 file changed, 60 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index b0cce3dfd062..a8d21b0b2054 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -2932,6 +2932,46 @@ static void schemata_list_destroy(void)
 	}
 }
 
+/*
+ * Called when a new group is created. If "mbm_cntr_assign" mode is enabled,
+ * counters are automatically assigned. Each group can accommodate two counters:
+ * one for the total event and one for the local event. Assignments may fail
+ * due to the limited number of counters. However, it is not necessary to fail
+ * the group creation and thus no failure is returned. Users have the option
+ * to modify the counter assignments after the group has been created.
+ */
+static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r))
+		return;
+
+	if (is_mbm_total_enabled())
+		rdtgroup_assign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_TOTAL_EVENT_ID);
+
+	if (is_mbm_local_enabled())
+		rdtgroup_assign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_LOCAL_EVENT_ID);
+}
+
+/*
+ * Called when a group is deleted. Counters are unassigned if it was in
+ * assigned state.
+ */
+static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+
+	if (!resctrl_arch_mbm_cntr_assign_enabled(r))
+		return;
+
+	if (is_mbm_total_enabled())
+		rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_TOTAL_EVENT_ID);
+
+	if (is_mbm_local_enabled())
+		rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_LOCAL_EVENT_ID);
+}
+
 static int rdt_get_tree(struct fs_context *fc)
 {
 	struct rdt_fs_context *ctx = rdt_fc2context(fc);
@@ -2991,6 +3031,8 @@ static int rdt_get_tree(struct fs_context *fc)
 		if (ret < 0)
 			goto out_mongrp;
 		rdtgroup_default.mon.mon_data_kn = kn_mondata;
+
+		rdtgroup_assign_cntrs(&rdtgroup_default);
 	}
 
 	ret = rdt_pseudo_lock_init();
@@ -3021,8 +3063,10 @@ static int rdt_get_tree(struct fs_context *fc)
 out_psl:
 	rdt_pseudo_lock_release();
 out_mondata:
-	if (resctrl_arch_mon_capable())
+	if (resctrl_arch_mon_capable()) {
+		rdtgroup_unassign_cntrs(&rdtgroup_default);
 		kernfs_remove(kn_mondata);
+	}
 out_mongrp:
 	if (resctrl_arch_mon_capable())
 		kernfs_remove(kn_mongrp);
@@ -3201,6 +3245,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
 
 	head = &rdtgrp->mon.crdtgrp_list;
 	list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
+		rdtgroup_unassign_cntrs(sentry);
 		free_rmid(sentry->closid, sentry->mon.rmid);
 		list_del(&sentry->mon.crdtgrp_list);
 
@@ -3241,6 +3286,8 @@ static void rmdir_all_sub(void)
 		cpumask_or(&rdtgroup_default.cpu_mask,
 			   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
 
+		rdtgroup_unassign_cntrs(rdtgrp);
+
 		free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 
 		kernfs_remove(rdtgrp->kn);
@@ -3272,6 +3319,7 @@ static void rdt_kill_sb(struct super_block *sb)
 	for_each_alloc_capable_rdt_resource(r)
 		reset_all_ctrls(r);
 	rmdir_all_sub();
+	rdtgroup_unassign_cntrs(&rdtgroup_default);
 	rdt_pseudo_lock_release();
 	rdtgroup_default.mode = RDT_MODE_SHAREABLE;
 	schemata_list_destroy();
@@ -3280,6 +3328,7 @@ static void rdt_kill_sb(struct super_block *sb)
 		resctrl_arch_disable_alloc();
 	if (resctrl_arch_mon_capable())
 		resctrl_arch_disable_mon();
+
 	resctrl_mounted = false;
 	kernfs_kill_sb(sb);
 	mutex_unlock(&rdtgroup_mutex);
@@ -3871,6 +3920,8 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn,
 		goto out_unlock;
 	}
 
+	rdtgroup_assign_cntrs(rdtgrp);
+
 	kernfs_activate(rdtgrp->kn);
 
 	/*
@@ -3915,6 +3966,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
 	if (ret)
 		goto out_closid_free;
 
+	rdtgroup_assign_cntrs(rdtgrp);
+
 	kernfs_activate(rdtgrp->kn);
 
 	ret = rdtgroup_init_alloc(rdtgrp);
@@ -3940,6 +3993,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
 out_del_list:
 	list_del(&rdtgrp->rdtgroup_list);
 out_rmid_free:
+	rdtgroup_unassign_cntrs(rdtgrp);
 	mkdir_rdt_prepare_rmid_free(rdtgrp);
 out_closid_free:
 	closid_free(closid);
@@ -4010,6 +4064,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 	update_closid_rmid(tmpmask, NULL);
 
 	rdtgrp->flags = RDT_DELETED;
+
+	rdtgroup_unassign_cntrs(rdtgrp);
+
 	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 
 	/*
@@ -4056,6 +4113,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
 	cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
 	update_closid_rmid(tmpmask, NULL);
 
+	rdtgroup_unassign_cntrs(rdtgrp);
+
 	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
 	closid_free(rdtgrp->closid);
 
-- 
2.34.1

Re: [PATCH v9 20/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled

Posted by Reinette Chatre 6 days, 11 hours ago

Hi Babu,

On 10/29/24 4:21 PM, Babu Moger wrote:
> Assign/unassign counters on resctrl group creation/deletion. Two counters
> are required per group, one for MBM total event and one for MBM local
> event.
> 
> There are a limited number of counters available for assignment. If these
> counters are exhausted, the kernel will display the error message: "Out of
> MBM assignable counters". However, it is not necessary to fail the
> creation of a group due to assignment failures. Users have the flexibility
> to modify the assignments at a later time.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v9: Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to return void.
>     Updated couple of rdtgroup_unassign_cntrs() calls properly.
>     Updated function comments.
> 
> v8: Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
>     Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
>     Fixed the problem with unassigning the child MON groups of CTRL_MON group.
> 
> v7: Reworded the commit message.
>     Removed the reference of ABMC with mbm_cntr_assign.
>     Renamed the function rdtgroup_assign_cntrs to rdtgroup_assign_grp.
> 
> v6: Removed the redundant comments on all the calls of
>     rdtgroup_assign_cntrs. Updated the commit message.
>     Dropped printing error message on every call of rdtgroup_assign_cntrs.
> 
> v5: Removed the code to enable/disable ABMC during the mount.
>     That will be another patch.
>     Added arch callers to get the arch specific data.
>     Renamed fuctions to match the other abmc function.
>     Added code comments for assignment failures.
> 
> v4: Few name changes based on the upstream discussion.
>     Commit message update.
> 
> v3: This is a new patch. Patch addresses the upstream comment to enable
>     ABMC feature by default if the feature is available.
> ---
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 61 +++++++++++++++++++++++++-
>  1 file changed, 60 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index b0cce3dfd062..a8d21b0b2054 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -2932,6 +2932,46 @@ static void schemata_list_destroy(void)
>  	}
>  }
>  
> +/*
> + * Called when a new group is created. If "mbm_cntr_assign" mode is enabled,
> + * counters are automatically assigned. Each group can accommodate two counters:
> + * one for the total event and one for the local event. Assignments may fail
> + * due to the limited number of counters. However, it is not necessary to fail
> + * the group creation and thus no failure is returned. Users have the option
> + * to modify the counter assignments after the group has been created.
> + */
> +static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
> +{
> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +
> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r))
> +		return;
> +
> +	if (is_mbm_total_enabled())
> +		rdtgroup_assign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_TOTAL_EVENT_ID);
> +
> +	if (is_mbm_local_enabled())
> +		rdtgroup_assign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_LOCAL_EVENT_ID);
> +}
> +
> +/*
> + * Called when a group is deleted. Counters are unassigned if it was in
> + * assigned state.
> + */
> +static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
> +{
> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +
> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r))
> +		return;
> +
> +	if (is_mbm_total_enabled())
> +		rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_TOTAL_EVENT_ID);
> +
> +	if (is_mbm_local_enabled())
> +		rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_LOCAL_EVENT_ID);
> +}
> +
>  static int rdt_get_tree(struct fs_context *fc)
>  {
>  	struct rdt_fs_context *ctx = rdt_fc2context(fc);
> @@ -2991,6 +3031,8 @@ static int rdt_get_tree(struct fs_context *fc)
>  		if (ret < 0)
>  			goto out_mongrp;
>  		rdtgroup_default.mon.mon_data_kn = kn_mondata;
> +
> +		rdtgroup_assign_cntrs(&rdtgroup_default);

I think counters should be assigned *before* the files exposing them
are added to resctrl.

>  	}
>  
>  	ret = rdt_pseudo_lock_init();
> @@ -3021,8 +3063,10 @@ static int rdt_get_tree(struct fs_context *fc)
>  out_psl:
>  	rdt_pseudo_lock_release();
>  out_mondata:
> -	if (resctrl_arch_mon_capable())
> +	if (resctrl_arch_mon_capable()) {
> +		rdtgroup_unassign_cntrs(&rdtgroup_default);
>  		kernfs_remove(kn_mondata);

... and here remove the files before taking away the data exposed by them.

> +	}
>  out_mongrp:
>  	if (resctrl_arch_mon_capable())
>  		kernfs_remove(kn_mongrp);
> @@ -3201,6 +3245,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
>  
>  	head = &rdtgrp->mon.crdtgrp_list;
>  	list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
> +		rdtgroup_unassign_cntrs(sentry);
>  		free_rmid(sentry->closid, sentry->mon.rmid);
>  		list_del(&sentry->mon.crdtgrp_list);
>  
> @@ -3241,6 +3286,8 @@ static void rmdir_all_sub(void)
>  		cpumask_or(&rdtgroup_default.cpu_mask,
>  			   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
>  
> +		rdtgroup_unassign_cntrs(rdtgrp);
> +
>  		free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>  
>  		kernfs_remove(rdtgrp->kn);
> @@ -3272,6 +3319,7 @@ static void rdt_kill_sb(struct super_block *sb)
>  	for_each_alloc_capable_rdt_resource(r)
>  		reset_all_ctrls(r);
>  	rmdir_all_sub();
> +	rdtgroup_unassign_cntrs(&rdtgroup_default);
>  	rdt_pseudo_lock_release();
>  	rdtgroup_default.mode = RDT_MODE_SHAREABLE;
>  	schemata_list_destroy();
> @@ -3280,6 +3328,7 @@ static void rdt_kill_sb(struct super_block *sb)
>  		resctrl_arch_disable_alloc();
>  	if (resctrl_arch_mon_capable())
>  		resctrl_arch_disable_mon();
> +
>  	resctrl_mounted = false;
>  	kernfs_kill_sb(sb);
>  	mutex_unlock(&rdtgroup_mutex);

Unnecessary hunk.

> @@ -3871,6 +3920,8 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn,
>  		goto out_unlock;
>  	}
>  
> +	rdtgroup_assign_cntrs(rdtgrp);
> +
>  	kernfs_activate(rdtgrp->kn);
>  
>  	/*
> @@ -3915,6 +3966,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>  	if (ret)
>  		goto out_closid_free;
>  
> +	rdtgroup_assign_cntrs(rdtgrp);
> +
>  	kernfs_activate(rdtgrp->kn);
>  
>  	ret = rdtgroup_init_alloc(rdtgrp);

Please compare the above two hunks with earlier "x86/resctrl: Introduce cntr_id in mongroup for assignments".
Earlier patch initializes the counters within mkdir_rdt_prepare_rmid_alloc() while the above
hunk assigns the counters after mkdir_rdt_prepare_rmid_alloc() is called. Could this fragmentation be avoided
with init done once within mkdir_rdt_prepare_rmid_alloc()?

> @@ -3940,6 +3993,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>  out_del_list:
>  	list_del(&rdtgrp->rdtgroup_list);
>  out_rmid_free:
> +	rdtgroup_unassign_cntrs(rdtgrp);
>  	mkdir_rdt_prepare_rmid_free(rdtgrp);
>  out_closid_free:
>  	closid_free(closid);
> @@ -4010,6 +4064,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>  	update_closid_rmid(tmpmask, NULL);
>  
>  	rdtgrp->flags = RDT_DELETED;
> +
> +	rdtgroup_unassign_cntrs(rdtgrp);
> +
>  	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>  
>  	/*
> @@ -4056,6 +4113,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>  	cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
>  	update_closid_rmid(tmpmask, NULL);
>  
> +	rdtgroup_unassign_cntrs(rdtgrp);
> +
>  	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>  	closid_free(rdtgrp->closid);
>  

There is a potential problem here. rdtgroup_unassign_cntrs() attempts to remove counter from 
all domains associated with the resource group. This may fail in any of the domains that results
in the counter not being marked as free in the global map and not reset the counter in the
resource group ... but the resource group is removed right after calling rdtgroup_unassign_cntrs().
In this scenario there is thus a counter that is considered to be in use but not assigned to any
resource group.

From what I can tell there is a difference here between default resource group and the others:
on remount of resctrl the default resource group will maintain knowledge of the counter that could
not be unassigned. This means that unmount/remount of resctrl does not provide a real "clean slate"
when it comes to counter assignment. Is this intended?

Reinette

Re: [PATCH v9 20/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled

Posted by Moger, Babu 3 days, 4 hours ago

Hi Reinette,

On 11/18/2024 11:18 AM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 10/29/24 4:21 PM, Babu Moger wrote:
>> Assign/unassign counters on resctrl group creation/deletion. Two counters
>> are required per group, one for MBM total event and one for MBM local
>> event.
>>
>> There are a limited number of counters available for assignment. If these
>> counters are exhausted, the kernel will display the error message: "Out of
>> MBM assignable counters". However, it is not necessary to fail the
>> creation of a group due to assignment failures. Users have the flexibility
>> to modify the assignments at a later time.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v9: Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to return void.
>>      Updated couple of rdtgroup_unassign_cntrs() calls properly.
>>      Updated function comments.
>>
>> v8: Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
>>      Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
>>      Fixed the problem with unassigning the child MON groups of CTRL_MON group.
>>
>> v7: Reworded the commit message.
>>      Removed the reference of ABMC with mbm_cntr_assign.
>>      Renamed the function rdtgroup_assign_cntrs to rdtgroup_assign_grp.
>>
>> v6: Removed the redundant comments on all the calls of
>>      rdtgroup_assign_cntrs. Updated the commit message.
>>      Dropped printing error message on every call of rdtgroup_assign_cntrs.
>>
>> v5: Removed the code to enable/disable ABMC during the mount.
>>      That will be another patch.
>>      Added arch callers to get the arch specific data.
>>      Renamed fuctions to match the other abmc function.
>>      Added code comments for assignment failures.
>>
>> v4: Few name changes based on the upstream discussion.
>>      Commit message update.
>>
>> v3: This is a new patch. Patch addresses the upstream comment to enable
>>      ABMC feature by default if the feature is available.
>> ---
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 61 +++++++++++++++++++++++++-
>>   1 file changed, 60 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index b0cce3dfd062..a8d21b0b2054 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -2932,6 +2932,46 @@ static void schemata_list_destroy(void)
>>   	}
>>   }
>>   
>> +/*
>> + * Called when a new group is created. If "mbm_cntr_assign" mode is enabled,
>> + * counters are automatically assigned. Each group can accommodate two counters:
>> + * one for the total event and one for the local event. Assignments may fail
>> + * due to the limited number of counters. However, it is not necessary to fail
>> + * the group creation and thus no failure is returned. Users have the option
>> + * to modify the counter assignments after the group has been created.
>> + */
>> +static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
>> +{
>> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +
>> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>> +		return;
>> +
>> +	if (is_mbm_total_enabled())
>> +		rdtgroup_assign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +
>> +	if (is_mbm_local_enabled())
>> +		rdtgroup_assign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +}
>> +
>> +/*
>> + * Called when a group is deleted. Counters are unassigned if it was in
>> + * assigned state.
>> + */
>> +static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
>> +{
>> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +
>> +	if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>> +		return;
>> +
>> +	if (is_mbm_total_enabled())
>> +		rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_TOTAL_EVENT_ID);
>> +
>> +	if (is_mbm_local_enabled())
>> +		rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, QOS_L3_MBM_LOCAL_EVENT_ID);
>> +}
>> +
>>   static int rdt_get_tree(struct fs_context *fc)
>>   {
>>   	struct rdt_fs_context *ctx = rdt_fc2context(fc);
>> @@ -2991,6 +3031,8 @@ static int rdt_get_tree(struct fs_context *fc)
>>   		if (ret < 0)
>>   			goto out_mongrp;
>>   		rdtgroup_default.mon.mon_data_kn = kn_mondata;
>> +
>> +		rdtgroup_assign_cntrs(&rdtgroup_default);
> 
> I think counters should be assigned *before* the files exposing them
> are added to resctrl.

Sure.

> 
>>   	}
>>   
>>   	ret = rdt_pseudo_lock_init();
>> @@ -3021,8 +3063,10 @@ static int rdt_get_tree(struct fs_context *fc)
>>   out_psl:
>>   	rdt_pseudo_lock_release();
>>   out_mondata:
>> -	if (resctrl_arch_mon_capable())
>> +	if (resctrl_arch_mon_capable()) {
>> +		rdtgroup_unassign_cntrs(&rdtgroup_default);
>>   		kernfs_remove(kn_mondata);
> 
> ... and here remove the files before taking away the data exposed by them.

Sure.

> 
>> +	}
>>   out_mongrp:
>>   	if (resctrl_arch_mon_capable())
>>   		kernfs_remove(kn_mongrp);
>> @@ -3201,6 +3245,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp)
>>   
>>   	head = &rdtgrp->mon.crdtgrp_list;
>>   	list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
>> +		rdtgroup_unassign_cntrs(sentry);
>>   		free_rmid(sentry->closid, sentry->mon.rmid);
>>   		list_del(&sentry->mon.crdtgrp_list);
>>   
>> @@ -3241,6 +3286,8 @@ static void rmdir_all_sub(void)
>>   		cpumask_or(&rdtgroup_default.cpu_mask,
>>   			   &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
>>   
>> +		rdtgroup_unassign_cntrs(rdtgrp);
>> +
>>   		free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>   
>>   		kernfs_remove(rdtgrp->kn);
>> @@ -3272,6 +3319,7 @@ static void rdt_kill_sb(struct super_block *sb)
>>   	for_each_alloc_capable_rdt_resource(r)
>>   		reset_all_ctrls(r);
>>   	rmdir_all_sub();
>> +	rdtgroup_unassign_cntrs(&rdtgroup_default);
>>   	rdt_pseudo_lock_release();
>>   	rdtgroup_default.mode = RDT_MODE_SHAREABLE;
>>   	schemata_list_destroy();
>> @@ -3280,6 +3328,7 @@ static void rdt_kill_sb(struct super_block *sb)
>>   		resctrl_arch_disable_alloc();
>>   	if (resctrl_arch_mon_capable())
>>   		resctrl_arch_disable_mon();
>> +
>>   	resctrl_mounted = false;
>>   	kernfs_kill_sb(sb);
>>   	mutex_unlock(&rdtgroup_mutex);
> 
> Unnecessary hunk.

ok

> 
>> @@ -3871,6 +3920,8 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn,
>>   		goto out_unlock;
>>   	}
>>   
>> +	rdtgroup_assign_cntrs(rdtgrp);
>> +
>>   	kernfs_activate(rdtgrp->kn);
>>   
>>   	/*
>> @@ -3915,6 +3966,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>>   	if (ret)
>>   		goto out_closid_free;
>>   
>> +	rdtgroup_assign_cntrs(rdtgrp);
>> +
>>   	kernfs_activate(rdtgrp->kn);
>>   
>>   	ret = rdtgroup_init_alloc(rdtgrp);
> 
> Please compare the above two hunks with earlier "x86/resctrl: Introduce cntr_id in mongroup for assignments".
> Earlier patch initializes the counters within mkdir_rdt_prepare_rmid_alloc() while the above
> hunk assigns the counters after mkdir_rdt_prepare_rmid_alloc() is called. Could this fragmentation be avoided
> with init done once within mkdir_rdt_prepare_rmid_alloc()?

It seems more appropriate to call rdtgroup_cntr_id_init() inside 
mkdir_rdt_prepare(). This will initialize the cntr_id to MON_CNTR_UNSET.

And then call rdtgroup_assign_cntrs() inside mkdir_rdt_prepare_rmid_alloc().

what do you think?


> 
>> @@ -3940,6 +3993,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>>   out_del_list:
>>   	list_del(&rdtgrp->rdtgroup_list);
>>   out_rmid_free:
>> +	rdtgroup_unassign_cntrs(rdtgrp);
>>   	mkdir_rdt_prepare_rmid_free(rdtgrp);
>>   out_closid_free:
>>   	closid_free(closid);
>> @@ -4010,6 +4064,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>>   	update_closid_rmid(tmpmask, NULL);
>>   
>>   	rdtgrp->flags = RDT_DELETED;
>> +
>> +	rdtgroup_unassign_cntrs(rdtgrp);
>> +
>>   	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>   
>>   	/*
>> @@ -4056,6 +4113,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>>   	cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
>>   	update_closid_rmid(tmpmask, NULL);
>>   
>> +	rdtgroup_unassign_cntrs(rdtgrp);
>> +
>>   	free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>   	closid_free(rdtgrp->closid);
>>   
> 
> There is a potential problem here. rdtgroup_unassign_cntrs() attempts to remove counter from
> all domains associated with the resource group. This may fail in any of the domains that results
> in the counter not being marked as free in the global map and not reset the counter in the
> resource group ... but the resource group is removed right after calling rdtgroup_unassign_cntrs().
> In this scenario there is thus a counter that is considered to be in use but not assigned to any
> resource group.
> 
>>From what I can tell there is a difference here between default resource group and the others:
> on remount of resctrl the default resource group will maintain knowledge of the counter that could
> not be unassigned. This means that unmount/remount of resctrl does not provide a real "clean slate"
> when it comes to counter assignment. Is this intended?
> 

Yes. Agree. It is not intended.

How about adding bitmap_zero() inside rdt_get_tree() to address this 
problem? Also need to reset the cntr_id in rdt_kill_sb().

-- 
- Babu Moger

Re: [PATCH v9 20/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled

Posted by Moger, Babu 3 days, 4 hours ago


On 11/21/2024 6:22 PM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 11/18/2024 11:18 AM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 10/29/24 4:21 PM, Babu Moger wrote:
>>> Assign/unassign counters on resctrl group creation/deletion. Two 
>>> counters
>>> are required per group, one for MBM total event and one for MBM local
>>> event.
>>>
>>> There are a limited number of counters available for assignment. If 
>>> these
>>> counters are exhausted, the kernel will display the error message: 
>>> "Out of
>>> MBM assignable counters". However, it is not necessary to fail the
>>> creation of a group due to assignment failures. Users have the 
>>> flexibility
>>> to modify the assignments at a later time.
>>>
>>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>>> ---
>>> v9: Changed rdtgroup_assign_cntrs() and rdtgroup_unassign_cntrs() to 
>>> return void.
>>>      Updated couple of rdtgroup_unassign_cntrs() calls properly.
>>>      Updated function comments.
>>>
>>> v8: Renamed rdtgroup_assign_grp to rdtgroup_assign_cntrs.
>>>      Renamed rdtgroup_unassign_grp to rdtgroup_unassign_cntrs.
>>>      Fixed the problem with unassigning the child MON groups of 
>>> CTRL_MON group.
>>>
>>> v7: Reworded the commit message.
>>>      Removed the reference of ABMC with mbm_cntr_assign.
>>>      Renamed the function rdtgroup_assign_cntrs to rdtgroup_assign_grp.
>>>
>>> v6: Removed the redundant comments on all the calls of
>>>      rdtgroup_assign_cntrs. Updated the commit message.
>>>      Dropped printing error message on every call of 
>>> rdtgroup_assign_cntrs.
>>>
>>> v5: Removed the code to enable/disable ABMC during the mount.
>>>      That will be another patch.
>>>      Added arch callers to get the arch specific data.
>>>      Renamed fuctions to match the other abmc function.
>>>      Added code comments for assignment failures.
>>>
>>> v4: Few name changes based on the upstream discussion.
>>>      Commit message update.
>>>
>>> v3: This is a new patch. Patch addresses the upstream comment to enable
>>>      ABMC feature by default if the feature is available.
>>> ---
>>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 61 +++++++++++++++++++++++++-
>>>   1 file changed, 60 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c 
>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index b0cce3dfd062..a8d21b0b2054 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -2932,6 +2932,46 @@ static void schemata_list_destroy(void)
>>>       }
>>>   }
>>> +/*
>>> + * Called when a new group is created. If "mbm_cntr_assign" mode is 
>>> enabled,
>>> + * counters are automatically assigned. Each group can accommodate 
>>> two counters:
>>> + * one for the total event and one for the local event. Assignments 
>>> may fail
>>> + * due to the limited number of counters. However, it is not 
>>> necessary to fail
>>> + * the group creation and thus no failure is returned. Users have 
>>> the option
>>> + * to modify the counter assignments after the group has been created.
>>> + */
>>> +static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
>>> +{
>>> +    struct rdt_resource *r = 
>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>> +
>>> +    if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>>> +        return;
>>> +
>>> +    if (is_mbm_total_enabled())
>>> +        rdtgroup_assign_cntr_event(r, rdtgrp, NULL, 
>>> QOS_L3_MBM_TOTAL_EVENT_ID);
>>> +
>>> +    if (is_mbm_local_enabled())
>>> +        rdtgroup_assign_cntr_event(r, rdtgrp, NULL, 
>>> QOS_L3_MBM_LOCAL_EVENT_ID);
>>> +}
>>> +
>>> +/*
>>> + * Called when a group is deleted. Counters are unassigned if it was in
>>> + * assigned state.
>>> + */
>>> +static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
>>> +{
>>> +    struct rdt_resource *r = 
>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>> +
>>> +    if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>>> +        return;
>>> +
>>> +    if (is_mbm_total_enabled())
>>> +        rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, 
>>> QOS_L3_MBM_TOTAL_EVENT_ID);
>>> +
>>> +    if (is_mbm_local_enabled())
>>> +        rdtgroup_unassign_cntr_event(r, rdtgrp, NULL, 
>>> QOS_L3_MBM_LOCAL_EVENT_ID);
>>> +}
>>> +
>>>   static int rdt_get_tree(struct fs_context *fc)
>>>   {
>>>       struct rdt_fs_context *ctx = rdt_fc2context(fc);
>>> @@ -2991,6 +3031,8 @@ static int rdt_get_tree(struct fs_context *fc)
>>>           if (ret < 0)
>>>               goto out_mongrp;
>>>           rdtgroup_default.mon.mon_data_kn = kn_mondata;
>>> +
>>> +        rdtgroup_assign_cntrs(&rdtgroup_default);
>>
>> I think counters should be assigned *before* the files exposing them
>> are added to resctrl.
> 
> Sure.
> 
>>
>>>       }
>>>       ret = rdt_pseudo_lock_init();
>>> @@ -3021,8 +3063,10 @@ static int rdt_get_tree(struct fs_context *fc)
>>>   out_psl:
>>>       rdt_pseudo_lock_release();
>>>   out_mondata:
>>> -    if (resctrl_arch_mon_capable())
>>> +    if (resctrl_arch_mon_capable()) {
>>> +        rdtgroup_unassign_cntrs(&rdtgroup_default);
>>>           kernfs_remove(kn_mondata);
>>
>> ... and here remove the files before taking away the data exposed by 
>> them.
> 
> Sure.
> 
>>
>>> +    }
>>>   out_mongrp:
>>>       if (resctrl_arch_mon_capable())
>>>           kernfs_remove(kn_mongrp);
>>> @@ -3201,6 +3245,7 @@ static void free_all_child_rdtgrp(struct 
>>> rdtgroup *rdtgrp)
>>>       head = &rdtgrp->mon.crdtgrp_list;
>>>       list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) {
>>> +        rdtgroup_unassign_cntrs(sentry);
>>>           free_rmid(sentry->closid, sentry->mon.rmid);
>>>           list_del(&sentry->mon.crdtgrp_list);
>>> @@ -3241,6 +3286,8 @@ static void rmdir_all_sub(void)
>>>           cpumask_or(&rdtgroup_default.cpu_mask,
>>>                  &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask);
>>> +        rdtgroup_unassign_cntrs(rdtgrp);
>>> +
>>>           free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>           kernfs_remove(rdtgrp->kn);
>>> @@ -3272,6 +3319,7 @@ static void rdt_kill_sb(struct super_block *sb)
>>>       for_each_alloc_capable_rdt_resource(r)
>>>           reset_all_ctrls(r);
>>>       rmdir_all_sub();
>>> +    rdtgroup_unassign_cntrs(&rdtgroup_default);
>>>       rdt_pseudo_lock_release();
>>>       rdtgroup_default.mode = RDT_MODE_SHAREABLE;
>>>       schemata_list_destroy();
>>> @@ -3280,6 +3328,7 @@ static void rdt_kill_sb(struct super_block *sb)
>>>           resctrl_arch_disable_alloc();
>>>       if (resctrl_arch_mon_capable())
>>>           resctrl_arch_disable_mon();
>>> +
>>>       resctrl_mounted = false;
>>>       kernfs_kill_sb(sb);
>>>       mutex_unlock(&rdtgroup_mutex);
>>
>> Unnecessary hunk.
> 
> ok
> 
>>
>>> @@ -3871,6 +3920,8 @@ static int rdtgroup_mkdir_mon(struct 
>>> kernfs_node *parent_kn,
>>>           goto out_unlock;
>>>       }
>>> +    rdtgroup_assign_cntrs(rdtgrp);
>>> +
>>>       kernfs_activate(rdtgrp->kn);
>>>       /*
>>> @@ -3915,6 +3966,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct 
>>> kernfs_node *parent_kn,
>>>       if (ret)
>>>           goto out_closid_free;
>>> +    rdtgroup_assign_cntrs(rdtgrp);
>>> +
>>>       kernfs_activate(rdtgrp->kn);
>>>       ret = rdtgroup_init_alloc(rdtgrp);
>>
>> Please compare the above two hunks with earlier "x86/resctrl: 
>> Introduce cntr_id in mongroup for assignments".
>> Earlier patch initializes the counters within 
>> mkdir_rdt_prepare_rmid_alloc() while the above
>> hunk assigns the counters after mkdir_rdt_prepare_rmid_alloc() is 
>> called. Could this fragmentation be avoided
>> with init done once within mkdir_rdt_prepare_rmid_alloc()?
> 
> It seems more appropriate to call rdtgroup_cntr_id_init() inside 
> mkdir_rdt_prepare(). This will initialize the cntr_id to MON_CNTR_UNSET.
> 
> And then call rdtgroup_assign_cntrs() inside 
> mkdir_rdt_prepare_rmid_alloc().
> 
> what do you think?
> 
> 
>>
>>> @@ -3940,6 +3993,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct 
>>> kernfs_node *parent_kn,
>>>   out_del_list:
>>>       list_del(&rdtgrp->rdtgroup_list);
>>>   out_rmid_free:
>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>>       mkdir_rdt_prepare_rmid_free(rdtgrp);
>>>   out_closid_free:
>>>       closid_free(closid);
>>> @@ -4010,6 +4064,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup 
>>> *rdtgrp, cpumask_var_t tmpmask)
>>>       update_closid_rmid(tmpmask, NULL);
>>>       rdtgrp->flags = RDT_DELETED;
>>> +
>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>> +
>>>       free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>       /*
>>> @@ -4056,6 +4113,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup 
>>> *rdtgrp, cpumask_var_t tmpmask)
>>>       cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
>>>       update_closid_rmid(tmpmask, NULL);
>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>> +
>>>       free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>       closid_free(rdtgrp->closid);
>>
>> There is a potential problem here. rdtgroup_unassign_cntrs() attempts 
>> to remove counter from
>> all domains associated with the resource group. This may fail in any 
>> of the domains that results
>> in the counter not being marked as free in the global map and not 
>> reset the counter in the
>> resource group ... but the resource group is removed right after 
>> calling rdtgroup_unassign_cntrs().
>> In this scenario there is thus a counter that is considered to be in 
>> use but not assigned to any
>> resource group.
>>
>>>> From what I can tell there is a difference here between default 
>>>> resource group and the others:
>> on remount of resctrl the default resource group will maintain 
>> knowledge of the counter that could
>> not be unassigned. This means that unmount/remount of resctrl does not 
>> provide a real "clean slate"
>> when it comes to counter assignment. Is this intended?
>>
> 
> Yes. Agree. It is not intended.
> 
> How about adding bitmap_zero() inside rdt_get_tree() to address this 
> problem? Also need to reset the cntr_id in rdt_kill_sb().

I meant reset the cntr_id for the default group in rdt_kill_sb().
-- 
- Babu Moger

Re: [PATCH v9 20/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled

Posted by Reinette Chatre 2 days, 10 hours ago

Hi Babu,

On 11/21/24 4:26 PM, Moger, Babu wrote:
> On 11/21/2024 6:22 PM, Moger, Babu wrote:
>> On 11/18/2024 11:18 AM, Reinette Chatre wrote:
>>> On 10/29/24 4:21 PM, Babu Moger wrote:

>>>> @@ -3871,6 +3920,8 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn,
>>>>           goto out_unlock;
>>>>       }
>>>> +    rdtgroup_assign_cntrs(rdtgrp);
>>>> +
>>>>       kernfs_activate(rdtgrp->kn);
>>>>       /*
>>>> @@ -3915,6 +3966,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>>>>       if (ret)
>>>>           goto out_closid_free;
>>>> +    rdtgroup_assign_cntrs(rdtgrp);
>>>> +
>>>>       kernfs_activate(rdtgrp->kn);
>>>>       ret = rdtgroup_init_alloc(rdtgrp);
>>>
>>> Please compare the above two hunks with earlier "x86/resctrl: Introduce cntr_id in mongroup for assignments".
>>> Earlier patch initializes the counters within mkdir_rdt_prepare_rmid_alloc() while the above
>>> hunk assigns the counters after mkdir_rdt_prepare_rmid_alloc() is called. Could this fragmentation be avoided
>>> with init done once within mkdir_rdt_prepare_rmid_alloc()?
>>
>> It seems more appropriate to call rdtgroup_cntr_id_init() inside mkdir_rdt_prepare(). This will initialize the cntr_id to MON_CNTR_UNSET.
>>
>> And then call rdtgroup_assign_cntrs() inside mkdir_rdt_prepare_rmid_alloc().
>>
>> what do you think?

Taking a closer look this seems most appropriate. mkdir_rdt_prepare() is where the resource groupreset
is created and all fields initialized, control and monitoring (irrespective of monitoring enabled). 
Doing the MON_CNTR_UNSET initalization in that central place seems good.
Yes, and then assigning the counters in mkdir_rdt_prepare_rmid_alloc() sounds good.

>>>> @@ -3940,6 +3993,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>>>>   out_del_list:
>>>>       list_del(&rdtgrp->rdtgroup_list);
>>>>   out_rmid_free:
>>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>>>       mkdir_rdt_prepare_rmid_free(rdtgrp);
>>>>   out_closid_free:
>>>>       closid_free(closid);
>>>> @@ -4010,6 +4064,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>>>>       update_closid_rmid(tmpmask, NULL);
>>>>       rdtgrp->flags = RDT_DELETED;
>>>> +
>>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>>> +
>>>>       free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>>       /*
>>>> @@ -4056,6 +4113,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>>>>       cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
>>>>       update_closid_rmid(tmpmask, NULL);
>>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>>> +
>>>>       free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>>       closid_free(rdtgrp->closid);
>>>
>>> There is a potential problem here. rdtgroup_unassign_cntrs() attempts to remove counter from
>>> all domains associated with the resource group. This may fail in any of the domains that results
>>> in the counter not being marked as free in the global map and not reset the counter in the
>>> resource group ... but the resource group is removed right after calling rdtgroup_unassign_cntrs().
>>> In this scenario there is thus a counter that is considered to be in use but not assigned to any
>>> resource group.
>>>
>>>>> From what I can tell there is a difference here between default resource group and the others:
>>> on remount of resctrl the default resource group will maintain knowledge of the counter that could
>>> not be unassigned. This means that unmount/remount of resctrl does not provide a real "clean slate"
>>> when it comes to counter assignment. Is this intended?
>>>
>>
>> Yes. Agree. It is not intended.
>>
>> How about adding bitmap_zero() inside rdt_get_tree() to address this problem? Also need to reset the cntr_id in rdt_kill_sb().
> 
> I meant reset the cntr_id for the default group in rdt_kill_sb().

Doing the cntr_id reset like this matches the custom is to reset to defaults in rdt_kill_sb(). I am not sure
what you envision with the bitmap_zero() in rdt_get_tree() ... I wonder if it may not just be simpler to
call mbm_cntr_reset() from rdt_kill_sb()? This does raise the question if mbm_cntr_reset() should reset
architectural state. I do not think it does harm, the state will just be reset again when the mon dirs are
created?

Reinette

Re: [PATCH v9 20/26] x86/resctrl: Auto assign/unassign counters when mbm_cntr_assign is enabled

Posted by Moger, Babu 2 days, 6 hours ago

Hi Reinette,

On 11/22/2024 12:12 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 11/21/24 4:26 PM, Moger, Babu wrote:
>> On 11/21/2024 6:22 PM, Moger, Babu wrote:
>>> On 11/18/2024 11:18 AM, Reinette Chatre wrote:
>>>> On 10/29/24 4:21 PM, Babu Moger wrote:
> 
>>>>> @@ -3871,6 +3920,8 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn,
>>>>>            goto out_unlock;
>>>>>        }
>>>>> +    rdtgroup_assign_cntrs(rdtgrp);
>>>>> +
>>>>>        kernfs_activate(rdtgrp->kn);
>>>>>        /*
>>>>> @@ -3915,6 +3966,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>>>>>        if (ret)
>>>>>            goto out_closid_free;
>>>>> +    rdtgroup_assign_cntrs(rdtgrp);
>>>>> +
>>>>>        kernfs_activate(rdtgrp->kn);
>>>>>        ret = rdtgroup_init_alloc(rdtgrp);
>>>>
>>>> Please compare the above two hunks with earlier "x86/resctrl: Introduce cntr_id in mongroup for assignments".
>>>> Earlier patch initializes the counters within mkdir_rdt_prepare_rmid_alloc() while the above
>>>> hunk assigns the counters after mkdir_rdt_prepare_rmid_alloc() is called. Could this fragmentation be avoided
>>>> with init done once within mkdir_rdt_prepare_rmid_alloc()?
>>>
>>> It seems more appropriate to call rdtgroup_cntr_id_init() inside mkdir_rdt_prepare(). This will initialize the cntr_id to MON_CNTR_UNSET.
>>>
>>> And then call rdtgroup_assign_cntrs() inside mkdir_rdt_prepare_rmid_alloc().
>>>
>>> what do you think?
> 
> Taking a closer look this seems most appropriate. mkdir_rdt_prepare() is where the resource groupreset
> is created and all fields initialized, control and monitoring (irrespective of monitoring enabled).
> Doing the MON_CNTR_UNSET initalization in that central place seems good.
> Yes, and then assigning the counters in mkdir_rdt_prepare_rmid_alloc() sounds good.

ok.

> 
>>>>> @@ -3940,6 +3993,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn,
>>>>>    out_del_list:
>>>>>        list_del(&rdtgrp->rdtgroup_list);
>>>>>    out_rmid_free:
>>>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>>>>        mkdir_rdt_prepare_rmid_free(rdtgrp);
>>>>>    out_closid_free:
>>>>>        closid_free(closid);
>>>>> @@ -4010,6 +4064,9 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>>>>>        update_closid_rmid(tmpmask, NULL);
>>>>>        rdtgrp->flags = RDT_DELETED;
>>>>> +
>>>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>>>> +
>>>>>        free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>>>        /*
>>>>> @@ -4056,6 +4113,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask)
>>>>>        cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask);
>>>>>        update_closid_rmid(tmpmask, NULL);
>>>>> +    rdtgroup_unassign_cntrs(rdtgrp);
>>>>> +
>>>>>        free_rmid(rdtgrp->closid, rdtgrp->mon.rmid);
>>>>>        closid_free(rdtgrp->closid);
>>>>
>>>> There is a potential problem here. rdtgroup_unassign_cntrs() attempts to remove counter from
>>>> all domains associated with the resource group. This may fail in any of the domains that results
>>>> in the counter not being marked as free in the global map and not reset the counter in the
>>>> resource group ... but the resource group is removed right after calling rdtgroup_unassign_cntrs().
>>>> In this scenario there is thus a counter that is considered to be in use but not assigned to any
>>>> resource group.
>>>>
>>>>>>  From what I can tell there is a difference here between default resource group and the others:
>>>> on remount of resctrl the default resource group will maintain knowledge of the counter that could
>>>> not be unassigned. This means that unmount/remount of resctrl does not provide a real "clean slate"
>>>> when it comes to counter assignment. Is this intended?
>>>>
>>>
>>> Yes. Agree. It is not intended.
>>>
>>> How about adding bitmap_zero() inside rdt_get_tree() to address this problem? Also need to reset the cntr_id in rdt_kill_sb().
>>
>> I meant reset the cntr_id for the default group in rdt_kill_sb().
> 
> Doing the cntr_id reset like this matches the custom is to reset to defaults in rdt_kill_sb(). I am not sure
> what you envision with the bitmap_zero() in rdt_get_tree() ... I wonder if it may not just be simpler to
> call mbm_cntr_reset() from rdt_kill_sb()? This does raise the question if mbm_cntr_reset() should reset
> architectural state. I do not think it does harm, the state will just be reset again when the mon dirs are
> created?

Yes. Calling mbm_cntr_reset() from rdt_kill_sb() seems more clean approach.

Architectural state will reset again when counter is assigned(when mon 
directories are created).

thanks
- Babu Moger