[PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups

Babu Moger posted 20 patches 1 year, 7 months ago
There is a newer version of this series
[PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups
Posted by Babu Moger 1 year, 7 months ago
Introduce the interface to enable events in ABMC mode.

Events can be enabled or disabled by writing to file
/sys/fs/resctrl/info/L3_MON/mbm_control

Format is similar to the list format with addition of op-code for the
assignment operation.
 "<CTRL_MON group>/<MON group>/<op-code><flags>"

Format for specific type of groups:

 * Default CTRL_MON group:
         "//<domain_id><op-code><flags>"

 * Non-default CTRL_MON group:
         "<CTRL_MON group>//<domain_id><op-code><flags>"

 * Child MON group of default CTRL_MON group:
         "/<MON group>/<domain_id><op-code><flags>"

 * Child MON group of non-default CTRL_MON group:
         "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"

Op-code can be one of the following:

 = Update the assignment to match the flags
 + enable a new state
 - disable a new state

Assignment flags can be one of the following:
 t  MBM total event is enabled
 l  MBM local event is enabled
 tl Both total and local MBM events are enabled
 _  None of the MBM events are enabled. Valid only with '=" opcode.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v5: Interface name changed from mbm_assign_control to mbm_control.
    Fixed opcode and flags combination.
    '=_" is valid.
    "-_" amd "+_" is not valid.
    Minor message update.
    Renamed the function with prefix - rdtgroup_.
    Corrected few documentation mistakes.
    Rebase related changes after SNC support.

v4: Added domain specific assignments. Fixed the opcode parsing.

v3: New patch.
    Addresses the feedback to provide the global assignment interface.
    https://lore.kernel.org/lkml/c73f444b-83a1-4e9a-95d3-54c5165ee782@intel.com/
---
 Documentation/arch/x86/resctrl.rst     |  81 +++++++-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 250 ++++++++++++++++++++++++-
 2 files changed, 329 insertions(+), 2 deletions(-)

diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
index 05fee779e109..5a621235eb2b 100644
--- a/Documentation/arch/x86/resctrl.rst
+++ b/Documentation/arch/x86/resctrl.rst
@@ -331,7 +331,7 @@ with the following files:
 	 t  MBM total event is enabled.
 	 l  MBM local event is enabled.
 	 tl Both total and local MBM events are enabled.
-	 _  None of the MBM events are enabled.
+	 _  None of the MBM events are enabled. Only works with opcode '=' for write.
 
 	Examples:
 	::
@@ -358,6 +358,85 @@ with the following files:
 
 	 /child_default_mon_grp/ - This is a child monitor group of default CTRL_MON group.
 
+	Assignment state can be updated by writing to the interface.
+
+	Format is similar to the list format with addition of op-code for the
+	assignment operation.
+
+		"<CTRL_MON group>/<MON group>/<op-code><flags>"
+
+	Format for each type of groups:
+
+        * Default CTRL_MON group:
+                "//<domain_id><op-code><flags>"
+
+        * Non-default CTRL_MON group:
+                "<CTRL_MON group>//<domain_id><op-code><flags>"
+
+        * Child MON group of default CTRL_MON group:
+                "/<MON group>/<domain_id><op-code><flags>"
+
+        * Child MON group of non-default CTRL_MON group:
+                "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
+
+	Op-code can be one of the following:
+	::
+
+	 = Update the assignment to match the flags.
+	 + Add a new state.
+	 - delete a new state.
+
+	Examples:
+	::
+
+	  Initial group status:
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_control
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
+	  //0=tl;1=tl;
+	  /child_default_mon_grp/0=tl;1=tl;
+
+	  To update the default group to enable only total event on domain 0:
+	  # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_control
+
+	  Assignment status after the update:
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_control
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
+	  //0=t;1=tl;
+	  /child_default_mon_grp/0=tl;1=tl;
+
+	  To update the MON group child_default_mon_grp to remove total event on domain 1:
+	  # echo "/child_default_mon_grp/1-t" > /sys/fs/resctrl/info/L3_MON/mbm_control
+
+	  Assignment status after the update:
+	  $ cat /sys/fs/resctrl/info/L3_MON/mbm_control
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
+	  //0=t;1=tl;
+	  /child_default_mon_grp/0=tl;1=l;
+
+	  To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to
+	  remove both local and total events on domain 1:
+	  # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
+			/sys/fs/resctrl/info/L3_MON/mbm_control
+
+	  Assignment status after the update:
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
+	  //0=t;1=tl;
+	  /child_default_mon_grp/0=tl;1=l;
+
+	  To update the default group to add a local event domain 0.
+	  # echo "//0+l" > /sys/fs/resctrl/info/L3_MON/mbm_control
+
+	  Assignment status after the update:
+	  # cat /sys/fs/resctrl/info/L3_MON/mbm_control
+	  non_default_ctrl_mon_grp//0=tl;1=tl;
+	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
+	  //0=tl;1=tl;
+	  /child_default_mon_grp/0=tl;1=l;
+
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
 		bytes) at which a previously used LLC_occupancy
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 0de9f23d5389..84c0874d7872 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1068,6 +1068,253 @@ static int rdtgroup_mbm_control_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+static int rdtgroup_str_to_mon_state(char *flag)
+{
+	int i, mon_state = 0;
+
+	for (i = 0; i < strlen(flag); i++) {
+		switch (*(flag + i)) {
+		case 't':
+			mon_state |= ASSIGN_TOTAL;
+			break;
+		case 'l':
+			mon_state |= ASSIGN_LOCAL;
+			break;
+		case '_':
+			mon_state = ASSIGN_NONE;
+			break;
+		default:
+			mon_state = ASSIGN_NONE;
+			break;
+		}
+	}
+
+	return mon_state;
+}
+
+static struct rdtgroup *rdtgroup_find_grp(enum rdt_group_type rtype, char *p_grp, char *c_grp)
+{
+	struct rdtgroup *rdtg, *crg;
+
+	if (rtype == RDTCTRL_GROUP && *p_grp == '\0') {
+		return &rdtgroup_default;
+	} else if (rtype == RDTCTRL_GROUP) {
+		list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list)
+			if (!strcmp(p_grp, rdtg->kn->name))
+				return rdtg;
+	} else if (rtype == RDTMON_GROUP) {
+		list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) {
+			if (!strcmp(p_grp, rdtg->kn->name)) {
+				list_for_each_entry(crg, &rdtg->mon.crdtgrp_list,
+						    mon.crdtgrp_list) {
+					if (!strcmp(c_grp, crg->kn->name))
+						return crg;
+				}
+			}
+		}
+	}
+
+	return NULL;
+}
+
+static int rdtgroup_process_flags(enum rdt_group_type rtype, char *p_grp, char *c_grp, char *tok)
+{
+	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
+	int op, mon_state, assign_state, unassign_state;
+	char *dom_str, *id_str, *op_str;
+	struct rdt_mon_domain *d;
+	struct rdtgroup *rdtgrp;
+	unsigned long dom_id;
+	int ret, found = 0;
+
+	rdtgrp = rdtgroup_find_grp(rtype, p_grp, c_grp);
+
+	if (!rdtgrp) {
+		rdt_last_cmd_puts("Not a valid resctrl group\n");
+		return -EINVAL;
+	}
+
+next:
+	if (!tok || tok[0] == '\0')
+		return 0;
+
+	/* Start processing the strings for each domain */
+	dom_str = strim(strsep(&tok, ";"));
+
+	op_str = strpbrk(dom_str, "=+-");
+
+	if (op_str) {
+		op = *op_str;
+	} else {
+		rdt_last_cmd_puts("Missing operation =, +, -, _ character\n");
+		return -EINVAL;
+	}
+
+	id_str = strsep(&dom_str, "=+-");
+
+	if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
+		rdt_last_cmd_puts("Missing domain id\n");
+		return -EINVAL;
+	}
+
+	/* Verify if the dom_id is valid */
+	list_for_each_entry(d, &r->mon_domains, hdr.list) {
+		if (d->hdr.id == dom_id) {
+			found = 1;
+			break;
+		}
+	}
+	if (!found) {
+		rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
+		return -EINVAL;
+	}
+
+	mon_state = rdtgroup_str_to_mon_state(dom_str);
+
+	assign_state = 0;
+	unassign_state = 0;
+
+	switch (op) {
+	case '+':
+		if (mon_state == ASSIGN_NONE) {
+			rdt_last_cmd_puts("Invalid assign opcode\n");
+			goto out_fail;
+		}
+		assign_state = mon_state;
+		break;
+	case '-':
+		if (mon_state == ASSIGN_NONE) {
+			rdt_last_cmd_puts("Invalid assign opcode\n");
+			goto out_fail;
+		}
+		unassign_state = mon_state;
+		break;
+	case '=':
+		assign_state = mon_state;
+		unassign_state = (ASSIGN_TOTAL | ASSIGN_LOCAL) & ~assign_state;
+		break;
+	default:
+		break;
+	}
+
+	if (assign_state & ASSIGN_TOTAL)
+		ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_TOTAL_EVENT_ID,
+					       rdtgrp->mon.rmid,
+					       rdtgrp->mon.cntr_id[0],
+					       rdtgrp->closid, 1);
+	if (ret)
+		goto out_fail;
+
+	if (assign_state & ASSIGN_LOCAL)
+		ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_LOCAL_EVENT_ID,
+					       rdtgrp->mon.rmid,
+					       rdtgrp->mon.cntr_id[1],
+					       rdtgrp->closid, 1);
+
+	if (ret)
+		goto out_fail;
+
+	if (unassign_state & ASSIGN_TOTAL)
+		ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_TOTAL_EVENT_ID,
+					       rdtgrp->mon.rmid,
+					       rdtgrp->mon.cntr_id[0],
+					       rdtgrp->closid, 0);
+
+	if (ret)
+		goto out_fail;
+
+	if (unassign_state & ASSIGN_LOCAL)
+		ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_LOCAL_EVENT_ID,
+					       rdtgrp->mon.rmid,
+					       rdtgrp->mon.cntr_id[1],
+					       rdtgrp->closid, 0);
+	if (ret)
+		goto out_fail;
+
+	goto next;
+
+out_fail:
+
+	return -EINVAL;
+}
+
+static ssize_t rdtgroup_mbm_control_write(struct kernfs_open_file *of,
+					  char *buf, size_t nbytes,
+					  loff_t off)
+{
+	struct rdt_resource *r = of->kn->parent->priv;
+	char *token, *cmon_grp, *mon_grp;
+	struct rdt_hw_resource *hw_res;
+	int ret;
+
+	hw_res = resctrl_to_arch_res(r);
+	if (!hw_res->abmc_enabled)
+		return -EINVAL;
+
+	/* Valid input requires a trailing newline */
+	if (nbytes == 0 || buf[nbytes - 1] != '\n')
+		return -EINVAL;
+
+	buf[nbytes - 1] = '\0';
+	rdt_last_cmd_clear();
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	while ((token = strsep(&buf, "\n")) != NULL) {
+		if (strstr(token, "//")) {
+			/*
+			 * The CTRL_MON group processing:
+			 * default CTRL_MON group: "//<flags>"
+			 * non-default CTRL_MON group: "<CTRL_MON group>//flags"
+			 * The CTRL_MON group will be empty string if it is a
+			 * default group.
+			 */
+			cmon_grp = strsep(&token, "//");
+
+			/*
+			 * strsep returns empty string for contiguous delimiters.
+			 * Make sure check for two consicutive delimiters and
+			 * advance the token.
+			 */
+			mon_grp = strsep(&token, "//");
+			if (*mon_grp != '\0') {
+				rdt_last_cmd_printf("Invalid CTRL_MON group format %s\n", token);
+				ret = -EINVAL;
+				break;
+			}
+
+			ret = rdtgroup_process_flags(RDTCTRL_GROUP, cmon_grp, mon_grp, token);
+			if (ret)
+				break;
+		} else if (strstr(token, "/")) {
+			/*
+			 * MON group processing:
+			 * MON_GROUP inside default CTRL_MON group: "/<MON group>/<flags>"
+			 * MON_GROUP within CTRL_MON group: "<CTRL_MON group>/<MON group>/<flags>"
+			 */
+			cmon_grp = strsep(&token, "/");
+
+			/* Extract the MON_GROUP. It cannot be empty string */
+			mon_grp = strsep(&token, "/");
+			if (*mon_grp == '\0') {
+				rdt_last_cmd_printf("Invalid MON_GROUP format %s\n", token);
+				ret = -EINVAL;
+				break;
+			}
+
+			ret = rdtgroup_process_flags(RDTMON_GROUP, cmon_grp, mon_grp, token);
+			if (ret)
+				break;
+		}
+	}
+
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret ?: nbytes;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -2282,9 +2529,10 @@ static struct rftype res_common_files[] = {
 	},
 	{
 		.name		= "mbm_control",
-		.mode		= 0444,
+		.mode		= 0644,
 		.kf_ops		= &rdtgroup_kf_single_ops,
 		.seq_show	= rdtgroup_mbm_control_show,
+		.write		= rdtgroup_mbm_control_write,
 	},
 	{
 		.name		= "cpus_list",
-- 
2.34.1
Re: [PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups
Posted by Peter Newman 1 year, 6 months ago
Hi Babu,

On Wed, Jul 3, 2024 at 2:51 PM Babu Moger <babu.moger@amd.com> wrote:
>
> Introduce the interface to enable events in ABMC mode.
>
> Events can be enabled or disabled by writing to file
> /sys/fs/resctrl/info/L3_MON/mbm_control
>
> Format is similar to the list format with addition of op-code for the
> assignment operation.
>  "<CTRL_MON group>/<MON group>/<op-code><flags>"
>
> Format for specific type of groups:
>
>  * Default CTRL_MON group:
>          "//<domain_id><op-code><flags>"
>
>  * Non-default CTRL_MON group:
>          "<CTRL_MON group>//<domain_id><op-code><flags>"
>
>  * Child MON group of default CTRL_MON group:
>          "/<MON group>/<domain_id><op-code><flags>"
>
>  * Child MON group of non-default CTRL_MON group:
>          "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"

Just a reminder, Reinette and I had discussed[1] omitting the
domain_id for performing the same operation on all domains.

I would really appreciate this, otherwise our most typical operations
could be really tedious and needlessly serialized.

# cat mbm_control
//0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;12=tl;13=tl;14=tl;15=tl;16=tl;17=tl;18=tl;19=tl;20=tl;21=tl;22=tl;23=tl;24=tl;25=tl;26=tl;27=tl;28=tl;29=tl;30=tl;31=tl;
# echo '//-l' > mbm_control
-bash: echo: write error: Invalid argument
# cat ../last_cmd_status
Missing domain id

If you can't get to it in this series, I'll push a
scalability-oriented series after the basic assignment support is
merged.

Thanks!
-Peter

[1] https://lore.kernel.org/lkml/CALPaoChcJq5zoPchB2j0aM+nZpQe1xoo7w2QQUjtH+c58Yyxag@mail.gmail.com/
Re: [PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups
Posted by Moger, Babu 1 year, 6 months ago
Hi Peter,

On 7/24/2024 7:03 PM, Peter Newman wrote:
> Hi Babu,
> 
> On Wed, Jul 3, 2024 at 2:51 PM Babu Moger <babu.moger@amd.com> wrote:
>>
>> Introduce the interface to enable events in ABMC mode.
>>
>> Events can be enabled or disabled by writing to file
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>> Format is similar to the list format with addition of op-code for the
>> assignment operation.
>>   "<CTRL_MON group>/<MON group>/<op-code><flags>"
>>
>> Format for specific type of groups:
>>
>>   * Default CTRL_MON group:
>>           "//<domain_id><op-code><flags>"
>>
>>   * Non-default CTRL_MON group:
>>           "<CTRL_MON group>//<domain_id><op-code><flags>"
>>
>>   * Child MON group of default CTRL_MON group:
>>           "/<MON group>/<domain_id><op-code><flags>"
>>
>>   * Child MON group of non-default CTRL_MON group:
>>           "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
> 
> Just a reminder, Reinette and I had discussed[1] omitting the
> domain_id for performing the same operation on all domains.

Yes. I remember. Lets refresh our memory.
> 
> I would really appreciate this, otherwise our most typical operations
> could be really tedious and needlessly serialized.

> 
> # cat mbm_control
> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;12=tl;13=tl;14=tl;15=tl;16=tl;17=tl;18=tl;19=tl;20=tl;21=tl;22=tl;23=tl;24=tl;25=tl;26=tl;27=tl;28=tl;29=tl;30=tl;31=tl;
> # echo '//-l' > mbm_control

What is the expectation here?
You want to unassign local event on all the domains?

Domain id makes it easy to parse the command. Without that it parsing 
code becomes  messy.

How about something like this? We can use the max domain id to mean all 
the domains. In the above case there are 32 domains(0-31). 32 is total 
number of domains. We can get that details looking through all the 
domains. We can print that detail when we list it.

# cat mbm_control
//0=tl;1=tl;2=tl;3=tl;... 31=tl;
Max domain id is 31. Use domain-id 32 to apply the flags on all the 
domains.

echo '//32-l' > mbm_control

There is only on syscall but IPIs will be sent to all the domains.

Any other ideas?

> -bash: echo: write error: Invalid argument
> # cat ../last_cmd_status
> Missing domain id
> 
> If you can't get to it in this series, I'll push a
> scalability-oriented series after the basic assignment support is
> merged.

Lets try to get this resolved in this series.

> 
> Thanks!
> -Peter
> 
> [1] https://lore.kernel.org/lkml/CALPaoChcJq5zoPchB2j0aM+nZpQe1xoo7w2QQUjtH+c58Yyxag@mail.gmail.com/

-- 
- Babu Moger
Re: [PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups
Posted by Peter Newman 1 year, 6 months ago
Hi Babu,

On Wed, Jul 24, 2024 at 6:23 PM Moger, Babu <bmoger@amd.com> wrote:
>
> Hi Peter,
>
> On 7/24/2024 7:03 PM, Peter Newman wrote:
> > Hi Babu,
> >
> > On Wed, Jul 3, 2024 at 2:51 PM Babu Moger <babu.moger@amd.com> wrote:
> >>
> >> Introduce the interface to enable events in ABMC mode.
> >>
> >> Events can be enabled or disabled by writing to file
> >> /sys/fs/resctrl/info/L3_MON/mbm_control
> >>
> >> Format is similar to the list format with addition of op-code for the
> >> assignment operation.
> >>   "<CTRL_MON group>/<MON group>/<op-code><flags>"
> >>
> >> Format for specific type of groups:
> >>
> >>   * Default CTRL_MON group:
> >>           "//<domain_id><op-code><flags>"
> >>
> >>   * Non-default CTRL_MON group:
> >>           "<CTRL_MON group>//<domain_id><op-code><flags>"
> >>
> >>   * Child MON group of default CTRL_MON group:
> >>           "/<MON group>/<domain_id><op-code><flags>"
> >>
> >>   * Child MON group of non-default CTRL_MON group:
> >>           "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
> >
> > Just a reminder, Reinette and I had discussed[1] omitting the
> > domain_id for performing the same operation on all domains.
>
> Yes. I remember. Lets refresh our memory.
> >
> > I would really appreciate this, otherwise our most typical operations
> > could be really tedious and needlessly serialized.
>
> >
> > # cat mbm_control
> > //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;12=tl;13=tl;14=tl;15=tl;16=tl;17=tl;18=tl;19=tl;20=tl;21=tl;22=tl;23=tl;24=tl;25=tl;26=tl;27=tl;28=tl;29=tl;30=tl;31=tl;
> > # echo '//-l' > mbm_control
>
> What is the expectation here?
> You want to unassign local event on all the domains?

Correct.

>
> Domain id makes it easy to parse the command. Without that it parsing
> code becomes  messy.
>
> How about something like this? We can use the max domain id to mean all
> the domains. In the above case there are 32 domains(0-31). 32 is total
> number of domains. We can get that details looking through all the
> domains. We can print that detail when we list it.

This sounds like only a minor simplification to the parsing code. It
seems like it would be easy to determine if the final '/' is
immediately followed by an opcode (+-=_) rather than a number.

-Peter
Re: [PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups
Posted by Moger, Babu 1 year, 6 months ago
Hi Peter,

On 7/25/24 12:11, Peter Newman wrote:
> Hi Babu,
> 
> On Wed, Jul 24, 2024 at 6:23 PM Moger, Babu <bmoger@amd.com> wrote:
>>
>> Hi Peter,
>>
>> On 7/24/2024 7:03 PM, Peter Newman wrote:
>>> Hi Babu,
>>>
>>> On Wed, Jul 3, 2024 at 2:51 PM Babu Moger <babu.moger@amd.com> wrote:
>>>>
>>>> Introduce the interface to enable events in ABMC mode.
>>>>
>>>> Events can be enabled or disabled by writing to file
>>>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>>>
>>>> Format is similar to the list format with addition of op-code for the
>>>> assignment operation.
>>>>   "<CTRL_MON group>/<MON group>/<op-code><flags>"
>>>>
>>>> Format for specific type of groups:
>>>>
>>>>   * Default CTRL_MON group:
>>>>           "//<domain_id><op-code><flags>"
>>>>
>>>>   * Non-default CTRL_MON group:
>>>>           "<CTRL_MON group>//<domain_id><op-code><flags>"
>>>>
>>>>   * Child MON group of default CTRL_MON group:
>>>>           "/<MON group>/<domain_id><op-code><flags>"
>>>>
>>>>   * Child MON group of non-default CTRL_MON group:
>>>>           "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
>>>
>>> Just a reminder, Reinette and I had discussed[1] omitting the
>>> domain_id for performing the same operation on all domains.
>>
>> Yes. I remember. Lets refresh our memory.
>>>
>>> I would really appreciate this, otherwise our most typical operations
>>> could be really tedious and needlessly serialized.
>>
>>>
>>> # cat mbm_control
>>> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;12=tl;13=tl;14=tl;15=tl;16=tl;17=tl;18=tl;19=tl;20=tl;21=tl;22=tl;23=tl;24=tl;25=tl;26=tl;27=tl;28=tl;29=tl;30=tl;31=tl;
>>> # echo '//-l' > mbm_control
>>
>> What is the expectation here?
>> You want to unassign local event on all the domains?
> 
> Correct.
> 
>>
>> Domain id makes it easy to parse the command. Without that it parsing
>> code becomes  messy.
>>
>> How about something like this? We can use the max domain id to mean all
>> the domains. In the above case there are 32 domains(0-31). 32 is total
>> number of domains. We can get that details looking through all the
>> domains. We can print that detail when we list it.
> 
> This sounds like only a minor simplification to the parsing code. It
> seems like it would be easy to determine if the final '/' is
> immediately followed by an opcode (+-=_) rather than a number.

Ok. Will try to get that working. Will let you know if there are
complexities with that.--
Thanks
Babu Moger
Re: [PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups
Posted by Reinette Chatre 1 year, 6 months ago
Hi Babu and Peter,

On 7/25/24 10:28 AM, Moger, Babu wrote:
> Hi Peter,
> 
> On 7/25/24 12:11, Peter Newman wrote:
>> Hi Babu,
>>
>> On Wed, Jul 24, 2024 at 6:23 PM Moger, Babu <bmoger@amd.com> wrote:
>>>
>>> Hi Peter,
>>>
>>> On 7/24/2024 7:03 PM, Peter Newman wrote:
>>>> Hi Babu,
>>>>
>>>> On Wed, Jul 3, 2024 at 2:51 PM Babu Moger <babu.moger@amd.com> wrote:
>>>>>
>>>>> Introduce the interface to enable events in ABMC mode.
>>>>>
>>>>> Events can be enabled or disabled by writing to file
>>>>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>>>>
>>>>> Format is similar to the list format with addition of op-code for the
>>>>> assignment operation.
>>>>>    "<CTRL_MON group>/<MON group>/<op-code><flags>"
>>>>>
>>>>> Format for specific type of groups:
>>>>>
>>>>>    * Default CTRL_MON group:
>>>>>            "//<domain_id><op-code><flags>"
>>>>>
>>>>>    * Non-default CTRL_MON group:
>>>>>            "<CTRL_MON group>//<domain_id><op-code><flags>"
>>>>>
>>>>>    * Child MON group of default CTRL_MON group:
>>>>>            "/<MON group>/<domain_id><op-code><flags>"
>>>>>
>>>>>    * Child MON group of non-default CTRL_MON group:
>>>>>            "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
>>>>
>>>> Just a reminder, Reinette and I had discussed[1] omitting the
>>>> domain_id for performing the same operation on all domains.
>>>
>>> Yes. I remember. Lets refresh our memory.
>>>>
>>>> I would really appreciate this, otherwise our most typical operations
>>>> could be really tedious and needlessly serialized.
>>>
>>>>
>>>> # cat mbm_control
>>>> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;12=tl;13=tl;14=tl;15=tl;16=tl;17=tl;18=tl;19=tl;20=tl;21=tl;22=tl;23=tl;24=tl;25=tl;26=tl;27=tl;28=tl;29=tl;30=tl;31=tl;
>>>> # echo '//-l' > mbm_control
>>>
>>> What is the expectation here?
>>> You want to unassign local event on all the domains?
>>
>> Correct.
>>
>>>
>>> Domain id makes it easy to parse the command. Without that it parsing
>>> code becomes  messy.
>>>
>>> How about something like this? We can use the max domain id to mean all
>>> the domains. In the above case there are 32 domains(0-31). 32 is total
>>> number of domains. We can get that details looking through all the
>>> domains. We can print that detail when we list it.
>>
>> This sounds like only a minor simplification to the parsing code. It
>> seems like it would be easy to determine if the final '/' is
>> immediately followed by an opcode (+-=_) rather than a number.
> 
> Ok. Will try to get that working. Will let you know if there are
> complexities with that.--
> Thanks
> Babu Moger

Dave suggested [1]  "*" to indicate "all domains". This seems an intuitive
addition to the interface to accomplish this goal.

Reinette

[1] https://lore.kernel.org/lkml/ZierjRNDMfg5swT8@e133380.arm.com/
Re: [PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups
Posted by Moger, Babu 1 year, 6 months ago
Hi Reinette,

On 8/1/24 13:56, Reinette Chatre wrote:
> Hi Babu and Peter,
> 
> On 7/25/24 10:28 AM, Moger, Babu wrote:
>> Hi Peter,
>>
>> On 7/25/24 12:11, Peter Newman wrote:
>>> Hi Babu,
>>>
>>> On Wed, Jul 24, 2024 at 6:23 PM Moger, Babu <bmoger@amd.com> wrote:
>>>>
>>>> Hi Peter,
>>>>
>>>> On 7/24/2024 7:03 PM, Peter Newman wrote:
>>>>> Hi Babu,
>>>>>
>>>>> On Wed, Jul 3, 2024 at 2:51 PM Babu Moger <babu.moger@amd.com> wrote:
>>>>>>
>>>>>> Introduce the interface to enable events in ABMC mode.
>>>>>>
>>>>>> Events can be enabled or disabled by writing to file
>>>>>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>>>>>
>>>>>> Format is similar to the list format with addition of op-code for the
>>>>>> assignment operation.
>>>>>>    "<CTRL_MON group>/<MON group>/<op-code><flags>"
>>>>>>
>>>>>> Format for specific type of groups:
>>>>>>
>>>>>>    * Default CTRL_MON group:
>>>>>>            "//<domain_id><op-code><flags>"
>>>>>>
>>>>>>    * Non-default CTRL_MON group:
>>>>>>            "<CTRL_MON group>//<domain_id><op-code><flags>"
>>>>>>
>>>>>>    * Child MON group of default CTRL_MON group:
>>>>>>            "/<MON group>/<domain_id><op-code><flags>"
>>>>>>
>>>>>>    * Child MON group of non-default CTRL_MON group:
>>>>>>            "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
>>>>>
>>>>> Just a reminder, Reinette and I had discussed[1] omitting the
>>>>> domain_id for performing the same operation on all domains.
>>>>
>>>> Yes. I remember. Lets refresh our memory.
>>>>>
>>>>> I would really appreciate this, otherwise our most typical operations
>>>>> could be really tedious and needlessly serialized.
>>>>
>>>>>
>>>>> # cat mbm_control
>>>>> //0=tl;1=tl;2=tl;3=tl;4=tl;5=tl;6=tl;7=tl;8=tl;9=tl;10=tl;11=tl;12=tl;13=tl;14=tl;15=tl;16=tl;17=tl;18=tl;19=tl;20=tl;21=tl;22=tl;23=tl;24=tl;25=tl;26=tl;27=tl;28=tl;29=tl;30=tl;31=tl;
>>>>> # echo '//-l' > mbm_control
>>>>
>>>> What is the expectation here?
>>>> You want to unassign local event on all the domains?
>>>
>>> Correct.
>>>
>>>>
>>>> Domain id makes it easy to parse the command. Without that it parsing
>>>> code becomes  messy.
>>>>
>>>> How about something like this? We can use the max domain id to mean all
>>>> the domains. In the above case there are 32 domains(0-31). 32 is total
>>>> number of domains. We can get that details looking through all the
>>>> domains. We can print that detail when we list it.
>>>
>>> This sounds like only a minor simplification to the parsing code. It
>>> seems like it would be easy to determine if the final '/' is
>>> immediately followed by an opcode (+-=_) rather than a number.
>>
>> Ok. Will try to get that working. Will let you know if there are
>> complexities with that.--
>> Thanks
>> Babu Moger
> 
> Dave suggested [1]  "*" to indicate "all domains". This seems an intuitive
> addition to the interface to accomplish this goal.

Yes. that is correct. Will try to address that in v6.

-- 
Thanks
Babu Moger
Re: [PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups
Posted by Reinette Chatre 1 year, 7 months ago
Hi Babu,

On 7/3/24 2:48 PM, Babu Moger wrote:
> Introduce the interface to enable events in ABMC mode.

As mentioned in cover letter, please take care with terms. This
interface does not "enable events" - note that events can be
"enabled" even in legacy mode. This is the interface to
assign counters.

> 
> Events can be enabled or disabled by writing to file
> /sys/fs/resctrl/info/L3_MON/mbm_control
> 
> Format is similar to the list format with addition of op-code for the
> assignment operation.
>   "<CTRL_MON group>/<MON group>/<op-code><flags>"

Missing a "domain_id".

> 
> Format for specific type of groups:
> 
>   * Default CTRL_MON group:
>           "//<domain_id><op-code><flags>"
> 
>   * Non-default CTRL_MON group:
>           "<CTRL_MON group>//<domain_id><op-code><flags>"
> 
>   * Child MON group of default CTRL_MON group:
>           "/<MON group>/<domain_id><op-code><flags>"
> 
>   * Child MON group of non-default CTRL_MON group:
>           "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
> 
> Op-code can be one of the following:
> 
>   = Update the assignment to match the flags
>   + enable a new state
>   - disable a new state

(note comment in cover letter about consistent terms)

> 
> Assignment flags can be one of the following:
>   t  MBM total event is enabled
>   l  MBM local event is enabled
>   tl Both total and local MBM events are enabled
>   _  None of the MBM events are enabled. Valid only with '=" opcode.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> v5: Interface name changed from mbm_assign_control to mbm_control.
>      Fixed opcode and flags combination.
>      '=_" is valid.
>      "-_" amd "+_" is not valid.
>      Minor message update.
>      Renamed the function with prefix - rdtgroup_.
>      Corrected few documentation mistakes.
>      Rebase related changes after SNC support.
> 
> v4: Added domain specific assignments. Fixed the opcode parsing.
> 
> v3: New patch.
>      Addresses the feedback to provide the global assignment interface.
>      https://lore.kernel.org/lkml/c73f444b-83a1-4e9a-95d3-54c5165ee782@intel.com/
> ---
>   Documentation/arch/x86/resctrl.rst     |  81 +++++++-
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 250 ++++++++++++++++++++++++-
>   2 files changed, 329 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> index 05fee779e109..5a621235eb2b 100644
> --- a/Documentation/arch/x86/resctrl.rst
> +++ b/Documentation/arch/x86/resctrl.rst
> @@ -331,7 +331,7 @@ with the following files:
>   	 t  MBM total event is enabled.
>   	 l  MBM local event is enabled.
>   	 tl Both total and local MBM events are enabled.
> -	 _  None of the MBM events are enabled.
> +	 _  None of the MBM events are enabled. Only works with opcode '=' for write.
>   
>   	Examples:
>   	::
> @@ -358,6 +358,85 @@ with the following files:
>   
>   	 /child_default_mon_grp/ - This is a child monitor group of default CTRL_MON group.
>   
> +	Assignment state can be updated by writing to the interface.
> +
> +	Format is similar to the list format with addition of op-code for the
> +	assignment operation.
> +
> +		"<CTRL_MON group>/<MON group>/<op-code><flags>"

Missing domain_id

> +
> +	Format for each type of groups:
> +
> +        * Default CTRL_MON group:
> +                "//<domain_id><op-code><flags>"
> +
> +        * Non-default CTRL_MON group:
> +                "<CTRL_MON group>//<domain_id><op-code><flags>"
> +
> +        * Child MON group of default CTRL_MON group:
> +                "/<MON group>/<domain_id><op-code><flags>"
> +
> +        * Child MON group of non-default CTRL_MON group:
> +                "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
> +
> +	Op-code can be one of the following:
> +	::
> +
> +	 = Update the assignment to match the flags.
> +	 + Add a new state.
> +	 - delete a new state.
> +
> +	Examples:
> +	::
> +
> +	  Initial group status:
> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_control
> +	  non_default_ctrl_mon_grp//0=tl;1=tl;
> +	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> +	  //0=tl;1=tl;
> +	  /child_default_mon_grp/0=tl;1=tl;
> +
> +	  To update the default group to enable only total event on domain 0:
> +	  # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_control
> +
> +	  Assignment status after the update:
> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_control
> +	  non_default_ctrl_mon_grp//0=tl;1=tl;
> +	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> +	  //0=t;1=tl;
> +	  /child_default_mon_grp/0=tl;1=tl;
> +
> +	  To update the MON group child_default_mon_grp to remove total event on domain 1:
> +	  # echo "/child_default_mon_grp/1-t" > /sys/fs/resctrl/info/L3_MON/mbm_control
> +
> +	  Assignment status after the update:
> +	  $ cat /sys/fs/resctrl/info/L3_MON/mbm_control
> +	  non_default_ctrl_mon_grp//0=tl;1=tl;
> +	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
> +	  //0=t;1=tl;
> +	  /child_default_mon_grp/0=tl;1=l;
> +
> +	  To update the MON group non_default_ctrl_mon_grp/child_non_default_mon_grp to
> +	  remove both local and total events on domain 1:
> +	  # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
> +			/sys/fs/resctrl/info/L3_MON/mbm_control
> +
> +	  Assignment status after the update:
> +	  non_default_ctrl_mon_grp//0=tl;1=tl;
> +	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
> +	  //0=t;1=tl;
> +	  /child_default_mon_grp/0=tl;1=l;
> +
> +	  To update the default group to add a local event domain 0.
> +	  # echo "//0+l" > /sys/fs/resctrl/info/L3_MON/mbm_control
> +
> +	  Assignment status after the update:
> +	  # cat /sys/fs/resctrl/info/L3_MON/mbm_control
> +	  non_default_ctrl_mon_grp//0=tl;1=tl;
> +	  non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
> +	  //0=tl;1=tl;
> +	  /child_default_mon_grp/0=tl;1=l;
> +
>   "max_threshold_occupancy":
>   		Read/write file provides the largest value (in
>   		bytes) at which a previously used LLC_occupancy
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 0de9f23d5389..84c0874d7872 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1068,6 +1068,253 @@ static int rdtgroup_mbm_control_show(struct kernfs_open_file *of,
>   	return 0;
>   }
>   
> +static int rdtgroup_str_to_mon_state(char *flag)
> +{
> +	int i, mon_state = 0;
> +
> +	for (i = 0; i < strlen(flag); i++) {
> +		switch (*(flag + i)) {
> +		case 't':
> +			mon_state |= ASSIGN_TOTAL;
> +			break;
> +		case 'l':
> +			mon_state |= ASSIGN_LOCAL;
> +			break;
> +		case '_':
> +			mon_state = ASSIGN_NONE;
> +			break;
> +		default:
> +			mon_state = ASSIGN_NONE;
> +			break;
> +		}
> +	}
> +

No. As I mentioned before this makes all this work for nothing
by preventing us from ever adding another flag. Please do not
have a default catchall that unassigns all flags.

> +	return mon_state;
> +}
> +
> +static struct rdtgroup *rdtgroup_find_grp(enum rdt_group_type rtype, char *p_grp, char *c_grp)
> +{
> +	struct rdtgroup *rdtg, *crg;
> +
> +	if (rtype == RDTCTRL_GROUP && *p_grp == '\0') {
> +		return &rdtgroup_default;
> +	} else if (rtype == RDTCTRL_GROUP) {
> +		list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list)
> +			if (!strcmp(p_grp, rdtg->kn->name))
> +				return rdtg;
> +	} else if (rtype == RDTMON_GROUP) {
> +		list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) {
> +			if (!strcmp(p_grp, rdtg->kn->name)) {
> +				list_for_each_entry(crg, &rdtg->mon.crdtgrp_list,
> +						    mon.crdtgrp_list) {
> +					if (!strcmp(c_grp, crg->kn->name))
> +						return crg;
> +				}
> +			}
> +		}
> +	}
> +
> +	return NULL;
> +}
> +
> +static int rdtgroup_process_flags(enum rdt_group_type rtype, char *p_grp, char *c_grp, char *tok)
> +{
> +	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> +	int op, mon_state, assign_state, unassign_state;
> +	char *dom_str, *id_str, *op_str;
> +	struct rdt_mon_domain *d;
> +	struct rdtgroup *rdtgrp;
> +	unsigned long dom_id;
> +	int ret, found = 0;
> +
> +	rdtgrp = rdtgroup_find_grp(rtype, p_grp, c_grp);
> +
> +	if (!rdtgrp) {
> +		rdt_last_cmd_puts("Not a valid resctrl group\n");
> +		return -EINVAL;
> +	}
> +
> +next:
> +	if (!tok || tok[0] == '\0')
> +		return 0;
> +
> +	/* Start processing the strings for each domain */
> +	dom_str = strim(strsep(&tok, ";"));
> +
> +	op_str = strpbrk(dom_str, "=+-");
> +
> +	if (op_str) {
> +		op = *op_str;
> +	} else {
> +		rdt_last_cmd_puts("Missing operation =, +, -, _ character\n");
> +		return -EINVAL;
> +	}
> +
> +	id_str = strsep(&dom_str, "=+-");
> +
> +	if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
> +		rdt_last_cmd_puts("Missing domain id\n");
> +		return -EINVAL;
> +	}
> +
> +	/* Verify if the dom_id is valid */
> +	list_for_each_entry(d, &r->mon_domains, hdr.list) {
> +		if (d->hdr.id == dom_id) {
> +			found = 1;
> +			break;
> +		}
> +	}
> +	if (!found) {
> +		rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
> +		return -EINVAL;
> +	}
> +
> +	mon_state = rdtgroup_str_to_mon_state(dom_str);
> +
> +	assign_state = 0;
> +	unassign_state = 0;
> +
> +	switch (op) {
> +	case '+':
> +		if (mon_state == ASSIGN_NONE) {
> +			rdt_last_cmd_puts("Invalid assign opcode\n");
> +			goto out_fail;
> +		}
> +		assign_state = mon_state;
> +		break;
> +	case '-':
> +		if (mon_state == ASSIGN_NONE) {
> +			rdt_last_cmd_puts("Invalid assign opcode\n");
> +			goto out_fail;
> +		}
> +		unassign_state = mon_state;
> +		break;
> +	case '=':
> +		assign_state = mon_state;
> +		unassign_state = (ASSIGN_TOTAL | ASSIGN_LOCAL) & ~assign_state;
> +		break;
> +	default:
> +		break;
> +	}
> +

this flow is not clear to me ... I see how an existing counter is
configured but I do not see any counter being freed/allocated, where is that
done?

> +	if (assign_state & ASSIGN_TOTAL)
> +		ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_TOTAL_EVENT_ID,
> +					       rdtgrp->mon.rmid,
> +					       rdtgrp->mon.cntr_id[0],
> +					       rdtgrp->closid, 1);
> +	if (ret)
> +		goto out_fail;
> +
> +	if (assign_state & ASSIGN_LOCAL)
> +		ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_LOCAL_EVENT_ID,
> +					       rdtgrp->mon.rmid,
> +					       rdtgrp->mon.cntr_id[1],
> +					       rdtgrp->closid, 1);
> +
> +	if (ret)
> +		goto out_fail;
> +
> +	if (unassign_state & ASSIGN_TOTAL)
> +		ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_TOTAL_EVENT_ID,
> +					       rdtgrp->mon.rmid,
> +					       rdtgrp->mon.cntr_id[0],
> +					       rdtgrp->closid, 0);
> +
> +	if (ret)
> +		goto out_fail;
> +
> +	if (unassign_state & ASSIGN_LOCAL)
> +		ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_LOCAL_EVENT_ID,
> +					       rdtgrp->mon.rmid,
> +					       rdtgrp->mon.cntr_id[1],
> +					       rdtgrp->closid, 0);
> +	if (ret)
> +		goto out_fail;
> +
> +	goto next;
> +
> +out_fail:
> +
> +	return -EINVAL;
> +}
> +
> +static ssize_t rdtgroup_mbm_control_write(struct kernfs_open_file *of,
> +					  char *buf, size_t nbytes,
> +					  loff_t off)
> +{
> +	struct rdt_resource *r = of->kn->parent->priv;
> +	char *token, *cmon_grp, *mon_grp;
> +	struct rdt_hw_resource *hw_res;
> +	int ret;
> +
> +	hw_res = resctrl_to_arch_res(r);
> +	if (!hw_res->abmc_enabled)
> +		return -EINVAL;
> +
> +	/* Valid input requires a trailing newline */
> +	if (nbytes == 0 || buf[nbytes - 1] != '\n')
> +		return -EINVAL;
> +
> +	buf[nbytes - 1] = '\0';
> +	rdt_last_cmd_clear();

rdt_last_cmd_clear() should be called with mutex held

> +
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	while ((token = strsep(&buf, "\n")) != NULL) {
> +		if (strstr(token, "//")) {
> +			/*
> +			 * The CTRL_MON group processing:
> +			 * default CTRL_MON group: "//<flags>"
> +			 * non-default CTRL_MON group: "<CTRL_MON group>//flags"
> +			 * The CTRL_MON group will be empty string if it is a
> +			 * default group.
> +			 */
> +			cmon_grp = strsep(&token, "//");
> +
> +			/*
> +			 * strsep returns empty string for contiguous delimiters.
> +			 * Make sure check for two consicutive delimiters and

consicutive -> consecutive

> +			 * advance the token.
> +			 */
> +			mon_grp = strsep(&token, "//");
> +			if (*mon_grp != '\0') {
> +				rdt_last_cmd_printf("Invalid CTRL_MON group format %s\n", token);
> +				ret = -EINVAL;
> +				break;
> +			}
> +
> +			ret = rdtgroup_process_flags(RDTCTRL_GROUP, cmon_grp, mon_grp, token);
> +			if (ret)
> +				break;
> +		} else if (strstr(token, "/")) {
> +			/*
> +			 * MON group processing:
> +			 * MON_GROUP inside default CTRL_MON group: "/<MON group>/<flags>"
> +			 * MON_GROUP within CTRL_MON group: "<CTRL_MON group>/<MON group>/<flags>"
> +			 */
> +			cmon_grp = strsep(&token, "/");
> +
> +			/* Extract the MON_GROUP. It cannot be empty string */
> +			mon_grp = strsep(&token, "/");
> +			if (*mon_grp == '\0') {
> +				rdt_last_cmd_printf("Invalid MON_GROUP format %s\n", token);
> +				ret = -EINVAL;
> +				break;
> +			}
> +
> +			ret = rdtgroup_process_flags(RDTMON_GROUP, cmon_grp, mon_grp, token);
> +			if (ret)
> +				break;
> +		}

can these two blocks not be merged? strsep(&token, "//") and strsep(&token, "/") do the same
thing, no?

> +	}
> +
> +	mutex_unlock(&rdtgroup_mutex);
> +	cpus_read_unlock();
> +
> +	return ret ?: nbytes;
> +}
> +
>   #ifdef CONFIG_PROC_CPU_RESCTRL
>   
>   /*
> @@ -2282,9 +2529,10 @@ static struct rftype res_common_files[] = {
>   	},
>   	{
>   		.name		= "mbm_control",
> -		.mode		= 0444,
> +		.mode		= 0644,
>   		.kf_ops		= &rdtgroup_kf_single_ops,
>   		.seq_show	= rdtgroup_mbm_control_show,
> +		.write		= rdtgroup_mbm_control_write,
>   	},
>   	{
>   		.name		= "cpus_list",

Reinette
Re: [PATCH v5 20/20] x86/resctrl: Introduce interface to modify assignment states of the groups
Posted by Moger, Babu 1 year, 6 months ago
Hi Reinette,

On 7/12/24 17:17, Reinette Chatre wrote:
> Hi Babu,
> 
> On 7/3/24 2:48 PM, Babu Moger wrote:
>> Introduce the interface to enable events in ABMC mode.
> 
> As mentioned in cover letter, please take care with terms. This
> interface does not "enable events" - note that events can be
> "enabled" even in legacy mode. This is the interface to
> assign counters.
> 
>>
>> Events can be enabled or disabled by writing to file
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>>
>> Format is similar to the list format with addition of op-code for the
>> assignment operation.
>>   "<CTRL_MON group>/<MON group>/<op-code><flags>"
> 
> Missing a "domain_id".

Yea. Will fix it.

> 
>>
>> Format for specific type of groups:
>>
>>   * Default CTRL_MON group:
>>           "//<domain_id><op-code><flags>"
>>
>>   * Non-default CTRL_MON group:
>>           "<CTRL_MON group>//<domain_id><op-code><flags>"
>>
>>   * Child MON group of default CTRL_MON group:
>>           "/<MON group>/<domain_id><op-code><flags>"
>>
>>   * Child MON group of non-default CTRL_MON group:
>>           "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
>>
>> Op-code can be one of the following:
>>
>>   = Update the assignment to match the flags
>>   + enable a new state
>>   - disable a new state
> 
> (note comment in cover letter about consistent terms)

Sure.
> 
>>
>> Assignment flags can be one of the following:
>>   t  MBM total event is enabled
>>   l  MBM local event is enabled
>>   tl Both total and local MBM events are enabled
>>   _  None of the MBM events are enabled. Valid only with '=" opcode.
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> v5: Interface name changed from mbm_assign_control to mbm_control.
>>      Fixed opcode and flags combination.
>>      '=_" is valid.
>>      "-_" amd "+_" is not valid.
>>      Minor message update.
>>      Renamed the function with prefix - rdtgroup_.
>>      Corrected few documentation mistakes.
>>      Rebase related changes after SNC support.
>>
>> v4: Added domain specific assignments. Fixed the opcode parsing.
>>
>> v3: New patch.
>>      Addresses the feedback to provide the global assignment interface.
>>     
>> https://lore.kernel.org/lkml/c73f444b-83a1-4e9a-95d3-54c5165ee782@intel.com/
>> ---
>>   Documentation/arch/x86/resctrl.rst     |  81 +++++++-
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 250 ++++++++++++++++++++++++-
>>   2 files changed, 329 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/arch/x86/resctrl.rst
>> b/Documentation/arch/x86/resctrl.rst
>> index 05fee779e109..5a621235eb2b 100644
>> --- a/Documentation/arch/x86/resctrl.rst
>> +++ b/Documentation/arch/x86/resctrl.rst
>> @@ -331,7 +331,7 @@ with the following files:
>>        t  MBM total event is enabled.
>>        l  MBM local event is enabled.
>>        tl Both total and local MBM events are enabled.
>> -     _  None of the MBM events are enabled.
>> +     _  None of the MBM events are enabled. Only works with opcode '='
>> for write.
>>         Examples:
>>       ::
>> @@ -358,6 +358,85 @@ with the following files:
>>          /child_default_mon_grp/ - This is a child monitor group of
>> default CTRL_MON group.
>>   +    Assignment state can be updated by writing to the interface.
>> +
>> +    Format is similar to the list format with addition of op-code for the
>> +    assignment operation.
>> +
>> +        "<CTRL_MON group>/<MON group>/<op-code><flags>"
> 
> Missing domain_id

Sure.

> 
>> +
>> +    Format for each type of groups:
>> +
>> +        * Default CTRL_MON group:
>> +                "//<domain_id><op-code><flags>"
>> +
>> +        * Non-default CTRL_MON group:
>> +                "<CTRL_MON group>//<domain_id><op-code><flags>"
>> +
>> +        * Child MON group of default CTRL_MON group:
>> +                "/<MON group>/<domain_id><op-code><flags>"
>> +
>> +        * Child MON group of non-default CTRL_MON group:
>> +                "<CTRL_MON group>/<MON group>/<domain_id><op-code><flags>"
>> +
>> +    Op-code can be one of the following:
>> +    ::
>> +
>> +     = Update the assignment to match the flags.
>> +     + Add a new state.
>> +     - delete a new state.
>> +
>> +    Examples:
>> +    ::
>> +
>> +      Initial group status:
>> +      # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> +      non_default_ctrl_mon_grp//0=tl;1=tl;
>> +      non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>> +      //0=tl;1=tl;
>> +      /child_default_mon_grp/0=tl;1=tl;
>> +
>> +      To update the default group to enable only total event on domain 0:
>> +      # echo "//0=t" > /sys/fs/resctrl/info/L3_MON/mbm_control
>> +
>> +      Assignment status after the update:
>> +      # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> +      non_default_ctrl_mon_grp//0=tl;1=tl;
>> +      non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>> +      //0=t;1=tl;
>> +      /child_default_mon_grp/0=tl;1=tl;
>> +
>> +      To update the MON group child_default_mon_grp to remove total
>> event on domain 1:
>> +      # echo "/child_default_mon_grp/1-t" >
>> /sys/fs/resctrl/info/L3_MON/mbm_control
>> +
>> +      Assignment status after the update:
>> +      $ cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> +      non_default_ctrl_mon_grp//0=tl;1=tl;
>> +      non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=tl;
>> +      //0=t;1=tl;
>> +      /child_default_mon_grp/0=tl;1=l;
>> +
>> +      To update the MON group
>> non_default_ctrl_mon_grp/child_non_default_mon_grp to
>> +      remove both local and total events on domain 1:
>> +      # echo "non_default_ctrl_mon_grp/child_non_default_mon_grp/1=_" >
>> +            /sys/fs/resctrl/info/L3_MON/mbm_control
>> +
>> +      Assignment status after the update:
>> +      non_default_ctrl_mon_grp//0=tl;1=tl;
>> +      non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>> +      //0=t;1=tl;
>> +      /child_default_mon_grp/0=tl;1=l;
>> +
>> +      To update the default group to add a local event domain 0.
>> +      # echo "//0+l" > /sys/fs/resctrl/info/L3_MON/mbm_control
>> +
>> +      Assignment status after the update:
>> +      # cat /sys/fs/resctrl/info/L3_MON/mbm_control
>> +      non_default_ctrl_mon_grp//0=tl;1=tl;
>> +      non_default_ctrl_mon_grp/child_non_default_mon_grp/0=tl;1=_;
>> +      //0=tl;1=tl;
>> +      /child_default_mon_grp/0=tl;1=l;
>> +
>>   "max_threshold_occupancy":
>>           Read/write file provides the largest value (in
>>           bytes) at which a previously used LLC_occupancy
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 0de9f23d5389..84c0874d7872 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1068,6 +1068,253 @@ static int rdtgroup_mbm_control_show(struct
>> kernfs_open_file *of,
>>       return 0;
>>   }
>>   +static int rdtgroup_str_to_mon_state(char *flag)
>> +{
>> +    int i, mon_state = 0;
>> +
>> +    for (i = 0; i < strlen(flag); i++) {
>> +        switch (*(flag + i)) {
>> +        case 't':
>> +            mon_state |= ASSIGN_TOTAL;
>> +            break;
>> +        case 'l':
>> +            mon_state |= ASSIGN_LOCAL;
>> +            break;
>> +        case '_':
>> +            mon_state = ASSIGN_NONE;
>> +            break;
>> +        default:
>> +            mon_state = ASSIGN_NONE;
>> +            break;
>> +        }
>> +    }
>> +
> 
> No. As I mentioned before this makes all this work for nothing
> by preventing us from ever adding another flag. Please do not
> have a default catchall that unassigns all flags.

Will remove default ASSIGN_NONE.

> 
>> +    return mon_state;
>> +}
>> +
>> +static struct rdtgroup *rdtgroup_find_grp(enum rdt_group_type rtype,
>> char *p_grp, char *c_grp)
>> +{
>> +    struct rdtgroup *rdtg, *crg;
>> +
>> +    if (rtype == RDTCTRL_GROUP && *p_grp == '\0') {
>> +        return &rdtgroup_default;
>> +    } else if (rtype == RDTCTRL_GROUP) {
>> +        list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list)
>> +            if (!strcmp(p_grp, rdtg->kn->name))
>> +                return rdtg;
>> +    } else if (rtype == RDTMON_GROUP) {
>> +        list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) {
>> +            if (!strcmp(p_grp, rdtg->kn->name)) {
>> +                list_for_each_entry(crg, &rdtg->mon.crdtgrp_list,
>> +                            mon.crdtgrp_list) {
>> +                    if (!strcmp(c_grp, crg->kn->name))
>> +                        return crg;
>> +                }
>> +            }
>> +        }
>> +    }
>> +
>> +    return NULL;
>> +}
>> +
>> +static int rdtgroup_process_flags(enum rdt_group_type rtype, char
>> *p_grp, char *c_grp, char *tok)
>> +{
>> +    struct rdt_resource *r =
>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>> +    int op, mon_state, assign_state, unassign_state;
>> +    char *dom_str, *id_str, *op_str;
>> +    struct rdt_mon_domain *d;
>> +    struct rdtgroup *rdtgrp;
>> +    unsigned long dom_id;
>> +    int ret, found = 0;
>> +
>> +    rdtgrp = rdtgroup_find_grp(rtype, p_grp, c_grp);
>> +
>> +    if (!rdtgrp) {
>> +        rdt_last_cmd_puts("Not a valid resctrl group\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +next:
>> +    if (!tok || tok[0] == '\0')
>> +        return 0;
>> +
>> +    /* Start processing the strings for each domain */
>> +    dom_str = strim(strsep(&tok, ";"));
>> +
>> +    op_str = strpbrk(dom_str, "=+-");
>> +
>> +    if (op_str) {
>> +        op = *op_str;
>> +    } else {
>> +        rdt_last_cmd_puts("Missing operation =, +, -, _ character\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    id_str = strsep(&dom_str, "=+-");
>> +
>> +    if (!id_str || kstrtoul(id_str, 10, &dom_id)) {
>> +        rdt_last_cmd_puts("Missing domain id\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    /* Verify if the dom_id is valid */
>> +    list_for_each_entry(d, &r->mon_domains, hdr.list) {
>> +        if (d->hdr.id == dom_id) {
>> +            found = 1;
>> +            break;
>> +        }
>> +    }
>> +    if (!found) {
>> +        rdt_last_cmd_printf("Invalid domain id %ld\n", dom_id);
>> +        return -EINVAL;
>> +    }
>> +
>> +    mon_state = rdtgroup_str_to_mon_state(dom_str);
>> +
>> +    assign_state = 0;
>> +    unassign_state = 0;
>> +
>> +    switch (op) {
>> +    case '+':
>> +        if (mon_state == ASSIGN_NONE) {
>> +            rdt_last_cmd_puts("Invalid assign opcode\n");
>> +            goto out_fail;
>> +        }
>> +        assign_state = mon_state;
>> +        break;
>> +    case '-':
>> +        if (mon_state == ASSIGN_NONE) {
>> +            rdt_last_cmd_puts("Invalid assign opcode\n");
>> +            goto out_fail;
>> +        }
>> +        unassign_state = mon_state;
>> +        break;
>> +    case '=':
>> +        assign_state = mon_state;
>> +        unassign_state = (ASSIGN_TOTAL | ASSIGN_LOCAL) & ~assign_state;
>> +        break;
>> +    default:
>> +        break;
>> +    }
>> +
> 
> this flow is not clear to me ... I see how an existing counter is
> configured but I do not see any counter being freed/allocated, where is that
> done?

My bad. There is a bug here. The following code updates the assignment
states if the group has the counter assigned already.
I need to add the check to allocated/free the counters based on
assign/unassign state requested. Good catch.



> 
>> +    if (assign_state & ASSIGN_TOTAL)
>> +        ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_TOTAL_EVENT_ID,
>> +                           rdtgrp->mon.rmid,
>> +                           rdtgrp->mon.cntr_id[0],
>> +                           rdtgrp->closid, 1);
>> +    if (ret)
>> +        goto out_fail;
>> +
>> +    if (assign_state & ASSIGN_LOCAL)
>> +        ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_LOCAL_EVENT_ID,
>> +                           rdtgrp->mon.rmid,
>> +                           rdtgrp->mon.cntr_id[1],
>> +                           rdtgrp->closid, 1);
>> +
>> +    if (ret)
>> +        goto out_fail;
>> +
>> +    if (unassign_state & ASSIGN_TOTAL)
>> +        ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_TOTAL_EVENT_ID,
>> +                           rdtgrp->mon.rmid,
>> +                           rdtgrp->mon.cntr_id[0],
>> +                           rdtgrp->closid, 0);
>> +
>> +    if (ret)
>> +        goto out_fail;
>> +
>> +    if (unassign_state & ASSIGN_LOCAL)
>> +        ret = resctrl_arch_assign_cntr(d, QOS_L3_MBM_LOCAL_EVENT_ID,
>> +                           rdtgrp->mon.rmid,
>> +                           rdtgrp->mon.cntr_id[1],
>> +                           rdtgrp->closid, 0);
>> +    if (ret)
>> +        goto out_fail;
>> +
>> +    goto next;
>> +
>> +out_fail:
>> +
>> +    return -EINVAL;
>> +}
>> +
>> +static ssize_t rdtgroup_mbm_control_write(struct kernfs_open_file *of,
>> +                      char *buf, size_t nbytes,
>> +                      loff_t off)
>> +{
>> +    struct rdt_resource *r = of->kn->parent->priv;
>> +    char *token, *cmon_grp, *mon_grp;
>> +    struct rdt_hw_resource *hw_res;
>> +    int ret;
>> +
>> +    hw_res = resctrl_to_arch_res(r);
>> +    if (!hw_res->abmc_enabled)
>> +        return -EINVAL;
>> +
>> +    /* Valid input requires a trailing newline */
>> +    if (nbytes == 0 || buf[nbytes - 1] != '\n')
>> +        return -EINVAL;
>> +
>> +    buf[nbytes - 1] = '\0';
>> +    rdt_last_cmd_clear();
> 
> rdt_last_cmd_clear() should be called with mutex held

Sure.
> 
>> +
>> +    cpus_read_lock();
>> +    mutex_lock(&rdtgroup_mutex);
>> +
>> +    while ((token = strsep(&buf, "\n")) != NULL) {
>> +        if (strstr(token, "//")) {
>> +            /*
>> +             * The CTRL_MON group processing:
>> +             * default CTRL_MON group: "//<flags>"
>> +             * non-default CTRL_MON group: "<CTRL_MON group>//flags"
>> +             * The CTRL_MON group will be empty string if it is a
>> +             * default group.
>> +             */
>> +            cmon_grp = strsep(&token, "//");
>> +
>> +            /*
>> +             * strsep returns empty string for contiguous delimiters.
>> +             * Make sure check for two consicutive delimiters and
> 
> consicutive -> consecutive

Sure.

> 
>> +             * advance the token.
>> +             */
>> +            mon_grp = strsep(&token, "//");
>> +            if (*mon_grp != '\0') {
>> +                rdt_last_cmd_printf("Invalid CTRL_MON group format
>> %s\n", token);
>> +                ret = -EINVAL;
>> +                break;
>> +            }
>> +
>> +            ret = rdtgroup_process_flags(RDTCTRL_GROUP, cmon_grp,
>> mon_grp, token);
>> +            if (ret)
>> +                break;
>> +        } else if (strstr(token, "/")) {
>> +            /*
>> +             * MON group processing:
>> +             * MON_GROUP inside default CTRL_MON group: "/<MON
>> group>/<flags>"
>> +             * MON_GROUP within CTRL_MON group: "<CTRL_MON group>/<MON
>> group>/<flags>"
>> +             */
>> +            cmon_grp = strsep(&token, "/");
>> +
>> +            /* Extract the MON_GROUP. It cannot be empty string */
>> +            mon_grp = strsep(&token, "/");
>> +            if (*mon_grp == '\0') {
>> +                rdt_last_cmd_printf("Invalid MON_GROUP format %s\n",
>> token);
>> +                ret = -EINVAL;
>> +                break;
>> +            }
>> +
>> +            ret = rdtgroup_process_flags(RDTMON_GROUP, cmon_grp,
>> mon_grp, token);
>> +            if (ret)
>> +                break;
>> +        }
> 
> can these two blocks not be merged? strsep(&token, "//") and
> strsep(&token, "/") do the same
> thing, no?

Sure. Will check.

> 
>> +    }
>> +
>> +    mutex_unlock(&rdtgroup_mutex);
>> +    cpus_read_unlock();
>> +
>> +    return ret ?: nbytes;
>> +}
>> +
>>   #ifdef CONFIG_PROC_CPU_RESCTRL
>>     /*
>> @@ -2282,9 +2529,10 @@ static struct rftype res_common_files[] = {
>>       },
>>       {
>>           .name        = "mbm_control",
>> -        .mode        = 0444,
>> +        .mode        = 0644,
>>           .kf_ops        = &rdtgroup_kf_single_ops,
>>           .seq_show    = rdtgroup_mbm_control_show,
>> +        .write        = rdtgroup_mbm_control_write,
>>       },
>>       {
>>           .name        = "cpus_list",
> 
> Reinette
> 

-- 
Thanks
Babu Moger