[PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature

Babu Moger posted 8 patches 4 months ago
[PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
Posted by Babu Moger 4 months ago
The io_alloc feature in resctrl is a mechanism that enables direct
insertion of data from I/O devices into the L3 cache.

On AMD systems, io_alloc feature is backed by SDCIAE (L3 Smart Data Cache
Injection Allocation Enforcement). When enabled, SDCIAE forces all SDCI
lines to be placed into the L3 cache partitions identified by the
highest-supported L3_MASK_n register as reported by CPUID
Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15, SDCI lines will
be allocated into the L3 cache partitions determined by the bitmask in
the L3_MASK_15 register.

When CDP is enabled, io_alloc routes I/O traffic using the highest CLOSID
allocated for the instruction cache (L3CODE).

Introduce user interface to enable/disable "io_alloc" feature.

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v6: Changed the subject to fs/resctrl:

v5: Resolved conflicts due to recent resctrl FS/ARCH code restructure.
    Used rdt_kn_name to get the rdtgroup name instead of accesssing it directly
    while printing group name used by the io_alloc_closid.

    Updated bit_usage to reflect the io_alloc CBM as discussed in the thread:
    https://lore.kernel.org/lkml/3ca0a5dc-ad9c-4767-9011-b79d986e1e8d@intel.com/
    Modified rdt_bit_usage_show() to read io_alloc_cbm in hw_shareable, ensuring
    that bit_usage accurately represents the CBMs.

    Updated the code to modify io_alloc either with L3CODE or L3DATA.
    https://lore.kernel.org/lkml/c00c00ea-a9ac-4c56-961c-dc5bf633476b@intel.com/

v4: Updated the change log.
    Updated the user doc.
    The "io_alloc" interface will report "enabled/disabled/not supported".
    Updated resctrl_io_alloc_closid_get() to verify the max closid availability.
    Updated the documentation for "shareable_bits" and "bit_usage".
    Introduced io_alloc_init() to initialize fflags.
    Printed the group name when io_alloc enablement fails.

    NOTE: io_alloc is about specific CLOS. rdt_bit_usage_show() is not designed
    handle bit_usage for specific CLOS. Its about overall system. So, we cannot
    really tell the user which CLOS is shared across both hardware and software.
    We need to discuss this.

v3: Rewrote the change to make it generic.
    Rewrote the documentation in resctrl.rst to be generic and added
    AMD feature details in the end.
    Added the check to verify if MAX CLOSID availability on the system.
    Added CDP check to make sure io_alloc is configured in CDP_CODE.
    Added resctrl_io_alloc_closid_free() to free the io_alloc CLOSID.
    Added errors in few cases when CLOSID allocation fails.
    Fixes splat reported when info/L3/bit_usage is accesed when io_alloc
    is enabled.
    https://lore.kernel.org/lkml/SJ1PR11MB60837B532254E7B23BC27E84FC052@SJ1PR11MB6083.namprd11.prod.outlook.com/

v2: Renamed the feature to "io_alloc".
    Added generic texts for the feature in commit log and resctrl.rst doc.
    Added resctrl_io_alloc_init_cat() to initialize io_alloc to default
    values when enabled.
    Fixed io_alloc show functinality to display only on L3 resource.
---
 Documentation/filesystems/resctrl.rst |  34 ++++
 fs/resctrl/rdtgroup.c                 | 216 +++++++++++++++++++++++++-
 2 files changed, 248 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
index c7949dd44f2f..03c829b2c276 100644
--- a/Documentation/filesystems/resctrl.rst
+++ b/Documentation/filesystems/resctrl.rst
@@ -95,6 +95,11 @@ related to allocation:
 		some platforms support devices that have their
 		own settings for cache use which can over-ride
 		these bits.
+
+		When the "io_alloc" feature is enabled, a portion of the cache
+		is reserved for shared use between hardware and software. Refer
+		to "bit_usage" to see which portion is allocated for this purpose.
+
 "bit_usage":
 		Annotated capacity bitmasks showing how all
 		instances of the resource are used. The legend is:
@@ -135,6 +140,35 @@ related to allocation:
 			"1":
 			      Non-contiguous 1s value in CBM is supported.
 
+"io_alloc":
+		The "io_alloc" enables system software to configure the portion
+		of the L3 cache allocated for I/O traffic.
+
+		The feature routes the I/O traffic via specific CLOSID reserved
+		for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
+		for the CLOSID, users can control the L3 portions available for
+		I/0 traffic. The reserved CLOSID will be excluded for group creation.
+
+		The interface provides a means to query the status of the feature.
+
+		Example::
+
+			# cat /sys/fs/resctrl/info/L3/io_alloc
+			disabled
+
+		Feature can be enabled/disabled by writing to the interface.
+		Example::
+
+			# echo 1 > /sys/fs/resctrl/info/L3/io_alloc
+			# cat /sys/fs/resctrl/info/L3/io_alloc
+			enabled
+
+		On AMD systems, the io_alloc feature is supported by the L3 Smart
+		Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
+		io_alloc is determined by the highest CLOSID supported by the resource.
+		When CDP is enabled, io_alloc routes I/O traffic using the highest
+		CLOSID allocated for the instruction cache (L3CODE).
+
 Memory bandwidth(MB) subdirectory contains the following files
 with respect to allocation:
 
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 1beb124e25f6..bbc032b4d0e9 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -70,6 +70,7 @@ static struct seq_buf last_cmd_status;
 static char last_cmd_status_buf[512];
 
 static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
+static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
 
 static void rdtgroup_destroy_root(void);
 
@@ -232,6 +233,19 @@ bool closid_allocated(unsigned int closid)
 	return !test_bit(closid, closid_free_map);
 }
 
+static int resctrl_io_alloc_closid_alloc(u32 io_alloc_closid)
+{
+	if (__test_and_clear_bit(io_alloc_closid, closid_free_map))
+		return io_alloc_closid;
+	else
+		return -ENOSPC;
+}
+
+static void resctrl_io_alloc_closid_free(u32 io_alloc_closid)
+{
+	closid_free(io_alloc_closid);
+}
+
 /**
  * rdtgroup_mode_by_closid - Return mode of resource group with closid
  * @closid: closid if the resource group
@@ -1030,6 +1044,29 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
 	return 0;
 }
 
+/*
+ * resctrl_io_alloc_closid_get - io_alloc feature uses max CLOSID to route
+ * the IO traffic. Get the max CLOSID and verify if the CLOSID is available.
+ *
+ * The total number of CLOSIDs is determined in closid_init(),  based on the
+ * minimum supported across all resources. If CDP (Code Data Prioritization)
+ * is enabled, the number of CLOSIDs is halved. The final value is returned
+ * by closids_supported(). Make sure this value aligns with the maximum
+ * CLOSID supported by the respective resource.
+ */
+static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
+{
+	int num_closids = closids_supported();
+
+	if (resctrl_arch_get_cdp_enabled(r->rid))
+		num_closids *= 2;
+
+	if (num_closids != resctrl_arch_get_num_closid(r))
+		return -ENOSPC;
+
+	return closids_supported() - 1;
+}
+
 /*
  * rdt_bit_usage_show - Display current usage of resources
  *
@@ -1058,20 +1095,23 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
 	struct rdt_ctrl_domain *dom;
 	int i, hwb, swb, excl, psl;
 	enum rdtgrp_mode mode;
+	int io_alloc_closid;
 	bool sep = false;
 	u32 ctrl_val;
 
 	cpus_read_lock();
 	mutex_lock(&rdtgroup_mutex);
-	hw_shareable = r->cache.shareable_bits;
 	list_for_each_entry(dom, &r->ctrl_domains, hdr.list) {
 		if (sep)
 			seq_putc(seq, ';');
+		hw_shareable = r->cache.shareable_bits;
 		sw_shareable = 0;
 		exclusive = 0;
 		seq_printf(seq, "%d=", dom->hdr.id);
 		for (i = 0; i < closids_supported(); i++) {
-			if (!closid_allocated(i))
+			if (!closid_allocated(i) ||
+			    (resctrl_arch_get_io_alloc_enabled(r) &&
+			     i == resctrl_io_alloc_closid_get(r)))
 				continue;
 			ctrl_val = resctrl_arch_get_config(r, dom, i,
 							   s->conf_type);
@@ -1099,6 +1139,24 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
 				break;
 			}
 		}
+
+		/*
+		 * When the "io_alloc" feature is enabled, a portion of the
+		 * cache is reserved for shared use between hardware and software.
+		 */
+		if (resctrl_arch_get_io_alloc_enabled(r)) {
+			io_alloc_closid = resctrl_io_alloc_closid_get(r);
+			if (resctrl_arch_get_cdp_enabled(r->rid))
+				ctrl_val = resctrl_arch_get_config(r, dom,
+								   io_alloc_closid,
+								   CDP_CODE);
+			else
+				ctrl_val = resctrl_arch_get_config(r, dom,
+								   io_alloc_closid,
+								   CDP_NONE);
+			hw_shareable |= ctrl_val;
+		}
+
 		for (i = r->cache.cbm_len - 1; i >= 0; i--) {
 			pseudo_locked = dom->plr ? dom->plr->cbm : 0;
 			hwb = test_bit(i, &hw_shareable);
@@ -1803,6 +1861,142 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
 	return ret ?: nbytes;
 }
 
+static int resctrl_io_alloc_show(struct kernfs_open_file *of,
+				 struct seq_file *seq, void *v)
+{
+	struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+	struct rdt_resource *r = s->res;
+
+	if (r->cache.io_alloc_capable) {
+		if (resctrl_arch_get_io_alloc_enabled(r))
+			seq_puts(seq, "enabled\n");
+		else
+			seq_puts(seq, "disabled\n");
+	} else {
+		seq_puts(seq, "not supported\n");
+	}
+
+	return 0;
+}
+
+/*
+ * Initialize io_alloc CLOSID cache resource with default CBM values.
+ */
+static int resctrl_io_alloc_init_cat(struct rdt_resource *r,
+				     struct resctrl_schema *s, u32 closid)
+{
+	int ret;
+
+	rdt_staged_configs_clear();
+
+	ret = rdtgroup_init_cat(s, closid);
+	if (ret < 0)
+		goto out_init_cat;
+
+	ret = resctrl_arch_update_domains(r, closid);
+
+out_init_cat:
+	rdt_staged_configs_clear();
+	return ret;
+}
+
+static const char *rdtgroup_name_by_closid(int closid)
+{
+	struct rdtgroup *rdtgrp;
+
+	list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
+		if (rdtgrp->closid == closid)
+			return rdt_kn_name(rdtgrp->kn);
+	}
+
+	return NULL;
+}
+
+/*
+ * When CDP is enabled, io_alloc directs traffic using the highest CLOSID
+ * linked to an L3CODE resource. Although CBMs can be accessed through
+ * either L3CODE or L3DATA resources, any updates to the schemata must
+ * always be performed on L3CODE.
+ */
+static struct resctrl_schema *resctrl_schema_io_alloc(struct resctrl_schema *s)
+{
+	struct resctrl_schema *schema;
+
+	if (s->conf_type == CDP_DATA) {
+		list_for_each_entry(schema, &resctrl_schema_all, list) {
+			if (schema->conf_type == CDP_CODE)
+				return schema;
+		}
+	}
+
+	return s;
+}
+
+static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
+				      size_t nbytes, loff_t off)
+{
+	struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
+	struct rdt_resource *r = s->res;
+	char const *grp_name;
+	u32 io_alloc_closid;
+	bool enable;
+	int ret;
+
+	ret = kstrtobool(buf, &enable);
+	if (ret)
+		return ret;
+
+	cpus_read_lock();
+	mutex_lock(&rdtgroup_mutex);
+
+	rdt_last_cmd_clear();
+
+	if (!r->cache.io_alloc_capable) {
+		rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
+		ret = -ENODEV;
+		goto out_io_alloc;
+	}
+
+	io_alloc_closid = resctrl_io_alloc_closid_get(r);
+	if (io_alloc_closid < 0) {
+		rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
+		ret = -EINVAL;
+		goto out_io_alloc;
+	}
+
+	if (resctrl_arch_get_io_alloc_enabled(r) != enable) {
+		if (enable) {
+			ret = resctrl_io_alloc_closid_alloc(io_alloc_closid);
+			if (ret < 0) {
+				grp_name = rdtgroup_name_by_closid(io_alloc_closid);
+				rdt_last_cmd_printf("CLOSID for io_alloc is used by %s group\n",
+						    grp_name ? grp_name : "another");
+				ret = -EINVAL;
+				goto out_io_alloc;
+			}
+
+			ret = resctrl_io_alloc_init_cat(r, resctrl_schema_io_alloc(s),
+							io_alloc_closid);
+			if (ret) {
+				rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
+				resctrl_io_alloc_closid_free(io_alloc_closid);
+				goto out_io_alloc;
+			}
+
+		} else {
+			resctrl_io_alloc_closid_free(io_alloc_closid);
+		}
+
+		ret = resctrl_arch_io_alloc_enable(r, enable);
+	}
+
+out_io_alloc:
+	mutex_unlock(&rdtgroup_mutex);
+	cpus_read_unlock();
+
+	return ret ?: nbytes;
+}
+
 /* rdtgroup information files for one cache resource. */
 static struct rftype res_common_files[] = {
 	{
@@ -1955,6 +2149,13 @@ static struct rftype res_common_files[] = {
 		.seq_show	= rdtgroup_schemata_show,
 		.fflags		= RFTYPE_CTRL_BASE,
 	},
+	{
+		.name           = "io_alloc",
+		.mode           = 0644,
+		.kf_ops         = &rdtgroup_kf_single_ops,
+		.seq_show       = resctrl_io_alloc_show,
+		.write          = resctrl_io_alloc_write,
+	},
 	{
 		.name		= "mba_MBps_event",
 		.mode		= 0644,
@@ -2062,6 +2263,15 @@ static void thread_throttle_mode_init(void)
 				 RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
 }
 
+static void io_alloc_init(void)
+{
+	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+
+	if (r->cache.io_alloc_capable)
+		resctrl_file_fflags_init("io_alloc",
+					 RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
+}
+
 void resctrl_file_fflags_init(const char *config, unsigned long fflags)
 {
 	struct rftype *rft;
@@ -4249,6 +4459,8 @@ int resctrl_init(void)
 
 	thread_throttle_mode_init();
 
+	io_alloc_init();
+
 	ret = resctrl_mon_resource_init();
 	if (ret)
 		return ret;
-- 
2.34.1
Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
Posted by Reinette Chatre 3 months, 3 weeks ago
Hi Babu,

On 6/11/25 2:23 PM, Babu Moger wrote:
> The io_alloc feature in resctrl is a mechanism that enables direct

"The ... feature ... is a mechanism"? What does it mean when a feature
is a mechanism? How about just "The io_alloc feature in resctrl enables ..."?

> insertion of data from I/O devices into the L3 cache.

Drop "L3"?

> 
> On AMD systems, io_alloc feature is backed by SDCIAE (L3 Smart Data Cache
> Injection Allocation Enforcement). When enabled, SDCIAE forces all SDCI
> lines to be placed into the L3 cache partitions identified by the
> highest-supported L3_MASK_n register as reported by CPUID
> Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15, SDCI lines will
> be allocated into the L3 cache partitions determined by the bitmask in
> the L3_MASK_15 register.

This is a resctrl fs patch but most of changelog so far is about AMD's implementation
details. I do think it is relevant, but all the register details can
probably be dropped since it is too low level to support the goal. I believe the
goal here is to motivate resctrl fs's implementation that needs to pick highest
CLOSID. So, essentially the changelog needs to hightlight that AMD's implementation
requires highest CLOSID and without any other reference that is what resctrl fs's
implementation supports. I think it is important to highlight that the
user interface is not tied to this implementation decision to avoid future issues
if another architecture support "io_alloc" "differently". Internals of
resctrl fs can then be changed.

> 
> When CDP is enabled, io_alloc routes I/O traffic using the highest CLOSID
> allocated for the instruction cache (L3CODE).

Again, this is a resctrl fs patch and above is an AMD implementation detail. resctrl
fs should not be limited by how AMD implements the feature but can use it as first
reference.

> 
> Introduce user interface to enable/disable "io_alloc" feature.

This patch does more than this.

> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
...

> ---
>  Documentation/filesystems/resctrl.rst |  34 ++++
>  fs/resctrl/rdtgroup.c                 | 216 +++++++++++++++++++++++++-

This patch contains several logical changes that are not adequately described
in changelog.
I think the following can be separate patches:
- rdt_bit_usage_show() change
- io_alloc_init() definition and usage
- show() and write() helpers
- possibly the io_alloc_closid helpers (more later)

>  2 files changed, 248 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
> index c7949dd44f2f..03c829b2c276 100644
> --- a/Documentation/filesystems/resctrl.rst
> +++ b/Documentation/filesystems/resctrl.rst
> @@ -95,6 +95,11 @@ related to allocation:
>  		some platforms support devices that have their
>  		own settings for cache use which can over-ride
>  		these bits.
> +
> +		When the "io_alloc" feature is enabled, a portion of the cache
> +		is reserved for shared use between hardware and software. Refer

"reserved" and "shared" sounds like a contradiction. How about "is reserved" ->
"can be configured"?

> +		to "bit_usage" to see which portion is allocated for this purpose.

This is the "shareable_bits" section and the reason why user is pointed to
"bit_usage" is because "shareable_bits" is a global CBM that applies to all
cache domains/instances while "bit_usage" is per cache domain/instance. I think
it will be helpful to point this out to the user.
Perhaps second sentence can be replaced with something like:
""bit_usage" should be used to see which portions of each cache instance
 is configured for hardware use via the "io_alloc" feature because every cache
 instance can have its "io_alloc" bitmask configured independently".

Please do improve this.

To complete this the first part of the "shareable_bits" doc can be amended:
"Bitmask of shareable resource" -> "Bitmask (applicable to all instances of this resource) of shareable resource"
What do you think?

> +
>  "bit_usage":
>  		Annotated capacity bitmasks showing how all
>  		instances of the resource are used. The legend is:
> @@ -135,6 +140,35 @@ related to allocation:
>  			"1":
>  			      Non-contiguous 1s value in CBM is supported.
>  
> +"io_alloc":
> +		The "io_alloc" enables system software to configure the portion

"The "io_alloc" enables"? Maybe just ""io_alloc" enables"?

> +		of the L3 cache allocated for I/O traffic.

Drop "L3"?

Can append, for example, "File may only exist if the system supports this feature on some of its cache
resources".

> +
> +		The feature routes the I/O traffic via specific CLOSID reserved
> +		for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
> +		for the CLOSID, users can control the L3 portions available for
> +		I/0 traffic. The reserved CLOSID will be excluded for group creation.

Looking back I've commented *four* times already about how resctrl fs user interface
should not be made specific to AMD's implementation.
This paragraph just be dropped?


The rest can be something like:

		"disabled": Portions of cache used for allocation of I/O traffic cannot be configured.
		"enabled": Portions of cache used for allocation of I/O traffic can be configured using "io_alloc_cbm"
		"not supported": ...

		The underlying implementation may reduce resources available to
		general (CPU) cache allocation. See architecture specific notes below.
		Depending on usage requirements the feature can be enabled or disabled:

		To enable:
			# echo 1 > /sys/fs/resctrl/info/L3/io_alloc

		To disable:
			# echo 0 > /sys/fs/resctrl/info/L3/io_alloc

> +
> +		The interface provides a means to query the status of the feature.
> +
> +		Example::
> +
> +			# cat /sys/fs/resctrl/info/L3/io_alloc
> +			disabled
> +
> +		Feature can be enabled/disabled by writing to the interface.
> +		Example::
> +
> +			# echo 1 > /sys/fs/resctrl/info/L3/io_alloc
> +			# cat /sys/fs/resctrl/info/L3/io_alloc
> +			enabled
> +
> +		On AMD systems, the io_alloc feature is supported by the L3 Smart
> +		Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
> +		io_alloc is determined by the highest CLOSID supported by the resource.
> +		When CDP is enabled, io_alloc routes I/O traffic using the highest
> +		CLOSID allocated for the instruction cache (L3CODE).
> +
>  Memory bandwidth(MB) subdirectory contains the following files
>  with respect to allocation:
>  
> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
> index 1beb124e25f6..bbc032b4d0e9 100644
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -70,6 +70,7 @@ static struct seq_buf last_cmd_status;
>  static char last_cmd_status_buf[512];
>  
>  static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
> +static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
>  
>  static void rdtgroup_destroy_root(void);
>  
> @@ -232,6 +233,19 @@ bool closid_allocated(unsigned int closid)
>  	return !test_bit(closid, closid_free_map);
>  }
>  
> +static int resctrl_io_alloc_closid_alloc(u32 io_alloc_closid)
> +{
> +	if (__test_and_clear_bit(io_alloc_closid, closid_free_map))
> +		return io_alloc_closid;
> +	else
> +		return -ENOSPC;
> +}
> +
> +static void resctrl_io_alloc_closid_free(u32 io_alloc_closid)
> +{
> +	closid_free(io_alloc_closid);
> +}
> +
>  /**
>   * rdtgroup_mode_by_closid - Return mode of resource group with closid
>   * @closid: closid if the resource group
> @@ -1030,6 +1044,29 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
>  	return 0;
>  }
>  
> +/*
> + * resctrl_io_alloc_closid_get - io_alloc feature uses max CLOSID to route
> + * the IO traffic. Get the max CLOSID and verify if the CLOSID is available.

The function name should be followed by a *brief* description.

> + *
> + * The total number of CLOSIDs is determined in closid_init(),  based on the
> + * minimum supported across all resources. If CDP (Code Data Prioritization)
> + * is enabled, the number of CLOSIDs is halved. The final value is returned
> + * by closids_supported(). Make sure this value aligns with the maximum
> + * CLOSID supported by the respective resource.

All but the last sentence is unrelated to this function and the last sentence
is very vague. What is "this value" that it refers to?

> + */
> +static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
> +{
> +	int num_closids = closids_supported();
> +
> +	if (resctrl_arch_get_cdp_enabled(r->rid))
> +		num_closids *= 2;
> +
> +	if (num_closids != resctrl_arch_get_num_closid(r))
> +		return -ENOSPC;
> +
> +	return closids_supported() - 1;
> +}

resctrl_io_alloc_closid_get() seems to be trying to do two things: 
- determine what the io_alloc_closid is
- make sure the io_alloc_closid is supported

I think this should be split into two functions. Once the
io_alloc_closid is determined to be supported and io_alloc
enabled then there is no reason to keep checking if it is
supported whenever the io_alloc_closid is queried.

How about simplifying this to:

/*
 * note how this returns u32 that will eliminate
 * unnecessary error checking in usages where io_alloc_closid
 * needs to be determined after an resctrl_arch_get_io_alloc_enabled(r)
 * already confirmed io_alloc is enabled
 * function comment could note that this returns the CLOSID
 * required by io_alloc but not whether the CLOSID can
 * be supported, for this resctrl_io_alloc_closid_supported() should
 * be used.
 * Can also note that returned value will always be valid if
 * resctrl_arch_get_io_alloc_enabled(r) is true.
 */
u32 resctrl_io_alloc_closid(struct rdt_resource *r) {
	if (resctrl_arch_get_cdp_enabled(r->rid))
		return resctrl_arch_get_num_closid(r)/2  - 1
	else
		return resctrl_arch_get_num_closid(r) -1
}

/*
 * note how below already makes resctrl's io_alloc implementation
 * more generic
 */
resctrl_io_alloc_closid_supported(u32 io_alloc_closid) {
	return io_alloc_closid <  closids_supported()
}



> +
>  /*
>   * rdt_bit_usage_show - Display current usage of resources
>   *
> @@ -1058,20 +1095,23 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
>  	struct rdt_ctrl_domain *dom;
>  	int i, hwb, swb, excl, psl;
>  	enum rdtgrp_mode mode;
> +	int io_alloc_closid;
>  	bool sep = false;
>  	u32 ctrl_val;
>  
>  	cpus_read_lock();
>  	mutex_lock(&rdtgroup_mutex);
> -	hw_shareable = r->cache.shareable_bits;
>  	list_for_each_entry(dom, &r->ctrl_domains, hdr.list) {
>  		if (sep)
>  			seq_putc(seq, ';');
> +		hw_shareable = r->cache.shareable_bits;
>  		sw_shareable = 0;
>  		exclusive = 0;
>  		seq_printf(seq, "%d=", dom->hdr.id);
>  		for (i = 0; i < closids_supported(); i++) {
> -			if (!closid_allocated(i))
> +			if (!closid_allocated(i) ||
> +			    (resctrl_arch_get_io_alloc_enabled(r) &&
> +			     i == resctrl_io_alloc_closid_get(r)))
>  				continue;
>  			ctrl_val = resctrl_arch_get_config(r, dom, i,
>  							   s->conf_type);
> @@ -1099,6 +1139,24 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
>  				break;
>  			}
>  		}
> +
> +		/*
> +		 * When the "io_alloc" feature is enabled, a portion of the
> +		 * cache is reserved for shared use between hardware and software.
> +		 */
> +		if (resctrl_arch_get_io_alloc_enabled(r)) {
> +			io_alloc_closid = resctrl_io_alloc_closid_get(r);

In this implementation io_alloc_closid can be negative and the static
checker I used complained about the subsequent usage in 
resctrl_arch_get_config() that must be unsigned.
Since resctrl_arch_get_io_alloc_enabled(r) already passed this is one
example where resctrl_io_alloc_closid() can be used.

> +			if (resctrl_arch_get_cdp_enabled(r->rid))
> +				ctrl_val = resctrl_arch_get_config(r, dom,
> +								   io_alloc_closid,
> +								   CDP_CODE);
> +			else
> +				ctrl_val = resctrl_arch_get_config(r, dom,
> +								   io_alloc_closid,
> +								   CDP_NONE);
> +			hw_shareable |= ctrl_val;
> +		}
> +
>  		for (i = r->cache.cbm_len - 1; i >= 0; i--) {
>  			pseudo_locked = dom->plr ? dom->plr->cbm : 0;
>  			hwb = test_bit(i, &hw_shareable);
> @@ -1803,6 +1861,142 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>  	return ret ?: nbytes;
>  }
>  
> +static int resctrl_io_alloc_show(struct kernfs_open_file *of,
> +				 struct seq_file *seq, void *v)
> +{
> +	struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
> +	struct rdt_resource *r = s->res;
> +

Needs rdtgroup_mutex

> +	if (r->cache.io_alloc_capable) {
> +		if (resctrl_arch_get_io_alloc_enabled(r))
> +			seq_puts(seq, "enabled\n");
> +		else
> +			seq_puts(seq, "disabled\n");
> +	} else {
> +		seq_puts(seq, "not supported\n");
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Initialize io_alloc CLOSID cache resource with default CBM values.

It is unclear what "default" means.
Could be "Initialize io_alloc CLOSID cache resource CBM with all usable (shared and unused) cache portions".

> + */
> +static int resctrl_io_alloc_init_cat(struct rdt_resource *r,
> +				     struct resctrl_schema *s, u32 closid)
> +{
> +	int ret;
> +
> +	rdt_staged_configs_clear();
> +
> +	ret = rdtgroup_init_cat(s, closid);
> +	if (ret < 0)
> +		goto out_init_cat;

The "out" label should reflect what is done at target, not the source.
Consider all the usages of "out_unlock" in resctrl.
Since this is the only label it can just be "out".

> +
> +	ret = resctrl_arch_update_domains(r, closid);
> +
> +out_init_cat:
> +	rdt_staged_configs_clear();
> +	return ret;
> +}
> +
> +static const char *rdtgroup_name_by_closid(int closid)
> +{
> +	struct rdtgroup *rdtgrp;
> +
> +	list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
> +		if (rdtgrp->closid == closid)
> +			return rdt_kn_name(rdtgrp->kn);
> +	}
> +
> +	return NULL;
> +}
> +
> +/*
> + * When CDP is enabled, io_alloc directs traffic using the highest CLOSID
> + * linked to an L3CODE resource. Although CBMs can be accessed through
> + * either L3CODE or L3DATA resources, any updates to the schemata must
> + * always be performed on L3CODE.
> + */

I think that updates to the schemata needs to be made on *both* L3DATA and L3CODE.
Consider how __init_one_rdt_domain() works when a new resource group is created ...
the algorithm looks through all allocated CLOSID and the associated schemata impact
the CBM of the new group. If an allocated CLOSID is associated with L3CODE
then its "peer" L3DATA is also taken into account, similar for L3DATA.
If only L3CODE is updated for the io_alloc_closid then it looks to me that
whatever the original L3DATA schema was will keep impacting new resource
groups. To avoid that and ensure only accurate CBMs are used it looks to me
as though the L3DATA and L3CODE schema needs to be kept in sync.

> +static struct resctrl_schema *resctrl_schema_io_alloc(struct resctrl_schema *s)
> +{
> +	struct resctrl_schema *schema;
> +
> +	if (s->conf_type == CDP_DATA) {
> +		list_for_each_entry(schema, &resctrl_schema_all, list) {
> +			if (schema->conf_type == CDP_CODE)
> +				return schema;
> +		}
> +	}
> +
> +	return s;
> +}
> +
> +static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
> +				      size_t nbytes, loff_t off)
> +{
> +	struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
> +	struct rdt_resource *r = s->res;
> +	char const *grp_name;
> +	u32 io_alloc_closid;
> +	bool enable;
> +	int ret;
> +
> +	ret = kstrtobool(buf, &enable);
> +	if (ret)
> +		return ret;
> +
> +	cpus_read_lock();
> +	mutex_lock(&rdtgroup_mutex);
> +
> +	rdt_last_cmd_clear();
> +
> +	if (!r->cache.io_alloc_capable) {
> +		rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");

This could be more useful with a small change:
"io_alloc is not supported on %s\n", s->name

> +		ret = -ENODEV;
> +		goto out_io_alloc;

out_io_alloc -> out_unlock

> +	}
> +
> +	io_alloc_closid = resctrl_io_alloc_closid_get(r);
> +	if (io_alloc_closid < 0) {
> +		rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");

This could be more useful to help debug by printing the value of io_alloc_closid
that user can compare against output of num_closids files. Here the terms become
a bit complicated since ideally we want to use ctrl_hw_id but that does not match
"num_closids", so perhaps use both terms, for example "CLOSID (ctrl_hw_id)"?
I am not sure here.

> +		ret = -EINVAL;
> +		goto out_io_alloc;
> +	}
> +
> +	if (resctrl_arch_get_io_alloc_enabled(r) != enable) {
> +		if (enable) {
> +			ret = resctrl_io_alloc_closid_alloc(io_alloc_closid);
> +			if (ret < 0) {
> +				grp_name = rdtgroup_name_by_closid(io_alloc_closid);

Below handles !grp_name but that would be a kernel bug, no? Maybe WARN_ON_ONCE()?

> +				rdt_last_cmd_printf("CLOSID for io_alloc is used by %s group\n",
> +						    grp_name ? grp_name : "another");


CLOSID -> ctrl_hw_id

> +				ret = -EINVAL;
> +				goto out_io_alloc;
> +			}
> +
> +			ret = resctrl_io_alloc_init_cat(r, resctrl_schema_io_alloc(s),
> +							io_alloc_closid);
> +			if (ret) {
> +				rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
> +				resctrl_io_alloc_closid_free(io_alloc_closid);
> +				goto out_io_alloc;
> +			}
> +
> +		} else {
> +			resctrl_io_alloc_closid_free(io_alloc_closid);
> +		}
> +
> +		ret = resctrl_arch_io_alloc_enable(r, enable);
> +	}
> +
> +out_io_alloc:

out_unlock ... to match the other places in resctrl.

> +	mutex_unlock(&rdtgroup_mutex);
> +	cpus_read_unlock();
> +
> +	return ret ?: nbytes;
> +}
> +
>  /* rdtgroup information files for one cache resource. */
>  static struct rftype res_common_files[] = {
>  	{
> @@ -1955,6 +2149,13 @@ static struct rftype res_common_files[] = {
>  		.seq_show	= rdtgroup_schemata_show,
>  		.fflags		= RFTYPE_CTRL_BASE,
>  	},
> +	{
> +		.name           = "io_alloc",
> +		.mode           = 0644,
> +		.kf_ops         = &rdtgroup_kf_single_ops,
> +		.seq_show       = resctrl_io_alloc_show,
> +		.write          = resctrl_io_alloc_write,
> +	},

Please match existing code wrt tab usage.

>  	{
>  		.name		= "mba_MBps_event",
>  		.mode		= 0644,
> @@ -2062,6 +2263,15 @@ static void thread_throttle_mode_init(void)
>  				 RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
>  }
>  
> +static void io_alloc_init(void)
> +{
> +	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
> +
> +	if (r->cache.io_alloc_capable)
> +		resctrl_file_fflags_init("io_alloc",
> +					 RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
> +}

Note that even if it only checks the L3 cache resource, this will make the
file visible to all cache resource, also L2. This is why it is important to
ensure documentation and implementation also accommodates resources that
do not support io_alloc.

> +
>  void resctrl_file_fflags_init(const char *config, unsigned long fflags)
>  {
>  	struct rftype *rft;
> @@ -4249,6 +4459,8 @@ int resctrl_init(void)
>  
>  	thread_throttle_mode_init();
>  
> +	io_alloc_init();
> +
>  	ret = resctrl_mon_resource_init();
>  	if (ret)
>  		return ret;

Reinette
Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
Posted by Moger, Babu 3 months, 3 weeks ago
Hi Reinette,

On 6/17/25 22:59, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/11/25 2:23 PM, Babu Moger wrote:
>> The io_alloc feature in resctrl is a mechanism that enables direct
> 
> "The ... feature ... is a mechanism"? What does it mean when a feature
> is a mechanism? How about just "The io_alloc feature in resctrl enables ..."?

Sure.

> 
>> insertion of data from I/O devices into the L3 cache.
> 
> Drop "L3"?
> 

Sure.

>>
>> On AMD systems, io_alloc feature is backed by SDCIAE (L3 Smart Data Cache
>> Injection Allocation Enforcement). When enabled, SDCIAE forces all SDCI
>> lines to be placed into the L3 cache partitions identified by the
>> highest-supported L3_MASK_n register as reported by CPUID
>> Fn0000_0010_EDX_x1.MAX_COS. For example, if MAX_COS=15, SDCI lines will
>> be allocated into the L3 cache partitions determined by the bitmask in
>> the L3_MASK_15 register.
> 
> This is a resctrl fs patch but most of changelog so far is about AMD's implementation
> details. I do think it is relevant, but all the register details can
> probably be dropped since it is too low level to support the goal. I believe the
> goal here is to motivate resctrl fs's implementation that needs to pick highest
> CLOSID. So, essentially the changelog needs to hightlight that AMD's implementation
> requires highest CLOSID and without any other reference that is what resctrl fs's
> implementation supports. I think it is important to highlight that the
> user interface is not tied to this implementation decision to avoid future issues
> if another architecture support "io_alloc" "differently". Internals of
> resctrl fs can then be changed.
> 

Sure. Will split these patches. Will try make the text more generic.

>>
>> When CDP is enabled, io_alloc routes I/O traffic using the highest CLOSID
>> allocated for the instruction cache (L3CODE).
> 
> Again, this is a resctrl fs patch and above is an AMD implementation detail. resctrl
> fs should not be limited by how AMD implements the feature but can use it as first
> reference.

Sure.

> 
>>
>> Introduce user interface to enable/disable "io_alloc" feature.
> 
> This patch does more than this.

Sure.

> 
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
> ...
> 
>> ---
>>  Documentation/filesystems/resctrl.rst |  34 ++++
>>  fs/resctrl/rdtgroup.c                 | 216 +++++++++++++++++++++++++-
> 
> This patch contains several logical changes that are not adequately described
> in changelog.
> I think the following can be separate patches:
> - rdt_bit_usage_show() change
> - io_alloc_init() definition and usage
> - show() and write() helpers
> - possibly the io_alloc_closid helpers (more later)

Yes. Splitting this into 3 patches.
1. rdt_bit_usage_show() change
   Updates shareable_bits section.
   It calls - resctrl_io_alloc_closid().

2. io_alloc_init() definition and usage
  Introdce the inteface and add show()

3. Add io_alloc write().
   Calls resctrl_io_alloc_closid_supported() and
resctrl_io_alloc_closid_get().


> 
>>  2 files changed, 248 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/filesystems/resctrl.rst b/Documentation/filesystems/resctrl.rst
>> index c7949dd44f2f..03c829b2c276 100644
>> --- a/Documentation/filesystems/resctrl.rst
>> +++ b/Documentation/filesystems/resctrl.rst
>> @@ -95,6 +95,11 @@ related to allocation:
>>  		some platforms support devices that have their
>>  		own settings for cache use which can over-ride
>>  		these bits.
>> +
>> +		When the "io_alloc" feature is enabled, a portion of the cache
>> +		is reserved for shared use between hardware and software. Refer
> 
> "reserved" and "shared" sounds like a contradiction. How about "is reserved" ->
> "can be configured"?
> 
>> +		to "bit_usage" to see which portion is allocated for this purpose.
> 
> This is the "shareable_bits" section and the reason why user is pointed to
> "bit_usage" is because "shareable_bits" is a global CBM that applies to all
> cache domains/instances while "bit_usage" is per cache domain/instance. I think
> it will be helpful to point this out to the user.
> Perhaps second sentence can be replaced with something like:
> ""bit_usage" should be used to see which portions of each cache instance
>  is configured for hardware use via the "io_alloc" feature because every cache
>  instance can have its "io_alloc" bitmask configured independently".

Sure.

""bit_usage" should be used to see which portions of each cache
instance is configured for hardware use via the "io_alloc" feature
because every cache instance can have its "io_alloc" bitmask
configured independently.


> 
> Please do improve this.
> 
> To complete this the first part of the "shareable_bits" doc can be amended:
> "Bitmask of shareable resource" -> "Bitmask (applicable to all instances of this resource) of shareable resource"
> What do you think?

Sure. Added one sentance for that.

Bitmask of shareable resource with other executing entities
(e.g. I/O). Applies to all instances of this resource.



> 
>> +
>>  "bit_usage":
>>  		Annotated capacity bitmasks showing how all
>>  		instances of the resource are used. The legend is:
>> @@ -135,6 +140,35 @@ related to allocation:
>>  			"1":
>>  			      Non-contiguous 1s value in CBM is supported.
>>  
>> +"io_alloc":
>> +		The "io_alloc" enables system software to configure the portion
> 
> "The "io_alloc" enables"? Maybe just ""io_alloc" enables"?
> 
>> +		of the L3 cache allocated for I/O traffic.
> 
> Drop "L3"?
> 
> Can append, for example, "File may only exist if the system supports this feature on some of its cache
> resources".
> 
>> +
>> +		The feature routes the I/O traffic via specific CLOSID reserved
>> +		for io_alloc feature. By configuring the CBM (Capacity Bit Mask)
>> +		for the CLOSID, users can control the L3 portions available for
>> +		I/0 traffic. The reserved CLOSID will be excluded for group creation.
> 
> Looking back I've commented *four* times already about how resctrl fs user interface
> should not be made specific to AMD's implementation.
> This paragraph just be dropped?
> 
> 
> The rest can be something like:
> 
> 		"disabled": Portions of cache used for allocation of I/O traffic cannot be configured.
> 		"enabled": Portions of cache used for allocation of I/O traffic can be configured using "io_alloc_cbm"
> 		"not supported": ...
> 
> 		The underlying implementation may reduce resources available to
> 		general (CPU) cache allocation. See architecture specific notes below.
> 		Depending on usage requirements the feature can be enabled or disabled:
> 
> 		To enable:
> 			# echo 1 > /sys/fs/resctrl/info/L3/io_alloc
> 
> 		To disable:
> 			# echo 0 > /sys/fs/resctrl/info/L3/io_alloc

Sure.

> 
>> +
>> +		The interface provides a means to query the status of the feature.
>> +
>> +		Example::
>> +
>> +			# cat /sys/fs/resctrl/info/L3/io_alloc
>> +			disabled
>> +
>> +		Feature can be enabled/disabled by writing to the interface.
>> +		Example::
>> +
>> +			# echo 1 > /sys/fs/resctrl/info/L3/io_alloc
>> +			# cat /sys/fs/resctrl/info/L3/io_alloc
>> +			enabled
>> +
>> +		On AMD systems, the io_alloc feature is supported by the L3 Smart
>> +		Data Cache Injection Allocation Enforcement (SDCIAE). The CLOSID for
>> +		io_alloc is determined by the highest CLOSID supported by the resource.
>> +		When CDP is enabled, io_alloc routes I/O traffic using the highest
>> +		CLOSID allocated for the instruction cache (L3CODE).
>> +
>>  Memory bandwidth(MB) subdirectory contains the following files
>>  with respect to allocation:
>>  
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index 1beb124e25f6..bbc032b4d0e9 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -70,6 +70,7 @@ static struct seq_buf last_cmd_status;
>>  static char last_cmd_status_buf[512];
>>  
>>  static int rdtgroup_setup_root(struct rdt_fs_context *ctx);
>> +static int rdtgroup_init_cat(struct resctrl_schema *s, u32 closid);
>>  
>>  static void rdtgroup_destroy_root(void);
>>  
>> @@ -232,6 +233,19 @@ bool closid_allocated(unsigned int closid)
>>  	return !test_bit(closid, closid_free_map);
>>  }
>>  
>> +static int resctrl_io_alloc_closid_alloc(u32 io_alloc_closid)
>> +{
>> +	if (__test_and_clear_bit(io_alloc_closid, closid_free_map))
>> +		return io_alloc_closid;
>> +	else
>> +		return -ENOSPC;
>> +}
>> +
>> +static void resctrl_io_alloc_closid_free(u32 io_alloc_closid)
>> +{
>> +	closid_free(io_alloc_closid);
>> +}
>> +
>>  /**
>>   * rdtgroup_mode_by_closid - Return mode of resource group with closid
>>   * @closid: closid if the resource group
>> @@ -1030,6 +1044,29 @@ static int rdt_shareable_bits_show(struct kernfs_open_file *of,
>>  	return 0;
>>  }
>>  
>> +/*
>> + * resctrl_io_alloc_closid_get - io_alloc feature uses max CLOSID to route
>> + * the IO traffic. Get the max CLOSID and verify if the CLOSID is available.
> 
> The function name should be followed by a *brief* description.

Sure.

> 
>> + *
>> + * The total number of CLOSIDs is determined in closid_init(),  based on the
>> + * minimum supported across all resources. If CDP (Code Data Prioritization)
>> + * is enabled, the number of CLOSIDs is halved. The final value is returned
>> + * by closids_supported(). Make sure this value aligns with the maximum
>> + * CLOSID supported by the respective resource.
> 
> All but the last sentence is unrelated to this function and the last sentence
> is very vague. What is "this value" that it refers to?

Removed it.

> 
>> + */
>> +static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
>> +{
>> +	int num_closids = closids_supported();
>> +
>> +	if (resctrl_arch_get_cdp_enabled(r->rid))
>> +		num_closids *= 2;
>> +
>> +	if (num_closids != resctrl_arch_get_num_closid(r))
>> +		return -ENOSPC;
>> +
>> +	return closids_supported() - 1;
>> +}
> 
> resctrl_io_alloc_closid_get() seems to be trying to do two things: 
> - determine what the io_alloc_closid is
> - make sure the io_alloc_closid is supported
> 
> I think this should be split into two functions. Once the
> io_alloc_closid is determined to be supported and io_alloc
> enabled then there is no reason to keep checking if it is
> supported whenever the io_alloc_closid is queried.
> 
> How about simplifying this to:
> 
> /*
>  * note how this returns u32 that will eliminate
>  * unnecessary error checking in usages where io_alloc_closid
>  * needs to be determined after an resctrl_arch_get_io_alloc_enabled(r)
>  * already confirmed io_alloc is enabled
>  * function comment could note that this returns the CLOSID
>  * required by io_alloc but not whether the CLOSID can
>  * be supported, for this resctrl_io_alloc_closid_supported() should
>  * be used.
>  * Can also note that returned value will always be valid if
>  * resctrl_arch_get_io_alloc_enabled(r) is true.
>  */
> u32 resctrl_io_alloc_closid(struct rdt_resource *r) {
> 	if (resctrl_arch_get_cdp_enabled(r->rid))
> 		return resctrl_arch_get_num_closid(r)/2  - 1
> 	else
> 		return resctrl_arch_get_num_closid(r) -1
> }
> 
> /*
>  * note how below already makes resctrl's io_alloc implementation
>  * more generic
>  */
> resctrl_io_alloc_closid_supported(u32 io_alloc_closid) {
> 	return io_alloc_closid <  closids_supported()
> }
> 

Sure.
   Changed the check to

    return io_alloc_closid == (closids_supported() -1)

> 
> 
>> +
>>  /*
>>   * rdt_bit_usage_show - Display current usage of resources
>>   *
>> @@ -1058,20 +1095,23 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
>>  	struct rdt_ctrl_domain *dom;
>>  	int i, hwb, swb, excl, psl;
>>  	enum rdtgrp_mode mode;
>> +	int io_alloc_closid;
>>  	bool sep = false;
>>  	u32 ctrl_val;
>>  
>>  	cpus_read_lock();
>>  	mutex_lock(&rdtgroup_mutex);
>> -	hw_shareable = r->cache.shareable_bits;
>>  	list_for_each_entry(dom, &r->ctrl_domains, hdr.list) {
>>  		if (sep)
>>  			seq_putc(seq, ';');
>> +		hw_shareable = r->cache.shareable_bits;
>>  		sw_shareable = 0;
>>  		exclusive = 0;
>>  		seq_printf(seq, "%d=", dom->hdr.id);
>>  		for (i = 0; i < closids_supported(); i++) {
>> -			if (!closid_allocated(i))
>> +			if (!closid_allocated(i) ||
>> +			    (resctrl_arch_get_io_alloc_enabled(r) &&
>> +			     i == resctrl_io_alloc_closid_get(r)))
>>  				continue;
>>  			ctrl_val = resctrl_arch_get_config(r, dom, i,
>>  							   s->conf_type);
>> @@ -1099,6 +1139,24 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
>>  				break;
>>  			}
>>  		}
>> +
>> +		/*
>> +		 * When the "io_alloc" feature is enabled, a portion of the
>> +		 * cache is reserved for shared use between hardware and software.
>> +		 */
>> +		if (resctrl_arch_get_io_alloc_enabled(r)) {
>> +			io_alloc_closid = resctrl_io_alloc_closid_get(r);
> 
> In this implementation io_alloc_closid can be negative and the static
> checker I used complained about the subsequent usage in 
> resctrl_arch_get_config() that must be unsigned.
> Since resctrl_arch_get_io_alloc_enabled(r) already passed this is one
> example where resctrl_io_alloc_closid() can be used.

Sure.

> 
>> +			if (resctrl_arch_get_cdp_enabled(r->rid))
>> +				ctrl_val = resctrl_arch_get_config(r, dom,
>> +								   io_alloc_closid,
>> +								   CDP_CODE);
>> +			else
>> +				ctrl_val = resctrl_arch_get_config(r, dom,
>> +								   io_alloc_closid,
>> +								   CDP_NONE);
>> +			hw_shareable |= ctrl_val;
>> +		}
>> +
>>  		for (i = r->cache.cbm_len - 1; i >= 0; i--) {
>>  			pseudo_locked = dom->plr ? dom->plr->cbm : 0;
>>  			hwb = test_bit(i, &hw_shareable);
>> @@ -1803,6 +1861,142 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of,
>>  	return ret ?: nbytes;
>>  }
>>  
>> +static int resctrl_io_alloc_show(struct kernfs_open_file *of,
>> +				 struct seq_file *seq, void *v)
>> +{
>> +	struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
>> +	struct rdt_resource *r = s->res;
>> +
> 
> Needs rdtgroup_mutex

Sure.

> 
>> +	if (r->cache.io_alloc_capable) {
>> +		if (resctrl_arch_get_io_alloc_enabled(r))
>> +			seq_puts(seq, "enabled\n");
>> +		else
>> +			seq_puts(seq, "disabled\n");
>> +	} else {
>> +		seq_puts(seq, "not supported\n");
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +/*
>> + * Initialize io_alloc CLOSID cache resource with default CBM values.
> 
> It is unclear what "default" means.
> Could be "Initialize io_alloc CLOSID cache resource CBM with all usable (shared and unused) cache portions".

Sure.

> 
>> + */
>> +static int resctrl_io_alloc_init_cat(struct rdt_resource *r,
>> +				     struct resctrl_schema *s, u32 closid)
>> +{
>> +	int ret;
>> +
>> +	rdt_staged_configs_clear();
>> +
>> +	ret = rdtgroup_init_cat(s, closid);
>> +	if (ret < 0)
>> +		goto out_init_cat;
> 
> The "out" label should reflect what is done at target, not the source.
> Consider all the usages of "out_unlock" in resctrl.
> Since this is the only label it can just be "out".

Sure.

> 
>> +
>> +	ret = resctrl_arch_update_domains(r, closid);
>> +
>> +out_init_cat:
>> +	rdt_staged_configs_clear();
>> +	return ret;
>> +}
>> +
>> +static const char *rdtgroup_name_by_closid(int closid)
>> +{
>> +	struct rdtgroup *rdtgrp;
>> +
>> +	list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
>> +		if (rdtgrp->closid == closid)
>> +			return rdt_kn_name(rdtgrp->kn);
>> +	}
>> +
>> +	return NULL;
>> +}
>> +
>> +/*
>> + * When CDP is enabled, io_alloc directs traffic using the highest CLOSID
>> + * linked to an L3CODE resource. Although CBMs can be accessed through
>> + * either L3CODE or L3DATA resources, any updates to the schemata must
>> + * always be performed on L3CODE.
>> + */
> 
> I think that updates to the schemata needs to be made on *both* L3DATA and L3CODE.
> Consider how __init_one_rdt_domain() works when a new resource group is created ...
> the algorithm looks through all allocated CLOSID and the associated schemata impact
> the CBM of the new group. If an allocated CLOSID is associated with L3CODE
> then its "peer" L3DATA is also taken into account, similar for L3DATA.
> If only L3CODE is updated for the io_alloc_closid then it looks to me that
> whatever the original L3DATA schema was will keep impacting new resource
> groups. To avoid that and ensure only accurate CBMs are used it looks to me
> as though the L3DATA and L3CODE schema needs to be kept in sync.

Sure. Will verify this.

> 
>> +static struct resctrl_schema *resctrl_schema_io_alloc(struct resctrl_schema *s)
>> +{
>> +	struct resctrl_schema *schema;
>> +
>> +	if (s->conf_type == CDP_DATA) {
>> +		list_for_each_entry(schema, &resctrl_schema_all, list) {
>> +			if (schema->conf_type == CDP_CODE)
>> +				return schema;
>> +		}
>> +	}
>> +
>> +	return s;
>> +}
>> +
>> +static ssize_t resctrl_io_alloc_write(struct kernfs_open_file *of, char *buf,
>> +				      size_t nbytes, loff_t off)
>> +{
>> +	struct resctrl_schema *s = rdt_kn_parent_priv(of->kn);
>> +	struct rdt_resource *r = s->res;
>> +	char const *grp_name;
>> +	u32 io_alloc_closid;
>> +	bool enable;
>> +	int ret;
>> +
>> +	ret = kstrtobool(buf, &enable);
>> +	if (ret)
>> +		return ret;
>> +
>> +	cpus_read_lock();
>> +	mutex_lock(&rdtgroup_mutex);
>> +
>> +	rdt_last_cmd_clear();
>> +
>> +	if (!r->cache.io_alloc_capable) {
>> +		rdt_last_cmd_puts("io_alloc feature is not supported on the resource\n");
> 
> This could be more useful with a small change:
> "io_alloc is not supported on %s\n", s->name

Yes.

> 
>> +		ret = -ENODEV;
>> +		goto out_io_alloc;
> 
> out_io_alloc -> out_unlock

Sure.

> 
>> +	}
>> +
>> +	io_alloc_closid = resctrl_io_alloc_closid_get(r);
>> +	if (io_alloc_closid < 0) {
>> +		rdt_last_cmd_puts("Max CLOSID to support io_alloc is not available\n");
> 
> This could be more useful to help debug by printing the value of io_alloc_closid
> that user can compare against output of num_closids files. Here the terms become
> a bit complicated since ideally we want to use ctrl_hw_id but that does not match
> "num_closids", so perhaps use both terms, for example "CLOSID (ctrl_hw_id)"?
> I am not sure here.

Fine with me.

> 
>> +		ret = -EINVAL;
>> +		goto out_io_alloc;
>> +	}
>> +
>> +	if (resctrl_arch_get_io_alloc_enabled(r) != enable) {
>> +		if (enable) {
>> +			ret = resctrl_io_alloc_closid_alloc(io_alloc_closid);
>> +			if (ret < 0) {
>> +				grp_name = rdtgroup_name_by_closid(io_alloc_closid);
> 
> Below handles !grp_name but that would be a kernel bug, no? Maybe WARN_ON_ONCE()?

Sure.

> 
>> +				rdt_last_cmd_printf("CLOSID for io_alloc is used by %s group\n",
>> +						    grp_name ? grp_name : "another");
> 
> 
> CLOSID -> ctrl_hw_id
> 
sure.

>> +				ret = -EINVAL;
>> +				goto out_io_alloc;
>> +			}
>> +
>> +			ret = resctrl_io_alloc_init_cat(r, resctrl_schema_io_alloc(s),
>> +							io_alloc_closid);
>> +			if (ret) {
>> +				rdt_last_cmd_puts("Failed to initialize io_alloc allocations\n");
>> +				resctrl_io_alloc_closid_free(io_alloc_closid);
>> +				goto out_io_alloc;
>> +			}
>> +
>> +		} else {
>> +			resctrl_io_alloc_closid_free(io_alloc_closid);
>> +		}
>> +
>> +		ret = resctrl_arch_io_alloc_enable(r, enable);
>> +	}
>> +
>> +out_io_alloc:
> 
> out_unlock ... to match the other places in resctrl.

Sure.

> 
>> +	mutex_unlock(&rdtgroup_mutex);
>> +	cpus_read_unlock();
>> +
>> +	return ret ?: nbytes;
>> +}
>> +
>>  /* rdtgroup information files for one cache resource. */
>>  static struct rftype res_common_files[] = {
>>  	{
>> @@ -1955,6 +2149,13 @@ static struct rftype res_common_files[] = {
>>  		.seq_show	= rdtgroup_schemata_show,
>>  		.fflags		= RFTYPE_CTRL_BASE,
>>  	},
>> +	{
>> +		.name           = "io_alloc",
>> +		.mode           = 0644,
>> +		.kf_ops         = &rdtgroup_kf_single_ops,
>> +		.seq_show       = resctrl_io_alloc_show,
>> +		.write          = resctrl_io_alloc_write,
>> +	},
> 
> Please match existing code wrt tab usage.

Sure.

> 
>>  	{
>>  		.name		= "mba_MBps_event",
>>  		.mode		= 0644,
>> @@ -2062,6 +2263,15 @@ static void thread_throttle_mode_init(void)
>>  				 RFTYPE_CTRL_INFO | RFTYPE_RES_MB);
>>  }
>>  
>> +static void io_alloc_init(void)
>> +{
>> +	struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
>> +
>> +	if (r->cache.io_alloc_capable)
>> +		resctrl_file_fflags_init("io_alloc",
>> +					 RFTYPE_CTRL_INFO | RFTYPE_RES_CACHE);
>> +}
> 
> Note that even if it only checks the L3 cache resource, this will make the
> file visible to all cache resource, also L2. This is why it is important to
> ensure documentation and implementation also accommodates resources that
> do not support io_alloc.

Agree.
> 
>> +
>>  void resctrl_file_fflags_init(const char *config, unsigned long fflags)
>>  {
>>  	struct rftype *rft;
>> @@ -4249,6 +4459,8 @@ int resctrl_init(void)
>>  
>>  	thread_throttle_mode_init();
>>  
>> +	io_alloc_init();
>> +
>>  	ret = resctrl_mon_resource_init();
>>  	if (ret)
>>  		return ret;
> 
> Reinette
> 

-- 
Thanks
Babu Moger
Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
Posted by Reinette Chatre 3 months, 3 weeks ago
Hi Babu,

On 6/19/25 11:41 AM, Moger, Babu wrote:
> On 6/17/25 22:59, Reinette Chatre wrote:
>> On 6/11/25 2:23 PM, Babu Moger wrote:

...

>>> + */
>>> +static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
>>> +{
>>> +	int num_closids = closids_supported();
>>> +
>>> +	if (resctrl_arch_get_cdp_enabled(r->rid))
>>> +		num_closids *= 2;
>>> +
>>> +	if (num_closids != resctrl_arch_get_num_closid(r))
>>> +		return -ENOSPC;
>>> +
>>> +	return closids_supported() - 1;
>>> +}
>>
>> resctrl_io_alloc_closid_get() seems to be trying to do two things: 
>> - determine what the io_alloc_closid is
>> - make sure the io_alloc_closid is supported
>>
>> I think this should be split into two functions. Once the
>> io_alloc_closid is determined to be supported and io_alloc
>> enabled then there is no reason to keep checking if it is
>> supported whenever the io_alloc_closid is queried.
>>
>> How about simplifying this to:
>>
>> /*
>>  * note how this returns u32 that will eliminate
>>  * unnecessary error checking in usages where io_alloc_closid
>>  * needs to be determined after an resctrl_arch_get_io_alloc_enabled(r)
>>  * already confirmed io_alloc is enabled
>>  * function comment could note that this returns the CLOSID
>>  * required by io_alloc but not whether the CLOSID can
>>  * be supported, for this resctrl_io_alloc_closid_supported() should
>>  * be used.
>>  * Can also note that returned value will always be valid if
>>  * resctrl_arch_get_io_alloc_enabled(r) is true.
>>  */
>> u32 resctrl_io_alloc_closid(struct rdt_resource *r) {
>> 	if (resctrl_arch_get_cdp_enabled(r->rid))
>> 		return resctrl_arch_get_num_closid(r)/2  - 1
>> 	else
>> 		return resctrl_arch_get_num_closid(r) -1
>> }
>>
>> /*
>>  * note how below already makes resctrl's io_alloc implementation
>>  * more generic
>>  */
>> resctrl_io_alloc_closid_supported(u32 io_alloc_closid) {
>> 	return io_alloc_closid <  closids_supported()
>> }
>>
> 
> Sure.
>    Changed the check to
> 
>     return io_alloc_closid == (closids_supported() -1)
> 

resctrl_io_alloc_closid_supported() is not intended to reflect what the
value is but just check if provided value is supported. By changing the
check to above resctrl_io_alloc_closid_supported() does two things again
(what the move to new functions aimed to avoid): checking that the CLOSID
is supported while requiring that it is the highest supported CLOSID.
What issue(s) do you see with using "io_alloc_closid <  closids_supported()"
as the check?

Reinette
Re: [PATCH v6 5/8] fs/resctrl: Add user interface to enable/disable io_alloc feature
Posted by Moger, Babu 3 months, 3 weeks ago
Hi Reinette,

On 6/20/2025 10:53 AM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 6/19/25 11:41 AM, Moger, Babu wrote:
>> On 6/17/25 22:59, Reinette Chatre wrote:
>>> On 6/11/25 2:23 PM, Babu Moger wrote:
> 
> ...
> 
>>>> + */
>>>> +static int resctrl_io_alloc_closid_get(struct rdt_resource *r)
>>>> +{
>>>> +	int num_closids = closids_supported();
>>>> +
>>>> +	if (resctrl_arch_get_cdp_enabled(r->rid))
>>>> +		num_closids *= 2;
>>>> +
>>>> +	if (num_closids != resctrl_arch_get_num_closid(r))
>>>> +		return -ENOSPC;
>>>> +
>>>> +	return closids_supported() - 1;
>>>> +}
>>>
>>> resctrl_io_alloc_closid_get() seems to be trying to do two things:
>>> - determine what the io_alloc_closid is
>>> - make sure the io_alloc_closid is supported
>>>
>>> I think this should be split into two functions. Once the
>>> io_alloc_closid is determined to be supported and io_alloc
>>> enabled then there is no reason to keep checking if it is
>>> supported whenever the io_alloc_closid is queried.
>>>
>>> How about simplifying this to:
>>>
>>> /*
>>>   * note how this returns u32 that will eliminate
>>>   * unnecessary error checking in usages where io_alloc_closid
>>>   * needs to be determined after an resctrl_arch_get_io_alloc_enabled(r)
>>>   * already confirmed io_alloc is enabled
>>>   * function comment could note that this returns the CLOSID
>>>   * required by io_alloc but not whether the CLOSID can
>>>   * be supported, for this resctrl_io_alloc_closid_supported() should
>>>   * be used.
>>>   * Can also note that returned value will always be valid if
>>>   * resctrl_arch_get_io_alloc_enabled(r) is true.
>>>   */
>>> u32 resctrl_io_alloc_closid(struct rdt_resource *r) {
>>> 	if (resctrl_arch_get_cdp_enabled(r->rid))
>>> 		return resctrl_arch_get_num_closid(r)/2  - 1
>>> 	else
>>> 		return resctrl_arch_get_num_closid(r) -1
>>> }
>>>
>>> /*
>>>   * note how below already makes resctrl's io_alloc implementation
>>>   * more generic
>>>   */
>>> resctrl_io_alloc_closid_supported(u32 io_alloc_closid) {
>>> 	return io_alloc_closid <  closids_supported()
>>> }
>>>
>>
>> Sure.
>>     Changed the check to
>>
>>      return io_alloc_closid == (closids_supported() -1)
>>
> 
> resctrl_io_alloc_closid_supported() is not intended to reflect what the
> value is but just check if provided value is supported. By changing the
> check to above resctrl_io_alloc_closid_supported() does two things again
> (what the move to new functions aimed to avoid): checking that the CLOSID
> is supported while requiring that it is the highest supported CLOSID.
> What issue(s) do you see with using "io_alloc_closid <  closids_supported()"
> as the check?

I don't see any issue. It should be fine. Will test and verify it.

thanks
Babu