[PATCH 06/33] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id

James Morse posted 33 patches 1 month, 1 week ago
There is a newer version of this series
[PATCH 06/33] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
Posted by James Morse 1 month, 1 week ago
MPAM identifies CPUs by the cache_id in the PPTT cache structure.

The driver needs to know which CPUs are associated with the cache,
the CPUs may not all be online, so cacheinfo does not have the
information.

Add a helper to pull this information out of the PPTT.

CC: Rohit Mathew <Rohit.Mathew@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
---
Changes since RFC:
 * acpi_count_levels() now returns a value.
 * Converted the table-get stuff to use Jonathan's cleanup helper.
 * Dropped Sudeep's Review tag due to the cleanup change.
---
 drivers/acpi/pptt.c  | 62 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/acpi.h |  6 +++++
 2 files changed, 68 insertions(+)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 660457644a5b..cb93a9a7f9b6 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -971,3 +971,65 @@ int find_acpi_cache_level_from_id(u32 cache_id)
 
 	return -ENOENT;
 }
+
+/**
+ * acpi_pptt_get_cpumask_from_cache_id() - Get the cpus associated with the
+ *					   specified cache
+ * @cache_id: The id field of the unified cache
+ * @cpus: Where to build the cpumask
+ *
+ * Determine which CPUs are below this cache in the PPTT. This allows the property
+ * to be found even if the CPUs are offline.
+ *
+ * The PPTT table must be rev 3 or later,
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
+ * Otherwise returns 0 and sets the cpus in the provided cpumask.
+ */
+int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus)
+{
+	u32 acpi_cpu_id;
+	int level, cpu, num_levels;
+	struct acpi_pptt_cache *cache;
+	struct acpi_pptt_cache_v1 *cache_v1;
+	struct acpi_pptt_processor *cpu_node;
+	struct acpi_table_header *table __free(acpi_table) = acpi_get_table_ret(ACPI_SIG_PPTT, 0);
+
+	cpumask_clear(cpus);
+
+	if (IS_ERR(table))
+		return -ENOENT;
+
+	if (table->revision < 3)
+		return -ENOENT;
+
+	/*
+	 * If we found the cache first, we'd still need to walk from each cpu.
+	 */
+	for_each_possible_cpu(cpu) {
+		acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+		cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
+		if (!cpu_node)
+			return 0;
+		num_levels = acpi_count_levels(table, cpu_node, NULL);
+
+		/* Start at 1 for L1 */
+		for (level = 1; level <= num_levels; level++) {
+			cache = acpi_find_cache_node(table, acpi_cpu_id,
+						     ACPI_PPTT_CACHE_TYPE_UNIFIED,
+						     level, &cpu_node);
+			if (!cache)
+				continue;
+
+			cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
+						cache,
+						sizeof(struct acpi_pptt_cache));
+
+			if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
+			    cache_v1->cache_id == cache_id)
+				cpumask_set_cpu(cpu, cpus);
+		}
+	}
+
+	return 0;
+}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 30c10b1dcdb2..4ad08f5f1d83 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1555,6 +1555,7 @@ int find_acpi_cpu_topology_package(unsigned int cpu);
 int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
 void acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus);
 int find_acpi_cache_level_from_id(u32 cache_id);
+int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus);
 #else
 static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
 {
@@ -1582,6 +1583,11 @@ static inline int find_acpi_cache_level_from_id(u32 cache_id)
 {
 	return -EINVAL;
 }
+static inline int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id,
+						      cpumask_t *cpus)
+{
+	return -EINVAL;
+}
 #endif
 
 void acpi_arch_init(void);
-- 
2.20.1
Re: [PATCH 06/33] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
Posted by Dave Martin 1 month, 1 week ago
Hi James,

On Fri, Aug 22, 2025 at 03:29:47PM +0000, James Morse wrote:
> MPAM identifies CPUs by the cache_id in the PPTT cache structure.
> 
> The driver needs to know which CPUs are associated with the cache,
> the CPUs may not all be online, so cacheinfo does not have the
> information.

Nit: cacheinfo lacking the information is not a consequence of the
driver needing it.

Maybe split the sentence:

-> "[...] associated with the cache. The CPUs may not [...]"

> 
> Add a helper to pull this information out of the PPTT.
> 
> CC: Rohit Mathew <Rohit.Mathew@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
> Changes since RFC:
>  * acpi_count_levels() now returns a value.
>  * Converted the table-get stuff to use Jonathan's cleanup helper.
>  * Dropped Sudeep's Review tag due to the cleanup change.
> ---
>  drivers/acpi/pptt.c  | 62 ++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/acpi.h |  6 +++++
>  2 files changed, 68 insertions(+)
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 660457644a5b..cb93a9a7f9b6 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -971,3 +971,65 @@ int find_acpi_cache_level_from_id(u32 cache_id)
>  
>  	return -ENOENT;
>  }
> +
> +/**
> + * acpi_pptt_get_cpumask_from_cache_id() - Get the cpus associated with the
> + *					   specified cache
> + * @cache_id: The id field of the unified cache
> + * @cpus: Where to build the cpumask
> + *
> + * Determine which CPUs are below this cache in the PPTT. This allows the property
> + * to be found even if the CPUs are offline.
> + *
> + * The PPTT table must be rev 3 or later,
> + *
> + * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
> + * Otherwise returns 0 and sets the cpus in the provided cpumask.
> + */
> +int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus)
> +{
> +	u32 acpi_cpu_id;
> +	int level, cpu, num_levels;
> +	struct acpi_pptt_cache *cache;
> +	struct acpi_pptt_cache_v1 *cache_v1;
> +	struct acpi_pptt_processor *cpu_node;
> +	struct acpi_table_header *table __free(acpi_table) = acpi_get_table_ret(ACPI_SIG_PPTT, 0);
> +
> +	cpumask_clear(cpus);
> +
> +	if (IS_ERR(table))
> +		return -ENOENT;
> +
> +	if (table->revision < 3)
> +		return -ENOENT;
> +
> +	/*
> +	 * If we found the cache first, we'd still need to walk from each cpu.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> +		cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
> +		if (!cpu_node)
> +			return 0;
> +		num_levels = acpi_count_levels(table, cpu_node, NULL);
> +
> +		/* Start at 1 for L1 */
> +		for (level = 1; level <= num_levels; level++) {
> +			cache = acpi_find_cache_node(table, acpi_cpu_id,
> +						     ACPI_PPTT_CACHE_TYPE_UNIFIED,
> +						     level, &cpu_node);
> +			if (!cache)
> +				continue;
> +
> +			cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
> +						cache,
> +						sizeof(struct acpi_pptt_cache));
> +
> +			if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
> +			    cache_v1->cache_id == cache_id)
> +				cpumask_set_cpu(cpu, cpus);

Again, it feels like we are repeating the same walk multiple times to
determine how deep the table is (on which point the table is self-
describing anyway), and then again to derive some static property, and
then we are then doing all of that work multiple times to derive
different static properties, etc.

Can we not just walk over the tables once and stash the derived
properties somewhere?

I'm still getting my head around this parsing code, so I'm not saying
that the approach is incorrect here -- just wondering whether there is
a way to make it simpler.

[...]

Cheers
---Dave
Re: [PATCH 06/33] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
Posted by James Morse 1 month ago
Hi Dave,

On 27/08/2025 11:53, Dave Martin wrote:
> On Fri, Aug 22, 2025 at 03:29:47PM +0000, James Morse wrote:
>> MPAM identifies CPUs by the cache_id in the PPTT cache structure.
>>
>> The driver needs to know which CPUs are associated with the cache,
>> the CPUs may not all be online, so cacheinfo does not have the
>> information.
> 
> Nit: cacheinfo lacking the information is not a consequence of the
> driver needing it.
> 
> Maybe split the sentence:
> 
> -> "[...] associated with the cache. The CPUs may not [...]"

Sure,


>> Add a helper to pull this information out of the PPTT.

>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> index 660457644a5b..cb93a9a7f9b6 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -971,3 +971,65 @@ int find_acpi_cache_level_from_id(u32 cache_id)
>>  
>>  	return -ENOENT;
>>  }
>> +
>> +/**
>> + * acpi_pptt_get_cpumask_from_cache_id() - Get the cpus associated with the
>> + *					   specified cache
>> + * @cache_id: The id field of the unified cache
>> + * @cpus: Where to build the cpumask
>> + *
>> + * Determine which CPUs are below this cache in the PPTT. This allows the property
>> + * to be found even if the CPUs are offline.
>> + *
>> + * The PPTT table must be rev 3 or later,
>> + *
>> + * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
>> + * Otherwise returns 0 and sets the cpus in the provided cpumask.
>> + */
>> +int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus)
>> +{
>> +	u32 acpi_cpu_id;
>> +	int level, cpu, num_levels;
>> +	struct acpi_pptt_cache *cache;
>> +	struct acpi_pptt_cache_v1 *cache_v1;
>> +	struct acpi_pptt_processor *cpu_node;
>> +	struct acpi_table_header *table __free(acpi_table) = acpi_get_table_ret(ACPI_SIG_PPTT, 0);
>> +
>> +	cpumask_clear(cpus);
>> +
>> +	if (IS_ERR(table))
>> +		return -ENOENT;
>> +
>> +	if (table->revision < 3)
>> +		return -ENOENT;
>> +
>> +	/*
>> +	 * If we found the cache first, we'd still need to walk from each cpu.
>> +	 */
>> +	for_each_possible_cpu(cpu) {
>> +		acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> +		cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
>> +		if (!cpu_node)
>> +			return 0;
>> +		num_levels = acpi_count_levels(table, cpu_node, NULL);
>> +
>> +		/* Start at 1 for L1 */
>> +		for (level = 1; level <= num_levels; level++) {
>> +			cache = acpi_find_cache_node(table, acpi_cpu_id,
>> +						     ACPI_PPTT_CACHE_TYPE_UNIFIED,
>> +						     level, &cpu_node);
>> +			if (!cache)
>> +				continue;
>> +
>> +			cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
>> +						cache,
>> +						sizeof(struct acpi_pptt_cache));
>> +
>> +			if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
>> +			    cache_v1->cache_id == cache_id)
>> +				cpumask_set_cpu(cpu, cpus);

> Again, it feels like we are repeating the same walk multiple times to
> determine how deep the table is (on which point the table is self-
> describing anyway), and then again to derive some static property, and
> then we are then doing all of that work multiple times to derive
> different static properties, etc.
> 
> Can we not just walk over the tables once and stash the derived
> properties somewhere?

That is possible - but its a more invasive change to the PPTT parsing code.
Before the introduction of the leaf flag, the search for a processor also included a
search to check if the discovered node was a leaf.

I think this is trading time - walking over the table multiple times, against the memory
you'd need to de-serialise the tree to find the necessary properties quickly. I think the
reason Jeremy L went this way was because there may never be another request into this
code, so being ready with a quick answer was a waste of memory.

MPAM doesn't change this - all these things are done up front during driver probing, and
the values are cached by the driver.


> I'm still getting my head around this parsing code, so I'm not saying
> that the approach is incorrect here -- just wondering whether there is
> a way to make it simpler.

It's walked at boot, and on cpu-hotplug. Neither are particularly performance critical.

I agree that as platforms get bigger, there will be a tipping point ... I don't think
anyone has complained yet!


Thanks,

James
Re: [PATCH 06/33] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
Posted by Dave Martin 3 weeks, 3 days ago
Hi,

On Thu, Aug 28, 2025 at 04:58:16PM +0100, James Morse wrote:
> Hi Dave,
> 
> On 27/08/2025 11:53, Dave Martin wrote:
> > On Fri, Aug 22, 2025 at 03:29:47PM +0000, James Morse wrote:
> >> MPAM identifies CPUs by the cache_id in the PPTT cache structure.
> >>
> >> The driver needs to know which CPUs are associated with the cache,
> >> the CPUs may not all be online, so cacheinfo does not have the
> >> information.
> > 
> > Nit: cacheinfo lacking the information is not a consequence of the
> > driver needing it.
> > 
> > Maybe split the sentence:
> > 
> > -> "[...] associated with the cache. The CPUs may not [...]"
> 
> Sure,

OK

> >> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> >> index 660457644a5b..cb93a9a7f9b6 100644
> >> --- a/drivers/acpi/pptt.c
> >> +++ b/drivers/acpi/pptt.c
> >> @@ -971,3 +971,65 @@ int find_acpi_cache_level_from_id(u32 cache_id)

[...]

> >> + * acpi_pptt_get_cpumask_from_cache_id() - Get the cpus associated with the
> >> + *					   specified cache
> >> + * @cache_id: The id field of the unified cache
> >> + * @cpus: Where to build the cpumask
> >> + *
> >> + * Determine which CPUs are below this cache in the PPTT. This allows the property
> >> + * to be found even if the CPUs are offline.
> >> + *
> >> + * The PPTT table must be rev 3 or later,
> >> + *
> >> + * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
> >> + * Otherwise returns 0 and sets the cpus in the provided cpumask.
> >> + */
> >> +int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus)
> >> +{

[...]

> >> +	/*
> >> +	 * If we found the cache first, we'd still need to walk from each cpu.
> >> +	 */
> >> +	for_each_possible_cpu(cpu) {

[...]

> > Again, it feels like we are repeating the same walk multiple times to
> > determine how deep the table is (on which point the table is self-
> > describing anyway), and then again to derive some static property, and
> > then we are then doing all of that work multiple times to derive
> > different static properties, etc.
> > 
> > Can we not just walk over the tables once and stash the derived
> > properties somewhere?
> 
> That is possible - but its a more invasive change to the PPTT parsing code.
> Before the introduction of the leaf flag, the search for a processor also included a
> search to check if the discovered node was a leaf.
> 
> I think this is trading time - walking over the table multiple times, against the memory
> you'd need to de-serialise the tree to find the necessary properties quickly. I think the
> reason Jeremy L went this way was because there may never be another request into this
> code, so being ready with a quick answer was a waste of memory.
> 
> MPAM doesn't change this - all these things are done up front during driver probing, and
> the values are cached by the driver.

I guess that's true.

> > I'm still getting my head around this parsing code, so I'm not saying
> > that the approach is incorrect here -- just wondering whether there is
> > a way to make it simpler.
> 
> It's walked at boot, and on cpu-hotplug. Neither are particularly performance critical.

Do we do this only for unknown late secondaries (e.g., that haven't
previously come online?)  I haven't gone to track this down but, if not,
this cuts across the assertion that "there may never be another request
into this code".

cpu hotlug is slow in practice, but gratuitous cost on this path should
still be avoided where feasible.

> I agree that as platforms get bigger, there will be a tipping point ... I don't think
> anyone has complained yet!

Ack -- when in ACPI, do as the ACPI folks do, I guess.

Cheers
---Dave
Re: [PATCH 06/33] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
Posted by James Morse 3 weeks, 2 days ago
Hi Dave,

On 09/09/2025 11:14, Dave Martin wrote:
> On Thu, Aug 28, 2025 at 04:58:16PM +0100, James Morse wrote:
>> On 27/08/2025 11:53, Dave Martin wrote:
>>> On Fri, Aug 22, 2025 at 03:29:47PM +0000, James Morse wrote:
>>>> MPAM identifies CPUs by the cache_id in the PPTT cache structure.
>>>>
>>>> The driver needs to know which CPUs are associated with the cache,
>>>> the CPUs may not all be online, so cacheinfo does not have the
>>>> information.

>>>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>>>> index 660457644a5b..cb93a9a7f9b6 100644
>>>> --- a/drivers/acpi/pptt.c
>>>> +++ b/drivers/acpi/pptt.c
>>>> @@ -971,3 +971,65 @@ int find_acpi_cache_level_from_id(u32 cache_id)
> 
> [...]
> 
>>>> + * acpi_pptt_get_cpumask_from_cache_id() - Get the cpus associated with the
>>>> + *					   specified cache
>>>> + * @cache_id: The id field of the unified cache
>>>> + * @cpus: Where to build the cpumask
>>>> + *
>>>> + * Determine which CPUs are below this cache in the PPTT. This allows the property
>>>> + * to be found even if the CPUs are offline.
>>>> + *
>>>> + * The PPTT table must be rev 3 or later,
>>>> + *
>>>> + * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
>>>> + * Otherwise returns 0 and sets the cpus in the provided cpumask.
>>>> + */
>>>> +int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus)
>>>> +{
> 
> [...]
> 
>>>> +	/*
>>>> +	 * If we found the cache first, we'd still need to walk from each cpu.
>>>> +	 */
>>>> +	for_each_possible_cpu(cpu) {
> 
> [...]
> 
>>> Again, it feels like we are repeating the same walk multiple times to
>>> determine how deep the table is (on which point the table is self-
>>> describing anyway), and then again to derive some static property, and
>>> then we are then doing all of that work multiple times to derive
>>> different static properties, etc.
>>>
>>> Can we not just walk over the tables once and stash the derived
>>> properties somewhere?
>>
>> That is possible - but its a more invasive change to the PPTT parsing code.
>> Before the introduction of the leaf flag, the search for a processor also included a
>> search to check if the discovered node was a leaf.
>>
>> I think this is trading time - walking over the table multiple times, against the memory
>> you'd need to de-serialise the tree to find the necessary properties quickly. I think the
>> reason Jeremy L went this way was because there may never be another request into this
>> code, so being ready with a quick answer was a waste of memory.
>>
>> MPAM doesn't change this - all these things are done up front during driver probing, and
>> the values are cached by the driver.
> 
> I guess that's true.
> 
>>> I'm still getting my head around this parsing code, so I'm not saying
>>> that the approach is incorrect here -- just wondering whether there is
>>> a way to make it simpler.
>>
>> It's walked at boot, and on cpu-hotplug. Neither are particularly performance critical.

> Do we do this only for unknown late secondaries (e.g., that haven't
> previously come online?) 

No, each time a CPU comes online.


> I haven't gone to track this down but, if not,
> this cuts across the assertion that "there may never be another request
> into this code".

CPU hotplug is optional - you don't have to bounce CPUs. It's very common on mobile parts
for power saving. I think its fairly unusual on server parts, once CPUs are online they
stay online.

The cacheinfo code doesn't cache this, it re-reads it every time. That turns out to be
because of PowerPC where some of these properties can be changed while a CPU is offline.
Sure, we could have a Kconfig thing to say ARCH_STATIC_TABLES_ARE_STATIC, but that would
be a different piece of work.
(I've had a couple of stabs at this, but cacheinfo is the shape it needs to be)


> cpu hotlug is slow in practice, but gratuitous cost on this path should
> still be avoided where feasible.
> 
>> I agree that as platforms get bigger, there will be a tipping point ... I don't think
>> anyone has complained yet!
> 
> Ack -- when in ACPI, do as the ACPI folks do, I guess.


Thanks,

James