[PATCH v3 08/12] PM: EM: Introduce runtime modifiable table

Lukasz Luba posted 12 patches 2 years, 5 months ago
There is a newer version of this series
[PATCH v3 08/12] PM: EM: Introduce runtime modifiable table
Posted by Lukasz Luba 2 years, 5 months ago
This patch introduces the new feature: modifiable EM perf_state table.
The new runtime table would be populated with a new power data to better
reflect the actual power. The power can vary over time e.g. due to the
SoC temperature change. Higher temperature can increase power values.
For longer running scenarios, such as game or camera, when also other
devices are used (e.g. GPU, ISP) the CPU power can change. The new
EM framework is able to addresses this issue and change the data
at runtime safely.

The runtime modifiable EM data is used by the Energy Aware Scheduler (EAS)
for the task placement. The EAS is the only user of the 'runtime
modifiable EM'. All the other users (thermal, etc.) are still using the
default (basic) EM. This fact drove the design of this feature.

Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
---
 include/linux/energy_model.h |  4 +++-
 kernel/power/energy_model.c  | 26 ++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index 9b67f54ddcf0..cfb1759ffd45 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -39,7 +39,7 @@ struct em_perf_state {
 /**
  * struct em_perf_table - Performance states table
  * @state:	List of performance states, in ascending order
- * @rcu:	RCU used for safe access and destruction
+ * @rcu:	RCU used only for runtime modifiable table
  */
 struct em_perf_table {
 	struct em_perf_state *state;
@@ -49,6 +49,7 @@ struct em_perf_table {
 /**
  * struct em_perf_domain - Performance domain
  * @default_table:	Pointer to the default em_perf_table
+ * @runtime_table:	Pointer to the runtime modifiable em_perf_table
  * @nr_perf_states:	Number of performance states
  * @flags:		See "em_perf_domain flags"
  * @cpus:		Cpumask covering the CPUs of the domain. It's here
@@ -64,6 +65,7 @@ struct em_perf_table {
  */
 struct em_perf_domain {
 	struct em_perf_table *default_table;
+	struct em_perf_table __rcu *runtime_table;
 	int nr_perf_states;
 	unsigned long flags;
 	unsigned long cpus[];
diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
index 6cd94f92701d..c2f8a0046f8a 100644
--- a/kernel/power/energy_model.c
+++ b/kernel/power/energy_model.c
@@ -212,6 +212,7 @@ static int em_create_pd(struct device *dev, int nr_states,
 			unsigned long flags)
 {
 	struct em_perf_table *default_table;
+	struct em_perf_table *runtime_table;
 	struct em_perf_domain *pd;
 	struct device *cpu_dev;
 	int cpu, ret, num_cpus;
@@ -244,13 +245,25 @@ static int em_create_pd(struct device *dev, int nr_states,
 
 	pd->default_table = default_table;
 
+	runtime_table = kzalloc(sizeof(*runtime_table), GFP_KERNEL);
+	if (!runtime_table) {
+		kfree(default_table);
+		kfree(pd);
+		return -ENOMEM;
+	}
+
 	ret = em_create_perf_table(dev, pd, nr_states, cb, flags);
 	if (ret) {
 		kfree(default_table);
+		kfree(runtime_table);
 		kfree(pd);
 		return ret;
 	}
 
+	/* Re-use temporally (till 1st modification) the memory */
+	runtime_table->state = default_table->state;
+	rcu_assign_pointer(pd->runtime_table, runtime_table);
+
 	if (_is_cpu_device(dev))
 		for_each_cpu(cpu, cpus) {
 			cpu_dev = get_cpu_device(cpu);
@@ -448,23 +461,36 @@ EXPORT_SYMBOL_GPL(em_dev_register_perf_domain);
  */
 void em_dev_unregister_perf_domain(struct device *dev)
 {
+	struct em_perf_table __rcu *runtime_table;
+	struct em_perf_domain *pd;
+
 	if (IS_ERR_OR_NULL(dev) || !dev->em_pd)
 		return;
 
 	if (_is_cpu_device(dev))
 		return;
 
+	pd = dev->em_pd;
 	/*
 	 * The mutex separates all register/unregister requests and protects
 	 * from potential clean-up/setup issues in the debugfs directories.
 	 * The debugfs directory name is the same as device's name.
 	 */
 	mutex_lock(&em_pd_mutex);
+
 	em_debug_remove_pd(dev);
 
+	runtime_table = pd->runtime_table;
+
+	rcu_assign_pointer(pd->runtime_table, NULL);
+	synchronize_rcu();
+
+	kfree(runtime_table);
+
 	kfree(pd->default_table->state);
 	kfree(pd->default_table);
 	kfree(dev->em_pd);
+
 	dev->em_pd = NULL;
 	mutex_unlock(&em_pd_mutex);
 }
-- 
2.25.1
Re: [PATCH v3 08/12] PM: EM: Introduce runtime modifiable table
Posted by Dietmar Eggemann 2 years, 4 months ago
On 21/07/2023 17:50, Lukasz Luba wrote:
> This patch introduces the new feature: modifiable EM perf_state table.

nit pick: The first sentence doesn't add any information. I would skip it.

[...]

> The runtime modifiable EM data is used by the Energy Aware Scheduler (EAS)
> for the task placement. The EAS is the only user of the 'runtime
> modifiable EM'. 

The runtime modifiable EM is currently only used ...
The you can skip the next sentence: "The EAS is the only user ..."

All the other users (thermal, etc.) are still using the
> default (basic) EM. This fact drove the design of this feature.

[...]
Re: [PATCH v3 08/12] PM: EM: Introduce runtime modifiable table
Posted by Lukasz Luba 2 years, 4 months ago

On 8/16/23 14:05, Dietmar Eggemann wrote:
> On 21/07/2023 17:50, Lukasz Luba wrote:
>> This patch introduces the new feature: modifiable EM perf_state table.
> 
> nit pick: The first sentence doesn't add any information. I would skip it.
> 
> [...]
> 
>> The runtime modifiable EM data is used by the Energy Aware Scheduler (EAS)
>> for the task placement. The EAS is the only user of the 'runtime
>> modifiable EM'.
> 
> The runtime modifiable EM is currently only used ...
> The you can skip the next sentence: "The EAS is the only user ..."
> 
> All the other users (thermal, etc.) are still using the
>> default (basic) EM. This fact drove the design of this feature.
> 
> [...]
> 

Thanks, I'll remove them in the next version.