[PATCH v1] thermal/core: change mm alloc method to avoid kernel warning

Junwen Wu posted 1 patch 4 years ago
drivers/thermal/thermal_sysfs.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH v1] thermal/core: change mm alloc method to avoid kernel warning
Posted by Junwen Wu 4 years ago
Very high cooling device max state value makes cooling device stats
buffer allocation fails,like below.Using kzvalloc instead of kzalloc
can avoid this issue.

[    7.392644]WARNING: CPU: 7 PID: 1747 at mm/page_alloc.c:5090 __alloc_pages_nodemask+0x1c0/0x3dc
[    7.392989]Call trace:
[    7.392992]__alloc_pages_nodemask+0x1c0/0x3dc
[    7.392995]kmalloc_order+0x54/0x358
[    7.392997]kmalloc_order_trace+0x34/0x1bc
[    7.393001]__kmalloc+0x5cc/0x9c8
[    7.393005]thermal_cooling_device_setup_sysfs+0x90/0x218
[    7.393008]__thermal_cooling_device_register+0x160/0x7a4
[    7.393012]thermal_of_cooling_device_register+0x14/0x24
[    7.393140]backlight_cdev_register+0x88/0x100 [msm_drm]

Signed-off-by: Junwen Wu <wudaemon@163.com>
---
 drivers/thermal/thermal_sysfs.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index f154bada2906..361e0d0c241b 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -829,7 +829,7 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
 	var += sizeof(*stats->time_in_state) * states;
 	var += sizeof(*stats->trans_table) * states * states;
 
-	stats = kzalloc(var, GFP_KERNEL);
+	stats = kvzalloc(var, GFP_KERNEL);
 	if (!stats)
 		return;
 
@@ -848,7 +848,7 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
 
 static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev)
 {
-	kfree(cdev->stats);
+	kvfree(cdev->stats);
 	cdev->stats = NULL;
 }
 
-- 
2.25.1
Re: [PATCH v1] thermal/core: change mm alloc method to avoid kernel warning
Posted by Zhang Rui 4 years ago
On Sun, 2022-04-17 at 12:56 +0000, Junwen Wu wrote:
> Very high cooling device max state value makes cooling device stats
> buffer allocation fails,like below.Using kzvalloc instead of kzalloc
> can avoid this issue.

When a cooling device has big max_state, this patch can get ride of the
warning here, but still we end up with the read failure of the
trans_table in sysfs because it is larger than PAGE_SIZE.

$ cat /sys/class/thermal/cooling_device8/stats/trans_table 
cat: /sys/class/thermal/cooling_device8/stats/trans_table: File too
large

IMO, unless we can fix both places, I'd suggest we skip allocating and
creating the broken trans_table attr. Like a prototype patch below

thanks,
rui

From 2a504596d06a91d6f01d25eee71ebcdeff164e59 Mon Sep 17 00:00:00 2001
From: Zhang Rui <rui.zhang@intel.com>
Date: Tue, 19 Apr 2022 16:40:04 +0800
Subject: [PATCH] thermal: thermal_stats: skip broken trans_table

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
---
 drivers/thermal/thermal_sysfs.c | 40 ++++++++++++++++++++++++++-------
 1 file changed, 32 insertions(+), 8 deletions(-)

diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index f154bada2906..d917489f89bc 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -686,7 +686,8 @@ void thermal_cooling_device_stats_update(struct thermal_cooling_device *cdev,
 		goto unlock;
 
 	update_time_in_state(stats);
-	stats->trans_table[stats->state * stats->max_states + new_state]++;
+	if (stats->trans_table)
+		stats->trans_table[stats->state * stats->max_states + new_state]++;
 	stats->state = new_state;
 	stats->total_trans++;
 
@@ -741,8 +742,9 @@ reset_store(struct device *dev, struct device_attribute *attr, const char *buf,
 
 	stats->total_trans = 0;
 	stats->last_time = ktime_get();
-	memset(stats->trans_table, 0,
-	       states * states * sizeof(*stats->trans_table));
+	if (stats->trans_table)
+		memset(stats->trans_table, 0,
+		       states * states * sizeof(*stats->trans_table));
 
 	for (i = 0; i < stats->max_states; i++)
 		stats->time_in_state[i] = ktime_set(0, 0);
@@ -805,7 +807,6 @@ static struct attribute *cooling_device_stats_attrs[] = {
 	&dev_attr_total_trans.attr,
 	&dev_attr_time_in_state_ms.attr,
 	&dev_attr_reset.attr,
-	&dev_attr_trans_table.attr,
 	NULL
 };
 
@@ -814,11 +815,25 @@ static const struct attribute_group cooling_device_stats_attr_group = {
 	.name = "stats"
 };
 
+static struct attribute *cooling_device_stats_ext_attrs[] = {
+	&dev_attr_total_trans.attr,
+	&dev_attr_time_in_state_ms.attr,
+	&dev_attr_reset.attr,
+	&dev_attr_trans_table.attr,
+	NULL
+};
+
+static const struct attribute_group cooling_device_stats_ext_attr_group = {
+	.attrs = cooling_device_stats_ext_attrs,
+	.name = "stats"
+};
+
 static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
 {
 	struct cooling_dev_stats *stats;
 	unsigned long states;
-	int var;
+	int var, size;
+	bool ext = false;
 
 	if (cdev->ops->get_max_state(cdev, &states))
 		return;
@@ -827,14 +842,20 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
 
 	var = sizeof(*stats);
 	var += sizeof(*stats->time_in_state) * states;
-	var += sizeof(*stats->trans_table) * states * states;
+	size = sizeof(*stats->trans_table) * states * states;
+
+	if (var + size < PAGE_SIZE) {
+		ext = true;
+		var += size;
+	}
 
 	stats = kzalloc(var, GFP_KERNEL);
 	if (!stats)
 		return;
 
 	stats->time_in_state = (ktime_t *)(stats + 1);
-	stats->trans_table = (unsigned int *)(stats->time_in_state + states);
+	if (ext)
+		stats->trans_table = (unsigned int *)(stats->time_in_state + states);
 	cdev->stats = stats;
 	stats->last_time = ktime_get();
 	stats->max_states = states;
@@ -843,7 +864,10 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
 
 	/* Fill the empty slot left in cooling_device_attr_groups */
 	var = ARRAY_SIZE(cooling_device_attr_groups) - 2;
-	cooling_device_attr_groups[var] = &cooling_device_stats_attr_group;
+	if (ext)
+		cooling_device_attr_groups[var] = &cooling_device_stats_ext_attr_group;
+	else
+		cooling_device_attr_groups[var] = &cooling_device_stats_attr_group;
 }
 
 static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev)
-- 
2.17.1

> 
> [    7.392644]WARNING: CPU: 7 PID: 1747 at mm/page_alloc.c:5090
> __alloc_pages_nodemask+0x1c0/0x3dc
> [    7.392989]Call trace:
> [    7.392992]__alloc_pages_nodemask+0x1c0/0x3dc
> [    7.392995]kmalloc_order+0x54/0x358
> [    7.392997]kmalloc_order_trace+0x34/0x1bc
> [    7.393001]__kmalloc+0x5cc/0x9c8
> [    7.393005]thermal_cooling_device_setup_sysfs+0x90/0x218
> [    7.393008]__thermal_cooling_device_register+0x160/0x7a4
> [    7.393012]thermal_of_cooling_device_register+0x14/0x24
> [    7.393140]backlight_cdev_register+0x88/0x100 [msm_drm]
> 
> 
> Signed-off-by: Junwen Wu <wudaemon@163.com>
> ---
>  drivers/thermal/thermal_sysfs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_sysfs.c
> b/drivers/thermal/thermal_sysfs.c
> index f154bada2906..361e0d0c241b 100644
> --- a/drivers/thermal/thermal_sysfs.c
> +++ b/drivers/thermal/thermal_sysfs.c
> @@ -829,7 +829,7 @@ static void cooling_device_stats_setup(struct
> thermal_cooling_device *cdev)
>  	var += sizeof(*stats->time_in_state) * states;
>  	var += sizeof(*stats->trans_table) * states * states;
>  
> -	stats = kzalloc(var, GFP_KERNEL);
> +	stats = kvzalloc(var, GFP_KERNEL);
>  	if (!stats)
>  		return;
>  
> @@ -848,7 +848,7 @@ static void cooling_device_stats_setup(struct
> thermal_cooling_device *cdev)
>  
>  static void cooling_device_stats_destroy(struct
> thermal_cooling_device *cdev)
>  {
> -	kfree(cdev->stats);
> +	kvfree(cdev->stats);
>  	cdev->stats = NULL;
>  }
>
Re: [PATCH v1] thermal/core: change mm alloc method to avoid kernel warning
Posted by Daniel Lezcano 4 years ago
On 19/04/2022 10:48, Zhang Rui wrote:
> On Sun, 2022-04-17 at 12:56 +0000, Junwen Wu wrote:
>> Very high cooling device max state value makes cooling device stats
>> buffer allocation fails,like below.Using kzvalloc instead of kzalloc
>> can avoid this issue.
> 
> When a cooling device has big max_state, this patch can get ride of the
> warning here, but still we end up with the read failure of the
> trans_table in sysfs because it is larger than PAGE_SIZE.
> 
> $ cat /sys/class/thermal/cooling_device8/stats/trans_table
> cat: /sys/class/thermal/cooling_device8/stats/trans_table: File too
> large
> 
> IMO, unless we can fix both places, I'd suggest we skip allocating and
> creating the broken trans_table attr. Like a prototype patch below

Why not create a thermal debugfs with real useful information and get 
rid of this broken code ?

I've some prototype code I can respin to RFC



-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Re: [PATCH v1] thermal/core: change mm alloc method to avoid kernel warning
Posted by Zhang Rui 4 years ago
CC Viresh.

On Tue, 2022-04-19 at 11:14 +0200, Daniel Lezcano wrote:
> On 19/04/2022 10:48, Zhang Rui wrote:
> > On Sun, 2022-04-17 at 12:56 +0000, Junwen Wu wrote:
> > > Very high cooling device max state value makes cooling device
> > > stats
> > > buffer allocation fails,like below.Using kzvalloc instead of
> > > kzalloc
> > > can avoid this issue.
> > 
> > When a cooling device has big max_state, this patch can get ride of
> > the
> > warning here, but still we end up with the read failure of the
> > trans_table in sysfs because it is larger than PAGE_SIZE.
> > 
> > $ cat /sys/class/thermal/cooling_device8/stats/trans_table
> > cat: /sys/class/thermal/cooling_device8/stats/trans_table: File too
> > large
> > 
> > IMO, unless we can fix both places, I'd suggest we skip allocating
> > and
> > creating the broken trans_table attr. Like a prototype patch below
> 
> Why not create a thermal debugfs with real useful information and
> get 
> rid of this broken code ?

The idea looks good to me.

thanks,
rui
Re: [PATCH v1] thermal/core: change mm alloc method to avoid kernel warning
Posted by Daniel Lezcano 4 years ago
On 19/04/2022 15:54, Zhang Rui wrote:
> CC Viresh.
> 
> On Tue, 2022-04-19 at 11:14 +0200, Daniel Lezcano wrote:
>> On 19/04/2022 10:48, Zhang Rui wrote:
>>> On Sun, 2022-04-17 at 12:56 +0000, Junwen Wu wrote:
>>>> Very high cooling device max state value makes cooling device
>>>> stats
>>>> buffer allocation fails,like below.Using kzvalloc instead of
>>>> kzalloc
>>>> can avoid this issue.
>>>
>>> When a cooling device has big max_state, this patch can get ride of
>>> the
>>> warning here, but still we end up with the read failure of the
>>> trans_table in sysfs because it is larger than PAGE_SIZE.
>>>
>>> $ cat /sys/class/thermal/cooling_device8/stats/trans_table
>>> cat: /sys/class/thermal/cooling_device8/stats/trans_table: File too
>>> large
>>>
>>> IMO, unless we can fix both places, I'd suggest we skip allocating
>>> and
>>> creating the broken trans_table attr. Like a prototype patch below
>>
>> Why not create a thermal debugfs with real useful information and
>> get
>> rid of this broken code ?
> 
> The idea looks good to me.

What about doing a percentile approach of the state indexes changes 
instead of a raw matrix full of zeros ? So we show the most significant 
transitions, perhaps something like:

99%:	7->6	6->7
98%:	6->5	5->6
95%:	5->4	4->5
90%:	7->5	5->7
80%:	6->4	4->6
70%:	7->1	7->2
50%:	...	...

total:	123456	124573


And another statistics file containing some timings information like the 
total duration in mitigation, and the duration in the most significant 
states above?

-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog