[PATCH] thermal/thresholds: Fix thermal lock annotation issue

Daniel Lezcano posted 1 patch 1 month ago
drivers/thermal/thermal_thresholds.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
[PATCH] thermal/thresholds: Fix thermal lock annotation issue
Posted by Daniel Lezcano 1 month ago
When the thermal zone is unregistered (thermal sensor module being
unloaded), no lock is held when flushing the thresholds. That results
in a WARN when the lockdep validation is set in the kernel config.

This has been reported by syzbot.

As the thermal zone is in the process of being destroyed, there is no
need to send a notification about purging the thresholds to the
userspace as this one will receive a thermal zone deletion
notification which imply the deletion of all the associated resources
like the trip points or the user thresholds.

Split the function thermal_thresholds_flush() into a lockless one
without notification and its call with the lock annotation followed
with the thresholds flushing notification.

Please note this scenario is unlikely to happen, as the sensor drivers
are usually compiled-in in order to have the thermal framework to be
able to kick in at boot time if needed.

Link: https://lore.kernel.org/all/67124175.050a0220.10f4f4.0012.GAE@google.com
Reported-by: syzbot+f24dd060c1911fe54c85@syzkaller.appspotmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/thermal/thermal_thresholds.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/thermal_thresholds.c b/drivers/thermal/thermal_thresholds.c
index ea4aa5a2e86c..2888eabd3efe 100644
--- a/drivers/thermal/thermal_thresholds.c
+++ b/drivers/thermal/thermal_thresholds.c
@@ -20,17 +20,22 @@ int thermal_thresholds_init(struct thermal_zone_device *tz)
 	return 0;
 }
 
-void thermal_thresholds_flush(struct thermal_zone_device *tz)
+static void __thermal_thresholds_flush(struct thermal_zone_device *tz)
 {
 	struct list_head *thresholds = &tz->user_thresholds;
 	struct user_threshold *entry, *tmp;
 
-	lockdep_assert_held(&tz->lock);
-
 	list_for_each_entry_safe(entry, tmp, thresholds, list_node) {
 		list_del(&entry->list_node);
 		kfree(entry);
 	}
+}
+
+void thermal_thresholds_flush(struct thermal_zone_device *tz)
+{
+	lockdep_assert_held(&tz->lock);
+
+	__thermal_thresholds_flush(tz);
 
 	thermal_notify_threshold_flush(tz);
 
@@ -39,7 +44,7 @@ void thermal_thresholds_flush(struct thermal_zone_device *tz)
 
 void thermal_thresholds_exit(struct thermal_zone_device *tz)
 {
-	thermal_thresholds_flush(tz);
+	__thermal_thresholds_flush(tz);
 }
 
 static int __thermal_thresholds_cmp(void *data,
-- 
2.43.0
Re: [PATCH] thermal/thresholds: Fix thermal lock annotation issue
Posted by Daniel Lezcano 1 month ago
Hi,

please note this fix has been written on top of the thermal thresholds 
series, so I don't know how it conflicts if it is applied before

Thanks

   -- D.

On 24/10/2024 12:23, Daniel Lezcano wrote:
> When the thermal zone is unregistered (thermal sensor module being
> unloaded), no lock is held when flushing the thresholds. That results
> in a WARN when the lockdep validation is set in the kernel config.
> 
> This has been reported by syzbot.
> 
> As the thermal zone is in the process of being destroyed, there is no
> need to send a notification about purging the thresholds to the
> userspace as this one will receive a thermal zone deletion
> notification which imply the deletion of all the associated resources
> like the trip points or the user thresholds.
> 
> Split the function thermal_thresholds_flush() into a lockless one
> without notification and its call with the lock annotation followed
> with the thresholds flushing notification.
> 
> Please note this scenario is unlikely to happen, as the sensor drivers
> are usually compiled-in in order to have the thermal framework to be
> able to kick in at boot time if needed.
> 
> Link: https://lore.kernel.org/all/67124175.050a0220.10f4f4.0012.GAE@google.com
> Reported-by: syzbot+f24dd060c1911fe54c85@syzkaller.appspotmail.com
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>   drivers/thermal/thermal_thresholds.c | 13 +++++++++----
>   1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/thermal/thermal_thresholds.c b/drivers/thermal/thermal_thresholds.c
> index ea4aa5a2e86c..2888eabd3efe 100644
> --- a/drivers/thermal/thermal_thresholds.c
> +++ b/drivers/thermal/thermal_thresholds.c
> @@ -20,17 +20,22 @@ int thermal_thresholds_init(struct thermal_zone_device *tz)
>   	return 0;
>   }
>   
> -void thermal_thresholds_flush(struct thermal_zone_device *tz)
> +static void __thermal_thresholds_flush(struct thermal_zone_device *tz)
>   {
>   	struct list_head *thresholds = &tz->user_thresholds;
>   	struct user_threshold *entry, *tmp;
>   
> -	lockdep_assert_held(&tz->lock);
> -
>   	list_for_each_entry_safe(entry, tmp, thresholds, list_node) {
>   		list_del(&entry->list_node);
>   		kfree(entry);
>   	}
> +}
> +
> +void thermal_thresholds_flush(struct thermal_zone_device *tz)
> +{
> +	lockdep_assert_held(&tz->lock);
> +
> +	__thermal_thresholds_flush(tz);
>   
>   	thermal_notify_threshold_flush(tz);
>   
> @@ -39,7 +44,7 @@ void thermal_thresholds_flush(struct thermal_zone_device *tz)
>   
>   void thermal_thresholds_exit(struct thermal_zone_device *tz)
>   {
> -	thermal_thresholds_flush(tz);
> +	__thermal_thresholds_flush(tz);
>   }
>   
>   static int __thermal_thresholds_cmp(void *data,


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Re: [PATCH] thermal/thresholds: Fix thermal lock annotation issue
Posted by Rafael J. Wysocki 1 month ago
On Thu, Oct 24, 2024 at 12:36 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
>
> Hi,
>
> please note this fix has been written on top of the thermal thresholds
> series, so I don't know how it conflicts if it is applied before

No worries.

Applied (on top of the thresholds series).

> On 24/10/2024 12:23, Daniel Lezcano wrote:
> > When the thermal zone is unregistered (thermal sensor module being
> > unloaded), no lock is held when flushing the thresholds. That results
> > in a WARN when the lockdep validation is set in the kernel config.
> >
> > This has been reported by syzbot.
> >
> > As the thermal zone is in the process of being destroyed, there is no
> > need to send a notification about purging the thresholds to the
> > userspace as this one will receive a thermal zone deletion
> > notification which imply the deletion of all the associated resources
> > like the trip points or the user thresholds.
> >
> > Split the function thermal_thresholds_flush() into a lockless one
> > without notification and its call with the lock annotation followed
> > with the thresholds flushing notification.
> >
> > Please note this scenario is unlikely to happen, as the sensor drivers
> > are usually compiled-in in order to have the thermal framework to be
> > able to kick in at boot time if needed.
> >
> > Link: https://lore.kernel.org/all/67124175.050a0220.10f4f4.0012.GAE@google.com
> > Reported-by: syzbot+f24dd060c1911fe54c85@syzkaller.appspotmail.com
> > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> > ---
> >   drivers/thermal/thermal_thresholds.c | 13 +++++++++----
> >   1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/thermal/thermal_thresholds.c b/drivers/thermal/thermal_thresholds.c
> > index ea4aa5a2e86c..2888eabd3efe 100644
> > --- a/drivers/thermal/thermal_thresholds.c
> > +++ b/drivers/thermal/thermal_thresholds.c
> > @@ -20,17 +20,22 @@ int thermal_thresholds_init(struct thermal_zone_device *tz)
> >       return 0;
> >   }
> >
> > -void thermal_thresholds_flush(struct thermal_zone_device *tz)
> > +static void __thermal_thresholds_flush(struct thermal_zone_device *tz)
> >   {
> >       struct list_head *thresholds = &tz->user_thresholds;
> >       struct user_threshold *entry, *tmp;
> >
> > -     lockdep_assert_held(&tz->lock);
> > -
> >       list_for_each_entry_safe(entry, tmp, thresholds, list_node) {
> >               list_del(&entry->list_node);
> >               kfree(entry);
> >       }
> > +}
> > +
> > +void thermal_thresholds_flush(struct thermal_zone_device *tz)
> > +{
> > +     lockdep_assert_held(&tz->lock);
> > +
> > +     __thermal_thresholds_flush(tz);
> >
> >       thermal_notify_threshold_flush(tz);
> >
> > @@ -39,7 +44,7 @@ void thermal_thresholds_flush(struct thermal_zone_device *tz)
> >
> >   void thermal_thresholds_exit(struct thermal_zone_device *tz)
> >   {
> > -     thermal_thresholds_flush(tz);
> > +     __thermal_thresholds_flush(tz);
> >   }
> >
> >   static int __thermal_thresholds_cmp(void *data,
>
>
> --
> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>
> Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog