The function tegra_tsensor_enable_hw_channel() takes the thermal zone
lock to prevent "a potential" race with a call to set_trips()
callback.
The driver must not play with the thermal framework core code
internals.
The tegra_tsensor_enable_hw_channel() is called by:
- the suspend / resume callbacks
- the probe function after the thermal zones are registered
The thermal zone lock taken in this function is supposed to protect
from a call to the set_trips() callback which writes in the same
register.
The potential race is when suspend / resume are called at the same
time as set_trips. This one is called only in
thermal_zone_device_update().
- At suspend time, the 'in_suspend' is set, thus the
thermal_zone_device_update() bails out immediately and set_trips is
not called during this moment.
- At resume time, the thermal zone is updated at PM_POST_SUSPEND,
thus the driver has already set the TH2 temperature.
- At probe time, we register the thermal zone and then we set the
TH2. The only scenario I can see so far is the interrupt fires, the
thermal_zone_update() is called exactly at the moment
tegra_tsensor_enable_hw_channel() a few lines after registering it.
Disable the interrupt before setting up the hw channels and then
enable it. We close the potential race window without using the
thermal zone's lock.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
drivers/thermal/tegra/tegra30-tsensor.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/thermal/tegra/tegra30-tsensor.c b/drivers/thermal/tegra/tegra30-tsensor.c
index 4b2ea17910cd..3506c3f3c474 100644
--- a/drivers/thermal/tegra/tegra30-tsensor.c
+++ b/drivers/thermal/tegra/tegra30-tsensor.c
@@ -359,9 +359,6 @@ static int tegra_tsensor_enable_hw_channel(const struct tegra_tsensor *ts,
tegra_tsensor_get_hw_channel_trips(tzd, &hot_trip, &crit_trip);
- /* prevent potential racing with tegra_tsensor_set_trips() */
- mutex_lock(&tzd->lock);
-
dev_info_once(ts->dev, "ch%u: PMC emergency shutdown trip set to %dC\n",
id, DIV_ROUND_CLOSEST(crit_trip, 1000));
@@ -404,8 +401,6 @@ static int tegra_tsensor_enable_hw_channel(const struct tegra_tsensor *ts,
val |= FIELD_PREP(TSENSOR_SENSOR0_CONFIG0_INTR_THERMAL_RST_EN, 1);
writel_relaxed(val, tsc->regs + TSENSOR_SENSOR0_CONFIG0);
- mutex_unlock(&tzd->lock);
-
err = thermal_zone_device_enable(tzd);
if (err) {
dev_err(ts->dev, "ch%u: failed to enable zone: %d\n", id, err);
@@ -592,12 +587,24 @@ static int tegra_tsensor_probe(struct platform_device *pdev)
return dev_err_probe(&pdev->dev, err,
"failed to request interrupt\n");
+ /*
+ * Disable the interrupt so set_trips() can not be called
+ * while we are setting up the register
+ * TSENSOR_SENSOR0_CONFIG1. With this we close a potential
+ * race window where we are setting up the TH2 and the
+ * temperature hits TH1 resulting to an update of the
+ * TSENSOR_SENSOR0_CONFIG1 register in the ISR.
+ */
+ disable_irq(irq);
+
for (i = 0; i < ARRAY_SIZE(ts->ch); i++) {
err = tegra_tsensor_enable_hw_channel(ts, i);
if (err)
return err;
}
+ enable_irq(irq);
+
return 0;
}
--
2.34.1
On Tue, Feb 28, 2023 at 12:22:35PM +0100, Daniel Lezcano wrote: > The function tegra_tsensor_enable_hw_channel() takes the thermal zone > lock to prevent "a potential" race with a call to set_trips() > callback. > > The driver must not play with the thermal framework core code > internals. > > The tegra_tsensor_enable_hw_channel() is called by: > > - the suspend / resume callbacks > - the probe function after the thermal zones are registered > > The thermal zone lock taken in this function is supposed to protect > from a call to the set_trips() callback which writes in the same > register. > > The potential race is when suspend / resume are called at the same > time as set_trips. This one is called only in > thermal_zone_device_update(). > > - At suspend time, the 'in_suspend' is set, thus the > thermal_zone_device_update() bails out immediately and set_trips is > not called during this moment. > > - At resume time, the thermal zone is updated at PM_POST_SUSPEND, > thus the driver has already set the TH2 temperature. > > - At probe time, we register the thermal zone and then we set the > TH2. The only scenario I can see so far is the interrupt fires, the > thermal_zone_update() is called exactly at the moment > tegra_tsensor_enable_hw_channel() a few lines after registering it. > > Disable the interrupt before setting up the hw channels and then > enable it. We close the potential race window without using the > thermal zone's lock. > > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> > --- > drivers/thermal/tegra/tegra30-tsensor.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/drivers/thermal/tegra/tegra30-tsensor.c b/drivers/thermal/tegra/tegra30-tsensor.c > index 4b2ea17910cd..3506c3f3c474 100644 > --- a/drivers/thermal/tegra/tegra30-tsensor.c > +++ b/drivers/thermal/tegra/tegra30-tsensor.c > @@ -359,9 +359,6 @@ static int tegra_tsensor_enable_hw_channel(const struct tegra_tsensor *ts, > > tegra_tsensor_get_hw_channel_trips(tzd, &hot_trip, &crit_trip); > > - /* prevent potential racing with tegra_tsensor_set_trips() */ > - mutex_lock(&tzd->lock); > - > dev_info_once(ts->dev, "ch%u: PMC emergency shutdown trip set to %dC\n", > id, DIV_ROUND_CLOSEST(crit_trip, 1000)); > > @@ -404,8 +401,6 @@ static int tegra_tsensor_enable_hw_channel(const struct tegra_tsensor *ts, > val |= FIELD_PREP(TSENSOR_SENSOR0_CONFIG0_INTR_THERMAL_RST_EN, 1); > writel_relaxed(val, tsc->regs + TSENSOR_SENSOR0_CONFIG0); > > - mutex_unlock(&tzd->lock); > - > err = thermal_zone_device_enable(tzd); > if (err) { > dev_err(ts->dev, "ch%u: failed to enable zone: %d\n", id, err); > @@ -592,12 +587,24 @@ static int tegra_tsensor_probe(struct platform_device *pdev) > return dev_err_probe(&pdev->dev, err, > "failed to request interrupt\n"); > > + /* > + * Disable the interrupt so set_trips() can not be called > + * while we are setting up the register > + * TSENSOR_SENSOR0_CONFIG1. With this we close a potential > + * race window where we are setting up the TH2 and the > + * temperature hits TH1 resulting to an update of the > + * TSENSOR_SENSOR0_CONFIG1 register in the ISR. > + */ > + disable_irq(irq); > + > for (i = 0; i < ARRAY_SIZE(ts->ch); i++) { > err = tegra_tsensor_enable_hw_channel(ts, i); > if (err) > return err; > } > > + enable_irq(irq); Instead of disabling and reenabling the interrupt, could we simply move the channel enabling code a couple of lines above, before the IRQ request call? If enabling the channels were to trigger an interrupt, it should get triggered right after requesting the IRQ. Thierry
On 01/03/2023 15:18, Thierry Reding wrote: > On Tue, Feb 28, 2023 at 12:22:35PM +0100, Daniel Lezcano wrote: >> The function tegra_tsensor_enable_hw_channel() takes the thermal zone >> lock to prevent "a potential" race with a call to set_trips() >> callback. >> >> The driver must not play with the thermal framework core code >> internals. >> >> The tegra_tsensor_enable_hw_channel() is called by: >> >> - the suspend / resume callbacks >> - the probe function after the thermal zones are registered >> >> The thermal zone lock taken in this function is supposed to protect >> from a call to the set_trips() callback which writes in the same >> register. >> >> The potential race is when suspend / resume are called at the same >> time as set_trips. This one is called only in >> thermal_zone_device_update(). >> >> - At suspend time, the 'in_suspend' is set, thus the >> thermal_zone_device_update() bails out immediately and set_trips is >> not called during this moment. >> >> - At resume time, the thermal zone is updated at PM_POST_SUSPEND, >> thus the driver has already set the TH2 temperature. >> >> - At probe time, we register the thermal zone and then we set the >> TH2. The only scenario I can see so far is the interrupt fires, the >> thermal_zone_update() is called exactly at the moment >> tegra_tsensor_enable_hw_channel() a few lines after registering it. >> >> Disable the interrupt before setting up the hw channels and then >> enable it. We close the potential race window without using the >> thermal zone's lock. >> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> >> --- >> drivers/thermal/tegra/tegra30-tsensor.c | 17 ++++++++++++----- >> 1 file changed, 12 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/thermal/tegra/tegra30-tsensor.c b/drivers/thermal/tegra/tegra30-tsensor.c >> index 4b2ea17910cd..3506c3f3c474 100644 >> --- a/drivers/thermal/tegra/tegra30-tsensor.c >> +++ b/drivers/thermal/tegra/tegra30-tsensor.c >> @@ -359,9 +359,6 @@ static int tegra_tsensor_enable_hw_channel(const struct tegra_tsensor *ts, >> >> tegra_tsensor_get_hw_channel_trips(tzd, &hot_trip, &crit_trip); >> >> - /* prevent potential racing with tegra_tsensor_set_trips() */ >> - mutex_lock(&tzd->lock); >> - >> dev_info_once(ts->dev, "ch%u: PMC emergency shutdown trip set to %dC\n", >> id, DIV_ROUND_CLOSEST(crit_trip, 1000)); >> >> @@ -404,8 +401,6 @@ static int tegra_tsensor_enable_hw_channel(const struct tegra_tsensor *ts, >> val |= FIELD_PREP(TSENSOR_SENSOR0_CONFIG0_INTR_THERMAL_RST_EN, 1); >> writel_relaxed(val, tsc->regs + TSENSOR_SENSOR0_CONFIG0); >> >> - mutex_unlock(&tzd->lock); >> - >> err = thermal_zone_device_enable(tzd); >> if (err) { >> dev_err(ts->dev, "ch%u: failed to enable zone: %d\n", id, err); >> @@ -592,12 +587,24 @@ static int tegra_tsensor_probe(struct platform_device *pdev) >> return dev_err_probe(&pdev->dev, err, >> "failed to request interrupt\n"); >> >> + /* >> + * Disable the interrupt so set_trips() can not be called >> + * while we are setting up the register >> + * TSENSOR_SENSOR0_CONFIG1. With this we close a potential >> + * race window where we are setting up the TH2 and the >> + * temperature hits TH1 resulting to an update of the >> + * TSENSOR_SENSOR0_CONFIG1 register in the ISR. >> + */ >> + disable_irq(irq); >> + >> for (i = 0; i < ARRAY_SIZE(ts->ch); i++) { >> err = tegra_tsensor_enable_hw_channel(ts, i); >> if (err) >> return err; >> } >> >> + enable_irq(irq); > > Instead of disabling and reenabling the interrupt, could we simply move > the channel enabling code a couple of lines above, before the IRQ > request call? If enabling the channels were to trigger an interrupt, it > should get triggered right after requesting the IRQ. Won't we have a spurious interrupt if that situation happen ? -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
On Wed, Mar 01, 2023 at 03:35:13PM +0100, Daniel Lezcano wrote: > On 01/03/2023 15:18, Thierry Reding wrote: > > On Tue, Feb 28, 2023 at 12:22:35PM +0100, Daniel Lezcano wrote: > > > The function tegra_tsensor_enable_hw_channel() takes the thermal zone > > > lock to prevent "a potential" race with a call to set_trips() > > > callback. > > > > > > The driver must not play with the thermal framework core code > > > internals. > > > > > > The tegra_tsensor_enable_hw_channel() is called by: > > > > > > - the suspend / resume callbacks > > > - the probe function after the thermal zones are registered > > > > > > The thermal zone lock taken in this function is supposed to protect > > > from a call to the set_trips() callback which writes in the same > > > register. > > > > > > The potential race is when suspend / resume are called at the same > > > time as set_trips. This one is called only in > > > thermal_zone_device_update(). > > > > > > - At suspend time, the 'in_suspend' is set, thus the > > > thermal_zone_device_update() bails out immediately and set_trips is > > > not called during this moment. > > > > > > - At resume time, the thermal zone is updated at PM_POST_SUSPEND, > > > thus the driver has already set the TH2 temperature. > > > > > > - At probe time, we register the thermal zone and then we set the > > > TH2. The only scenario I can see so far is the interrupt fires, the > > > thermal_zone_update() is called exactly at the moment > > > tegra_tsensor_enable_hw_channel() a few lines after registering it. > > > > > > Disable the interrupt before setting up the hw channels and then > > > enable it. We close the potential race window without using the > > > thermal zone's lock. > > > > > > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> > > > --- > > > drivers/thermal/tegra/tegra30-tsensor.c | 17 ++++++++++++----- > > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/thermal/tegra/tegra30-tsensor.c b/drivers/thermal/tegra/tegra30-tsensor.c > > > index 4b2ea17910cd..3506c3f3c474 100644 > > > --- a/drivers/thermal/tegra/tegra30-tsensor.c > > > +++ b/drivers/thermal/tegra/tegra30-tsensor.c > > > @@ -359,9 +359,6 @@ static int tegra_tsensor_enable_hw_channel(const struct tegra_tsensor *ts, > > > tegra_tsensor_get_hw_channel_trips(tzd, &hot_trip, &crit_trip); > > > - /* prevent potential racing with tegra_tsensor_set_trips() */ > > > - mutex_lock(&tzd->lock); > > > - > > > dev_info_once(ts->dev, "ch%u: PMC emergency shutdown trip set to %dC\n", > > > id, DIV_ROUND_CLOSEST(crit_trip, 1000)); > > > @@ -404,8 +401,6 @@ static int tegra_tsensor_enable_hw_channel(const struct tegra_tsensor *ts, > > > val |= FIELD_PREP(TSENSOR_SENSOR0_CONFIG0_INTR_THERMAL_RST_EN, 1); > > > writel_relaxed(val, tsc->regs + TSENSOR_SENSOR0_CONFIG0); > > > - mutex_unlock(&tzd->lock); > > > - > > > err = thermal_zone_device_enable(tzd); > > > if (err) { > > > dev_err(ts->dev, "ch%u: failed to enable zone: %d\n", id, err); > > > @@ -592,12 +587,24 @@ static int tegra_tsensor_probe(struct platform_device *pdev) > > > return dev_err_probe(&pdev->dev, err, > > > "failed to request interrupt\n"); > > > + /* > > > + * Disable the interrupt so set_trips() can not be called > > > + * while we are setting up the register > > > + * TSENSOR_SENSOR0_CONFIG1. With this we close a potential > > > + * race window where we are setting up the TH2 and the > > > + * temperature hits TH1 resulting to an update of the > > > + * TSENSOR_SENSOR0_CONFIG1 register in the ISR. > > > + */ > > > + disable_irq(irq); > > > + > > > for (i = 0; i < ARRAY_SIZE(ts->ch); i++) { > > > err = tegra_tsensor_enable_hw_channel(ts, i); > > > if (err) > > > return err; > > > } > > > + enable_irq(irq); > > > > Instead of disabling and reenabling the interrupt, could we simply move > > the channel enabling code a couple of lines above, before the IRQ > > request call? If enabling the channels were to trigger an interrupt, it > > should get triggered right after requesting the IRQ. > > Won't we have a spurious interrupt if that situation happen ? It wouldn't be a spurious interrupt, but rather something that we want to react to. It's ultimately the same result as your patch. In your variant we basically request the IRQ (which automatically enables it), then immediately disable it, enable the HW channels and then reenable the interrupt. If enabling the HW channels were to trigger an interrupt, that interrupt would be raised immediately after enable_irq() as well, as far as I can tell. Thierry
On 02/03/2023 10:45, Thierry Reding wrote: [ ... ] >>>> + /* >>>> + * Disable the interrupt so set_trips() can not be called >>>> + * while we are setting up the register >>>> + * TSENSOR_SENSOR0_CONFIG1. With this we close a potential >>>> + * race window where we are setting up the TH2 and the >>>> + * temperature hits TH1 resulting to an update of the >>>> + * TSENSOR_SENSOR0_CONFIG1 register in the ISR. >>>> + */ >>>> + disable_irq(irq); >>>> + >>>> for (i = 0; i < ARRAY_SIZE(ts->ch); i++) { >>>> err = tegra_tsensor_enable_hw_channel(ts, i); >>>> if (err) >>>> return err; >>>> } >>>> + enable_irq(irq); >>> >>> Instead of disabling and reenabling the interrupt, could we simply move >>> the channel enabling code a couple of lines above, before the IRQ >>> request call? If enabling the channels were to trigger an interrupt, it >>> should get triggered right after requesting the IRQ. >> >> Won't we have a spurious interrupt if that situation happen ? > > It wouldn't be a spurious interrupt, but rather something that we want > to react to. It's ultimately the same result as your patch. In your > variant we basically request the IRQ (which automatically enables it), > then immediately disable it, enable the HW channels and then reenable > the interrupt. If enabling the HW channels were to trigger an interrupt, > that interrupt would be raised immediately after enable_irq() as well, > as far as I can tell. I see, thanks for the clarification -- Daniel -- <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook | <http://twitter.com/#!/linaroorg> Twitter | <http://www.linaro.org/linaro-blog/> Blog
© 2016 - 2024 Red Hat, Inc.