From: Aaron Kling <webgeek1234@gmail.com>
Instead, unregister the cpufreq device for this fatal fail case.
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
---
drivers/cpufreq/tegra124-cpufreq.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/cpufreq/tegra124-cpufreq.c b/drivers/cpufreq/tegra124-cpufreq.c
index 514146d98bca2d8aa59980a14dff3487cd8045f6..bc0691e8971f9454def37f489e4a3e244100b9f4 100644
--- a/drivers/cpufreq/tegra124-cpufreq.c
+++ b/drivers/cpufreq/tegra124-cpufreq.c
@@ -168,7 +168,10 @@ static int __maybe_unused tegra124_cpufreq_resume(struct device *dev)
disable_dfll:
clk_disable_unprepare(priv->dfll_clk);
disable_cpufreq:
- disable_cpufreq();
+ if (!IS_ERR(priv->cpufreq_dt_pdev)) {
+ platform_device_unregister(priv->cpufreq_dt_pdev);
+ priv->cpufreq_dt_pdev = ERR_PTR(-ENODEV);
+ }
return err;
}
--
2.48.1
On 09/05/2025 01:04, Aaron Kling via B4 Relay wrote:
> From: Aaron Kling <webgeek1234@gmail.com>
>
> Instead, unregister the cpufreq device for this fatal fail case.
This is not a complete sentence. Seems to be a continuation of the
subject which is not clear to the reader (at least not to me). No
mention of why or what this is fixing, if anything?
>
> Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
> ---
> drivers/cpufreq/tegra124-cpufreq.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/tegra124-cpufreq.c b/drivers/cpufreq/tegra124-cpufreq.c
> index 514146d98bca2d8aa59980a14dff3487cd8045f6..bc0691e8971f9454def37f489e4a3e244100b9f4 100644
> --- a/drivers/cpufreq/tegra124-cpufreq.c
> +++ b/drivers/cpufreq/tegra124-cpufreq.c
> @@ -168,7 +168,10 @@ static int __maybe_unused tegra124_cpufreq_resume(struct device *dev)
> disable_dfll:
> clk_disable_unprepare(priv->dfll_clk);
> disable_cpufreq:
> - disable_cpufreq();
> + if (!IS_ERR(priv->cpufreq_dt_pdev)) {
> + platform_device_unregister(priv->cpufreq_dt_pdev);
> + priv->cpufreq_dt_pdev = ERR_PTR(-ENODEV);
> + }
So you are proposing to unregister the device in resume? That seems odd.
I see there is no remove for this driver, but I really don't see the
value in this.
Jon
--
nvpublic
On 09-05-25, 12:04, Jon Hunter wrote:
> > diff --git a/drivers/cpufreq/tegra124-cpufreq.c b/drivers/cpufreq/tegra124-cpufreq.c
> > index 514146d98bca2d8aa59980a14dff3487cd8045f6..bc0691e8971f9454def37f489e4a3e244100b9f4 100644
> > --- a/drivers/cpufreq/tegra124-cpufreq.c
> > +++ b/drivers/cpufreq/tegra124-cpufreq.c
> > @@ -168,7 +168,10 @@ static int __maybe_unused tegra124_cpufreq_resume(struct device *dev)
> > disable_dfll:
> > clk_disable_unprepare(priv->dfll_clk);
> > disable_cpufreq:
> > - disable_cpufreq();
> > + if (!IS_ERR(priv->cpufreq_dt_pdev)) {
> > + platform_device_unregister(priv->cpufreq_dt_pdev);
> > + priv->cpufreq_dt_pdev = ERR_PTR(-ENODEV);
> > + }
>
> So you are proposing to unregister the device in resume? That seems odd. I
> see there is no remove for this driver, but I really don't see the value in
> this.
This is the failure path and the driver is trying to disable itself
here. Instead of using the disable_cpufreq() (which isn't designed for
this usecase), I suggested removing the device itself as the driver
will be unusable after this anyway.
--
viresh
On 19/05/2025 11:17, Viresh Kumar wrote:
> On 09-05-25, 12:04, Jon Hunter wrote:
>>> diff --git a/drivers/cpufreq/tegra124-cpufreq.c b/drivers/cpufreq/tegra124-cpufreq.c
>>> index 514146d98bca2d8aa59980a14dff3487cd8045f6..bc0691e8971f9454def37f489e4a3e244100b9f4 100644
>>> --- a/drivers/cpufreq/tegra124-cpufreq.c
>>> +++ b/drivers/cpufreq/tegra124-cpufreq.c
>>> @@ -168,7 +168,10 @@ static int __maybe_unused tegra124_cpufreq_resume(struct device *dev)
>>> disable_dfll:
>>> clk_disable_unprepare(priv->dfll_clk);
>>> disable_cpufreq:
>>> - disable_cpufreq();
>>> + if (!IS_ERR(priv->cpufreq_dt_pdev)) {
>>> + platform_device_unregister(priv->cpufreq_dt_pdev);
>>> + priv->cpufreq_dt_pdev = ERR_PTR(-ENODEV);
>>> + }
>>
>> So you are proposing to unregister the device in resume? That seems odd. I
>> see there is no remove for this driver, but I really don't see the value in
>> this.
>
> This is the failure path and the driver is trying to disable itself
> here. Instead of using the disable_cpufreq() (which isn't designed for
> this usecase), I suggested removing the device itself as the driver
> will be unusable after this anyway.
I understand, but this seems odd. It would be odd that the device may
just disappear after resuming from suspend if it fails to resume. I have
not seen this done for other drivers that fail to resume. Presumably
this is not the only CPU Freq driver that could fail to resume either?
It makes the code messy because now we have more than one place where
the device could be unregistered.
Jon
--
nvpublic
On 20-05-25, 10:53, Jon Hunter wrote: > I understand, but this seems odd. It would be odd that the device may just > disappear after resuming from suspend if it fails to resume. I have not seen > this done for other drivers that fail to resume. Presumably this is not the > only CPU Freq driver that could fail to resume either? > > It makes the code messy because now we have more than one place where the > device could be unregistered. Fair enough. This driver, along with other cpufreq drivers, can fail at multiple places during suspend/resume (and other operations). If something goes wrong, we print an error to inform the user. Should we avoid doing anything else (like everyone else) ? i.e. Just remove the call to disable_cpufreq(), as all later calls will fail anyway. This is the only driver that behaves differently on failures. -- viresh
On 20/05/2025 11:02, Viresh Kumar wrote: > On 20-05-25, 10:53, Jon Hunter wrote: >> I understand, but this seems odd. It would be odd that the device may just >> disappear after resuming from suspend if it fails to resume. I have not seen >> this done for other drivers that fail to resume. Presumably this is not the >> only CPU Freq driver that could fail to resume either? >> >> It makes the code messy because now we have more than one place where the >> device could be unregistered. > > Fair enough. > > This driver, along with other cpufreq drivers, can fail at multiple > places during suspend/resume (and other operations). If something goes > wrong, we print an error to inform the user. Should we avoid doing > anything else (like everyone else) ? i.e. Just remove the call to > disable_cpufreq(), as all later calls will fail anyway. I think that would be fine. Given that the tegra124-cpufreq driver is the parent, if it fails to resume, then I assume that cpufreq-dt driver would not resume either? Has anyone tested this? Jon -- nvpublic
On 05-06-25, 11:34, Jon Hunter wrote: > I think that would be fine. Given that the tegra124-cpufreq driver is the > parent, if it fails to resume, then I assume that cpufreq-dt driver would > not resume either? There is no resume interface in the cpufreq-dt driver, it is the cpufreq core which resumes to doing DVFS and I think it will try to do DVFS even if tegra's driver failed. > Has anyone tested this? -- viresh
On Thu, Jun 5, 2025 at 5:51 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 05-06-25, 11:34, Jon Hunter wrote: > > I think that would be fine. Given that the tegra124-cpufreq driver is the > > parent, if it fails to resume, then I assume that cpufreq-dt driver would > > not resume either? > > There is no resume interface in the cpufreq-dt driver, it is the cpufreq core > which resumes to doing DVFS and I think it will try to do DVFS even if tegra's > driver failed. In my opinion, I'm thinking the original flow makes more sense. If resume fails, disable cpufreq. Then the subsystem doesn't keep trying and failing and causing who knows what kind of havoc. But if that's still not desired, what should I do to get this moving again? Just drop the error handling entirely, as suggested? Aaron
On 30-06-25, 13:43, Aaron Kling wrote: > In my opinion, I'm thinking the original flow makes more sense. If > resume fails, disable cpufreq. Then the subsystem doesn't keep trying > and failing and causing who knows what kind of havoc. Lets do this and move ahead since we aren't able to conclude for a while. -- viresh
On Tue, May 20, 2025 at 5:02 AM Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 20-05-25, 10:53, Jon Hunter wrote: > > I understand, but this seems odd. It would be odd that the device may just > > disappear after resuming from suspend if it fails to resume. I have not seen > > this done for other drivers that fail to resume. Presumably this is not the > > only CPU Freq driver that could fail to resume either? > > > > It makes the code messy because now we have more than one place where the > > device could be unregistered. > > Fair enough. > > This driver, along with other cpufreq drivers, can fail at multiple > places during suspend/resume (and other operations). If something goes > wrong, we print an error to inform the user. Should we avoid doing > anything else (like everyone else) ? i.e. Just remove the call to > disable_cpufreq(), as all later calls will fail anyway. > > This is the only driver that behaves differently on failures. > > -- > viresh Is there any consensus on the best way to handle this? I'd like to keep the series moving. Sincerely, Aaron
On 28-05-25, 12:29, Aaron Kling wrote: > Is there any consensus on the best way to handle this? I'd like to > keep the series moving. I was waiting for Jon to provide further feedback here. -- viresh
On Fri, May 9, 2025 at 6:04 AM Jon Hunter <jonathanh@nvidia.com> wrote:
>
>
>
> On 09/05/2025 01:04, Aaron Kling via B4 Relay wrote:
> > From: Aaron Kling <webgeek1234@gmail.com>
> >
> > Instead, unregister the cpufreq device for this fatal fail case.
>
> This is not a complete sentence. Seems to be a continuation of the
> subject which is not clear to the reader (at least not to me). No
> mention of why or what this is fixing, if anything?
I can clean up the subject and message in a new revision. More on the
reasoning below.
> >
> > Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
> > ---
> > drivers/cpufreq/tegra124-cpufreq.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpufreq/tegra124-cpufreq.c b/drivers/cpufreq/tegra124-cpufreq.c
> > index 514146d98bca2d8aa59980a14dff3487cd8045f6..bc0691e8971f9454def37f489e4a3e244100b9f4 100644
> > --- a/drivers/cpufreq/tegra124-cpufreq.c
> > +++ b/drivers/cpufreq/tegra124-cpufreq.c
> > @@ -168,7 +168,10 @@ static int __maybe_unused tegra124_cpufreq_resume(struct device *dev)
> > disable_dfll:
> > clk_disable_unprepare(priv->dfll_clk);
> > disable_cpufreq:
> > - disable_cpufreq();
> > + if (!IS_ERR(priv->cpufreq_dt_pdev)) {
> > + platform_device_unregister(priv->cpufreq_dt_pdev);
> > + priv->cpufreq_dt_pdev = ERR_PTR(-ENODEV);
> > + }
>
> So you are proposing to unregister the device in resume? That seems odd.
> I see there is no remove for this driver, but I really don't see the
> value in this.
This was suggested by Viresh in v1 [0] to replace the call to
disable_cpufreq, which is not currently an exported function. I'm open
to other suggestions.
Sincerely,
Aaron
[0] https://lore.kernel.org/all/20250421054452.fdlrrhtsimyucbup@vireshk-i7/
On Fri, May 9, 2025 at 11:57 AM Aaron Kling <webgeek1234@gmail.com> wrote:
>
> On Fri, May 9, 2025 at 6:04 AM Jon Hunter <jonathanh@nvidia.com> wrote:
> >
> >
> >
> > On 09/05/2025 01:04, Aaron Kling via B4 Relay wrote:
> > > From: Aaron Kling <webgeek1234@gmail.com>
> > >
> > > Instead, unregister the cpufreq device for this fatal fail case.
> >
> > This is not a complete sentence. Seems to be a continuation of the
> > subject which is not clear to the reader (at least not to me). No
> > mention of why or what this is fixing, if anything?
>
> I can clean up the subject and message in a new revision. More on the
> reasoning below.
>
> > >
> > > Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
> > > ---
> > > drivers/cpufreq/tegra124-cpufreq.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/cpufreq/tegra124-cpufreq.c b/drivers/cpufreq/tegra124-cpufreq.c
> > > index 514146d98bca2d8aa59980a14dff3487cd8045f6..bc0691e8971f9454def37f489e4a3e244100b9f4 100644
> > > --- a/drivers/cpufreq/tegra124-cpufreq.c
> > > +++ b/drivers/cpufreq/tegra124-cpufreq.c
> > > @@ -168,7 +168,10 @@ static int __maybe_unused tegra124_cpufreq_resume(struct device *dev)
> > > disable_dfll:
> > > clk_disable_unprepare(priv->dfll_clk);
> > > disable_cpufreq:
> > > - disable_cpufreq();
> > > + if (!IS_ERR(priv->cpufreq_dt_pdev)) {
> > > + platform_device_unregister(priv->cpufreq_dt_pdev);
> > > + priv->cpufreq_dt_pdev = ERR_PTR(-ENODEV);
> > > + }
> >
> > So you are proposing to unregister the device in resume? That seems odd.
> > I see there is no remove for this driver, but I really don't see the
> > value in this.
>
> This was suggested by Viresh in v1 [0] to replace the call to
> disable_cpufreq, which is not currently an exported function. I'm open
> to other suggestions.
>
> Sincerely,
> Aaron
>
> [0] https://lore.kernel.org/all/20250421054452.fdlrrhtsimyucbup@vireshk-i7/
Viresh, could you comment here please? As you were the one that
suggested replacing disable_cpufreq with an unregister instead of
exporting said function. I can make the code changes and verify they
work as intended, but I'm not familiar enough with this subsystem to
know what a good option here is. Are there any other cpufreq drivers
that have to handle a resume failure like this?
Sincerely,
Aaron
© 2016 - 2026 Red Hat, Inc.