[PATCH v2] clk: check for disabled clock-provider in of_clk_get_hw_from_clkspec

Heiko Stuebner posted 1 patch 9 months, 3 weeks ago
drivers/clk/clk.c | 4 ++++
1 file changed, 4 insertions(+)
[PATCH v2] clk: check for disabled clock-provider in of_clk_get_hw_from_clkspec
Posted by Heiko Stuebner 9 months, 3 weeks ago
of_clk_get_hw_from_clkspec checks all available clock-providers by
compairing their of-nodes to the one from the clkspec. If no matching
clock-provider is found, the function returns EPROBE_DEFER to cause a
re-check at a later date.

If a matching clock-provider is found, an authoritative answer can be
retrieved from it whether the clock exists or not.

This does not take into account that the clock-provider may never appear,
because it's node is disabled. This can happen for example when a clock
is optional, provided by a separate block which just never gets enabled.

One example of this happening is the rk3588's VOP, which has optional
additional display-clock-supplies coming from PLLs inside the hdmiphy
blocks. These can be used for better rates, but the system will also
work without them.

The problem around that is described in the followups to:
https://lore.kernel.org/dri-devel/20250215-vop2-hdmi1-disp-modes-v1-3-81962a7151d6@collabora.com/

As we already know the of-node of the presumed clock-provider, just add
a check via of_device_is_available whether this is a "valid" device node.
This prevents eternal defer-loops.

Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Tested-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
---
changes in v2:
- add received Reviewed and Tested tags
- fix two spelling issues (Cristian and Sebastian)

 drivers/clk/clk.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index cf7720b9172f..50faafbf5dda 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -5258,6 +5258,10 @@ of_clk_get_hw_from_clkspec(struct of_phandle_args *clkspec)
 	if (!clkspec)
 		return ERR_PTR(-EINVAL);
 
+	/* Check if node in clkspec is in disabled/fail state */
+	if (!of_device_is_available(clkspec->np))
+		return ERR_PTR(-ENOENT);
+
 	mutex_lock(&of_clk_mutex);
 	list_for_each_entry(provider, &of_clk_providers, link) {
 		if (provider->node == clkspec->np) {
-- 
2.47.2
Re: [PATCH v2] clk: check for disabled clock-provider in of_clk_get_hw_from_clkspec
Posted by Stephen Boyd 9 months, 3 weeks ago
Quoting Heiko Stuebner (2025-02-22 14:37:33)
> of_clk_get_hw_from_clkspec checks all available clock-providers by
> compairing their of-nodes to the one from the clkspec. If no matching
> clock-provider is found, the function returns EPROBE_DEFER to cause a
> re-check at a later date.
> 
> If a matching clock-provider is found, an authoritative answer can be
> retrieved from it whether the clock exists or not.
> 
> This does not take into account that the clock-provider may never appear,
> because it's node is disabled. This can happen for example when a clock
> is optional, provided by a separate block which just never gets enabled.
> 
> One example of this happening is the rk3588's VOP, which has optional
> additional display-clock-supplies coming from PLLs inside the hdmiphy
> blocks. These can be used for better rates, but the system will also
> work without them.
> 
> The problem around that is described in the followups to:
> https://lore.kernel.org/dri-devel/20250215-vop2-hdmi1-disp-modes-v1-3-81962a7151d6@collabora.com/
> 
> As we already know the of-node of the presumed clock-provider, just add
> a check via of_device_is_available whether this is a "valid" device node.
> This prevents eternal defer-loops.
> 
> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
> Tested-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
> Signed-off-by: Heiko Stuebner <heiko@sntech.de>
> ---

Applied to clk-next (unless this needs to fix something urgent?)

Please write a unit test (or many). I also wonder if we should use a
different return value so that we don't try to look up the clk by name
(see clk_core_fill_parent_index()). We could go even further and stop
trying to find the clk over and over again too. Maybe -ENODEV can
indicate that and we can cache that parent entry value so we stop
trying.
Re: [PATCH v2] clk: check for disabled clock-provider in of_clk_get_hw_from_clkspec
Posted by Heiko Stübner 9 months, 3 weeks ago
Hi Stephen,

Am Mittwoch, 26. Februar 2025, 23:32:28 MEZ schrieb Stephen Boyd:
> Quoting Heiko Stuebner (2025-02-22 14:37:33)
> > of_clk_get_hw_from_clkspec checks all available clock-providers by
> > compairing their of-nodes to the one from the clkspec. If no matching
> > clock-provider is found, the function returns EPROBE_DEFER to cause a
> > re-check at a later date.
> > 
> > If a matching clock-provider is found, an authoritative answer can be
> > retrieved from it whether the clock exists or not.
> > 
> > This does not take into account that the clock-provider may never appear,
> > because it's node is disabled. This can happen for example when a clock
> > is optional, provided by a separate block which just never gets enabled.
> > 
> > One example of this happening is the rk3588's VOP, which has optional
> > additional display-clock-supplies coming from PLLs inside the hdmiphy
> > blocks. These can be used for better rates, but the system will also
> > work without them.
> > 
> > The problem around that is described in the followups to:
> > https://lore.kernel.org/dri-devel/20250215-vop2-hdmi1-disp-modes-v1-3-81962a7151d6@collabora.com/
> > 
> > As we already know the of-node of the presumed clock-provider, just add
> > a check via of_device_is_available whether this is a "valid" device node.
> > This prevents eternal defer-loops.
> > 
> > Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>
> > Tested-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
> > Signed-off-by: Heiko Stuebner <heiko@sntech.de>
> > ---
> 
> Applied to clk-next (unless this needs to fix something urgent?)

the area where this affects something for me is slated for 6.15, so
personally I see no urge to have this in 6.14 - especially as the effect
is present for so long already and nobody complained.

Though it looks like clk-next hasn't been pushed yet :-) .


> Please write a unit test (or many).

Had to look a bit ... never noticed the kunit dtso files before.
But will look into that :-) .

> I also wonder if we should use a
> different return value so that we don't try to look up the clk by name
> (see clk_core_fill_parent_index()). We could go even further and stop
> trying to find the clk over and over again too. Maybe -ENODEV can
> indicate that and we can cache that parent entry value so we stop
> trying.

Pffff ... no clue :-)

I.e. in the case I have, we're coming from clk_get_optional() [0].
which is supposed to just return NULL if the clock is not found, so at
least for the consumer view, the internals are not fixed and we could have
different "internal" error codes.

Not sure if more direct users of the of_clk_ functions would be affected
though?


In the case above, the optional clock is just a single one coming from a
phy-block, which may probe later (needing defer) or never (if disabled).

As for caching the ENODEV, I'm not sure how often we'd experience that?

Like "normally" you have that one big clock-controller + maybe a number
of smaller ones + maybe some blocks that expose one or two clocks
to one specific user - in my case the hdmi-phy exposing its hdmi-pll
for a nicer rate to the display controller when generating a hdmi output.

Does a case exist where some never-probed clock controller would have
so many clock-consumers that caching that single of-property check
would matter?


Heiko

[0] https://gitlab.freedesktop.org/drm/misc/kernel/-/blob/drm-misc-next/drivers/gpu/drm/rockchip/rockchip_drm_vop2.c#L3742
Re: Re: [PATCH v2] clk: check for disabled clock-provider in of_clk_get_hw_from_clkspec
Posted by Stephen Boyd 9 months, 3 weeks ago
Quoting Heiko Stübner (2025-02-27 06:36:39)
> Hi Stephen,
> 
> Am Mittwoch, 26. Februar 2025, 23:32:28 MEZ schrieb Stephen Boyd:
> > 
> > Applied to clk-next (unless this needs to fix something urgent?)
> 
> the area where this affects something for me is slated for 6.15, so
> personally I see no urge to have this in 6.14 - especially as the effect
> is present for so long already and nobody complained.
> 

Ok cool.

> 
> > I also wonder if we should use a
> > different return value so that we don't try to look up the clk by name
> > (see clk_core_fill_parent_index()). We could go even further and stop
> > trying to find the clk over and over again too. Maybe -ENODEV can
> > indicate that and we can cache that parent entry value so we stop
> > trying.
> 
> Pffff ... no clue :-)
> 
> I.e. in the case I have, we're coming from clk_get_optional() [0].
> which is supposed to just return NULL if the clock is not found, so at
> least for the consumer view, the internals are not fixed and we could have
> different "internal" error codes.
> 
> Not sure if more direct users of the of_clk_ functions would be affected
> though?
> 
> 
> In the case above, the optional clock is just a single one coming from a
> phy-block, which may probe later (needing defer) or never (if disabled).
> 
> As for caching the ENODEV, I'm not sure how often we'd experience that?
> 
> Like "normally" you have that one big clock-controller + maybe a number
> of smaller ones + maybe some blocks that expose one or two clocks
> to one specific user - in my case the hdmi-phy exposing its hdmi-pll
> for a nicer rate to the display controller when generating a hdmi output.
> 
> Does a case exist where some never-probed clock controller would have
> so many clock-consumers that caching that single of-property check
> would matter?

I don't know, nobody has measured it likely because this almost never
happens. I'm happy to not worry about the caching part until it's shown
to be a problem worth fixing.

Is there any point in searching by name in clk_core_fill_parent_index()
though? We've already found the reference in DT, and it isn't available,
so we know that the search by name is wrong. It may actually be harmful
if some parent can be found by name via the fallback path for a disabled
node. I'm thinking of this patch. I also noticed that we could have just
returned NULL from of_clk_get_hw_from_clkspec() and it would work the
same, but this is probably better so that we can build EPROBE_DEFER
logic on top.

----8<----
diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 50faafbf5dda..17aa6e0a8ff7 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -475,11 +475,14 @@ of_clk_get_hw_from_clkspec(struct of_phandle_args *clkspec)
  *              #clock-cells = <1>;
  *      };
  *
- * Returns: -ENOENT when the provider can't be found or the clk doesn't
- * exist in the provider or the name can't be found in the DT node or
- * in a clkdev lookup. NULL when the provider knows about the clk but it
- * isn't provided on this system.
- * A valid clk_core pointer when the clk can be found in the provider.
+ * Return:
+ * * %-ENOENT  - The provider can't be found or the clk doesn't
+ *               exist in the provider or the name can't be found in the DT node or
+ *               in a clkdev lookup.
+ * * %-ENODEV  - The provider is disabled or in the failed state.
+ * * %NULL     - The provider knows about the clk but it isn't provided on this
+ *               system.
+ * * A valid clk_core pointer when the clk can be found in the provider.
  */
 static struct clk_core *clk_core_get(struct clk_core *core, u8 p_index)
 {
@@ -493,7 +496,14 @@ static struct clk_core *clk_core_get(struct clk_core *core, u8 p_index)
 
 	if (np && (name || index >= 0) &&
 	    !of_parse_clkspec(np, index, name, &clkspec)) {
-		hw = of_clk_get_hw_from_clkspec(&clkspec);
+		/*
+		 * Skip lookup by name in clk_core_fill_parent_index() if the
+		 * device node is unavailable.
+		 */
+		if (!of_device_is_available(clkspec.np))
+			hw = ERR_PTR(-ENODEV);
+		else
+			hw = of_clk_get_hw_from_clkspec(&clkspec);
 		of_node_put(clkspec.np);
 	} else if (name) {
 		/*