[PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address

Yao Zi posted 1 patch 4 weeks, 1 day ago
drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
[PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Yao Zi 4 weeks, 1 day ago
We must set the clk_phy pointer to NULL to indicating it isn't available
if the optional phy clock couldn't be obtained. Otherwise the error code
returned by of_clk_get() could be wrongly taken as an address, causing
invalid pointer dereference when later clk_phy is passed to
clk_prepare_enable().

Fixes: da114122b831 ("net: ethernet: stmmac: dwmac-rk: Make the clk_phy could be used for external phy")
Signed-off-by: Yao Zi <ziyao@disroot.org>
---
 drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

On next-20250903, the fixed commit causes NULL pointer dereference on
Radxa E20C during probe of dwmac-rk, a typical dmesg looks like

[    0.273324] rk_gmac-dwmac ffbe0000.ethernet: IRQ eth_lpi not found
[    0.273888] rk_gmac-dwmac ffbe0000.ethernet: IRQ sfty not found
[    0.274520] rk_gmac-dwmac ffbe0000.ethernet: PTP uses main clock
[    0.275226] rk_gmac-dwmac ffbe0000.ethernet: clock input or output? (output).
[    0.275867] rk_gmac-dwmac ffbe0000.ethernet: Can not read property: tx_delay.
[    0.276491] rk_gmac-dwmac ffbe0000.ethernet: set tx_delay to 0x30
[    0.277026] rk_gmac-dwmac ffbe0000.ethernet: Can not read property: rx_delay.
[    0.278086] rk_gmac-dwmac ffbe0000.ethernet: set rx_delay to 0x10
[    0.278658] rk_gmac-dwmac ffbe0000.ethernet: integrated PHY? (no).
[    0.279249] Unable to handle kernel paging request at virtual address fffffffffffffffe
[    0.279948] Mem abort info:
[    0.280195]   ESR = 0x000000096000006
[    0.280523]   EC = 0x25: DABT (current EL), IL = 32 bits
[    0.280989]   SET = 0, FnV = 0
[    0.281287]   EA = 0, S1PTW = 0
[    0.281574]   FSC = 0x06: level 2 translation fault

where the invalid address is just -ENOENT (-2).

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
index cf619a428664..26ec8ae662a6 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
@@ -1414,11 +1414,17 @@ static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
 	if (plat->phy_node) {
 		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
 		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
-		/* If it is not integrated_phy, clk_phy is optional */
+		/*
+		 * If it is not integrated_phy, clk_phy is optional. But we must
+		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
+		 * the error code could be wrongly taken as an invalid pointer.
+		 */
 		if (bsp_priv->integrated_phy) {
 			if (ret)
 				return dev_err_probe(dev, ret, "Cannot get PHY clock\n");
 			clk_set_rate(bsp_priv->clk_phy, 50000000);
+		} else if (ret) {
+			bsp_priv->clk_phy = NULL;
 		}
 	}
 
-- 
2.50.1
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Russell King (Oracle) 4 weeks ago
On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
>  	if (plat->phy_node) {
>  		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
>  		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> -		/* If it is not integrated_phy, clk_phy is optional */
> +		/*
> +		 * If it is not integrated_phy, clk_phy is optional. But we must
> +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
> +		 * the error code could be wrongly taken as an invalid pointer.
> +		 */

I'm concerned by this. This code is getting the first clock from the DT
description of the PHY. We don't know what type of PHY it is, or what
the DT description of that PHY might suggest that the first clock would
be.

However, we're geting it and setting it to 50MHz. What if the clock is
not what we think it is?

I'm not sure we should be delving in to some other device's DT
properties to then get resources that it _uses_ to then effectively
take control those resources.

I think we need way more detail on what's going on. Commit da114122b83
merely stated:

    For external phy, clk_phy should be optional, and some external phy
    need the clock input from clk_phy. This patch adds support for setting
    clk_phy for external phy.

If the external PHY requires a clock supplied to it, shouldn't the PHY
driver itself be getting that clock and setting it appropriately?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Chaoyi Chen 4 weeks ago
On 9/4/2025 6:58 PM, Russell King (Oracle) wrote:
> On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
>>   	if (plat->phy_node) {
>>   		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
>>   		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
>> -		/* If it is not integrated_phy, clk_phy is optional */
>> +		/*
>> +		 * If it is not integrated_phy, clk_phy is optional. But we must
>> +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
>> +		 * the error code could be wrongly taken as an invalid pointer.
>> +		 */
> I'm concerned by this. This code is getting the first clock from the DT
> description of the PHY. We don't know what type of PHY it is, or what
> the DT description of that PHY might suggest that the first clock would
> be.
>
> However, we're geting it and setting it to 50MHz. What if the clock is
> not what we think it is?

We only set integrated_phy to 50M, which are all known targets. For external PHYs, we do not perform frequency settings.



>
> I'm not sure we should be delving in to some other device's DT
> properties to then get resources that it _uses_ to then effectively
> take control those resources.
>
> I think we need way more detail on what's going on. Commit da114122b83
> merely stated:
>
>      For external phy, clk_phy should be optional, and some external phy
>      need the clock input from clk_phy. This patch adds support for setting
>      clk_phy for external phy.
>
> If the external PHY requires a clock supplied to it, shouldn't the PHY
> driver itself be getting that clock and setting it appropriately?
>
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Russell King (Oracle) 4 weeks ago
On Thu, Sep 04, 2025 at 07:03:10PM +0800, Chaoyi Chen wrote:
> 
> On 9/4/2025 6:58 PM, Russell King (Oracle) wrote:
> > On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> > >   	if (plat->phy_node) {
> > >   		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
> > >   		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> > > -		/* If it is not integrated_phy, clk_phy is optional */
> > > +		/*
> > > +		 * If it is not integrated_phy, clk_phy is optional. But we must
> > > +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
> > > +		 * the error code could be wrongly taken as an invalid pointer.
> > > +		 */
> > I'm concerned by this. This code is getting the first clock from the DT
> > description of the PHY. We don't know what type of PHY it is, or what
> > the DT description of that PHY might suggest that the first clock would
> > be.
> > 
> > However, we're geting it and setting it to 50MHz. What if the clock is
> > not what we think it is?
> 
> We only set integrated_phy to 50M, which are all known targets. For external PHYs, we do not perform frequency settings.

Same question concerning enabling and disabling another device's clock
that the other device should be handling.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Russell King (Oracle) 4 weeks ago
On Thu, Sep 04, 2025 at 12:05:19PM +0100, Russell King (Oracle) wrote:
> On Thu, Sep 04, 2025 at 07:03:10PM +0800, Chaoyi Chen wrote:
> > 
> > On 9/4/2025 6:58 PM, Russell King (Oracle) wrote:
> > > On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> > > >   	if (plat->phy_node) {
> > > >   		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
> > > >   		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> > > > -		/* If it is not integrated_phy, clk_phy is optional */
> > > > +		/*
> > > > +		 * If it is not integrated_phy, clk_phy is optional. But we must
> > > > +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
> > > > +		 * the error code could be wrongly taken as an invalid pointer.
> > > > +		 */
> > > I'm concerned by this. This code is getting the first clock from the DT
> > > description of the PHY. We don't know what type of PHY it is, or what
> > > the DT description of that PHY might suggest that the first clock would
> > > be.
> > > 
> > > However, we're geting it and setting it to 50MHz. What if the clock is
> > > not what we think it is?
> > 
> > We only set integrated_phy to 50M, which are all known targets. For external PHYs, we do not perform frequency settings.
> 
> Same question concerning enabling and disabling another device's clock
> that the other device should be handling.

Let me be absolutely clear: I consider *everything* that is going on
with clk_phy here to be a dirty hack.

Resources used by a device that has its own driver should be managed
by _that_ driver alone, not by some other random driver.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Yao Zi 3 weeks, 5 days ago
On Thu, Sep 04, 2025 at 12:07:26PM +0100, Russell King (Oracle) wrote:
> On Thu, Sep 04, 2025 at 12:05:19PM +0100, Russell King (Oracle) wrote:
> > On Thu, Sep 04, 2025 at 07:03:10PM +0800, Chaoyi Chen wrote:
> > > 
> > > On 9/4/2025 6:58 PM, Russell King (Oracle) wrote:
> > > > On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> > > > >   	if (plat->phy_node) {
> > > > >   		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
> > > > >   		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> > > > > -		/* If it is not integrated_phy, clk_phy is optional */
> > > > > +		/*
> > > > > +		 * If it is not integrated_phy, clk_phy is optional. But we must
> > > > > +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
> > > > > +		 * the error code could be wrongly taken as an invalid pointer.
> > > > > +		 */
> > > > I'm concerned by this. This code is getting the first clock from the DT
> > > > description of the PHY. We don't know what type of PHY it is, or what
> > > > the DT description of that PHY might suggest that the first clock would
> > > > be.
> > > > 
> > > > However, we're geting it and setting it to 50MHz. What if the clock is
> > > > not what we think it is?
> > > 
> > > We only set integrated_phy to 50M, which are all known targets. For external PHYs, we do not perform frequency settings.
> > 
> > Same question concerning enabling and disabling another device's clock
> > that the other device should be handling.
> 
> Let me be absolutely clear: I consider *everything* that is going on
> with clk_phy here to be a dirty hack.
> 
> Resources used by a device that has its own driver should be managed
> by _that_ driver alone, not by some other random driver.

Agree on this. Should we drop the patch, or fix it up for now to at
least prevent the oops? Chaoyi, I guess there's no user of the feature
for now, is it?

Best regards,
Yao Zi

> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Chaoyi Chen 3 weeks, 5 days ago
On 9/6/2025 1:36 PM, Yao Zi wrote:

> On Thu, Sep 04, 2025 at 12:07:26PM +0100, Russell King (Oracle) wrote:
>> On Thu, Sep 04, 2025 at 12:05:19PM +0100, Russell King (Oracle) wrote:
>>> On Thu, Sep 04, 2025 at 07:03:10PM +0800, Chaoyi Chen wrote:
>>>> On 9/4/2025 6:58 PM, Russell King (Oracle) wrote:
>>>>> On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
>>>>>>    	if (plat->phy_node) {
>>>>>>    		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
>>>>>>    		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
>>>>>> -		/* If it is not integrated_phy, clk_phy is optional */
>>>>>> +		/*
>>>>>> +		 * If it is not integrated_phy, clk_phy is optional. But we must
>>>>>> +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
>>>>>> +		 * the error code could be wrongly taken as an invalid pointer.
>>>>>> +		 */
>>>>> I'm concerned by this. This code is getting the first clock from the DT
>>>>> description of the PHY. We don't know what type of PHY it is, or what
>>>>> the DT description of that PHY might suggest that the first clock would
>>>>> be.
>>>>>
>>>>> However, we're geting it and setting it to 50MHz. What if the clock is
>>>>> not what we think it is?
>>>> We only set integrated_phy to 50M, which are all known targets. For external PHYs, we do not perform frequency settings.
>>> Same question concerning enabling and disabling another device's clock
>>> that the other device should be handling.
>> Let me be absolutely clear: I consider *everything* that is going on
>> with clk_phy here to be a dirty hack.
>>
>> Resources used by a device that has its own driver should be managed
>> by _that_ driver alone, not by some other random driver.
> Agree on this. Should we drop the patch, or fix it up for now to at
> least prevent the oops? Chaoyi, I guess there's no user of the feature
> for now, is it?

This at least needs fixing. Sorry, I have no idea how to implement this in the PHY.
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Sebastian Reichel 3 weeks, 5 days ago
Hi,

On Sat, Sep 06, 2025 at 02:26:31PM +0800, Chaoyi Chen wrote:
> On 9/6/2025 1:36 PM, Yao Zi wrote:
> 
> > On Thu, Sep 04, 2025 at 12:07:26PM +0100, Russell King (Oracle) wrote:
> > > On Thu, Sep 04, 2025 at 12:05:19PM +0100, Russell King (Oracle) wrote:
> > > > On Thu, Sep 04, 2025 at 07:03:10PM +0800, Chaoyi Chen wrote:
> > > > > On 9/4/2025 6:58 PM, Russell King (Oracle) wrote:
> > > > > > On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> > > > > > >    	if (plat->phy_node) {
> > > > > > >    		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
> > > > > > >    		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> > > > > > > -		/* If it is not integrated_phy, clk_phy is optional */
> > > > > > > +		/*
> > > > > > > +		 * If it is not integrated_phy, clk_phy is optional. But we must
> > > > > > > +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
> > > > > > > +		 * the error code could be wrongly taken as an invalid pointer.
> > > > > > > +		 */
> > > > > > I'm concerned by this. This code is getting the first clock from the DT
> > > > > > description of the PHY. We don't know what type of PHY it is, or what
> > > > > > the DT description of that PHY might suggest that the first clock would
> > > > > > be.
> > > > > > 
> > > > > > However, we're geting it and setting it to 50MHz. What if the clock is
> > > > > > not what we think it is?
> > > > > We only set integrated_phy to 50M, which are all known targets. For external PHYs, we do not perform frequency settings.
> > > > Same question concerning enabling and disabling another device's clock
> > > > that the other device should be handling.
> > > Let me be absolutely clear: I consider *everything* that is going on
> > > with clk_phy here to be a dirty hack.
> > > 
> > > Resources used by a device that has its own driver should be managed
> > > by _that_ driver alone, not by some other random driver.
> > Agree on this. Should we drop the patch, or fix it up for now to at
> > least prevent the oops? Chaoyi, I guess there's no user of the feature
> > for now, is it?
> 
> This at least needs fixing. Sorry, I have no idea how to implement
> this in the PHY.

I think the proper fix is to revert da114122b8314 ("net: ethernet:
stmmac: dwmac-rk: Make the clk_phy could be used for external phy"),
which has only recently been merged. External PHYs should reference
their clocks themself instead of the MAC doing it.

Chaoyi Chen: Have a look at the ROCK 4D devicetree:

&mdio0 {
	rgmii_phy0: ethernet-phy@1 {
		compatible = "ethernet-phy-id001c.c916";
		reg = <0x1>;
		clocks = <&cru REFCLKO25M_GMAC0_OUT>;
		assigned-clocks = <&cru REFCLKO25M_GMAC0_OUT>;
		assigned-clock-rates = <25000000>;
        ...
    };
};

The clock is enabled by the RTL8211F PHY driver (check for
devm_clk_get_optional_enabled in drivers/net/phy/realtek/realtek_main.c),
as the PHY is the one needing the clock and not the Rockchip MAC. For
this to work it is important to set the right compatible string, so
that the kernel can probe the right driver without needing to read the
identification registers (as that would require the clock to be already
configured before the driver is being probed).

Greetings,

-- Sebastian
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Chaoyi Chen 2 weeks, 4 days ago
Hi Sebastian,

On 9/7/2025 4:25 AM, Sebastian Reichel wrote:
> Hi,
>
> On Sat, Sep 06, 2025 at 02:26:31PM +0800, Chaoyi Chen wrote:
>> On 9/6/2025 1:36 PM, Yao Zi wrote:
>>
>>> On Thu, Sep 04, 2025 at 12:07:26PM +0100, Russell King (Oracle) wrote:
>>>> On Thu, Sep 04, 2025 at 12:05:19PM +0100, Russell King (Oracle) wrote:
>>>>> On Thu, Sep 04, 2025 at 07:03:10PM +0800, Chaoyi Chen wrote:
>>>>>> On 9/4/2025 6:58 PM, Russell King (Oracle) wrote:
>>>>>>> On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
>>>>>>>>     	if (plat->phy_node) {
>>>>>>>>     		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
>>>>>>>>     		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
>>>>>>>> -		/* If it is not integrated_phy, clk_phy is optional */
>>>>>>>> +		/*
>>>>>>>> +		 * If it is not integrated_phy, clk_phy is optional. But we must
>>>>>>>> +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
>>>>>>>> +		 * the error code could be wrongly taken as an invalid pointer.
>>>>>>>> +		 */
>>>>>>> I'm concerned by this. This code is getting the first clock from the DT
>>>>>>> description of the PHY. We don't know what type of PHY it is, or what
>>>>>>> the DT description of that PHY might suggest that the first clock would
>>>>>>> be.
>>>>>>>
>>>>>>> However, we're geting it and setting it to 50MHz. What if the clock is
>>>>>>> not what we think it is?
>>>>>> We only set integrated_phy to 50M, which are all known targets. For external PHYs, we do not perform frequency settings.
>>>>> Same question concerning enabling and disabling another device's clock
>>>>> that the other device should be handling.
>>>> Let me be absolutely clear: I consider *everything* that is going on
>>>> with clk_phy here to be a dirty hack.
>>>>
>>>> Resources used by a device that has its own driver should be managed
>>>> by _that_ driver alone, not by some other random driver.
>>> Agree on this. Should we drop the patch, or fix it up for now to at
>>> least prevent the oops? Chaoyi, I guess there's no user of the feature
>>> for now, is it?
>> This at least needs fixing. Sorry, I have no idea how to implement
>> this in the PHY.
> I think the proper fix is to revert da114122b8314 ("net: ethernet:
> stmmac: dwmac-rk: Make the clk_phy could be used for external phy"),
> which has only recently been merged. External PHYs should reference
> their clocks themself instead of the MAC doing it.
>
> Chaoyi Chen: Have a look at the ROCK 4D devicetree:
>
> &mdio0 {
> 	rgmii_phy0: ethernet-phy@1 {
> 		compatible = "ethernet-phy-id001c.c916";
> 		reg = <0x1>;
> 		clocks = <&cru REFCLKO25M_GMAC0_OUT>;
> 		assigned-clocks = <&cru REFCLKO25M_GMAC0_OUT>;
> 		assigned-clock-rates = <25000000>;
>          ...
>      };
> };
>
> The clock is enabled by the RTL8211F PHY driver (check for
> devm_clk_get_optional_enabled in drivers/net/phy/realtek/realtek_main.c),
> as the PHY is the one needing the clock and not the Rockchip MAC. For
> this to work it is important to set the right compatible string, so
> that the kernel can probe the right driver without needing to read the
> identification registers (as that would require the clock to be already
> configured before the driver is being probed).

Yes, what you said is correct. This is also the issue we encountered earlier on RK3576 board :)
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Simon Horman 4 weeks ago
On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> We must set the clk_phy pointer to NULL to indicating it isn't available
> if the optional phy clock couldn't be obtained. Otherwise the error code
> returned by of_clk_get() could be wrongly taken as an address, causing
> invalid pointer dereference when later clk_phy is passed to
> clk_prepare_enable().
> 
> Fixes: da114122b831 ("net: ethernet: stmmac: dwmac-rk: Make the clk_phy could be used for external phy")
> Signed-off-by: Yao Zi <ziyao@disroot.org>
> ---
>  drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> On next-20250903, the fixed commit causes NULL pointer dereference on
> Radxa E20C during probe of dwmac-rk, a typical dmesg looks like
> 
> [    0.273324] rk_gmac-dwmac ffbe0000.ethernet: IRQ eth_lpi not found
> [    0.273888] rk_gmac-dwmac ffbe0000.ethernet: IRQ sfty not found
> [    0.274520] rk_gmac-dwmac ffbe0000.ethernet: PTP uses main clock
> [    0.275226] rk_gmac-dwmac ffbe0000.ethernet: clock input or output? (output).
> [    0.275867] rk_gmac-dwmac ffbe0000.ethernet: Can not read property: tx_delay.
> [    0.276491] rk_gmac-dwmac ffbe0000.ethernet: set tx_delay to 0x30
> [    0.277026] rk_gmac-dwmac ffbe0000.ethernet: Can not read property: rx_delay.
> [    0.278086] rk_gmac-dwmac ffbe0000.ethernet: set rx_delay to 0x10
> [    0.278658] rk_gmac-dwmac ffbe0000.ethernet: integrated PHY? (no).
> [    0.279249] Unable to handle kernel paging request at virtual address fffffffffffffffe
> [    0.279948] Mem abort info:
> [    0.280195]   ESR = 0x000000096000006
> [    0.280523]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    0.280989]   SET = 0, FnV = 0
> [    0.281287]   EA = 0, S1PTW = 0
> [    0.281574]   FSC = 0x06: level 2 translation fault
> 
> where the invalid address is just -ENOENT (-2).
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
> index cf619a428664..26ec8ae662a6 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
> @@ -1414,11 +1414,17 @@ static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
>  	if (plat->phy_node) {
>  		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
>  		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> -		/* If it is not integrated_phy, clk_phy is optional */
> +		/*
> +		 * If it is not integrated_phy, clk_phy is optional. But we must
> +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
> +		 * the error code could be wrongly taken as an invalid pointer.
> +		 */
>  		if (bsp_priv->integrated_phy) {
>  			if (ret)
>  				return dev_err_probe(dev, ret, "Cannot get PHY clock\n");
>  			clk_set_rate(bsp_priv->clk_phy, 50000000);
> +		} else if (ret) {
> +			bsp_priv->clk_phy = NULL;
>  		}
>  	}

Thanks, and sorry for my early confusion about applying this patch.

I agree that the bug you point out is addressed by this patch.
Although I wonder if it is cleaner not to set bsp_priv->clk_phy
unless there is no error, rather than setting it then resetting
it if there is an error.

More importantly, I wonder if there is another bug: does clk_set_rate need
to be called in the case where there is no error and bsp_priv->integrated_phy
is false?

So I am wondering if it makes sense to go with something like this.
(Compile tested only!)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
index 266c53379236..a25816af2c37 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
@@ -1411,12 +1411,16 @@ static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
 	}
 
 	if (plat->phy_node) {
-		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
-		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
-		/* If it is not integrated_phy, clk_phy is optional */
-		if (bsp_priv->integrated_phy) {
-			if (ret)
+		struct clk *clk_phy;
+
+		clk_phy = of_clk_get(plat->phy_node, 0);
+		ret = PTR_ERR_OR_ZERO(clk_phy);
+		if (ret) {
+			/* If it is not integrated_phy, clk_phy is optional */
+			if (bsp_priv->integrated_phy)
 				return dev_err_probe(dev, ret, "Cannot get PHY clock\n");
+		} else {
+			bsp_priv->clk_phy = clk_phy;
 			clk_set_rate(bsp_priv->clk_phy, 50000000);
 		}
 	}

Please note: if you send an updated patch (against net) please
make sure you wait 24h before the original post.

See: https://docs.kernel.org/process/maintainer-netdev.html
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Yao Zi 4 weeks ago
On Thu, Sep 04, 2025 at 11:34:43AM +0100, Simon Horman wrote:
> On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> > We must set the clk_phy pointer to NULL to indicating it isn't available
> > if the optional phy clock couldn't be obtained. Otherwise the error code
> > returned by of_clk_get() could be wrongly taken as an address, causing
> > invalid pointer dereference when later clk_phy is passed to
> > clk_prepare_enable().
> > 
> > Fixes: da114122b831 ("net: ethernet: stmmac: dwmac-rk: Make the clk_phy could be used for external phy")
> > Signed-off-by: Yao Zi <ziyao@disroot.org>
> > ---
> >  drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > On next-20250903, the fixed commit causes NULL pointer dereference on
> > Radxa E20C during probe of dwmac-rk, a typical dmesg looks like
> > 
> > [    0.273324] rk_gmac-dwmac ffbe0000.ethernet: IRQ eth_lpi not found
> > [    0.273888] rk_gmac-dwmac ffbe0000.ethernet: IRQ sfty not found
> > [    0.274520] rk_gmac-dwmac ffbe0000.ethernet: PTP uses main clock
> > [    0.275226] rk_gmac-dwmac ffbe0000.ethernet: clock input or output? (output).
> > [    0.275867] rk_gmac-dwmac ffbe0000.ethernet: Can not read property: tx_delay.
> > [    0.276491] rk_gmac-dwmac ffbe0000.ethernet: set tx_delay to 0x30
> > [    0.277026] rk_gmac-dwmac ffbe0000.ethernet: Can not read property: rx_delay.
> > [    0.278086] rk_gmac-dwmac ffbe0000.ethernet: set rx_delay to 0x10
> > [    0.278658] rk_gmac-dwmac ffbe0000.ethernet: integrated PHY? (no).
> > [    0.279249] Unable to handle kernel paging request at virtual address fffffffffffffffe
> > [    0.279948] Mem abort info:
> > [    0.280195]   ESR = 0x000000096000006
> > [    0.280523]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [    0.280989]   SET = 0, FnV = 0
> > [    0.281287]   EA = 0, S1PTW = 0
> > [    0.281574]   FSC = 0x06: level 2 translation fault
> > 
> > where the invalid address is just -ENOENT (-2).
> > 
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
> > index cf619a428664..26ec8ae662a6 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
> > @@ -1414,11 +1414,17 @@ static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
> >  	if (plat->phy_node) {
> >  		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
> >  		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> > -		/* If it is not integrated_phy, clk_phy is optional */
> > +		/*
> > +		 * If it is not integrated_phy, clk_phy is optional. But we must
> > +		 * set bsp_priv->clk_phy to NULL if clk_phy isn't proivded, or
> > +		 * the error code could be wrongly taken as an invalid pointer.
> > +		 */
> >  		if (bsp_priv->integrated_phy) {
> >  			if (ret)
> >  				return dev_err_probe(dev, ret, "Cannot get PHY clock\n");
> >  			clk_set_rate(bsp_priv->clk_phy, 50000000);
> > +		} else if (ret) {
> > +			bsp_priv->clk_phy = NULL;
> >  		}
> >  	}
> 
> Thanks, and sorry for my early confusion about applying this patch.
> 
> I agree that the bug you point out is addressed by this patch.
> Although I wonder if it is cleaner not to set bsp_priv->clk_phy
> unless there is no error, rather than setting it then resetting
> it if there is an error.

Yes, it sounds more natural to have a temporary variable storing result
of of_clk_get() and only assign it to clk_phy when the result is valid.

> More importantly, I wonder if there is another bug: does clk_set_rate need
> to be called in the case where there is no error and bsp_priv->integrated_phy
> is false?

In my understanding this may be intended, bsp_priv->integrated_phy is
only false when an external phy is used, and an external phy might
require arbitrary clock rates, thus it doesn't seem a good idea to me to
hardcode the clock rate in the driver.

I guess rate of clk_phy could also be set up with assigned-clock-rates
in devicetree. If so it may be reasonable to enable the clock only.

> So I am wondering if it makes sense to go with something like this.
> (Compile tested only!)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
> index 266c53379236..a25816af2c37 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-rk.c
> @@ -1411,12 +1411,16 @@ static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
>  	}
>  
>  	if (plat->phy_node) {
> -		bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
> -		ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> -		/* If it is not integrated_phy, clk_phy is optional */
> -		if (bsp_priv->integrated_phy) {
> -			if (ret)
> +		struct clk *clk_phy;
> +
> +		clk_phy = of_clk_get(plat->phy_node, 0);
> +		ret = PTR_ERR_OR_ZERO(clk_phy);
> +		if (ret) {
> +			/* If it is not integrated_phy, clk_phy is optional */
> +			if (bsp_priv->integrated_phy)
>  				return dev_err_probe(dev, ret, "Cannot get PHY clock\n");
> +		} else {
> +			bsp_priv->clk_phy = clk_phy;
>  			clk_set_rate(bsp_priv->clk_phy, 50000000);
>  		}
>  	}
> 
> Please note: if you send an updated patch (against net) please
> make sure you wait 24h before the original post.
> 
> See: https://docs.kernel.org/process/maintainer-netdev.html

Thanks for the tip. While digging through the problematic commit for the
clk_phy's rate problem, I found others have discovered the problem[1]
and proposed some fixes (though there hasn't been a formal patch).

I should have read the original thread before sending this patch! Will
wait for some time and see whether the netdev maintainer prefers waiting
for original author's fix or taking mine.

Best regards,
Yao Zi

[1]: https://lore.kernel.org/netdev/a30a8c97-6b96-45ba-bad7-8a40401babc2@samsung.com/
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Russell King (Oracle) 4 weeks ago
On Thu, Sep 04, 2025 at 11:34:43AM +0100, Simon Horman wrote:
> Thanks, and sorry for my early confusion about applying this patch.
> 
> I agree that the bug you point out is addressed by this patch.
> Although I wonder if it is cleaner not to set bsp_priv->clk_phy
> unless there is no error, rather than setting it then resetting
> it if there is an error.

+1 !

> More importantly, I wonder if there is another bug: does clk_set_rate need
> to be called in the case where there is no error and bsp_priv->integrated_phy
> is false?

I think there's another issue:

static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
{
...
        if (plat->phy_node) {
                bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
...

static void rk_gmac_remove(struct platform_device *pdev)
{
...
        if (priv->plat->phy_node && bsp_priv->integrated_phy)
                clk_put(bsp_priv->clk_phy);

So if bsp_priv->integrated_phy is false, then we get the clock but
don't put it.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Chaoyi Chen 4 weeks ago
On 9/4/2025 6:49 PM, Russell King (Oracle) wrote:
> On Thu, Sep 04, 2025 at 11:34:43AM +0100, Simon Horman wrote:
>> Thanks, and sorry for my early confusion about applying this patch.
>>
>> I agree that the bug you point out is addressed by this patch.
>> Although I wonder if it is cleaner not to set bsp_priv->clk_phy
>> unless there is no error, rather than setting it then resetting
>> it if there is an error.
> +1 !
>
>> More importantly, I wonder if there is another bug: does clk_set_rate need
>> to be called in the case where there is no error and bsp_priv->integrated_phy
>> is false?
> I think there's another issue:
>
> static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
> {
> ...
>          if (plat->phy_node) {
>                  bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
> ...
>
> static void rk_gmac_remove(struct platform_device *pdev)
> {
> ...
>          if (priv->plat->phy_node && bsp_priv->integrated_phy)
>                  clk_put(bsp_priv->clk_phy);
>
> So if bsp_priv->integrated_phy is false, then we get the clock but
> don't put it.

Yes! Just remove "bsp_priv->integrated_phy"
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Yao Zi 4 weeks ago
On Thu, Sep 04, 2025 at 06:58:12PM +0800, Chaoyi Chen wrote:
> 
> On 9/4/2025 6:49 PM, Russell King (Oracle) wrote:
> > On Thu, Sep 04, 2025 at 11:34:43AM +0100, Simon Horman wrote:
> > > Thanks, and sorry for my early confusion about applying this patch.
> > > 
> > > I agree that the bug you point out is addressed by this patch.
> > > Although I wonder if it is cleaner not to set bsp_priv->clk_phy
> > > unless there is no error, rather than setting it then resetting
> > > it if there is an error.
> > +1 !
> > 
> > > More importantly, I wonder if there is another bug: does clk_set_rate need
> > > to be called in the case where there is no error and bsp_priv->integrated_phy
> > > is false?
> > I think there's another issue:
> > 
> > static int rk_gmac_clk_init(struct plat_stmmacenet_data *plat)
> > {
> > ...
> >          if (plat->phy_node) {
> >                  bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);

This is the only invokation to of_clk_get() in the driver. Should we
convert it to the devres-managed variant? And rk_gmac_remove could be
simplified further.

> > ...
> > 
> > static void rk_gmac_remove(struct platform_device *pdev)
> > {
> > ...
> >          if (priv->plat->phy_node && bsp_priv->integrated_phy)
> >                  clk_put(bsp_priv->clk_phy);
> > 
> > So if bsp_priv->integrated_phy is false, then we get the clock but
> > don't put it.
> 
> Yes! Just remove "bsp_priv->integrated_phy"
> 

Cheers,
Yao Zi
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Simon Horman 4 weeks ago
On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> We must set the clk_phy pointer to NULL to indicating it isn't available
> if the optional phy clock couldn't be obtained. Otherwise the error code
> returned by of_clk_get() could be wrongly taken as an address, causing
> invalid pointer dereference when later clk_phy is passed to
> clk_prepare_enable().
> 
> Fixes: da114122b831 ("net: ethernet: stmmac: dwmac-rk: Make the clk_phy could be used for external phy")
> Signed-off-by: Yao Zi <ziyao@disroot.org>

...

Hi,

I this patch doesn't seem to match upstream code.

Looking over the upstream code, it seems to me that
going into the code in question .clk_phy should
be NULL, as bsp_priv it is allocated using devm_kzalloc()
over in rk_gmac_setup()

While the upstream version of the code your patch modifies
is as follows. And doesn't touch .clk_phy if integrated_phy is not set.

        if (plat->phy_node && bsp_priv->integrated_phy) {
                bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
                ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
                if (ret)
                        return dev_err_probe(dev, ret, "Cannot get PHY clock\n");
                clk_set_rate(bsp_priv->clk_phy, 50000000);
        }

Am I missing something?
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Simon Horman 4 weeks ago
On Thu, Sep 04, 2025 at 10:54:57AM +0100, Simon Horman wrote:
> On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> > We must set the clk_phy pointer to NULL to indicating it isn't available
> > if the optional phy clock couldn't be obtained. Otherwise the error code
> > returned by of_clk_get() could be wrongly taken as an address, causing
> > invalid pointer dereference when later clk_phy is passed to
> > clk_prepare_enable().
> > 
> > Fixes: da114122b831 ("net: ethernet: stmmac: dwmac-rk: Make the clk_phy could be used for external phy")
> > Signed-off-by: Yao Zi <ziyao@disroot.org>
> 
> ...
> 
> Hi,
> 
> I this patch doesn't seem to match upstream code.
> 
> Looking over the upstream code, it seems to me that
> going into the code in question .clk_phy should
> be NULL, as bsp_priv it is allocated using devm_kzalloc()
> over in rk_gmac_setup()
> 
> While the upstream version of the code your patch modifies
> is as follows. And doesn't touch .clk_phy if integrated_phy is not set.
> 
>         if (plat->phy_node && bsp_priv->integrated_phy) {
>                 bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
>                 ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
>                 if (ret)
>                         return dev_err_probe(dev, ret, "Cannot get PHY clock\n");
>                 clk_set_rate(bsp_priv->clk_phy, 50000000);
>         }
> 
> Am I missing something?

Oops, I missed that da114122b831 is present in net-next (but not net).
Let me look over this a second time.
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Yao Zi 4 weeks ago
On Thu, Sep 04, 2025 at 10:56:57AM +0100, Simon Horman wrote:
> On Thu, Sep 04, 2025 at 10:54:57AM +0100, Simon Horman wrote:
> > On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> > > We must set the clk_phy pointer to NULL to indicating it isn't available
> > > if the optional phy clock couldn't be obtained. Otherwise the error code
> > > returned by of_clk_get() could be wrongly taken as an address, causing
> > > invalid pointer dereference when later clk_phy is passed to
> > > clk_prepare_enable().
> > > 
> > > Fixes: da114122b831 ("net: ethernet: stmmac: dwmac-rk: Make the clk_phy could be used for external phy")
> > > Signed-off-by: Yao Zi <ziyao@disroot.org>
> > 
> > ...
> > 
> > Hi,
> > 
> > I this patch doesn't seem to match upstream code.
> > 
> > Looking over the upstream code, it seems to me that
> > going into the code in question .clk_phy should
> > be NULL, as bsp_priv it is allocated using devm_kzalloc()
> > over in rk_gmac_setup()
> > 
> > While the upstream version of the code your patch modifies
> > is as follows. And doesn't touch .clk_phy if integrated_phy is not set.
> > 
> >         if (plat->phy_node && bsp_priv->integrated_phy) {
> >                 bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
> >                 ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> >                 if (ret)
> >                         return dev_err_probe(dev, ret, "Cannot get PHY clock\n");
> >                 clk_set_rate(bsp_priv->clk_phy, 50000000);
> >         }
> > 
> > Am I missing something?
> 
> Oops, I missed that da114122b831 is present in net-next (but not net).
> Let me look over this a second time.

Oops, yes. Though this is a fix patch, it should target net-next instead
of net since the commit causing problems hasn't landed in net. Sorry for
the confusion.

Best regards,
Yao Zi
Re: [PATCH net] net: stmmac: dwmac-rk: Ensure clk_phy doesn't contain invalid address
Posted by Simon Horman 4 weeks ago
On Thu, Sep 04, 2025 at 10:10:01AM +0000, Yao Zi wrote:
> On Thu, Sep 04, 2025 at 10:56:57AM +0100, Simon Horman wrote:
> > On Thu, Sep 04, 2025 at 10:54:57AM +0100, Simon Horman wrote:
> > > On Thu, Sep 04, 2025 at 03:12:24AM +0000, Yao Zi wrote:
> > > > We must set the clk_phy pointer to NULL to indicating it isn't available
> > > > if the optional phy clock couldn't be obtained. Otherwise the error code
> > > > returned by of_clk_get() could be wrongly taken as an address, causing
> > > > invalid pointer dereference when later clk_phy is passed to
> > > > clk_prepare_enable().
> > > > 
> > > > Fixes: da114122b831 ("net: ethernet: stmmac: dwmac-rk: Make the clk_phy could be used for external phy")
> > > > Signed-off-by: Yao Zi <ziyao@disroot.org>
> > > 
> > > ...
> > > 
> > > Hi,
> > > 
> > > I this patch doesn't seem to match upstream code.
> > > 
> > > Looking over the upstream code, it seems to me that
> > > going into the code in question .clk_phy should
> > > be NULL, as bsp_priv it is allocated using devm_kzalloc()
> > > over in rk_gmac_setup()
> > > 
> > > While the upstream version of the code your patch modifies
> > > is as follows. And doesn't touch .clk_phy if integrated_phy is not set.
> > > 
> > >         if (plat->phy_node && bsp_priv->integrated_phy) {
> > >                 bsp_priv->clk_phy = of_clk_get(plat->phy_node, 0);
> > >                 ret = PTR_ERR_OR_ZERO(bsp_priv->clk_phy);
> > >                 if (ret)
> > >                         return dev_err_probe(dev, ret, "Cannot get PHY clock\n");
> > >                 clk_set_rate(bsp_priv->clk_phy, 50000000);
> > >         }
> > > 
> > > Am I missing something?
> > 
> > Oops, I missed that da114122b831 is present in net-next (but not net).
> > Let me look over this a second time.
> 
> Oops, yes. Though this is a fix patch, it should target net-next instead
> of net since the commit causing problems hasn't landed in net. Sorry for
> the confusion.

It's ok. I'm pretty good at confusing myself without any assistance.