[v3] net: dsa: mv88e6xxx: propperly shutdown PPU re-enable timer on destroy

[PATCH v3] net: dsa: mv88e6xxx: propperly shutdown PPU re-enable timer on destroy

Posted by David Oberhollenzer 1 year ago

The mv88e6xxx has an internal PPU that polls PHY state. If we want to
access the internal PHYs, we need to disable it. Because enable/disable
of the PPU is a slow operation, a 10ms timer is used to re-enable it,
canceled with every access, so bulk operations effectively only disable
it once and re-enable it some 10ms after the last access.

If a PHY is accessed and then the mv88e6xxx module is removed before
the 10ms are up, the PPU re-enable ends up accessing a dangling pointer.

This especially affects probing during bootup. The MDIO bus and PHY
registration may succeed, but registration with the DSA framework
may fail later on (e.g. because the CPU port depends on another,
very slow device that isn't done probing yet, returning -EPROBE_DEFER).
In this case, probe() fails, but the MDIO subsystem may already have
accessed the MIDO bus or PHYs, arming timer.

This is fixed as follows:
 - If probe fails after mv88e6xxx_phy_init(), make sure we also call
   mv88e6xxx_phy_destroy() before returning
 - In mv88e6xxx_phy_destroy(), grab the ppu_mutex to make sure the work
   function either has already exited, or (should it run) cannot do
   anything, mutex_trylock() in the worker function fails to grab the
   mutex and returns.
 - In addition to destroying the timer, also destroy the work item, in
   case the timer has already fired.
 - Do all of this synchronously, to make sure timer & work item are
   destroyed and none of the callbacks are running.
 - Because of the nesting of the timer & work item, cancel_work_sync()
   alone is not sufficient and the order of the cancellation steps matters.

Fixes: 2e5f032095ff ("dsa: add support for the Marvell 88E6131 switch chip")
Signed-off-by: David Oberhollenzer <david.oberhollenzer@sigma-star.at>
---
FWIW, this is a forward port of a patch I'm using on v6.6.

Thanks,

David
---
 drivers/net/dsa/mv88e6xxx/chip.c | 8 +++++---
 drivers/net/dsa/mv88e6xxx/phy.c  | 3 +++
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 284270a4ade1..c2af69bed660 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -7264,13 +7264,13 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
 	err = mv88e6xxx_switch_reset(chip);
 	mv88e6xxx_reg_unlock(chip);
 	if (err)
-		goto out;
+		goto out_phy;
 
 	if (np) {
 		chip->irq = of_irq_get(np, 0);
 		if (chip->irq == -EPROBE_DEFER) {
 			err = chip->irq;
-			goto out;
+			goto out_phy;
 		}
 	}
 
@@ -7289,7 +7289,7 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
 	mv88e6xxx_reg_unlock(chip);
 
 	if (err)
-		goto out;
+		goto out_phy;
 
 	if (chip->info->g2_irqs > 0) {
 		err = mv88e6xxx_g2_irq_setup(chip);
@@ -7323,6 +7323,8 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
 		mv88e6xxx_g1_irq_free(chip);
 	else
 		mv88e6xxx_irq_poll_free(chip);
+out_phy:
+	mv88e6xxx_phy_destroy(chip);
 out:
 	if (pdata)
 		dev_put(pdata->netdev);
diff --git a/drivers/net/dsa/mv88e6xxx/phy.c b/drivers/net/dsa/mv88e6xxx/phy.c
index 8bb88b3d900d..ee9e5d7e5277 100644
--- a/drivers/net/dsa/mv88e6xxx/phy.c
+++ b/drivers/net/dsa/mv88e6xxx/phy.c
@@ -229,7 +229,10 @@ static void mv88e6xxx_phy_ppu_state_init(struct mv88e6xxx_chip *chip)
 
 static void mv88e6xxx_phy_ppu_state_destroy(struct mv88e6xxx_chip *chip)
 {
+	mutex_lock(&chip->ppu_mutex);
 	del_timer_sync(&chip->ppu_timer);
+	cancel_work_sync(&chip->ppu_work);
+	mutex_unlock(&chip->ppu_mutex);
 }
 
 int mv88e6185_phy_ppu_read(struct mv88e6xxx_chip *chip, struct mii_bus *bus,
-- 
2.47.1

Re: [PATCH v3] net: dsa: mv88e6xxx: propperly shutdown PPU re-enable timer on destroy

Posted by Jakub Kicinski 1 year ago

On Mon, 13 Jan 2025 09:49:12 +0100 David Oberhollenzer wrote:
> @@ -7323,6 +7323,8 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
>  		mv88e6xxx_g1_irq_free(chip);
>  	else
>  		mv88e6xxx_irq_poll_free(chip);
> +out_phy:
> +	mv88e6xxx_phy_destroy(chip);
>  out:
>  	if (pdata)
>  		dev_put(pdata->netdev);

If this is the right ordering the order in mv88e6xxx_remove()
looks suspicious. We call mv88e6xxx_phy_destroy() pretty early 
and then unregister from DSA. Isn't there a window where DSA 
callbacks can reschedule the timer?
-- 
pw-bot: cr

Re: [PATCH v3] net: dsa: mv88e6xxx: propperly shutdown PPU re-enable timer on destroy

Posted by David Oberhollenzer 10 months, 1 week ago

Hi,

I did some further re-testing on the fix, regarding the the similar race
in remove() as well as the previous question regarding the locking and
cancellation order. V3 already expands on this, and the point still stands,
the nested timer+queue+trylock mechanism is somewhat tricky and I manage
to hit the race window with just cancel_work_sync(), without the lock or
a different order for tear down.

On 1/15/25 12:27 AM, Jakub Kicinski wrote:
> On Mon, 13 Jan 2025 09:49:12 +0100 David Oberhollenzer wrote:
>> @@ -7323,6 +7323,8 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
>>   		mv88e6xxx_g1_irq_free(chip);
>>   	else
>>   		mv88e6xxx_irq_poll_free(chip);
>> +out_phy:
>> +	mv88e6xxx_phy_destroy(chip);
>>   out:
>>   	if (pdata)
>>   		dev_put(pdata->netdev);
> 
> If this is the right ordering the order in mv88e6xxx_remove()
> looks suspicious. We call mv88e6xxx_phy_destroy() pretty early
> and then unregister from DSA. Isn't there a window where DSA
> callbacks can reschedule the timer?

yes, this does looks suspicious, mv88e6xxx_phy_destroy() should be done
after the switch is unregistered, otherwise it should logically cause
the same issue.

However, I did not manage to trigger this during testing, and this also
did not fix the original issue I saw, but I will fix the order in a
followup v4 patch.

Greetings,

David