[PATCH] drm/panthor: Ensure MCU is disabled on suspend

Ketil Johnsen posted 1 patch 4 months ago
drivers/gpu/drm/panthor/panthor_fw.c | 1 +
1 file changed, 1 insertion(+)
[PATCH] drm/panthor: Ensure MCU is disabled on suspend
Posted by Ketil Johnsen 4 months ago
Currently the Panthor driver needs the GPU to be powered down
between suspend and resume. If this is not done, then the
MCU_CONTROL register will be preserved as AUTO, which again will
cause a premature FW boot on resume. The FW will go directly into
fatal state in this case.

This case needs to be handled as there is no guarantee that the
GPU will be powered down after the suspend callback on all platforms.

The fix is to call panthor_fw_stop() in "pre-reset" path to ensure
the MCU_CONTROL register is cleared (set DISABLE). This matches
well with the already existing call to panthor_fw_start() from the
"post-reset" path.

Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
---
 drivers/gpu/drm/panthor/panthor_fw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 9bf06e55eaee..df767e82148a 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1099,6 +1099,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
 	}
 
 	panthor_job_irq_suspend(&ptdev->fw->irq);
+	panthor_fw_stop(ptdev);
 }
 
 /**
-- 
2.43.0
Re: [PATCH] drm/panthor: Ensure MCU is disabled on suspend
Posted by Steven Price 4 months ago
On 08/10/2025 11:51, Ketil Johnsen wrote:
> Currently the Panthor driver needs the GPU to be powered down
> between suspend and resume. If this is not done, then the
> MCU_CONTROL register will be preserved as AUTO, which again will
> cause a premature FW boot on resume. The FW will go directly into
> fatal state in this case.
> 
> This case needs to be handled as there is no guarantee that the
> GPU will be powered down after the suspend callback on all platforms.
> 
> The fix is to call panthor_fw_stop() in "pre-reset" path to ensure
> the MCU_CONTROL register is cleared (set DISABLE). This matches
> well with the already existing call to panthor_fw_start() from the
> "post-reset" path.
> 
> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>

Reviewed-by: Steven Price <steven.price@arm.com>

Do we need a Fixes tag? Or is this only actually an issue on newer GPUs?

Thanks,
Steve

> ---
>  drivers/gpu/drm/panthor/panthor_fw.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 9bf06e55eaee..df767e82148a 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1099,6 +1099,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
>  	}
>  
>  	panthor_job_irq_suspend(&ptdev->fw->irq);
> +	panthor_fw_stop(ptdev);
>  }
>  
>  /**
Re: [PATCH] drm/panthor: Ensure MCU is disabled on suspend
Posted by Boris Brezillon 4 months ago
On Thu, 9 Oct 2025 11:29:14 +0100
Steven Price <steven.price@arm.com> wrote:

> On 08/10/2025 11:51, Ketil Johnsen wrote:
> > Currently the Panthor driver needs the GPU to be powered down
> > between suspend and resume. If this is not done, then the
> > MCU_CONTROL register will be preserved as AUTO, which again will
> > cause a premature FW boot on resume. The FW will go directly into
> > fatal state in this case.
> > 
> > This case needs to be handled as there is no guarantee that the
> > GPU will be powered down after the suspend callback on all platforms.
> > 
> > The fix is to call panthor_fw_stop() in "pre-reset" path to ensure
> > the MCU_CONTROL register is cleared (set DISABLE). This matches
> > well with the already existing call to panthor_fw_start() from the
> > "post-reset" path.
> > 
> > Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>  
> 
> Reviewed-by: Steven Price <steven.price@arm.com>
> 
> Do we need a Fixes tag? Or is this only actually an issue on newer GPUs?

I think it'd be good to have a Fixes tag, if it's known to be the right
thing to do after a HALT, even if it's not needed on the GPUs currently
supported by this driver.
Re: [PATCH] drm/panthor: Ensure MCU is disabled on suspend
Posted by Steven Price 4 months ago
On 09/10/2025 12:45, Boris Brezillon wrote:
> On Thu, 9 Oct 2025 11:29:14 +0100
> Steven Price <steven.price@arm.com> wrote:
> 
>> On 08/10/2025 11:51, Ketil Johnsen wrote:
>>> Currently the Panthor driver needs the GPU to be powered down
>>> between suspend and resume. If this is not done, then the
>>> MCU_CONTROL register will be preserved as AUTO, which again will
>>> cause a premature FW boot on resume. The FW will go directly into
>>> fatal state in this case.
>>>
>>> This case needs to be handled as there is no guarantee that the
>>> GPU will be powered down after the suspend callback on all platforms.
>>>
>>> The fix is to call panthor_fw_stop() in "pre-reset" path to ensure
>>> the MCU_CONTROL register is cleared (set DISABLE). This matches
>>> well with the already existing call to panthor_fw_start() from the
>>> "post-reset" path.
>>>
>>> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>  
>>
>> Reviewed-by: Steven Price <steven.price@arm.com>
>>
>> Do we need a Fixes tag? Or is this only actually an issue on newer GPUs?
> 
> I think it'd be good to have a Fixes tag, if it's known to be the right
> thing to do after a HALT, even if it's not needed on the GPUs currently
> supported by this driver.

Yeah, at the very least it won't do any harm. I'll add a:

Fixes: 2718d91816ee ("drm/panthor: Add the FW logical block")

And merge this to drm-misc-fixes.

Thanks,
Steve
Re: [PATCH] drm/panthor: Ensure MCU is disabled on suspend
Posted by Boris Brezillon 4 months ago
On Wed,  8 Oct 2025 12:51:11 +0200
Ketil Johnsen <ketil.johnsen@arm.com> wrote:

> Currently the Panthor driver needs the GPU to be powered down
> between suspend and resume. If this is not done, then the
> MCU_CONTROL register will be preserved as AUTO, which again will
> cause a premature FW boot on resume. The FW will go directly into
> fatal state in this case.
> 
> This case needs to be handled as there is no guarantee that the
> GPU will be powered down after the suspend callback on all platforms.
> 
> The fix is to call panthor_fw_stop() in "pre-reset" path to ensure
> the MCU_CONTROL register is cleared (set DISABLE). This matches
> well with the already existing call to panthor_fw_start() from the
> "post-reset" path.
> 
> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_fw.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 9bf06e55eaee..df767e82148a 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1099,6 +1099,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
>  	}
>  
>  	panthor_job_irq_suspend(&ptdev->fw->irq);
> +	panthor_fw_stop(ptdev);

Is this not preventing fast resets? My understanding was that
MCU_CONTROL shouldn't be touched if the MCU was halted from the FW, but
maybe I got that wrong. If it's just for the MCU crash case, we can
have:

	if (on_hang)
		panthor_fw_stop(ptdev);

>  }
>  
>  /**
Re: [PATCH] drm/panthor: Ensure MCU is disabled on suspend
Posted by Karunika Choo 4 months ago
On 08/10/2025 12:32, Boris Brezillon wrote:
> On Wed,  8 Oct 2025 12:51:11 +0200
> Ketil Johnsen <ketil.johnsen@arm.com> wrote:
> 
>> Currently the Panthor driver needs the GPU to be powered down
>> between suspend and resume. If this is not done, then the
>> MCU_CONTROL register will be preserved as AUTO, which again will
>> cause a premature FW boot on resume. The FW will go directly into
>> fatal state in this case.
>>
>> This case needs to be handled as there is no guarantee that the
>> GPU will be powered down after the suspend callback on all platforms.
>>
>> The fix is to call panthor_fw_stop() in "pre-reset" path to ensure
>> the MCU_CONTROL register is cleared (set DISABLE). This matches
>> well with the already existing call to panthor_fw_start() from the
>> "post-reset" path.
>>
>> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
>> ---
>>  drivers/gpu/drm/panthor/panthor_fw.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
>> index 9bf06e55eaee..df767e82148a 100644
>> --- a/drivers/gpu/drm/panthor/panthor_fw.c
>> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
>> @@ -1099,6 +1099,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
>>  	}
>>  
>>  	panthor_job_irq_suspend(&ptdev->fw->irq);
>> +	panthor_fw_stop(ptdev);
> 
> Is this not preventing fast resets? My understanding was that
> MCU_CONTROL shouldn't be touched if the MCU was halted from the FW, but
> maybe I got that wrong. If it's just for the MCU crash case, we can
> have:
> 
> 	if (on_hang)
> 		panthor_fw_stop(ptdev);
> 

Hi Boris, I think as long as the FW is properly halted, we can safely
disable the MCU. In fact, because of the ptdev->reset.fast tracking, we
can call panthor_fw_stop() in both cases, as the flag allows us to
handle the resume path correctly.

As per Ketil's commit message, if we don't clear the HALT bit on L2
power on, the MCU can start booting the FW with the bit enabled, which
in certain cases can make it fail to boot. So this patch fixes that by
gating when the FW is allowed to boot.

I also believe this behaviour will be better aligned with the expected
behaviour from the FW of newer GPUs (Mali-G1 for example).

Kind regards,
Karunika

>>  }
>>  
>>  /**
>
Re: [PATCH] drm/panthor: Ensure MCU is disabled on suspend
Posted by Boris Brezillon 4 months ago
On Thu, 9 Oct 2025 09:42:08 +0100
Karunika Choo <karunika.choo@arm.com> wrote:

> On 08/10/2025 12:32, Boris Brezillon wrote:
> > On Wed,  8 Oct 2025 12:51:11 +0200
> > Ketil Johnsen <ketil.johnsen@arm.com> wrote:
> >   
> >> Currently the Panthor driver needs the GPU to be powered down
> >> between suspend and resume. If this is not done, then the
> >> MCU_CONTROL register will be preserved as AUTO, which again will
> >> cause a premature FW boot on resume. The FW will go directly into
> >> fatal state in this case.
> >>
> >> This case needs to be handled as there is no guarantee that the
> >> GPU will be powered down after the suspend callback on all platforms.
> >>
> >> The fix is to call panthor_fw_stop() in "pre-reset" path to ensure
> >> the MCU_CONTROL register is cleared (set DISABLE). This matches
> >> well with the already existing call to panthor_fw_start() from the
> >> "post-reset" path.
> >>
> >> Signed-off-by: Ketil Johnsen <ketil.johnsen@arm.com>
> >> ---
> >>  drivers/gpu/drm/panthor/panthor_fw.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >>
> >> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> >> index 9bf06e55eaee..df767e82148a 100644
> >> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> >> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> >> @@ -1099,6 +1099,7 @@ void panthor_fw_pre_reset(struct panthor_device *ptdev, bool on_hang)
> >>  	}
> >>  
> >>  	panthor_job_irq_suspend(&ptdev->fw->irq);
> >> +	panthor_fw_stop(ptdev);  
> > 
> > Is this not preventing fast resets? My understanding was that
> > MCU_CONTROL shouldn't be touched if the MCU was halted from the FW, but
> > maybe I got that wrong. If it's just for the MCU crash case, we can
> > have:
> > 
> > 	if (on_hang)
> > 		panthor_fw_stop(ptdev);
> >   
> 
> Hi Boris, I think as long as the FW is properly halted, we can safely
> disable the MCU. In fact, because of the ptdev->reset.fast tracking, we
> can call panthor_fw_stop() in both cases, as the flag allows us to
> handle the resume path correctly.
> 
> As per Ketil's commit message, if we don't clear the HALT bit on L2
> power on, the MCU can start booting the FW with the bit enabled, which
> in certain cases can make it fail to boot. So this patch fixes that by
> gating when the FW is allowed to boot.
> 
> I also believe this behaviour will be better aligned with the expected
> behaviour from the FW of newer GPUs (Mali-G1 for example).

Okay, as long as you're sure it doesn't screw up the fast reset, I'm
happy to get that in.

Acked-by: Boris Brezillon <boris.brezillon@collabora.com>