[PATCH 1/2] drm/msm/dpu: fix mismatch between power and frequency

yuanjie yang posted 2 patches 1 month ago
[PATCH 1/2] drm/msm/dpu: fix mismatch between power and frequency
Posted by yuanjie yang 1 month ago
From: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>

During DPU runtime suspend, calling dev_pm_opp_set_rate(dev, 0) drops
the MMCX rail to MIN_SVS while the core clock frequency remains at its
original (highest) rate. When runtime resume re-enables the clock, this
may result in a mismatch between the rail voltage and the clock rate.

For example, in the DPU bind path, the sequence could be:
  cpu0: dev_sync_state -> rpmhpd_sync_state
  cpu1:                                     dpu_kms_hw_init
timeline 0 ------------------------------------------------> t

After rpmhpd_sync_state, the voltage performance is no longer guaranteed
to stay at the highest level. During dpu_kms_hw_init, calling
dev_pm_opp_set_rate(dev, 0) drops the voltage, causing the MMCX rail to
fall to MIN_SVS while the core clock is still at its maximum frequency.
When the power is re-enabled, only the clock is enabled, leading to a
situation where the MMCX rail is at MIN_SVS but the core clock is at its
highest rate. In this state, the rail cannot sustain the clock rate,
which may cause instability or system crash.

Fix this by setting the corresponding OPP corner during both power-on
and power-off sequences to ensure proper alignment of rail voltage and
clock frequency.

Fixes: b0530eb11913 ("drm/msm/dpu: Use OPP API to set clk/perf state")

Signed-off-by: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 16 ++++++++++++----
 drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  3 +++
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
index 0623f1dbed97..c31488335f2b 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
@@ -1306,9 +1306,14 @@ static int dpu_kms_init(struct drm_device *ddev)
 	struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
 	struct dev_pm_opp *opp;
 	int ret = 0;
-	unsigned long max_freq = ULONG_MAX;
+	dpu_kms->max_freq = ULONG_MAX;
+	dpu_kms->min_freq = 0;
 
-	opp = dev_pm_opp_find_freq_floor(dev, &max_freq);
+	opp = dev_pm_opp_find_freq_floor(dev, &dpu_kms->max_freq);
+	if (!IS_ERR(opp))
+		dev_pm_opp_put(opp);
+
+	opp = dev_pm_opp_find_freq_ceil(dev, &dpu_kms->min_freq);
 	if (!IS_ERR(opp))
 		dev_pm_opp_put(opp);
 
@@ -1461,8 +1466,8 @@ static int __maybe_unused dpu_runtime_suspend(struct device *dev)
 	struct msm_drm_private *priv = platform_get_drvdata(pdev);
 	struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
 
-	/* Drop the performance state vote */
-	dev_pm_opp_set_rate(dev, 0);
+	/* adjust the performance state vote to low performance state */
+	dev_pm_opp_set_rate(dev, dpu_kms->min_freq);
 	clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
 
 	for (i = 0; i < dpu_kms->num_paths; i++)
@@ -1481,6 +1486,9 @@ static int __maybe_unused dpu_runtime_resume(struct device *dev)
 	struct drm_device *ddev;
 
 	ddev = dpu_kms->dev;
+	/* adjust the performance state vote to high performance state */
+	if (dpu_kms->max_freq != ULONG_MAX)
+		dev_pm_opp_set_rate(dev, dpu_kms->max_freq);
 
 	rc = clk_bulk_prepare_enable(dpu_kms->num_clocks, dpu_kms->clocks);
 	if (rc) {
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
index 993cf512f8c5..8d2595d8a5f6 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
@@ -92,6 +92,9 @@ struct dpu_kms {
 	struct clk_bulk_data *clocks;
 	size_t num_clocks;
 
+	unsigned long max_freq;
+	unsigned long min_freq;
+
 	/* reference count bandwidth requests, so we know when we can
 	 * release bandwidth.  Each atomic update increments, and frame-
 	 * done event decrements.  Additionally, for video mode, the
-- 
2.34.1
Re: [PATCH 1/2] drm/msm/dpu: fix mismatch between power and frequency
Posted by Dmitry Baryshkov 1 month ago
On Fri, Jan 09, 2026 at 04:38:07PM +0800, yuanjie yang wrote:
> From: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
> 
> During DPU runtime suspend, calling dev_pm_opp_set_rate(dev, 0) drops
> the MMCX rail to MIN_SVS while the core clock frequency remains at its
> original (highest) rate. When runtime resume re-enables the clock, this
> may result in a mismatch between the rail voltage and the clock rate.
> 
> For example, in the DPU bind path, the sequence could be:
>   cpu0: dev_sync_state -> rpmhpd_sync_state
>   cpu1:                                     dpu_kms_hw_init
> timeline 0 ------------------------------------------------> t
> 
> After rpmhpd_sync_state, the voltage performance is no longer guaranteed
> to stay at the highest level. During dpu_kms_hw_init, calling
> dev_pm_opp_set_rate(dev, 0) drops the voltage, causing the MMCX rail to
> fall to MIN_SVS while the core clock is still at its maximum frequency.

Ah, I see. dev_pm_set_rate(0) transforms to  _disable_opp_table(), which
doesn't do anything with clocks. I think we should have a call to
clk_disable_unprepare() before that and clk_prepare_enable() in the
resume path.

> When the power is re-enabled, only the clock is enabled, leading to a
> situation where the MMCX rail is at MIN_SVS but the core clock is at its
> highest rate. In this state, the rail cannot sustain the clock rate,
> which may cause instability or system crash.
> 
> Fix this by setting the corresponding OPP corner during both power-on
> and power-off sequences to ensure proper alignment of rail voltage and
> clock frequency.
> 
> Fixes: b0530eb11913 ("drm/msm/dpu: Use OPP API to set clk/perf state")
> 
> Signed-off-by: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>

No empty lines between the tags. Also please cc stable.

> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 16 ++++++++++++----
>  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  3 +++
>  2 files changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> index 0623f1dbed97..c31488335f2b 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> @@ -1306,9 +1306,14 @@ static int dpu_kms_init(struct drm_device *ddev)
>  	struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
>  	struct dev_pm_opp *opp;
>  	int ret = 0;
> -	unsigned long max_freq = ULONG_MAX;
> +	dpu_kms->max_freq = ULONG_MAX;
> +	dpu_kms->min_freq = 0;
>  
> -	opp = dev_pm_opp_find_freq_floor(dev, &max_freq);
> +	opp = dev_pm_opp_find_freq_floor(dev, &dpu_kms->max_freq);
> +	if (!IS_ERR(opp))
> +		dev_pm_opp_put(opp);
> +
> +	opp = dev_pm_opp_find_freq_ceil(dev, &dpu_kms->min_freq);
>  	if (!IS_ERR(opp))
>  		dev_pm_opp_put(opp);
>  
> @@ -1461,8 +1466,8 @@ static int __maybe_unused dpu_runtime_suspend(struct device *dev)
>  	struct msm_drm_private *priv = platform_get_drvdata(pdev);
>  	struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
>  
> -	/* Drop the performance state vote */
> -	dev_pm_opp_set_rate(dev, 0);
> +	/* adjust the performance state vote to low performance state */
> +	dev_pm_opp_set_rate(dev, dpu_kms->min_freq);

Here min_freq is the minumum working frequency, which will keep it
ticking at a high frequency.  I think we are supposed to turn it off
(well, switch to XO). Would it be enough to swap these two lines
instead?

>  	clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
>  
>  	for (i = 0; i < dpu_kms->num_paths; i++)
> @@ -1481,6 +1486,9 @@ static int __maybe_unused dpu_runtime_resume(struct device *dev)
>  	struct drm_device *ddev;
>  
>  	ddev = dpu_kms->dev;
> +	/* adjust the performance state vote to high performance state */
> +	if (dpu_kms->max_freq != ULONG_MAX)
> +		dev_pm_opp_set_rate(dev, dpu_kms->max_freq);

This one should not be necessary, we should be setting the performance
point while comitting the DRM state.

>  
>  	rc = clk_bulk_prepare_enable(dpu_kms->num_clocks, dpu_kms->clocks);
>  	if (rc) {
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> index 993cf512f8c5..8d2595d8a5f6 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> @@ -92,6 +92,9 @@ struct dpu_kms {
>  	struct clk_bulk_data *clocks;
>  	size_t num_clocks;
>  
> +	unsigned long max_freq;
> +	unsigned long min_freq;
> +
>  	/* reference count bandwidth requests, so we know when we can
>  	 * release bandwidth.  Each atomic update increments, and frame-
>  	 * done event decrements.  Additionally, for video mode, the
> -- 
> 2.34.1
> 

-- 
With best wishes
Dmitry
Re: [PATCH 1/2] drm/msm/dpu: fix mismatch between power and frequency
Posted by yuanjiey 4 weeks ago
On Fri, Jan 09, 2026 at 05:22:37PM +0200, Dmitry Baryshkov wrote:
> On Fri, Jan 09, 2026 at 04:38:07PM +0800, yuanjie yang wrote:
> > From: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
> > 
> > During DPU runtime suspend, calling dev_pm_opp_set_rate(dev, 0) drops
> > the MMCX rail to MIN_SVS while the core clock frequency remains at its
> > original (highest) rate. When runtime resume re-enables the clock, this
> > may result in a mismatch between the rail voltage and the clock rate.
> > 
> > For example, in the DPU bind path, the sequence could be:
> >   cpu0: dev_sync_state -> rpmhpd_sync_state
> >   cpu1:                                     dpu_kms_hw_init
> > timeline 0 ------------------------------------------------> t
> > 
> > After rpmhpd_sync_state, the voltage performance is no longer guaranteed
> > to stay at the highest level. During dpu_kms_hw_init, calling
> > dev_pm_opp_set_rate(dev, 0) drops the voltage, causing the MMCX rail to
> > fall to MIN_SVS while the core clock is still at its maximum frequency.
> 
> Ah, I see. dev_pm_set_rate(0) transforms to  _disable_opp_table(), which
> doesn't do anything with clocks. I think we should have a call to
> clk_disable_unprepare() before that and clk_prepare_enable() in the
> resume path.

I do check the backtrace on kaanapali: 

active_corner = 3 (MIN_SVS)
rpmhpd_aggregate_corner+0x168/0x1d0
rpmhpd_set_performance_state+0x84/0xd0
_genpd_set_performance_state+0x50/0x198
genpd_set_performance_state.isra.0+0xbc/0xdc
genpd_dev_pm_set_performance_state+0x60/0xc4
dev_pm_domain_set_performance_state+0x24/0x3c
_set_opp+0x370/0x584
dev_pm_opp_set_rate+0x258/0x2b4
dpu_runtime_suspend+0x50/0x11c [msm]
pm_generic_runtime_suspend+0x2c/0x44
genpd_runtime_suspend+0xac/0x2d0
__rpm_callback+0x48/0x19c
rpm_callback+0x74/0x80
rpm_suspend+0x108/0x588
rpm_idle+0xe8/0x190
__pm_runtime_idle+0x50/0x94
dpu_kms_hw_init+0x3a0/0x4a8 

dev_pm_opp_set_rate(dev, 0); just low power to MIN_SVS. 
And freq MDP: 650MHz


And I try change the order:
from: 
	dev_pm_opp_set_rate(dev, 0);
	clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
to:
	clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
	dev_pm_opp_set_rate(dev, 0);

But the issue is still here.

 
> > When the power is re-enabled, only the clock is enabled, leading to a
> > situation where the MMCX rail is at MIN_SVS but the core clock is at its
> > highest rate. In this state, the rail cannot sustain the clock rate,
> > which may cause instability or system crash.
> > 
> > Fix this by setting the corresponding OPP corner during both power-on
> > and power-off sequences to ensure proper alignment of rail voltage and
> > clock frequency.
> > 
> > Fixes: b0530eb11913 ("drm/msm/dpu: Use OPP API to set clk/perf state")
> > 
> > Signed-off-by: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
> 
> No empty lines between the tags. Also please cc stable.

Sure, will fix.

> > ---
> >  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 16 ++++++++++++----
> >  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  3 +++
> >  2 files changed, 15 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > index 0623f1dbed97..c31488335f2b 100644
> > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > @@ -1306,9 +1306,14 @@ static int dpu_kms_init(struct drm_device *ddev)
> >  	struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
> >  	struct dev_pm_opp *opp;
> >  	int ret = 0;
> > -	unsigned long max_freq = ULONG_MAX;
> > +	dpu_kms->max_freq = ULONG_MAX;
> > +	dpu_kms->min_freq = 0;
> >  
> > -	opp = dev_pm_opp_find_freq_floor(dev, &max_freq);
> > +	opp = dev_pm_opp_find_freq_floor(dev, &dpu_kms->max_freq);
> > +	if (!IS_ERR(opp))
> > +		dev_pm_opp_put(opp);
> > +
> > +	opp = dev_pm_opp_find_freq_ceil(dev, &dpu_kms->min_freq);
> >  	if (!IS_ERR(opp))
> >  		dev_pm_opp_put(opp);
> >  
> > @@ -1461,8 +1466,8 @@ static int __maybe_unused dpu_runtime_suspend(struct device *dev)
> >  	struct msm_drm_private *priv = platform_get_drvdata(pdev);
> >  	struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
> >  
> > -	/* Drop the performance state vote */
> > -	dev_pm_opp_set_rate(dev, 0);
> > +	/* adjust the performance state vote to low performance state */
> > +	dev_pm_opp_set_rate(dev, dpu_kms->min_freq);
> 
> Here min_freq is the minumum working frequency, which will keep it
> ticking at a high frequency.  I think we are supposed to turn it off
> (well, switch to XO). Would it be enough to swap these two lines
> instead? 

Yes, just clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks) to
disable clk is OK. 
we can drop change here.

> >  	clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
> >  
> >  	for (i = 0; i < dpu_kms->num_paths; i++)
> > @@ -1481,6 +1486,9 @@ static int __maybe_unused dpu_runtime_resume(struct device *dev)
> >  	struct drm_device *ddev;
> >  
> >  	ddev = dpu_kms->dev;
> > +	/* adjust the performance state vote to high performance state */
> > +	if (dpu_kms->max_freq != ULONG_MAX)
> > +		dev_pm_opp_set_rate(dev, dpu_kms->max_freq);
> 
> This one should not be necessary, we should be setting the performance
> point while comitting the DRM state.

No, issue is during dpu binding path. And in msm_drm_kms_init driver just
pm_runtime_resume_and_get enable power and pm_runtime_put_sync disable power.
But We do not set the appropriate OPP each time the power is enabled. 
As a result, a situation may occur where the rail stays at MIN_SVS while the 
MDP is running at a high frequency.

rpm_idle+0xe8/0x190
__pm_runtime_idle+0x50/0x94
dpu_kms_hw_init+0x3a0/0x4a8 [msm]
msm_drm_kms_init+0xb8/0x340 [msm]
msm_drm_init+0x16c/0x22c [msm]
msm_drm_bind+0x30/0x3c [msm]
try_to_bring_up_aggregate_device+0x168/0x1d4
__component_add+0xa4/0x170
component_add+0x14/0x20
dsi_dev_attach+0x20/0x2c [msm]
dsi_host_attach+0x58/0x98 [msm]
mipi_dsi_attach+0x30/0x54
novatek_nt37801_probe+0x13c/0x1c8 [panel_novatek_nt37801]

And I found a a similar bug.
https://lore.kernel.org/all/20220915205559.14574-1-quic_bjorande@quicinc.com/

Since the panel driver does not hold the property power-domains = <&rpmhpd RPMHPD_MMCX> 
once all drivers that do own the RPMHPD_MMCX power domain finish probing, 
the interconnect’s dev_sync_state is triggered, which eventually runs rpmhpd_sync_state 
and starts dynamic voltage adjustment. This is when the issue occurs.


if do change below, this issue can also be fixed.
&mdss_dsi0 {
    ...
	panel@0 {
		compatible = "novatek,nt37801";
        	...
	++	power-domains = <&rpmhpd RPMHPD_MMCX>;
    }
}
But I don't think panel driver should own a power-domains = <&rpmhpd RPMHPD_MMCX>;



Thanks,
Yuanjie

> >  
> >  	rc = clk_bulk_prepare_enable(dpu_kms->num_clocks, dpu_kms->clocks);
> >  	if (rc) {
> > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > index 993cf512f8c5..8d2595d8a5f6 100644
> > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > @@ -92,6 +92,9 @@ struct dpu_kms {
> >  	struct clk_bulk_data *clocks;
> >  	size_t num_clocks;
> >  
> > +	unsigned long max_freq;
> > +	unsigned long min_freq;
> > +
> >  	/* reference count bandwidth requests, so we know when we can
> >  	 * release bandwidth.  Each atomic update increments, and frame-
> >  	 * done event decrements.  Additionally, for video mode, the
> > -- 
> > 2.34.1
> > 
> 
> -- 
> With best wishes
> Dmitry
Re: [PATCH 1/2] drm/msm/dpu: fix mismatch between power and frequency
Posted by Dmitry Baryshkov 4 weeks ago
On Mon, 12 Jan 2026 at 08:23, yuanjiey <yuanjie.yang@oss.qualcomm.com> wrote:
>
> On Fri, Jan 09, 2026 at 05:22:37PM +0200, Dmitry Baryshkov wrote:
> > On Fri, Jan 09, 2026 at 04:38:07PM +0800, yuanjie yang wrote:
> > > From: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
> > >
> > > During DPU runtime suspend, calling dev_pm_opp_set_rate(dev, 0) drops
> > > the MMCX rail to MIN_SVS while the core clock frequency remains at its
> > > original (highest) rate. When runtime resume re-enables the clock, this
> > > may result in a mismatch between the rail voltage and the clock rate.
> > >
> > > For example, in the DPU bind path, the sequence could be:
> > >   cpu0: dev_sync_state -> rpmhpd_sync_state
> > >   cpu1:                                     dpu_kms_hw_init
> > > timeline 0 ------------------------------------------------> t
> > >
> > > After rpmhpd_sync_state, the voltage performance is no longer guaranteed
> > > to stay at the highest level. During dpu_kms_hw_init, calling
> > > dev_pm_opp_set_rate(dev, 0) drops the voltage, causing the MMCX rail to
> > > fall to MIN_SVS while the core clock is still at its maximum frequency.
> >
> > Ah, I see. dev_pm_set_rate(0) transforms to  _disable_opp_table(), which
> > doesn't do anything with clocks. I think we should have a call to
> > clk_disable_unprepare() before that and clk_prepare_enable() in the
> > resume path.
>
> I do check the backtrace on kaanapali:
>
> active_corner = 3 (MIN_SVS)
> rpmhpd_aggregate_corner+0x168/0x1d0
> rpmhpd_set_performance_state+0x84/0xd0
> _genpd_set_performance_state+0x50/0x198
> genpd_set_performance_state.isra.0+0xbc/0xdc
> genpd_dev_pm_set_performance_state+0x60/0xc4
> dev_pm_domain_set_performance_state+0x24/0x3c
> _set_opp+0x370/0x584
> dev_pm_opp_set_rate+0x258/0x2b4
> dpu_runtime_suspend+0x50/0x11c [msm]
> pm_generic_runtime_suspend+0x2c/0x44
> genpd_runtime_suspend+0xac/0x2d0
> __rpm_callback+0x48/0x19c
> rpm_callback+0x74/0x80
> rpm_suspend+0x108/0x588
> rpm_idle+0xe8/0x190
> __pm_runtime_idle+0x50/0x94
> dpu_kms_hw_init+0x3a0/0x4a8
>
> dev_pm_opp_set_rate(dev, 0); just low power to MIN_SVS.
> And freq MDP: 650MHz
>
>
> And I try change the order:
> from:
>         dev_pm_opp_set_rate(dev, 0);
>         clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
> to:
>         clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
>         dev_pm_opp_set_rate(dev, 0);
>
> But the issue is still here.

But which clock is still demanding higher MMCX voltage? All DPU clocks
should be turned off at this point.

>
>
> > > When the power is re-enabled, only the clock is enabled, leading to a
> > > situation where the MMCX rail is at MIN_SVS but the core clock is at its
> > > highest rate. In this state, the rail cannot sustain the clock rate,
> > > which may cause instability or system crash.
> > >
> > > Fix this by setting the corresponding OPP corner during both power-on
> > > and power-off sequences to ensure proper alignment of rail voltage and
> > > clock frequency.
> > >
> > > Fixes: b0530eb11913 ("drm/msm/dpu: Use OPP API to set clk/perf state")
> > >
> > > Signed-off-by: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
> >
> > No empty lines between the tags. Also please cc stable.
>
> Sure, will fix.
>
> > > ---
> > >  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 16 ++++++++++++----
> > >  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  3 +++
> > >  2 files changed, 15 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > > index 0623f1dbed97..c31488335f2b 100644
> > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > > @@ -1306,9 +1306,14 @@ static int dpu_kms_init(struct drm_device *ddev)
> > >     struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
> > >     struct dev_pm_opp *opp;
> > >     int ret = 0;
> > > -   unsigned long max_freq = ULONG_MAX;
> > > +   dpu_kms->max_freq = ULONG_MAX;
> > > +   dpu_kms->min_freq = 0;
> > >
> > > -   opp = dev_pm_opp_find_freq_floor(dev, &max_freq);
> > > +   opp = dev_pm_opp_find_freq_floor(dev, &dpu_kms->max_freq);
> > > +   if (!IS_ERR(opp))
> > > +           dev_pm_opp_put(opp);
> > > +
> > > +   opp = dev_pm_opp_find_freq_ceil(dev, &dpu_kms->min_freq);
> > >     if (!IS_ERR(opp))
> > >             dev_pm_opp_put(opp);
> > >
> > > @@ -1461,8 +1466,8 @@ static int __maybe_unused dpu_runtime_suspend(struct device *dev)
> > >     struct msm_drm_private *priv = platform_get_drvdata(pdev);
> > >     struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
> > >
> > > -   /* Drop the performance state vote */
> > > -   dev_pm_opp_set_rate(dev, 0);
> > > +   /* adjust the performance state vote to low performance state */
> > > +   dev_pm_opp_set_rate(dev, dpu_kms->min_freq);
> >
> > Here min_freq is the minumum working frequency, which will keep it
> > ticking at a high frequency.  I think we are supposed to turn it off
> > (well, switch to XO). Would it be enough to swap these two lines
> > instead?
>
> Yes, just clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks) to
> disable clk is OK.
> we can drop change here.
>
> > >     clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
> > >
> > >     for (i = 0; i < dpu_kms->num_paths; i++)
> > > @@ -1481,6 +1486,9 @@ static int __maybe_unused dpu_runtime_resume(struct device *dev)
> > >     struct drm_device *ddev;
> > >
> > >     ddev = dpu_kms->dev;
> > > +   /* adjust the performance state vote to high performance state */
> > > +   if (dpu_kms->max_freq != ULONG_MAX)
> > > +           dev_pm_opp_set_rate(dev, dpu_kms->max_freq);
> >
> > This one should not be necessary, we should be setting the performance
> > point while comitting the DRM state.
>
> No, issue is during dpu binding path. And in msm_drm_kms_init driver just
> pm_runtime_resume_and_get enable power and pm_runtime_put_sync disable power.
> But We do not set the appropriate OPP each time the power is enabled.
> As a result, a situation may occur where the rail stays at MIN_SVS while the
> MDP is running at a high frequency.

Please move dev_pm_opp_set_rate() from dpu_kms_init() to dpu_kms_hw_init().

>
> rpm_idle+0xe8/0x190
> __pm_runtime_idle+0x50/0x94
> dpu_kms_hw_init+0x3a0/0x4a8 [msm]
> msm_drm_kms_init+0xb8/0x340 [msm]
> msm_drm_init+0x16c/0x22c [msm]
> msm_drm_bind+0x30/0x3c [msm]
> try_to_bring_up_aggregate_device+0x168/0x1d4
> __component_add+0xa4/0x170
> component_add+0x14/0x20
> dsi_dev_attach+0x20/0x2c [msm]
> dsi_host_attach+0x58/0x98 [msm]
> mipi_dsi_attach+0x30/0x54
> novatek_nt37801_probe+0x13c/0x1c8 [panel_novatek_nt37801]
>
> And I found a a similar bug.
> https://lore.kernel.org/all/20220915205559.14574-1-quic_bjorande@quicinc.com/
>
> Since the panel driver does not hold the property power-domains = <&rpmhpd RPMHPD_MMCX>
> once all drivers that do own the RPMHPD_MMCX power domain finish probing,
> the interconnect’s dev_sync_state is triggered, which eventually runs rpmhpd_sync_state
> and starts dynamic voltage adjustment. This is when the issue occurs.
>
>
> if do change below, this issue can also be fixed.
> &mdss_dsi0 {
>     ...
>         panel@0 {
>                 compatible = "novatek,nt37801";
>                 ...
>         ++      power-domains = <&rpmhpd RPMHPD_MMCX>;
>     }
> }
> But I don't think panel driver should own a power-domains = <&rpmhpd RPMHPD_MMCX>;

That's not related.

>
>
>
> Thanks,
> Yuanjie
>
> > >
> > >     rc = clk_bulk_prepare_enable(dpu_kms->num_clocks, dpu_kms->clocks);
> > >     if (rc) {
> > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > index 993cf512f8c5..8d2595d8a5f6 100644
> > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > @@ -92,6 +92,9 @@ struct dpu_kms {
> > >     struct clk_bulk_data *clocks;
> > >     size_t num_clocks;
> > >
> > > +   unsigned long max_freq;
> > > +   unsigned long min_freq;
> > > +
> > >     /* reference count bandwidth requests, so we know when we can
> > >      * release bandwidth.  Each atomic update increments, and frame-
> > >      * done event decrements.  Additionally, for video mode, the
> > > --
> > > 2.34.1
> > >
> >
> > --
> > With best wishes
> > Dmitry



-- 
With best wishes
Dmitry
Re: [PATCH 1/2] drm/msm/dpu: fix mismatch between power and frequency
Posted by yuanjiey 4 weeks ago
On Mon, Jan 12, 2026 at 09:38:41AM +0200, Dmitry Baryshkov wrote:
> On Mon, 12 Jan 2026 at 08:23, yuanjiey <yuanjie.yang@oss.qualcomm.com> wrote:
> >
> > On Fri, Jan 09, 2026 at 05:22:37PM +0200, Dmitry Baryshkov wrote:
> > > On Fri, Jan 09, 2026 at 04:38:07PM +0800, yuanjie yang wrote:
> > > > From: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
> > > >
> > > > During DPU runtime suspend, calling dev_pm_opp_set_rate(dev, 0) drops
> > > > the MMCX rail to MIN_SVS while the core clock frequency remains at its
> > > > original (highest) rate. When runtime resume re-enables the clock, this
> > > > may result in a mismatch between the rail voltage and the clock rate.
> > > >
> > > > For example, in the DPU bind path, the sequence could be:
> > > >   cpu0: dev_sync_state -> rpmhpd_sync_state
> > > >   cpu1:                                     dpu_kms_hw_init
> > > > timeline 0 ------------------------------------------------> t
> > > >
> > > > After rpmhpd_sync_state, the voltage performance is no longer guaranteed
> > > > to stay at the highest level. During dpu_kms_hw_init, calling
> > > > dev_pm_opp_set_rate(dev, 0) drops the voltage, causing the MMCX rail to
> > > > fall to MIN_SVS while the core clock is still at its maximum frequency.
> > >
> > > Ah, I see. dev_pm_set_rate(0) transforms to  _disable_opp_table(), which
> > > doesn't do anything with clocks. I think we should have a call to
> > > clk_disable_unprepare() before that and clk_prepare_enable() in the
> > > resume path.
> >
> > I do check the backtrace on kaanapali:
> >
> > active_corner = 3 (MIN_SVS)
> > rpmhpd_aggregate_corner+0x168/0x1d0
> > rpmhpd_set_performance_state+0x84/0xd0
> > _genpd_set_performance_state+0x50/0x198
> > genpd_set_performance_state.isra.0+0xbc/0xdc
> > genpd_dev_pm_set_performance_state+0x60/0xc4
> > dev_pm_domain_set_performance_state+0x24/0x3c
> > _set_opp+0x370/0x584
> > dev_pm_opp_set_rate+0x258/0x2b4
> > dpu_runtime_suspend+0x50/0x11c [msm]
> > pm_generic_runtime_suspend+0x2c/0x44
> > genpd_runtime_suspend+0xac/0x2d0
> > __rpm_callback+0x48/0x19c
> > rpm_callback+0x74/0x80
> > rpm_suspend+0x108/0x588
> > rpm_idle+0xe8/0x190
> > __pm_runtime_idle+0x50/0x94
> > dpu_kms_hw_init+0x3a0/0x4a8
> >
> > dev_pm_opp_set_rate(dev, 0); just low power to MIN_SVS.
> > And freq MDP: 650MHz
> >
> >
> > And I try change the order:
> > from:
> >         dev_pm_opp_set_rate(dev, 0);
> >         clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
> > to:
> >         clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
> >         dev_pm_opp_set_rate(dev, 0);
> >
> > But the issue is still here.
> 
> But which clock is still demanding higher MMCX voltage? All DPU clocks
> should be turned off at this point.
Yes, no DPU clock demand higher MMCX voltage here. But next time pm_runtime_get_sync
need higher MMCX voltagei due to high freq.
 
> >
> >
> > > > When the power is re-enabled, only the clock is enabled, leading to a
> > > > situation where the MMCX rail is at MIN_SVS but the core clock is at its
> > > > highest rate. In this state, the rail cannot sustain the clock rate,
> > > > which may cause instability or system crash.
> > > >
> > > > Fix this by setting the corresponding OPP corner during both power-on
> > > > and power-off sequences to ensure proper alignment of rail voltage and
> > > > clock frequency.
> > > >
> > > > Fixes: b0530eb11913 ("drm/msm/dpu: Use OPP API to set clk/perf state")
> > > >
> > > > Signed-off-by: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
> > >
> > > No empty lines between the tags. Also please cc stable.
> >
> > Sure, will fix.
> >
> > > > ---
> > > >  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 16 ++++++++++++----
> > > >  drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h |  3 +++
> > > >  2 files changed, 15 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > > > index 0623f1dbed97..c31488335f2b 100644
> > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > > > @@ -1306,9 +1306,14 @@ static int dpu_kms_init(struct drm_device *ddev)
> > > >     struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
> > > >     struct dev_pm_opp *opp;
> > > >     int ret = 0;
> > > > -   unsigned long max_freq = ULONG_MAX;
> > > > +   dpu_kms->max_freq = ULONG_MAX;
> > > > +   dpu_kms->min_freq = 0;
> > > >
> > > > -   opp = dev_pm_opp_find_freq_floor(dev, &max_freq);
> > > > +   opp = dev_pm_opp_find_freq_floor(dev, &dpu_kms->max_freq);
> > > > +   if (!IS_ERR(opp))
> > > > +           dev_pm_opp_put(opp);
> > > > +
> > > > +   opp = dev_pm_opp_find_freq_ceil(dev, &dpu_kms->min_freq);
> > > >     if (!IS_ERR(opp))
> > > >             dev_pm_opp_put(opp);
> > > >
> > > > @@ -1461,8 +1466,8 @@ static int __maybe_unused dpu_runtime_suspend(struct device *dev)
> > > >     struct msm_drm_private *priv = platform_get_drvdata(pdev);
> > > >     struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
> > > >
> > > > -   /* Drop the performance state vote */
> > > > -   dev_pm_opp_set_rate(dev, 0);
> > > > +   /* adjust the performance state vote to low performance state */
> > > > +   dev_pm_opp_set_rate(dev, dpu_kms->min_freq);
> > >
> > > Here min_freq is the minumum working frequency, which will keep it
> > > ticking at a high frequency.  I think we are supposed to turn it off
> > > (well, switch to XO). Would it be enough to swap these two lines
> > > instead?
> >
> > Yes, just clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks) to
> > disable clk is OK.
> > we can drop change here.
> >
> > > >     clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
> > > >
> > > >     for (i = 0; i < dpu_kms->num_paths; i++)
> > > > @@ -1481,6 +1486,9 @@ static int __maybe_unused dpu_runtime_resume(struct device *dev)
> > > >     struct drm_device *ddev;
> > > >
> > > >     ddev = dpu_kms->dev;
> > > > +   /* adjust the performance state vote to high performance state */
> > > > +   if (dpu_kms->max_freq != ULONG_MAX)
> > > > +           dev_pm_opp_set_rate(dev, dpu_kms->max_freq);
> > >
> > > This one should not be necessary, we should be setting the performance
> > > point while comitting the DRM state.
> >
> > No, issue is during dpu binding path. And in msm_drm_kms_init driver just
> > pm_runtime_resume_and_get enable power and pm_runtime_put_sync disable power.
> > But We do not set the appropriate OPP each time the power is enabled.
> > As a result, a situation may occur where the rail stays at MIN_SVS while the
> > MDP is running at a high frequency.
> 
> Please move dev_pm_opp_set_rate() from dpu_kms_init() to dpu_kms_hw_init().

During dpu_kms_hw_init and msm_drm_kms_init and msm_drm_kms_post_init.

There are multiple places where pm_runtime_get_sync(pm_runtime_resume_and_get)and pm_runtime_put_sync are called.
And each time after pm_runtime_get_sync(pm_runtime_resume_and_get) will access register depond on MDP clk.

Do I need to add dev_pm_opp_set_rate after every pm_runtime_get_sync and pm_runtime_resume_and_get?

pm_runtime_get_sync
dev_pm_opp_set_rate
"access register"
pm_runtime_put_sync

pm_runtime_resume_and_get
dev_pm_opp_set_rate
"access register"
pm_runtime_put_sync

Thanks,
Yuanjie

> >
> > rpm_idle+0xe8/0x190
> > __pm_runtime_idle+0x50/0x94
> > dpu_kms_hw_init+0x3a0/0x4a8 [msm]
> > msm_drm_kms_init+0xb8/0x340 [msm]
> > msm_drm_init+0x16c/0x22c [msm]
> > msm_drm_bind+0x30/0x3c [msm]
> > try_to_bring_up_aggregate_device+0x168/0x1d4
> > __component_add+0xa4/0x170
> > component_add+0x14/0x20
> > dsi_dev_attach+0x20/0x2c [msm]
> > dsi_host_attach+0x58/0x98 [msm]
> > mipi_dsi_attach+0x30/0x54
> > novatek_nt37801_probe+0x13c/0x1c8 [panel_novatek_nt37801]
> >
> > And I found a a similar bug.
> > https://lore.kernel.org/all/20220915205559.14574-1-quic_bjorande@quicinc.com/
> >
> > Since the panel driver does not hold the property power-domains = <&rpmhpd RPMHPD_MMCX>
> > once all drivers that do own the RPMHPD_MMCX power domain finish probing,
> > the interconnect’s dev_sync_state is triggered, which eventually runs rpmhpd_sync_state
> > and starts dynamic voltage adjustment. This is when the issue occurs.
> >
> >
> > if do change below, this issue can also be fixed.
> > &mdss_dsi0 {
> >     ...
> >         panel@0 {
> >                 compatible = "novatek,nt37801";
> >                 ...
> >         ++      power-domains = <&rpmhpd RPMHPD_MMCX>;
> >     }
> > }
> > But I don't think panel driver should own a power-domains = <&rpmhpd RPMHPD_MMCX>;
> 
> That's not related.
> 
> >
> >
> >
> > Thanks,
> > Yuanjie
> >
> > > >
> > > >     rc = clk_bulk_prepare_enable(dpu_kms->num_clocks, dpu_kms->clocks);
> > > >     if (rc) {
> > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > > index 993cf512f8c5..8d2595d8a5f6 100644
> > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > > @@ -92,6 +92,9 @@ struct dpu_kms {
> > > >     struct clk_bulk_data *clocks;
> > > >     size_t num_clocks;
> > > >
> > > > +   unsigned long max_freq;
> > > > +   unsigned long min_freq;
> > > > +
> > > >     /* reference count bandwidth requests, so we know when we can
> > > >      * release bandwidth.  Each atomic update increments, and frame-
> > > >      * done event decrements.  Additionally, for video mode, the
> > > > --
> > > > 2.34.1
> > > >
> > >
> > > --
> > > With best wishes
> > > Dmitry
> 
> 
> 
> -- 
> With best wishes
> Dmitry
Re: [PATCH 1/2] drm/msm/dpu: fix mismatch between power and frequency
Posted by Konrad Dybcio 1 month ago
On 1/9/26 9:38 AM, yuanjie yang wrote:
> From: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
> 
> During DPU runtime suspend, calling dev_pm_opp_set_rate(dev, 0) drops
> the MMCX rail to MIN_SVS while the core clock frequency remains at its
> original (highest) rate. When runtime resume re-enables the clock, this
> may result in a mismatch between the rail voltage and the clock rate.
> 
> For example, in the DPU bind path, the sequence could be:
>   cpu0: dev_sync_state -> rpmhpd_sync_state
>   cpu1:                                     dpu_kms_hw_init
> timeline 0 ------------------------------------------------> t
> 
> After rpmhpd_sync_state, the voltage performance is no longer guaranteed
> to stay at the highest level. During dpu_kms_hw_init, calling
> dev_pm_opp_set_rate(dev, 0) drops the voltage, causing the MMCX rail to
> fall to MIN_SVS while the core clock is still at its maximum frequency.
> When the power is re-enabled, only the clock is enabled, leading to a
> situation where the MMCX rail is at MIN_SVS but the core clock is at its
> highest rate. In this state, the rail cannot sustain the clock rate,
> which may cause instability or system crash.

So what this message essentially says is that dev_pm_opp_set_rate(dev, 0)
doesn't actually set the rate of "0" (or any other rate, unless opp-level
is at play), nor does it disable the clock.

Seems like a couple of our drivers make this oversight..

I see that originally calling dev_pm_opp_set_rate(dev, 0) was forbidden,
up until Commit cd7ea582866f ("opp: Make dev_pm_opp_set_rate() handle freq
= 0 to drop performance votes")..

In fact,

$ rg 'dev_pm_opp_set_rate\(.*, 0\)'

returns exclusively Qualcomm drivers where I believe the intention is always
to disable the clock.. perhaps we should just do that instead. We don't have
to worry about setting F_MIN beforehand, because a disabled clock won't be
eating up power and when enabling it back up, calling opp_set_rate with a
non-zero frequency will bring back the rails to a suitable level

Konrad
Re: [PATCH 1/2] drm/msm/dpu: fix mismatch between power and frequency
Posted by yuanjiey 4 weeks ago
On Fri, Jan 09, 2026 at 11:44:48AM +0100, Konrad Dybcio wrote:
> On 1/9/26 9:38 AM, yuanjie yang wrote:
> > From: Yuanjie Yang <yuanjie.yang@oss.qualcomm.com>
> > 
> > During DPU runtime suspend, calling dev_pm_opp_set_rate(dev, 0) drops
> > the MMCX rail to MIN_SVS while the core clock frequency remains at its
> > original (highest) rate. When runtime resume re-enables the clock, this
> > may result in a mismatch between the rail voltage and the clock rate.
> > 
> > For example, in the DPU bind path, the sequence could be:
> >   cpu0: dev_sync_state -> rpmhpd_sync_state
> >   cpu1:                                     dpu_kms_hw_init
> > timeline 0 ------------------------------------------------> t
> > 
> > After rpmhpd_sync_state, the voltage performance is no longer guaranteed
> > to stay at the highest level. During dpu_kms_hw_init, calling
> > dev_pm_opp_set_rate(dev, 0) drops the voltage, causing the MMCX rail to
> > fall to MIN_SVS while the core clock is still at its maximum frequency.
> > When the power is re-enabled, only the clock is enabled, leading to a
> > situation where the MMCX rail is at MIN_SVS but the core clock is at its
> > highest rate. In this state, the rail cannot sustain the clock rate,
> > which may cause instability or system crash.
> 
> So what this message essentially says is that dev_pm_opp_set_rate(dev, 0)
> doesn't actually set the rate of "0" (or any other rate, unless opp-level
> is at play), nor does it disable the clock.
> 
> Seems like a couple of our drivers make this oversight..
> 
> I see that originally calling dev_pm_opp_set_rate(dev, 0) was forbidden,
> up until Commit cd7ea582866f ("opp: Make dev_pm_opp_set_rate() handle freq
> = 0 to drop performance votes")..
> 
> In fact,
> 
> $ rg 'dev_pm_opp_set_rate\(.*, 0\)'
> 
> returns exclusively Qualcomm drivers where I believe the intention is always
> to disable the clock.. perhaps we should just do that instead. We don't have
> to worry about setting F_MIN beforehand, because a disabled clock won't be
> eating up power and when enabling it back up, calling opp_set_rate with a
> non-zero frequency will bring back the rails to a suitable level

Yes, calling opp_set_rate with a non-zero frequency will bring back 
the rails to a suitable level. The other steps are unnecessary.

And here I just choose the highest freq, because I see 
dpu binding patch:
dpu_kms_init(struct drm_device *ddev) use highest freq to do initialization. 

I want to keep it consistent with the previous initialization logic.

Thanks,
Yuanjie

> 
> Konrad